Re: Help needed on Boyce-Codd Normal Form.

From: Kira Yamato <kirakun_at_earthlink.net>
Date: Sat, 15 Dec 2007 05:16:51 -0500
Message-ID: <2007121505165116807-kirakun_at_earthlinknet>


On 2007-12-14 21:50:13 -0500, Jan Hidders <hidders_at_gmail.com> said:

> On 15 dec, 02:16, Kira Yamato <kira..._at_earthlink.net> wrote:

>> On 2007-12-14 17:08:02 -0500, TroyK <cs_tr..._at_juno.com> said:
>> 
>> 
>> 
>>> On Dec 14, 4:42 am, Kira Yamato <kira..._at_earthlink.net> wrote:
>>>> On 2007-12-14 06:24:06 -0500, Kira Yamato <kira..._at_earthlink.net> said:
>> 
>>>>> Suppose I want to create a schema for
>>>>> S = student,
>>>>> J = subject,
>>>>> T = teacher,
>>>>> that enforces
>> 
>>>>> 1) For each subject, each student taking that subject is taught by
>>>>> exactly one teacher.  But a student can take multiple subjects.
>>>>> 2) Each teacher teaches exactly one subject.  But a subject can be
>>>>> taught by multiple teachers.
>> 
>>>>> How do you create a schema that is at least BCNF and enforces 1) and 2)?
>> 
>>>> Ok.  Here is my attempt:
>> 
>>>> One obvious relation is
>>>> T --> J.
>>>> This relation is obviously BCNF and it captures 2).
>> 
>>>> But I'm having trouble identifying the other relation(s) for 1).
>> 
>>>> First, I tried the relation
>>>> SJ --> T,
>>>> but it is not BCNF because of the functional dependency T --> J.
>> 
>>>> Next, I tried the relation
>>>> ST --> {},
>>>> but this relation does not enforce 1).  For example, a student can take
>>>> two teachers teaching the same subject --- a scenario not allowed.
>> 
>>>> So, I'm in search of some hints or suggestions.
>> 
>>>> --
>> 
>>>> -kira
>> 
>>> This looks like the example Date gives in An Introduction To Database
>>> Systems (or am I thinking of another of his writings?).
>> 
>>> Basically, your choices are to decompose the relvar into 2 BCNF
>>> relvars, in which case you will have a multi-relvar constraint to deal
>>> with in order to preserve one of the FDs, or forgo BCNF and leave the
>>> relvar as is.
>> 
>> Hmm...  Interesting.  I had thought that you can always decompose a
>> scheme into BCNF while keeping the FD independent from each other.

>
> That depends a bit on what you mean by "keeping the FDs independent".
> In some sense it is always possible to normalize to BCNF without
> introducing functional dependencies that span multiple relations.

Nice way to put it. This is what I was thinking of intuitively. Thanks for showing me how to state it more precisely.

> Your
> example can be represented in pseudo-BCNF as follows:
>
> R1(S,J,T) with candidate key {S,J}
> R2(T,J) with candidate key {T}
> inclusion dependencies:
> - from R1(T,J) to R2(T,J)
>
> Note that I omitted the candidate key {S,T} for R1. It is implied by
> the other specified dependencies. So you see it is quite possible to
> come up with a set of relations and a set of dependencies that consist
> of inclusion dependencies and local functional dependencies such that
> all original dependencies are implied.
>
> What is a bit dodgy about this solution is that it pretends that the
> FD T-->J does not exists for R1 and so it will only find out that this
> FD has been violated by an update on R2 if the update is propagated to
> R2. It is in that sense that this is usually not considered a valid
> solution: not all local FDs are implied by the CKs. It is also in this
> sense that the claim is correct that says that you cannot always
> normalize to BCNF while being dependency preserving. Note that in
> practice it could be that pseudo-BCNF is acceptable, provided that
> your DBMS supports it efficiently.

Nice explanation. So, the trouble is that some constraint is needed that must span multiple relations, like the inclusion dependency you mentioned above.

So, when you say "provided that your DBMS supports it [pseudo-BCNF]," do you mean that a DBMS that allows specifying a constraint across multiple tables?

>

>>> Here are the FDs I see given the rules... they may give you a hint how
>>> to proceed:
>> 
>>> { S, J } -> { T }
>>> Loosely, if you know a student and the subject they're taking, you
>>> know the teacher, and
>> 
>>> { T } -> { J }
>>> if you know the teacher, you know the subject being taught.
>> 
>> Yes, these two FD's are exactly the two constraints I was trying to
>> achieve with BCNF.  But, I suppose this cannot always be done then.
>> 
>> Is there a way to identify when a set of FD's cannot be decomposed into BCNF?

>
> Yep. Determine an irreducible cover. Check if there are two different
> FDs in it such that the set of all attributes mentioned in one is a
> subset of that of the other. If this is the case you cannot get to
> BCNF and be dependency preserving.

I am guessing that by an "irreducible cover" you mean some minimal set of FD's such that all of its combinations will produce all of the original FD's in question.

So your claim is that if the procedure of getting to BCNF while being dependency preserving cannot be met, then every irreducible cover of the original FD's will have two distinct FD's such that the set of attributes mentioned in one will be a subset of those in the other.

This is quite a strong claim.

>
> But you can also simply start splitting off 3NF violating dependencies
> as usual and see what you end up with. If that is not in BCNF then you
> cannot get there and be dependency preserving. The order in which you
> pick the dependencies does not matter.

Very cool results. I will try to look up some books that prove these assertions. Do you have a good reference to recommend?

Thanks so much.

-- 

-kira
Received on Sat Dec 15 2007 - 11:16:51 CET

Original text of this message