Re: Sixth normal form

From: Brian Selzer <brian_at_selzer-software.com>
Date: Tue, 21 Aug 2007 03:52:10 GMT
Message-ID: <Kjtyi.47081$Um6.24324_at_newssvr12.news.prodigy.net>


"Jan Hidders" <hidders_at_gmail.com> wrote in message news:1187647809.393912.318860_at_57g2000hsv.googlegroups.com...
> On 20 aug, 16:32, "Brian Selzer" <br..._at_selzer-software.com> wrote:
>>
>>
>> For this particular case, you're right, but it doesn't hurt to look at
>> the
>> FDs, and in the general case, it is necessary. Here are some simple
>> decompositions.
>>
>> (1) R(A,B,C) such that A --> B and A --> C into R1(A,B) and R2(A,C):
>> Since A --> C, there can't be a value for A without a value for C, so the
>> IND R1[A] in R2[A] is needed. Since A --> B, there can't be a value for
>> A
>> without a value for C so the IND R2[A] in R1[A] is needed.
>
> The conclusion is correct, but your argumentation is false. It is not
> because of A --> C that there cannot be an A value without a C value.
> Also if that FD did not hold then there would have to be for every A
> value at least one C value. All that the FD says is that in addition
> there can be at most one.
>
>> (2) R(A,B,C) such that A --> B and B --> C into R1(A,B) and R2(B,C):
>> Since A --> C is implied by the cover for R, there can't be a value for A
>> without a value for C, so the IND R1[B] in R2[B] is needed.
>
> But in R there cannot be a B value without a value for A, so the IND
> R2[B] -> R1[B] is needed.
>
>> (3) R(A,B,C) such that AB --> C and C --> B into R1(A,C) and R2(B,C):
>> Since C --> B, there can't be a value for C without a value for B, so the
>> IND R1[C] in R2[C] is needed.
>
> In R there cannot be a C value with at least one associated A value,
> so you also need IND R2[C] -> R1[C].
>
>>
>> (4) R(A,B,C) such that A --> B and B --> A into R1(A,B) and R2(A,C):
>> Since A --> B, there can't be a value for A without a value for B, so the
>> IND R2[A] in R1[A] is needed.
>
> In R there cannot be a B value without at least one associated C
> value, so you also need IND R1[A] -> R2[A].
>
> So you see that in all cases both INDs are required. Which FDs exactly
> hold is in fact completely immaterial.
>

In each case above that you added an IND, the reason was something like, "There cannot be a value for Y without at least one associated value for X." In each case above where I added an IND, the reason was something like, "There cannot be a value for X without one and only one associated value for Y."

Did you notice that each INDs that I defined above restores the functional relationship between sets of attributes, whereas each INDs that you added restores only the surjective nature of those functions?

In (1) the FDs AB --> AC and AC --> AB appear in the closure of R.
In (2) the FD AB --> BC appears in the closure of R.
In (3) the FD AC --> BC appears in the closure of R.
In (4) the FD AC --> AB appears in the closure of R.

In (1) the functional relationship is bijective, so both INDs are required to ensure that the relationship remains a function in both directions.

In (2) the IND R1[B] in R2[B] ensures that the relationship from AB to BC remains a function, but the IND R2[B] in R1[B] only ensures that it remains a surjection.

In (3) the IND R1[C] in R2[C] ensures that the relationship from AC to BC remains a function, but the IND R2[C] in R1[C] only ensures that it remains a surjection.

In (4) the IND R2[C] in R1[C] ensures that the relationship from AC to AB remains a function, but the IND R1[C] in R2[C] only ensures that it remains a surjection.

My argument is that if there is a functional relationship from one set of attributes to another in a less normalized schema, then there should still be a functional relationship after decomposition. I don't think it's always necessary that that functional relationship also remains a surjection. (This coincides with the definition for independence I supplied earlier.) In fact, it is often undesireable: the update anomalies that BCNF is supposed to eliminate can be linked to the fact that the relationships are surjective, so adding an IND in the opposite direction would be counterproductive. I would note, however, that when moving from 5NF to 6NF, the relationships between the sets of attributes in each pair of projections are bijective, and therefore a cyclical referential constraint is always required.

> -- Jan Hidders
>
Received on Tue Aug 21 2007 - 05:52:10 CEST

Original text of this message