Re: Sixth normal form

From: Brian Selzer <brian_at_selzer-software.com>
Date: Tue, 07 Aug 2007 04:38:35 GMT
Message-ID: <fHSti.13059$eY.2161_at_newssvr13.news.prodigy.net>


"Jan Hidders" <hidders_at_gmail.com> wrote in message news:1186391813.648681.90280_at_w3g2000hsg.googlegroups.com...

[big snip]

>>
>> > Agreed. I would add that this is not specific for going from 5NF to
>> > 6NF but anywhere you decompose to go from a lower to a higher normal
>> > form.
>>
>> Not always. If two sets of attributes are independent, as is the case
>> when
>> moving from 1NF to 2NF, then there is no need for a referential
>> constraint;
>> if two projections are independent, as is the case when moving from BCNF
>> to
>> 4NF, then there is no need for a referential constraint.
>
> I have no idea what you mean here. In all those cases you need a
> cyclic pair of nclusion dependencies if you want the new schema to be
> equivalent with the old. And if you only want the new schema to
> contain at least as much information as the old then you strictly
> speaking don't need any inclusion dependency at all.
>

If two sets of attributes are independent, then for each combination of values for one of the sets of attributes, there is a copy of the projection over all of the other attributes. What I mean is, the cardinality of the 1NF relation is equal to the product of the cardinalities of the projections over each independent set of attributes. A cyclical constraint would only prevent (pathologically, I should add) the case where the 1NF relation is empty and where either but not both of the 2NF relations is empty: the whole point of moving to 2NF is to avoid the update anomalies caused by lumping two independent sets of attributes in the same relation schema. The same can be said for independent projections of relations, as is the case when moving from BCNF to 4NF.

I guess I need to clarify what I mean by /at least the same/ information content. Consider the following example relation schema that is in 5NF:

(1) {Whse, Item, TranId, RcvdDate, QtyRcvd, QtySold, Cost}

such that

{Whse, Item, TranId} --> {RcvdDate, QtyRcvd, QtySold, Cost}

An instance of this schema enumerates a set of cost tiers for items in a warehouse. Cost is the total cost for the quantity received. The unit cost is therefore Cost / QtyRcvd, but is not calculated ahead of time to minimize rounding errors. There can be several cost tiers for an item in a warehouse, and the cost assigned to a sale is computed by pulling from the oldest tier first, and continuing with successive tiers until the quantity needed for the sale has been met. Now decomposing the above schema into 6NF gives:

(2) {Whse, Item, TranId, RcvdDate}
(3) {Whse, Item, TranId, QtyRcvd}
(4) {Whse, Item, TranId, QtySold}
(5) {Whse, Item, TranId, Cost}

This schema allows a Cost without a QtyRcvd, thus preventing the calculation of unit cost. It permits the existence of a QtySold without a QtyRcvd, thus preventing the calculation of the remaining quantity. It also permits QtyRcvd without RcvdDate, thus preventing the selection of the oldest tier. Clearly the existence of QtySold and Cost depend upon the existence of QtyRcvd, and the existence of QtyRcvd depends upon the existence of RcvdDate. Consequently, the set of 6NF relation schemata can't contain /at least the same/ information content as the 5NF schema, because some valid instances of the 6NF schemata do not make sense.

Suppose that tuples exist for a particular {Whse, Item, TranId} in instances of the 6NF schemata for (3) and (4), but not for (2) and (5). A query seeking the total quantity on hand for the Item (SUM(QtyRcvd - QtySold)) could indicate that quantity required for a sale is available, but since the tuples for (2) and (5) do not exist, the cost to be assigned to the sale transaction cannot be computed. It certainly doesn't sound to me like a good thing to carry inventory that you can't sell!

It should be obvious that if sets of dependent attributes are truly independent of each other (which is the only case that an inclusion dependency is not indicated), then they should have already been separated into their own relation schemata as part of the normalization process. It means that a multivalued dependency exists that is not implied by the candidate keys. It means that the relation schema is not even in 4NF. Therefore, when moving from 5NF to 6NF, a cyclical constraint is /always/ necessary to avoid information loss.

[snip] Received on Tue Aug 07 2007 - 06:38:35 CEST

Original text of this message