Re: How to normalize this?

From: Erwin <e.smout_at_myonline.be>
Date: Mon, 6 May 2013 08:23:41 -0700 (PDT)
Message-ID: <f25d9b02-7abc-4325-bb06-bb2166d4e8d9_at_googlegroups.com>


Op maandag 6 mei 2013 00:52:44 UTC+2 schreef Jan Hidders het volgende:
> On 2013-05-05 19:32:35 +0000, Erwin said:
>
>
>
> > Op zondag 5 mei 2013 10:10:49 UTC+2 schreef Jan Hidders het volgende:
>
> >> On 2013-05-04 23:32:18 +0000, ... said:
>
> >>
>
> >>
>
> >>
>
> >>> On Thursday, May 2, 2013 6:23:07 AM UTC-7, Jan Hidders wrote:
>
> >>
>
> >>>> But most> textbooks I know are correct and clear on this and will
>
> >>>> state>> >> that the> goal is for example a lossless-join and>> >>
>
> >>>> dependency-preserving> decomposition.
>
> >>
>
> >>>
>
> >>
>
> >>> Can you name some? Because I don't think books are clear about>> >
>
> >>> normalization within design, which involves generated predicates and>>
>
> >>> > constraint preservation. Normalization per se assumes the generated>>
>
> >>> > components are projections of the original, ie are suitably>> >
>
> >>> constrained, without addressing the management of those constraints
>
> >>> (eg>> > introducing FKs for lost FDs).
>
> >>
>
> >>>
>
> >>
>
> >>>> 1. Apply the 3NF normalization procedure we discussed earlier
>
> >>
>
> >>>> 2. Include with each component only the FDs that generated that component
>
> >>
>
> >>>
>
> >>
>
> >>> It's not clear to me what you are describing or how it relates to>> >
>
> >>> arguing against stopping at 3nf.
>
> >>
>
> >>
>
> >>
>
> >> I've described to you a normalization algorithm that seems to produce
>
> >> a>> schema in BCNF that is also lossless-join and dependency
>
> >> preserving,>> but also has no inter-component dependencies. That's not
>
> >> supposed to be>> possible according to the usual normalization
>
> >> explanation. So what went>> wrong here?
>
> >
>
> > That's simply not true is it ?
>
> >
>
> > You are referring to the three-table decomposition, right ?
>
>
>
> No. I'm referring to the algorithm that is described by steps 1. and 2.
>
> above. It's an adapted variant of the usual algorithm that produced the
>
> three-table decomposition. That older algorithm does not always produce
>
> a schema in in BCNF. Mine seems to do so.
>
>
>
> > That one does have the inter-component rule that R1 JOIN R2 === R1 JOIN
>
> > R3 (as I pointed out), no ?
>
>
>
> Only if you want to be information-equivalent but conventional
>
> normalization does not require that. It only aims at obtaining a
>
> lossless decomposition. What I am now pointing out is that if that is
>
> our goal, it seems actually possible to always normalize to BCNF
>
> without introducing inter-component dependencies. That cannot be right
>
> now, can it? So what then is exactly meant with the conventional claim
>
> that this is not possible?
>
>
>
> > "What went wrong" imo is that your decomposition did not match the
>
> > original "spirit" of the operation. In which relation schemas get
>
> > split in _exactly_ two, with the dependent attributes of the "small" FD
>
> > (bc->e) moved _out of_ the schema. your three-way split is either not
>
> > a split in exactly two, or it is two subsequent splits-in-two where the
>
> > first does _not_ move any attribute "out of the schema" (precisely
>
> > because it is still needed for the second split).
>
>
>
> Note that I was applying one of the usual textbook 3NF normalization
>
> algorithms. Normalizing stepwise with pair-wise splits is not the only
>
> option and will in fact sometimes not allow you to find a perfectly
>
> valid 3NF decomposiition.
>
>
>
> -- Jan Hidders

Previous reply written in too much of a hurry.

I'll repeat the running {abcde} example with { ... bc->e cd->e }.

The way I understand it, conventional normalization procedure would take, say, bc->e, introduce a relvar {bce} in which that FD was to hold, _and then reduce the original schema_ by the RHS of that FD, {e}.

"Your" normalization procedure, would take the same FD, introduce the same relvar {bce}, but it would then do a different kind of "reduction" on the original schema. Instead of "removing the RHS of the FD just 'processed'", without any consideration for remaining FDs in the original schema, you are now first going to consider the remaining FDs in the original schema, and remove _only those attributes_ from the RHS of the 'processed' FD that are no longer mentioned in the remaining FDs at all.

In the running example, you are _not_ going to remove {e} because e is still mentioned in cd->e, hence you are going to leave the original {abcde} schema unaltered. Only after you've 'processed' cd->e as well, will {e} be "removed" from the original schema.

I presume that such "degenerate decompositions", in which an extra relvar is introduced and the original schema left unaltered, were never considered. If that sort of answers your actual question (meaning I've probably grasped the essence of what your actual question is), I'm tempted to suggest that for a definitive answer, the question would have to be posed to the Chris Dates and/or Ron Fagins of this world. They're the ones who "were there when it happened". Received on Mon May 06 2013 - 17:23:41 CEST

Original text of this message