# Re: 3vl 2vl and NULL

From: David Cressey <dcressey_at_verizon.net>
Date: Sat, 18 Feb 2006 19:10:15 GMT
Message-ID: <riKJf.1600\$lR2.1422_at_trndny01>

> The thing that I can't help but feel, though, is that the whole
> thing is a false dichotomy. If you can really always do the
> composition/decomposition, then it would appear that the
> choice between nested structure or not could be expressed
> simply as a view. How come no one ever talks about that
> idea?
>
I can't help but agree with you. I think that Codd's choice of the term "normalization", rather than some other term, such as "regularization" is instructive in this regard. The word "normalization" had occurred before, in other areas of mathematics.

I have only an informal understanding of what "normalization" means, in general, but here goes:

We start off with a set (in our case, the set of all possible schemas). We discover an equivalence relationship that is psossible between distinct elelemnts of the set. (in our case, the equivalence relationship is something like "capable of expressing the same facts", although I would want to tighten that up a bit). Once we discover the equivalence relationship, we divide the original set into groups. (The word "group" here is NOT used in its sepcialized mathematical sense).

The grouping is such that two elements are in the same group if and only if they satisfy the equivalence relationship with each other. (In our case, a given group is a set of all possible schemas that are capable of expressing a given set of facts). In each group, we set apart one element, which we are going to use as the "Archtype" for all of the other members of its group. We also devise a rule for discovering the archtype, given any element of any group.

We call the archtype, the "normal form" for all the elements of the group, and we call the process of transforming any element into the archtype of its group, "normalizing".

In the case at hand, the rule regarding splitting out repeating groups into separate tuples (I don't actually remember whether Codd, in the 1970 paper, decomposed the relation into two relations), and interesting thing happened. As soon as people started looking into the "normal form", they discovered that the rule about eliminating repeating groups did not result in a single unique schema, but rather in a set of schemas that were all capable of expressing the same facts, and all without repeating groups.

This eventually gave rise to "second normal form", and the renaming of "normal form" into "First normal form". And the rest is history.

I apologize for any sloppiness in the above. I'm speaking a little above my level of mathematical sophistication, and that means that sometimes my phrasing is clumsy, or just plain incorrect.

To come back to your point: it's clear among those who read and understand Codd and Date that for every schema that uses nested sets or lists (repeating groups) to express certain facts, there exists an equivalent schema that does not use repeating groups, and that expresses the same facts.

That means that we can, if we want to, restrict ourselves to the design of schemas that do not contain repreating groups. That is how I learned to design databases when I first learned database design. It was a little bit of trouble, but only a little bit.

What I understand Dawn to be saying is that conformance to this rule has had a devastating effect on the productivity of programmers who would have been more productive if they had migrated to Pick. I don't believe it. But that's the claim being put forth.

>
> > There is a large body of other professionals, who use the term 1NF
> > differently, and would claim that a schema of relations is already in
1NF,
> > even if there are repeating groups. If I'm not greatly mistaken, the
quote
> > cited from Date's introduction defines 1NF precisely this way.
>
> Yes, that is Date's position as I understand it.
>
>
> Marshall
>
Received on Sat Feb 18 2006 - 20:10:15 CET

Original text of this message