Re: Multiple-Attribute Keys and 1NF

From: David Cressey <>
Date: Fri, 31 Aug 2007 12:58:28 GMT
Message-ID: <UfUBi.12075$Eh5.9962_at_trndny06>

"JOG" <> wrote in message

> Well I've never suggested multiple values contained in a collection.
> But yes as I said, multiple roles does break the guaranteed access
> rule. My question is now (in the continuuing hunt for the theory
> behind 1NF) is why on earth would that be a problem? I don't see any
> affect on the relational algebra.

I honestly think that the impetus behind "normalization" in the Codd 1970 paper is more of a stopgap than a theory. (I'm not familiar with the 1969 paper, and I only read the 1970 paper after I began participating in the discussions in c.d.t.) In the 1970 paper, Codd suggests that it may be worthwhile to consider the subset of schemas that contain only atomic attributes. (He didn't use the word "schemas", but I hope I can use it without introducing confusion.)

He pointed out that such a restriction did not thereby reduce the expressiveness to the system, in that for every unnormalized schema, there existed an equivalent normalized schema. "normalized" in the 1970 paper is called 1NF in later writings, once further normal forms were discovered.

There is one other piece of the 1NF definition in the 1970 paper, the "no duplicates rule". The no duplicates rule has to do with the representation of a relation, and not with a relation itself. Codd imagined (correctly) that the first relational database systems would use records to represent tuples and (virtual) arrays of records to represent relations. In a relation, there is no such thing as "a tuple appearing twice". However, in an array of records, there is such a thing as two of the records having identical contents. Codd ruled that out as a practical stop gap, in order to prevent the implementations from diverging from the properties of mathematical relations in an unnecessary and harmful way. This is my reading of the 1970 paper, in regard to 1NF theory.

There's a connection between the "atomic values" rule and the "no duplicates rule", at the implementation level.

consider the following fact:

Jack speaks English and German.

Let's say we are about to include this fact in a relation stored somewhere in a relational database, and that one of the columns of a relational table is "set of languages spoken".

Further, let's say that there is already a tuple in the relation with the following fact stored:

Jack speaks German and English.

As a practical matter, in terms of the representation of data inside a database, it can be extraordinarily difficult to ascertain that these two propositions, together, violate the "no duplicates rule"

Notice that my focus has been entirely on the implementation, and not on the relational algebra itself. With regard to the relational algebra itself, I believe your understanding is correct.

So what the heck are implementation oriented issues doing in the 1970 paper? I believe Codd wanted to get across two main ideas: building a system for relational databases would be a good idea. And building such a system was also feasable. It's for this second reason that I believe Codd added some material that is primarily about implementation, rather than about the power of relational algebra itself.

This is my insight, such as it is. I hope it helps. Received on Fri Aug 31 2007 - 14:58:28 CEST

Original text of this message