Re: Issues with the logical consistency of The Third Manifesto

From: Laconic2 <laconic2_at_comcast.net>
Date: Tue, 16 Nov 2004 08:01:27 -0500
Message-ID: <t8CdnSte1-u2ZwTcRVn-iw_at_comcast.com>


"Marshall Spight" <mspight_at_dnai.com> wrote in message news:SGhmd.102152$R05.26010_at_attbi_s53...

> Okay, so you're right; if we find ourselves is this situation, then we
> have to, as it were, normalize the representation. This is hard. If
> we only have one possible representation for a type and it's always
> normalized, then this isn't an issue. For example, we can have
> a rational type, and normalize the fraction after every operation.
> In this case, comparing the representation and comparing the
> values is the same.

OK, now we're getting somewhere. The question I have is: did Codd find himself in that situation when he was preparing the 1970 paper, and if he did, was this a great blunder or not?

Codd made the distinction in the paper between what he called arrays, made up of rows and columns, and relations themselves. He said that we often use the terminology of arrays when discussing relations, but we should remember that
arrays are really the representation of relations (meaning, I take it, the form of relations) and not relations themselves.

I take "Arrays" in the 1970s paper to be the equivalent of "tables" in later usage.

And now the question is, how do you test two relations for equality? And the answer is, it depends on how they are represented. If there is more than one way to represent a relation using arrays, such that the arrays are not equal, but the relations thus represented are equal, then we have to normalize at equality testing time.

Now, order of the rows is inherent in an array, but order of the tuples is irrelevant in a relation. So, if our equality tester has to check two arrays, to see if they both represent the same relation, it might have to sort both of them, in order to do a row by row comparison. This is hell. It's probably cheaper to decompose the relvars out of the original relations, and then test the foreign keys for equality. It means more complexity, and more joins, but less sorting.

So if arrays are the "form" of a relation, then "normal form" (later known as 1st normal form) is what we get when we decompose out all the relvars. Right? Received on Tue Nov 16 2004 - 14:01:27 CET

Original text of this message