Re: what are keys and surrogates?

From: David BL <>
Date: Wed, 9 Jan 2008 18:23:57 -0800 (PST)
Message-ID: <>

On Jan 10, 1:22 am, Marshall <> wrote:
> On Jan 9, 8:07 am, David BL <> wrote:
> > On Jan 9, 1:25 pm, Marshall <> wrote:
> > > This issue goes away if we relax 1NF and allow attributes that are
> > > lists or relations. This gives us nested structures. (Nested relations
> > > are not particularly controversial around here.)
> > In addition to my previous post, I wish to add another comment
> > regarding my suspicion with RVAs. The tuples of a relation are
> > supposed to represent facts, but what does it mean when a relation
> > merely represents a value?
> The question is meaningless. The distinction you are drawing
> does not exist.

In what sense do tuples of an RVA represent propositions in *the* UoD?

> > Isn't the RM meant to have some close
> > association with FOPL?
> Yes.
> > It seems to me there is a fundamental difference between
> > a) a large collection of propositions relevant to a particular UoD;
> > and
> > b) a composite data structure such as an AST which simply
> > "is what it is"
> This is an illusion. There is no difference.

Hmmm. Unfortunately you didn't respond to my last paragraph which was more tangible.

I don't believe the distinction is an illusion. I'll have a go at providing an objective measure on a given relational database d...

Let B(d) equal some measure of the amount of information in d, quantified as the total number of bits required to store all the data (accounting for "compressibility").

Let P(d) equal the total number of tuples across all (top level) relvars. Do not count tuples in nested relations. This is a measure of the number of propositions on the UoD.

Now take the ratio bpp(d) = B(d)/P(d) to give the "average bits per proposition".

An alternative measure could account for the number of attributes to give bpa(d) which is an "average bits per attribute", for the attribute values that appear in the top level propositions on the UoD.

In a conventional use of the RM, where attributes are "reasonably atomic" bpa(d) will be relatively small. However for an unconventional use of the RM (such as the representation of source code using nested RVAs) bpa(d) will be very large. An extreme example is the representation of a single AST and P(d) = 1.

Now for the part you won't agree with: I think bpa(d) provides an (inverse) indicator of how "relational" the DB is. Received on Thu Jan 10 2008 - 03:23:57 CET

Original text of this message