Re: Relational Databases and Their Guts

From: Paul Vernon <paul.vernon_at_ukk.ibmm.comm>
Date: Sun, 22 Jun 2003 22:32:04 +0100
Message-ID: <bd59nu$165s$2_at_gazette.almaden.ibm.com>


"Marshall Spight" <mspight_at_dnai.com> wrote in message news:LxjJa.1018625$OV.1115113_at_rwcrnsc54... [snip]
> But let me take your question to be simple "how does one deal with
> optional data?" If we imagine a record type with attributes a, b, c, and
> optionally d, then I see three approaches:
>
> 1) table t has attributes (a, b, c, d) and there is a distinguished value of
> the type of d that represents "no d". For example, a zero length string
> for addr2 would indicate "no address line 2." This is distinctly different
> from null, although it may not seem like an important difference at first.
>
> 2) Declare the column for d to be of a type that consists of all the values
> of the type of d union a specific value of a specific different type. I
believe
> these are called "sum types" in sum languages (ha ha) such as ML and
> unfortunately, most popular languages don't have this facility, but this
> could be compensated for by the call-level interface to the database.
>
> 3) A second table with two columns: the primary-key-foreign-key of the
> first table, and a d column.
>
> (If anyone is still reading, there is a question I've had about this for a
while:
> couldn't these three logically-equivalent cases be made to be simply
> syntactic differences? There are details to be worked out, but the idea
> intrigues me.)

If the cases are indeed information equivalent, then logical data indepence takes case of this. Or at least a full exposition of logical data independence would. By that I mean that logical data independence is more that just the principle of interchangebility between real and virtual relvars (although that is part of it). In particular we need a way to define and switch between 'equivalent cases' *within* the confines of the relational model. Email me for further thoughts.

> Application logic is about equal for all three cases, but query logic is
> somewhat complicated by the third.
>
> Note also that this is a model-level description, and so the question of
> which one is "efficient to implement" is specifically excluded; the
> implementation may be anything that can represent the model.

Absolutely.

> > But, how would you, in terms of basic relational operations (select,
> > project, etc), create a view st each street is in the same tuple as
> > the company city, state, and zip. This is one area where SQL the
> > language (let alone any implementation of it), falls flat on its face.
>
> Au contraire; this is where SQL stands tall, by not allowing such a
> thing.
>
> And what would you expect to get out of such a multivalue dbms?
> There isn't any data you can store as multivalue that you can't store
> as relations.
>
>
> > Then, even if you did that,
> > what would you use to represent those attributes that do not have the
> > maximum number of streets defined. Does relational theory require
> > that each tuple have the same number of relations in a domain? Did I
> > say that right?
>
> I don't think so. Relations have tuples, not vice verse.

You guys would be well be advised to pay up and read Chris Date's, "What First Normal Form Really Means" from dbdebunk.com.

In short, why should relations be the only values not allowed in tuples??

Regards
Paul Vernon
Business Intelligence, IBM Global Services Received on Sun Jun 22 2003 - 23:32:04 CEST

Original text of this message