Re: Relational Databases and Their Guts

From: Todd Bandrowsky <anakin_at_unitedsoftworks.com>
Date: 22 Jun 2003 16:04:34 -0700
Message-ID: <af3d9224.0306221504.68914683_at_posting.google.com>


> > I'm not clear I understand you example. Assuming I understand your
> > But let me take your question to be simple "how does one deal with
> > optional data?"

That's correct. The use of street was contrived.

> If we imagine a record type with attributes a, b, c, and
> > optionally d, then I see three approaches:

<snip>

> > the type of d that represents "no d". For example, a zero length string
> > for addr2 would indicate "no address line 2." This is distinctly different
> > from null, although it may not seem like an important difference at first.

It's a huge difference. Null means "I don't know what d is", and blank means "nothing". You have null propogation issues with the former and none with the latter.

> >
> > 2) Declare the column for d to be of a type that consists of all the

You could probably fake this with a user defined function of some sort.

> >
> > 3) A second table with two columns: the primary-key-foreign-key of the
> > first table, and a d column.

Yes.

> One can make the above options equivalent using CREATE VIEW....

Yes.

All of the above answers are good, but they didn't really address the biggest part of my question, so I will try and ask it differently. To avoid the misunderstandings caused by asking for a general solution to specific instance of a problem, I'll try ask the question in a general way:

Assume I have two tables R1 and R2. The number of columns in R1 is irrelevant, except that there exists in R1 a primary key, R1PK, and the rest of the columns in R1 we will call R1O. R2 has a minimum of 3 columns, the first of which is a foreign key into R1, the second is a unique identifier to satisfy a one to many relationship between R1 and R2, and the third we will call RCOL.

Let [] be used to refer to the specific row for the primary key of a given table Rn. Thus, for any given R1[ R1PK ], there are 0..N of RCOL. We will later refer to the set of names for a specific PK as RCOL( R1PK ). Let () be used to refer, by index, or by name, to a specific column on a view. Thus, one might say R1( R1PK ) refers to the column name, and R1( 0..N ) refers to columns 0 through N, and, finally, R1( 0...N - R1PK ) refers to all of the columns except for the primary key of R1.

  1. Is there any way using the basic relational algebra operators, to arrive at a final view RResult such that the columns of RResult are as follows:

R1( R1PK + RCOL( R1PK ) + R1O )? 2) If the above is legal, what would that values be for those RCOL( R1PK ) not defined? How does one do the above without introducing NULL. And, if you do introduce NULL, how can you do that without violating the information rule, the systematic treatment of null values rule, and finally, worst of all, how do you handle the updatable view rule when your update MUST include NULLs for some PK! Received on Mon Jun 23 2003 - 01:04:34 CEST

Original text of this message