Re: 3vl 2vl and NULL

From: David Cressey <david.cressey_at_earthlink.net>
Date: Tue, 06 Dec 2005 00:41:04 GMT
Message-ID: <A65lf.10634$wf.6359_at_newsread3.news.atl.earthlink.net>


"paul c" <toledobythesea_at_oohay.ac> wrote in message news:wOZkf.47166$ki.43132_at_pd7tw2no...
> David Cressey wrote:
> > I've stayed away from the 3vl versus 2vl discussion over in the monster
NULL
> > thread. Mainly, I'm not sure exactly what Codd and Date have to say on
the
> > subject. I think that both of them are smarter than I am, and that they
> > don't agree on the subject, and that they agree that they disagree.
That
> > pretty much leaves it up to us, doesn't it?
> >
> > Here's what I think: There are two Boolean constants: FALSE and TRUE.
> > A boolean variable with a value must have one of these two values. I
think
> > that's the 2vl position, although I'm not sure.
>
>
> Well, I'm glad for the sake of my eyes at least, that David C. has
> renamed this thread. For me, most of it is playing with words (no
> insult intended). That can be fun but it can also get us into trouble.
>
> EG., "a ... variable ... might have no value" are dangerous words
> because we're on the verge of stepping out of the context we really
> mean, namely the value of an attribute in a tuple.
>
>
> My basic attitude on the whole question goes back to where Codd started
> from, namely first-order predicate logic. As far as I know, FOL has no
> notion of nulls in the senses they are being talked about here. So, if
> one implants nulls into the RM, surely one must be prepared either to
> throw FOL out the window or invent a new flavour of FOL, which seems to
> me to be an undertaking that only a very few logic philosophers are
> capable of taking on.
>
>
> I thought I'd mention this rather sticky point since nobody else seems
> to have brought it up and it seems fundamental to me that one must
> address it if one is going to entertain nulls.
>
>
> Maybe I've got it wrong and Codd himself did try to do this sometime
> around 1979 but from what material is widely available, it seems he was
> prepared to discard FOL, here and there, for the sake of expediency.
> When I see people talking about special 'marks', I can't help but wonder
> if Codd wasn't subtly influenced by some of the hardware he would have
> been very familiar with and that had features like 'word marks' and
> 'field marks'.
>
I'm pretty much exhausted by this topic, and, as JOG said, it goes around in circles.

But for your sake, I'll try and respond. (I'm still grateful to you for "it's easier to understand 600 tables than 100,000 lines of code").

There is no reason whatsoever for FOL or any other order of logic to add the notion of nulls. It's simply outside the subject matter. It would make no more sense, and no less sense, to add the notion of nulls to the mathematics of integers.

But when you're dealing with data systems, that's another story. If you deal with data systems, sooner or later you are going to have to deal with missing data. It's inherent in data systems. When you deal with missing data, you are either going to deal with it in a systematic way, or else you are going to deal with it in an unsystematic way.

NULLS isn't the only systematic way of dealing with missing data, nor is it necessarily the best way. But it's definitely systematic, it's defnitely type independent, and it's definitely the way SQL chose. And, from my point of view, it's definitely good enough for a lot of practical work.

And it's definitely more systematic and more type independent, than the practice, presented by some pickies in here of treating the empty string and the null as equivalent. Doing that blurs a distinction that is sometimes useful to make. And the fact that pickies are able to program their way around the absence of nulls is no proof that the notion is not a worthwhile one. You can program your way through a spaghetti bowl full of GOTOs as well. But that doesn't mean that you should.

Notice I said "data systems" , not "database systems". While it's clear that Codd, in the 1970 paper, introduced the relational data model in connection with the problem of managing large shared databases, it's by no means clear that the data model is limited to that application.

Indeed, if you'll let me quote from the beginning of the 1970 paper,

"This paper is concerned with the application of elementary relation theory to systems which provide shared access to large banks of formatted data. Except for a paper by Childs [1], the principal application of relations to data systems has been to deductive question - answering systems. Levein and Maron [2] provide numerous references to work in this area."

It's clear that some earlier work had been done with data systems other that databases, using the concept of mathematical relations.

To my knowledge, Codd addressed the need for dealing with missing data as early as 1985, when he published the 12 rules.

Here's rule 3:

"Rule 3: Systematic Treatment of Null Values A field should be allowed to remain empty. This involves the support of a null value, which is distinct from an empty string or a number with a value of zero. Of course, this can't apply to primary keys. In addition, most database implementations support the concept of a nun- null field constraint that prevents null values in a specific table column. "

It's also clear that it's distinct from FALSE or TRUE, and is NOT another truth value. The wording "null value" above is unfortunate. The term "marker" is better. And I wouldn't get too bent out of shape by the word "marker". All we are dealing with, all the time, in data systems, are markers.

I hope this makes my position clear: the NULL is a feature of DATA SYSTEMS and not a feature of any of the domains of the primitive dataypes, including the BOOLEAN datatype.

I'm close to the end of this topic now. Michael Preece is probably still thinking that SQL-relational people are "poor deluded fools". Maybe something will one day change his mind. Maybe not. Received on Tue Dec 06 2005 - 01:41:04 CET

Original text of this message