Re: 3vl 2vl and NULL

From: paul c <toledobythesea_at_oohay.ac>
Date: Tue, 06 Dec 2005 01:44:50 GMT
Message-ID: <m26lf.53756$Gd6.35030_at_pd7tw3no>


David Cressey wrote:
> "paul c" <toledobythesea_at_oohay.ac> wrote in message
> news:wOZkf.47166$ki.43132_at_pd7tw2no...
>

>>David Cressey wrote:
>>
>>>I've stayed away from the 3vl versus 2vl discussion over in the monster

>
> NULL
>
>>>thread.  Mainly, I'm not sure exactly what Codd and Date have to say on

>
> the
>
>>>subject.  I think that both of them are smarter than I am, and that they
>>>don't agree on the subject, and that they agree that they disagree.

>
> That
>
>>>pretty much leaves it up to us, doesn't it?
>>>
>>>Here's what I think:  There are two Boolean constants:  FALSE and TRUE.
>>>A boolean variable with a value must have one of these two values. I

>
> think
>
>>>that's the 2vl position, although I'm not sure.
>>
>>
>>Well, I'm glad for the sake of my eyes at least, that David C. has
>>renamed this thread.  For me, most of it is playing with words (no
>>insult intended).  That can be fun but it can also get us into trouble.
>>
>>EG., "a ... variable ... might have no value" are dangerous words
>>because we're on the verge of stepping out of the context we really
>>mean, namely the value of an attribute in a tuple.
>>
>>
>>My basic attitude on the whole question goes back to where Codd started
>>from, namely first-order predicate logic.  As far as I know, FOL has no
>>notion of nulls in the senses they are being talked about here.  So, if
>>one implants nulls into the RM, surely one must be prepared either to
>>throw FOL out the window or invent a new flavour of FOL, which seems to
>>me to be an undertaking that only a very few logic philosophers are
>>capable of taking on.
>>
>>
>>I thought I'd mention this rather sticky point since nobody else seems
>>to have brought it up and it seems fundamental to me that one must
>>address it if one is going to entertain nulls.
>>
>>
>>Maybe I've got it wrong and Codd himself did try to do this sometime
>>around 1979 but from what material is widely available, it seems he was
>>prepared to discard FOL, here and there, for the sake of expediency.
>>When I see people talking about special 'marks', I can't help but wonder
>>if Codd wasn't subtly influenced by some of the hardware he would have
>>been very familiar with and that had features like 'word marks' and
>>'field marks'.
>>

>
> I'm pretty much exhausted by this topic, and, as JOG said, it goes around in
> circles.
>
> But for your sake, I'll try and respond. (I'm still grateful to you for
> "it's easier to understand 600 tables than 100,000 lines of code").
>
> There is no reason whatsoever for FOL or any other order of logic to add the
> notion of nulls. It's simply outside the subject matter. It would make no
> more sense, and no less sense, to add the notion of nulls to the mathematics
> of integers.
>
> But when you're dealing with data systems, that's another story. If you
> deal with data systems, sooner or later you are going to have to deal with
> missing data. It's inherent in data systems. When you deal with missing
> data, you are either going to deal with it in a systematic way, or else
> you are going to deal with it in an unsystematic way.
>
> NULLS isn't the only systematic way of dealing with missing data, nor is it
> necessarily the best way. But it's definitely systematic, it's defnitely
> type independent, and it's definitely the way SQL chose. And, from my point
> of view, it's definitely good enough for a lot of practical work.
>
> And it's definitely more systematic and more type independent, than the
> practice, presented by some pickies in here of treating the empty string and
> the null as equivalent. Doing that blurs a distinction that is sometimes
> useful to make. And the fact that pickies are able to program their way
> around the absence of nulls is no proof that the notion is not a worthwhile
> one. You can program your way through a spaghetti bowl full of GOTOs as
> well. But that doesn't mean that you should.
>
> Notice I said "data systems" , not "database systems". While it's clear
> that Codd, in the 1970 paper, introduced the relational data model in
> connection with the problem of managing large shared databases, it's by no
> means clear that the data model is limited to that application.
>
> Indeed, if you'll let me quote from the beginning of the 1970 paper,
>
> "This paper is concerned with the application of elementary relation theory
> to systems which provide shared access to large banks of formatted data.
> Except for a paper by Childs [1], the principal application of relations to
> data systems has been to deductive question - answering systems. Levein and
> Maron [2] provide numerous references to work in this area."
>
> It's clear that some earlier work had been done with data systems other that
> databases, using the concept of mathematical relations.
>
> To my knowledge, Codd addressed the need for dealing with missing data as
> early as 1985, when he published the 12 rules.
>
> Here's rule 3:
>
> "Rule 3: Systematic Treatment of Null Values
> A field should be allowed to remain empty. This involves the support of a
> null value, which is distinct from an empty string or a number with a value
> of zero. Of course, this can't apply to primary keys. In addition, most
> database implementations support the concept of a nun- null field constraint
> that prevents null values in a specific table column.
> "
>
> It's also clear that it's distinct from FALSE or TRUE, and is NOT another
> truth value. The wording "null value" above is unfortunate. The term
> "marker" is better. And I wouldn't get too bent out of shape by the word
> "marker". All we are dealing with, all the time, in data systems, are
> markers.
>
>
> I hope this makes my position clear: the NULL is a feature of DATA SYSTEMS
> and not a feature of any of the domains of the primitive dataypes,
> including the BOOLEAN datatype.
>
> I'm close to the end of this topic now. Michael Preece is probably still
> thinking that SQL-relational people are "poor deluded fools".
> Maybe something will one day change his mind. Maybe not.
>

Thanks David. You always seem to have a position that you can defend rather well even if I, being or trying to be a rather strict interpreter of the RM don't agree. But your points have more than a little practical weight to them. I did notice that you jumped about 15 years between Codd's 2nd paper and the notorious twelve rules, which I mention only because I think Codd, as smart as he was (and entertaining - I once had the privilege of listening to him even though I didn't understand most of what he was talking about!) did diverge from his original two papers in a significant way after about 1979 or so. But that's all okay and there are some other writers quoted with respect by one or more of what I think of as the RM's "gang of four" (in alphabetical order: Darwen, Date, McGoveran and Pascal) who give a fair deal of weight to your position. William Kent comes to mind. I have a little book of his somewhere, I forget the title, but he is very good at saying practical things very simply. I think he has (had?) a great deal of what you call 'data system' experience.

When it comes to a strict treatment of the RM, avoiding nulls, I could say that I would prefer 700 tables without nulls (along with a mere ten thousand lines of code) over 600 tables without nulls! I suspect you have modelled many, many, more applications than I have - for the most part I only got to do the ones that nobody else wanted to do (because they involved some strange process control machine or because several failed versions that wouldn't run on less than six mainframes had already been built and nobody else wanted to be the next scapegoat) and the rest of the time I was regarded as a mere interloper, aka a 'sh...disturber' or a 'bare-metal' type who was inherently incapable of understanding highly abstract concepts for modelling reality. So I personally never designed more than a seventy or eighty tables or so for a single app.

Of course, being the oddball sometimes has its advantages because you can suggest unorthodoxy without the other developers feeling threatened.   I'll admit that I often agreed with the others in the sense that I do tend to think fairly often in terms of the bare metal and when push comes to shove, I think we must always remember that the machinery can never be the same thing, and must remain to some degree or other a mere representation of aspects of it, indeed I'd go so far as to say that we must expect some distortion as a result of putting an application on a computer. From my own little practical viewpoint, I did find that without needing to give much explanation to the higher-level business users, to whom I usually had some access, I was often able to get away with zero instead of a 'null' number and blanks instead of 'null' string or character data, but I never thought of 'empty string' being equivalent to null, rather I thought of it being equivalent to 'empty string'!

Even though I disagree with lots on this group, I enjoy it. I'm also grateful to the people here because even when I disagree, it forces me to examine my own attitudes.

cheers,
paul c. Received on Tue Dec 06 2005 - 02:44:50 CET

Original text of this message