Re: 3vl 2vl and NULL

From: Hugo Kornelis <hugo_at_pe_NO_rFact.in_SPAM_fo>
Date: Mon, 12 Dec 2005 01:03:00 +0100
Message-ID: <aaepp1l4g1h3ud0imb120471k08g9rdaf0_at_4ax.com>


On Fri, 9 Dec 2005 23:34:03 +0100, Jon Heggland wrote:

>In article <36dhp1dv7d9r2r2ujp4qeor3cnunbsmp8m_at_4ax.com>,
>hugo_at_pe_NO_rFact.in_SPAM_fo says...
>>
>> Interesting here is that Date changes Codd's definition "value is
>> missing" to something that suits his argument better ("value unknown").
>> He doesn't explicitly mention this change, nor does he defend this
>> change with as much as one word.
>
>My problem with Codd here is that "value is missing" is not a definition
>or a meaning. It is a *fact*; anyone can see that the value is missing,
>like an empty cell in a spreadsheet. But what does it *mean* that it is
>missing, why is it missing? I find the answer "it just means that it is
>missing" circular and unsatisfying.

Hi Jon,

And yet, that is all there is to it. A NULL in a database is exactly like an empty cell in a spreadsheet - there's room where you'd expect a value, but when you look there is no value.

If you want to know why it's missing, you'll have to add another column to record that information. Just as you would do if you desire to know why a cell in a spreadsheet is empty.

>
>> In my opinion, Codd should have sticked to his original model. NULL is
>> not "value unknown", not "value does not apply". NULL is "no value".
>
>Be that as it may, SQL does create some meaning through the way its
>operators work. Is NULL greater than 7, or less than it? SQL answers
>"don't know". In my opinion, it is very reasonable to assume then that
>NULL in this case is a number (otherwise the comparison engine ought to
>report an error),

All true. The DB engine can conclude the datatype of the NULL from the datatype of the column that holds the NULL. Or when used as a literal in a query (though I fail to see why anyone would do that), it can either choose a datatype from the context or choose a default datatype and use implicit conversion rules to get at the destination datatype.

> but nobody knows what it is. It is unknown.

I would hasard a guess that at least Uncle Vernon knows his age. His parents, his wife and his children too.

But *within the context of the database*, Uncle Vernon's age is indeed unknown. That's the result of the value for the proposition "Age of <person> is <age> years" being missing in the database.

>
>> "The age of Aunt Marge is 47 years" is a fact.
>> "The age of Uncle Vernon is unknown" is also a fact - but it's not the
>> same fact type as the former.
>
>Then what is the interpretation of the tuple (Uncle Vernon, NULL), given
>your interpretation of (Aunt Marge, 47)?

For (Aunt Marge, 47), my interpretation would be:

  • I have a family member who is uniquely identified (in the UoD of my family database) by the name "Aunt Marge";
  • My family member Aunt Marge has an age of 47 years.

For (Uncle Vernon, NULL), my interpretation would be:

  • I have a family member who is uniquely identified (in the UoD of my family database) by the name "Uncle Vernon".

>
>> But the database itself can not make any of these interpretations. For
>> the database, NULL should never mean anything more than "no value here".
>
>Then it shouldn't define operators for it, IMO. It should throw an
>exception whenever it encounters one, except for "IS [NOT] NULL" checks.

That might possibly be a valid position from a theoretic POV, but it's definitely not valid from a practical POV. Suppose you search a jobs database for a job in the neighbourhood of (insert your city here) that does not require you to work more than 45 hours a week, would you want it to fail because two or three companies forgot to fill in the "hours to work" field on the form?

Best, Hugo

-- 

(Remove _NO_ and _SPAM_ to get my e-mail address)
Received on Mon Dec 12 2005 - 01:03:00 CET

Original text of this message