Re: NULLs: theoretical problems?

From: Hugo Kornelis <>
Date: Wed, 08 Aug 2007 21:30:26 +0200
Message-ID: <>

On Wed, 8 Aug 2007 06:43:53 -0400, sinister wrote:

>Many discussions point out one deficiency of NULLs: that they collapse
>multiple, distinct concepts into one ("no value possible," "value missing,"
>"value not available at this time", etc).

Hi sinister,

The deficiency in these discussions is that they fail to normalize the design before starting to look at NULLs.

The ANSI standard defines NULL as follows:

  "null value: A special value that is used to indicate the    absence of any data value."

Thus, the only concept collapsed into NULL is that of "no value here".

The concepts you mention are, indeed, various reasons why a value might be missing. The specific reason might or might not be of interest to the users of a database. If it's not, there's no need to let it bother us; if it is, we need to properly model it.


How would you react if someone showed you a database model where the contracts table had a single column to store both termination date and the reason why the contract was terminated - I'm pretty sure that you'd send the junior DBA who made that model back to class to study normalisation rules, right?

How then would you react if you saw a model that tries to combine the termination date and the reason why the contract is NOT terminated in a single column?


And yet, that is exactly what some people try to do in the discussion of NULLs. They see that a value might be missing (NULL) for various reasons, which is of course correct. They assume that the reason why the value is missing should be somehow stored or encoded in the database, which is not necessarily correct but might be requireed for some applications. And then they go on to state that this reason should be stored or encoded in the same column where the missing value should have been - i.e. they want to use a single column to store both the termination date of a terminated contract and the reason why there is no termination date of a (probably not terminated, though that's not even sure) contract. And that is of course a violation of 1NF.

Unfortunately, many database theorists, including Codd and Date, either failed to see this basic flaw in the discussion or deliberately sidestepped it for convenience. This has led to many pointless discussions on using A-marks and I-marks, and how that would lead to fourvalued logic and still not suffice, and so on....

I made this same point, but in different words, about a month ago on my blog:

Best, Hugo Received on Wed Aug 08 2007 - 21:30:26 CEST

Original text of this message