Re: What does this NULL mean?

From: David Cressey <dcressey_at_verizon.net>
Date: Sun, 11 Dec 2005 13:46:59 GMT
Message-ID: <n5Wmf.1065$tR6.713_at_trndny09>


"Eric Junkermann" <eric_at_deptj.demon.co.uk> wrote in message news:mG8LO6Gul0mDFwjR_at_deptj.demon.co.uk...
> The number of long threads about NULLs indicates that they are a source
> of difficulty and disagreement.

Good post, Eric.

>
> I have been prompted to start yet another by the www.dbdebunk.com quote
> of the week for December 10, 2005 . It says
>
> "There have been, there are, and there
> always will be NULLS in the real world."
>

I don't quite agree with this quote. I would say that there has been, is, and always will be missing data in the real world.

Whether there will always be NULLS depends on other factors: whether data will always be stored in tables, and whether tables will always be constructed in such a way that there are rows where the key data is present, but some dependent data is missing. I think those two will remain true during my lifetime, but "always" is beyond my ken.

Consider the following: A table with a missing row. The entire row ought to be there, but it isn't. It's just as much "missing data" as a NULL is, but it isn't as conspicuous. And it doesn't call attention to what some people consider a defect in the data model the way NULL does. I disagree with most of the people who try to call attention to defects in the RDM, or even in the SQLDM. But that's the source of the controversy.

> and is attributed to "Pete Brown, dbmonster.com", but it is from a
> comp.databases.theory thread "Does Codd's view of a relational database
> differ from that of Date & Darwin?" on June 4, 2005 .
>
> At the time, I followed up with a request for a definition and examples,
> but I never pursued the answer I got (which I thought less than
> satisfactory). My favourite bit out of the rest of the thread was from
> Ged Byrne, who said
>
> ... If my Manager points to a NULL in a report, I don't say 'That is a
> NULL.' I say something like 'That customer is retired, and does not
> have a work address' or 'The user left that field blank.' That is if I
> am familiar with the data. If I am unfamiliar then the NULL tells me
> nothing, I can only shrug my shoulders and say 'Sorry, I don't know.'
> ...

If I prepare a report, by the time the manager sees it, there are no NULLS in it. There may be blanks in it, but there aren't any NULLS. I consider that a matter of "presentation" rather than "data management". You can look at my thread about empty cells in MS Excel to see where I'm coming from on this.

>
> Finally, this leads to where I want to be with this post:
>
> A NULL does not actually provide enough information to interpret it
> properly, additional knowledge is required.

This leads me back to the curious incident of the dog in the night. I'm going to put that in a separate thread this time.

>
> The best way to look at a NULL is as a sort of denormalisation. If we
> have a table X {A, B, C} where A is the key, B is a column we are not
> currently interested in, and C is the column which is NULL in at least
> one row, then we are really talking about two tables X1 {A, B} and Y {A,
> C}, where at least one row in X1 does not have a corresponding row in Y.
>

Exactly.

Every NULL can be construed as the result of an outer join. The outer join may have been materialized in the table design of an actual database, for reasons that are best explained by the database designer. But you can always decompose a table with NULLS in it into a system of tables with no NULLS.
> (This is like vertical decomposition in
> http://web.onetel.com/~hughdarwen/TheThirdManifesto/Missing-info-without-
> nulls.pdf)
>
> All this situation is telling us is that there is no C value in the
> database for this value of A. It does not matter how many possible
> reasons there are for the existence of this situation (the
> unsatisfactory answer I referred to above), we do not know why there is
> no value. Actually presenting lots of possible reasons was a very bad
> argument because only if there was exactly one reason would we know what
> was going on.
>
> There is no such thing as an acceptable implementation of NULL, because
> each occurrence of a potential NULL in a database design may have many
> possible reasons, each of which must be represented in the database
> somehow. There is no guarantee that there is a single form of
> representation for each of the reasons, and hence no general way of
> implementing a NULL.

I disagree.

DEC Rdb was, in my experience, an "acceptable implementation of NULLS". That's not the same thing as a "universal convention regarding the meaning of NULLS". That's a differnt story. It's about interpretation, not representation.

And I'm back to the curious incident of the dog in the night.

>
> So in most cases, the meaning of NULL is "This database has no value
> here, the design must be incomplete"
>
Or, better, "this database has limited scope". Every database I have ever built has, by golly, limited scope. I'm never going to try to build one that doesn't.

> --
> Eric Junkermann

Thanks again for a good post. Received on Sun Dec 11 2005 - 14:46:59 CET

Original text of this message