Re: Date, Darwen, Pascal and the alternative to Nulls in the RM

From: Christopher Browne <cbbrowne_at_acm.org>
Date: Tue, 21 Mar 2006 21:57:57 -0500
Message-ID: <87hd5rmbii.fsf_at_wolfe.cbbrowne.com>


Martha Stewart called it a Good Thing when "Paul Mansour" <paul_at_carlislegroup.com> wrote:
> Assume one accepts, as I do, the argument against nulls put forward
> by Date et al. Would it be fair to say that at this point in time
> they really don't have a solution to missing information?
>
> In the latest edition, just published, of the Third Manifesto (TTM)
> Date and Darwen do not include any recommendations on handling
> missing information other than pointing to a set of PowerPoint
> slides by Darwen, dated 2003. In fact, as the years go by, each of
> Date's books seem to have stronger and stronger proscriptions
> against nulls, and fewer and fewer ideas about how to handle missing
> information. The fact that at least 3 years after Darwen's slides on
> distributed keys they still did not include the concept in the new
> TTM book seems a pretty clear indication that they are really not
> that confident in the concept.

That seems entirely disappointing.

> Meanwhile, Pascal has 2005 paper "The Final Null in the Coffin",
> which seems a bit prematurely named (I haven't read it yet)
> considering the fact that it is advertised on his website as only a
> starting point or recommendation on how to avoid nulls, and that
> much research needs to be done.
>
> So, where does that leave an implementer who does not want to
> implement nulls? The leading theorists don't seem to have
> answers. Their proposed solutions may well cause more long run
> damage, just as nulls did, to the relational model. From nulls to
> special values, the proposed cures to missing information may well
> be worse than the disease. At this point in time, it seems that the
> prudent implementer would prohibit nulls, and essentially leave it
> up to users to design well, avoiding inapplicable information, and
> using application logic where necessary to handle missing
> information . Obviously this is not a good solution, but Date,
> Darwen and Pascal can't seem to offer anything better. Am I wrong
> here?

The Darwen paper comes off as considerably disappointing.

Yes, it offers a thin method of expressing multiple notions of "missing" values. Unfortunately, it doesn't provide anything that will be meaningful for systematic *further* analysis of the results.

Doing summaries of the results is not eased in any way; evaluating count(*) or sum(something) isn't helped.

Really, the paper doesn't try to solve the 3-valued-logic problem of NULLs; it instead replaces it with something more like a 6-valued logic. This isn't a conceptual improvement.

Now, I'm not arguing that NULLs are a wonderful thing, and we should be using them more. They *are* trouble. I had trouble with some NULLs today; a table that, at this point, is quite misdesigned. The NULLs in this pricing table were driving one of our QA guys batty; he was having great trouble trying to delete some rows. The absence of a primary key on the table was entirely troublesome; that will *HAVE* to change before the system gets into production...

The only systematic "NULL evasion" seems to be to, in effect, "fork off" extra tables so that, where a value is omitted, you merely don't link it in. Unfortunately, that complicates the data model...

-- 
(reverse (concatenate 'string "gro.mca" "_at_" "enworbbc"))
http://linuxfinances.info/info/emacs.html
All ITS machines now have hardware for a new machine instruction --
XOI     Execute Operator Immediate.
Please update your programs.
Received on Wed Mar 22 2006 - 03:57:57 CET

Original text of this message