Re: 3 value logic. Why is SQL so special?

From: Roy Hann <specially_at_processed.almost.meat>
Date: Tue, 19 Sep 2006 09:32:56 +0100
Message-ID: <or6dne8UYrTTNpLYnZ2dnUVZ8qCdnZ2d_at_pipex.net>


"Chris Lim" <blackcap80_at_hotmail.com> wrote in message news:1158651482.204488.296170_at_i42g2000cwa.googlegroups.com...
> However in one of the databases I currently work on, there are dozens
> of attributes captured (particularly for personal customers) for
> marketing purposes. These are all optional as a customer does not have
> to supply them. To avoid NULLs, I would end up with dozens of tables,
> complicating the extremely large number of queries that reference them.

I am afraid I don't see the benefit of using NULL to "record" the certain knowledge that someone declined to supply a piece of information. However, if this were the only way to use NULL I wouldn't bother objecting. Unfortunately there is a very different way to use NULL and it is only your self-discipline (and mine) that keeps us out of trouble. So as far as I can see nullability is a barely useful feature that can also very easily cause serious problems in return. Too many of my predecessors had no self-discipline whatever and created a legacy of complication and uncertainty that I have to live with every day.

> There is also a duplicate-checking process which identifies potential
> duplicates between customers, using several of the optional attributes
> (e.g. First Name, Middle Name, Last Name, Birth Date, as a unique
> identifier is not always available).

Having worked on both medical and police intelligence applications I know very well how hard the problem of identification is. But there are solutions to the problem based on a subtle understanding of what the business problem really is. NULLs have no part in any of the robust solutions.

> The current queries are
> complicated enough (and NULLs are easily handled with ISNULLs).

No doubt--because you are doing at run-time what could have been done at design time. How is your life made simpler by ISNULL?

> Increasing the number of tables just to avoid NULLs would be a
> nightmare in this scenario (unless outer joins were permitted, but then
> you'd end up dealing with NULLs anyway).

I will let that pass, although I have some opinions on it.

>> But if more were required, so what? You've got to write the code to sort
>> it
>> out somewhere, so why not in the query?
>
> I would weigh up the advantages of avoiding NULLs with the extra
> overhead incurred by creating the extra tables (in terms of complexity
> in querying the data).

The compexity doesn't go away. Any additional tables merely reveal the complexity that is otherwise hidden, lying in wait but still very much there. If you don't resolve it with a complex query you have to resolve it with complex code somewhere else. As horrid as it is, SQL is still almost always more concise than whatever ghastly 3GL language we might use otherwise.

> In some cases I do create extra tables, but I
> just find the thought of taking this approach to the extreme to avoid
> all NULLs to be a bit, well, extreme.

I think perhaps we are not as far apart as it seems. For instance we agree that entity subtyping makes a large number of spuriously nullable attributes unnecessary. Many remaining nullable attributes probably don't need to be nullable even given the limitations of SQL data types; they are used freely only because many people see no harm in them.

I won't go so far as to insist you don't ever need to introduce nullable attributes into an SQL database, but only because I couldn't defend that position except to say that I don't remember the last time I had absolutely no choice but to do so.

Roy Received on Tue Sep 19 2006 - 10:32:56 CEST

Original text of this message