Re: WWW/Internet 2009: 2nd CFP until 21 September x

From: Walter Mitty <>
Date: Sat, 15 Aug 2009 20:11:33 GMT
Message-ID: <VDEhm.2272$>

"paul c" <> wrote in message news:n1Dhm.41413$PH1.3194_at_edtnps82...
> Walter Mitty wrote:
>> ....
>> The way I think of it is that every table with nulls in it is a
>> materialized outer join. If you can decompose the table into multiple
>> tables each of which has no nulls, what you discover is that a null in
>> the combined table corresponds to an absent row in one of the decomposed
>> tables.
>> Let me shift gears back into practical mode for a minute. In any
>> database I've ever worked with, the majority of columns are not a primary
>> key, or a foreign key or a part of a primary or foreign key, or ever
>> appear in a where of having clause. Nulls in those columns are of almost
>> no consequence at all in the overall behavior of queries. Shunning nulls
>> in those cases is being overly picky.
>> Nulls in "important" columns almost always cause more trouble than
>> decomposing tables would cause, but nulls in inumportant columns help
>> keep things simple.
> ...
> An implicit suggestion here is that there is a way to determine which apps
> are 'simple' in some sense, eg., not capable of contradictions, other than
> by avoiding nulls. I don't mind people advocating nulls as long as they
> don't pretend they have a theory, logic and algebra that makes their use
> contradiction-free. If they want their cake and eat it too, they could
> get to work figuring out SQL might prevent the definitions and statements
> that cause apps to be 'non-simple' in some useful sense.

I don't so much advocate nulls as that I don't avoid their use entirely. There are some places where missing data does no harm, or at least very little harm.

As far as having a theory goes, I don't think we should go so far as to generate an algebra of missing data. I'm perfectly content with a mathematical base that stipulates that the data we are given is all we need to compute with. But information systems are not just methematics. When Codd said that a DBMS should have a systematic method of dealing with missing data, I think I know what he meant. And I agree. Whether SQL's method is systematic enough is something that could be argued either way. I'm willing to live with the SQL implementation, but I wouldn't want to try to defend it. I'm sure a better job can be done.

Where a lot of database designers go awry is to assign "meaning" to missing data. For example, if spouse's first name is missing, it means the person has no spouse. People who set up information systems with this kind of conventional interpretation are entering the quicksand. I guess, to come back to your comment of a few replies back, this means that I share your qualms about the CWA. If there is no passenger named Yussuf Islam on the passenger list, it could mean that there is no such passenger on the plane, or it could mean that the passenger list is incomplete or incorrect. (I may be misrepresenting your position on the CWA, but you get the idea.)

Received on Sat Aug 15 2009 - 22:11:33 CEST

Original text of this message