Re: So what's null then if it's not nothing?

From: David Cressey <david.cressey_at_earthlink.net>
Date: Tue, 22 Nov 2005 01:30:28 GMT
Message-ID: <Uwugf.1892$aA2.961_at_newsread2.news.atl.earthlink.net>


"DonR" <donr_work_at_yahoo.com> wrote in message

> UniVerse is one of several Pick-like system, I'll use Pick as a generic
> name here.
> The IBM UniVerse programmers manual has this to say about the NULL
> value.
>
> <Beginning of quote>
> Unknown Data: The Null Value
>
> The null value has a special run-time data type in UniVerse BASIC. It
> was added to UniVerse BASIC for compatibility with UniVerse SQL. The
> null value represents data whose value is unknown.
> Note: Do not confuse the null value with the empty string. The empty
> string is a character string of zero length which is known to have no
> value. Unlike null, whose value is defined as unknown, the value of the
> empty string is known. You cannot use the empty string to represent the
> null value, nor can you use the null value to represent "no value."
> <end of quote>
>
> As stated, the null value was added to comply with SQL standards.

Now let's compare with one of Codd's Twelve Rules:

3. Systematic treatment of null values. Null values (distinct from an empty character string or a string of blank characters and distinct from zero or any other number) are supported in a fully relational DBMS for representing missing information and inapplicable information in a systematic way, independent of the data type.

I pasted this copy from one of the websites that list the 12 rules. It may not be Codd's exact words, but it'll do. Also, it's worth noting that Codd wasn't specifically addressing SQL as such when he wrote the 12 rules.

Both the universe description, and the rule 3 description are careful to point out that Null values are different from the empty character string. A programmer could use an empty string to represent a missing string, by convention, but such a convention by a relational DBMS would run counter to the intent of both Codd and the UniVerse document writers.

Moving on, the phrase "missing information and inapplicable information" seems to me to imply something rather different from "unknown". In particular, "inapplicable information" is fairly clearly something other than "unknown".

So I dislike the use of "unknown" to describe the message conveyed by NULL. I prefer simply "missing."

I would also suggest that the treatment of missing data is somewhat more critical in the relational data model (and the SQL model) than it would seem to be in the Pick model, if I read you right. You see, relational tables and SQL tables are both constrained by their nature to have a rectangular array of containers (what I call "cells", but could easily be called "variables").

But the available (or applicable) information may not always fit neatly into the rectangular form. Therefore some mechanism is needed for explicitly marking a container that has no contents. The SQL Null meets this need. If the Pick model is more fleixible, then it's possible that, at least in some cases, the Pick programmer can leave out the missing or inapplicable content by simply not including the container. This would be analogous to the case in which an entire row was left out of a table, which could be done without recourse to NULLS.

Whether the inflexibility of the SQL table model comes with some simplicity and power that justifies the difficulty is a matter of some difference of opinion between relational enthusiasts and Pick enthusiasts. I'll admit to being a relational enthiast myself, but I would never claim that all information, everywhere, should be made to conform to the relational model. It's a great model, but is range of applicability is not universal.

Finally, I'll point out that the rule 3 description says that the systematic treatment of missing data should be independent of data type. This is quite important in SQL and RDM. Received on Tue Nov 22 2005 - 02:30:28 CET

Original text of this message