Re: So what's null then if it's not nothing?

From: David Cressey <david.cressey_at_earthlink.net>
Date: Fri, 02 Dec 2005 12:04:45 GMT
Message-ID: <xLWjf.8114$N45.3542_at_newsread1.news.atl.earthlink.net>


<michael_at_preece.net> wrote in message news:1133490255.501081.275050_at_g44g2000cwa.googlegroups.com...

> Sorry vc - was it Jon that made that point? Can Jon, or perhaps someone
> else, explain how an empty string differs from a NULL string? I don't
> want to know about a place-holder for a missing string - I mean a
> string that is known to be empty.

They differ in two ways: in representation and in interpretation. The difference in representation is an implementation detail. The difference in interpretation might need some explanation.

The number ZERO is the identity over the operation of addition. This means that ZERO has the nice property that:

     X + 0 == 0 + X == X for every integer X

Among mathematicians, some wag has said, "if the number ZERO did not exist, it would be necessary to invent it."

The EMPTY string (the VARCHAR string of length zero) is the identity over the operation of concatenation, a fundamental operation on strings. EMPTY can be represented in SQL by a couple of single quotes with nothing in between them. For
purposes of these postings, I prefer to use the word EMPTY. EMPTY has the nice property that

  A || EMPTY == EMPTY || A == A for every string A.

So EMPTY plays the same role with respect to concatenation that ZERO plays with regard to addition. People who are doing simple things like mailing lists often don't need top operate on strings in a mathematical sense. But there are some operations on strings that do require this.

Now the history of ideas is such that all flavors of nothingness begin as an abstraction, indicating the non existence of something real, and then take on a reality of their own.

For centuries, the number represented by the single cipher ZERO was used to represent, "there isn't a number here". Then ZERO was added to the set of integers, and it no longer meant what it used to mean. It was a number, like any other number.

The cipher ZERO went through a similar evolution. Originally, in decimal place value notation, a little dot, looks kinda like a decimal point, was used to indicate "there isn't a cipher here". So 1..3 meant one thousand and three, while 13 meant thriteen. Then the Cipher ZERO was added to the set [1,2,3,4,5,6,7,8,9] and it grew into the oh shaped cipher we know today, and the set now looks like this: [1,2,3,4,5,6,7,8,9,0].

A similar evolution happened to the ASCII NUL (char(0)). In the days of paper tape and teletypes, NUL was used to indicate "tape feed". Tape feed was a series of sprocket holes, with no data punched in the eight data tracks. It was understood as "there isn't a character here". At the time the C language was developed, they adopted the convention that, since NUL was unnecessary, it could be used as a sentinel. indicating the end of a variable length string.

But what if the very first character is a NUL? Now the empty string is born.... Yet another flavor of nothingness.

Now, you wouldn't use NUL to represent ZERO (either the number or the cipher). You wouldn't use NUL to represent SPACE. And, if you want to represent the empty string as a string, you wouldn't use it to represent the absence of a string.
The Oracle engineers unfortunately did just this, and it's a mistake. The mistake is not apparent when you do mailing lists, but if you use strings for linguistic analysis, it comes back to bite you.

There are two cases of nothingness that I haven't dealt with yet: the Lisp "nil", and the empty set "[]". I'll skip the lisp nil, because it's beside the point here. The empty set is NOT the absence of a set. It's one of the possible sets. So if you have a container that might contain a set, or might not contain a set, you need two representations:

one for "a set is present and it's empty" and the other for "no set is present". If you use the same representation for both cases you can't tell them apart. This distinction and its necessity ought to be apparent to you, even if the EMPTY and NULL distinction is not.

This is about the fifth time I've tried to explain this. I don't know how to make it any clearer. Received on Fri Dec 02 2005 - 13:04:45 CET

Original text of this message