Re: Joins with nulls

From: Finarfin <finarfin_at_sympatico.ca>
Date: Sun, 24 Nov 2002 00:20:40 -0500
Message-ID: <7mZD9.11031$kS3.1845079_at_news20.bellglobal.com>


Bob Badour wrote:

>
> If first name and last name are always known, I see no need to put
> them in separate relations. The absurdity lies in your example and
> not in any problem of data management. For someone with a single
> name, like Meatloaf, make the last name an empty string. We know
> what the value of an empty string is: a string of zero characters.
> An empty string equals any other empty string and collates "less
> than" any non-empty string.

Hello:

I admit the tone of my follow up may have been a little harsh, but:

Isn't your method a means of dealing with missing information? As such, isn't it a method of dealing with a null?

Also, with respect to the example, it is one everyone can relate to. More obscure examples I must deal with, involving data entered by labourers, would require more background. I struggled for some time to determine how to set things up so that when someone failed to enter a piece of information, I wouldn't be faced with a null. It is just not possible. At times there will be blanks. The correct answer when asking a question of the data in these cases is NULL, no known answer.

Using the Name table as an example. An analogy to the situation I face is: Operator must get a person's name and enter it. Operator gets the first name, but due to circumstances, not the last. The first name is useable for some purposes and is kept as data. We don't know what (or if) the last name was. We have missing data or . . . a null. It is inherent in the underlying data.

For another analogy, consider Date's Suppliers and Parts database. Examine his table S. The person who gets the data is able to determine that S1 is Smith with status 20, but we don't know and can no longer determine the city. We now have a table with missing data. A null. This type of situation represents 10% to 40% of the data I work with. It's not my job to change how the operators work (i.e. the collective agreement), it's my job to capture the information (sometimes incomplete, which in itself is information). From what I have seen, there is no rational way of doing this without nulls. I believe it is precisely this type of situation that Codd envisioned when he stated that a true relational system must have a means of dealing with missing data.

That being said, I must also get information from a particular database where no attempt has been made to eliminate NULLS that do NOT represent missing data. This is a nightmare. We don't know if NULL represents missing data or poor design .

JE Received on Sun Nov 24 2002 - 06:20:40 CET

Original text of this message