Re: NULLs

From: Hugo Kornelis <hugo_at_perFact.REMOVETHIS.info.INVALID>
Date: Fri, 04 Jan 2008 00:12:36 +0100
Message-ID: <63qqn31k38u8ad1of2tb7n6r70cekr39bg_at_4ax.com>


On Sat, 29 Dec 2007 07:00:47 -0800 (PST), JOG wrote:

(snip)
>> That would be the case if I were intentionally storing the fact that an
>> age is not on file, and using (or rather: abusing) the age column for
>> that. But I'm not. I'm just storing ages I have on file.
>
>Then surely you are trying to fit propositions of different
>structures into a mathematical construct (i.e. a relation) that
>proscribes such an undertaking. The conclusion must be that relations
>will not do the job that you require of them?

Agreed (more below)

>> Yes, if there is no value (as represented by the NULL marker) in this
>> column in a specific row, then you can infer from this that I have no
>> age on file for the person described by that row - but this is an
>> unintended though unavoidable by-effect. Just as the fact that you can
>> infer "Jack is an adult" from "Jack is 43 years old" - surely you
>> wouldn't use that to maintain that storing an age violated 1NF?
>
>I think there is a difference Hugo. Stating the Jack is an adult is an
>inference from correct mathematical application of a relation to store
>values, whereas a NULL flag (whatever its meaning) is not a value. I
>understand what you aim to achieve but it seems perhaps that relations
>are the wrong vehicle.

Agreed.

I tried to avoid the term relation in my previous messages (if one accidentally slipped through, my bad). I talked aboout tables.

I'm a big supporter of ORM (not object-relational mappper -ugh, I have to wash my hands every time I type that- but Object Role Modeling) and other similar (but less known) methods such as NIAM. ORM and NIAM base on elementary fact types. I truly believe that all relations should be based on elementary fact types only.

When implementing relations in what is currently usually called a relational database, it *is* possible to create a table per elementary fact type. In that case, no NULLs are ever needed. The facts

are represented in two tables, like this:

EmployeeAge           EmployeeGender
name | age            name | gender
-----+-----           -----+--------
Jack | 43             Mary | female
JJ   | 32             Jack | male

But for various reasons (maintainability and performance being the most important), it is often preferable to combine these relations in a single table, like this:

Employee
name | age | gender
-----+-----+-------

Jack | 43  | male
JJ   | 32  |
Mary |     | female

Some way has to be found to ensure that JJ's gender is not mistaken as '' (empty string), and to ensure that no datatype mismatch error is raised on the blank age of Mary (as a blank is not numeric data). The best (or, if you prefer, least bad) answer to that problem is NULL.

Coming from this background, you'll probably understand why I have little issue with NULL - as it is (to me) just a marker that says "hey, if you hadn't insisted on stuffing the fact types EmployeeAge and EmployeeGender in the same table, there simply would have been no row for Mary in the EmployeeAge table".

>> > It still appears to be a hack imo, and one still ends up
>> >with 3VL.
>>
>> 3VL is not a result of using NULL to represent missing information, but
>> a result of allowing missing information. IMO, there are only two
>> options: either you deal with missing information, and with the 3VL that
>> results from it -- or you somehow alter reality so that information is
>> never missing again, for any reason.
>
>I think I disagree there. In predicate logic I can state (and
>question) everything I know about the world without turning to 3VL, so
>it must be possible in a data model surely? If I know an attribute
>does or doesn't exist, then I can state that formally (with
>existential quantifiers), and If I don't know anything about that
>attribute...well I just don't say anything about it at all.

Yes, indeed: "If I know an attribute does or doesn't exist, then I can state that formally" - turning the fact that you know an attribute does or doesn't exist into a new attribute. Add this attribute to your relational model, and you can use it to avoid comparisons to NULL. But I doubt if the actual results of the queries will ever be different from what the current NULL handling does. (See my reply to David).

(snip remainder - either already adressed in reply to David, or no need to react)

>Happy New Year, Jim.

And the same to you!

Best, Hugo Received on Fri Jan 04 2008 - 00:12:36 CET

Original text of this message