Re: NULLs

From: JOG <jog_at_cs.nott.ac.uk>
Date: Sat, 29 Dec 2007 07:00:47 -0800 (PST)
Message-ID: <8e3bc8f8-3bbd-42fe-8512-f6611f5c57b9_at_1g2000hsl.googlegroups.com>


On Dec 28, 10:59 pm, Hugo Kornelis
<h..._at_perFact.REMOVETHIS.info.INVALID> wrote:
> On Thu, 27 Dec 2007 16:41:51 -0800 (PST), JOG wrote:
> >On Dec 27, 11:31 pm, Hugo Kornelis
> ><h..._at_perFact.REMOVETHIS.info.INVALID> wrote:
> >> On Wed, 26 Dec 2007 23:17:52 -0400, Bob Badour wrote:
> >> > Codd pointed out that a single NULL marker did not suffice and
> >> >suggested 2 markers. Date pointed out that one can apply the same
> >> >argument to 2 markers leading to an infinite progression once one heads
> >> >down that path, which suggests the path was never a productive one to
> >> >head down in the first place.
>
> >> Hi Bob,
>
> >> Unfortunately, both Codd and Date forgot that the NULL marker in (for
> >> instance) the age column should represent only that the age is not on
> >> file and not try to represent a reason for this as well.
>
> >> Codd's suggestions to use two markers (for "not applicable" and
> >> "unknown", IIRC) assumes that we want to store both the age of a person
> >> (if on file), and the reason why an age is not on file (if it isn't).
> >> That can of course be necessary - but in that case, we have two
> >> attributes that should be stored in two seperate columns.
>
> >> Attempting to store both the age and the reason why an age is unknown in
> >> a single column violates first normal form.
>
> >Is the Zaniolo approach you are favouring not doing exactly the same
> >thing? Surely it is attempting to store both an age and the fact that
> >an age is not on file in a single column, which also violates first
> >normal form.
>
> Hi J,
>
> That would be the case if I were intentionally storing the fact that an
> age is not on file, and using (or rather: abusing) the age column for
> that. But I'm not. I'm just storing ages I have on file.

Then surely you are trying to fit propositions of different structures into a mathematical construct (i.e. a relation) that proscribes such an undertaking. The conclusion must be that relations will not do the job that you require of them?

>
> Yes, if there is no value (as represented by the NULL marker) in this
> column in a specific row, then you can infer from this that I have no
> age on file for the person described by that row - but this is an
> unintended though unavoidable by-effect. Just as the fact that you can
> infer "Jack is an adult" from "Jack is 43 years old" - surely you
> wouldn't use that to maintain that storing an age violated 1NF?

I think there is a difference Hugo. Stating the Jack is an adult is an inference from correct mathematical application of a relation to store values, whereas a NULL flag (whatever its meaning) is not a value. I understand what you aim to achieve but it seems perhaps that relations are the wrong vehicle.

>
> > It still appears to be a hack imo, and one still ends up
> >with 3VL.
>
> 3VL is not a result of using NULL to represent missing information, but
> a result of allowing missing information. IMO, there are only two
> options: either you deal with missing information, and with the 3VL that
> results from it -- or you somehow alter reality so that information is
> never missing again, for any reason.

I think I disagree there. In predicate logic I can state (and question) everything I know about the world without turning to 3VL, so it must be possible in a data model surely? If I know an attribute does or doesn't exist, then I can state that formally (with existential quantifiers), and If I don't know anything about that attribute...well I just don't say anything about it at all.

>
> > There has to be a more elegant way....Regards, J.
>
> I doubt it. Given this information:
>
> * "Employee Jack is male"
> * "Employee Mary is female"
> * "Employee Jack is 43 years old"
> * "Employee JJ is 32 years old"
>
> how would you answer the below questions:
>
> * "List all employees aged 40 and above".
> * "What is the average age of our employees?"
> * "For each employee, how many years left until retirement (assume a
> country with laws for retirement age of 65)"
> * "List all employees that are female, under 35 years old, or both"
> * "Is JJ older than Mary?"
> * etc

Yup, I catch your drift. However I believe these questions should have "in propositions that you know" concatenated to them, so we still have 2VL, and leaving us humans in the real world to interpret the results. The CWA is a nonsense imo. But I do agree that with CWA, in RM, 3VL is very difficult to avoid, and nulls are incredibly tempting to use. My personal conclusion therefore is that perhaps we are trying to address /symptoms/ and not the root cause of the issue, and a step back is required.

>
> Try to describe how you would answer those questions, in an elegant way
> and without using anything even remotely resembling 3VL. If you succeed
> at that, a database without 3VL is just around the corner - because the
> hardest part is not implementing, but finding out what to implement.

And I shall call it the Propositional Model. And it shall glisten like golden sunlight, before industry completely ignores it. And I'll publish it in a journal wellllll in advance of letting you lot rip it to shreds ;)

>
> I can give answers to all questions above, but I have to use 3VL in all
> cases. For me, that signifies that 3VL is part of reality and hence
> can't be left out of a database that attempts to model (aspects of)
> reality.
>
> Best, Hugo

Happy New Year, Jim. Received on Sat Dec 29 2007 - 16:00:47 CET

Original text of this message