Re: Proposal: 6NF
Date: Fri, 13 Oct 2006 00:19:17 +0200
Message-ID: <j9eti25ftf87hkqqqfp7vj51s72h5ed6jl_at_4ax.com>
On Wed, 11 Oct 2006 02:59:26 GMT, J M Davitt wrote:
(snip)
>There's a bit of a wrinkle here: NULL -- as found in those
>non-relational databases which claim to be relational -- is
>a marker that denotes several things, only one of which is
>absence of a value.
Hi J M,
I disagree. And so does the ANSI standard for SQL, that holds most of the definitions that are valid in most non-relational databases which claim to be relational.
"ISO/IEC 9075-1:2003 (E), 3.1.1.11 null value: A special value that is used to indicate the absence of any data value."
> It is because those NULLs have no clear
>genesis that one can't make sense of anything holding a NULL.
>We can say, "There's a NULL here because ___ or because ___
>or because ___ ..."
Sure, you can say that. But if you do, you have modeled a predicate. And that predicate should end up in the table as a seperate column. Unless, of course, the UoD allows only one reason for the data to be absent, or unless the reason for the data being absent is totally irrelevant.
> In other words, it's impossible to know
>what any given NULL means.
It means "there's no data here". Nothing more, nothing less.
> Consequently, it's impossible to
>construct declarative, conjunctive-only, quantifiable
>statements using some thing that holds one of those NULLs.
Correct. Constructing such statements requires either using the knowledge that in this particular UoD, this particular column can only be NULL for one single reason; or using a seperate predicate (attribute, column) to hold the reason why that particular instance of data is missing in the DBMS' population.
There have been attempts to have several types of NULL, depending on the reason why the data is missing. They have fallen flat on their face. But what everyone apparently overlooked is that they were doomed to fail right from the start, because the very notion of different types of NULL for different reasons of missing data is a violation of first normal form. Just think about it - consider a column "Birthdate" in which you can store John's age as Nov 6, 1962, Mary's birthdate as unknown and Oracle's birthdate as not applicable. You are now storing facts of the type "<Person> is born on <Date>" and "<Person>'s birthdate is absent because of <Reason>" in a single column!
Everyone who believes in 1NF should store this in two columns. One for facts of the type "<Person> is born on <Date>", and a second one for facts of the type "<Person>'s birthdate is absent because of <Reason>". The reason for a NULL in the former column can be found in the latter column. And a NULL in the latter can, in this UoD, only mean that the former column is not NULL (which, by the way, can easily be verified).
Best, Hugo Received on Fri Oct 13 2006 - 00:19:17 CEST
