Re: NULLs

From: Hugo Kornelis <hugo_at_perFact.REMOVETHIS.info.INVALID>
Date: Tue, 08 Jan 2008 01:05:18 +0100
Message-ID: <c2f5o3171fa992pf9b4a191857si79s7gc_at_4ax.com>


On Fri, 04 Jan 2008 14:50:55 GMT, Brian Selzer wrote:

>Excuse me, but I don't think normalization has any bearing whatsoever. Even
>with a fully normalized schema it may be that there can be missing
>information. Even if there is one table per elementary fact type, there can
>still be missing information: the difference is that instead of a null that
>indicates that there should be a value but it just hasn't been supplied, you
>have to decide whether the absence of a row indicates that the attribute
>does not apply or that it does apply but that a value just hasn't been
>supplied.

Hi Brian,

No, you don't have to decide that at all. There are two possibilities:

  1. The business wants to store this fact if available, but if not they don't give a hoot why it's unavailable.
  2. The business wants to store this fact if available, and if not they do care why it is unavailable.

In the first case, you don't have to decide why a fact is absent bacause the business doesn't care about that reason.

In the latter case, the reason for the fact's absence becomes a new proposition: "The age of person <PersonID> is not recorded in the database because of <reason>", which eventually ends up as a seperate column in the table - so instead of having to decide why the age of Aunt Maggie is not recorded, you just fetch the reason from this column.

> Splitting the tables only alters the indicator from an explicit
>one to an implicit one, and information is lost in the process.

Information that the business didn't want to track in the first place. Unless we assume that the original model was incomplete.

(snip)
> the database no longer directly represents reality (as
>is the case when using an explicit indicator), but rather what is known to
>be true about reality.

No database can ever directly represent reality. Since a database always relies on some external process or operator to feed it with data, it can never represent anything but what is known (to the database) to be true about reality.

And even that only under the assumption that the data entered into the database can only be assumed to be true.

Best, Hugo Received on Tue Jan 08 2008 - 01:05:18 CET

Original text of this message