Re: NULLs

From: Brian Selzer <brian_at_selzer-software.com>
Date: Fri, 04 Jan 2008 14:50:55 GMT
Message-ID: <jJrfj.2407$jJ5.2224_at_newssvr11.news.prodigy.net>


"Hugo Kornelis" <hugo_at_perFact.REMOVETHIS.info.INVALID> wrote in message news:60rqn35dm3og64dt42hgdisgm8rut5lqr2_at_4ax.com...
> On Sat, 29 Dec 2007 10:34:06 -0800 (PST), Marshall wrote:
>
>>On Dec 28, 2:59 pm, Hugo Kornelis
>><h..._at_perFact.REMOVETHIS.info.INVALID> wrote:
>>>
>>> 3VL is not a result of using NULL to represent missing information, but
>>> a result of allowing missing information. IMO, there are only two
>>> options: either you deal with missing information, and with the 3VL that
>>> results from it -- or you somehow alter reality so that information is
>>> never missing again, for any reason.
>>
>>Full normalization is a third option.
>
> Hi Marshall,
>
> As in "one table per elementary fact type"? Full agreement here, see my
> message to Jim.
>

Excuse me, but I don't think normalization has any bearing whatsoever. Even with a fully normalized schema it may be that there can be missing information. Even if there is one table per elementary fact type, there can still be missing information: the difference is that instead of a null that indicates that there should be a value but it just hasn't been supplied, you have to decide whether the absence of a row indicates that the attribute does not apply or that it does apply but that a value just hasn't been supplied. Splitting the tables only alters the indicator from an explicit one to an implicit one, and information is lost in the process. Moreover, decomposition isn't enough, you must also alter the intended interpretation for those elementary fact tables so that each represents what /is known to be/ the case instead of what /is/ the case. Each table would then be closed with respect to what is known about the world, but this can lead to confusion: if it is true that it is known that 'X' is the case, then 'X' is the case, but if it is not true that it is known that 'X' is the case, then 'X' may or may not be the case. So you again are faced with 3VL.

Using nulls that indicate only that there should be a value but it just hasn't been supplied interferes only with the Domain Closure Assumption, since it can no longer be determined with certainty which individuals exist, but using one table per elementary fact type instead plays havoc with both the Domain Closure and Closed World Assumptions, and introduces an extra level of indirection: the database no longer directly represents reality (as is the case when using an explicit indicator), but rather what is known to be true about reality.

> I think I addressed the rest of your message in my reply to David. Let
> me know if you think I failed to address any of your points.
>
> Best, Hugo
Received on Fri Jan 04 2008 - 15:50:55 CET

Original text of this message