Re: NULLs

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sat, 05 Jan 2008 04:49:43 GMT
Message-ID: <H%Dfj.1804$pA7.1522_at_newssvr25.news.prodigy.net>


"JOG" <jog_at_cs.nott.ac.uk> wrote in message news:44eb5e67-8990-436a-97d2-745564952739_at_x69g2000hsx.googlegroups.com...

> On Jan 4, 2:50 pm, "Brian Selzer" <br..._at_selzer-software.com> wrote:

>> "Hugo Kornelis" <h..._at_perFact.REMOVETHIS.info.INVALID> wrote in message
>>
>> news:60rqn35dm3og64dt42hgdisgm8rut5lqr2_at_4ax.com...
>>
>>
>>
>> > On Sat, 29 Dec 2007 10:34:06 -0800 (PST), Marshall wrote:
>>
>> >>On Dec 28, 2:59 pm, Hugo Kornelis
>> >><h..._at_perFact.REMOVETHIS.info.INVALID> wrote:
>>
>> >>> 3VL is not a result of using NULL to represent missing information,
>> >>> but
>> >>> a result of allowing missing information. IMO, there are only two
>> >>> options: either you deal with missing information, and with the 3VL
>> >>> that
>> >>> results from it -- or you somehow alter reality so that information
>> >>> is
>> >>> never missing again, for any reason.
>>
>> >>Full normalization is a third option.
>>
>> > Hi Marshall,
>>
>> > As in "one table per elementary fact type"? Full agreement here, see my
>> > message to Jim.
>>
>> Excuse me, but I don't think normalization has any bearing whatsoever.
>> Even
>> with a fully normalized schema it may be that there can be missing
>> information.
>
> I think we have to be careful on terminology here - if a model has no
> nulls in it then as far as the /logical model/ is concerned absolutely
> nothing is missing. All predicates are satisfied by full
> instantiations.There is a distinction between what it means for
> information to be missing at the conceptual layer and the logical
> layer, and I fear a lot of arguments about the issue on cdt are down
> to not specifying which layer is being discussed.
>

I'm not sure what you mean. Can you please elaborate on the distinction between what it means for information to be missing at the conceptural layer and the logical layer? As far as I'm concerned, if there is a difference between what is the case and what is represented in the database, that is, if there is information that could be represented in the database but isn't, then that information is missing. If a model has no nulls but there is a difference between what is the case and what is represented in the database, then at least one relation must be subject to an open world interpretation (if there's no row, then you don't know) instead of the closed world interpretation (if there's no row, then it ain't so). That it can be that you don't know raises the spectre of 3VL, regardless of what layer you're at.

>> Even if there is one table per elementary fact type, there can
>> still be missing information: the difference is that instead of a null
>> that
>> indicates that there should be a value but it just hasn't been supplied,
>> you
>> have to decide whether the absence of a row indicates that the attribute
>> does not apply or that it does apply but that a value just hasn't been
>> supplied.
>> Splitting the tables only alters the indicator from an explicit
>> one to an implicit one, and information is lost in the process.
>> Moreover,
>> decomposition isn't enough, you must also alter the intended
>> interpretation
>> for those elementary fact tables so that each represents what /is known
>> to
>> be/ the case instead of what /is/ the case. Each table would then be
>> closed
>> with respect to what is known about the world, but this can lead to
>> confusion: if it is true that it is known that 'X' is the case, then 'X'
>> is
>> the case, but if it is not true that it is known that 'X' is the case,
>> then
>> 'X' may or may not be the case. So you again are faced with 3VL.
>>
>> Using nulls that indicate only that there should be a value but it just
>> hasn't been supplied interferes only with the Domain Closure Assumption,
>> since it can no longer be determined with certainty which individuals
>> exist,
>> but using one table per elementary fact type instead plays havoc with
>> both
>> the Domain Closure and Closed World Assumptions, and introduces an extra
>> level of indirection: the database no longer directly represents reality
>> (as
>> is the case when using an explicit indicator), but rather what is known
>> to
>> be true about reality.
>>
>> > I think I addressed the rest of your message in my reply to David. Let
>> > me know if you think I failed to address any of your points.
>>
>> > Best, Hugo
> Received on Sat Jan 05 2008 - 05:49:43 CET

Original text of this message