Re: NULLs

From: Brian Selzer <brian_at_selzer-software.com>
Date: Mon, 31 Dec 2007 03:32:05 -0500
Message-ID: <9O1ej.2355$se5.1148_at_nlpi069.nbdc.sbc.com>


"JOG" <jog_at_cs.nott.ac.uk> wrote in message news:aaf9d91c-fb76-412c-9dbb-213a0f3f70ce_at_i12g2000prf.googlegroups.com...

> On Dec 30, 10:46 pm, "Brian Selzer" <br..._at_selzer-software.com> wrote:

>> "JOG" <j..._at_cs.nott.ac.uk> wrote in message
>>
>> news:c9ec9790-f950-4407-b1f8-e5b8f4d07e91_at_v4g2000hsf.googlegroups.com...
>>
>>
>>
>> > On Dec 30, 7:23 pm, "Brian Selzer" <br..._at_selzer-software.com> wrote:
>> >> "stevedtrm" <steved..._at_hotmail.com> wrote in message
>>
>> >>news:a3716365-7bc4-4a75-ab58-ee7ac957c299_at_n20g2000hsh.googlegroups.com...
>>
>> >> >> > If everyone is clear NULLS shouldn't be used, why the debate as
>> >> >> > to
>> >> >> > what
>> >> >> > to do about them ?
>>
>> >> >> Because SQL allows NULL and even promotes the idea that NULL solves
>> >> >> some
>> >> >> problem instead of introducing many.
>>
>> >> > So everyone is agreed that NULLs shouldn't appear anywhere, and its
>> >> > just a matter of time before NULLS become a legacy problem and a
>> >> > relational language supercedes SQL?
>>
>> >> No, Steve, not everyone is agreed.
>>
>> >> > Are the two solutions I suggested before the widely accepted as
>> >> > resolutions to the two problems NULLs were introduced to eradicate?
>>
>> >> >>Missing information? Then surely there should just be no tuple?
>>
>> >> A row may contain a lot more information that would otherwise be
>> >> useful.
>> >> Would you tell a potential customer that you won't serve them because
>> >> you
>> >> don't know everything about them? I think not! Haven't you been
>> >> asked
>> >> for
>> >> your e-mail address at a retailer? What would you think if the
>> >> cashier
>> >> told
>> >> you to get lost for refusing to supply it?
>>
>> >> >>To indicate that there can be no value? Why not a seperate table
>> >> >>with
>> >> >>a boolean value in the non-key column?
>>
>> >> In my opinion, null should only ever be used to indicate "There should
>> >> be
>> >> a
>> >> value here, but it hasn't been supplied." That there should be a
>> >> value
>> >> or
>> >> that there must be a value are structural issues, and therefore should
>> >> not
>> >> be subject to interpretation. "No value here" invites interpretation,
>> >> because it begs the question, "Should there have even been a value?"
>>
>> >> Structural information is lost by separating a nullable column into
>> >> its
>> >> own
>> >> table: the fact that there should always be a value, even if it hasn't
>> >> been
>> >> supplied is lost.
>>
>> > I think this could be put better. It is not that there should always
>> > be a value, but rather that "a value exists in the real world".
>> > However I don't really see what is structurally lost by complete
>> > decomposition to eliminate nulls, that can't be remedied via
>> > constraints.
>>
>> If you complete the interpretation such that all missing values are
>> supplied, then the resulting database should be a set of relations. When
>> there is a nullable column on a table, there will a 1:1 correspondence
>> between values in the completed relation. The problem with separating a
>> column that can contain nulls into another table is that when all missing
>> values are supplied, there is no requirement that for every tuple in the
>> referenced relation there must be one and only one tuple in the
>> referencing
>> relation. Adding a constraint to guarantee that 1:1 correspondence would
>> be
>> the same as not permitting the information to be missing in the first
>> place.
>> Thus the only ways that I can see to remedy that would be either to add a
>> boolean column, or to add another table.
>
> Could you clarify why you think  using nulls maintain structural
> information? If one is saying that a datum /must/ exist as part of the
> proposition, why is there a nullable column allowing it not to?
> Perhaps you are saying that the nullable column is stating that the
> value "should be" part of the proposition, but "we're being a bit lax
> about it" whereas a decomposed version does not?
>
> If so, I am unclear at the moment what the value of this is - perhaps
> you could refer me to an example of when it would be necessary?
>

There is a functional relationship between a key and each non-key column. If null can only indicate that there should be a value but it hasn't been supplied, then the presence of a null doesn't alter that functional relationship. In other words, for each particular key value, there is one and only one value for each non-key column--even if it is unclear what that value is. In a perfect world, there is no missing information, and a table that permits nulls becomes a relation that doesn't. In such a relation, the relationship I'm referring to would be called a functional dependency. By separating a nullable column into its own table, that functional relationship is lost. Just as is the case when a relation R {A, B, C} where A is the key is separated into a pair of relations, S {A, B} and T {A, C} such that T[A] IN S[A], since the values for A, B and C no longer appear in the same tuple, it is now possible for there to be a value for A and B without a value for C: the functional relationship between A and C is lost.

>>
>> >> That's not to say that it isn't possible to eliminate
>> >> nulls altoghether, it's just that you need several separate tables
>> >> instead
>> >> of one: one table for the values that must always be present, one
>> >> table
>> >> for
>> >> each nullable column (or set of columns in those cases where whenever
>> >> one
>> >> value is supplied, another value must also be supplied) that indicates
>> >> whether or not there should be a value (or set of values), and one
>> >> table
>> >> with a column (or set of columns) for those values that just happen to
>> >> have
>> >> been supplied.
>>
>> >> Note also that the spectre of 3VL disappears when null indicates only
>> >> "there
>> >> should be a value here, but it hasn't been supplied." Since there
>> >> should
>> >> be
>> >> a value, and only one value, a null can simply be replaced by an
>> >> exclusive
>> >> disjunction that encompasses all of the values in the attribute's
>> >> domain.
>> >> So there is no need for 3VL, 2VL suffices. (For infinite domains, a
>> >> quantifier could be used that means instead of "there is...," "there
>> >> is
>> >> exactly one....")
Received on Mon Dec 31 2007 - 09:32:05 CET

Original text of this message