Re: NULLs

From: JOG <jog_at_cs.nott.ac.uk>
Date: Mon, 31 Dec 2007 05:33:35 -0800 (PST)
Message-ID: <b844301b-bcdd-4098-8c73-08ce49165238_at_w56g2000hsf.googlegroups.com>


On Dec 31, 8:32 am, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> "JOG" <j..._at_cs.nott.ac.uk> wrote in message
>
> news:aaf9d91c-fb76-412c-9dbb-213a0f3f70ce_at_i12g2000prf.googlegroups.com...
>
>
>
> > On Dec 30, 10:46 pm, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> >> "JOG" <j..._at_cs.nott.ac.uk> wrote in message
>
> >>news:c9ec9790-f950-4407-b1f8-e5b8f4d07e91_at_v4g2000hsf.googlegroups.com...
>
> >> > On Dec 30, 7:23 pm, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> >> >> "stevedtrm" <steved..._at_hotmail.com> wrote in message
>
> >> >>news:a3716365-7bc4-4a75-ab58-ee7ac957c299_at_n20g2000hsh.googlegroups.com...
>
> >> >> >> > If everyone is clear NULLS shouldn't be used, why the debate as
> >> >> >> > to
> >> >> >> > what
> >> >> >> > to do about them ?
>
> >> >> >> Because SQL allows NULL and even promotes the idea that NULL solves
> >> >> >> some
> >> >> >> problem instead of introducing many.
>
> >> >> > So everyone is agreed that NULLs shouldn't appear anywhere, and its
> >> >> > just a matter of time before NULLS become a legacy problem and a
> >> >> > relational language supercedes SQL?
>
> >> >> No, Steve, not everyone is agreed.
>
> >> >> > Are the two solutions I suggested before the widely accepted as
> >> >> > resolutions to the two problems NULLs were introduced to eradicate?
>
> >> >> >>Missing information? Then surely there should just be no tuple?
>
> >> >> A row may contain a lot more information that would otherwise be
> >> >> useful.
> >> >> Would you tell a potential customer that you won't serve them because
> >> >> you
> >> >> don't know everything about them? I think not! Haven't you been
> >> >> asked
> >> >> for
> >> >> your e-mail address at a retailer? What would you think if the
> >> >> cashier
> >> >> told
> >> >> you to get lost for refusing to supply it?
>
> >> >> >>To indicate that there can be no value? Why not a seperate table
> >> >> >>with
> >> >> >>a boolean value in the non-key column?
>
> >> >> In my opinion, null should only ever be used to indicate "There should
> >> >> be
> >> >> a
> >> >> value here, but it hasn't been supplied." That there should be a
> >> >> value
> >> >> or
> >> >> that there must be a value are structural issues, and therefore should
> >> >> not
> >> >> be subject to interpretation. "No value here" invites interpretation,
> >> >> because it begs the question, "Should there have even been a value?"
>
> >> >> Structural information is lost by separating a nullable column into
> >> >> its
> >> >> own
> >> >> table: the fact that there should always be a value, even if it hasn't
> >> >> been
> >> >> supplied is lost.
>
> >> > I think this could be put better. It is not that there should always
> >> > be a value, but rather that "a value exists in the real world".
> >> > However I don't really see what is structurally lost by complete
> >> > decomposition to eliminate nulls, that can't be remedied via
> >> > constraints.
>
> >> If you complete the interpretation such that all missing values are
> >> supplied, then the resulting database should be a set of relations. When
> >> there is a nullable column on a table, there will a 1:1 correspondence
> >> between values in the completed relation. The problem with separating a
> >> column that can contain nulls into another table is that when all missing
> >> values are supplied, there is no requirement that for every tuple in the
> >> referenced relation there must be one and only one tuple in the
> >> referencing
> >> relation. Adding a constraint to guarantee that 1:1 correspondence would
> >> be
> >> the same as not permitting the information to be missing in the first
> >> place.
> >> Thus the only ways that I can see to remedy that would be either to add a
> >> boolean column, or to add another table.
>
> > Could you clarify why you think using nulls maintain structural
> > information? If one is saying that a datum /must/ exist as part of the
> > proposition, why is there a nullable column allowing it not to?
> > Perhaps you are saying that the nullable column is stating that the
> > value "should be" part of the proposition, but "we're being a bit lax
> > about it" whereas a decomposed version does not?
>
> > If so, I am unclear at the moment what the value of this is - perhaps
> > you could refer me to an example of when it would be necessary?
>
> There is a functional relationship between a key and each non-key column.
> If null can only indicate that there should be a value but it hasn't been
> supplied, then the presence of a null doesn't alter that functional
> relationship. In other words, for each particular key value, there is one
> and only one value for each non-key column--even if it is unclear what that
> value is. In a perfect world, there is no missing information, and a table
> that permits nulls becomes a relation that doesn't. In such a relation, the
> relationship I'm referring to would be called a functional dependency. By
> separating a nullable column into its own table, that functional
> relationship is lost.

But by allowing a column to be nullable aren't you saying that the "functional relationship" wasn't valid in the first place.

> Just as is the case when a relation R {A, B, C} where
> A is the key is separated into a pair of relations, S {A, B} and T {A, C}
> such that T[A] IN S[A], since the values for A, B and C no longer appear in
> the same tuple, it is now possible for there to be a value for A and B
> without a value for C: the functional relationship between A and C is lost.

I appreciate the argument for requiring extra constraints to maintain all information after decomposition under full info. However, if you are allowing nulls then surely you are already acceding that those constraints can be violated.

>
>
>
> >> >> That's not to say that it isn't possible to eliminate
> >> >> nulls altoghether, it's just that you need several separate tables
> >> >> instead
> >> >> of one: one table for the values that must always be present, one
> >> >> table
> >> >> for
> >> >> each nullable column (or set of columns in those cases where whenever
> >> >> one
> >> >> value is supplied, another value must also be supplied) that indicates
> >> >> whether or not there should be a value (or set of values), and one
> >> >> table
> >> >> with a column (or set of columns) for those values that just happen to
> >> >> have
> >> >> been supplied.
>
> >> >> Note also that the spectre of 3VL disappears when null indicates only
> >> >> "there
> >> >> should be a value here, but it hasn't been supplied." Since there
> >> >> should
> >> >> be
> >> >> a value, and only one value, a null can simply be replaced by an
> >> >> exclusive
> >> >> disjunction that encompasses all of the values in the attribute's
> >> >> domain.
> >> >> So there is no need for 3VL, 2VL suffices. (For infinite domains, a
> >> >> quantifier could be used that means instead of "there is...," "there
> >> >> is
> >> >> exactly one....")
Received on Mon Dec 31 2007 - 14:33:35 CET

Original text of this message