Re: NULLs: theoretical problems?

From: JOG <jog_at_cs.nott.ac.uk>
Date: Mon, 13 Aug 2007 05:29:23 -0700
Message-ID: <1187008163.120347.20000_at_22g2000hsm.googlegroups.com>

On Aug 13, 1:06 pm, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> "JOG" <j..._at_cs.nott.ac.uk> wrote in message
>
> news:1186971690.979746.159320_at_l70g2000hse.googlegroups.com...
>
>
>
> > On Aug 8, 11:43 am, "sinister" <sinis..._at_nospam.invalid> wrote:
> >> Many discussions point out one deficiency of NULLs: that they collapse
> >> multiple, distinct concepts into one ("no value possible," "value
> >> missing,"
> >> "value not available at this time", etc).
>
> >> What are the other theoretical problems? My impression from skimming
> >> some
> >> threads in this ng is that some anomalies might occur, maybe having to do
> >> with NULLs and joins, or NULLs and keys composed of more than one field,
> >> but
> >> I'm not sure.
>
> > A database stores true propositions. A statement with a hole in it (an
> > SQL null) is not a proposition, and hence is a theoretical abhorrence.
>
> Not necessarily. If there can always be a value for an attribute, but it
> simply hasn't been supplied, then the "hole" isn't a hole, but rather a
> disjunction that means one of the values in the domain for the attribute.
> The statement is therefore a true proposition, even if it is indeterminant.

I disagree, even in that situation propositions can't have holes or gaps according to relational theory. If one is kludging in a flag that means "one of the values from the domain", one is in fact trying to state something completely different using the existential quantifier. And that will make it a predicate not a proposition due to the inclusion of "metadata" (for want of a better term). For example:

Name(Tom) ^ Age(38) ^ Ex HairColour(x)

or if you don't want to include attribute names: Ex Person(Tom, 38, x)

which is entirely different kettle of fish from: Person(Tom, 38, brown)

The former is a statement in FOL, the latter a statement in propositional logic. In RM only the relation is predicated, with tuples satisfying that predicate, and hence the null flag is theoretically and mathematically incorrect.

>
> > Or from a different angle you might want to consider that a relation
> > is a set of tuples. A tuple must contain a value in every position, or
> > it is not a valid tuple (not being a subset of the cartesian product
> > of the domains being considered). Hence an SQL-null is a theoretical
> > abhorrence.
>
> I wouldn't say that it is not a valid tuple, but rather that the set of
> tuples is no longer strictly a relation. A tuple is simply a set of named
> values, but a relation in its strictest sense is a named set of tuples that
> each have one value for each of the attributes in the heading. As can
> easily be seen above, the lack of regularity is not necessarily a problem,
> unless there are instances where there cannot be a value for an attribute in
> each tuple. I therefore wouldn't call it a "theoretical abhorrence," but
> rather a more concise way to represent the same information. There can be
> merit in a more concise representation, not the least of which is the
> reduced computational complexity (big-O) for both queries and updates.
>
> I do agree that the 3VL involving SQL-null is definitely broken.

There we are agreed.

>
> > The solution is of course to decompose a relation around its key so
> > that no nullable columns are required. This results in a schema with
> > more relations, and tends to produce queries with more joins, but
> > without any theoretical or logical flaws.
>
> > Some view such decomposition as being computationally inefficient,
> > others reply that this is a physical implementation issue and not a
> > concern of the logical model.
>
> > Given the fact that it can generate much longer queries, I sometimes
> > find myself allowing nulls in personal home-brew projects out of sheer
> > laziness. However I do suffer from internal pangs of guilt during the
> > process ;)
Received on Mon Aug 13 2007 - 14:29:23 CEST

This message: [ Message body ]
Next message: Bob Badour: "Re: A pk is *both* a physical and a logical object."
Previous message: Cimode: "Re: DATA MINING: Use of modern heuristics to transform and select regressors for linear modelling"
In reply to: Brian Selzer: "Re: NULLs: theoretical problems?"
In reply to JOG: "Re: NULLs: theoretical problems?"
Next in thread: Brian Selzer: "Re: A pk is *both* a physical and a logical object."

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message