Re: Examples of SQL anomalies?

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sat, 5 Jul 2008 10:41:59 -0400
Message-ID: <XKLbk.6511$LG4.3140_at_nlpi065.nbdc.sbc.com>


"David Cressey" <cressey73_at_verizon.net> wrote in message news:LGJbk.201$iU.196_at_trndny02...
>
> "David BL" <davidbl_at_iinet.net.au> wrote in message
> news:cfb587f5-ff58-48c7-ad1d-53540c7b8d7d_at_e39g2000hsf.googlegroups.com...
>> On Jul 5, 8:49 am, Marshall <marshall.spi..._at_gmail.com> wrote:
>> > What can be meaningfully asked is determined by the schema.
>> >
>> > If the schema specifies that the weight attribute is nullable,
>> > then the question of how much a shipment weighs in total
>> > is a question that cannot be asked.
>> >
>> > Marshall
>>
>> IMO aggregate functions are only defined on well defined sets and only
>> give well defined answers. SUM should never return NULL.
>> Nevertheless it is meaningful to calculate SUM on a nullable weight
>> attribute. Rather than upset the mathematical simplicity of aggregate
>> functions one can easily make sense of the result by correctly
>> interpreting the intensional definition of the set being aggregated.
>> For example in this case it is the set of *known* weights. It is
>> absurd to not allow this to be calculated, by returning NULL.
>
> This is degenerating into yet another philosophical exercise on the
> meaning
> of NULL.
>
> The only way to make sense out of all this is as follows:
>
> Consider a column the might actually contain NULLS to be (part of) a
> materialized OUTER JOIN. Decompose the tables back to the point where all
> columns of all tables can be required data. (I guess, from prior
> discussions, that this is 6th normal form. Someone could check this for
> me). Compute how the column with missing weights appears in this
> equivalent
> 6th normal form representation. Figure out what the SUM aggregate does
> (or
> alternatively, figure out what it ought to do) with the data in 6th normal
> form. Cause the SUM aggregate to do the same thing with the table in its
> original form.
>

6NF requires that each relation have at most one non-prime attribute. (I think it goes further than that, allowing that for each key only 1 attribute in the relation that is not part of that key--even if that attribute is part of a different key, but I could be wrong.) It is possible for there to be more than one column in a table that is not part of the key and is also not nullable, so the result of decomposition would not necessarily be in 6NF. In addition, when you decompose a relation by taking projections, you eliminate duplicates within each projection--the result of taking a projection is a set, not a bag, which would alter the sum. For example, if you have three packages that weigh 1 pound--say three pounds of coffee--the projection would have one row instead of three, so the sum of weight would be 1 pound instead of 3:

Lines{{line#:1, item:coffee},{line#:2, item:coffee},{line#:3, item:coffee}} Weights{{item:coffee, weight:1lb}}

> I haven't done the computation, but I'm guessing that you'll end up with
> exactly the same behavior you get from the present SQL SUM aggregate.
>
> Missing data, which shows up as NULL values in the original form
> (original
> only in the sense that that's where we started the discussion) will show
> up
> in the form of missing rows in the 6th normal form equivalent.
>
> As far as getting "right" answers when some rows are missing, that's
> beyond
> the province of the DBMS.
>
>
Received on Sat Jul 05 2008 - 16:41:59 CEST

Original text of this message