# Re: Examples of SQL anomalies?

Date: Sat, 5 Jul 2008 10:41:59 -0400

Message-ID: <XKLbk.6511$LG4.3140_at_nlpi065.nbdc.sbc.com>

"David Cressey" <cressey73_at_verizon.net> wrote in message
news:LGJbk.201$iU.196_at_trndny02...

*>
*

> "David BL" <davidbl_at_iinet.net.au> wrote in message

*> news:cfb587f5-ff58-48c7-ad1d-53540c7b8d7d_at_e39g2000hsf.googlegroups.com...
**>> On Jul 5, 8:49 am, Marshall <marshall.spi..._at_gmail.com> wrote:
**>> > What can be meaningfully asked is determined by the schema.
**>> >
**>> > If the schema specifies that the weight attribute is nullable,
**>> > then the question of how much a shipment weighs in total
**>> > is a question that cannot be asked.
**>> >
**>> > Marshall
**>>
**>> IMO aggregate functions are only defined on well defined sets and only
**>> give well defined answers. SUM should never return NULL.
**>> Nevertheless it is meaningful to calculate SUM on a nullable weight
**>> attribute. Rather than upset the mathematical simplicity of aggregate
**>> functions one can easily make sense of the result by correctly
**>> interpreting the intensional definition of the set being aggregated.
**>> For example in this case it is the set of *known* weights. It is
**>> absurd to not allow this to be calculated, by returning NULL.
**>
**> This is degenerating into yet another philosophical exercise on the
**> meaning
**> of NULL.
**>
**> The only way to make sense out of all this is as follows:
**>
**> Consider a column the might actually contain NULLS to be (part of) a
**> materialized OUTER JOIN. Decompose the tables back to the point where all
**> columns of all tables can be required data. (I guess, from prior
**> discussions, that this is 6th normal form. Someone could check this for
**> me). Compute how the column with missing weights appears in this
**> equivalent
**> 6th normal form representation. Figure out what the SUM aggregate does
**> (or
**> alternatively, figure out what it ought to do) with the data in 6th normal
**> form. Cause the SUM aggregate to do the same thing with the table in its
**> original form.
**>
*

6NF requires that each relation have at most one non-prime attribute. (I think it goes further than that, allowing that for each key only 1 attribute in the relation that is not part of that key--even if that attribute is part of a different key, but I could be wrong.) It is possible for there to be more than one column in a table that is not part of the key and is also not nullable, so the result of decomposition would not necessarily be in 6NF. In addition, when you decompose a relation by taking projections, you eliminate duplicates within each projection--the result of taking a projection is a set, not a bag, which would alter the sum. For example, if you have three packages that weigh 1 pound--say three pounds of coffee--the projection would have one row instead of three, so the sum of weight would be 1 pound instead of 3:

Lines{{line#:1, item:coffee},{line#:2, item:coffee},{line#:3, item:coffee}} Weights{{item:coffee, weight:1lb}}

> I haven't done the computation, but I'm guessing that you'll end up with

*> exactly the same behavior you get from the present SQL SUM aggregate.
**>
**> Missing data, which shows up as NULL values in the original form
**> (original
**> only in the sense that that's where we started the discussion) will show
**> up
**> in the form of missing rows in the 6th normal form equivalent.
**>
**> As far as getting "right" answers when some rows are missing, that's
**> beyond
**> the province of the DBMS.
**>
**>
*

Received on Sat Jul 05 2008 - 16:41:59 CEST