Re: does a table always need a PK?
Date: Tue, 02 Sep 2003 08:22:18 GMT
Message-ID: <3F5452A0.1AB_at_ix.netcom.com>
Bob Badour wrote:
>
> "Lee Fesperman" <firstsql_at_ix.netcom.com> wrote in message
> news:3F510C49.2A6E_at_ix.netcom.com..
> > Bob Badour wrote:
> > > Does the mathematical identity "SUM(A)+SUM(B)=SUM(A+B)" demonstrate
> > > a flaw or limitation of SQL's NULL?
> >
> > No, if you ignore the empty set issue.
>
> I am surprised by this answer, because I know we have discussed it before.
> Given the following table, T:
>
> A B
> = =
> 1 2
> 3 NULL
> NULL 4
>
> What are the results of the following queries?
>
> select SUM(A) from T;
> select SUM(B) from T;
> select SUM(A+B) from T;
>
> Am I missing something?
I didn't intend to mislead. I decided against a longer response to keep all the answers succinct.
I answered 'no' because I consider it an inherent of aspect of NULL processing rather than a flaw or limitation. IOW, I am saying that the sub-expression above is not an identity when either A or B can contain a NULL.
If I do a query on T and add (A+B) as a calculated attribute:
select A, B, (A+B) as C from T;
I get:
A B C = = = 1 2 3 3 NULL NULL NULL 4 NULL
Applying the SUM() function to each result attribute:
select SUM(A), SUM(B), SUM(C) from (select A, B, (A+B) as C from T);
produces:
SUM(A) SUM(B) SUM(C)
- ====== ====== 4 6 3
(SUM(A) + SUM(B)) is not equal to SUM(C)
SUM(A) is the sum of the values -- A from tuple 1, A from tuple 2, A from tuple 3.
Using subscripts to indicate tuples, I can write it as:
SUM(A) is sum of A[1], A[2], A[3]
or:
SUM(A) is sum of 1, 3, NULL, giving 4.
and:
SUM(B) is sum of B[1], B[2], B[3]
or:
SUM(B) is sum of 2, NULL, 4, giving 6.
For SUM(A+B):
SUM(A+B) is sum of (A+B)[1], (A+B)[2], (A+B)[3]
SUM(A+B) is sum of 3, NULL, NULL, giving 3
It is not:
SUM(A+B) is sum of A[1], B[1], A[2], B[2], A[3], B[3]
producing:
SUM(A+B) is sum of 1, 2, 3, NULL, NULL, 4, giving 10
... because the SUM() function and the '+' operator, while similar, are different operations. SUM() throws away any NULL results and only sums non-NULL values. The + operator yields NULL if either operand is NULL.
Aggregate functions, like SUM(), operate on values from tuples. A is a value from a tuple. B is a value from a tuple. And, the expression (A+B) is a value from a tuple. The expression (A+B) yields 3 in tuple 1 and NULL in tuples 2 and 3, thus SUM(A+B) yields 3.
Codd recommended that aggregate functions have a modifier that specifies the disposition of NULLs in the aggregation. IIRC, it was intended as part of the aggregate function.
However, specifying it on the aggregate function like this pseudo-syntax:
SUM(A+B) substitute 0 for NULL
will not resolve the identity mismatch.
SQL provides the coalesce() function for finer grained specification:
SUM(coalesce(A,0)+coalesce(B,0))
Using coalesce():
SUM(coalesce(A,0)) + SUM(coalesce(B,0)) = SUM(coalesce(A,0)+coalesce(B,0))
evaluates to True.
--
Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
Received on Tue Sep 02 2003 - 10:22:18 CEST
