Re: Idempotence and "Replication Insensitivity" are equivalent ?

From: vc <boston103_at_hotmail.com>
Date: 24 Sep 2006 08:54:43 -0700
Message-ID: <1159113283.509184.27550_at_m7g2000cwm.googlegroups.com>


Aloha Kakuikanu wrote:
> vc wrote:
> > Aloha Kakuikanu wrote:
> > > vc wrote:
> > > > According to ANSI SQL, nulls are ignored in the aggregate functions
> > > > (except count(*)). It may not make sense to you, but it makes sense
> > > > to a lot of other folks.
> > >
> > > Or yeah, those folks defined null as a neutral element for sum
> > > aggregate too. Apparently they are unaware of identity:
> > >
> > > 0 + 0 + ... = 0
> >
> > The above does not make any obvious sense. What's a "neutral element"
> > ?

>

> Ugh,
>

> http://www.google.com/search?hl=en&q=neutral+element+wikipedia&btnG=Google+Search
>

What I was objecting to was you usage of "neutral element". The neutral element is defined for a binary operation, the aggregate function is not a binary operation so the notion is inapplicable.

>

> There are actually 2 issues:
> 1. How sum is defined on empty set. It is certainly 0, not null.

In SQL'92, sum over an empty set [of numbers] is defined as null.

> 2. How the sum is defined when there are unknown elements. Unknown
> elements are interpreted as 0s.

The usual interpretation is that nulls are ignored: Let B be a bag of numbers, then Sum(B) = Sum({{ e in B| e is not null }}). Interpreting null as a neutral element contradicts the "x + null = null, where x is an arbitrary number" definition.

>

> > > What I'm saying is that you have to define different neutral elements
> > > for different aggregates in order to maintain elementary consistency
> >
> > Could you, like, you know, rephrase that ?
>

> OK, I should retract this statement back. What consistency one can hope
> for when
>
> null != null

That "null != null" evaluates to "unknown" is the consequence of the paticular kind of 3-valued logic used in SQL. The logic may not match you expectations, though.

>

> And you don't have to go far for the concrete examples, e.g. summing
> null and x explicitly gives null, while doing this via aggregation
> gives x!

Right, that's why the SQL'92 folks talk about ignoring nulls in aggregate functions. The choice to ignore null implicitely was apparently pragmatic (saving on typing "where x is not null"). Whether or not it's confusing is probably a matter of taste. After all, one can choose not to use nulls at all thus eliminating the confusion issue (along with some other problems ;) ) Received on Sun Sep 24 2006 - 17:54:43 CEST

Original text of this message