Re: Idempotence and "Replication Insensitivity" are equivalent ?

From: Chris Smith <cdsmith_at_twu.net>
Date: Tue, 19 Sep 2006 09:43:10 -0600
Message-ID: <MPG.1f79bdfb5920857a989718_at_news.altopia.net>


<pamelafluente_at_libero.it> wrote:
> Chris Smith ha scritto:
>
> > > - median
> > > - Any quantile
> > > - standard deviation
> > > - variance
> >
> > These are not definable as aggregate functions.
>
> No? Really?
>
> http://www.adp-gmbh.ch/ora/sql/agg/index.html

So Oracle calls them aggregate functions. That doesn't mean that aggregate functions is a correct term for them. As Marshall points out, and I suppose I must agree, there is *a* definition in terms of aggregate functions. Specifically, define an aggregate function COLLECT that accumulates the entire multiset set as its result, and then perform the operation in post-processing of the results. This is technically valid, but it sorta bucks the whole spirit of an aggregate function, which is that it can be evaluated incrementally as the data is found.

(Perhaps this is a good reason for Marshall's definition, as it is problematic that M(A) can be a subset of A.)

In the case of variance and standard deviation, there's actually a better way than to use an accumulating aggregate function. I'd be willing to bet that Oracle actually does it differently; by scanning the data twice, once to determine the mean, and again to determine the variance. That's *not* evaluating it as an aggregate function, whatever Oracle calls it.

[The counterexample]

> Assume that we restrict ourself to the space {0, 1}
>
> Consider f := [Product of operands], which allows the above
> representation
>
> We have
>
> 1.
> f(x,y) = x
>
> does not hold, e.g. f(0, 1) = 0
>
> Therefore f is NOT idempotent

Actually, f(0,0)=0 and f(1,1)=1, so actually f is idempotent. The value of f(0,1) is not relevant to idempotence.

> 2.
>
> f is "duplication insensitive".

Yes, it certainly is.

-- 
Chris Smith
Received on Tue Sep 19 2006 - 17:43:10 CEST

Original text of this message