Re: Idempotence and "Replication Insensitivity" are equivalent ?

From: <pamelafluente_at_libero.it>
Date: 19 Sep 2006 08:12:41 -0700
Message-ID: <1158678761.385503.267010_at_e3g2000cwe.googlegroups.com>

Chris Smith ha scritto:

> > - median
> > - Any quantile
> > - standard deviation
> > - variance
>
> These are not definable as aggregate functions.

No? Really?

http://www.adp-gmbh.ch/ora/sql/agg/index.html

>
> > - regression coefficient
> > - Pearson's skewness
>
> I don't know what these are.

I believe you.
>
> --
> Chris Smith

I propose a counterexample, specific to the case of Marshall's concern:

Consider a commutative, associative Aggregate function f "expressible through a a binary generator", in the sense that it allows the following representation:

f(x,y,z) = f( f (x,y) , z)

Assume that we restrict ourself to the space {0, 1}

Consider f := [Product of operands], which allows the above representation

We have

f(x,y) = x

does not hold, e.g. f(0, 1) = 0

Therefore f is NOT idempotent

f is "duplication insensitive". Infact:

f can take values only in {0, 1}

if f(.) is 0 this result cannot be changed by any further operands if f(.) is 1 it means that all operands are 1. Hence the duplication of an operand (a new 1) does not change the result.

Therefore we have provided, a case where "NOT idempotent" and "duplication insensitivity" co-exist.

Hence "duplication insensitivity" => idempotence does not hold.

Did I do right?

-P

PS.

1.
Restricting aggregate function to above contraints is in real word dbms' is useless, imho.

I have coded countless times aggegate functions such as:

Ranges
Quantiles
standard deviation / variance
regression coefficient
any moment (from origin or central)
Any statistical estimator and especially "sufficient" statistics ...

why should we care about requiring the above representation? People want complex KPIs, who cares about SUM and COUNT ?

2.
avg(x) = sum(x) / count(x)

average is not computed that way

http://www.databasejournal.com/features/mssql/article.php/3399931

"The reason for the different results is that most aggregate functions ignore columns that contain a NULL value, where as the COUNT function does not ignore columns that have a NULL value. Because the AVG function
ignores columns with a NULL value, the row where the color was "White" and
the number value was NULL was not considered when calculating the average for
the color "White". However, since the COUNT function does not ignore columns
that contain a NULL value, the row that contained a NULL value was considered
when the SUM and COUNT functions where used to calculate the average. " Received on Tue Sep 19 2006 - 17:12:41 CEST

This message: [ Message body ]
Next message: Marshall: "Re: Idempotence and "Replication Insensitivity" are equivalent ?"
Previous message: Bob Badour: "Re: 3 value logic. Why is SQL so special?"
In reply to Chris Smith: "Re: Idempotence and "Replication Insensitivity" are equivalent ?"
Next in thread: Marshall: "Re: Idempotence and "Replication Insensitivity" are equivalent ?"
Reply: Chris Smith: "Re: Idempotence and "Replication Insensitivity" are equivalent ?"
Reply: pamelafluente_at_libero.it: "Re: Idempotence and "Replication Insensitivity" are equivalent ?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message