Re: Real world issue:- OT recreational interval

From: Marshall <marshall.spight_at_gmail.com>
Date: 17 Sep 2006 18:30:43 -0700
Message-ID: <1158543043.918873.26530_at_m73g2000cwd.googlegroups.com>


pamelafluente_at_libero.it wrote:
>
> Ok I found this reference:
>
> http://www.latrobe.edu.au/philosophy/phimvt/joy/j04alg.html
>
> "Idempotency, zero elements and arities"
>
> A binary function f(x,y) is called idempotent if for all x
>
> f(x,x) = x

Yes, this is exactly what I've been saying, and what's been in the wikipedia entry, from the beginning. It's almost word for word what was in my last post.

> According to the above definition:
>
> countDistinct (5,6) = 2
> countDistinct is "not idempotent" (1 counterexample is enough)

You still don't understand how binary functions are used to create aggregate functions. Remember back in the beginning I said you needed to understand that? There's no way to make the aggregate count distinct out of your above binary function. So this is not a counterexample.

> max(x,x) = x for any x
> max is "idempotent"
>
> (1) Your statement is: "Duplication Sensitive" <=> Non Idempotent
>
> if we negate both sides, we obtain an equivalent statement:
>
> (2) "Not Duplication Sensitive" <=> Idempotent
>
> Now observe that CountDistinct is "Not Duplication Sensitive function"
> example: CountDistinct (2,2) = 1.
>
> At the same time CountDistinct is "not idempotent" (see above).
>
> Therefore we have found 1 case where:
>
> (3) "Not Duplication Sensitive function" => "idempotent"
>
> is false. Therefore (2), (1) are false.

All of the above is irrelevant because the aggregate count distinct is not related to the binary count distinct you've supplied.

Remember this message?

http://groups.google.com/group/comp.databases.theory/msg/a639271c4355ec5d

In it, I said "If the binary form of the aggregate function is idempotent, the aggregate will return the same value even if values are repeated arbitrarily. Since + is not idempotent, sum() is "sensitive" to repeated values. Since binary min *is* idempotent, aggregate min() is not "sensitive" to repeated values."

Until you understand the above paragraph, you will not understand the issue you are asking about. In particular, you need to understand how binary functions relate to aggregates.

I will refer to your above binary count distinct as "cd()". I will apply it to {5, 6, 2}.

Aggregating cd() into a full aggregate count distinct would work as follows:

  cd(cd(5,6),2)

This evaluates as follows:

  • cd(cd(5,6),2)
  • cd(2,2)
  • 1

So if we try to aggregate your cd() function over {5,6,2} it evaluates to 1. However, there are 3 distinct values in {5, 6, 2}, so the cd() function you supplied is not a representation of the aggregate count distinct.

Consider how you would use + to make sum:

(5+6)+2

or binary min to make aggregate min:

min(min(5,6),2)

Consider also why every binary function used to make an aggregate is both commutative and associative. What if it wasn't?

Extra credit:
avg() is one aggregate that doesn't seem to fit with the rest. Why is that?

Good luck.

Marshall Received on Mon Sep 18 2006 - 03:30:43 CEST

Original text of this message