Re: Idempotence and "Replication Insensitivity" are equivalent ?

From: <pamelafluente_at_libero.it>
Date: 20 Sep 2006 00:37:12 -0700
Message-ID: <1158737832.398750.160770_at_h48g2000cwc.googlegroups.com>


William Hughes ha scritto:

> That the study of functions can be writting in the form
> g(x1,g(x2,g(x3,...g(xn,e)) is standard is reasonable. To suggest that
> only these functions are interesting is not. This is what you
> seem to be doing by restricting the use of the term aggregate function.

...
..
..

> No, but this is a data point that suggests the standard usage
> of aggregate function is non-restrictive.
...
>
> Here we have no restriction whatsoever on the output of an aggregate
> function. This seems much more reasonable than your very
> restrictive definition in which an aggregate function could only map
> into A. Still there is a second more fundamental issue. What could
> the binary form of a function that find the largest five elements be?
>
..
> > > Yes, Median cannot be put into the form of your
> > > aggregate function. Despite this median is very important
> > > and can be computed efficiently.
> >
> > Can you clarify what you mean by computed efficiently? Perhaps I'm
> > wrong again.
> >
>
> Just that it is practical to compute the median routinely. I did
> not have anything more formal in mind. My point is that
> the median function is clearly of practical importance,
> so defining aggregate function in such a way as to exclude
> the median needs justification.
>

Actually,

Median can be computed O(n) like mean.
e.g.

http://valis.cs.uiuc.edu/~sariel/research/CG/applets/linear_prog/median.html

 Further, memory is a problem, I could show you at least one algorithm which
 compute the median on a stream of data (no need to sort them)

 What is this g(x1,g(x2,g(x3,...g(xn,e)))) why you need that for?

 If I should code according the above, it would take forever to compute  things, and then do not complain dbms are slow !

 I find a little weird that some db experts are not familiar with  computational complexity of the most common functions.

 The discussion about what is an aggregate function probably should not

 be done on a pure theoretical perpective. Oracle nor I am going to  define an aggregate function your way. We do the software and it is  the other way round ;) You must keep up ;)

 If it takes O(log n) instead of O(n) and the user is willing  to wait a little more because he needs that result, what are you going  to do ? (Of course, I agree, forget about quadratic complexity, I didn't
 reach that far)

 Are you going to tell him: "No sorry, I will not put this function  in my software because someone say that this is not an aggregate function" ?

 You surely can. But then do not complain if your competitors make more money
 than you :)

 Common sense would define an aggregate function as any function  whose best ratio

    computational complexity / market request

 is under some threshold. :)) (quite a provocation eh? )

 Also, if you begin to write some line of code for a complex system like
 a dbms, then you quickly realize that mere computational complexity  of an algorithm is not the only factor you have been watching.

  • * We continuosly strive to make complex things simple (divide et impera) ** not the other way round: make complicate the simple things.

 This is the only way software can grow and deal with higher level of  abstraction and complexity (in the sense of complication).

-P

>
> - William Hughes
Received on Wed Sep 20 2006 - 09:37:12 CEST

Original text of this message