Re: Idempotence and "Replication Insensitivity" are equivalent ?
Date: 20 Sep 2006 00:37:12 -0700
Message-ID: <1158737832.398750.160770_at_h48g2000cwc.googlegroups.com>
William Hughes ha scritto:
> That the study of functions can be writting in the form
> g(x1,g(x2,g(x3,...g(xn,e)) is standard is reasonable. To suggest that
> only these functions are interesting is not. This is what you
> seem to be doing by restricting the use of the term aggregate function.
...
..
..
> No, but this is a data point that suggests the standard usage
> of aggregate function is non-restrictive.
...
>
> Here we have no restriction whatsoever on the output of an aggregate
> function. This seems much more reasonable than your very
> restrictive definition in which an aggregate function could only map
> into A. Still there is a second more fundamental issue. What could
> the binary form of a function that find the largest five elements be?
>
..
> > > Yes, Median cannot be put into the form of your
> > > aggregate function. Despite this median is very important
> > > and can be computed efficiently.
> >
> > Can you clarify what you mean by computed efficiently? Perhaps I'm
> > wrong again.
> >
>
> Just that it is practical to compute the median routinely. I did
> not have anything more formal in mind. My point is that
> the median function is clearly of practical importance,
> so defining aggregate function in such a way as to exclude
> the median needs justification.
>
Actually,
Median can be computed O(n) like mean.
e.g.
http://valis.cs.uiuc.edu/~sariel/research/CG/applets/linear_prog/median.html
Further, memory is a problem, I could show you at least one algorithm
which
compute the median on a stream of data (no need to sort them)
What is this g(x1,g(x2,g(x3,...g(xn,e)))) why you need that for?
I find a little weird that some db experts are not familiar with computational complexity of the most common functions.
The discussion about what is an aggregate function probably should not
be done on a pure theoretical perpective. Oracle nor I am going to define an aggregate function your way. We do the software and it is the other way round ;) You must keep up ;)
If it takes O(log n) instead of O(n) and the user is willing
to wait a little more because he needs that result, what are you going
to do ? (Of course, I agree, forget about quadratic complexity, I
didn't
reach that far)
Are you going to tell him: "No sorry, I will not put this function in my software because someone say that this is not an aggregate function" ?
You surely can. But then do not complain if your competitors make more
money
than you :)
Common sense would define an aggregate function as any function whose best ratio
computational complexity / market request
is under some threshold. :)) (quite a provocation eh? )
Also, if you begin to write some line of code for a complex system
like
a dbms, then you quickly realize that mere computational complexity
of an algorithm is not the only factor you have been watching.
- * We continuosly strive to make complex things simple (divide et impera) ** not the other way round: make complicate the simple things.
This is the only way software can grow and deal with higher level of abstraction and complexity (in the sense of complication).
-P
>
> - William Hughes
Received on Wed Sep 20 2006 - 09:37:12 CEST