Re: Teach SELECT DISTINCT first!

From: Mikito Harakiri <mikharakiri_at_iahu.com>
Date: Tue, 27 Apr 2004 15:00:44 -0700
Message-ID: <uUAjc.47$BD6.629_at_news.oracle.com>


"Paul" <paul_at_test.com> wrote in message news:j7Ajc.36911$Y%6.5021809_at_wards.force9.net...
> Mikito Harakiri wrote:
> >>There's no way in general an optimizer can tell whether or not there are
> >>duplicates without doing the calculation. Even if f were a relatively
> >>simple function it would be too much to expect that a relational engine
> >>should have knowledge of maths built-in.
> >
> > CAS usually assumed to have this kind of knowledge, so why (in
principle)
> > relational engine couldn't have it? The technical difference might be
that
> > everything in CAS is an expression, while in SQL we allow function
> > programmed in 3GL language. In foreseeable future no relational
optimizer is
> > expected to look inside procedural code, of course. (Note an interesting
> > analogy: it is easy to reason about a function defined decalratively as
an
> > expression, and its very hard to reason about fuction defined
procedurally.)
>
> CAS=Computer Algebra System?

Yes

> OK here's a pathological example:
> Define a function f on the integers by:
> f(i) = i (if Fermat's Last Theorem is true
> = 0 (if Fermat's Last Theorem is false

This is not closed-form expression. Normally you supply a formula with arithmetic operations and elementary functions.

> I can't remember offhand whether the proof has been verified officially
> yet but let's assume it has for the sake of argument.
>
> Now if we know this we can see at a glance that the function values
> don't coincide for any two different arguments.

CAS have their own problems even within practical realm of closed-form expressions, there is no need to go as far as FLT. The idea is however, that in vast majority of the cases CAS does an excellent work.

> But can we expect a CAS or a DBMS to know this?
> Could a CAS prove Fermat's Last Theorem even in theory?
> Don't some proofs require you to step into the meta-language - they
> can't be proved in the language itself? (think Godel).
>
> In practice though I don't think it's going to be practical for a DBMS
> to have a full computer algebra system on board. It would probably be
> quicker for the DBMS to do things the simple way anyway.

It depends how powerful DBMS engine you want. Query transformation optimizations, for example, are essentially based on the idea similar to expression manipulation in CAS.

> Now functions are actually special cases of (mathematical) relations. So
> maybe a relational system could think of them in this way and access
> them as logical (DBMS) relations. Whether they are physically
> implemented as an actual table or not would be up to the optimizer. For
> example say you define a complicated function on the integers. It might
> be best for the DBMS to store a table that pre-calculates this.

But in most cases the relation is infinite. This is why CAS represents a function as an expression tree. Finite representations of infinite objects, that is what CAS is all about.

> For non-integer domains you might have more problems though. But then
> you've already got problems with rounding etc. where two decimals may be
> different theoretically but the same according to the DBMS.

What is the excuse for RDBMS having fixed-point arithmetics? All CAS implemented unlimited precision arithmetics from the very beginning. Business users don't need anything that sophisticated? That is really lame. Received on Wed Apr 28 2004 - 00:00:44 CEST

Original text of this message