# Re: Teach SELECT DISTINCT first!

Date: Wed, 28 Apr 2004 10:01:48 -0700

Message-ID: <eCRjc.21$WN3.79_at_news.oracle.com>

"Paul" <paul_at_test.com> wrote in message
news:UELjc.36980$Y%6.5047943_at_wards.force9.net...

> Mikito Harakiri wrote:

*> >> I'm sure you could expand it out in terms of arithmetic operations
**> >> and elementary functions. It might be very long though...
**> >
**> > No, you would need quantifiers.
**>
**> OK so we exclude all expressions that have "for all" or "there exists"
**> quantifying over some variable? What are we classing as elementary
**> functions, does it include things like trigonometric functions and
**> exponentiation?
*

Yes, exponentiation and trionometric functions are elementary.

> Suppose I say "SELECT * FROM R WHERE i^4 + j^4 = k^4"

*> where i,j,k are integer columns?
*

Well, it sounds a little bit different from your original message where you focused on functions, not predicates. Also, case n=4 is relatively easy. You probably have meant

"SELECT * FROM R WHERE power(i,n) + power(j,n) = power(k,n) where i,j,k,n are integer columns.

But, in general, you are correct: the time spent on optimization can be much higher than query execution with "suboptimal" plan.

> Consider a function f that is a polynomial expression of order 5.

*> Suppose it has no integer roots, and I know this.
**> This is a simple arithmetic function.
**> Now if I do "SELECT * FROM R WHERE f(i) = 0"
**> Galois theory tells us there is no formula to tell us roots of a quintic
**> equation, so a CAS can't help here. But if I could hint to the query
**> engine it would be useful in this case.
*

"assume" statements in Maple, for example, essentially are the hints. Also CAS is able solving quintics:

*> solve(x^5-23/3*x^4+68/3*x^3-32*x^2+64/3*x-16/3,x);
*

1, 2/3, 2, 2, 2

sometimes. It is unable to do it in certain cases, of course

*> solve(x^5-23/3*x^4+67/3*x^3-32*x^2+64/3*x-16/3,x);
*

RootOf(%1, index = 1), RootOf(%1, index = 2), RootOf(%1, index = 3),

RootOf(%1, index = 4), RootOf(%1, index = 5)

5 4 3 2 %1 := 3 _Z - 23 _Z + 67 _Z - 96 _Z + 64 _Z - 16

> If the hint I give is the five actual roots, none of which are integers,

*> a CAS can easily verify this. But what if I incorrectly hint to the
**> engine that there are no integer roots whereas in fact there are? Should
**> the optimizer protect me from my errors? Should I be allowed to choose
**> whether it does or not?
*

<snip>

> Yes, I suppose this is driven by economics. If a pure RDBMS was

*> something that would out-perform current DBMSs and was possible to write
**> maybe someone would have done it by now. Though I guess the problem is
**> SQL has become so entrenched in terms of "mindshare" it's difficult to
**> get people to switch.
*

Polynomial expressions don't seem to be that essential to RDBMS everyday users. Therefore, polynomial optimization discussion really has very little touch with practical optimization problems.

Is SQL really blocking relational progress? There are certainly several quirks here and there, plus those new fashioned extensions are insane, but the major obstacle is IMO the lack of understanding. What is aggregation concept in relational? How is it related to (non)-distinctivenes? Then, ordering clause invokes some interesting parallels with combinatorics (and, perhaps, the keyword distinct is borrowed from there?) Received on Wed Apr 28 2004 - 19:01:48 CEST