Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?
Date: 15 Feb 2003 09:11:49 -0800
Message-ID: <3c8cab4c.0302150911.5ed901a_at_posting.google.com>
Lauri Pietarinen <lauri.pietarinen_at_atbusiness.com> wrote in message news:<3E4E2224.4050306_at_atbusiness.com>...
>Jan Hidders wrote:
>>
>> SET(SET(A) INTERSECT(SET(B UNION C))) =
>> SET(SET((A) INTERSECT SET(B)) UNION (SET(A) INTERSECT SET(C)))
>>
>>So, although things get a bit more complicated, there is absolutely
no
>>reason why a query optimizer that is based on a bag algebra would
>>perform worse than one that is based on a set algebra.
>
>So the point here seems to be that in SQL we can always use
'DISTINCT'
>anyway and the optimizer will be able to take that into account and
>optimize accordingly?
Yes. But let me stress that I am not a fan of SQL; as a bag calculus
it
is also less elegant then it could have been, to put it mildly.
>In principle this is correct, but:
>
>- more work for the optimizer groops because it has to support two
"modes";
> hence bigger and buggier products.
There is only one mode, the bag mode, and it has to take those into
account
anyway even if the user only sees sets because for optimization
purposes a
bag-view is sometimes more optimal. And as I already said, the added
complexity is very little. Don't you find it remarkable that most
people who
claim that bags makes things much more difficult are the ones who have
never
have done any real research on query optimiziation or built a full
scale
query optimizer? I don't. In that respect Ullman, Widom and Molina can
run circles around Date whose expertise in this area is very shallow,
to say the least.
>- optimizing efforts will be concentrated on the most used features,
that
> being queries without disticnt.
If those queries are the ones that are the most used, then that is
what the
users apparently want, and so indeed those should be optimized the
most.
>I would draw a parallel with using GOTO's in programming languages:
>- GOTO's give programmers more freedom to do what
> they want
>- however, the programs are harder to optimize, so programs will
> end up being bigger and running slower
>- compilers will be harder to write
>- the programs will be probably be buggier
>- and as Dijkstra has pointed out, everything can be done
> _without_ GOTO's anyway.
The optimization of imperative programs is very different from optimizing declarative query languages, so such an analogy is virtually meaningless to people who actually know a thing or two about optimization. Btw. What makes you think that programs with GOTOs are hard to optimize? Dijkstra did not mention that as an argument in his famous letter.
That bags are hard to understand is a myth. I have been teaching SQL at university level and below that; there are several things about SQL that are hard to explain but bags was not one of them.
Kind regards,
- Jan Hidders