Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

From: Jan Hidders <jan.hidders_at_REMOVE.THIS.ua.ac.be>
Date: 3 Mar 2003 12:00:02 +0100
Message-ID: <3e6335b2.0_at_news.ruca.ua.ac.be>


Lauri Pietarinen wrote:
>Jan Hidders wrote:
>
>>>And, regarding "Distinct" support in DBMS's, you suggested that bad support
>>>resulted from the fact that the keyword was not used.
>>
>>When did I say that?
>
>[snipped from previous posting...]
>
>>- optimizing efforts will be concentrated on the most used features, that
>> being queries without disticnt.
>
>If those queries are the ones that are the most used, then that is what the
>users apparently want, and so indeed those should be optimized the most.
>
>[end snip]

Er, yes, so you see that *you* suggested it, and I merely said that if that was the case then IMO this wouldn't be a big problem.

>I will offer one last example to try to clarify what I mean.

Promises, promises. :-)

>Take the standard S, SP and P -tables. I want to hide the joins from
>the user so I create a view:
>
>CREATE VIEW V_S_SP_P
> ( S#, SNAME, QTY, P#, PNAME)
>AS
>SELECT S.S#, S.SNAME, SP.QTY, P.P#, P.PNAME
> FROM S, SP, P
> WHERE S.S# = SP.S# AND
> SP.P# = P.P#;
>
>There are lots of interesting questions that can be
>easily answered using this view. [...]
>
>Ok, let's try
>
>SELECT DISTINCT P#, PNAME
> FROM V_S_SP_P;
>
>Correct answer, but why is performance
>so bad?
>
>Oh, the DBMS materialises the whole join
>(even reading S, which is not needed at all)
>and then groups by P# and PNAME!
>
>(Just tried it on SQLServer2000).

Excellent example. No, really. What happens if you add DISTINCT to the view definition?

>So while you might be right in theory, i.e. that supporting two modes at
>once (sets and bags), is no burden for the DBMS optimiser, you are
>certainly wrong in practice! If it is so easy, why did they (the
>SQL-vendors) not support it a long time ago?

Well, your specific example involves recognizing that duplicate elimination is not necessary in the view. That was something that was researched as (relatively) recently as 1994:

  http://www.informatik.uni-trier.de/~ley/db/conf/icde/PaulleyL94.html

But you also have to solve that problem if you have a set-only model. So it could very well that for this particular problem you are right.

>Date is trying to take the relational model to it's full potential and I
>would claim that the SQL-mode of thinking is hindering us from achieving
>this. And this (in my view) is ultimately Date's criticism of Hector
>Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom, DATABASE SYSTEM
>IMPLEMENTATION.
Is he really criticizing this book? I don't get that impression from the "double trouble" article on dbdebunk.

  • Jan Hidders
Received on Mon Mar 03 2003 - 12:00:02 CET

Original text of this message