Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

From: Jan Hidders <jan.hidders_at_REMOVE.THIS.ua.ac.be>
Date: 10 Mar 2003 13:35:42 +0100
Message-ID: <3e6c869e.0_at_news.ruca.ua.ac.be>

Bob Badour wrote:
>"Jan Hidders" <jan.hidders_at_REMOVE.THIS.ua.ac.be> wrote in message
>news:3e6bd183.0_at_news.ruca.ua.ac.be...
>> Bob Badour wrote:
>> >"Jan Hidders" <jan.hidders_at_REMOVE.THIS.ua.ac.be> wrote in message
>> >news:3e620dec.0_at_news.ruca.ua.ac.be...
>> >>
>> >> Lauri Pietarinen wrote:
>> >> >
>> >> >>>So actually your views are quite close to Date's, it seems! What
>> >> >>>you are describing sounds like Tutorial-D, to me. Does "SQL with
>> >> >>>set semantics" mean what I think it means?
>> >> >>
>> >> >>Yes, it does.
>> >> >>
>> >> >Maybe I am missing something here, but what is it _exactly_ you are
>> >> >criticizing Date of? Is it the issue of equating SQL with bag
>> >> >algebra?
>> >>
>> >> Yes, that too, but mainly that he overestimates the complexity that is
>> >> added to the optimizer when bags are exposed to the user.
>> >
>> >The optimizer may not be any more complex, but it is nowhere near as
>> >effective either.
>>
>> No. It can be just as efective.
>
>If that is the case, why haven't they?

To know if they have or not you would have to be able to compare their query optimization to that of an existing implementation based on a set-only approach.

>> >> >>>The query
>> >> >>>
>> >> >>>SELECT CITY
>> >> >>> FROM P
>> >> >>>
>> >> >>>answers the question
>> >> >>>"what cities do parts come from?"
>> >> >>
>> >> >>Yes, it does, but under SQL semantics it also contains the
>> >> >>information how many parts come from each city.
>> >> >>
>> >> >In a sence, yes, but then you have to have a program to tidy up the
>> >> >result or count the duplicates. Why would you want to do that in a
>> >> >program?
>> >>
>> >> If that is what the user asked then that is what he or she will get. If
>> >> they wanted something else, they would have asked a different query.
>> >
>> >Are you not now requiring that all users have expert level knowledge of
>> >the dbms internals?
>>
>> No.
>
>What level of knowledge would a user need to predict which of the 12
>formulations of the query in Date's example will give the user the correct
>answer out of the 9 different answers received?

The semantics of SQL, which is at the logical level. But as I already said several times before, I'm not definding SQL here, that is another issue.

>> >When I look at your statement above, I think: "Well that totally
>> >invalidates the argument that duplicate removal costs too much in
>> >performance."
>>
>> Why do you think that?
>
>I answered that in the part you snipped. The user will just have to
>formulate and execute multiple queries until the dbms delivers the answer
>the user needs.

Yes, but how does that invalidate the argument that duplicate removal sometimes costs too much?

>> >> >Well, it was not used because users were advised against using it:
>> >> >"Don't use 'DISTINCT', it will result in a sort!". So it was kind of
>> >> >a vicious circle.
>> >>
>> >> That's nonsense. DISTINCT is usually not used because you don't need
>> >> to, because for example the SELECT clause contains a candidate key. It
>> >> is very rare in my experience that users deliberately accept
>> >> duplicates.
>> >
>> >Jan, with all due respect, I cannot count how many times I have heard
>> >alleged database experts tell users to "Never use DISTINCT." If the
>> >result is already distinct, the keyword should have no cost.
>>
>> Yes. *should* is the right word. Deriving that "at compile time" is not a
>> trival problem.
>
>It is trivial in a system based on sets and requiring logical identity.

No, it is just as difficult.

>> >> >Let's say for sake of argument that P has 1 row, and SP has 1000000
>> >> >rows.
>> >>
>> >> That assumption makes the example extremely unrealistic because if that
>> >> were the case it is very unlikely that the user would accept that many
>> >> duplicates in the answer.
>> >
>> >Why should they accept any duplicates?
>>
>> I didn't say they should.
>
>You implied that users would more likely accept duplicates with lower
>cardinality.

would <> should

Jan Hidders

Received on Mon Mar 10 2003 - 13:35:42 CET

This message: [ Message body ]
Next message: Jan Hidders: "Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?"
Previous message: Jason W. Paul: "Re: Tabe Relationship"
Maybe in reply to: Lauri Pietarinen: "Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?"
In reply to Bob Badour: "Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?"
Next in thread: Jan Hidders: "Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message