Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

From: Jan Hidders <jan.hidders_at_REMOVE.THIS.ua.ac.be>
Date: 10 Mar 2003 17:43:02 +0100
Message-ID: <3e6cc096.0_at_news.ruca.ua.ac.be>


Bob Badour wrote:
>"Jan Hidders" <jan.hidders_at_REMOVE.THIS.ua.ac.be> wrote in message
>news:3e6c869e.0_at_news.ruca.ua.ac.be...
>> Bob Badour wrote:
>> >"Jan Hidders" <jan.hidders_at_REMOVE.THIS.ua.ac.be> wrote in message
>> >news:3e6bd183.0_at_news.ruca.ua.ac.be...
>> >> Bob Badour wrote:
>> >> >"Jan Hidders" <jan.hidders_at_REMOVE.THIS.ua.ac.be> wrote in message
>> >> >news:3e620dec.0_at_news.ruca.ua.ac.be...
>> >> >>
>> >> >> Yes, that too, but mainly that he overestimates the complexity that
>> >> >> is added to the optimizer when bags are exposed to the user.
>> >> >
>> >> >The optimizer may not be any more complex, but it is nowhere near as
>> >> >effective either.
>> >>
>> >> No. It can be just as efective.
>> >
>> >If that is the case, why haven't they?
>>
>> To know if they have or not you would have to be able to compare their
>> query optimization to that of an existing implementation based on a
>> set-only approach.
>
>Lauri has already provided examples in this thread where one vendor or
>another has not implemented an available set optimization.

Sure. But what you have to show to support the claim is that those optimizations are implemented in DBMSs with a set-only approach. It's not enought to show that the algebraic rules are simple because they are also not complicated in the bag-approach.

>> >> >When I look at your statement above, I think: "Well that totally
>> >> >invalidates the argument that duplicate removal costs too much in
>> >> >performance."
>> >>
>> >> Why do you think that?
>> >
>> >I answered that in the part you snipped. The user will just have to
>> >formulate and execute multiple queries until the dbms delivers the
>> >answer the user needs.
>>
>> Yes, but how does that invalidate the argument that duplicate removal
>> sometimes costs too much?
>
>You have never established that it costs too much. In order for the user to
>get the desired result, the user will eventually have to force the dbms to
>remove the duplicates.

If the users doesn't mind the duplicates or knows somehow that there won't be any then any time spent on duplicate elimination is wasted.

>> >> >Jan, with all due respect, I cannot count how many times I have heard
>> >> >alleged database experts tell users to "Never use DISTINCT." If the
>> >> >result is already distinct, the keyword should have no cost.
>> >>
>> >> Yes. *should* is the right word. Deriving that "at compile time" is
>> >> not a trival problem.
>> >
>> >It is trivial in a system based on sets and requiring logical identity.
>>
>> No, it is just as difficult.
>
>The input to every relational operation is a set of distinct tuples with a
>given predicate and the output from every relational operation is a set of
>distinct tuples with a derivable predicate. Other than projection and
>summarization, what operations have intermediate duplicate tuples?

The union, if implemented it in a naive way. But is the above somehow supposed to be a proof that the problem is trivial?

>> >> >Why should they accept any duplicates?
>> >>
>> >> I didn't say they should.
>> >
>> >You implied that users would more likely accept duplicates with lower
>> >cardinality.
>>
>> would <> should
>
>Oh, I see. Deconstructionism again.

The distinction between ontology and deontology is a bit older than that.

>Or are you saying that vendors should ram down their users' throats
>whatever crap they can get away with ramming?

Try reading what I write and not what you want me to write. That would help the discussion.

  • Jan Hidders
Received on Mon Mar 10 2003 - 17:43:02 CET

Original text of this message