Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

From: Jan Hidders <>
Date: 26 Feb 2003 15:11:22 +0100
Message-ID: <>

Lauri Pietarinen wrote:
>> Yes. But let me stress that I am not a fan of SQL; as a bag calculus it
>> is also less elegant then it could have been, to put it mildly.
>Can you point to some reference with what you think would be a decent query
>language? What would be the "ideal" db-language in your view?

Nothing fancy. Basically SQL with set semantics, no NULLs (except as explicit values in domains), two-valued logic, orthogonal in the sense that everywhere where I can use a table I can also put a query, and to top it off an EXISTS and FORALL that look more like their counterparts in tuple calculus (so EXISTS (SELECT * FROM a WHERE ...) can be written as for example EXISTS a : ...). After that I'd probably want some limited form of recursion such as inflationary fixpoints or something.

>> >- optimizing efforts will be concentrated on the most used features,
>> > that being queries without disticnt.
>> If those queries are the ones that are the most used, then that is what
>> the users apparently want, and so indeed those should be optimized the
>> most.
>So the user is always right!

No, not always. Users sometimes contradict themselves or have conflicting requirements.

>Tell me why any user would really want duplicates in the result of his/her
>query. Just give me _one_ example applicable to a real world situation in
>an actual program.

I don't think users "really want" duplicates (although I'm not sure that this is really a well-defined notion). If duplicate elimination means a big performance hit and the user doesn't really need this then in some sense you could say that her or she "really wants" to be able to determine if the duplcates are eliminated or not.

>Taking the table 'P', having, say 1M rows, what is the meaning
>of the following query:
>select city
> from P

Roughly the same as:

  GROUP BY city;

>> The optimization of imperative programs is very different from
>> optimizing declarative query languages, so such an analogy is
>> virtually meaningless to people who actually know a thing or two about
>> optimization.
>The point here is that the more stuff you have to consider the
>harder it is to do a good job. "Less is More"!

That's just a general rule of thumb. The question at hand is if in this particular case it is really as "more" as Date wants us to believe. I am skeptical about that and I believe I already gave some arguments why that is. For example, suppose we would "expose" bags to the user, but under the hood we would translate them to true relations in the obvious way and do the query optimization on those. Could you indicate how much "more" that would cause?

>> Btw. What makes you think that programs with GOTOs are hard to optimize?
>> Dijkstra did not mention that as an argument in his famous letter.
>Am I wrong? Are programs with GOTOs as easy to optimise as programs
>not using GOTO's, assuming that we don't have any restrictions?

That is a very complex question. How exactly would you define optimizability? And do you take into account that programs with GOTOs are often smaller than their counterparts with only WHILEs?

Perhaps we should ask this question to somebody who has written a classic on compiler design, like, ... Jeffrey Ullman?

>[...] I have the feeling that optimising over a cleaner language would have
>been easier in the first place. This involves also giving the user less

That could be a good argument if you had an example of an optimization that was not found by the optimizer and that would have been much easier to spot if it had not been working with bags.

  • Jan Hidders
Received on Wed Feb 26 2003 - 15:11:22 CET

Original text of this message