Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

From: Costin Cozianu <>
Date: Tue, 25 Feb 2003 06:58:13 -0800
Message-ID: <b3g02b$1l0j69$>


Thanks for the clarifications. Indeed I did some sloppy reading fo this thread, but can you bmale me ?

Jan Hidders wrote:
>>I mean we are arguing here about which is model is more conveninet,
>>aren't we ?
> Er, not exactly. I'm arguing that some arguments that are presented by Date
> to disallow bags are not entirely correct.

Ok, now I understand better.

>>>Bags can be defined as a special kind of set, and sets can be seen as a
>>>special case of bags. For every set operation it is easy to think of one
>>>(or more) corresponding bag operation that is the same up to duplicates.
>>Another doubt that I have is that we have several axiomatic constructions
>>that go from set theory to the whole math. I don't know if you can as
>>easily construct sets from bags. Remeber, sets are how you *define* bags.
>>Saying that every set is a bag is a one to one mapping between some bags
>>and sets, not a constructive definition. There's also a one to one mapping
>>from cartesian coordinates and polar coordinates, yet we don't seem to
>>argue about sacrificing one vs. the other.
> That's not really a good analogy because cartesions coordinates cannot be
> seen as a special case of polar coordinates while sets are clearly a special
> case of bags. So choosing for bags doesn't sacrifice anything in this case.

That's what I don't agree with. It doesn't look to me that bags are "clearly a special kind of sets.

All you might be able to say is that you can make set algebra isomorphic to a sub-algebra of bag algebra.

But that's not the point. The point is already that bags are a special kind of sets.

To make it the other way around, you'll have to show me a proper definition of bags that doesn't rely on sets being well defined.

>>In the meantime the point I was trying to make vis-a-vis Date vs. Garcia
>>Molina is that I might issue a conjecture that Garcia Molina and all
>>have to study optimization for bags because that's what is required for
>>their students who probably get to work at Oracle et. comp.
> That certainly is the case, and I already said that at the beginning of the
> thread:
> | The book is agnostic on this issue. It's starting point is that you want
> | to build an SQL database (and so there are duplicates in your results) and
> | then explains how you should do that.
> But my point was that apart from that there are other reasons to talk about
> a bag-based algebra, even if you don't expose bags to the end users. And
> of course it also works the other way around: even though you have bags in
> your logical data model, you could still use a set-based algebra for
> optimization.
>>I'd like to be contradicted and let me know if their stated opion is
>>that bags are better suited for data management with reghards to
>>optimization, ease of use and everything.
> That's not what I said. Here is what I said at the beginning of the thread:
> | There are two separate questions here:
> | 1. Do we want duplicates in the data model, i.e., in the original relations
> | and the results of queries?
> | 2. Do we want duplicates in intermediate results?
> |
> | I'm not completely sure what their answer to 1. is but I suspect it is
> | something like "probably not". [...]

OK, here are my 2 cents:

  1. probably not. I'd conside that operating on bags as sets with specially defined bag operatorsa maybe, but within a set algebra is easier and more manageable for the purpose of building information systems. The natural semantics of the relational model in the first order logic is too damn important.
  2. I don't see why we couldn't have duplicates in the intermediate result even if we operate on sets at the logical level. That was one of my puzzles with regards to your position. A logical set can be represented in the physical implementation level by a whole class of equivalence of bags

   For example, if you implement a final "merge join", you might decide that is easier to eliminate the duplicates in the final merge operation, and leave the intermediaries as containing duplicates.

   What is the big issue here ?

> Kind regards,
> -- Jan Hidders
Best regards,
Costin Received on Tue Feb 25 2003 - 15:58:13 CET

Original text of this message