# Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

Date: 25 Feb 2003 18:59:00 +0100

Message-ID: <3e5baee4.0_at_news.ruca.ua.ac.be>

In article <b3g02b$1l0j69$1_at_ID-152540.news.dfncis.de>,
Costin Cozianu <c_cozianu_at_hotmail.com> wrote:

*>
*

>Thanks for the clarifications. Indeed I did some sloppy reading fo this

*>thread, but can you bmale me ?
*

No, I cannot. This war on bags is getting a bit out of hand. :-)

*>Jan Hidders wrote:
*

>>>>Bags can be defined as a special kind of set, and sets can be seen as a

*>>>>special case of bags. For every set operation it is easy to think of one
**>>>>(or more) corresponding bag operation that is the same up to duplicates.
**>>>
**>>>Another doubt that I have is that we have several axiomatic constructions
**>>>that go from set theory to the whole math. I don't know if you can as
**>>>easily construct sets from bags. Remeber, sets are how you *define* bags.
**>>>Saying that every set is a bag is a one to one mapping between some bags
**>>>and sets, not a constructive definition. There's also a one to one
**>>>mapping from cartesian coordinates and polar coordinates, yet we don't
**>>>seem to argue about sacrificing one vs. the other.
**>>
**>> That's not really a good analogy because cartesions coordinates cannot be
**>> seen as a special case of polar coordinates while sets are clearly a
**>> special case of bags. So choosing for bags doesn't sacrifice anything in
**>> this case.
**>
**>That's what I don't agree with. It doesn't look to me that bags are
**>"clearly a special kind of sets.
*

I presume you meant that the other way around.

>All you might be able to say is that you can make set algebra isomorphic

*>to a sub-algebra of bag algebra.
**>
**>But that's not the point. The point is already that bags are a special
**>kind of sets.
**>
**>To make it the other way around, you'll have to show me a proper definition
**>of bags that doesn't rely on sets being well defined.
*

The question is if that definition of "are a special kind of" is the definition that the user cares about. I would argue that all that he or she cares about is if all the data can still be stored without too much shoe-horning it into the data model, and if all queries and updates that he or she wants to perforem can be expressed in a convenient way.

>>>I'd like to be contradicted and let me know if their stated opion is that

*>>>bags are better suited for data management with reghards to optimization,
**>>>ease of use and everything.
**>>
**>> That's not what I said. Here is what I said at the beginning of the
**>> thread:
**>> | There are two separate questions here:
**>> | 1. Do we want duplicates in the data model, i.e., in the original
**>> | relations and the results of queries?
**>> | 2. Do we want duplicates in intermediate results?
**>> |
**>> | I'm not completely sure what their answer to 1. is but I suspect it is
**>> | something like "probably not". [...]
**>>
**>
**>OK, here are my 2 cents:
**> 1) probably not. I'd conside that operating on bags as sets with
**>specially defined bag operatorsa maybe, but within a set algebra is
**>easier and more manageable for the purpose of building information
**>systems. The natural semantics of the relational model in the first
**>order logic is too damn important.
*

Agreed, and you can be assured that Ullman et al are very much aware of this. You can tell this by just reading his "Principles of Database and Knowledge-base systems", especially the chapters on, surprise surprise, logic.

*> 2) I don't see why we couldn't have duplicates in the intermediate
*

>result even if we operate on sets at the logical level. That was one of

*>my puzzles with regards to your position. A logical set can be
**>represented in the physical implementation level by a whole class of
**>equivalence of bags
**>
**> For example, if you implement a final "merge join", you might decide
**>that is easier to eliminate the duplicates in the final merge operation,
**>and leave the intermediaries as containing duplicates.
**>
**> What is the big issue here ?
*

I don't know. Perhaps that some people already find the suggestion that a bag algebra might be useful for query optimization suspect.

Kind regards,

- Jan Hidders