Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

From: Lauri Pietarinen <lauri.pietarinen_at_atbusiness.com>
Date: Thu, 13 Feb 2003 13:27:51 +0200
Message-ID: <3E4B8137.2080204_at_atbusiness.com>


>
>
>There are two separate questions here:
>1. Do we want duplicates in the data model, i.e., in the original relations
> and the results of queries?
>2. Do we want duplicates in intermediate results?
>
>I'm not completely sure what their answer to 1. is but I suspect it is
>something like "probably not".
>

That's not the impression I get from Date's article (see part II), but I have not read the book.

< quotes from book Hector Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom, DATABASE SYSTEM IMPLEMENTATION>

[Relational] algebra was originally defined as if relations were sets [sic!--italics added].Yet relations in SQL are really bags ... Thus, we shall introduce relational algebra as an algebra on bags.

...

For instance, you may have learned set-theoretic laws such as A INTERSECT (B UNION C) = (A INTERSECT B) UNION (A INTERSECT C), which is formally the "distributive law of intersection over union." This law holds for sets, but not for bags.

< quotes from book/ >

This does not look like it is dealing with intermediate results only...

> But how your algebra looks depends on how you
>answer question 2, because query optimization is the main raison d'etre of
>the algebra, and there it is a completely different story. It can for
>example be more efficient to postpone duplicate elimination. If you don't
>have a bag algebra you cannot express this in your algebra.
>
I don't think that anybody is suggesting that intermediate results need to remove duplicates. It's
the end result that counts. E.g. in the following code fragment  

  int i;
  i = 5;
  i = 6;

  System.out.println(i);

an optimizing compiler would not bother update i to 5 because nobody is interested in that intermediate value. In the same spirit intermediate values
of relational expressions would be of interest only to the system internally.

>Note that in the writings you mention Date only addresses the first
>question, where what you actually asked concerned mostly the second
>question.
>

See above...

regards,
Lauri Pietarinen

-- 
________________________________________________________________

 Lauri Pietarinen, Senior Consultant, Databases

 AtBusiness Communications Oyj, Kaapeliaukio 1, FIN-00180 Helsinki

 tel. +358-9-2311 6632,  mob. +358-50-594 2011,  fax +358-9-2311 6601
 http://www.atbusiness.com,  email: lauri.pietarinen_at_atbusiness.com
_____________________________________________________________________
Received on Thu Feb 13 2003 - 12:27:51 CET

Original text of this message