Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

From: Lauri Pietarinen <>
Date: Mon, 10 Mar 2003 00:03:42 +0200
Message-ID: <>

Jan Hidders wrote:

>Lauri Pietarinen wrote:
>>Jan Hidders wrote:
>>>Let's say I have a set X and I want to know the average of f(x) for all x
>>>in the set X. In the bag algebra this is straightforward: an iteration
>>>over X where you compute f(x), followed by the bag function AVG. If you
>>>simulate this with sets ( the bag {{x, x, y}} becomes {(x,2), (y,1)} )
>>>then you get an iteration followed by a grouping on the values of f(x)
>>>and then you have to compute the average over the product of every
>>>element and its cardinality. It's not trivial to recognize that this is
>>>equivalent with the previous sequence of bag operations, which is often a
>>>more efficient implementation.
>>Could you clarify?
>>In "SQL-speak" what would this mean?
>>Something like this?
>>-- "tuple bag" version
>> FROM X;
>>instead of
>>-- "set" version
>>SELECT AVG'( fx, count )
>> FROM (
>> SELECT F(x) as fx, count(*) as count
>> GROUP BY x )
>Yes. With the correction you yourself already gave that the final GROUP BY
>should be on fx.
OK. Many of the questions regarding the bags<-->sets discussion seem - in the end - to focus on aggregates.

Here is Date's and Darwen's take on aggregates:

Aggregate operators are relational operators that take two arguments: a relation and a column name

e.g. we could have AVG( EMP, SALARY) which would be equivalent to the SQL-statement  

I can't get into all the details here but there there is lot's of discussion on different issues relating aggregates in
Relational Database Writings 1994-1997
(ISBN 0-201-39814-1) installments 44, 45 and 50.

Taking this approach, it would seem that there is no advantage for bags over sets when optimising aggregates.

Lauri Pietarinen       Received on Sun Mar 09 2003 - 23:03:42 CET

Original text of this message