# Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

Date: 26 Feb 2003 07:17:13 -0800

Message-ID: <51d64140.0302260717.356b7e70_at_posting.google.com>

Steve Kass <skass_at_drew.edu> wrote in message news:<b3ga4e$4nh$1_at_slb0.atl.mindspring.net>...

> >Yes, it has. But how are you going to express in your algebra that you are

*> >not going to eliminate duplicates immediately after a projection?
**> >
**> Do you mean "are going to eliminate", meaning duplicate a_i ?
**> Given { (<101, 'abc'>, 3), (<102, 'abc'>, 2)}, a projection onto
**> the second column naively gives { (<'abc'>, 3), (<'abc'>, 2)}.
**> Since we might not want multiple representations of a bag of
**> 5 'abc's, we can aggregate after set-like projection or make
**> aggregation a part of bag projection, analogous to a
**> non-bag algebra's need to eliminate duplicates.
*

Well my initial definition of a bag explicitly stated that in the set
{(a,2),(b,3),(c,1),...} etc, a,b,c were distinct. So by this
definition

{(<'abc'>, 3), (<'abc'>, 2)} is not a bag, and the projection would
have to be {(<'abc'>, 5)}. In fact all we've done really is shifted
the elimination of duplicates elsewhere.

Maybe you could amend the definition slightly so that the bag
[a,b,b,b,c,c] is the equivalence class of sets like
{(a,1),(b,3),(c,2)} etc.

So for example [a,a] would be the equivalence class that contains both
{(a,2)} and {(a,1),(a,1)}

So I think I'm beginning to see what we are getting at now:
if you have a series of procedures to carry out (I'm thinking at the
physical level internal to the DMBS) do you do:
step 1

eliminate duplicates

step 2

eliminate duplicates

step 3

eliminate duplicates

step 4

eliminate duplicates

or might it be equivalent and more efficient to leave the
deduplication to the end i.e:

step 1

step 2

step 3

step 4

eliminate duplicates

So what you need is an algebra to tell you that the two processes will give you the same result.

Paul. Received on Wed Feb 26 2003 - 16:17:13 CET