Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?
Date: 26 Feb 2003 07:17:13 -0800
Message-ID: <51d64140.0302260717.356b7e70_at_posting.google.com>
Steve Kass <skass_at_drew.edu> wrote in message news:<b3ga4e$4nh$1_at_slb0.atl.mindspring.net>...
> >Yes, it has. But how are you going to express in your algebra that you are
> >not going to eliminate duplicates immediately after a projection?
> >
> Do you mean "are going to eliminate", meaning duplicate a_i ?
> Given { (<101, 'abc'>, 3), (<102, 'abc'>, 2)}, a projection onto
> the second column naively gives { (<'abc'>, 3), (<'abc'>, 2)}.
> Since we might not want multiple representations of a bag of
> 5 'abc's, we can aggregate after set-like projection or make
> aggregation a part of bag projection, analogous to a
> non-bag algebra's need to eliminate duplicates.
Well my initial definition of a bag explicitly stated that in the set
{(a,2),(b,3),(c,1),...} etc, a,b,c were distinct. So by this
definition
{(<'abc'>, 3), (<'abc'>, 2)} is not a bag, and the projection would
have to be {(<'abc'>, 5)}. In fact all we've done really is shifted
the elimination of duplicates elsewhere.
Maybe you could amend the definition slightly so that the bag
[a,b,b,b,c,c] is the equivalence class of sets like
{(a,1),(b,3),(c,2)} etc.
So for example [a,a] would be the equivalence class that contains both
{(a,2)} and {(a,1),(a,1)}
So I think I'm beginning to see what we are getting at now:
if you have a series of procedures to carry out (I'm thinking at the
physical level internal to the DMBS) do you do:
step 1
eliminate duplicates
step 2
eliminate duplicates
step 3
eliminate duplicates
step 4
eliminate duplicates
or might it be equivalent and more efficient to leave the
deduplication to the end i.e:
step 1
step 2
step 3
step 4
eliminate duplicates
So what you need is an algebra to tell you that the two processes will give you the same result.
Paul. Received on Wed Feb 26 2003 - 16:17:13 CET