Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

From: Steve Kass <skass_at_drew.edu>
Date: Tue, 25 Feb 2003 12:47:40 -0500
Message-ID: <b3ga4e$4nh$1_at_slb0.atl.mindspring.net>


Jan Hidders wrote:

>In article <51d64140.0302250629.37501202_at_posting.google.com>,
>Paul <pbrazier_at_cosmos-uk.co.uk> wrote:
>
>
>>jan.hidders_at_REMOVE.THIS.ua.ac.be (Jan Hidders) wrote in message
>>news:<3e4bca2b.0_at_news.ruca.ua.ac.be>...
>>
>>
>>>No. In fact, in theory, all optimizations that can be done in a set-based
>>>algebra can also be done in a bag-based algebra but not the other way
>>>around.
>>>
>>>
>>We can define a bag [a,a,a,b,c,c,d] as the set {(a,3),(b,1),(c,2),(d,1)}
>>where a,b,c,d are distinct. I assume this is the standard definition?
>>
>>
>
>Yes, it is.
>
>
>
>>So doesn't any bag algebra have a isomorphic algebra of sets of the form:
>>{(a1,n1),(a2,n2),(a3,n3), ...} where a1,a2,a3,... are distinct members of
>>our domain set and n1,n2,n3,... are natural numbers (not necesarily
>>distinct)?
>>
>>
>
>Yes, it has. But how are you going to express in your algebra that you are
>not going to eliminate duplicates immediately after a projection?
>
>
Do you mean "are going to eliminate", meaning duplicate a_i ? Given { (<101, 'abc'>, 3), (<102, 'abc'>, 2)}, a projection onto the second column naively gives { (<'abc'>, 3), (<'abc'>, 2)}. Since we might not want multiple representations of a bag of 5 'abc's, we can aggregate after set-like projection or make aggregation a part of bag projection, analogous to a non-bag algebra's need to eliminate duplicates.

SK

>
>
>>The other thing I still don't understand:
>>The underlying interpretation of a relvar is a predicate, with the
>>rows being propositions. How can it make sense for a relvar to contain
>>any proposition more than once? Does the bag-based "relational"
>>algebra have a different starting point and what is it?
>>
>>
>
>It's starting point is roughly that we are just talking about a data
>structure here and that it is up to the user to decide what it means. One
>possible interpretation is that you are talking about entities which can be
>distinguished but not by attributes that are stored in the database.
>
>
>
>>Should a bag-based SQL have some sort of extensions to cope with duplicate
>>rows? For example if I had n identical rows in a relvar and I wanted to
>>update m of them (m < n) it should have the syntax to say "only update a
>>maximum of m identical rows" (it wouldn't matter which ones because they
>>are identical). Or if I wanted to delete one of them there would be a
>>corresponding DELETE extension. At the moment I can only delete all or
>>nothing.
>>
>>
>
>Yes, it should. Even if you look at SQL as a bag calculus, it is very
>inadequate.
>
>
>
>>I'm having trouble working out which of the pro-bag posters are just
>>playing devils advocate or trolling and which are genuine (if any).
>>
>>
>
>If you read my postings carefully you will notice that I have been
>consistently arguing that the issue is more complicated than being pro-bag
>or contra-bag. Questions such as "are bags in SQL a good idea?", "should we
>allow bags in the logical data model?" and "should we use a bag algebra for
>query optimization?" are related but not exactly the same. Moreover, if
>someone thinks that pointing out that some of Date's contra-bag arguments
>are not entirely correct or very convincing means that I am pro-bag, then
>that is just sloppy thinking.
>
>
>
>>I assume from the cited papers that it must be a legitimate area of
>>research (the algebra of a particular family or sets) but I can't really
>>see its applicability to relational theory.
>>
>>
>
>In relational theory they have their use in query optimization. Besides,
>relational theory is not the only database theory around.
>
>Kind regards,
>
>-- Jan Hidders
>
>
Received on Tue Feb 25 2003 - 18:47:40 CET

Original text of this message