# Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

Date: Tue, 25 Feb 2003 12:47:40 -0500

Message-ID: <b3ga4e$4nh$1_at_slb0.atl.mindspring.net>

Jan Hidders wrote:

>In article <51d64140.0302250629.37501202_at_posting.google.com>,

*>Paul <pbrazier_at_cosmos-uk.co.uk> wrote:
**>
**>
**>>jan.hidders_at_REMOVE.THIS.ua.ac.be (Jan Hidders) wrote in message
**>>news:<3e4bca2b.0_at_news.ruca.ua.ac.be>...
**>>
**>>
**>>>No. In fact, in theory, all optimizations that can be done in a set-based
**>>>algebra can also be done in a bag-based algebra but not the other way
**>>>around.
**>>>
**>>>
**>>We can define a bag [a,a,a,b,c,c,d] as the set {(a,3),(b,1),(c,2),(d,1)}
**>>where a,b,c,d are distinct. I assume this is the standard definition?
**>>
**>>
**>
**>Yes, it is.
**>
**>
**>
**>>So doesn't any bag algebra have a isomorphic algebra of sets of the form:
**>>{(a1,n1),(a2,n2),(a3,n3), ...} where a1,a2,a3,... are distinct members of
**>>our domain set and n1,n2,n3,... are natural numbers (not necesarily
**>>distinct)?
**>>
**>>
**>
**>Yes, it has. But how are you going to express in your algebra that you are
**>not going to eliminate duplicates immediately after a projection?
**>
**>
*

Do you mean "are going to eliminate", meaning duplicate a_i ?
Given { (<101, 'abc'>, 3), (<102, 'abc'>, 2)}, a projection onto
the second column naively gives { (<'abc'>, 3), (<'abc'>, 2)}.
Since we might not want multiple representations of a bag of
5 'abc's, we can aggregate after set-like projection or make
aggregation a part of bag projection, analogous to a
non-bag algebra's need to eliminate duplicates.

SK

*>
**>
*

>>The other thing I still don't understand:

*>>The underlying interpretation of a relvar is a predicate, with the
**>>rows being propositions. How can it make sense for a relvar to contain
**>>any proposition more than once? Does the bag-based "relational"
**>>algebra have a different starting point and what is it?
**>>
**>>
**>
**>It's starting point is roughly that we are just talking about a data
**>structure here and that it is up to the user to decide what it means. One
**>possible interpretation is that you are talking about entities which can be
**>distinguished but not by attributes that are stored in the database.
**>
**>
**>
**>>Should a bag-based SQL have some sort of extensions to cope with duplicate
**>>rows? For example if I had n identical rows in a relvar and I wanted to
**>>update m of them (m < n) it should have the syntax to say "only update a
**>>maximum of m identical rows" (it wouldn't matter which ones because they
**>>are identical). Or if I wanted to delete one of them there would be a
**>>corresponding DELETE extension. At the moment I can only delete all or
**>>nothing.
**>>
**>>
**>
**>Yes, it should. Even if you look at SQL as a bag calculus, it is very
**>inadequate.
**>
**>
**>
**>>I'm having trouble working out which of the pro-bag posters are just
**>>playing devils advocate or trolling and which are genuine (if any).
**>>
**>>
**>
**>If you read my postings carefully you will notice that I have been
**>consistently arguing that the issue is more complicated than being pro-bag
**>or contra-bag. Questions such as "are bags in SQL a good idea?", "should we
**>allow bags in the logical data model?" and "should we use a bag algebra for
**>query optimization?" are related but not exactly the same. Moreover, if
**>someone thinks that pointing out that some of Date's contra-bag arguments
**>are not entirely correct or very convincing means that I am pro-bag, then
**>that is just sloppy thinking.
**>
**>
**>
**>>I assume from the cited papers that it must be a legitimate area of
**>>research (the algebra of a particular family or sets) but I can't really
**>>see its applicability to relational theory.
**>>
**>>
**>
**>In relational theory they have their use in query optimization. Besides,
**>relational theory is not the only database theory around.
**>
**>Kind regards,
**>
**>-- Jan Hidders
**>
**>
*

Received on Tue Feb 25 2003 - 18:47:40 CET