Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?
Date: Thu, 13 Feb 2003 10:54:30 -0000
"Jan Hidders" <hidders_at_REMOVE.THIS.ua.ac.be> wrote in message
> Lauri Pietarinen wrote:
> >what is your take on Garcia-Molina, Ullman and Jennifer Widom
> >regarding their stand on duplicates?
> >(see http://www.dbdebunk.com/cjddtdt.htm and cjddtdt2)
> >Are they just realists accepting that SQL is the de facto
> >standard, so one might as well take it as a basis?
> There are two separate questions here:
> 1. Do we want duplicates in the data model, i.e., in the original relations
> and the results of queries?
> 2. Do we want duplicates in intermediate results?
> I'm not completely sure what their answer to 1. is but I suspect it is
> something like "probably not". But how your algebra looks depends on how you
> answer question 2, because query optimization is the main raison d'etre of
> the algebra, and there it is a completely different story. It can for
> example be more efficient to postpone duplicate elimination. If you don't
> have a bag algebra you cannot express this in your algebra.
In other words this is a matter of possibly having two algebras. One for database users that is complient with the relational model and closure (i.e. no dulplicates), and the possability of an extended alegbra used internally by an implementation to allow the postponement of duplicate elimination in intermediate results.
That is a useful distinction to make, but not one I've really picked up on before, so thanks Jan.
> Note that in the writings you mention Date only addresses the first
> question, where what you actually asked concerned mostly the second
I guess Date says that the second question is an implementation issue so not something of great concern to him. I've certainly seen mentions of delaying duplicate elimination in his writings, but only in passing.
Business Intelligence, IBM Global Services Received on Thu Feb 13 2003 - 11:54:30 CET