Re: Idempotence and "Replication Insensitivity" are equivalent ?

From: David Cressey <dcressey_at_verizon.net>
Date: Wed, 20 Sep 2006 05:51:52 GMT
Message-ID: <YF4Qg.2420$x11.286_at_trndny02>


"Chris Smith" <cdsmith_at_twu.net> wrote in message news:MPG.1f7a57acc83a3133989720_at_news.altopia.net...

> If anyone is attached to the definition of "aggregate function" provided
> by a vendor like Oracle, they are welcome to be so attached. It doesn't
> change the fact, though, that Marshall was talking about something more
> restricted in this thread. It also doesn't make Oracle an authority of
> database theory, terminology or otherwise. They have their own goals,
> and their goals are related more toward helping people use their product
> to write applications than helping people understand how aggregate
> functions are defined at a theoretical level.

I've stayed out of the discussion over in sci.math because I'm unwilling to get either your or Marshall's formalism through my thick skull, and I didn't want to speak up in sci.math without following that part of the discussion.

But over in comp.database.theory there is another aspect to the original discussion that has gotten lost in the discussion of "replication sensitivity". I'm going to break silence over here in sci.math just to fill in what's going on in the other discussion.

It has to do with the fact that SQL misleads the mathematically naive concerning the difference between a bag and a set. Note that I'm saying "SQL" and not "Oracle". I'm being quite precise on that point. I'm perhaps marginally mathematically naive myself, although I do think that I think fairly logically.

For the most part, those of us who are SQL jockeys specify SQL operations, including aggregate functions, on collections of "rows" (or "records" if you prefer) as if those collections were always sets. But an SQL table is inherently a bag of rows, not a set of rows.

Those of us who want to apply the power of relational calculus using a tool like SQL are well advised to constantly keep, in the back of our minds, the distinction between a bag and a set in both the underlying tables and the queries based on those tables. This discussion of "replication sensitivity" is, at bottom, a sensitivity to the difference between a bag and a set used as input. (If you project a bag onto a set, you get rid of the replication ("duplication") and that changes the quantity you get out of an aggregate, if it's "replication sensitive").

Beyond that, I'm fairly lost in the math discussion that's going on over here.
I apologize for butting in. Received on Wed Sep 20 2006 - 07:51:52 CEST

Original text of this message