Idempotence and Projection

From: David Cressey <dcressey_at_verizon.net>
Date: Tue, 19 Sep 2006 09:04:36 GMT
Message-ID: <EoOPg.2438$zs6.2087_at_trndny07>



If f(x) = SELECT * FROM x GROUP BY *

then I would expect f(x) = x provided that x is a relational table. In other words, I expect that projecting a relation into (or it it onto?) its own space is an idempotent operation.

However, if x contains duplicate rows, and the semantics are such that the duplicates do not make the same assertion (the "cat food problem") then f(x) will not equal x. Tables with duplicate rows are not relational tables (again, the "cat food problem").

As Bob has pointed out, GROUP BY in the absence of an aggregate function is a projection. As Bob has also pointed out, "count distinct" is the count of a projection.

Adding an "Amount" column to a table is really a way of disguising NFNF data so that it looks like 1NF data.
But the cat food problem will exist even if there are no duplicates in the table.

I'm beginning to think that "replication sensitive" and the cat food problem are really the same problem.

In this connection, perhaps we should borrow a term from science, and refer to "intensive measures" versus "extensive measures". Kimball's books on DW have some other terminology for the same phenomenon, but I forget what it is. Received on Tue Sep 19 2006 - 11:04:36 CEST

Original text of this message