Re: Which normal form is this violating?

From: Daniel Guntermann <guntermann_at_uswest.net>
Date: Mon, 6 May 2002 08:11:29 -0700
Message-ID: <oPwB8.86$vz5.29898_at_news.uswest.net>


[snip]

First off, I would like to apologize for any slight that might have been perceived by my previous posting, particularly to Mr. Hamm. I too have a high regard for Mr. Celko, though I still believe his argument was flawed. That said...

It is indeed true that a schema should not
> describe two tables that are logically the same, i.e., represent the same
> predicate, because then one table would be redundant.

This statement seems to imply any horizontal partitioning, snapshots, materialized views, or several other types of conventions used in databases are now invalidated. Though these are implementation details and techniques, they still are implemented as tables, and thus can loosely be translated into relation variable definitions. And yes, they typically support the same predicates.

Even in my short time developing and modeling databases and user information, I've seen cases where data have been structured to meet the same predicate, but are not integrated in a relational sense, and even purposely segregated in terms of base tables. I find these instances especially in cases where I am working with temporal data, decision support functions, and somewhat antiquated distributed systems/database processing, where deltas (changes) across tables are periodically computed.

Also, it seems that most textbooks I've come across, stress the fact that the relational model does a great deal to "reduce" redundancy (Maybe it says differently in Celko's book). They have never claimed to say that redundany is entirely eliminated, nor do they say that this is always desirable.

If one views a table definition as that of a relation "variable" definition, then we can start thinking of a table as a variable. And as with most variables, certain operators are desireable -- comparison (=), assignment, etc. Although such operators are not explicit within the original set of eight operators for relational algebra, they can easily be contrived using either relational algebra, relational calculus, or SQL.

Maybe one could be, if a minus operation both ways for two relation variables of the same type returns one or more rows, then false is returned for a comparison operator.

Assignment could be constructed as a retrieve, insert, delete, commit series of statements on two tables of the same type.

Stating that such duplicate definitions, aside from the name or identifier, are out-and-out wrong seems like quite a remarkable statement. Taking away the ability to create "table variables" of the same type would take a lot away from relational model, as it would from any other computer programming language or system.

So if you use the
> term "table" synonymous with "predicate" then it is correct, but if you
> define the term as "a set of tuples" then it is not. So if you are going
> to say that "a schema is a set of tables" and "a table is a set of tuples"
> in the same paragraph you have to make clear that the term "table" has two
> different meanings here.

Obviously I agree with discerning between table values and table definitions. However, I still have trouble swallowing the "no duplicate predicate/table definitions (aside from identifier)" argument, no matter if it used in the context of either a relation or a relation variable. But if my reasoning has holes, feel free to blast away.

Daniel Guntermann Received on Mon May 06 2002 - 17:11:29 CEST

Original text of this message