Re: Surrogate primary key plus unique constraint vs. natural primary key: data integrity?

From: <compdb_at_hotmail.com>
Date: Sat, 9 Mar 2013 15:02:29 -0800 (PST)
Message-ID: <8cdaee4b-1106-46df-bd2a-a01c3eca64c9_at_googlegroups.com>

On Friday, March 8, 2013 2:11:29 PM UTC-8, Wolfgang Keller wrote:
> The issue was whether a surrogate primary key plus a unique constraint
> on the natural key is *equivalent* concerning data integrity, especially
> relational integrity, to a natural primary key.

Wolfgang,

Key columns should be declared to be unique. If there are no other columns then declaring such a constraint is unnecessary but not wrong. ("Primary" might have some meaning to a product but it doesn't in the relational model.)

Applying this to having the natural key columns plus a surrogate key column means that each of these key column sets needs an explicit unique constraint even if there aren't any further columns, because each is the other's "other columns".

A row is in a table when and only when it makes some membership condition true. The way you are using keys, your membership condition for the original table is that the natural key column values identify something AND that some further condition is true of those plus any further column values. Also, your membership condition for the new table is that condition/conjunction AND that the surrogate key column value identifies the same thing as the natural key column values do.

Clearly there is a one-to-one correspondence between the tables/designs given that each surrogate & natural key value pair in the new table is in fact how you are identifying things.

You want tables/designs to be "equivalent". What does this mean? One sense is that for an old expression involving the old table giving an old value you want a corresponding new expression involving the new table to give a corresponding new value. Note that this involves corresponding new/old table variable membership conditions, values of table variables and query results, and query and constraint expressions (including literals).

One correspondence is "the new version of a condition/value/expression everywhere looks like the old one with the appropriate surrogate key columns/values added to the natural key columns/values". Another correspondence is "the new version of a condition/value/expression involving just that one table variable adds the appropriate surrogate key columns/values but in every other condition/value/expression it replaces the natural key columns/values by the the appropriate surrogate key columns/values".

In this sense, each of those new designs is "equivalent" to the old one.

Note that "appropriate column/value" means that you have to use/expect a surrogate-natural value pair that appears in the new table!

Note also that the correspondence applies to membership conditions and constraint expressions, not just values and query expressions. A foreign key (or equivalent) constraint on the natural key has to be converted to the corresponding constraint on the new key (natural+surrogate pair or just surrogate).

If you do not convert constraints appropriately when a new table has both natural+surrogate key columns then the two databases will not be equivalent. Because there will possibly be new rows not corresponding to old rows, because their natural+surrogate pairs will not be constrained appropriately. Integrity will not have been maintained.

In summary: I explained a notion of equivalence of designs under a correspondence. Every key needs a constraint. Wherever a natural+surrogate same-identification pair appears in any new version of a table variable it must be constrained to appear in the new version of the natural+surrogate pairs table variable. If your approach is NOT that then the argument of the preceding paragraph kills it.

philip Received on Sun Mar 10 2013 - 00:02:29 CET

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message