Re: Primary Key of a table

From: Jack L. Swayze Sr. <keystrk_at_feist.com>
Date: 1996/04/22
Message-ID: <4lh8it$iip_at_wormer.fn.net>


Alvin Nonaka <xea0005_at_co.honolulu.hi.us> wrote:

>Michael Kustermann <MKUSTERM_at_worldnet.att.net> wrote:
 

>There is no need to banish the alternate keys to some other status (as
>being foreign keys, say). The alternate keys have independent existence
>on their own right.

Yes, and an independent uniqueness also. This is the essence of Boyce-Codd normal form, that all non-unique columns of a table be indentification dependent upon each and every unique identifier, the whole part of each and every unique identifier, and no other column (owned, or foriegn key) within the table.

In other words (despite the significance placed upon them by being named 'Primary') the column(s) that make up the primary key have no more need to have their uniquenes inforced than any and all alternate unique identifiers.

Or in even simpler terms, it is good to define as many unique indexes on a table as make business sense.

>Let's now add a sequence generated key with the intention of making it
>the primary key:
 

><sk>, <ck2>, <ck1>, ... <ckn>, <data1>, <data2>, ....
>pk ak' ak1 akn
 

>Now the table constraints are:
 

> sk, the artifical sequence generated item, is the primary key
> ck2 is not null and unique (which it was when it was a pk)
> ck1 is not null and unique
> ....

I recognize the mathematics of this, and agree that the <sk> is just as much unique as the (now) <ak> indexes. However, using the <sk> has now lost something. What it has lost is meaning. This raises a question: Why assign an Arbitrary Unique Identifier (my term for your <sk>) when a naturaly occuring unique identifier exists?

The only answer I can come up with is a performance one. It may be easier for the DBMS to inforce uniqueness across one column rather than several. If this is the case, and as perfromance issues are best left to design decisions, then the only justification for the use of an arbitrary unique identifier is to solve a particular design issue. Therefore, unless no naturally occuring unique identifier can be found, arbitrary unique identifiers should not be put into a data model (as the data model should not include design compromises, but instead should reflect the meaningful nature of the data).

Now, in refering to arbitrary unique identifiers as a design compromise, I indicate that there is a downside to them. The downside to arbitrary unique identifiers is the loss of meaning. This forces either one of two things to occur: 1) the end user must associate the meaning with the arbitrary unique identifier (as in 678-4321 on your caller-id means that Betsy is calling), and use the arbitrary unique identifier when communicating with the system or 2) the programs, processes, and procedures that manipulate and present the data to the user must interpret the arbitrary unique identifier for the user each and every time it is used.

Situation 1) is what can cause the system to become less friendly, and therefore less useful and less trusted. Situation 2) forces a programming solution to every business problem (that involves the arbitrary unique identifier), thereby disallowing for ad-hoc query solutions in many cases.

By stating that arbitrary unique identifiers need to be interpreted I point out one other aspect of them, that they are actualy a 'codification' of data, and that they are a codification of data that already exists in the database. Codifying data that already exists in the database is one of the classic forms of redundancy in that you now have more than one data item that represents a single real-world fact.

Therefore, to summarize: Arbitrary unique identifiers: 1) create a data redundancy that must be managed, 2) hide meaning, 3) increase dependency on programming solutions or 4) cause the system to be less usfull and perhaps less trusted.

So, the question is: Is the effecency purchased by the use of arbitrary unique identifiers worth the effectiveness lost by the use of them?

Only your DBA and DA can tell for sure!

>The real trick is to hide the artificially generated key from the users
>as much as possible and to allow users to reference rows by the (more)
>natural keys (which are meaningful to them).

>aloha
>ayn


'Keystroke'
KeystrkTX_at_AOL.COM Received on Mon Apr 22 1996 - 00:00:00 CEST

Original text of this message