Re: Primary Key Theory Question

From: Bernard Peek <bap_at_shrdlu.com>
Date: Sat, 3 Apr 2004 22:06:41 +0100
Message-ID: <fF9x6sphdybAFwgi_at_shrdlu.com>


In message <7cadnVFbnulxVvPdRVn-hQ_at_comcast.com>, Laconic2 <laconic2_at_comcast.net> writes
>What is a surrogate key and what is a natural key is in the eye of the
>beholder. There exists a certain context in which SSN is a surrogate key.
>It's a key issued for the purposes of identification to an object whose
>natural features are unusable for identification. It's just not invented by
>"our system."
>
>SSN only looks natural to the IT person because the subject matter expert
>uses it as if it were natural.

I've just looked into this newsgroup after a wile away, and found the same debate continuing.

My personal definition of a natural key (which I've mentioned here in the past) is a key whose uniqueness is guaranteed by the laws of physics. They are quite rare.

More often there is some sort of surrogate used. Someone, somewhere is assigned the task of ensuring that there is a mapping between the surrogate and the objects that it is intended to identify. If that person does their job well we can treat the surrogate as if it was a natural key. In most cases people doing that job make occasional mistakes, or someone does something that screws the system. For that reason it's usually best to either use your own surrogate, or build some sort of error-handler into your systems.

Adding an identity field has risks. Identity fields don't appear in the logical data structure and every move away from the logical data structure introduces risks that need to be evaluated. In essence the identity field adds a new entity with only one attribute and a 1:1 relationship with an existing entity. But there is nothing in the nature of the data to enforce that relationship. It is up to the system designer to enforce it. That mapping needs to be checked whenever data is added, removed or altered.

When a new instance is created it's trivially easy to create a new identity number. But the system needs to ensure that the instance really is new, and doesn't have a mapping to an existing identity value.

When you delete an entry for "Fred Smith" are you removing the correct identity value, or is there another Fred Smith somewhere in the system. (And if you have some other way of identifying all of the Fred Smiths why are you using an identity value?)

When you alter data for an individual have you found the right Fred Smith.

Remember that if you publish your surrogate key other database designers may come to rely on it and treat it as a natural key. If you change the way it is used, or perhaps re-use an old number, you may wreck someone else's systems.

-- 
Bernard Peek
London, UK. DBA, Manager, Trainer & Author. Will work for money.
Received on Sat Apr 03 2004 - 23:06:41 CEST

Original text of this message