Re: Surrogate Keys: an Implementation Issue

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Mon, 24 Jul 2006 15:39:35 GMT
Message-ID: <XQ5xg.14298$pu3.333248_at_ursa-nb00s0.nbnet.nb.ca>


Brian Selzer wrote:

> "Bernard Peek" <bap_at_shrdlu.com> wrote in message
> news:xn0ep1e3sn11oo000_at_news.individual.net...
>

>>In comp.databases.theory Paul Mansour wrote:
>>
>>>The second is perhaps more profound. Consider a rollback database, or
>>>a database that must provide a complete audit trail of every change.
>>>For example, the database must provide the answer to "who changed
>>>this SS number from  X to Y, and when did they change it? As far as I
>>>can tell, if there is no way to answer this without an immutable
>>>identifier. ( I suppose you could design the DB to handle specific
>>>cases, but I'm interested in DBMS with native rollback and audit
>>>trail support.)
>>
>>If what you think of as the key is mutable it's not a natural key.
>>Natural keys aren't mutable, at all, ever.

>
> Untrue! Natural keys are often mutable--especially compound keys.
>
>
>>The problem here is that there isn't a usable natural key for
>>identifying people. So all that's left are surrogates of varying
>>quality. The SSN is a surrogate that works most of the time. If you
>>assign someone a payroll number and tell tham that thay won't get paid
>>unless they can quote it then you have a close to immutable key. But
>>it's still a surrogate.
>>
>>I think the original post was part correct. Use a natural key when you
>>have one. I'd just add to that, don't use a surrogate unless there is a
>>compelling reason for it. Efficiency is unlikely to be a compelling
>>argument in most cases.

>
> There is always a compelling reason to use surrogates: natural keys can
> change. This makes it difficult--if not impossible--to detect changes to
> rows. For example, Bob is preparing to update a row--that is, he has read
> the row and is in the process of keying in a change. During that time,
> another process updated several rows in the same table. Unless the key is
> immutable, when Bob issues the update, there's no guarantee that the row
> he's updating is the same one that was read out.

So? Bob issues an update statement to change some attribute identified by a logical identifier. As long as the data Bob is entering is correct, why should he care what happens to any other attributes?

   This problem is magnified
> if there are rows related via a foreign key constraint because it's possible
> for the referenced row to appear unchanged. So you're left with either
> maintaining an exclusive lock on the row until Bob returns from the golf
> outing, or adding additional columns and code in order to determine with
> certainty whether or not a change occurred between the time that a row was
> read and the time of the update.

Or you can just write better applications that don't update anything that didn't change. By introducing some 'under-the-covers' identifying attribute, you create a risk that Bob will change some data identified by a familiar logical identifier and some other process in the meantime will associate that identifier with a different surrogate. Your application will then record the updates against the wrong logical identifier.

It seems to me you just went to a lot of trouble to invent a problem that never existed in the first place.

> Natural keys are necessary to maintain the integrity of the information
> stored in the database; surrogates are necessary to maintain the integrity
> of information during the time that it is in use by applications.

I disagree entirely. Received on Mon Jul 24 2006 - 17:39:35 CEST

Original text of this message