Re: Surrogate Keys: an Implementation Issue
Date: Mon, 24 Jul 2006 15:39:35 GMT
Message-ID: <XQ5xg.14298$pu3.333248_at_ursa-nb00s0.nbnet.nb.ca>
> "Bernard Peek" <bap_at_shrdlu.com> wrote in message
> news:xn0ep1e3sn11oo000_at_news.individual.net...
>
>>In comp.databases.theory Paul Mansour wrote: >> >>>The second is perhaps more profound. Consider a rollback database, or >>>a database that must provide a complete audit trail of every change. >>>For example, the database must provide the answer to "who changed >>>this SS number from X to Y, and when did they change it? As far as I >>>can tell, if there is no way to answer this without an immutable >>>identifier. ( I suppose you could design the DB to handle specific >>>cases, but I'm interested in DBMS with native rollback and audit >>>trail support.) >> >>If what you think of as the key is mutable it's not a natural key. >>Natural keys aren't mutable, at all, ever.
>
> Untrue! Natural keys are often mutable--especially compound keys.
>
>
>>The problem here is that there isn't a usable natural key for >>identifying people. So all that's left are surrogates of varying >>quality. The SSN is a surrogate that works most of the time. If you >>assign someone a payroll number and tell tham that thay won't get paid >>unless they can quote it then you have a close to immutable key. But >>it's still a surrogate. >> >>I think the original post was part correct. Use a natural key when you >>have one. I'd just add to that, don't use a surrogate unless there is a >>compelling reason for it. Efficiency is unlikely to be a compelling >>argument in most cases.
>
> There is always a compelling reason to use surrogates: natural keys can
> change. This makes it difficult--if not impossible--to detect changes to
> rows. For example, Bob is preparing to update a row--that is, he has read
> the row and is in the process of keying in a change. During that time,
> another process updated several rows in the same table. Unless the key is
> immutable, when Bob issues the update, there's no guarantee that the row
> he's updating is the same one that was read out.
So? Bob issues an update statement to change some attribute identified by a logical identifier. As long as the data Bob is entering is correct, why should he care what happens to any other attributes?
This problem is magnified
> if there are rows related via a foreign key constraint because it's possible
> for the referenced row to appear unchanged. So you're left with either
> maintaining an exclusive lock on the row until Bob returns from the golf
> outing, or adding additional columns and code in order to determine with
> certainty whether or not a change occurred between the time that a row was
> read and the time of the update.
Or you can just write better applications that don't update anything that didn't change. By introducing some 'under-the-covers' identifying attribute, you create a risk that Bob will change some data identified by a familiar logical identifier and some other process in the meantime will associate that identifier with a different surrogate. Your application will then record the updates against the wrong logical identifier.
> Natural keys are necessary to maintain the integrity of the information
> stored in the database; surrogates are necessary to maintain the integrity
> of information during the time that it is in use by applications.
I disagree entirely. Received on Mon Jul 24 2006 - 17:39:35 CEST