Re: Surrogate Keys: an Implementation Issue

From: paul c <toledobythesea_at_oohay.ac>
Date: Tue, 25 Jul 2006 03:34:08 GMT
Message-ID: <Qigxg.225864$iF6.182231_at_pd7tw2no>


Bob Badour wrote:

> Brian Selzer wrote:
> 

>> "Bob Badour" <bbadour_at_pei.sympatico.ca> wrote in message
>> news:XQ5xg.14298$pu3.333248_at_ursa-nb00s0.nbnet.nb.ca...
>>
>>> Brian Selzer wrote:
>>>
>>>> "Bernard Peek" <bap_at_shrdlu.com> wrote in message
>>>> news:xn0ep1e3sn11oo000_at_news.individual.net...
>>>>
>>>>> In comp.databases.theory Paul Mansour wrote:
>>>>>
>>>>>> The second is perhaps more profound. Consider a rollback database, or
>>>>>> a database that must provide a complete audit trail of every change.
>>>>>> For example, the database must provide the answer to "who changed
>>>>>> this SS number from X to Y, and when did they change it? As far as I
>>>>>> can tell, if there is no way to answer this without an immutable
>>>>>> identifier. ( I suppose you could design the DB to handle specific
>>>>>> cases, but I'm interested in DBMS with native rollback and audit
>>>>>> trail support.)
>>>>>
>>>>> If what you think of as the key is mutable it's not a natural key.
>>>>> Natural keys aren't mutable, at all, ever.
>>>>
>>>> Untrue! Natural keys are often mutable--especially compound keys.
>>>>
>>>>> The problem here is that there isn't a usable natural key for
>>>>> identifying people. So all that's left are surrogates of varying
>>>>> quality. The SSN is a surrogate that works most of the time. If you
>>>>> assign someone a payroll number and tell tham that thay won't get paid
>>>>> unless they can quote it then you have a close to immutable key. But
>>>>> it's still a surrogate.
>>>>>
>>>>> I think the original post was part correct. Use a natural key when you
>>>>> have one. I'd just add to that, don't use a surrogate unless there
>>>>> is a
>>>>> compelling reason for it. Efficiency is unlikely to be a compelling
>>>>> argument in most cases.
>>>>
>>>> There is always a compelling reason to use surrogates: natural keys
>>>> can change. This makes it difficult--if not impossible--to detect
>>>> changes to rows. For example, Bob is preparing to update a
>>>> row--that is, he has read the row and is in the process of keying in
>>>> a change. During that time, another process updated several rows in
>>>> the same table. Unless the key is immutable, when Bob issues the
>>>> update, there's no guarantee that the row he's updating is the same
>>>> one that was read out.
>>>
>>> So? Bob issues an update statement to change some attribute
>>> identified by a logical identifier. As long as the data Bob is
>>> entering is correct, why should he care what happens to any other
>>> attributes?
>>
>> He may be overwriting a change made by another user.
> 
> The only relevant concern is whether the update is correct. If the 
> update is correct, it matters not at all that it overwrites something, 
> which is presumably no longer correct.
> ...

Exactly. For me, the only question remaining is whether an update to a 'line item' for example, presupposes, somehow, a particular value for the 'invoice', say, what do users consider 'correct'? The rub is in the 'somehow'. If the user ignores the 'invoice', then he can do whatever he wants to the line item, including escrow increments (even though I think it is crazy to allow those for most users I've met). This is what bugs me about 2PL protocols.

p Received on Tue Jul 25 2006 - 05:34:08 CEST

Original text of this message