Re: Surrogate Keys: an Implementation Issue

From: Brian Selzer <brian_at_selzer-software.com>
Date: Fri, 28 Jul 2006 05:45:35 GMT
Message-ID: <3whyg.183111$F_3.59230_at_newssvr29.news.prodigy.net>


"paul c" <toledobythesea_at_oohay.ac> wrote in message news:AH5yg.250333$IK3.94968_at_pd7tw1no...
> Brian Selzer wrote:

>> "Bob Badour" <bbadour_at_pei.sympatico.ca> wrote in message 
>> news:PPMxg.21411$pu3.353655_at_ursa-nb00s0.nbnet.nb.ca...
>>> Brian Selzer wrote:
>>> ...
>> I know what an external predicate is.  What does that have to do with the 
>> fact that Bob just updated the wrong row?  How could you prevent that in 
>> application code, or in the middle-tier?  You can't, unless you either 
>> (1) lock the row until Bob gets back from Tahiti, or (2) use a surrogate 
>> to guarantee that the row that's about to be updated is the same as the 
>> one that was read out.
>> ...
>

> This paragraph shows a muddling of different ideas. Locking rows is a
> heavy-handed implementation technique that often locks more values than
> the application intends (using values in the way Bob Badour's later post
> does). Also, early lock managers didn't recognize external predicates so
> were unable to lock the non-existence of a row, maybe some are still
> unable to. (Maybe there are products today that will use views to 'lock'
> only the values an application is concerned with but I doubt it.)
>

> Regardless, so-called surrogates aren't going to help an application whose
> updates depends on the non-existence of some row (logically the falseness
> of some proposition), so the idea that surrogates somehow substitute for a
> lock manager is false.
>

I don't see your point. Bob just read out a row and is about to modify it. He isn't trying to insert a new row.

There are two ways to deal with contention for a shared resource in a concurrent environment: collision avoidance and collision detection. Collision avoidance involves one actor taking control of a resource until he's finished with it--all other actors must wait until it's released; collision detection requires that each actor be able to detect any other actor's activity and then to react accordingly. For databases, collision avoidance involves obtaining a lock on a row and then holding that lock until the update is completed; collision detection requires each user to be able to detect changes made by other users and then to either commit or restart. In order to detect a change, it is necessary that the value of at least one candidate key remain constant throughout the interval bounded by the initial read and the detection read. (First you read the row, then you work with it, then you read the row again to make sure that nothing has changed, and finally, if it hasn't, commit.) Surrogates do not substitute for a lock manager: they guarantee that the key value remains constant.

> It has to be said that RT has nothing to do with 'duration' (let alone
> lock managers) nor does it need to. Theoretically, all an implementation
> needs to guarantee is 1) serialize the changes that an application desires
> to be made atomic, ie. give the effect of applying all changes at a point
> in time and 2) at update time, allow the application to re-iterate all
> (and only) the assertions it is depending on (not just true ones but also
> false ones). I doubt that there are any such rdbms's today. It is like
> criticism of RT because of SQL shortcomings - the lack of faithful
> implementations only serves to encourage naive extensions to RT.
>

I contend that it does need to. While it is imperative that the results of a transaction become current at the same instant, it is naive to assume that all updates occur instantaneously or that transactions have no duration, especially in a concurrent environment and especially when human interaction is involved. If RT doesn't take into account that reality, then what's the point? In a concurrent environment, there may be many intervening database states between the time that a transaction starts and the time that it completes. This means that the premises upon which assertions made during a transaction are based may no longer be valid at commit time. Because the only operation available is assignment, and because assignment essentially replaces the previous value, it is not necessarily true that the assertions in one database state mean the same thing as those in a subsequent one, even if the constituent values are the same. Unless it is known that the value of at least one candidate key will remain constant during an update, there can be no continuity between the database state prior to the update and that as a result of the update. In order to span more than two states, it must be known that that same key will have the same value throughout all intervening updates. Surrogates make it possible to correlate the facts in one database state with the facts in another, providing the continuity necessary to account for changes in what is known to be true during a transaction.

> If the Information Principle is to be followed, Bob Badour is right when
> he says that all keys are surrogates. However, updates that are faithful
> to an application need not involve declared keys (eg., keys stated in a
> catalogue).

>

> So-called predicate locking techniques bring other problems, all of which
> have to do with implementation, not RT, for example the need for session
> managers that enforce a two-phase locking protocol. A thoroughly pure RT
> implementation would have no such component and no lock manager either.
>
> p
Received on Fri Jul 28 2006 - 07:45:35 CEST

Original text of this message