Re: Surrogate Keys: an Implementation Issue

From: paul c <>
Date: Fri, 21 Jul 2006 03:10:12 GMT
Message-ID: <oAXvg.214209$Mn5.50893_at_pd7tw3no>

Paul Mansour wrote:
> paul c wrote:

>>> Name (pk)    City
>>> =====         ======
>>> Palu            New York          // Lastweek
>>> Palu            New Jersey       // yesterday
>>> Paul            New Jersey      //  today
>>> Since my primary key has changed, I need some immutable identifier so I
>>> know these are in fact part of the same set.
>>> Not sure how a timestamp helps here.
>> 'Lastweek', 'yesterday', 'today' look like timestamps to me!

> Sure, but how does the DBMS know that these three rows are somehow
> related? How would it even produce the result? That is what I'm getting
> at, and a timestamp does not help.

By the information principle.

(Sorry for delay, internet here has been busted for most of the day.)

I feel like my mouth has been taped over, hands cuffed and head tied to a bouncing ball and my neck is aching from trying to follow it. First we have surrogates, then we don't, then we do, now we're back to the physical - physical pointers, sector addresses, whatever, but fundamentally, a pair of timestamps could tie the 'rows' together. Most redo-logs don't bother with a pair since the redo-log's purpose is usually more limited than auditing and it is usually accessed only in a sequential way, but the timestamp is still effectively there, directly or indirectly, if only to make sure the dbms doesn't invoke checkpoint twice and a pair of them, while not the only way, certainly does express each element of the set you're talking about. Bob B mentioned the importance of separating the logical and physical. For me the really important reason to do that is because Codd's Information Principle is at odds with what must happen physically, namely ordered indexes or hashes or pointers or offsets or timestamps, even truly random 'oids' might suffice. Some dbms's have used nothing but user 'row' values to implement logs. But if you expose anything as part of a relation, you are no longer talking only about a physical dbms and must acknowledge TTM or equivalent (if one exists), eg. system candidate keys or invent a substitute with equal or better coherence and consistency.

"produce the result" is a tricky phrase because it hints at both logical and physical. We must separate how we produce from what results.

p Received on Fri Jul 21 2006 - 05:10:12 CEST

Original text of this message