Re: Surrogate Keys: an Implementation Issue

From: Brian Selzer <>
Date: Sat, 29 Jul 2006 16:03:12 GMT
Message-ID: <4FLyg.77239$>

Your points are well taken. A relation value is a set of statements of fact. I think it's important to be able to determine whether or not the reality underpinning one of those statements changed from one database state to the next, and to a limited extent, how. This requires the ability to correlate the premises in one database state to those in the next. This is not possible using the operations available in the Relational Model without knowing for certain that there is a candidate key on each affected relation whose values will remain constant throughout an update. I may be guilty of conflating terms when I speak of updating the wrong row, but the problems I described in my earlier post are real.

"David Cressey" <> wrote in message news:WxJyg.1226$sy2.598_at_trndny01...
> "Anith Sen" <> wrote in message
> news:eaavt3$5c2$
>> >> I doubt that there are any such rdbms's today. It is like criticism
> of
>> >> RT because of SQL shortcomings - the lack of faithful implementations
>> >> only serves to encourage naive extensions to RT.
>> The idea of hidden surrogates and associated myths is mistakenly nurtured
> by
>> many who are intimately familiar with current crop of SQL products. For
>> instance, a cursory inspection of some sql programming newsgroups reveals
>> several blatant claims like:
>> -- Surrogates are immutable
>> -- Surrogates should not be exposed the user
>> -- Surrogates help performance
>> -- Surrogates belong to the physical model
>> -- Surrogates are never verifiable in reality
>> -- ...
>> In most cases, claimant misses the fundamental that a surrogate key is
>> nothing but a simple, stable and generally an irreducible candidate key.
> And
>> that it has nothing to do with the physical model and that once used
> outside
>> the DBMS, it is verifiable as well.
>> --
>> Anith
> I think Anith's comments above are correct.
> I've been skipping over this thread for the most part, because the signal
> to
> noise ratio of the thread didn't look good to me.
> So my comment may duplicate what somebody else already wrote.
> The elephant in the living room here is whether a surrogate key identifies
> a
> row one the one hand, or identifies a subject matter entity described by
> the
> (key) contents of the row on the other. Comments like "updated the wrong
> row" lead inescapably to the idea that there is a "right row", and that
> the
> programmer coding an update has the responsibility to identify the right
> row, and manipulate that row.
> If surrogate keys are used as row identifiers, then they are not
> surrogates
> for natural keys. They are indirect addressing of the row, as a data
> structure. The indirection allows for a certain flexibility in
> reorganizing
> data physically, without losing linkages. But other than that, it largely
> misses the point of using either RDM or SQL. If you're going to link rows
> with other rows, you're back to the graph data model, whether you know
> it
> or not.
> If on the other hand, surrogates are surrogates for natural keys, about
> all they provide is immutability in the face of natural key mismanagement,
> or real world collisions. That's very useful in some circumstances. But
> it
> doesn't allow Dave to update "the correct row". It allows Dave to update
> "the correct data".
> An example of real world collisions is having a surrogate key for
> Employee_ID that will survive the merger of Delta Airlines with Epsilon
> Airlines. Data warehouses often use surrogate keys for this kind of
> purpose. Then again, data warehouses often avoid updates completely,
> except for error correction. So the issues raised by updates are moot in
> that context.
> But this has nothing to do with the substance of most of this discussion
> (again, unless I missed a key part of the discussion).
Received on Sat Jul 29 2006 - 18:03:12 CEST

Original text of this message