Re: Surrogate Keys: an Implementation Issue

From: J M Davitt <jdavitt_at_aeneas.net>
Date: Sat, 29 Jul 2006 04:55:36 GMT
Message-ID: <cTByg.59088$Eh1.8821_at_tornado.ohiordc.rr.com>


JOG wrote:
> J M Davitt wrote:
>

>>JOG wrote:
>>
>>>J M Davitt wrote:
>>>
>>>
>>>>[big snip]
>>>>
>>>>
>>>>
>>>>>What does that have to do with the
>>>>>fact that Bob just updated the wrong row?
>>>>
>>>>The wrong row?  What was wrong about it?  The database was
>>>>told, for example, "Make Suzie's record look like this."
>>>>
>>>>The confusion arising because "Suzie is George and William
>>>>is Suzie but Bob thought George was still Suzie" isn't
>>>>going away just because you slap another value on all your
>>>>records; you're just moving the confusion around.
>>>
>>>
>>>
>>>There is miscommunication in this thread. Brian is coming from an
>>>assumption that primary keys are mutable, and everyone else from the RM
>>>specification that they should be immutable. The primary key identifies
>>>a tuple, and if something's identity changes... well, conceptually, it
>>>is a completely different thing. Hence in the example you supply Brian,
>>>the RM is responding perfectly and as we would want it to.
>>
>>A couple things:  I'm not presuming that primary key values are
>>immutable.  I'm not sure what you mean when you say "from the RM
>>specification that they should be immutable."
>>
>>I don't think I agree that a "thing" changes because one of
>>its attributes changes.  Yes, the tuple representing it does
>>but the thing is the same.  For instance, the change in the
>>database may occur because the value for some attribute was
>>incorrectly recorded; that doesn't mean there was a different
>>thing.  Or instance of an entity.  Whatever

>
>
> Ok, I understand your viewpoint, but I think its a mistake. Liebniz,
> whose definition of identity is key to formal logic, states that
> something is identifiable and distinguishable through its attributes,
> and its attributes alone. If a comparison is made between item 1 and
> item 2 and they have identical attributes, then they are the same
> thing. If all attributes are not the same then they are by definition
> different things, and the fact that the items being considered exist at
> different points in time makes no difference to this equality
> comparison. Thinking in terms of permanent 'entities' clouds this logic
> (which granted is unintuitive to everyday life), and hence the
> preference for thinking in terms of statements of fact alone.

I'm familiar with the idea you've described and have to say I agree -- but only to the extent that the time-varying values of attributes associated with an entity can be ignored. That, I think, is the crux of the problem that the suggested surrogate keys don't solve.

Analyzed statically, any things which are indistinguishable are the same thing just as any things which are different are different things. In the case presented, the different things may be different states (or versions or values) of an attribute which the designer chose to use as a candidate key. While I feel the described key wasn't suitable for the intended use, the proposed solution was to invent an attribute which had the properties necessary to make the application design work.

Coarsely speaking, these issues related to the time-varying values of the entities of interest are lumped into the category of temporal data and are beyond the scope of relational theory. The unfortunate circumstance is that some of the worthwhile proposed approaches call themselves models and, in other cases, features were implemented "under the covers" in the database engine in ways which were inconsistent with relational theory. (I'm not suggesting that the rest of the SQL-esque features are consistent with RT.)

But just because they're beyond the scope doesn't mean that RM databases can't handle such databases; in fact, they certainly can.

Given the state of databases available to most of us, the trick is to define operations that do what the database can't: updates to an entity that we wish to track over time does not map to an 'UPDATE <table>' SQL statement. So sad.

[snip] Received on Sat Jul 29 2006 - 06:55:36 CEST

Original text of this message