Re: Surrogate Keys: an Implementation Issue

From: JOG <jog_at_cs.nott.ac.uk>
Date: 2 Aug 2006 04:17:00 -0700
Message-ID: <1154517420.845266.5110_at_75g2000cwc.googlegroups.com>


Brian Selzer wrote:
> "JOG" <jog_at_cs.nott.ac.uk> wrote in message
> news:1154468782.194660.305140_at_75g2000cwc.googlegroups.com...
> > Brian Selzer wrote:
> > [snip for brevity]
> >> Not exactly. Any candidate key value is sufficient to identify a
> >> proposition within a single database state, but that doesn't mean that it
> >> is
> >> sufficient across multiple successive states. While it's true that if two
> >> things are indistinguishible, they are the same thing, the reverse is not
> >> necessarily true: it's not a caterpillar that emerges from a chrysalis,
> >> it's
> >> a butterfly.
> >
> > An interesting point - but what makes the caterpillar and the butterfly
> > the same thing? How do you know its the same creature at all? Well, of
> > course, the two states of the creature have the same DNA. That's the
> > primary key in this case - and thats the whole point. Its a bit of a
> > daft example, but again this is still in line with standard RT - there
> > is _nothing_ outside of the attributes. One may not be able to record
> > that DNA, and use a surrogate to represent it (hence the term
> > surrogacy). But whatever field represents that distinguishablity, it is
> > still denoting an explicit attribute (or set thereof), and given that
> > is part of the logical model, must not be hidden. If there were no
> > common key, according to logic the two 'states' are wholly different
> > things.
> >
>
> So, what you're saying is (1) that anything that can be discussed must be
> distinguishable from every other thing that can be or has been discussed,

Aye.

> (2) that anything that can be discussed must have at least one identifying
> set of properties that is guaranteed to remain constant throughout the
> discussion,

Aye.

> and (3) that one or more other properties can be different at
> different stages of the discussion without affecting the logical identity of
> the thing under discussion.

This one no, or rather with a qualifier - If the primary key changes the proposition has a different logical identity. Values which are merely functionally dependent on the key can change without altering logical identity. If you are recorded by your dna sequence in a db, and you change your hair colour, thats fine. Change your dna and you're a different person.

>
> If that's what you're saying, then I'm in complete agreement.
>
> The Relational Model falls short in that (1) the definition of a candidate
> key assumes only that anything that can be discussed is distinguishable from
> every other thing that can be discussed--it does not require that anything
> that can be discussed be distinguishable from every other thing that has
> been discussed,

If something at time t has the same identifier at time t+1 then they are the same thing. If not then as far as the database has been told they are different things. Its as simple as that to me. If in the real world it is clear that they _should_ have been viewed as the same item, then we chose the wrong key, and need to find the correct one - or use a surrogate to represent the unique identifying property if we somehow cannot record it. However it is vital that this surrogate not be hidden, so it can be used externally as a representative for those identifying properties in communication with the database.

(2) the definition of a candidate key does not require that
> it remain constant throughout the discussion, and as a result, (3) the
> logical identity of a thing under discussion cannot be determined at
> different stages of the discussion because the candidate key value for a
> proposition about a thing at one stage of the discussion can be different
> from the candidate key value for a proposition about the same thing at
> another.

as I said. different identifier. different things. Remember I'm agreeing that If you want to compare something between time periods that you know to be the same, then you need something to identify it between those periods. I'm just saying that that identifier must not be hidden.

>
> In other words, (1) the scope of a candidate key's ability to logically
> identify the things under discussion is limited to a single database state,
> (2) candidate key values can be different at different stages of the
> discussion, and consequently, (3) the propositions in one database state
> cannot be correlated with propositions about the same things in the next.

I just don't think we're going to agree brian. I don't believe in perpetual entities that exist outside of their identifying properties. I did once (see my very first discussions with marshall for example), and as such I agree its an instinctive view, but studying the philosophy of information and discussions on this board has changed my opinions. All best, J. Received on Wed Aug 02 2006 - 13:17:00 CEST

Original text of this message