Re: A pk is *both* a physical and a logical object.

From: Brian Selzer <brian_at_selzer-software.com>
Date: Wed, 15 Aug 2007 17:23:40 GMT
Message-ID: <wEGwi.49581$YL5.17655_at_newssvr29.news.prodigy.net>


"JOG" <jog_at_cs.nott.ac.uk> wrote in message news:1187168879.650991.260650_at_57g2000hsv.googlegroups.com...
> On Aug 15, 6:45 am, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> [snip]
>> A simple example: Suppose that you built several identical computers.
>> Each
>> has one motherboard, one DIMM, one hard drive, one video board, and one
>> case. Each component is serialized, so the serial number for each
>> component
>> would be a candidate key value in a relation describing the composition
>> of
>> each computer. You're having trouble with one of the computers, but
>> you're
>> not sure which component is failing, or even if it is a hardware problem,
>> so
>> you swap the hard drives from two of the computers to see if the problem
>> moves. For either of the two computers affected by the swap, the
>> motherboard is the same, the DIMM is the same, the video board is the
>> same,
>> and the case is the same. So obviously, since those serial numbers are
>> the
>> same before and after the swap, it's the same computer, right? But wait,
>> the hard drive is different; therefore, it must not be the same computer
>> because the serial numbers for the hard drives are different.
>
> Nice little example. The answer is that in your schema you have no
> concept of a computer. None whatsoever. A computer does not exist.
> There is just a relationship between several components. A computer
> (which you have defined as a set of components) has no identity of its
> own over time outside the things it contains - and as such, if these
> components can be changed, it simply cannot be identified throughout
> its lifetime.
>
> Hence if you switch a component, well then you have a different set of
> components. And if you want to call that set of components a computer,
> yes, you have a different computer.
>
> If this was not what was desired (as I imagine) then your schema was
> broken. Your computer should have an id on the box to allow it to be
> identified / give it an identity.
>
> Make sense?
>

Yes, and no. The example certainly does have a concept of a computer. You can pick it up, plug it in on a user's desk, attach peripherals to it, and turn it on. Its components are useless apart from each other. (Well, maybe not useless: at one point I was using an old 5.25" full height hard drive to hold my door open.)

Suppose that there is an asset tag affixed to the outside of the case. Does each computer now have identity? Suppose, then, that a GUID was assigned to the computer when the OS was installed. Aren't we now in the same boat? From the perspective of the directory service and thus the users of the network, the computers involved in the swap are different, but from the perspective of the bean counters, it's the same computer.

The problem is that not one of the key values by itself is a rigid designator for nor a rigid description of a computer. If you define a historical relation, and combine the initial component serial numbers along with the time that they were originally assembled and an interval attribute to be the key, then it should be clear that as time progresses, the individual components in the computer may be swapped, but it's still the same computer--even if all of the original components were swapped! It is the one computer that had a particular combination of components at a particular point in time. Since the value of a database represents what has been true since the last update, the combination of the serial numbers of the current components along with the point in time of the last update is another rigid description for the computer that had a different configuration at the time it was originally assembled. The absence of that initial configuration and assembly time in the relation does not change that fact. A projection over the attributes representing the component serial numbers that is restrict to only those tuples with indefinite intervals /is/ the non-historical relation. Just because the history isn't represented in the database doesn't mean that there isn't any! It just means that it isn't relevant.

>> [snip]
>> >> In addition, identification is not identity!
>>
>> > Aha! That's where we differ then. That is /exactly/ what identity is
>> > in my opinion. Identification is stating that if I know one attribute
>> > (or set of attributes) I can functionally determine the rest. Perhaps
>> > we should discuss that and then the rest of the arguments might fall
>> > into place? Let me start the ball rolling, with a catfood example for
>> > the new century ;)
>>
>> An individual's identification is a set of properties that distinguishes
>> the
>> individual from all others in the context of a picture of the universe;
>> an
>> individual's identity is that set of properties that defines the
>> individual.
>> These are two different things.
>
> "identity is that set of properties that defines the individual." -
> what does that mean, if not the properties that distinguish it from
> other entities of the same type in that universe?
>

Does the chit holding the number you took at the pharmacy define you? I wouldn't think so, but it certainly identifies you, at least for as long as you're holding it.

>>
>> > ------------------------------------------
>> > I am shown a can of catfood from an identical batch of three. Its
>> > only, single identifying feature is a number on it. I read it, and the
>> > can is taken away. I am then shown a new can. Is it the same can? Does
>> > it have the same identity? I read the number on it, which is
>> > different. I conclude therefore, quite sensibly, it is a different can
>> > to the first.
>>
>> > Unbeknownst to me someone had shuffled the can numbers up at random
>> > after i'd read the first one. Even this mischevious soul himself has
>> > no idea if the original can I was given ended up with the same number
>> > on it as before (the shuffling was done blind). In fact /noone/ in the
>> > world now knows.
>>
>> > Where does identity stand now?
>> > ------------------------------------------
>>
>> Identity stands as it always did. What is different is identification.
>
> If /noone/ can identify it any longer you still think the original can
> has some god-given soul-number? Us, humans, give something an identity
> - it doesn't just exist as a force of nature.
>

I beg to differ. We, humans, identify something, or name something. If you see identical twins, wearing nothing but smiles, look away, and then look back, can you be certain that each is in the same position that they were the first time that you looked? I would venture to say that they each have identity regardless of their position, even if you can't tell them apart.

>> This is a perfect example of why update is primitive and assignment
>> isn't.
>> An assignment replaces the current relation value with a new one,
>> blindly,
>> but an update specifies which tuples are different and how each differs,
>
> An update replaces a relation with a new one with a delete, and then
> replaces it again with another new relation with an insert. Just
> because it the system does some jiggery pokery to work out which
> proposition to delete, and the content of the new one to be inserted
> does not change this.
>

That's Date's and Darwin's interpretation. They're entitled to their opinion, even if it's wrong. From what I've read, Codd didn't share it. In his book, RMv2, he defines relational assignment for variables in volatile memory, for temporary objects, as a tool to simplify complicated queries. As far as I know he never bought into Date's relvar notion for base relations. I also remember an interchange between Hugh Darwin and Fabian Pascal where they disagree about the notion of relvars. I'm not the only one that isn't sold.

>> which has the same effect as observing each can of cat food throughout
>> the
>> interval from the first reading to the second. Obviously, if you were
>> able
>> to simultaneously observe each can, then there would be no doubt as to
>> whether the new can is the original can.
>
> In that (let us note different) situation there would be two different
> definitions of the relative identity of a can. One would be its point
> in space (which one observer has tracked) and the other its no. ID,
> which someone else is observing. Two different constructs. Neither is
> right, neither is wrong. Two things with their own distinct identity.
>
> And given I'm sure your not suggesting that someone should walk around
> pointing at things continually to give them their identity, they must
> have an observable property that does not change in the propositions
> in which they feature, otherwise they simply could not be recognized
> as the same thing in real life, never mind a database encoding.
>

Something like identical twins?

> I know this is a subtle, uncomfortable way of viewing things - but
> I've managed to shake off the idea that something has an identity even
> if noone is observing it, so I'm sure others can. There is nothing to
> relative identity outside an objective human decision. Recognizing
> this means good conceptual modelling, as making the right decision is
> down to us. Regards, J.
>
>> >> Identification is used by one individual to pick another out of a
>> >> crowd,
>> >> whereas identity is what one individual is. It may be that much of
>> >> the
>> >> confusion is caused by misinterpreting this simple distinction.
>> >> Identification is the nominative form of the verb "to identify."
>>
>> >> Update is a primitive operation. It is not a shortcut--it cannot be a
>> >> shortcut, because not all key values permanently identify individuals.
>
>
Received on Wed Aug 15 2007 - 19:23:40 CEST

Original text of this message