Re: A pk is *both* a physical and a logical object.

From: JOG <jog_at_cs.nott.ac.uk>
Date: Wed, 15 Aug 2007 11:14:56 -0700
Message-ID: <1187201696.390580.326600_at_19g2000hsx.googlegroups.com>


On Aug 15, 6:23 pm, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> "JOG" <j..._at_cs.nott.ac.uk> wrote in message
>
> news:1187168879.650991.260650_at_57g2000hsv.googlegroups.com...
>
>
>
> > On Aug 15, 6:45 am, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> > [snip]
> >> A simple example: Suppose that you built several identical computers.
> >> Each
> >> has one motherboard, one DIMM, one hard drive, one video board, and one
> >> case. Each component is serialized, so the serial number for each
> >> component
> >> would be a candidate key value in a relation describing the composition
> >> of
> >> each computer. You're having trouble with one of the computers, but
> >> you're
> >> not sure which component is failing, or even if it is a hardware problem,
> >> so
> >> you swap the hard drives from two of the computers to see if the problem
> >> moves. For either of the two computers affected by the swap, the
> >> motherboard is the same, the DIMM is the same, the video board is the
> >> same,
> >> and the case is the same. So obviously, since those serial numbers are
> >> the
> >> same before and after the swap, it's the same computer, right? But wait,
> >> the hard drive is different; therefore, it must not be the same computer
> >> because the serial numbers for the hard drives are different.
>
> > Nice little example. The answer is that in your schema you have no
> > concept of a computer. None whatsoever. A computer does not exist.
> > There is just a relationship between several components. A computer
> > (which you have defined as a set of components) has no identity of its
> > own over time outside the things it contains - and as such, if these
> > components can be changed, it simply cannot be identified throughout
> > its lifetime.
>
> > Hence if you switch a component, well then you have a different set of
> > components. And if you want to call that set of components a computer,
> > yes, you have a different computer.
>
> > If this was not what was desired (as I imagine) then your schema was
> > broken. Your computer should have an id on the box to allow it to be
> > identified / give it an identity.
>
> > Make sense?
>
> Yes, and no. The example certainly does have a concept of a computer. You
> can pick it up, plug it in on a user's desk, attach peripherals to it, and
> turn it on. Its components are useless apart from each other. (Well, maybe
> not useless: at one point I was using an old 5.25" full height hard drive to
> hold my door open.)

Agreed, a set of components. But absolutely no identity outside that.

>
> Suppose that there is an asset tag affixed to the outside of the case. Does
> each computer now have identity? Suppose, then, that a GUID was assigned to
> the computer when the OS was installed. Aren't we now in the same boat?
> From the perspective of the directory service and thus the users of the
> network, the computers involved in the swap are different, but from the
> perspective of the bean counters, it's the same computer.

Yes if it had a GUID it then has an identity over time, outside a particular set of components at any one moment. Previously it only had an identity over a static point in time.

>
> The problem is that not one of the key values by itself is a rigid
> designator for nor a rigid description of a computer. If you define a
> historical relation, and combine the initial component serial numbers along
> with the time that they were originally assembled and an interval attribute
> to be the key, then it should be clear that as time progresses, the
> individual components in the computer may be swapped, but it's still the
> same computer--even if all of the original components were swapped! It is
> the one computer that had a particular combination of components at a
> particular point in time. Since the value of a database represents what has
> been true since the last update, the combination of the serial numbers of
> the current components along with the point in time of the last update is
> another rigid description for the computer that had a different
> configuration at the time it was originally assembled. The absence of that
> initial configuration and assembly time in the relation does not change that
> fact. A projection over the attributes representing the component serial
> numbers that is restrict to only those tuples with indefinite intervals /is/
> the non-historical relation. Just because the history isn't represented in
> the database doesn't mean that there isn't any! It just means that it
> isn't relevant.

If the history is important, as it is when identifying attributes might change, it should have been incorporated into the schema and not as some add on kludge. Your construct should have been stable, and identifiable over time, not just for one instant. Hence problems might ensue because the wrong context was picked.

>
>
>
> >> [snip]
> >> >> In addition, identification is not identity!
>
> >> > Aha! That's where we differ then. That is /exactly/ what identity is
> >> > in my opinion. Identification is stating that if I know one attribute
> >> > (or set of attributes) I can functionally determine the rest. Perhaps
> >> > we should discuss that and then the rest of the arguments might fall
> >> > into place? Let me start the ball rolling, with a catfood example for
> >> > the new century ;)
>
> >> An individual's identification is a set of properties that distinguishes
> >> the
> >> individual from all others in the context of a picture of the universe;
> >> an
> >> individual's identity is that set of properties that defines the
> >> individual.
> >> These are two different things.
>
> > "identity is that set of properties that defines the individual." -
> > what does that mean, if not the properties that distinguish it from
> > other entities of the same type in that universe?
>
> Does the chit holding the number you took at the pharmacy define you? I
> wouldn't think so, but it certainly identifies you, at least for as long as
> you're holding it.

To the chemist it might. To me, no it doesn't. Voila. Different contexts.

>
>
>
>
>
> >> > ------------------------------------------
> >> > I am shown a can of catfood from an identical batch of three. Its
> >> > only, single identifying feature is a number on it. I read it, and the
> >> > can is taken away. I am then shown a new can. Is it the same can? Does
> >> > it have the same identity? I read the number on it, which is
> >> > different. I conclude therefore, quite sensibly, it is a different can
> >> > to the first.
>
> >> > Unbeknownst to me someone had shuffled the can numbers up at random
> >> > after i'd read the first one. Even this mischevious soul himself has
> >> > no idea if the original can I was given ended up with the same number
> >> > on it as before (the shuffling was done blind). In fact /noone/ in the
> >> > world now knows.
>
> >> > Where does identity stand now?
> >> > ------------------------------------------
>
> >> Identity stands as it always did. What is different is identification.
>
> > If /noone/ can identify it any longer you still think the original can
> > has some god-given soul-number? Us, humans, give something an identity
> > - it doesn't just exist as a force of nature.
>
> I beg to differ. We, humans, identify something, or name something. If you
> see identical twins, wearing nothing but smiles, look away, and then look
> back, can you be certain that each is in the same position that they were
> the first time that you looked? I would venture to say that they each have
> identity regardless of their position, even if you can't tell them apart.

To themselves have they identity regardless of position sure. To me there is just the twin on the left and the twin on the right. Whatever happens, without further information, their identities will always be left twin and right twin to me.

>
> >> This is a perfect example of why update is primitive and assignment
> >> isn't.
> >> An assignment replaces the current relation value with a new one,
> >> blindly,
> >> but an update specifies which tuples are different and how each differs,
>
> > An update replaces a relation with a new one with a delete, and then
> > replaces it again with another new relation with an insert. Just
> > because it the system does some jiggery pokery to work out which
> > proposition to delete, and the content of the new one to be inserted
> > does not change this.
>
> That's Date's and Darwin's interpretation. They're entitled to their
> opinion, even if it's wrong.

As are you - but I'm afraid there is just no contention here. It is simply the mathematics of the model and there is nothing to interpret because its just an application of set theory. A set's contents cannot be "updated" - one set must be replaced by another. If x := {2} and later on x := {3}, the number 2 has not been updated. x has been given a new set value.

> From what I've read, Codd didn't share it. In
> his book, RMv2, he defines relational assignment for variables in volatile
> memory, for temporary objects, as a tool to simplify complicated queries.
> As far as I know he never bought into Date's relvar notion for base
> relations.
> I also remember an interchange between Hugh Darwin and Fabian
> Pascal where they disagree about the notion of relvars. I'm not the only
> one that isn't sold.
>
>
>
> >> which has the same effect as observing each can of cat food throughout
> >> the
> >> interval from the first reading to the second. Obviously, if you were
> >> able
> >> to simultaneously observe each can, then there would be no doubt as to
> >> whether the new can is the original can.
>
> > In that (let us note different) situation there would be two different
> > definitions of the relative identity of a can. One would be its point
> > in space (which one observer has tracked) and the other its no. ID,
> > which someone else is observing. Two different constructs. Neither is
> > right, neither is wrong. Two things with their own distinct identity.
>
> > And given I'm sure your not suggesting that someone should walk around
> > pointing at things continually to give them their identity, they must
> > have an observable property that does not change in the propositions
> > in which they feature, otherwise they simply could not be recognized
> > as the same thing in real life, never mind a database encoding.
>
> Something like identical twins?
>
> > I know this is a subtle, uncomfortable way of viewing things - but
> > I've managed to shake off the idea that something has an identity even
> > if noone is observing it, so I'm sure others can. There is nothing to
> > relative identity outside an objective human decision. Recognizing
> > this means good conceptual modelling, as making the right decision is
> > down to us. Regards, J.
>
> >> >> Identification is used by one individual to pick another out of a
> >> >> crowd,
> >> >> whereas identity is what one individual is. It may be that much of
> >> >> the
> >> >> confusion is caused by misinterpreting this simple distinction.
> >> >> Identification is the nominative form of the verb "to identify."
>
> >> >> Update is a primitive operation. It is not a shortcut--it cannot be a
> >> >> shortcut, because not all key values permanently identify individuals.

I'm not sure you are ready (or perhaps entrenched) to get through the hurdle of viewing identity as subjective, and not omnipotently defined. I'm not sure I can explain it any better than I have attempted, so there I will gracefully retire, and let you have the last word if you so desire.

I hope there will be a time when you will be convinced, as it is something that now seems very clear to me, while also providing elegant theory and solid practical consequences for schema. Received on Wed Aug 15 2007 - 20:14:56 CEST

Original text of this message