Re: A pk is *both* a physical and a logical object.

From: JOG <jog_at_cs.nott.ac.uk>
Date: Wed, 25 Jul 2007 02:27:39 -0700
Message-ID: <1185355659.582994.59190_at_k79g2000hse.googlegroups.com>


On Jul 24, 8:27 am, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> "JOG" <j..._at_cs.nott.ac.uk> wrote in message
>
> news:1185211122.628701.219130_at_g4g2000hsf.googlegroups.com...
>
>
>
> > On Jul 23, 3:57 pm, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> >> [snip]
> >> I don't miss the point: I dismiss it not only because it is a gross
> >> oversimplification based upon faulty assumptions but also because it is
> >> incorrect.
>
> > I have no faulty assumptions, but feel free to try and knock something
> > down specifically. It is good practice for me to defend a standpoint,
> > even a solid one. However, as ever, I only post because I think I
> > might be able to help.
>
> >> Although a key value identifies (or is a surrogate for) an
> >> individual in the Universe of Discourse, it is not correct to assume that
> >> the same key value identifies the same individual at every database value
> >> in
> >> which it appears.
> >> Here's proof: given a relation schema with two keys, one
> >> whose values rigidly designate individuals and one whose values represent
> >> non-rigid definite descriptions for individuals,
>
> > "non-rigid definite desciptions for individuals". This makes no sense,
> > but this is probably a symptom of the points below.
>
> >> there can be a tuple in
> >> each of two possible relation values that has the same value for the
> >> rigid
> >> key, but different values for the non-rigid key.
>
> > Forget the database - it starts way before then brian, back in real
> > life, when we are stating propositions about the item types of which
> > we speak. It is how we can permanently identify these items when we
> > talk about them that defines what they are. It has nothing to do with
> > keys, which just identify propositions - by the time you get to the RM
> > the mistake has already happened.
>
> "The right front tire on my car is going flat."
>
> "The tire with serial number BC324J5367 is going flat."

Ok, first let me say i totally understand what you are saying. Second let me add that we advocate the same solution of using a surrogate key to solve any ensuing problems, we just argue about /why/ we are doing it.

Let me consider the propistions you stated: Construct identified in the first proposition - "Right Front Tyre" Construct identified in the second proposition - "Tyre BC324j5637"

These are different things. "What", you no doubt cry, "of course they're the same bloody thing". Well, consider that I also tell the mechanic:

"The right front tyre on my car always goes flat"

What do I mean: The tyre that is on now, or whatever tyre I put there. It could be either. One bit of rubber, which I'm continually repairing, or new tyres that I keep changing. Its probably going to be the latter in this case, and the mechanic might check my suspension.

The current overlap I have with tyre BC324j5637 is lost. The two constructs are not the same thing over time, and hence should not be encoded in a database as such. The design should consider which is appropriate - the construct of a tyre with a code identifier, or the construct of a tyre with a position identifier.

Note that change over time is not the only example of this being an issue.

>
> Both of these statements state that a particular tire is going flat;
> however, the first depends upon the current state of affairs to identify the
> tire, whereas the second does not. The first employs a non-rigid definite
> description, and the second a rigid definite description, since the serial
> number is part of the physical makeup of the tire.
> Obviously, the first
> statement is what I would tell my mechanic, not the second. Who in their
> right mind would get out of the car and clean off all of the mud so that the
> tire can be examined to determine what the serial number is?

Well noone. We as humans can flick to different overlapping constructs with ease because of contextual and situational knowledge, in a way that differs from the needs of a formal database encoding.

> Now, I
> understand that as soon as the tire is replaced, or as soon as the tires are
> rotated, or as soon as I trade in my car, "the right front tire on my car"
> denotes a different tire.
>
> The point is that a description need not be rigid to be definite. It may
> also be the case that the serial number of the tire is not relevant to the
> problem at hand. Its absence from the database schema does not necessarily
> mean that it does not exist. It certainly doesn't mean that "the right
> front tire on my car" denotes the same tire in all possible states of
> affairs.
>
> >> Here each individual has
> >> two sets of identifying properties: one that identifies the individual at
> >> all possible states of affairs and one that identifies the individual at
> >> a
> >> particular state of affairs.
>
> > Utterly and totally incorrect. A construct that is one thing (i.e. has
> > the same identity) "over all its possible states" is a completely
> > different kettle of fish to an item which is only the "same thing in
> > one state". They are different constructs, with different identifiers
> > - even if they overlap.
>
> I beg to differ. The combination of a non-rigid definite description and
> the rest of the information about a particular state of affairs rigidly
> designates a particular individual. One alternative to using surrogates
> where keys can be updated is to record for each tuple in a separate set of
> attributes the key value that identified an individual when it was
> originally inserted along with the transaction time of the original insert.
> Since only one tuple can have a particular key value at a particular point
> in time, the original key value when combined with the original insertion
> time permanently identifies the individual.

Yes that would be fine in terms of uniqueness - There is a symmettric functional dependency between a tyre id and its position at a specified time.

However in the real world, the "position at a specified time" key would be impossible to determine if you just found a tyre on the floor. The attribute is not an adequate identifier because it requires historical knowledge that one might not possess. Identity occurs outside of the database.

> In this case the original key
> value represents the non-rigid definite description that identified the
> individual at the state of affairs denoted by the original insertion time.
> Once an individual is identified, everything else that is true about it can
> be determined, including other rigid designators.
>
> > (This explains how I am both the same person _and_ a different person
> > to the picture of me as a ten year old. The concept of a person type
> > has two different constructions, with different identifers. I can
> > quickly flick between the two interpretations in real life by
> > assessing the context of any particular question)
>
> > Which of these constructs is applicable depends on the context,
> > situation and application. If you have picked the wrong one when you
> > are designing a schema, well then, eventually your database will break
> > (as I know you have seen and remedied yourself in the past).
>
> > I think this is pretty much the whole sticking point so for now i have
> > snipped the rest.
>
> > Regards, Jim.
>
> >> [huge snip]
Received on Wed Jul 25 2007 - 11:27:39 CEST

Original text of this message