Re: A pk is *both* a physical and a logical object.

From: JOG <jog_at_cs.nott.ac.uk>
Date: Thu, 26 Jul 2007 03:14:32 -0700
Message-ID: <1185444872.730943.83600_at_o61g2000hsh.googlegroups.com>


On Jul 25, 10:32 pm, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> "JOG" <j..._at_cs.nott.ac.uk> wrote in message
>
> news:1185355659.582994.59190_at_k79g2000hse.googlegroups.com...
>
> [snip]
>
> >> "The right front tire on my car is going flat."
>
> >> "The tire with serial number BC324J5367 is going flat."
>
> > Ok, first let me say i totally understand what you are saying. Second
> > let me add that we advocate the same solution of using a surrogate key
> > to solve any ensuing problems, we just argue about /why/ we are doing
> > it.
>
> > Let me consider the propistions you stated:
> > Construct identified in the first proposition - "Right Front Tyre"
> > Construct identified in the second proposition - "Tyre BC324j5637"
>
> > These are different things.
>
> No, these are not necessarily the same thing. "Different" is absolute, and
> incorrect because it may be the case that they are the same thing.

My bad, I shouldn't have used the word 'thing' as its so overloaded. I meant to say they are different constructs.

> You
> speak below of "constructs" overlapping, but that just doesn't make sense.
> Can there be two different "constructs" that occupy exactly the same space
> at exactly the same time?

Ok, yes, that's exactly what I think. Now I'm not crazy, and I do realise for a tyre there is obviously just one set of atoms there, but it is our definition of what a tyre /is/ (the construct) that overlaps. These construct-types are _defined_ by how we identify them (that being qualative identity not numerical identity of course) - and if that identifier changes then we end up with a different instance. And if this is unacceptable to our application then we picked the wrong construct!

If we are designing a database, and we pick the wrong construct, we then as a consequence pick the wrong identifier, which ends up as the wrong key to use, and eventually the database hits a problem when things change. Thats why I think its a design issue, rather than any flaw in the DB theory.

>
> > "What", you no doubt cry, "of course
> > they're the same bloody thing". Well, consider that I also tell the
> > mechanic:
>
> > "The right front tyre on my car always goes flat"
>
> > What do I mean: The tyre that is on now, or whatever tyre I put there.
> > It could be either. One bit of rubber, which I'm continually
> > repairing, or new tyres that I keep changing. Its probably going to be
> > the latter in this case, and the mechanic might check my suspension.
>
> Actually, in this case it is the tire that is currently on your car, since
> you didn't say,
>
> "The right front tire on my car has always gone flat."
>
> But I see what you mean.
> Clearly "The right front tire on my car has always
> gone flat." is ambiguous, because it could mean "Every tire that has been on
> the right front of my car has gone flat." or "The tire that is on the right
> front of my car has always gone flat." In either case, you can expect the
> tire that is on the right front of your car to go flat.
>
> This illustrates what happens when the only key on a relation schema permits
> updates. It can't be determined if a new individual is being selected, or
> if the state of the current individual is now different. The problem I have
> is with the assumption that it is always the case that a new individual is
> being selected. This implies that there is a requirement for all keys to be
> rigid, which is clearly not the case.
>
> > The current overlap I have with tyre BC324j5637 is lost. The two
> > constructs are not the same thing over time, and hence should not be
> > encoded in a database as such. The design should consider which is
> > appropriate - the construct of a tyre with a code identifier, or the
> > construct of a tyre with a position identifier.
>
> Are you saying, then, that whenever a key can be updated, there must also be
> another that can't?

Well, I'd say a key can never be updated. A proposition is removed and it is replaced by another, and there is no connection between them as far as the database is concerned.

All we can say is that two propositions have in common an item that they discuss. We can only recognize that item because of its identifier. And if the identifier changes there is no connection to be made. So we better bloody well pick the right identifer ;)

> That's certainly one way to solve the problem of
> identification across database values.

I'm just saying that the attribute that identifies the construct uniquely in the real world should be used to identify it in propositions (and that we can't rely on contextual and situational knowledge that we do in everyday conversation). I think in general database designers are pretty sloppy about such an important design points, and it results in a lot of broken databases.

I know you may not be interested but what the hell ;) - a guy called Peter Geach wrote a lot about this sort of thing (and he might be worth a google on relative identity), I just don't think its ever been applied to database design.

All best, Jim.

>
> > Note that change over time is not the only example of this being an
> > issue.
>
> >> Both of these statements state that a particular tire is going flat;
> >> however, the first depends upon the current state of affairs to identify
> >> the
> >> tire, whereas the second does not. The first employs a non-rigid
> >> definite
> >> description, and the second a rigid definite description, since the
> >> serial
> >> number is part of the physical makeup of the tire.
> >> Obviously, the first
> >> statement is what I would tell my mechanic, not the second. Who in their
> >> right mind would get out of the car and clean off all of the mud so that
> >> the
> >> tire can be examined to determine what the serial number is?
>
> > Well noone. We as humans can flick to different overlapping constructs
> > with ease because of contextual and situational knowledge, in a way
> > that differs from the needs of a formal database encoding.
>
> >> Now, I
> >> understand that as soon as the tire is replaced, or as soon as the tires
> >> are
> >> rotated, or as soon as I trade in my car, "the right front tire on my
> >> car"
> >> denotes a different tire.
>
> >> The point is that a description need not be rigid to be definite. It may
> >> also be the case that the serial number of the tire is not relevant to
> >> the
> >> problem at hand. Its absence from the database schema does not
> >> necessarily
> >> mean that it does not exist. It certainly doesn't mean that "the right
> >> front tire on my car" denotes the same tire in all possible states of
> >> affairs.
>
> >> >> Here each individual has
> >> >> two sets of identifying properties: one that identifies the individual
> >> >> at
> >> >> all possible states of affairs and one that identifies the individual
> >> >> at
> >> >> a
> >> >> particular state of affairs.
>
> >> > Utterly and totally incorrect. A construct that is one thing (i.e. has
> >> > the same identity) "over all its possible states" is a completely
> >> > different kettle of fish to an item which is only the "same thing in
> >> > one state". They are different constructs, with different identifiers
> >> > - even if they overlap.
>
> >> I beg to differ. The combination of a non-rigid definite description and
> >> the rest of the information about a particular state of affairs rigidly
> >> designates a particular individual. One alternative to using surrogates
> >> where keys can be updated is to record for each tuple in a separate set
> >> of
> >> attributes the key value that identified an individual when it was
> >> originally inserted along with the transaction time of the original
> >> insert.
> >> Since only one tuple can have a particular key value at a particular
> >> point
> >> in time, the original key value when combined with the original insertion
> >> time permanently identifies the individual.
>
> > Yes that would be fine in terms of uniqueness - There is a symmettric
> > functional dependency between a tyre id and its position at a
> > specified time.
>
> > However in the real world, the "position at a specified time" key
> > would be impossible to determine if you just found a tyre on the
> > floor. The attribute is not an adequate identifier because it requires
> > historical knowledge that one might not possess. Identity occurs
> > outside of the database.
>
> >> In this case the original key
> >> value represents the non-rigid definite description that identified the
> >> individual at the state of affairs denoted by the original insertion
> >> time.
> >> Once an individual is identified, everything else that is true about it
> >> can
> >> be determined, including other rigid designators.
>
> >> > (This explains how I am both the same person _and_ a different person
> >> > to the picture of me as a ten year old. The concept of a person type
> >> > has two different constructions, with different identifers. I can
> >> > quickly flick between the two interpretations in real life by
> >> > assessing the context of any particular question)
>
> >> > Which of these constructs is applicable depends on the context,
> >> > situation and application. If you have picked the wrong one when you
> >> > are designing a schema, well then, eventually your database will break
> >> > (as I know you have seen and remedied yourself in the past).
>
> >> > I think this is pretty much the whole sticking point so for now i have
> >> > snipped the rest.
>
> >> > Regards, Jim.
>
> >> >> [huge snip]
Received on Thu Jul 26 2007 - 12:14:32 CEST

Original text of this message