Re: candidate keys in abstract parent relations

From: David Cressey <dcressey_at_verizon.net>
Date: Mon, 23 Jan 2006 13:37:05 GMT
Message-ID: <5_4Bf.10306$mj3.8749_at_trndny06>


"Roy Hann" <specially_at_processed.almost.meat> wrote

> When we record a driver's licence number or an employee number, and name,
> all we are saying is that we believe that number was assigned to a person
> who will claim to have that name. Most of the time that also happens to
> work as identification for the person, but IMO that is an accidental
> side-effect of most people being honest. But I have a colleague who works
> on a criminal justice system, and they tie themselves in fantastic knots
> because they refuse to understand this.

It's a side effect of most people being honest, but by no means accidental. We have been keeping records since the dawn of recorded history (sic), and we have always relied on records to contain "truth". We use persistent, machine readable storage today, instead of clay tablets or papyrus scrolls, but the idea is the same: we record what we believe to be true, and we rely on records to be true.

Most information systems have some means of detecting error or deceit, but they are limited. For example, a driver's license usually has a photo on it.

The use of a driver's license number to identify a person is actually the use of a foreign key as if it were a primary key. The license number is an attribute of the license, not of the person. We are relying on the relationship between persons and drivers licenses to get away with this philosophical slieght of hand.

>
> I have never met a database designer yet, including myself, who doesn't
> struggle to remember that the database only has to assert what we are
told,
> NOT what is objectively true. It doesn't "matter" if the database
contains
> lies as long as it (and the application) doesn't invent or derive new
lies.
> (Obviously there is a burden on the person entering the data to be
diligent
> about establishing the truth as best they can before they enter it, but
> that's not a database design problem. Also obviously, it is nice to test
> the internal consistency of the assertions we record, if we can. And just
> as obviously there has to be a way to remove lies and all their false
> derivations when they are discovered.)
>
Two points:

Your post reminds me of the "database contents as opinion" concept.

Second, there are two ways in which value can be measured in the information industry: content and conveyance.

Considering the database as pure conveyer, what you say above is true. But if the database is valued for content, not merely conveyance, then the above comment is merely finger pointing.

Consider a dictionary, as a database of words and definitions. If a word is mispelled in the dictionary, the guy who runs the printing press can claim that it's not his fault. But somebody in the production of the dictionary is responsible.

> I am in danger of turning the conversation full-circle here by remarking
on
> how surrogate keys are helpful, so I will stop now. (The problem IMO is
how
> to discourage using them, not the opposite!)

Surrogate keys are a hammer. Not every design case is a nail. Received on Mon Jan 23 2006 - 14:37:05 CET

Original text of this message