Re: Newbie question

From: Jon Heggland <heggland_at_idi.ntnu.no>
Date: Mon, 20 Jun 2005 10:04:03 +0200
Message-ID: <MPG.1d2096e332422a4f9896a6_at_news.ntnu.no>


In article <392dnQc1p91Yay_fRVnyhA_at_pipex.net>, specially_at_processed.almost.meat says...
> > That is nonsense. Primary key != natural key.
>
> So a bogus counter that makes things trivially distinct is an acceptable
> key? Explain. I am keen to understand. Be sure to explain what I would be
> identifying (not distinguishing) with such a thing. Also explain why I
> wouldn't then just prefer a pointer instead.

Yes, a bogus counter (or surrogate key, which is my preferred term) is an acceptable key. A key is simply a set of attributes so that no two tuples have identical values for those attributes.

I'm not sure why you think surrogate keys are not acceptable---they are a staple of relational database design---but I'll assume you think they are useless in the presence of natural keys (which is the term for keys that come from the "real world"). Off the top of my head, the main advantages of surrogate keys vs. natural keys are that they are stable (I.e. they don't change), and they are never unknown. And of course: It's not always *possible* to find a reasonable natural key.

(But that does not mean that you shouldn't record any natural key(s) as well, and designate them as keys, too. The distinction between "primary" and "alternate" keys really *is* bogus.)

Anyway, many "real-world" keys are really surrogates---SSNs, account numbers, registration numbers, student numbers and so on. They are bogus counters (even if they have some internal structure (which in most cases is a bad idea)).

What you are identifying with a surrogate key is exactly the same as what you are identifying with another key: a tuple.

Note that the use of surrogate keys does not force you to present them in the user interface. Natural keys are often better for that, but they are not mutually exclusive in that regard.

As for pointers vs. surrogate keys: I think it is more reasonable to ask *you* why you would prefer pointers. We need the ability to compare values anyway, and this is sufficient for identifying "entities" and establishing "relationships" too. Why should we introduce another mechanism with additional operators in order to do what we already can? It serves no purpose; it increases complexity without adding value. Pointer chasing (while perhaps trivial to a computer professional) is significantly more difficult and error-prone than comparing values. And of course, pointers *point*. They are one-way, which means we need pairs of them. More useless complexity.

Pointers may be fast, but that is an implementation issue. Feel free to implement your database with pointers, but don't expose them at the logical level---it's a bad idea.

If you want to know more about these issues, I can recommend a few articles by C. J. Date, from his Relational Database Writings series:

Don't Encode Information into Primary Keys! Composite Keys
The Primacy of Primary Keys: An Investigation Object Identifiers vs. Relational Keys
Don't Mix Pointers and Relations!
Don't Mix Pointers and Relations---Please!

-- 
Jon
Received on Mon Jun 20 2005 - 10:04:03 CEST

Original text of this message