Re: Newbie question about db normalization theory: redundant keys OK?

From: JOG <jog_at_cs.nott.ac.uk>
Date: Wed, 19 Dec 2007 15:53:05 -0800 (PST)
Message-ID: <a5cca123-f9a1-496a-a6a4-ba43b1f1eccd_at_i12g2000prf.googlegroups.com>


On Dec 19, 7:12 pm, "David Cressey" <cresse..._at_verizon.net> wrote:
> "Bob Badour" <bbad..._at_pei.sympatico.ca> wrote in message
>
> news:476937e0$0$5290$9a566e8b_at_news.aliant.net...> David Cressey wrote:
>
> > > "Kevin Kirkpatrick" <kvnkrkpt..._at_gmail.com> wrote in message
>
> news:40957504-43df-453b-8e8e-396f66b2b1e7_at_r29g2000hsg.googlegroups.com...
>
>
>
>
>
> > >>On Dec 17, 9:57 am, "David Cressey" <cresse..._at_verizon.net> wrote:
>
> > >>When it comes to keys, it comes down to one question: is the
> > >>underlying attribute meaningful within the universe of discourse? If
> > >>it is (i.e. the clients know what it means, will use it in
> > >>conversation, want it in reports, specify it in queries, etc.), it is
> > >>a natural key and belongs in the database - regardless of whether it
> > >>is a phonetic name assigned at birth, a social security number
> > >>assigned upon citizenship, or an id generated and assigned by the
> > >>database itself. Otherwise, it is garbage (artificial, surrogate,
> > >>whatever you call it).
>
> > > Fair enough. If this is the accepted meaning of "natural", I can
> certainly
> > > use it in that way.
> > > And I'll agree that the difference between "assigned" like a name, or a
> > > social security number and "naturally occurring" like biometric data is
> > > largely moot from the point of view of database theory. Many of the
> > > discussions that get started in here start from the other point of view,
> > > but I can adapt.
>
> > > In addition to the disctinction you've made between "natural" and
> > > "surrogate", there is a third kind of key, one that doesn't always
> surface
> > > in these disucssions. It's a key that generated inside the information
> > > system but, unlike a surrogate key it isn't generated inside the databse
> > > layer. It's generated by some process in the application layer. And,
> in
> > > general, such keys are not hidden from the users.
>
> > > An example might be when you reserve a rental car at some airport on a
> given
> > > day. You are issued a "reservation number". This key is going to look
> like
> > > a "system generated key" to the clerk behind the auto rental desk, and
> to
> > > you, but it's going to look like a natural key from the point of view
> of
> > > the database layer.
>
> > > For purposes of database design, the case I've outline can be treated
> as
> > > yet another case of a natural key. But for purposes of information
> system
> > > design, it has to be treated differently.
>
> > A natural key is neither more nor less than a familiar surrogate. It
> > doesn't matter whether the surrogate originates in a person's head, in
> > an application layer, or in the bowels of a dbms. As soon as users
> > interact with it making it familiar, it becomes a natural key.
>
> Well put.
>
> There can, however, be confusion if different people construe the natural
> key to refer to a different entity, or to differing levels of abstraction.
> That confusion is harmless enough until two or more people start to use the
> new naturalized key to facilitate communication between or among them.
>
> The worst confusion that occurs here in c.d.t. regarding surrogate key is
> whether a surrogate key refers to the person, place, or thing, that the
> contents of some tuple describes, or whether it refers to a "row of a
> table".

I think part of the confusion comes from the use of keys in both E/R and RM, (when there really should be different terms, perpetuating the (understandable) difficulty people have with distinguishing conceptual and logical layers.

> In the second case, the surrogate key is indeed being used as a
> substitute for a pointer, and that's the start of a slippery slope. I
> don't think you, Bob, are affected by that confusion, except to the extent
> of plonking a few more people, but it does affect a lot of the discussion
> in c.d.t. nonetheless.

Let's see what people make of the following definitions. I'm going to distinguish the use of the term 'key' as used in different layers as:

  1. A 'Key' is a set of attributes, used at the logical layer to uniquely identify a proposition.
  2. An 'Identifier' is a set of attributes, used at the conceptual layer to uniquely identify an entity.

Any comments? Received on Thu Dec 20 2007 - 00:53:05 CET

Original text of this message