Re: Natural keys vs Aritficial Keys

From: Brian Selzer <>
Date: Sat, 16 May 2009 09:36:36 -0400
Message-ID: <FjzPl.15750$>

"Walter Mitty" <> wrote in message news:7ahPl.1492$
> I'm confused about some recent trends in database design.
> When I learned databases, about a quarter of a century ago, the preferred
> practice was to use natural keys to refer to individual table rows, unless
> there's some real good reason to go with an artificial key. I've run into
> a
> few cases where I chose to use an artificial key, but most of the time
> I've
> used a natural key. (I should clarify: if a university's registration
> office assigns each student a StudentID, before entering that student
> into
> the database, I'm treating that key as "natural" in the context of the
> database, even though one might argue that it's "artificial" in some other
> context.) This generally surfaced in the choice of one or more columns to
> be declared as a PRIMARY KEY.
> Every time the issue of natural keys comes up in, the
> prevailing view seems to be that the best primary keys are artificial and
> opaque. And responses that take this view get the votes. They emphasize
> efficiency (mainly efficiency in the index that you usually get
> automatically when you declare a PRIMARY KEY). But data integrity seems
> to
> be forgotten, here. The same is not generally tru when the conversation
> turns to referential integrity. Slowly but surely the programming
> community
> seems to heve been dragged, kicking and screaming, into turning over
> referential integrity enforcement to the DBMS.
> What I don't get is whether I've understood something that today's
> trendsetters never learned or whether they are taking into consideration
> matters that I never considered important, back in the day. I've read
> their
> arguments carefully, and it seems to me that failure to declare UNIQUE and
> NOT NULL constraints on the most widely used candidate keys is simply an
> invitation to degradation of data integrity. PRIMARY KEY is a convenient
> way to get those two constraints with one declaration.
> Is data integrity less important than it used to be? Are most people
> building databases that get embedded in some appplication to be sold to
> the
> non technical buyer? Is something else going on that I'm unaware of? Can
> some body explain to me what these people are thinking?

I'm going to use a term that provokes derision from the all-natural crowd: rigidity. If a natural key rigidly designates or describes something--that is, if it permanently identifies somthing in the universe, then there is no /logical/ reason to add an artificial key. But not every instance of a key is a permanent identifier: consider a table where the entire heading is the key. Do you permit updates to such a table? If updates are allowed, then what does it "mean" when one occurs? Did what was represented in the database merely change in appearance, or was it replaced by something else? And if it were replaced, then wouldn't it have been better to issue a delete followed by an insert instead? Wouldn't that more closely represent what actually occurred?

The delay between the time that a row is read and the time that an update occurs, and the possibility that a change initiated by another user can occur during that interval justifies the introduction of an artificial or surrogate key, but only in the event that instances of the natural key do not rigidly designate or describe something in the universe. Received on Sat May 16 2009 - 15:36:36 CEST

Original text of this message