Re: Normalization, Natural Keys, Surrogate Keys

From: Malcolm Dew-Jones <yf110_at_vtn1.victoria.tc.ca>
Date: 21 May 2002 15:41:09 -0800
Message-ID: <3ceacd05_at_news.victoria.tc.ca>


Kai Ponte (cybermusicdude_at_aol.com) wrote:
: My state (CA) is involved in a multi-hundred-million dollar project
: with a consulting company to re-write one of our county mainframe
: systems into a client-server system. (I probably should leave the
: names out.) Anyway, we recieved this document which they called
: "detailed design" that was supposed to describe the application they
: are writing for us. Unfortunately, I firmly believe the company is
: blowing smoke and has sent us a bunch of ERD's, UML's and various
: descriptions of data theory. (There are no screen descriptions or
: workflow items.)

: The database being proposed for this applicaiton is Oracle. Fine
: enough. (I prefer MS SQL or MySQL, just because I'm used to them.) In
: one of the descriptions in the document, they indicate that they are
: creating a databse design which uses Natural Keys vs. Surrogate Keys
: because they "were necessary for the nomalization of the data model."
: The document went on to say, "natural keys are necessary to determine
: functional dependence and the efficient normalization of the the data
: model."

: Now, I may not be a database genius, but this doesn't sound right. It
: may have been years since my last database design class, but I don't
: remember anywhere where surrogate keys or natural keys had any impact
: on the ability of a database designer to normalize a database.

$0.02

What they say sounds backwards to me, though I can't claim to be an expert on this.

When importing old data it may be necessary to create arbitrary keys because the existing keys are not reliable and it may be impractical to fix them (in time, or ever). When the data is brought into the new application then the data is assigned the arbitrary key so that work can continue ("arbitrary" is misleading, the new key usually tries to have some understandable connection to the data). New data is constrainted to be correct. The old data has errors, but they are never worse than they were in the existing application, and can be more readily fixed over time within the new application.

Either way, I think I would expect them to want to use surrogate keys, not natural keys. Received on Wed May 22 2002 - 01:41:09 CEST

Original text of this message