Re: Normalization, Natural Keys, Surrogate Keys

From: Daniel Morgan <dmorgan_at_exesolutions.com>
Date: Thu, 16 May 2002 16:28:53 GMT
Message-ID: <3CE3DE41.36582605_at_exesolutions.com>


Kai Ponte wrote:

> My state (CA) is involved in a multi-hundred-million dollar project
> with a consulting company to re-write one of our county mainframe
> systems into a client-server system. (I probably should leave the
> names out.) Anyway, we recieved this document which they called
> "detailed design" that was supposed to describe the application they
> are writing for us. Unfortunately, I firmly believe the company is
> blowing smoke and has sent us a bunch of ERD's, UML's and various
> descriptions of data theory. (There are no screen descriptions or
> workflow items.)
>
> The database being proposed for this applicaiton is Oracle. Fine
> enough. (I prefer MS SQL or MySQL, just because I'm used to them.) In
> one of the descriptions in the document, they indicate that they are
> creating a databse design which uses Natural Keys vs. Surrogate Keys
> because they "were necessary for the nomalization of the data model."
> The document went on to say, "natural keys are necessary to determine
> functional dependence and the efficient normalization of the the data
> model."
>
> Now, I may not be a database genius, but this doesn't sound right. It
> may have been years since my last database design class, but I don't
> remember anywhere where surrogate keys or natural keys had any impact
> on the ability of a database designer to normalize a database.
>
> To make matters worse, they later describe the use of a "code table"
> to store various codes. This is something I've often seen in IMS and
> various btrieve-based databases and I don't believe belongs anywhere
> in a relational model.
>
> Can somebody please comment on this? Am I totally wrong?
>
> KP

They are blowing smoke. How do I know? They are selling to a government agency that has far more money than knowledge. Government agencies, with few exceptions, make a practice of bad practices. That anyone is paying money to a vendor without knowing precisely what is being purchased is a nightmare that is played out every single day all over the world. Right this second a huge aerospace company is in the process of flushing hundreds of millions of dollars down the proverbial pipe trying to prove that somehow it is exempt from the rules of reality: It will, of course, painfully fail.

That being said Oracle is probably a far better choice the than products that you know (lets not start a flame war it is jump my personal opinion) but that doesn't mean the implementation of a system by this vendor won't be a pile of trash. You folks should bring in an OUTSIDE expert to evaluate their proposal. A couple of hundred thousand dollars could save tens of millions.

With respect to natural kes vs surrogate keys I go with natural keys every time. The problem with surrogate keys, especially in an environment such as yours is that they allow, by their very nature, duplicate data entry. Unless, of course, you wish to create a second set of constraints and indexes to enforce what the primary key would normally enforce.

I've no doubt the State of California will make another mistake. And also have little doubt that it will blow up in the press. And that Oracle Corp. will take the PR hit even though there is nothing wrong with the underlying database. It is sort of the equivalent of blaming the manufacturer of the hammers because the house fell down. You would think they would learn from their mistakes: But they don't do they?

Daniel Morgan Received on Thu May 16 2002 - 18:28:53 CEST

Original text of this message