Re: Normalization, Natural Keys, Surrogate Keys

From: Daniel Morgan <dmorgan_at_exesolutions.com>
Date: Fri, 17 May 2002 22:28:09 GMT
Message-ID: <3CE583F3.7F5D20D5_at_exesolutions.com>


Paul Tiseo wrote:

> In article <3ce567e8_3_at_news.teranews.com>, pablo_at_dev.null says...
> > Hi Ed,
> >
> > Seems like the issues that you're seeing could have been caused
> > regardless of whether surrogate keys were used or not. Seems like
> > what should have been done are the following items:
> >
> > * UNIQUE index on the natural key
> > * Implement RI to ensure no orphaned rows and ensure valid
> > relationships
> >
> > Perhaps I'm missing something obvious.
>
> Most people who are suspicious of natural keys are suspicious of
> them for the wrong reasons, as you illustrate above. They don't imply
> any integrity hardship. With the proper constraints and RI, the
> surrogate key stands in for the natural key, hence the name. If you
> can't get the proper controls in on the surrogate, then they can be
> trouble wrt to integrity of data.
>
> This said, the only thing I see wrong with surrogate keys is that
> they can cause you extra work. You must, in some cases, think about the
> "properness" of both the surrogate and the natural, alternate key. You
> now have two columns instead of one, although the impact these days on
> modern DBMSes should be negligeable. Furthermore, it requires additional
> rules on the physical implementation which could hamper performance or
> which might not be doable depending on the engine used. (PK on the
> surrogate and unique index on the complex natural key rather than just a
> single PK on a single complex natural.)
>
> Plus, I don't understand how someone can go ballistic over
> creating surrogate keys and then turn around and use an SSN or an
> EmployeeID as a PK, for example. These are basically someone else's
> surrogate key, yet are commonly used as PKs.
>
> (Any opinions expressed are strictly mine only and not my employer's)
> --------------------------------------------------------------------
> Paul Tiseo, Intermediate Systems Programmer
> Reply To: 123tiseo.paul_at_123mayo.edu (drop the numbers)

The reason some people go ballistic is as follows:

CREATE TABLE employees (

   surrogate_key NUMBER,

   ssn                    VARCHAR2(11));

ALTER TABLE employees
ADD CONSTRAINT pk_employees PRIMARY KEY (surrogate_key);

CREATE SEQUENCE seq_surrogate_key START WITH 1;

INSERT INTO employees
VALUES
(seq_surrogate_key.NEXTVAL '555-55-5555');

INSERT INTO employees
VALUES
(seq_surrogate_key.NEXTVAL '555-55-5555');

COMMIT; It happens all the time.

If I didn't get paid so darned much money dumping surrogate keys and cleaning up the messes I'd actually be upset. And not once has there been a resulting performance problem that warranted recreating a surrogate key and instituting a fake primary key using a unique index with a not null constraint.

The only justification I can see for surrogate keys is for performance. And in my experience 90% of performance issues relate to SQL that has never been Explain Planned running on systems under CBO without statistics where the last time someone looked at an alert log was ... well they can't remember. And I am sure this doesn't apply to any of the systems worked on by people here.

Daniel Morgan Received on Sat May 18 2002 - 00:28:09 CEST

Original text of this message