Re: Choosing a surrogate key vs. a composite key - "holy war" issue or clear best practice?

From: Mark D Powell <Mark.Powell_at_eds.com>
Date: Sat, 28 Jun 2008 07:11:13 -0700 (PDT)
Message-ID: <72803f7d-1bf2-4f9a-890e-18110bd4e238@c58g2000hsc.googlegroups.com>


On Jun 28, 9:15 am, Frank van Bortel <frank.van.bor..._at_gmail.com> wrote:
> dana..._at_yahoo.com wrote:
> > Early in my career, I was accustomed to seeing simple, sequence-
> > generated surrogate keys. Seemed clean and logical. Later in my
>
> They are not. They're called technical keys for a reason. They
> are invented by technocrats.
>
> > career, I saw seemingly monstrous (to me) composite keys of up to 5
> > fields. Found them initially repulsive because they made queries
> > difficult to read and write. I'm wondering if there is a clear best
> > practice on choosing a surrogate key over a composite key--or if this
> > is a "holy war" issue no one (collectively) currently agrees on.
>
> > Some pros to Composite primary keys:
> > --------------------------------------------------------
>
> > * They enforce uniqueness; meaning you could have a sequence-generated
> > surrogate key, yet redundant rows may still creep into your table.
> > They will be uniquely identifiable in that, yes, the surrogate key
> > values are unique; but all other column values in > 1 rows could be
> > identical.
>
> You do not seem to know the difference between Primay and Unique keys
>
>
>
> > * Simpler in the aspect of not having to create, manage, and have
> > programmers reference a surrogate key sequence
>
> Yeah - let's keep it simple. Lets call all PK "ID".
>
>
>
> > * Can't think of any other Pros to composite keys. Anyone?
>
> They are logical.
> Because of that, they are self-documenting.
>
>
>
> > Some cons to Composite primary keys:
> > ---------------------------------------------------------
>
> > * Composite PKs with a large # of columns make queries difficult to
> > read and write
>
> Well, learn how to read and write!
>
>
>
> > * Composite PKs with a large # of columns makes referencing tables
> > larger; e.g. a foreign key in a child table is a copy of a composite
> > PK in a parent table. This means more data storage is required in
> > referencing child tables and the burdens that presents.
>
> Storage - who cares? I don't, but I know some employers
> do. They just cannot understand many cheap disks will be as fast
> as many expensive "enterprise-class" disks. No - those disks
> rotate faster, so the "have less latency" - yeah, and so what?
>
>
>
> > * Some applications, like ESRI's ArcGIS Desktop, seem not to be
> > composite key aware, e.g. to those who know the product, you seemingly
> > can't do a "relate" to any existing external tables that use composite
> > keys; this makes database integration with legacy relational and
> > current relational databases, difficult. This may be the case
> > generally in OO programming and approaches and not at all unique to
> > ESRI--the presumption that all primary keys are only ever composed of
> > a single-column.
>
> ESRI and Oracle is not a match, made in heaven. I understand they
> are in therapy, so who knows...
>
> > Some pros to surrogate primary keys:
> > -------------------------------------------------------
>
> > * Aside from the obvious ones, given the cons I've listed for
> > composite keys, it seems you can have a surrogate key yet still
> > enforce more than nominal row uniqueness by creating a unique
> > constraint/index on the fields you'd otherwise.
>
> > * Sometimes there's no good combination of keys for making a composite
> > primary key
>
> Then, redo your (home)work, and apply analysis.
>
>
>
> > What have I missed in this debate?
>
> There is no debate. You're mumbling to yourself, and throwing
> flamebait. This seems to happen every now and them - searching the
> archives is not difficult, but hey, getting your name out  is
> far more important.
> [snip]
>
> --
>
> Regards,
> Frank van Bortel

A well designed system will likely have a mix of natural and surrogate keys. Never use a surrogate key where a suitable natural key exists. The practice of tacking a surrogate key on every table complicates SQL more than multi-column PK keys. You still end up with multi-column keys from inheritance when you attach a surrogate key to every table plus you often have to create indexes on the natural key anyway since the natural key value is what the customer uses to search the data.

Worse while the FK will enforce that no child rows exist without a valid parent key value the FK is usually useless for enforcing business rules involving data integrity. So if you did not have to create an index on the natural key for access reasons you end up creating unique indexes to enforce rules like no two rows will exist with the same value combination in col-C and col-D.

Even in the case where the natural key value might change it is usually fairly easy to handle the requirement by just duplicating the existing key row data with a new PK, updating all existing child rows, and then deleting the old PK row. We have discovered that using surrogate keys does not always eliminate the need to update child data where a PK value changes where corporate mergers, distributors buying dealers, and similar activities exist. In fact updating or recognizing the change with historical data is often easier where natural keys exist within the data rather than surrogate keys especially if the FK relationship was not defined to the database.

There are pro’s and con’s to each approach. The use of intelligence in making the key selections rather than relying on dogma will result in a better design.

IMHO -- Mark D Powell -- Received on Sat Jun 28 2008 - 09:11:13 CDT

Original text of this message