Re: Normalization, Natural Keys, Surrogate Keys

From: Pablo Sanchez <pablo_at_dev.null>
Date: Thu, 16 May 2002 17:10:37 -0600
Message-ID: <3ce43a9a$1_7_at_news.teranews.com>


"Tobin Harris" <comedyharris_at_hotmail.com> wrote in message news:ac1dmp$m5u4k$1_at_ID-135366.news.dfncis.de...
> > * we can efficiently join across columns (assuming a NUMBER(16)
in
> > Oracle or an INT in Sybase), we're looking at four bytes, and
> > * our implementation is independent of the business requirements
> > without losing the business requirments
> > * our index density is very high so index range scans/'covering an
> > index' is extremely efficient
> >
> > What's key (pun intended! <go>) is that surrogate keys and natural
> > keys can peacefully co-exist. Kumbaya, kumbaya ...
>
> I suppose there's 3 things I'm concerned about here:
>
> 1. By making the parent id a foreign key, it seems like you're
losing
> information in the design. More specifically, a child is related to
a
> parent, but not dependent on it for it's existance. So here, the
business
> requirements are not well comunicated through the design?

We don't lose that knowledge per se, it's just a bit indirect via the surrogate key. What I clearly failed to communicate is that we keep the natural keys but we don't use them as our primary key.

> 2. Everywhere I go people are trading off performance over
everything else.
> Ok, I admit I'm a little touchy on the subject, but 2 of your 3
points were
> directly related to performance. In my limited experience, I've seen
more
> projects fail due to 'understandability' than performance.

One should never trade off performance for business functionality. I'm totally in agreement with you. As I mention above, I hope clearer, we use surrogate keys as a bridge between the natural key and the actual implementation. Developers are free to discuss surrogate or natural keys.

> 3. Hymns in database discussion groups. Now that's definately
violating some
> integrity consraints!?

Wow, excellent come back! I won't go down the path of being database-agnostic.

> I'm keen to learn more, so how can your 'surrogate key' version
communicate
> the same thing as a composite primary key? I now have to examine the
> parent/buesiness rules to realise what it's 'real ' key is, since
you've
> replaced it with a surrogate key.

On its own, you cannot. But I don't believe I stated that you could. It's a level of abstraction that we as developers can easily map between. It's not that complicated. I'm working on an application that has 450+ tables with this concept implemented. Of course the number of tables means nothing but just trying to give an example that it does work.

> BTW, I'm not going to argue about the
> performance issues, although database vendors could probably create
their
> own surrogate keys internally to replace the composite key. Which
would
> remove the performance probs. (I'm thinking on the fly here, so be
gentle)

Some do (Oracle's ROWID) and some don't. Oracle's ROWID's are re-used though so it's not exactly as you're intending. It's definitely possible but at the end of the day, being that I want to control everything, I rather know about data placement than let the RDBMS handle it 'behind the scenes' for me.

Thx!

--
Pablo Sanchez, High-Performance Database Engineering
mailto:pablo_at_hpdbe.com
http://www.hpdbe.com
Available for short-term and long-term contracts
Received on Fri May 17 2002 - 01:10:37 CEST

Original text of this message