Re: Surrogate Key vs Production Key

From: Noons <wizofoz2k_at_yahoo.com.au>
Date: 17 Oct 2004 09:15:13 -0700
Message-ID: <73e20c6c.0410170815.6b7876e6@posting.google.com>

jcelko212_at_earthlink.net (--CELKO--) wrote in message news:<18c7b3c2.0410141854.62dc56c8_at_posting.google.com>...

> I hate to tell you this, but even if you did not use them as keys, the
> industry standard codes will have to be maintained and updated. But
> if you used an exposed physical locator, then you will have to
> co-ordinate the locator and the new values.

Sure. The difference if I use them as natural keys is that I will also have to change them EVERYWHERE they are part of a FK, compound or not. Whereas if I use them as a UK and I use surrogate keys for my PK/FK tree, then I change them ONCE. If this is done by the db or app code is immaterial, it must be done and can represent a serious overhead. Of course: not in all cases.

>
> In a good SQL engine I have an ALTER DOMAIN statement and ON UPDATE
> CASCADE to a lot of my work.

Yeah, but most of us have to work with Oracle. Or worse.... :(

> I am trying to figure where that idea came from. The main
> characteristic of a key is that it is unique and invariant anywhere it
> appears. Then there is familarity and some others I have forgotten
> off the top of my head.

Let me see if I can make my point better. As soon as you imply a structure to a column, it stopped being based on an attribute. An attribute should have no implicit structure. Otherwise it is a type, not an attribute. A (simplistic) analogy is: think of atoms and molecules. Keys should be based on attributes, not on types. Or else first time someone alters a method of that type, there is the potential for all sorts of things breaking in that database.

> A data element which can be verified and validated leads to data
> quality. It is a solution, not a problem. Do you really think that,
> say, ISBNs would be better without a check digit? If they did not
> have the publisher codes in them? Would books be easier to locate?
> Would data entry errors increase or decrease if they were a random
> number of varying length?

If on entering ISBNs a check digit is not used, then there is no guarantee whatsoever the data will be good. And that goes for ANY pre-codified data. REGARDLESS of it ever being used as a natural key. Or a surrogate key being used instead. Keys have nothing to do with data validation.

The two things are completely removed from one another. I still have a problem with ISBN having its check digit stored in the same attribute, although I do accept it is common practice. But that brings in the subject of "what is a type" and type/subtype support and it has nothing to do with RI. Let's not go there now.

Once again: a surrogate key is NOT to be used for human-interaction searches or assigned a meaning. Its function is to be used as a PK/FK. That's it, nothing more. If you want to search on ISBN then you store the ISBN, index it and search on it, period. That has nothing to do with referential integrity. Searches are NOT the same as RI. Neither is data validation.

While I'm here: your new book on trees and graphs looks very interesting. I'll check it out once it arrives down under. It's a subject very close to me, I've had some success in using tree queries to pull out possible paths between two travel points: a common graph traversal problem. Received on Sun Oct 17 2004 - 11:15:13 CDT