Re: Normalization, Natural Keys, Surrogate Keys
Date: Tue, 21 May 2002 14:37:18 +0100
In message <usn4mxfa2.fsf_at_rcn.com>, Galen Boyer <galendboyer_at_yahoo.com> writes
>> If you allow this to happen then your data has been corrupted, and
>> corrupted in a way that's not immediately obvious.
>What happens if a customer id is email address? There are apps around
>where you can't get a natural, non-changing primary key for the
>customer. But, a completely logical identifier for, say, an online app
>could be an email address, or maybe even a credit card. But, this
>identifier can change, or they could have more than one. What do you do
You find out who decided to use e-mail address as a unique identifier and tell them to fix it before they leave. I have multiple entries in several on-line databases because I use multiple e-mail addresses. This is fine for me as long as I can remember which address I used where and when.
A common fix is to have the customer generate their own identifier as a login ID and just check to make sure that it is unique within your own database. You can't prevent a customer from using multiple IDs if they choose, so don't bother trying. You might de-duplicate the data later. There are companies that exist to do nothing else but de-duplicate lists of names.
>Or, is the whole surrogate/natural argument centered around creating
>keys for data that already has "natural" keys and has nothing to do with
We've been through this one before in comp.databases.theory and I
proposed that we use latitude, longitude, altitude and birth-time as a
natural key for people. Anything else is a surrogate. We can't usually
get a real natural key for people so surrogates are all we have.
Other types of data have natural keys and I would strongly advise using
them wherever possible. I know that there are people who always use
surrogate keys even when they have natural keys available. I hope I
don't have to use any of the databases they designed.
Other types of data have natural keys and I would strongly advise using them wherever possible. I know that there are people who always use surrogate keys even when they have natural keys available. I hope I don't have to use any of the databases they designed.
true natural keys guaranteed by the laws of physics.
surrogate keys that I generate and therefore control.
surrogate keys issued by *trusted* third-parties.
Anything else shouldn't be completely trusted and perhaps shouldn't have a unique index in a database. If you choose to make the SSN (or any other third-party surrogate) a unique field in the database then you must have a procedure in place to handle duplicate values.
-- Bernard Peek bap_at_shrdlu.com In search of cognoscentiReceived on Tue May 21 2002 - 15:37:18 CEST