Re: Separate foreign keys with shared ID space

From: Howard J. Rogers <hjr_at_dizwell.com>
Date: Sun, 1 Aug 2004 10:01:17 +1000
Message-ID: <410c32c4$0$16107$afc38c87_at_news.optusnet.com.au>


"Erland Sommarskog" <esquel_at_sommarskog.se> wrote in message news:Xns9538D45EA661Yazorman_at_127.0.0.1...
> Howard J. Rogers (hjr_at_dizwell.com) writes:
> > The problem I have with synthetic primary keys is precisely that they
> > abstract themselves so far away from business logic and hence become
> > 'meaningless' that they cease to do the job they were intended to do. If
> > I insert a new record into the PERSONS table, using only a sequence
> > number generator to supply a new, unique ID for the row, I am pretty
> > well guaranteed to be able to insert the new record, even for a criminal
> > clone of Al Capone. If I use the SSN as a primary key field, however, it
> > is highly likely that my insert will (correctly) fail whenever someone
> > is trying to do a bit of identity theft.
>
> So you use SSN. Now you want to enter me in your database. What is my
> SSN?
I'm not entirely sure what point you're making. Either you have one, in which case I must know it, and then I can enter your details. Or you don't have one, in which case, I can't enter your details. (And I can't tell you the number of times that is exactly what has happened to me: "I need to see your driver's licence". "I don't have one". "Well, your passport then". "I've left it in the hotel". "Well, tough. I cannot process your application further without that information".) Unless the business rules are not as simple as that, and in fact I can ask for some other form of identification from you, and can use that/those as the primary key.

In other words, if obtaining and entering your SSN is not a problem, SSN is a good primary key candidate. If there's a problem with obtaining your SSN, or if there is some other potentital complexity associated with it, my primary key design needs to reflect that reality.

> > Sure, I could use a synthetic primary key, and then
> > add a unique constraint to a field such as SSN...
>
> At least then you have less problem when someone's SSN changes. And
> you can handle persons without SSNs. (That is, your constraint should
> be unique-when-NULL.)

But you are assuming that I would *want* to handle people without SSNs. I'll say up-front, I live in Australia, and I have no idea what the rules on US SSNs are. So let's try to keep it generic. The point is that, either my business rule states that two people *cannot* share SSNs (in which case, SSN is a good candidate for a primary key), or the business rules state some other point of uniqueness, and hence use that (or a combination of factors) as the primary key. But what you shouldn't be doing is claiming some profound weakness in how SSNs work, and from that claim that synthetic keys are the only way to go.

Again, suppose my business rules state: you can open an account for someone who cannot supply a US SSN. If that's my business rule, then SSN is a poor choice for a primary key, obviously. But if my business rule states "no SSN, no account", then your statement that "at least you can handle persons without SSNs" is meaningless... my company has chosen not to contemplate that possibilty, and the use of SSN as our primary key is therefore not a problem.

> You can also apply a partial validation rule. If the person is a Swedish
> tax payer, I may require that is person identifcation number is unique
> (because our customers reports to Swedosh tax authorities), but if he
> pays tax in Upper Fragonia I could care less what his identification
> numbers they may use there.

A multi-column primary key might be in order, then. Such as passport number, surname, date of birth, SSN if available etc etc etc. Point is, I could still construct a primary key from the available possibilities, dependent only on what my business requirements are, and still not resort to synthetic sequence numbers, which mean nothing and cannot prevent duplication on their own.

> > For me, a primary key should absolutely embody a business rule ("no two
> > people can share a SSN", for example),
>
> The problem is that the real-world business rule is different. Two
> persons *can* have the same person identification number.

Well, I think you're getting too specific and missing the actual point. If in the real world two people can share an identity number X, then X is not a candidate for a primary key. So find me something else that they *cannot* share. But do that extra thinking, please, before giving up and declaring only that a meaningless sequence number is the only answer.

What you are saying here is merely that the business rule is not described correctly or completely. In which case, fine: describe it further and in more refined detail. But doing so doesn't invalidate the whole idea of 'natural' primary keys.

>It can be
> because of systems that have goofed, or it can be because they live
> in different countries - but still lives in the same system.

It simply means the business rule gets more complicated. If I was designing a system that was to store US and British and Aussie citizens, I would have an 'ID' field that could accept the US SSN, the British National Insurance Number (assuming it still exists!!) and the Aussie Tax File number. As an example. I can call that field anything I want, but provided I make it a VARCHAR2, and make it suitably long, it can do duty for storing the requisite information from any of the three countries. Now you tell me that in the US, the SSN is not unique... fine, I believe you: so add something else to the primary key that would qualify it. When you finally report to me that, to guarantee uniqueness in the US I would have to include 32 other fields as part of the primary key, I might then think about letting you use a synthetic sequence number!

In short, it is not cut-and-dried one way or the other, nor without merit to discuss these things. It certainly isn't the brainless decision that the original quotation that started this part of the thread suggested it was.

Regards
HJR Received on Sun Aug 01 2004 - 02:01:17 CEST

Original text of this message