Re: Help finding natural keys

From: Bernard Peek <bap_at_shrdlu.com>
Date: Mon, 20 Jan 2003 12:01:25 +0000
Message-ID: <3K3QGa2VU+K+Ew3G_at_diamond9.demon.co.uk>


In message <OniW9.10233$I55.385455_at_news2.east.cox.net>, Alan Gutierrez <ajglist_at_izzy.net> writes

>--CELKO-- wrote:
>
>AG>>> In the real world, patients are identified by name by someone who
>> is responsible for a dozen or so patients. <<
>>
>> Maybe the name and the name of the responsible person form a key?
>>
>>>> These health care organizations use questionnaires designed by the nurses,
>>>> usually in Word ... questionnaire is a model of a blank questionnaire, it
>>>> is meta data, a completed questionnaire is an instance of the
>>>> questionnaire ... a questionnaire can have sub-questionnaires ... <<
>
>> Arrrgh!!
>
>That's what I needed to hear...
>
>> This is looking like more of a problem than we can do in a newsgroup
>> thread ...
>
>...I wanted to make sure that it wasn't as simple as it sounded in some of
>these debates.

Each newsgroup posting deals with small points in isolation from all of the others.

>
>In the archives I'd read that the lack of well defined natural keys indicates
>a weak model. This was the cause of much hand-wringing on my part. It seems
>that this is not a hard and fast rule. It is reassuring that people are asking
>me the same questions I asked myself to determine natural keys, and that the
>answers are not forthcoming. SSN, for example, has even more caveats than I'd
>imagined. Illegal immigration? My goodness. Better to learn that here than in
>production.

You are facing the same problem that many different organisations have faced, how to identify a person. The SSN is the government's attempt to fix that. They assign an SSN to everyone they have to deal with. If they discover a duplication they can change someone's SSN. If you discover a duplicate SSN you have to live with it.

The alternative is for you to issue your own codes, call them Case Numbers. You have to make sure that your case number gets assigned to one person for the duration of their stay in hospital, and that there is no way that the Case Number can be confused with any other person, even if there is another patient with the same name (or even one with the same name and date of birth).
>
>That the application is to be shared by dozens of small organizations makes
>the choice of natural keys that much more difficult. Some organizations will
>naturally identify a patient by first name, last name, and to break a tie,
>room number. Home health care organizations won't have rooms however. In
>emergency situations, the name might not be known at all.
>
>I better understand the costs of a surrogate key (maintianence, an extra
>index, etc.), and I am prepared to pay them. I will reconsider how these
>surrogate keys are generated.

I've been looking for an excuse to post some thoughts about surrogate keys. This looks like a good time.

When a table has a real and surrogate key it is not fully normalised. That is the surrogate key is (you hope) completely dependent on the real key. But that isn't always certain. There are ways that dependency can be broken. When that happens your database is broken.

Insert Anomaly: A new record can be inserted that duplicates all of the data from an existing record. But it will have a new surrogate key. The system will duplicate data.

Update Anomaly: The data for the real key is changed, but without changing the surrogate. As a result the data in associated tables linked by the surrogate key now refers to the wrong person.

Neither of these situations can be detected from within the database, they need additional processing systems to detect the anomalies, or the risk of anomalies. There may have to be manual processes or additional programs. This additional processing makes the system as a whole less robust. That is my main objection to using surrogate keys unnecessarily.

-- 
Bernard Peek
bap_at_shrdlu.com
www.diversebooks.com: SF & Computing book reviews and more.....

In search of cognoscenti
Received on Mon Jan 20 2003 - 13:01:25 CET

Original text of this message