Re: Surrogate primary key plus unique constraint vs. natural primary key: data integrity?

From: Roy Hann <specially_at_processed.almost.meat>
Date: Sat, 9 Mar 2013 14:52:27 +0000 (UTC)
Message-ID: <khfibb$j0m$1_at_speranza.aioe.org>


Wolfgang Keller wrote:

>> (a) there's no such thing as a primary key;
>
> PostgreSQL seems to think differently. >;->

Well like any product PostgreSQL will manifest the psychological quirks of its implementors. I'll take your word that PostgreSQL does this.

> It requires definition of a primary key afaik.

If it is made to insist that you must prefer one key over all others, even if you *really* don't want to, just because it suits someone sometimes, that would be a serious flaw IMO.

>> a key is a key or it's not. Preferring one candidate key over
>> another is purely psychological. There is no theoretical/logical
>> basis to introduce a surrogate primary key in the presence of a
>> natural key.
>
> Don't tell that to me. ;-)
>
> Tell that to those DB admins who refuse natural (composite) keys

I encounter those people frequently. Usually they refuse on the basis of some second-hand dogma about efficiency or performance, if they have any argument at all.

Provided a key is familiar, stable, unique and simple it is a good key to use. The desire for simplicity does not exclude composite keys. It excludes unweildy composite keys. The cut-off beyond which you judge a composite key to be unweildy is psychological, but we can probably all agree that more than four or five components is a nuisance. Two is definitely manageable. Three probably is too.

If a composite key leads to simpler, more comrehensible code and fewer joins to get required data, with no great loss of performance, the case for a composite is water-tight.

> and to the developers of certain DB application frameworks that
> don't allow composite primary keys.

Again, frameworks are products and they will manifest the mental tics and quirks of their developers. They will also be designed around the misconceptions of the developers, and since they are invariably programmers--usually OO programmers--they will usually be deeply ignorant of database technology, and proud of it.

Choosing a so-called "framework" (I'll pick on Hibernate by name) is the first step to completely botching the database design. You end up with an inefficient object repository in which all knowledge of the business is subordinated to knowledge of the objects.

No one gives a shit about objects except an object-oriented programmer. The end-users and the business CERTAINLY don't want to know about objects.

>> Nor, of course is there a basis for excluding a surrogate primary key
>> either. Use surrogates or don't, but don't pretend the surrogate key
>> improves your data integrity/quality because it doesn't.
>
> The surrogate key was not supposed to *improve* data integrity.
>
> The issue was whether a surrogate primary key plus a unique constraint
> on the natural key is *equivalent* concerning data integrity, especially
> relational integrity, to a natural primary key.

OK, I will admit I missed the point of the question.

Assuming the unique constraints are defined that's the job done. You can add any other columns you like to the table, including a surrogate key.

A surrogate key plus a unique constraint is not some much equivalent to using natural keys; the surrogate key is irrelevant. You can have it or not. Your choice.

Repeating an argument I made above, if a surrogate key leads to simpler, more comrehensible code and fewer joins to get required data, with no great loss of performance, the case for a surrogate is water-tight.

The point here being dogma isn't helpful. You have to think.

But I can be as dogmatic as the next man, and my dogma is use natural keys till you know they're a problem.

>> But why introduce surrogate keys at all?
>
> I don't want to open *that* can of worms, since for me the decision is
> taken anyway. By the requirement to federate/exchange data
> referring to the same "real world" item but coming from different
> sources. Which would require prohibitive amounts of handwork with
> surrogate keys, if it is possible at all.

I think you are arguing that using natural keys make federation not just easier, but more possible. I would agree, in general.

> I just wanted to know if there's a "killer" argument concerning
> data integrity *against* the approach to use a surrogate primary key
> plus a unique constraint.

Most DBAs and all programmers love writing code to solve problems. The more the better. So my killer argument--natural keys are usually way easier to work with--won't sway them one bit.

-- 
Roy
Received on Sat Mar 09 2013 - 15:52:27 CET

Original text of this message