Re: Surrogate primary key plus unique constraint vs. natural primary key: data integrity?

From: Wolfgang Keller <feliphil_at_gmx.net>
Date: Sun, 25 Aug 2013 14:23:44 +0200
Message-ID: <20130825142344.546d4034c2d2d3ed65355c51_at_gmx.net>


> I am a engineer too and develop applications for work preparation in
> plant inspection and maintainance.

CMMS/EAM? I happen to be a reliability engineer, btw. and thus a user of such systems.

And the datamodel I work with is MIMOSA's CRIS, in case you happen to know that one. It uses composite primary keys well over ten columns in length for certain entities.

> In our field of work unique surrogate keys never violate referential
> integrity.

They inevitably will, depending on the database schema, as robur.6_at_gmail.com has explained. As soon as you have overlapping foreign key relationships. Which you will inevitably get sooner or later during the evolution of your data model, according to Murphy's law.

Meanwhile I've found out that half-decent database design textbooks mention the issue right under that label ("overlapping foreign keys").

> On the contrary we had to migrate from natural (composite) keys to
> Guids (surrogate keys with guaranteed uniqueness) for referential
> integrity in complex structured data (moving subsystems with all
> components from one system to another).

GUIDs (which are "unique" only within the virtual IT "world" and have absolutely no identification function in the real world) *must* fail in this case.

If you need to identify data relating to a specific real-world physical item (such as a compoent of a machine) throughout its entire lifespan (from forge to scrapyard...), across different IT systems from different vendors operated by different companies (original manufacturer, operator, maintenance workshop, sometimes several different owners over the lifespan...), a natural key is the only way to accomplish this.

And with technical components, which usually hold a serial number from the manufacturer, the "natural" key to use it just that: naturally obvious; manufacturer's key (can be a composite itself) + serial number.

That's what MIMOSA uses, btw. Although they had the problem that there seems to be no world-wide unique identification method for companies or other organisations.

> The advantage of surrogate keys is, they will be created, but never
> changed.

They will. See the "information logistics" chain over the lifespan of those physical items.

Or do you want to create a world where everything in the "real world" is identified by its "SAP number" (or "personal IDs" in Sweden and Norway)? I know the mess that results from that requirement in companies that use SAP. The users who have to deal with that *hate* it.

> Changing the identification of entities never affects the
> referential integrity between entities. Using natural keys for
> linking of entities results in massive updates.

In the technical world, a serialised component never changes its serial number during its entire lifespan. In fact this would be a crime in any half-civilised country, since it makes tracability impossible. Which is a legal requirement for safety-critical components.

Sincerely,

Wolfgang Received on Sun Aug 25 2013 - 14:23:44 CEST

Original text of this message