Re: Surrogate primary key plus unique constraint vs. natural primary key: data integrity?

From: Jan Hidders <hidders_at_gmail.com>
Date: Mon, 11 Mar 2013 14:38:50 -0700 (PDT)
Message-ID: <fed4982d-e425-4c1d-bc6b-93c6f912a61e_at_googlegroups.com>


Op maandag 11 maart 2013 19:53:56 UTC+1 schreef Wolfgang Keller het volgende:
> > For a single relation, obviously there is no difference. But at the
>
> > database level there is at least one case - composite overlapping
>
> > keys
>
>
>
> I have lots of those in the schema that I use. The entities are in
>
> fact arranged into a deeply nested hierarchy (which corresponds very
>
> well to the semantics of the application domain), where the (natural)
>
> keys of "child" entities include foreign keys to (often several
>
> different) "parent" entities. With lots of overlapping.
>
>
>
> Due to the level of depth of the nesting, primary keys, especially for
>
> n:m(:o:...) tables can be long - I stopped counting at 15 columns or so.
>
>
>
> > - when a surrogate key design can lead to anomalies if not
>
> > supplemented by other restrictions.
>
>
>
> That's exactly the kind of application case I had in mind.
>
>
>
> Lots of tables with lots of foreign key constraints, often several of
>
> them in one primary key of a "child" entity.
>
>
>
> > Maybe is not the best example, but let’s take the following database
>
> > (with the obvious foreign keys):
>
> >
>
> > clients {client}
>
> > client_orders {client, order}
>
> > client_locations {client, location}
>
> >
>
> > deliveries {delivery_no, client, order, location}
>
> >
>
> > Now add surrogate keys to all relations, and replace the last one
>
> > with:
>
> >
>
> > deliveries {delivery_no, order_id, location_id}
>
> >
>
> > You may see that client attribute is no longer needed since it can be
>
> > determined from either order_id or location_id. However this design
>
> > no longer enforce the rule that the order and the location should
>
> > refer to the same client.
>
>
>
> Thanks a *lot* for that one. And even if you do include the client_id in
>
> deliveries (as the standard suggestion would probably be), that still
>
> won't enforce the consistency here. And I wouldn't even *want to* try
>
> and investigate what constraints I would need to enforce integrity in my
>
> case, given the shear number of entities and the complexity of the
>
> dependencies between the primary/foreign keys.

Not so fast. :-)

Are your hierarchies perhaps inheritance or part-of hierarchies? In that case there is probably still no problem. It's not so much the overlapping that creates the problem, but the partial overlapping. Pretty much anything that you can model in EER models is not really a problem under that type of mapping transformation.

The example shows indeed an important insight, namely that this particular way to transform your schema does not work. However, it is certainly possible to map it to an equivalent relational schema while only using keys and common foreign keys over surrogate identifiers. For example:

clients {id, client}
  key {id}
  key {client}

orders{id, order}
  key {id}
  key {order}

client_orders {id, client-id, order-id}
  key {id}
  key {client-id, order-id}
  fk (client-id) -> clients(client-id)
  fk (order-id) -> orders(order-id)

locations{id, location}
  key {id}
  key {location}

client_locations {id, client-id, location-id}   key {id}
  key {client-id, location-id}
  fk (client-id) -> clients(id)
  fk (location-id) -> locations(id)

deliveries {id, delivery_no, client-id, order-id, location-id}   key {id}
  key {delivery_no}
  fk (client-id, order-id) -> client_orders(client-id, order-id)   fk (client-id, location-id) -> client_locations(client-id, location-id)

I'm not saying it's pretty, just possible. ;-)

  • Jan Hidders
Received on Mon Mar 11 2013 - 22:38:50 CET

Original text of this message