Re: Surrogate primary key plus unique constraint vs. natural primary key: data integrity?
Date: Sat, 31 Aug 2013 13:56:48 -0400
Message-Id: <20130831135648.145ff53b.jklowden_at_speakeasy.net>
On Sat, 31 Aug 2013 07:10:09 -0700 (PDT) karl.scheurer_at_o2online.de wrote:
> Referencing clients with natural keys vioalates referential integrity
> when companies change names. One of our customer recently changed
> from BHR to EHR.
The trading symbol "T" on the NYSE currently belongs to AT&T. The same was true in 1970. But those aren't the same company or the same T! During that same time, the company that was AT&T underwent many changes and had many names (and symbols) until it was acquired by one of its spinoffs, IIRC Cingular, which then opted to call itself AT&T.
Is that a reason not to use "exchange, symbol" as a natural key for the stock? It is not. On any one day, there is only one T on the NYSE. Permitting two in the database would make the model not reflect the real world events it was created to capture.
In your case, you had a customer BHR and now have a customer EHR. Same company, same address, same people, maybe. But the fact -- the real-world fact -- is that you received some orders from BHR and nowadays you're getting them from EHR. If you record those facts as they occur, you can't go wrong. If you don't, you can.
The name change might be incidental or important. Old orders might be handled by the new entity, or they might not. (The specifics of the BHR/EHR transaction don't matter. What matters is that some name changes are significant, and recording actual reality deals with both cases equally well.)
Let's say all your orders use the customer name as a FK. You can distinguish orders according to the name used. You can get the union of orders with the union of names used (probably relying on some kind of customer history table). The RI rules can require that each order carry an order date for a name that lies between the name's valid start and end dates.
There is no need to update all the history, or *any* of the history. Just record the facts, and be prepared for the fact that facts change over time.
Let's say instead you use CUSTID for the key, and just keep the name by reference. If you don't keep the history, you have no way to query by historical name, and no way to distinguish between orders received under new and old names. If you do keep the history, you have at best a fragile way: you can *infer* that orders received prior to the change used the old name, but that inference relies on that date being correct, something the DBMS cannot ensure. Changing that date changes history, intentionally or not.
Perhaps the course I'm recommending would be more work, perhaps not worthwhile, at least not yet. That's OK. But more work is not a failing of RI, nor is using natural keys a problem for it.
--jkl Received on Sat Aug 31 2013 - 19:56:48 CEST