Re: HELP: Referential integrity or NOT ?

From: DanHW <danhw_at_aol.com>
Date: 2 Nov 1998 04:44:45 GMT
Message-ID: <19981101234445.29758.00002458_at_ng69.aol.com>


> have recently joined a team involved in a big project, and found myself
>really confused. Data model made by experienced specialists is designed
>deliberately with absence of referential integrity.
>Tables in the model are designed as in the example:
> CUST
> cust.id ---------------------- part of the primary key
> cust.valid_date_from ----- part of the primary key
> cust.valid_date_to
> cust.etc...
>
>A table with reference to CUST is as:
> INVOICE
> invoice.id ---------- primary key
> invoice.date
> invoice.custid -------- foreign key refering cust.id ( note: no
>reference
> to cust.valid_date_from ! )
> ivoice.etc...
>
>
>When I spotted that part of CUST primary key didn't migrate toward INVOICE
>entity I started asking questions about referential integrity and got
>answers:
> It is done to keep tracking of historical changes of data(?). All
>queries
>will be made:
> SELECT cust.id,......
> FROM CUST, INVOICE
> WHERE cust.id = invoice.custid
> AND invoice.date BETWEEN cust.valid_date_from AND
> cust.valid_date_to;
>
>Since 'date_valid_from' and 'date_valid_to' are to be maintained manually
>by the user I believe that there is a great chance the integrity wiil be
>easily corrupted. My project leader claims that this model is easy to
>maintain, all tasks can be easily performed, it will cause no problem....
> All my examples of what problems might occur were neglected. I am simply
>outvoted.
>

Just to add my 2 cents worth... (I just finished building a system with the some of the same structure as you have in your system - primarily time-dependent thing that are 'masters'.) I will fill you in on some of the logic I followed, and explain where at least one potential problem may arise in your system.

I created an ID (based on sequence) that the user never saw, but was used for all the joins. I do not like multi-part PKs - joins to other tables then require both components to get there. In addition, usually the 2nd part is a date component. That means if the date changes, now your keys are all messed up. In this case, the CUST_ID is the unique number that really identifies the customer, once and for all. The time-dependent aspects where kept in another table, with valid date ranges.

In these tables, there was a PK, also based on a sequence (because they we used by other tables also). However, there was the business rule that only one [name] could be valid at one time. You cannot do this check directly though constraints or indexes; you need a After-statement trigger to check for this condition. However, you can get PART of the solution by putting a Unique key on CUST_ID and VALID_START_DATE. That will prevent 2 with the same start dates, but will not prevent overlapping dates. That is where your problem might arise.

Ask your DB specialist what is supposed to happen if this is entered (maybe there is a mechanism to prevent it...):

User enters CUST_ID #1, valid dates 2/1/98 - 2/28/98, and an invoice for them dated 2/15/98.
Later, CUST_ID#1 has a new record added, with a valid date range of 1/1./98 to 12/31/98. based on what you have described, this is allowed. If someone pulls up the invoice for 2/15/98, the join will return BOTH CUST_ID records, which is probably not the desired result. More importantly, if queries are written to expect a single row, you may get the error 'single-row query returns multiple rows'.

If you present this to your specialist, be tactful - I don't want you to get in trouble; but just get the system done correctly.

Hope that helps...

Dan Hekimian-Williams
Hightouch Technologies Received on Mon Nov 02 1998 - 05:44:45 CET

Original text of this message