Re: 4 the FAQ: Are Commercial DBMS Truly Relational?

From: Kenneth Downs <firstinit.lastname_at_lastnameplusfam.net>
Date: Sun, 10 Oct 2004 13:49:42 -0400
Message-ID: <nnsbkc.0h8.ln_at_mercury.downsfam.net>


Christopher Browne wrote:

>
> If you have two employee 'base' records that are entirely identical
> save for some 'sequence' ID, you've got an _application_ problem,
> because those "two employees" are clones of one another:
>
> - These two employees have the same employee ID
> - They have the same SSN/SIN number
> - They share references to the same spouse, the same children,
> the same manager, the same set of employee benefits.
>
> The fact that these "clones" have two different "updated on"
> timestamps does nothing to help the "application problems" that fall
> out of non-uniqueness.

Let me drift OT from the OP to the question of the relationship between theory and practice. It seems that the example you give brings us to the conclusion that the RDM, while useful for ensuring some kinds of correctness, it not sufficient to completely guarantee correctness.

You give the example of data that is truly relational (all columns unique), but is in fact wrong. Yes us suppose that the bogus timestamp column is the culprit, leading us into false security and subsequent error, and so we get rid of it. Now the rows are a dupe and would be kicked out.

But what if I mistype the last name when entering the second row? Now I have "John Smith" and "John Smyth", with all other columns identical. The rows are unique but wrong. So then we add a unique constraint on SSN, but to every constraint we add I have a user who is going to put in a typo and give a unique but incorrect answer.

What body of theory do we depend upon to deal with this next level of error?

Practical experience seems to be that the Employee_ID column is the answer. By requiring that single unique column, I can eventually trace anomalies in the payroll system back to the presence of two "John Smith" rows that have different IDs. Great! Having discovered this human error after the fact, I use the Employee_ID to merge the records together and delete or inactivate one row. But what body of theory describes this practice?

Could it be back to the RDM? The Employee_ID column establishes the system as truly relational by brute force, every row is unique because that column will be unique.

-- 
Kenneth Downs
Use first initial plus last name at last name plus literal "fam.net" to
email me
Received on Sun Oct 10 2004 - 19:49:42 CEST

Original text of this message