Re: Normalization, Natural Keys, Surrogate Keys
Date: Wed, 29 May 2002 23:06:32 +0100
Message-ID: <tFi0JoKoDV98EwUU_at_shrdlu.com>
In message <c1ec9b8f.0205290820.706424bc_at_posting.google.com>, Kai Ponte <cybermusicdude_at_aol.com> writes
>> The right way to do it is to have an UNIQUE constraint, the
>> application will try to insert the data (without checking for
>> duplicates), and the database will fail the INSERT in case of invalid
>> data, then the application can catch the exception and follow the
>> business rules for dealing with these exceptions.
>>
>
>Okay, you've made me think. (I hate it when that happens.) I've
>discussed this issue with some of my co-workers and here's what I've
>come up with.
>
>1. Yes, having a unique column (or group of columns) in the table is
>beneficial. This is done even if the column(s) isn't (arent) the PK.
>
>2. Allowing the DBMS to catch the duplicates and failing on an insert
>is a benefit when programmers don't do their due dilligence. (That
>never happens, right?)
>
>HOWEVER...
>
>I thought about the exception table idea and I believe it is a bad
>design in most instances that I can forsee. (Remember this is just my
>narrow-minded opinion.) Here's why. Let's say, as in the examples
>we've been using, we have a table which includes a "supposedly" unique
>SSN. Let's say too that we want the DBMS to catch any duplicate SSNs
>and not allow for an INSERT. You suggest that an exception table be
>created where the duplicates reside. I can see how this would benefit
>the enforcement of a uniqe column.
>
>The problem that I see is in how to deal with the exceptions. In a
>real-world situation, the exceptions may never get handled and they
>will live in the "exeption" table forever. Thus you end up with
>"duplicate" data in two tables. As a result, problems will crop up
>with foreign key relationships since they will need to be related to
>both the primary and exeption tables.
>
>Now, let's say we setup a design with only one table as so:
>
>RowID(PK), SSN, Occurance
>
>We still use the surrogate key to maintain a separation of the
>business data from the primary key. The SSN column will allow
>duplicates. The new column added is the occurance column which is set
>to zero unless a duplicate SSN is entered. Our UNIQUE constraint is
>the combination of the SSN and occurance. Using my example with data
>we would see something like the following:
>
>RowID SSN Occurance
>1 123-45-6789 0
>2 123-43-8989 0
>3 123-45-6789 1
>
>The data still is allowed to be entered and it exists in only one
>table. The occurance flag is incremented each time a duplicate SSN is
>entered. Programming can be created to deal with the exceptions and
>provide for a clean up when time permits. (Read Steven Covey's parable
>on sharpening the saw to determine when this happens.) This design
>allows for the exeptions you mentioned without forcing the DBMS to
>maintain a second table.
That's close to what my solution would be. If I was really expecting the SSN to be unique I would probably call the extra column Error and set up a mechanism for investigating the circumstances that forced it to be used.
-- Bernard Peek bap_at_shrdlu.com In search of cognoscentiReceived on Thu May 30 2002 - 00:06:32 CEST