Re: A real world example

From: Brian Selzer <brian_at_selzer-software.com>
Date: Thu, 17 Aug 2006 02:14:21 GMT
Message-ID: <1iQEg.8919$1f6.2533_at_newssvr27.news.prodigy.net>


"Keith H Duggar" <duggar_at_alum.mit.edu> wrote in message news:1155773774.470979.111080_at_m73g2000cwd.googlegroups.com...
> Brian, let's just take a step back to where this began. It
> began back in the "Surrogate Keys: an Implementation Issue"
> thread when you argued the following:
>
> Brian Seizer wrote:
>> There is always a compelling reason to use surrogates:
>> natural keys can change.
>
> Now, my opinions on relational theory count for very little
> at this point given my n00b status. However, I've followed
> the discussion as best I can, and given recent compelling
> arguments that natural = familiar surrogate, I believe and
> think you may even agree that your above argument is false?
>

I think that there is a fundamental difference. Over the years I've worked with many databases that used surrogates and many that didn't. In all cases where surrogates were used, it was always assumed that surrogates could not change. That meant less application code that was more reliable, more scalable, and in most cases, performed better. There were times when it was better for some queries to have all columns of a compound natural key enumerated in a single table. There were also times when the number of joins could have a detrimental effect on performance. Ten or twelve years ago, this was in some cases a show stopper, but today, you can use indexed views and other implementaional improvements to achieve similar performance while still retaining the reduced code requirements and increased reliability that surrogates provide. More importantly, in the last several years the paradigm has shifted. With the advent of ADO.NET, most of the work an application does involves disconnected datasets. (You have to go way out of your way to bypass this, unlike with ADO.) This has significant implications because the pattern is based on collision detection rather than collision avoidance. First you read the data without locking it, work on it, and then when you're ready to commit, flush it to the database, usually in a stored procedure. The flush procedure re-reads the data with an update lock to make sure that nothing changed and that nobody else can change it, and then if nothing has changed, writes the new data. The use of disconnected datasets will result in that there will more often be a significant interval between the time that the data is read and the time that the data is flushed. This increased interval means that it is that much more likely that the data read out will have been changed by another user by the time that it is ready to be flushed. This makes it much more important to be able to ensure that the rows that you're about to update haven't changed, and more importantly, that they're the same rows. With a surrogate, there's no question. If the row isn't found, then it no longer exists. If it is found, then you can be sure that it's the same row. With natural keys, however, it's possible that when the row isn't found, that someone else changed the key. Of course, you can't tell: all you know is that the row no longer exists. Whether a row was changed or deleted may be important information that a user could use. But worse, it's possible that between the time that the data was read out and the time that the data is about to be flushed that more than one change occured that resulted in a different row having the same key value that was read out. I've seen this happen, and it's not pretty. (But then, I've seen a teller terminal execute the video memory, where symptoms of the failure depended on the balance that was displayed on the screen, so the fact that I've seen something isn't indicative as to how often it occurs.) Usually, something other than the key is also different, so this doesn't happen very often, but it definitely shouldn't be ignored. This situation cannot occur with surrogates.

> If a natural can change so can a surrogate since a natural
> /is a/ surrogate, a familiar surrogate. Therefore, both can
> change and hence the possibility of change cannot argue in
> favor of one over the other.
>
> You not knowing this and Bob Badour knowing, seems to have
> led to a spiral of communication failure and ultimate doom.
>
> Now I wish that had not happened because I have learned a
> lot from the discussion. And frankly, you seem to me a very
> civil and intelligent person.
>
> You certainly did not cause the kind of c'motion that
> was the dawning of Fraud 6 ;-)
>
> -- Keith -- Fraud 6
>
Received on Thu Aug 17 2006 - 04:14:21 CEST

Original text of this message