Re: candidate keys in abstract parent relations

From: Tony Rogerson <tonyrogerson_at_sqlserverfaq.com>
Date: Fri, 20 Jan 2006 07:55:36 -0000
Message-ID: <dqq50p$dco$1$8302bc10_at_news.demon.co.uk>


Good post Roy.

If you check out almost any website that has a drop down list, have a look at the html for an example.

Basically, yes, you do need to pass the natural primary key value out; but you'd have to do it twice, for instance on a list box you need the display element as well as the value element - its the value element that we are interested and gets passed back.

I know what i'm talking about when dropping into the application, but, as somebody whos done both camps, development and database design I feel strongly that a lot of the time the database designer does not consider the applications that may use the database. It's the good old DBA v Developer war; we need to consider both camps when looking at getting a scalable, performant and easily maintained design.

With the primary key change, I was more thinking of all the locking and page splitting there would be - it would result in a large transaction and in the default SQL Server world would cause a lot of blocking because of the default writer blocks reader. In a lot of db's its ok for single key changes, but imagine if you put your primary key on a phone number and it was London, suddenly you need to add an extra number, that is one hell of a big update - it would touch your whole database and take hours - you did mention this but i just wanted to highlight its one of my big concerns.

But its worse than that, on changing the primary key in the db, the old primary key will be on a user's screen and they click 'update' there is no record to come back to, or worse still - its a different record; surrogate keys get round that problem because they never change.

My experience over the pas 12ish years has been within investment banking, and I still am shocked at some of the 'data' practices that go on there; but, at the end of the day its an environment where speed and availability are the key. Prior to going into banking I never used surrogate keys either, performance just wasn't an issue.

Thanks for a good debate.

Tony.

-- 
Tony Rogerson
SQL Server MVP
http://sqlserverfaq.com - free video tutorials


"Roy Hann" <specially_at_processed.almost.meat> wrote in message 
news:SLCdnR-38eMwnU3enZ2dnUVZ8tydnZ2d_at_pipex.net...

> "Tony Rogerson" <tonyrogerson_at_sqlserverfaq.com> wrote in message
> news:dqoi24$505$1$8302bc10_at_news.demon.co.uk...
>> > What exactly is wrong with a long value? Who cares how long the value
> is?
>> > (Or more to the point, what does "long" really mean in this context?) >> > If >> > we're talking performance, the time it takes to transmit a 1kb value is
> an
>> > inconsequential fraction of the time it takes to lookup a surrogate key
> to
>> > get the value. >> >> No No - you see thats just the point! >> >> If the value where 1KB and you had 100 items in the list box, I now have
> to
>> transmit 100KB to the client, on the internet thats a big deal; now
> multiply
>> that up to 10, 100, 1000 users and it becomes a really really big deal;
> even
>> saving a couple of KB's on a page really makes a difference in
> scalability.
>
> Since network transmission time dominates, I don't see how the smallness
> of
> the value that gets transmitted back when the user makes a selection
> really
> makes much difference. You must already have delivered the *all* the text
> of *all* the possible options to *all* the users so that they know what
> they
> are choosing. Whether they send back a small surrogate key or an entire
> large composite key is really irrelevant by then.
>
> Your point is still a valid one; it just has nothing to do with databases.
> You are describing how a bad user interface is also a badly behaved
> network
> client. I sure as heck don't want to be presented with a 100+ item list
> box
> even on a LAN, never mind the Internet. I'd say the DBMS would be doing
> you
> a favour by making it hard to do stuff like that. I do fully understand
> what you are saying because I have seen just exactly what you describe.
> It
> was an Internet interface to a taxation system where accounting companies
> were presented with a list box to select one of their clients. Some had
> list boxes of over a thousand choices, weighing in at +500Kb each. But
> the
> bigger problem was that it was humanly unusable no matter how fast it was.
> (And it wasn't.)
> >> Now consider a search page with say 10 listboxes of average 100 items in >> each, using the surrogate key approach you can very easily cache these on >> the middle tier and refresh every 'x' minutes or seconds; you don't care >> that the natural key underneath changes say from 'Clarke' to 'Rogerson' >> because the surrogate key is a read only - never changing constant. >
> (Names are a bad example for us to use here because they are a poor key
> anyway, but let's ignore that.) Most of the time good keys don't change
> (unless there is a typo or some such error to be corrected). That is
> handled quite effectively if the DBMS supports foreign key constraints
> that
> allow ON UPDATE CASCADE. Admittedly that can be quite intensive when it
> is
> triggered, but it is triggered very rarely in practice (usually), plus you
> are spared the trouble of building any surrogate key generation machinery
> and the trouble caused by duplicates that went undetected because of a
> spuriously unique surrogate key value.
> >> >> or even worse a >> >> composite key in there - which begs the question, if your primary key
> is
>> >> a >> >> composite key which often it is, how would you code that? >> > >> > Don't understand the question. >> >> Say you had a composite primary key, surname, firstname, dob for instance >> just for example sake. >> >> The data would be Rogerson, Tony, 10/04/1970 >> >> The list box value would need to be Value="Rogerson, Tony, 10/04/1970";
> but
>> what if that data contained comma's - what delimiter would you use to >> seperate the composite keys? You'd probably need to use fixed length. >> >> Whereas, plop a surrogate key on there say id=5 it would just be
> Value="5".
>
> Again, this is an interface/programming issue intruding (unnecessarily
> IMO)
> on a database design. It is quite possible to imagine an API that would
> handle the problem entirely transparently. For instance, since the
> composite key is unique one could compute a single hash key
> *transparently,
> in the API, at run-time* and use that instead. No one will ever provide
> such an API of course... :-(
> >> > Please don't imagine that I reject the use of surrogate keys, but I do >> > find >> > there are great benefits if you can avoid them. They should be >> > employed >> > only with the greatest reluctance. >> >> My experience is the complete opposite, perhaps its just i've got used to >> cranking the most performance out of SQL Server in the early days because >> the hardware and disks weren't that powerful. I remember DB2 I didn't >> care >> because it was on a mainframe :) >
> Well my experience is with national taxation systems and batch
> applications.
> My goal is to have zero code that the runs outside the DBMS. I have
> observed over the years that the less stuff there is in the database
> that's
> not naturally occurring business data, the easier it is to achieve that
> goal
> and the faster those applications can be made to go (by large factors,
> e.g.
> 5, 10, more). When you are dealing with processes that run for days,
> facing
> deadlines set in law, that's just as important as a fast on-line
> interface.
>
> My point (yes, I have one) is that different types of application will put
> different demands on the DBMS, but restrained use of surrogate keys is
> universally beneficial.
>
> Roy
>
>
Received on Fri Jan 20 2006 - 08:55:36 CET

Original text of this message