Re: candidate keys in abstract parent relations

From: Tony Rogerson <tonyrogerson_at_sqlserverfaq.com>
Date: Fri, 20 Jan 2006 07:55:36 -0000
Message-ID: <dqq50p$dco$1$8302bc10_at_news.demon.co.uk>

Good post Roy.

If you check out almost any website that has a drop down list, have a look at the html for an example.

Basically, yes, you do need to pass the natural primary key value out; but you'd have to do it twice, for instance on a list box you need the display element as well as the value element - its the value element that we are interested and gets passed back.

I know what i'm talking about when dropping into the application, but, as somebody whos done both camps, development and database design I feel strongly that a lot of the time the database designer does not consider the applications that may use the database. It's the good old DBA v Developer war; we need to consider both camps when looking at getting a scalable, performant and easily maintained design.

With the primary key change, I was more thinking of all the locking and page splitting there would be - it would result in a large transaction and in the default SQL Server world would cause a lot of blocking because of the default writer blocks reader. In a lot of db's its ok for single key changes, but imagine if you put your primary key on a phone number and it was London, suddenly you need to add an extra number, that is one hell of a big update - it would touch your whole database and take hours - you did mention this but i just wanted to highlight its one of my big concerns.

But its worse than that, on changing the primary key in the db, the old primary key will be on a user's screen and they click 'update' there is no record to come back to, or worse still - its a different record; surrogate keys get round that problem because they never change.

My experience over the pas 12ish years has been within investment banking, and I still am shocked at some of the 'data' practices that go on there; but, at the end of the day its an environment where speed and availability are the key. Prior to going into banking I never used surrogate keys either, performance just wasn't an issue.

Thanks for a good debate.

Tony.

-- 
Tony Rogerson
SQL Server MVP
http://sqlserverfaq.com - free video tutorials


"Roy Hann" <specially_at_processed.almost.meat> wrote in message 
news:SLCdnR-38eMwnU3enZ2dnUVZ8tydnZ2d_at_pipex.net...


> "Tony Rogerson" <tonyrogerson_at_sqlserverfaq.com> wrote in message


> news:dqoi24$505$1$8302bc10_at_news.demon.co.uk...

>> > What exactly is wrong with a long value?  Who cares how long the value


> is?

>> > (Or more to the point, what does "long" really mean in this context?) 
>> > If
>> > we're talking performance, the time it takes to transmit a 1kb value is


> an

>> > inconsequential fraction of the time it takes to lookup a surrogate key


> to

>> > get the value.
>>
>> No No - you see thats just the point!
>>
>> If the value where 1KB and you had 100 items in the list box, I now have


> to

>> transmit 100KB to the client, on the internet thats a big deal; now


> multiply


>> that up to 10, 100, 1000 users and it becomes a really really big deal;


> even

>> saving a couple of KB's on a page really makes a difference in


> scalability.


>


> Since network transmission time dominates, I don't see how the smallness 


> of

> the value that gets transmitted back when the user makes a selection 

> really

> makes much difference.  You must already have delivered the *all* the text

> of *all* the possible options to *all* the users so that they know what 

> they

> are choosing.  Whether they send back a small surrogate key or an entire

> large composite key is really irrelevant by then.

>


> Your point is still a valid one; it just has nothing to do with databases.


> You are describing how a bad user interface is also a badly behaved 

> network

> client.  I sure as heck don't want to be presented with a 100+ item list 

> box

> even on a LAN, never mind the Internet.  I'd say the DBMS would be doing 

> you

> a favour by making it hard to do stuff like that.  I do fully understand

> what you are saying because I have seen just exactly what you describe. 

> It

> was an Internet interface to a taxation system where accounting companies

> were presented with a list box to select one of their clients.  Some had

> list boxes of over a thousand choices, weighing in at +500Kb each.  But 

> the

> bigger problem was that it was humanly unusable no matter how fast it was.

> (And it wasn't.)

>
>> Now consider a search page with say 10 listboxes of average 100 items in
>> each, using the surrogate key approach you can very easily cache these on
>> the middle tier and refresh every 'x' minutes or seconds; you don't care
>> that the natural key underneath changes say from 'Clarke' to 'Rogerson'
>> because the surrogate key is a read only - never changing constant.
>


> (Names are a bad example for us to use here because they are a poor key


> anyway, but let's ignore that.)  Most of the time good keys don't change

> (unless there is a typo or some such error to be corrected).  That is

> handled quite effectively if the DBMS supports foreign key constraints 

> that

> allow ON UPDATE CASCADE.  Admittedly that can be quite intensive when it 

> is

> triggered, but it is triggered very rarely in practice (usually), plus you

> are spared the trouble of building any surrogate key generation machinery

> and the trouble caused by duplicates that went undetected because of a

> spuriously unique surrogate key value.

>
>> >> or even worse a
>> >> composite key in there - which begs the question, if your primary key


> is

>> >> a
>> >> composite key which often it is, how would you code that?
>> >
>> > Don't understand the question.
>>
>> Say you had a composite primary key, surname, firstname, dob for instance
>> just for example sake.
>>
>> The data would be Rogerson, Tony, 10/04/1970
>>
>> The list box value would need to be Value="Rogerson, Tony, 10/04/1970";


> but

>> what if that data contained comma's - what delimiter would you use to
>> seperate the composite keys? You'd probably need to use fixed length.
>>
>> Whereas, plop a surrogate key on there say id=5 it would just be


> Value="5".


>


> Again, this is an interface/programming issue intruding (unnecessarily 


> IMO)

> on a database design.  It is quite possible to imagine an API that would

> handle the problem entirely transparently.  For instance, since the

> composite key is unique one could compute a single hash key 

> *transparently,

> in the API, at run-time* and use that instead.  No one will ever provide

> such an API of course...  :-(

>
>> > Please don't imagine that I reject the use of surrogate keys, but I do
>> > find
>> > there are great benefits if you can avoid them.  They should be 
>> > employed
>> > only with the greatest reluctance.
>>
>> My experience is the complete opposite, perhaps its just i've got used to
>> cranking the most performance out of SQL Server in the early days because
>> the hardware and disks weren't that powerful. I remember DB2 I didn't 
>> care
>> because it was on a mainframe :)
>


> Well my experience is with national taxation systems and batch 


> applications.

> My goal is to have zero code that the runs outside the DBMS.  I have

> observed over the years that the less stuff there is in the database 

> that's

> not naturally occurring business data, the easier it is to achieve that 

> goal

> and the faster those applications can be made to go (by large factors, 

> e.g.

> 5, 10, more).  When you are dealing with processes that run for days, 

> facing

> deadlines set in law, that's just as important as a fast on-line 

> interface.

>


> My point (yes, I have one) is that different types of application will put


> different demands on the DBMS, but restrained use of surrogate keys is

> universally beneficial.

>


> Roy

>


>

Received on Fri Jan 20 2006 - 08:55:36 CET

This message: [ Message body ]
Next message: Tony Rogerson: "Re: candidate keys in abstract parent relations"
Previous message: -CELKO-: "Re: Relate sets of data"
Maybe in reply to: Forrest L Norvell: "candidate keys in abstract parent relations"
In reply to Roy Hann: "Re: candidate keys in abstract parent relations"
Next in thread: Tony Rogerson: "Re: candidate keys in abstract parent relations"
Reply: Tony Rogerson: "Re: candidate keys in abstract parent relations"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message