Re: candidate keys in abstract parent relations
Date: Fri, 20 Jan 2006 07:55:36 -0000
Message-ID: <dqq50p$dco$1$8302bc10_at_news.demon.co.uk>
Good post Roy.
If you check out almost any website that has a drop down list, have a look
at the html for an example.
Basically, yes, you do need to pass the natural primary key value out; but
you'd have to do it twice, for instance on a list box you need the display
element as well as the value element - its the value element that we are
interested and gets passed back.
With the primary key change, I was more thinking of all the locking and page
splitting there would be - it would result in a large transaction and in the
default SQL Server world would cause a lot of blocking because of the
default writer blocks reader. In a lot of db's its ok for single key
changes, but imagine if you put your primary key on a phone number and it
was London, suddenly you need to add an extra number, that is one hell of a
big update - it would touch your whole database and take hours - you did
mention this but i just wanted to highlight its one of my big concerns.
But its worse than that, on changing the primary key in the db, the old
primary key will be on a user's screen and they click 'update' there is no
record to come back to, or worse still - its a different record; surrogate
keys get round that problem because they never change.
My experience over the pas 12ish years has been within investment banking,
and I still am shocked at some of the 'data' practices that go on there;
but, at the end of the day its an environment where speed and availability
are the key. Prior to going into banking I never used surrogate keys either,
performance just wasn't an issue.
Thanks for a good debate.
Tony.
--
Tony Rogerson
SQL Server MVP
http://sqlserverfaq.com - free video tutorials
"Roy Hann" <specially_at_processed.almost.meat> wrote in message
news:SLCdnR-38eMwnU3enZ2dnUVZ8tydnZ2d_at_pipex.net...
> "Tony Rogerson" <tonyrogerson_at_sqlserverfaq.com> wrote in message
> news:dqoi24$505$1$8302bc10_at_news.demon.co.uk...
>> > What exactly is wrong with a long value? Who cares how long the value
> is?
>> > (Or more to the point, what does "long" really mean in this context?)
>> > If
>> > we're talking performance, the time it takes to transmit a 1kb value is
> an
>> > inconsequential fraction of the time it takes to lookup a surrogate key
> to
>> > get the value.
>>
>> No No - you see thats just the point!
>>
>> If the value where 1KB and you had 100 items in the list box, I now have
> to
>> transmit 100KB to the client, on the internet thats a big deal; now
> multiply
>> that up to 10, 100, 1000 users and it becomes a really really big deal;
> even
>> saving a couple of KB's on a page really makes a difference in
> scalability.
>
> Since network transmission time dominates, I don't see how the smallness
> of
> the value that gets transmitted back when the user makes a selection
> really
> makes much difference. You must already have delivered the *all* the text
> of *all* the possible options to *all* the users so that they know what
> they
> are choosing. Whether they send back a small surrogate key or an entire
> large composite key is really irrelevant by then.
>
> Your point is still a valid one; it just has nothing to do with databases.
> You are describing how a bad user interface is also a badly behaved
> network
> client. I sure as heck don't want to be presented with a 100+ item list
> box
> even on a LAN, never mind the Internet. I'd say the DBMS would be doing
> you
> a favour by making it hard to do stuff like that. I do fully understand
> what you are saying because I have seen just exactly what you describe.
> It
> was an Internet interface to a taxation system where accounting companies
> were presented with a list box to select one of their clients. Some had
> list boxes of over a thousand choices, weighing in at +500Kb each. But
> the
> bigger problem was that it was humanly unusable no matter how fast it was.
> (And it wasn't.)
>
>> Now consider a search page with say 10 listboxes of average 100 items in
>> each, using the surrogate key approach you can very easily cache these on
>> the middle tier and refresh every 'x' minutes or seconds; you don't care
>> that the natural key underneath changes say from 'Clarke' to 'Rogerson'
>> because the surrogate key is a read only - never changing constant.
>
> (Names are a bad example for us to use here because they are a poor key
> anyway, but let's ignore that.) Most of the time good keys don't change
> (unless there is a typo or some such error to be corrected). That is
> handled quite effectively if the DBMS supports foreign key constraints
> that
> allow ON UPDATE CASCADE. Admittedly that can be quite intensive when it
> is
> triggered, but it is triggered very rarely in practice (usually), plus you
> are spared the trouble of building any surrogate key generation machinery
> and the trouble caused by duplicates that went undetected because of a
> spuriously unique surrogate key value.
>
>> >> or even worse a
>> >> composite key in there - which begs the question, if your primary key
> is
>> >> a
>> >> composite key which often it is, how would you code that?
>> >
>> > Don't understand the question.
>>
>> Say you had a composite primary key, surname, firstname, dob for instance
>> just for example sake.
>>
>> The data would be Rogerson, Tony, 10/04/1970
>>
>> The list box value would need to be Value="Rogerson, Tony, 10/04/1970";
> but
>> what if that data contained comma's - what delimiter would you use to
>> seperate the composite keys? You'd probably need to use fixed length.
>>
>> Whereas, plop a surrogate key on there say id=5 it would just be
> Value="5".
>
> Again, this is an interface/programming issue intruding (unnecessarily
> IMO)
> on a database design. It is quite possible to imagine an API that would
> handle the problem entirely transparently. For instance, since the
> composite key is unique one could compute a single hash key
> *transparently,
> in the API, at run-time* and use that instead. No one will ever provide
> such an API of course... :-(
>
>> > Please don't imagine that I reject the use of surrogate keys, but I do
>> > find
>> > there are great benefits if you can avoid them. They should be
>> > employed
>> > only with the greatest reluctance.
>>
>> My experience is the complete opposite, perhaps its just i've got used to
>> cranking the most performance out of SQL Server in the early days because
>> the hardware and disks weren't that powerful. I remember DB2 I didn't
>> care
>> because it was on a mainframe :)
>
> Well my experience is with national taxation systems and batch
> applications.
> My goal is to have zero code that the runs outside the DBMS. I have
> observed over the years that the less stuff there is in the database
> that's
> not naturally occurring business data, the easier it is to achieve that
> goal
> and the faster those applications can be made to go (by large factors,
> e.g.
> 5, 10, more). When you are dealing with processes that run for days,
> facing
> deadlines set in law, that's just as important as a fast on-line
> interface.
>
> My point (yes, I have one) is that different types of application will put
> different demands on the DBMS, but restrained use of surrogate keys is
> universally beneficial.
>
> Roy
>
>
Received on Fri Jan 20 2006 - 08:55:36 CET