Re: candidate keys in abstract parent relations

From: Tony Rogerson <tonyrogerson_at_sqlserverfaq.com>
Date: Fri, 20 Jan 2006 08:03:25 -0000
Message-ID: <dqq5fe$1uo$1$8300dec7_at_news.demon.co.uk>

> I know what i'm talking about when dropping into the application, but, as
> somebody whos done both camps, development and database design I feel
> strongly that a lot of the time the database designer does not consider
> the applications that may use the database. It's the good old DBA v
> Developer war; we need to consider both camps when looking at getting a
> scalable, performant and easily maintained design.

Sorry, have to proof read so when i start writing and a sentance and then start again it doesn't come out like that; think its time for a cuppa tea and breakfast :)

What i meant say was...

I know i'm dropping into the application, but, as somebody whos done both camps, development and database design I feel strongly that a lot of the time the database designer does not consider the applications that may use the database. It's the good old DBA v Developer war; we need to consider both camps when looking at getting a scalable, performant and easily maintained design.

-- 
Tony Rogerson
SQL Server MVP
http://sqlserverfaq.com - free video tutorials


"Tony Rogerson" <tonyrogerson_at_sqlserverfaq.com> wrote in message 
news:dqq50p$dco$1$8302bc10_at_news.demon.co.uk...


> Good post Roy.


>

> If you check out almost any website that has a drop down list, have a look 

> at the html for an example.

>

> Basically, yes, you do need to pass the natural primary key value out; but 

> you'd have to do it twice, for instance on a list box you need the display 

> element as well as the value element - its the value element that we are 

> interested and gets passed back.

>

> I know what i'm talking about when dropping into the application, but, as 

> somebody whos done both camps, development and database design I feel 

> strongly that a lot of the time the database designer does not consider 

> the applications that may use the database. It's the good old DBA v 

> Developer war; we need to consider both camps when looking at getting a 

> scalable, performant and easily maintained design.

>

> With the primary key change, I was more thinking of all the locking and 

> page splitting there would be - it would result in a large transaction and 

> in the default SQL Server world would cause a lot of blocking because of 

> the default writer blocks reader. In a lot of db's its ok for single key 

> changes, but imagine if you put your primary key on a phone number and it 

> was London, suddenly you need to add an extra number, that is one hell of 

> a big update - it would touch your whole database and take hours - you did 

> mention this but i just wanted to highlight its one of my big concerns.

>

> But its worse than that, on changing the primary key in the db, the old 

> primary key will be on a user's screen and they click 'update' there is no 

> record to come back to, or worse still - its a different record; surrogate 

> keys get round that problem because they never change.

>

> My experience over the pas 12ish years has been within investment banking, 

> and I still am shocked at some of the 'data' practices that go on there; 

> but, at the end of the day its an environment where speed and availability 

> are the key. Prior to going into banking I never used surrogate keys 

> either, performance just wasn't an issue.

>

> Thanks for a good debate.

>

> Tony.

>

> -- 

> Tony Rogerson

> SQL Server MVP

> http://sqlserverfaq.com - free video tutorials

>

>

> "Roy Hann" <specially_at_processed.almost.meat> wrote in message 

> news:SLCdnR-38eMwnU3enZ2dnUVZ8tydnZ2d_at_pipex.net...

>> "Tony Rogerson" <tonyrogerson_at_sqlserverfaq.com> wrote in message

>> news:dqoi24$505$1$8302bc10_at_news.demon.co.uk...

>>> > What exactly is wrong with a long value?  Who cares how long the value

>> is?

>>> > (Or more to the point, what does "long" really mean in this context?) 

>>> > If

>>> > we're talking performance, the time it takes to transmit a 1kb value 

>>> > is

>> an

>>> > inconsequential fraction of the time it takes to lookup a surrogate 

>>> > key

>> to

>>> > get the value.

>>>

>>> No No - you see thats just the point!

>>>

>>> If the value where 1KB and you had 100 items in the list box, I now have

>> to

>>> transmit 100KB to the client, on the internet thats a big deal; now

>> multiply

>>> that up to 10, 100, 1000 users and it becomes a really really big deal;

>> even

>>> saving a couple of KB's on a page really makes a difference in

>> scalability.

>>

>> Since network transmission time dominates, I don't see how the smallness 

>> of

>> the value that gets transmitted back when the user makes a selection 

>> really

>> makes much difference.  You must already have delivered the *all* the 

>> text

>> of *all* the possible options to *all* the users so that they know what 

>> they

>> are choosing.  Whether they send back a small surrogate key or an entire

>> large composite key is really irrelevant by then.

>>

>> Your point is still a valid one; it just has nothing to do with 

>> databases.

>> You are describing how a bad user interface is also a badly behaved 

>> network

>> client.  I sure as heck don't want to be presented with a 100+ item list 

>> box

>> even on a LAN, never mind the Internet.  I'd say the DBMS would be doing 

>> you

>> a favour by making it hard to do stuff like that.  I do fully understand

>> what you are saying because I have seen just exactly what you describe. 

>> It

>> was an Internet interface to a taxation system where accounting companies

>> were presented with a list box to select one of their clients.  Some had

>> list boxes of over a thousand choices, weighing in at +500Kb each.  But 

>> the

>> bigger problem was that it was humanly unusable no matter how fast it 

>> was.

>> (And it wasn't.)

>>

>>> Now consider a search page with say 10 listboxes of average 100 items in

>>> each, using the surrogate key approach you can very easily cache these 

>>> on

>>> the middle tier and refresh every 'x' minutes or seconds; you don't care

>>> that the natural key underneath changes say from 'Clarke' to 'Rogerson'

>>> because the surrogate key is a read only - never changing constant.

>>

>> (Names are a bad example for us to use here because they are a poor key

>> anyway, but let's ignore that.)  Most of the time good keys don't change

>> (unless there is a typo or some such error to be corrected).  That is

>> handled quite effectively if the DBMS supports foreign key constraints 

>> that

>> allow ON UPDATE CASCADE.  Admittedly that can be quite intensive when it 

>> is

>> triggered, but it is triggered very rarely in practice (usually), plus 

>> you

>> are spared the trouble of building any surrogate key generation machinery

>> and the trouble caused by duplicates that went undetected because of a

>> spuriously unique surrogate key value.

>>

>>> >> or even worse a

>>> >> composite key in there - which begs the question, if your primary key

>> is

>>> >> a

>>> >> composite key which often it is, how would you code that?

>>> >

>>> > Don't understand the question.

>>>

>>> Say you had a composite primary key, surname, firstname, dob for 

>>> instance

>>> just for example sake.

>>>

>>> The data would be Rogerson, Tony, 10/04/1970

>>>

>>> The list box value would need to be Value="Rogerson, Tony, 10/04/1970";

>> but

>>> what if that data contained comma's - what delimiter would you use to

>>> seperate the composite keys? You'd probably need to use fixed length.

>>>

>>> Whereas, plop a surrogate key on there say id=5 it would just be

>> Value="5".

>>

>> Again, this is an interface/programming issue intruding (unnecessarily 

>> IMO)

>> on a database design.  It is quite possible to imagine an API that would

>> handle the problem entirely transparently.  For instance, since the

>> composite key is unique one could compute a single hash key 

>> *transparently,

>> in the API, at run-time* and use that instead.  No one will ever provide

>> such an API of course...  :-(

>>

>>> > Please don't imagine that I reject the use of surrogate keys, but I do

>>> > find

>>> > there are great benefits if you can avoid them.  They should be 

>>> > employed

>>> > only with the greatest reluctance.

>>>

>>> My experience is the complete opposite, perhaps its just i've got used 

>>> to

>>> cranking the most performance out of SQL Server in the early days 

>>> because

>>> the hardware and disks weren't that powerful. I remember DB2 I didn't 

>>> care

>>> because it was on a mainframe :)

>>

>> Well my experience is with national taxation systems and batch 

>> applications.

>> My goal is to have zero code that the runs outside the DBMS.  I have

>> observed over the years that the less stuff there is in the database 

>> that's

>> not naturally occurring business data, the easier it is to achieve that 

>> goal

>> and the faster those applications can be made to go (by large factors, 

>> e.g.

>> 5, 10, more).  When you are dealing with processes that run for days, 

>> facing

>> deadlines set in law, that's just as important as a fast on-line 

>> interface.

>>

>> My point (yes, I have one) is that different types of application will 

>> put

>> different demands on the DBMS, but restrained use of surrogate keys is

>> universally beneficial.

>>

>> Roy

>>

>>

>

>

Received on Fri Jan 20 2006 - 09:03:25 CET

This message: [ Message body ]
Next message: x: "Re: Early and late binding."
Maybe in reply to: Forrest L Norvell: "candidate keys in abstract parent relations"
In reply to Tony Rogerson: "Re: candidate keys in abstract parent relations"
Next in thread: x: "Re: Early and late binding."

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message