Re: candidate keys in abstract parent relations

From: Tony Rogerson <tonyrogerson_at_sqlserverfaq.com>
Date: Fri, 20 Jan 2006 08:03:25 -0000
Message-ID: <dqq5fe$1uo$1$8300dec7_at_news.demon.co.uk>


> I know what i'm talking about when dropping into the application, but, as
> somebody whos done both camps, development and database design I feel
> strongly that a lot of the time the database designer does not consider
> the applications that may use the database. It's the good old DBA v
> Developer war; we need to consider both camps when looking at getting a
> scalable, performant and easily maintained design.

Sorry, have to proof read so when i start writing and a sentance and then start again it doesn't come out like that; think its time for a cuppa tea and breakfast :)

What i meant say was...

I know i'm dropping into the application, but, as  somebody whos done both camps, development and database design I feel  strongly that a lot of the time the database designer does not consider the  applications that may use the database. It's the good old DBA v Developer  war; we need to consider both camps when looking at getting a scalable,  performant and easily maintained design.

-- 
Tony Rogerson
SQL Server MVP
http://sqlserverfaq.com - free video tutorials


"Tony Rogerson" <tonyrogerson_at_sqlserverfaq.com> wrote in message 
news:dqq50p$dco$1$8302bc10_at_news.demon.co.uk...

> Good post Roy.
>
> If you check out almost any website that has a drop down list, have a look
> at the html for an example.
>
> Basically, yes, you do need to pass the natural primary key value out; but
> you'd have to do it twice, for instance on a list box you need the display
> element as well as the value element - its the value element that we are
> interested and gets passed back.
>
> I know what i'm talking about when dropping into the application, but, as
> somebody whos done both camps, development and database design I feel
> strongly that a lot of the time the database designer does not consider
> the applications that may use the database. It's the good old DBA v
> Developer war; we need to consider both camps when looking at getting a
> scalable, performant and easily maintained design.
>
> With the primary key change, I was more thinking of all the locking and
> page splitting there would be - it would result in a large transaction and
> in the default SQL Server world would cause a lot of blocking because of
> the default writer blocks reader. In a lot of db's its ok for single key
> changes, but imagine if you put your primary key on a phone number and it
> was London, suddenly you need to add an extra number, that is one hell of
> a big update - it would touch your whole database and take hours - you did
> mention this but i just wanted to highlight its one of my big concerns.
>
> But its worse than that, on changing the primary key in the db, the old
> primary key will be on a user's screen and they click 'update' there is no
> record to come back to, or worse still - its a different record; surrogate
> keys get round that problem because they never change.
>
> My experience over the pas 12ish years has been within investment banking,
> and I still am shocked at some of the 'data' practices that go on there;
> but, at the end of the day its an environment where speed and availability
> are the key. Prior to going into banking I never used surrogate keys
> either, performance just wasn't an issue.
>
> Thanks for a good debate.
>
> Tony.
>
> --
> Tony Rogerson
> SQL Server MVP
> http://sqlserverfaq.com - free video tutorials
>
>
> "Roy Hann" <specially_at_processed.almost.meat> wrote in message
> news:SLCdnR-38eMwnU3enZ2dnUVZ8tydnZ2d_at_pipex.net...
>> "Tony Rogerson" <tonyrogerson_at_sqlserverfaq.com> wrote in message
>> news:dqoi24$505$1$8302bc10_at_news.demon.co.uk...
>>> > What exactly is wrong with a long value? Who cares how long the value
>> is?
>>> > (Or more to the point, what does "long" really mean in this context?)
>>> > If
>>> > we're talking performance, the time it takes to transmit a 1kb value
>>> > is
>> an
>>> > inconsequential fraction of the time it takes to lookup a surrogate
>>> > key
>> to
>>> > get the value.
>>>
>>> No No - you see thats just the point!
>>>
>>> If the value where 1KB and you had 100 items in the list box, I now have
>> to
>>> transmit 100KB to the client, on the internet thats a big deal; now
>> multiply
>>> that up to 10, 100, 1000 users and it becomes a really really big deal;
>> even
>>> saving a couple of KB's on a page really makes a difference in
>> scalability.
>>
>> Since network transmission time dominates, I don't see how the smallness
>> of
>> the value that gets transmitted back when the user makes a selection
>> really
>> makes much difference. You must already have delivered the *all* the
>> text
>> of *all* the possible options to *all* the users so that they know what
>> they
>> are choosing. Whether they send back a small surrogate key or an entire
>> large composite key is really irrelevant by then.
>>
>> Your point is still a valid one; it just has nothing to do with
>> databases.
>> You are describing how a bad user interface is also a badly behaved
>> network
>> client. I sure as heck don't want to be presented with a 100+ item list
>> box
>> even on a LAN, never mind the Internet. I'd say the DBMS would be doing
>> you
>> a favour by making it hard to do stuff like that. I do fully understand
>> what you are saying because I have seen just exactly what you describe.
>> It
>> was an Internet interface to a taxation system where accounting companies
>> were presented with a list box to select one of their clients. Some had
>> list boxes of over a thousand choices, weighing in at +500Kb each. But
>> the
>> bigger problem was that it was humanly unusable no matter how fast it
>> was.
>> (And it wasn't.)
>>
>>> Now consider a search page with say 10 listboxes of average 100 items in
>>> each, using the surrogate key approach you can very easily cache these
>>> on
>>> the middle tier and refresh every 'x' minutes or seconds; you don't care
>>> that the natural key underneath changes say from 'Clarke' to 'Rogerson'
>>> because the surrogate key is a read only - never changing constant.
>>
>> (Names are a bad example for us to use here because they are a poor key
>> anyway, but let's ignore that.) Most of the time good keys don't change
>> (unless there is a typo or some such error to be corrected). That is
>> handled quite effectively if the DBMS supports foreign key constraints
>> that
>> allow ON UPDATE CASCADE. Admittedly that can be quite intensive when it
>> is
>> triggered, but it is triggered very rarely in practice (usually), plus
>> you
>> are spared the trouble of building any surrogate key generation machinery
>> and the trouble caused by duplicates that went undetected because of a
>> spuriously unique surrogate key value.
>>
>>> >> or even worse a
>>> >> composite key in there - which begs the question, if your primary key
>> is
>>> >> a
>>> >> composite key which often it is, how would you code that?
>>> >
>>> > Don't understand the question.
>>>
>>> Say you had a composite primary key, surname, firstname, dob for
>>> instance
>>> just for example sake.
>>>
>>> The data would be Rogerson, Tony, 10/04/1970
>>>
>>> The list box value would need to be Value="Rogerson, Tony, 10/04/1970";
>> but
>>> what if that data contained comma's - what delimiter would you use to
>>> seperate the composite keys? You'd probably need to use fixed length.
>>>
>>> Whereas, plop a surrogate key on there say id=5 it would just be
>> Value="5".
>>
>> Again, this is an interface/programming issue intruding (unnecessarily
>> IMO)
>> on a database design. It is quite possible to imagine an API that would
>> handle the problem entirely transparently. For instance, since the
>> composite key is unique one could compute a single hash key
>> *transparently,
>> in the API, at run-time* and use that instead. No one will ever provide
>> such an API of course... :-(
>>
>>> > Please don't imagine that I reject the use of surrogate keys, but I do
>>> > find
>>> > there are great benefits if you can avoid them. They should be
>>> > employed
>>> > only with the greatest reluctance.
>>>
>>> My experience is the complete opposite, perhaps its just i've got used
>>> to
>>> cranking the most performance out of SQL Server in the early days
>>> because
>>> the hardware and disks weren't that powerful. I remember DB2 I didn't
>>> care
>>> because it was on a mainframe :)
>>
>> Well my experience is with national taxation systems and batch
>> applications.
>> My goal is to have zero code that the runs outside the DBMS. I have
>> observed over the years that the less stuff there is in the database
>> that's
>> not naturally occurring business data, the easier it is to achieve that
>> goal
>> and the faster those applications can be made to go (by large factors,
>> e.g.
>> 5, 10, more). When you are dealing with processes that run for days,
>> facing
>> deadlines set in law, that's just as important as a fast on-line
>> interface.
>>
>> My point (yes, I have one) is that different types of application will
>> put
>> different demands on the DBMS, but restrained use of surrogate keys is
>> universally beneficial.
>>
>> Roy
>>
>>
>
>
Received on Fri Jan 20 2006 - 09:03:25 CET

Original text of this message