Re: candidate keys in abstract parent relations
Date: Thu, 19 Jan 2006 21:10:19 -0000
Message-ID: <SLCdnR-38eMwnU3enZ2dnUVZ8tydnZ2d_at_pipex.net>
"Tony Rogerson" <tonyrogerson_at_sqlserverfaq.com> wrote in message
news:dqoi24$505$1$8302bc10_at_news.demon.co.uk...
> > What exactly is wrong with a long value? Who cares how long the value
is?
> > (Or more to the point, what does "long" really mean in this context?) If
> > we're talking performance, the time it takes to transmit a 1kb value is
an
> > inconsequential fraction of the time it takes to lookup a surrogate key
to
> > get the value.
>
> No No - you see thats just the point!
>
> If the value where 1KB and you had 100 items in the list box, I now have
to
> transmit 100KB to the client, on the internet thats a big deal; now
multiply
> that up to 10, 100, 1000 users and it becomes a really really big deal;
even
> saving a couple of KB's on a page really makes a difference in
scalability.
Since network transmission time dominates, I don't see how the smallness of
the value that gets transmitted back when the user makes a selection really
makes much difference. You must already have delivered the *all* the text
of *all* the possible options to *all* the users so that they know what they
are choosing. Whether they send back a small surrogate key or an entire
large composite key is really irrelevant by then.
Your point is still a valid one; it just has nothing to do with databases.
You are describing how a bad user interface is also a badly behaved network
client. I sure as heck don't want to be presented with a 100+ item list box
even on a LAN, never mind the Internet. I'd say the DBMS would be doing you
a favour by making it hard to do stuff like that. I do fully understand
what you are saying because I have seen just exactly what you describe. It
was an Internet interface to a taxation system where accounting companies
were presented with a list box to select one of their clients. Some had
list boxes of over a thousand choices, weighing in at +500Kb each. But the
bigger problem was that it was humanly unusable no matter how fast it was.
(And it wasn't.)
> Now consider a search page with say 10 listboxes of average 100 items in
> each, using the surrogate key approach you can very easily cache these on
> the middle tier and refresh every 'x' minutes or seconds; you don't care
> that the natural key underneath changes say from 'Clarke' to 'Rogerson'
> because the surrogate key is a read only - never changing constant.
(Names are a bad example for us to use here because they are a poor key anyway, but let's ignore that.) Most of the time good keys don't change (unless there is a typo or some such error to be corrected). That is handled quite effectively if the DBMS supports foreign key constraints that allow ON UPDATE CASCADE. Admittedly that can be quite intensive when it is triggered, but it is triggered very rarely in practice (usually), plus you are spared the trouble of building any surrogate key generation machinery and the trouble caused by duplicates that went undetected because of a spuriously unique surrogate key value.
> >> or even worse a
> >> composite key in there - which begs the question, if your primary key
is
> >> a
> >> composite key which often it is, how would you code that?
> >
> > Don't understand the question.
>
> Say you had a composite primary key, surname, firstname, dob for instance
> just for example sake.
>
> The data would be Rogerson, Tony, 10/04/1970
>
> The list box value would need to be Value="Rogerson, Tony, 10/04/1970";
but
> what if that data contained comma's - what delimiter would you use to
> seperate the composite keys? You'd probably need to use fixed length.
>
> Whereas, plop a surrogate key on there say id=5 it would just be
Value="5".
Again, this is an interface/programming issue intruding (unnecessarily IMO)
on a database design. It is quite possible to imagine an API that would
handle the problem entirely transparently. For instance, since the
composite key is unique one could compute a single hash key *transparently,
in the API, at run-time* and use that instead. No one will ever provide
such an API of course... :-(
> > Please don't imagine that I reject the use of surrogate keys, but I do
> > find
> > there are great benefits if you can avoid them. They should be employed
> > only with the greatest reluctance.
>
> My experience is the complete opposite, perhaps its just i've got used to
> cranking the most performance out of SQL Server in the early days because
> the hardware and disks weren't that powerful. I remember DB2 I didn't care
> because it was on a mainframe :)