Re: candidate keys in abstract parent relations

From: Tony Rogerson <tonyrogerson_at_sqlserverfaq.com>
Date: Thu, 19 Jan 2006 15:41:13 -0000
Message-ID: <dqobtq$rha$1$8302bc10_at_news.demon.co.uk>

But don't forget there is another phase after the logical model and thats the implementation of it.

This is where surrogate keys really come into play.

It helps create a highly scalable database and provides an efficient interface into the applications using the database.

Consider a drop down listbox in say HTML, it has a VALUE which indicates which item in the drop down is selected, if we hadn't used a surrogate key (auto generated number) then we could well have a long value or even worse a composite key in there - which begs the question, if your primary key is a composite key which often it is, how would you code that?

I get a real bee in my bonet over the surrogate key issue, I'm not saying you do this David because I don't think you do; but a lot of designers (especially celko) try to directly implement the logical model without regard for anything other than writing pure SQL against it. The database from a business stand point is there to store data.

Tony.

-- 
Tony Rogerson
SQL Server MVP
http://sqlserverfaq.com - free video tutorials


"David Portas" <REMOVE_BEFORE_REPLYING_dportas_at_acm.org> wrote in message 
news:1137672084.357920.144470_at_g44g2000cwa.googlegroups.com...


> Forrest L Norvell wrote:

>

>> In this case, I guess I'm being finicky because if I use a generated

>> key column, it's *not* going to be a truly surrogate key: a surrogate

>> key implies that there is a candidate key for which the new key is

>> acting as a surrogate, and as defined, my album table lacks any

>> candidate keys. It's entirely possible that duplicate rows will be

>> added to the table without some kind of nonce or generated identifier

>> being added to its definition, so the entire justification for adding a

>> generated column is to guarantee uniqueness over the relation.

>> According to my (definitely shaky) grasp of the relational model, this

>> is a bad idea, and I'm afraid doing things this way exposes a weakness

>> in the underlying data model, even if I can't identify it.

>

> Yes it's a bad idea. Here's a generic example. Consider two tuples:

>

> a tuple = {'A','B','C'}

> a tuple = {'A','B','C'}

>

> They are the same. It serves no purpose to represent the same

> information twice, which is why we can safely say that a relvar must

> have at least one candidate key - the generality of the model isn't

> compromised by prohibiting redundancy. Now let's add an arbitrary

> artificial key on the understanding that this key has no external

> meaning whatever:

>

> a tuple = {1,'A','B','C'}

> another tuple = {2,'A','B','C'}

>

> Isn't this still just as redundant as in our first case? The artificial

> key isn't supplying any extra information in either tuple. In fact the

> artificial key usually isn't exposed to business users AT ALL. A

> relation containg these two tuples won't violate any principle of RM

> but it would cause us a major problem - the redundancy means that we

> have the potential for anomalies to arise and so we may get incorrect

> results from our database.

>

> Possibly {'A','B','C'} represents some entity that we are interested in

> counting - we want to know how many there are in our universe - but we

> don't need to add an artificial key to do that. We can just add another

> attribute ("quantity") and then a single tuple would be suffficient to

> represent any quantity of whatever we are modelling.

>

> So conceptually nothing is gained by adding the artificial key and

> nothing is lost by removing it. An artificial surrogate key may have

> certain practical advantages but an artificial key is redundant in the

> logical model by definition.

>

> The candidate key is defined by a subset of the attributes, so the real

> question is: what attributes do you need to model about an entity? If

> you aren't sure what the key is then you probably haven't answered that

> question properly.

>

> Now let's reconsider your example. You define an album by the tracks

> that it contains. There may be more than one album with the same name

> and in fact with all other attributes in common and those albums will

> be distinguishable only by the fact that they represent a different

> collection of tracks. Your mistake though is to identify a one-to-one

> correspondence between rows in your Album table (information about an

> album) and the album itself (a collection of tracks). Those are two

> different things. The missing piece in your model is a table or

> attribute(s) to supply the information that relates the two entities. I

> would have expected to see a "joining" table to represent the concept

> of a many-to-many relationship between the Album table and the Tracks

> table.

>

> --

> David Portas

>

Received on Thu Jan 19 2006 - 16:41:13 CET

This message: [ Message body ]
Next message: x: "Re: MV and SQL"
Previous message: dawn: "Re: Announcing New Blog"
Maybe in reply to: Forrest L Norvell: "candidate keys in abstract parent relations"
Next in thread: David Portas: "Re: candidate keys in abstract parent relations"
Reply: David Portas: "Re: candidate keys in abstract parent relations"
Reply: Roy Hann: "Re: candidate keys in abstract parent relations"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message