Re: Separate PK in Jxn Tbl?

From: Brian Selzer <>
Date: Sat, 26 Jan 2008 03:45:44 -0500
Message-ID: <YqCmj.5515$>

"Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)> wrote in message news:uTvWO%23%23XIHA.1532_at_TK2MSFTNGP04.phx.gbl...
> To that, I would add that the increased simplicity of using a surrogate
> (or artificial or autonumber) key as the primary key in place of a
> composite key is only half their advantage.
> The biggest problem that I have with composite keys is that they share the
> same fundamental problem as natural keys: using them as the primary key is
> allowing the fact that a primary key can change its value over time.
> IMHO, a primary key should never be allowed to change its value once it
> has been created; a assumption which will forbid the use of a composite
> key in many cases. (Of course, if you don't mind to see a primary key
> changing its value after its creation then you are not concerned by this
> argument.).

This argument has an inherent fallacy in it. Just because a key is composed from multiple columns doesn't necessarily mean that its values can be different in different database states. For example, in an Inventory table that has the key, {ItemKey, WarehouseKey}, with references to an Item table and a Warehouse table respectively, the combination values that comprise each key value can never change from database state to database state. A particular combination of values identifies a particular individual in the Universe of Discourse in /every/ database state in which it appears. It can /never/ identify any other individual. Therefore, it should be obvious that adding an additional autonumber primary key in this instance would be superfluous, since each {ItemKey, WarehouseKey} combination already rigidly designates a distinct individual in the Universe of Discourse.

The same can be said for many natural keys. For example, suppose you have a table, Queue, that has an integer key, {Position}. Each value for Position rigidly designates a distinct individual in the Universe of Discourse (3 always means "third in line" in any database state in which there are 3 or more elements), so therefore there is no need for an additional autonumber primary key.

> This is not only a theoritical argument as many interfaces - like Access -
> won't like to see a primary key that could change it value. But even if
> you take out such interfaces out of the equation, the use of a surrogate
> key for all tables reveals itself to be advantageous in many database
> problems. For example, if you want to add a log of all changes to a
> table, it's much more easier to design it if the table use a surrogate key
> for its primary key than a natural key or a composite key.
> Personally, I stopped using natural keys and composite keys many years ago
> and probably that something like half of my problems with the design of
> databases have vanished with them. On these occasions when I was called
> to work on a problematic database, chances was much higher to see that the
> problems were associated with the use of natural keys and/or composite
> keys than with the use of a surrogate keys and the solutions were usually
> much more complicated to solve in the first case than in the second case.
> Also, I've remember some peoples who have done like me and have stopped
> using natural and composite keys in favor of the exclusive use of
> surrogate keys but I don't remember anyone doing the opposite; ie. going
> from the use of surrogate keys to the use of natural and composite keys.
> --
> Sylvain Lafontaine, ing.
> MVP - Technologies Virtual-PC
> E-mail: sylvain aei ca (fill the blanks, no spam please)
> <> wrote in message
> On Jan 25, 9:12 am, Jamie Collins <> wrote:
>> On Jan 24, 11:00 pm, "James A. Fortune" <>
>> wrote:
>> > Whenever I have multiple key fields, natural or not, I create an
>> > AutoNumber PK for pragmatic reasons. The main reason is that it makes
>> > it easier to create the joins. The theorists are champions at joining
>> > tables and don't have to be concerned with the complexity of the SQL
>> > they write.
>> Word to the wise: 'theorists' hate SQL.
>> > Queries involving many to many relationships often add
>> > additional tables later and highlight the need to keep joins as simple
>> > as possible.
>> I think I'm with Bob Badour (gulp!) on the issue of complexity,
>> though: if you think more columns in the ON clause makes a SQL join
>> more 'complex' then I think you could be looking at things wrong.
>> Having more characters to type increases the risk of typos? More
>> columns mean you may omit one in error? The SQL engine may be twice as
>> slow in handling two columns rather than one? Is it more 'complex' to
>> split a post address into 'subatomic' columns (address lines from
>> postal code/zip etc)?
>> Surely the issue you allude to (I think) is the one that Access
>> Relationships (as distinct from Jet foreign keys) were invented to
>> solve? i.e. you pre-define the join columns and 'join type' (inner
>> join, left outer join or right outer join) and the join clause gets
>> written as SQL for you when you drop the tables into the Query Builder
>> thing. I would have thought the 'theorists' would love the fact that
>> you also create foreign keys in the same Relationships dialog i.e. you
>> end up with a natural join (not having to explicitly specify the
>> columns yourself) because one table references the other.
>> [I tend to be dismissive of tools that write SQL code for me but I
>> think I should perhaps review my stance e.g. I still write all my SQL
>> Server procs by hand whereas I have tasked myself to investigate CRUD
>> generators. But, for the time being, ...] As a SQL coder myself, I
>> find it more annoying that I have to create multiple joins to get the
>> 'natural key' values, having to discover what the 'artificial key'
>> columns are in the first place.
>> > Lately, I've increased the amount of normalization in one
>> > of my databases and the joins got even more complicated, adding about a
>> > line or so in the SQL view in Access for every new query using those
>> > tables.
>> Bad luck: I think you might have got way with "reduced the amount of
>> denormalization" ;-) In this thread I've already broken my personal
>> rule (!!) about not mentioning normalization [formulated because the
>> average 'replier' around here thinks "fully normalized" is BCNF, which
>> they think is 3NF anyhow, and doesn't pay much attention to anomalies
>> that normalization doesn't address, unless the 'asker' mentions
>> storing calculations...]
>> > I keep Jamie's advice in the back of my mind, about how enforcing
>> > constraints at the table level is better than enforcing them through
>> > code
>> ..and best to do it in both places! Bear in mind that it's a rule of
>> thumb i.e. "strict rules modified in practise." Checking something in
>> the in front end allows you to give timely user feedback and could
>> save them some keying, not to mention a database roundtrip. Checking
>> in the database catches anything neglected in the front end by
>> omission of validation or introduction of bugs. In practice, some
>> things are better done in one place but not the other: contrast the
>> validation of the basic pattern of an email address with the
>> verification that an addressable entity can be contacted at that email
>> address; I don't think it would be sensible to put the latter test
>> into a table constraint, even if it were possible.
>> Jamie.
>> --
> What part of simpler don't you understand :-). Only one expression in
> the ON is simpler. Needing less indexes is simpler. Not having to
> look for your multi-key fields is easier, although your point that
> Relationships can handle that is valid. If the AutoNumber key has a
> one-to-one relationship with the multi-key fields then it's fine to
> use it. There's no down side that I can see. I also like to rely on
> coding to detect inconsistent data rather than on error trapping, so I
> have to check the multi-key values anyway before adding a new record.
> I think that your idea about enforcing constraints at both the table
> level and in code is an excellent idea. The OP wanted to know what
> people did and why. I still don't see any reason put forward for me
> to change to a multi-field key. Are totals queries easier when multi-
> field keys are used? BTW, "reduced the amount of denormalization"
> works just as well. Real databases experience denormalizing
> influences.
> James A. Fortune
Received on Sat Jan 26 2008 - 09:45:44 CET

Original text of this message