Re: Separate PK in Jxn Tbl?

From: Brian Selzer <brian_at_selzer-software.com>
Date: Wed, 30 Jan 2008 09:32:17 GMT
Message-ID: <BuXnj.2018$0w.1637_at_newssvr27.news.prodigy.net>


"Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)> wrote in message news:%23GHRcuxYIHA.3652_at_TK2MSFTNGP02.phx.gbl...
> Sorry if I didn't responded before but I wasn't sure to understand what
> you have wrote. I partage your opinion that a composite key can not only
> unequivocally identifying any item in a particular database but also
> remain constant (ie. never change from database state to database state)
> in many systems.
>
> But as I said, this is not true for all systems and there are occasions
> where the value will change from state to state. For instance, is instead
> of an inventory system you have a sport ligue system, with a table making
> a jonction between a list of players and a list of team; it's easy to see
> that in such a table, the composite key may change its value from state to
> state; for example when a player is exchanged between two teams.
>

I was just pointing out that there are times when composite key values and natural key values permanently identify individuals, and in those instances, the autonumber primary key is just an added complication that serves no material purpose. One can claim that they speed up queries, when in fact they they slow things down due to the additional indexes that need to be maintained. And as has been shown in other posts, the number of joins necessary to answer even simple queries increases with the use of autonumber primary keys, so in fact they may degrade query performance.

> I don't want to enter into a discussion of the full range of possibilities
> (for example, do you want the database to remember the previous
> relationship, what about the player number (if a player change his/her
> number, do you want to keep the older information?), the individual
> statistics, etc., etc.) but it doesn't take too long to see that in such a
> situation, the use of a composite key to express the relationships between
> tables will rapidly become like hell. Like someone else has said: « been
> there, done that » and personally, it's not my intention to go back there.
> Everyone know that when it's time to make a decision, one personal
> experience has more weight than a thousand opinions so for me, my first
> reaction about using a composite primay key will be a no go.
>

I think that if you distill all of the reasons you're citing, they all boil down to whether or not a key's values are permanent identifiers or not. If they're not, and there is a need that they be, then the addition of a permanent identifier may be necessary. Now whether that's an autonumber or some other kind of surrogate isn't really that important.

> But there are other occasions, like your inventory system, where the
> possibility that a composite primary key can change its value don't exist.
> In these occasions, would it be overkill to use a separate primary key?
> Personally, I don't mind using a separate primary key even on these
> occasions but I fully understand that other people might feel
> unconfortable to do the same. Everyone know that the devil like to hide
> in the details; probably that besides the details, the composite keys are
> also one of his favorite places to hide.
>
> --
> Sylvain Lafontaine, ing.
> MVP - Technologies Virtual-PC
> E-mail: sylvain aei ca (fill the blanks, no spam please)
>
>
> "Brian Selzer" <brian_at_selzer-software.com> wrote in message
> news:YqCmj.5515$Rg1.711_at_nlpi068.nbdc.sbc.com...
>>
>> "Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)>
>> wrote in message news:uTvWO%23%23XIHA.1532_at_TK2MSFTNGP04.phx.gbl...
>>> To that, I would add that the increased simplicity of using a surrogate
>>> (or artificial or autonumber) key as the primary key in place of a
>>> composite key is only half their advantage.
>>>
>>> The biggest problem that I have with composite keys is that they share
>>> the same fundamental problem as natural keys: using them as the primary
>>> key is allowing the fact that a primary key can change its value over
>>> time. IMHO, a primary key should never be allowed to change its value
>>> once it has been created; a assumption which will forbid the use of a
>>> composite key in many cases. (Of course, if you don't mind to see a
>>> primary key changing its value after its creation then you are not
>>> concerned by this argument.).
>>>
>>
>> This argument has an inherent fallacy in it. Just because a key is
>> composed from multiple columns doesn't necessarily mean that its values
>> can be different in different database states. For example, in an
>> Inventory table that has the key, {ItemKey, WarehouseKey}, with
>> references to an Item table and a Warehouse table respectively, the
>> combination values that comprise each key value can never change from
>> database state to database state. A particular combination of values
>> identifies a particular individual in the Universe of Discourse in
>> /every/ database state in which it appears. It can /never/ identify any
>> other individual. Therefore, it should be obvious that adding an
>> additional autonumber primary key in this instance would be superfluous,
>> since each {ItemKey, WarehouseKey} combination already rigidly designates
>> a distinct individual in the Universe of Discourse.
>>
>> The same can be said for many natural keys. For example, suppose you
>> have a table, Queue, that has an integer key, {Position}. Each value for
>> Position rigidly designates a distinct individual in the Universe of
>> Discourse (3 always means "third in line" in any database state in which
>> there are 3 or more elements), so therefore there is no need for an
>> additional autonumber primary key.
>>
>>> This is not only a theoritical argument as many interfaces - like
>>> Access - won't like to see a primary key that could change it value.
>>> But even if you take out such interfaces out of the equation, the use of
>>> a surrogate key for all tables reveals itself to be advantageous in many
>>> database problems. For example, if you want to add a log of all changes
>>> to a table, it's much more easier to design it if the table use a
>>> surrogate key for its primary key than a natural key or a composite key.
>>>
>>> Personally, I stopped using natural keys and composite keys many years
>>> ago and probably that something like half of my problems with the design
>>> of databases have vanished with them. On these occasions when I was
>>> called to work on a problematic database, chances was much higher to see
>>> that the problems were associated with the use of natural keys and/or
>>> composite keys than with the use of a surrogate keys and the solutions
>>> were usually much more complicated to solve in the first case than in
>>> the second case.
>>>
>>> Also, I've remember some peoples who have done like me and have stopped
>>> using natural and composite keys in favor of the exclusive use of
>>> surrogate keys but I don't remember anyone doing the opposite; ie. going
>>> from the use of surrogate keys to the use of natural and composite keys.
>>>
>>> --
>>> Sylvain Lafontaine, ing.
>>> MVP - Technologies Virtual-PC
>>> E-mail: sylvain aei ca (fill the blanks, no spam please)
>>>
>>>
>>> <CDMAPoster_at_fortunejames.com> wrote in message
>>> news:a12c7757-aea7-4d11-bf7c-3bd4b7feb442_at_n20g2000hsh.googlegroups.com...
>>> On Jan 25, 9:12 am, Jamie Collins <jamiecoll..._at_xsmail.com> wrote:
>>>> On Jan 24, 11:00 pm, "James A. Fortune" <MPAPos..._at_FortuneJames.com>
>>>> wrote:
>>>>
>>>> > Whenever I have multiple key fields, natural or not, I create an
>>>> > AutoNumber PK for pragmatic reasons. The main reason is that it makes
>>>> > it easier to create the joins. The theorists are champions at joining
>>>> > tables and don't have to be concerned with the complexity of the SQL
>>>> > they write.
>>>>
>>>> Word to the wise: 'theorists' hate SQL.
>>>>
>>>> > Queries involving many to many relationships often add
>>>> > additional tables later and highlight the need to keep joins as
>>>> > simple
>>>> > as possible.
>>>>
>>>> I think I'm with Bob Badour (gulp!) on the issue of complexity,
>>>> though: if you think more columns in the ON clause makes a SQL join
>>>> more 'complex' then I think you could be looking at things wrong.
>>>> Having more characters to type increases the risk of typos? More
>>>> columns mean you may omit one in error? The SQL engine may be twice as
>>>> slow in handling two columns rather than one? Is it more 'complex' to
>>>> split a post address into 'subatomic' columns (address lines from
>>>> postal code/zip etc)?
>>>>
>>>> Surely the issue you allude to (I think) is the one that Access
>>>> Relationships (as distinct from Jet foreign keys) were invented to
>>>> solve? i.e. you pre-define the join columns and 'join type' (inner
>>>> join, left outer join or right outer join) and the join clause gets
>>>> written as SQL for you when you drop the tables into the Query Builder
>>>> thing. I would have thought the 'theorists' would love the fact that
>>>> you also create foreign keys in the same Relationships dialog i.e. you
>>>> end up with a natural join (not having to explicitly specify the
>>>> columns yourself) because one table references the other.
>>>>
>>>> [I tend to be dismissive of tools that write SQL code for me but I
>>>> think I should perhaps review my stance e.g. I still write all my SQL
>>>> Server procs by hand whereas I have tasked myself to investigate CRUD
>>>> generators. But, for the time being, ...] As a SQL coder myself, I
>>>> find it more annoying that I have to create multiple joins to get the
>>>> 'natural key' values, having to discover what the 'artificial key'
>>>> columns are in the first place.
>>>>
>>>> > Lately, I've increased the amount of normalization in one
>>>> > of my databases and the joins got even more complicated, adding about
>>>> > a
>>>> > line or so in the SQL view in Access for every new query using those
>>>> > tables.
>>>>
>>>> Bad luck: I think you might have got way with "reduced the amount of
>>>> denormalization" ;-) In this thread I've already broken my personal
>>>> rule (!!) about not mentioning normalization [formulated because the
>>>> average 'replier' around here thinks "fully normalized" is BCNF, which
>>>> they think is 3NF anyhow, and doesn't pay much attention to anomalies
>>>> that normalization doesn't address, unless the 'asker' mentions
>>>> storing calculations...]
>>>>
>>>> > I keep Jamie's advice in the back of my mind, about how enforcing
>>>> > constraints at the table level is better than enforcing them through
>>>> > code
>>>>
>>>> ..and best to do it in both places! Bear in mind that it's a rule of
>>>> thumb i.e. "strict rules modified in practise." Checking something in
>>>> the in front end allows you to give timely user feedback and could
>>>> save them some keying, not to mention a database roundtrip. Checking
>>>> in the database catches anything neglected in the front end by
>>>> omission of validation or introduction of bugs. In practice, some
>>>> things are better done in one place but not the other: contrast the
>>>> validation of the basic pattern of an email address with the
>>>> verification that an addressable entity can be contacted at that email
>>>> address; I don't think it would be sensible to put the latter test
>>>> into a table constraint, even if it were possible.
>>>>
>>>> Jamie.
>>>>
>>>> --
>>>
>>> What part of simpler don't you understand :-). Only one expression in
>>> the ON is simpler. Needing less indexes is simpler. Not having to
>>> look for your multi-key fields is easier, although your point that
>>> Relationships can handle that is valid. If the AutoNumber key has a
>>> one-to-one relationship with the multi-key fields then it's fine to
>>> use it. There's no down side that I can see. I also like to rely on
>>> coding to detect inconsistent data rather than on error trapping, so I
>>> have to check the multi-key values anyway before adding a new record.
>>> I think that your idea about enforcing constraints at both the table
>>> level and in code is an excellent idea. The OP wanted to know what
>>> people did and why. I still don't see any reason put forward for me
>>> to change to a multi-field key. Are totals queries easier when multi-
>>> field keys are used? BTW, "reduced the amount of denormalization"
>>> works just as well. Real databases experience denormalizing
>>> influences.
>>>
>>> James A. Fortune
>>> CDMAPoster_at_FortuneJames.com
>>>
>>
>>
>
>
Received on Wed Jan 30 2008 - 10:32:17 CET

Original text of this message