| Oracle FAQ | Your Portal to the Oracle Knowledge Grid | |
Home -> Community -> Usenet -> comp.databases.theory -> Re: Help finding natural keys
>>>>> "Alan" == Alan Gutierrez <ajglist_at_izzy.net> writes:
Alan> Tim Cross wrote:
>>>>>>> "Alan" == Alan Gutierrez <ajglist_at_izzy.net> writes:
Alan> ISSUE 2
>>
>> >> What I found of far more concern was the way you generate your
>> >> surrogate keys. Determineing a new primary key by looking at
>> >> the largest current primary key and adding one to it is very
>> >> dangerous. The big danger is that first selecting the maximum
>> >> value from a field, adding some amount to it and then inserting
>> >> the resulting key back into the table is not an atomic action.
>>
Alan> INSERT INTO Patient (firm_id, patient_id) VALUES (:firm_id, 0)
Alan> ;
>>
Alan> The above is atomic in PostgreSQL. The first transaction to Alan> execute this statement will block any other transaction Alan> attempting to execute this statement until this first Alan> transactions rolls back or commits.
Alan> UPDATE Patient SET patient_id = (SELECT MAX(patient_id) FROM Alan> Patient AS P1 WHERE P1.firm_id = firm_id) WHERE firm_id = Alan> :firm_id AND patient_id = 0 Alan> ;
>> The actual insert/update operations might be atomic in themselves,
>> but not between each other. For example, in your above example,
>> what would happen if another process has inserted a new record
>> after you inserted a new record, but before you issue the update
>> command to set the patient id?
Alan> In PostgreSQL the second process would wait until the first Alan> process ended it's transaction.
>> My interpretation of what you have above would indicate the
>> possibility both records would get the same patient id - the
>> possibility might be slight because of the firm_id part in the
>> where clause and it is 'unlikely' two records from the same firm
>> will be inserted and waiting for update at the same time. However
>> 'unlikely' is not never.
Alan> I understand race conditions. I have tested this sequence of Alan> statments to ensure that with concurrent transactions, the Alan> first one to preform the insert statement will force other Alan> transactions to wait until transaction end to resume Alan> execution. This method for generating unique ids does work.>> >> guaranteeing your id numbers will be unique, it will probably
>> >> Most commercial (and even most free) DBMS have some facility
>> >> for generating id numbers which are guaranteed to be unique
>> >> (e.g. sequence). If your system has such a facility, use it
>> >> rather than the max(some_column)+1 approach. Apart from
Alan> I'm going to put more thought into how I generate my surrogate Alan> keys. You are not the only one to raise and eye-brow to SELECT Alan> MAX. I chose this over sequences since a sequence was one more Alan> database object to maintain. This is an area where I have Alan>
Alan> The maintianence overhead of sequences still exists, and the Alan> problem of generating duplicate keys has been resolved. Are Alan> sequences still the better bet? Alan> Thank you for your concern. Data races and deadlock are a Alan> frequently a source of trouble for those new to multi-process Alan> environments. They still trip me up on (not-so-rare)Alan> occasion. I appreciate all the frantic hand waving. I'll post a Alan> sanity check over at a PostgreSQL forum. But this does work.
First, I preface what follows with the admission I know nothing about postgres and what follows is based largely on assumption.
I would still go with the sequence and here is why -
Finally, you say you have tested your technique to make sure it works. I just want to point out how difficult it is to test for something like this. It is not sufficient to just create a number of concurrent sessions and try it out because there are too many variables which can affect the results, machine speed, number of cpus, amount of memory, network latency, load from other processes, timing etc. Really the only way to be certain it works is to do an in-depth analysis of how the dbms works, its locking mechanisms etc. This would involve an audit of the code etc. I'm not trying to be critical of your testing skills etc, simply just point out how complex it can be to make a solid determination that any form of race condition does not exist - you really need to prove it theoretically rather than practically based on the source code. Note also that you do NOT need to have multiple processors for this situation to occur. It can happen with a single processor because there is no guarantee on the order of execution of commands between two processes on a single processor - it all depends on issues like how the OS schedules jobs, priorities of the processes etc.
HTH Tim
-- Tim Cross The e-mail address on this message is FALSE (obviously!). My real e-mail is to a company in Australia called rapttech and my login is tcross - if you really need to send mail, you should be able to work it out!Received on Sat Jan 18 2003 - 20:42:44 CST
![]() |
![]() |