Re: Deadlock problem

From: Chris Dawe <the-labyrinth_at_ntlworld.com>
Date: Mon, 18 Mar 2002 11:25:22 +0000
Message-ID: <rkjb9us8decdo2sf0cd4fru05d22imaujq_at_4ax.com>


On Mon, 18 Mar 2002 10:23:03 +0100, Ruud de Koter <ruud_dekoter_at_hp.com> wrote:

>Hi Tommi,
>
>Tommi Korhonen wrote:
>>
>> I have a client program executing database operations like
>>
>> op1: newid := select max(id)+1 from mytable
>> op2: insert into mytable values (newid,data)
>>
>> where op1 and op2 are executed inside the same transaction. If two clients are
>> running in the same time, they generate transactions T1 and T2. If the
>> isolation level of the database is set to serializable, it's possible to
>> schedule operations like
>>
>> T1: start
>> T2: start
>> T1: op1
>> T2: op1
>> T1: op2
>> T1: commit
>> T2: op2
>> T2: commit
>>
>> which will lead to deadlock, since
>>
>> T1: op1 -> locks the table with shared-read lock to prevent phamtom rows
>> T2: op1 -> locks the table with shared-read lock to prevent phamtom rows
>> T1: op2 -> waits for T2 to unlock
>> (T1: commit -> never gets here, because T1 is still waiting)
>> T2: op2 -> waits for T1 to unlock --> DEADLOCK
>>
>> (details taken from sybase, but the result was the same with all other
>> databases tried)
>>
>> If the isolation level is lower than serializable, the deadlock is avoided, but
>> both clients will get the same id-number (and that's not the purpose).
>>
>> Trigger, auto-ingrement field or stored procedure are not the answer, because
>> clients must work with every database, so proprietary features can not be used.
>> Locking the whole table with exclusive-lock would solve the situation, but I
>> think this also requires some proprietary actions.
>>
>> How can I fix this? I would think this is no unique problem, op1+op2 must be a
>> very basic operation in every client-server software. Of cource, DBMS can
>> handle the deadlock by killing one of the transactions, but I would like to
>> avoid the deadlock situation here altogether.
>
>One solution would be to construct an additional table, which stores the
>max(id). You can then query this table, and update it with a new id value for
>each operation. If you need to get the max(id) on more than one table, simply
>add the table name to the new table's definition and query on the table name.
>
>This is probably faster also, as you don't need to determine the max(id) for
>every action you perform.
>
>Locking problems on the new table are easier to prevent:
>- lock rows exclusively, not the complete table (in Oracle I would use 'FOR
>UPDATE OF'),
>- make sure reading and updating the row is (part of) one transaction.
>
>As one transaction performs all actions before releasing the exclusive lock, no
>risk exists that other transactions will start in the meantime. On catch: you
>should keep the transaction as short as possible!
>
>Hope this gets you under way,
>
>Ruud.

Hello Tommi,

Ruud's suggestion of an "id" table is good for scalability - some DBMSs are not very good at optimising a MAX(id) from a table and actually get more data off disc than is really required.

To minimise locking problems in the "id" table, change the order of the statements:

	UPDATE my_ids
	SET next_id = next_id+1
	WHERE category="my_table";

	SELECT new_id = next_id-1
	FROM my_ids
	WHERE category="my_table";
	[a]
	INSERT INTO my_table (id, data)
	VALUES (:new_id, :data);
	COMMIT;

At [a] you can add a COMMIT. This will reduce contention at the expense of occasionally having unused numbers (does the number sequence need to be continuous?)

The simple but less scalable solution is

	INSERT INTO mytable (id, data)
	SELECT 1+MAX(id), 'this is my new data'
	FROM my_table;
	COMMIT;

Exactly how this works will depend on your DBMS and whether it takes the lock for the insert before or after the lock for the select and whether they are table, page or row-level locks.
-- 
Chris Dawe
Database Consultant
cedawe_at_bcs.org.uk
Received on Mon Mar 18 2002 - 12:25:22 CET

Original text of this message