Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: Effective deadlock handling

Re: Effective deadlock handling

From: Saeed <sr_ng_at_goawaynms-sys-lts.demon.co.uk>
Date: Sat, 3 Jan 2004 16:06:10 +0000
Message-ID: <jaqbyOIyhu9$Ewol@nms-sys-ltd.demon.co.uk>


In message <1073118217.32288_at_yasure>, Daniel Morgan <damorgan_at_x.washington.edu> writes
>Galen Boyer wrote:
>
>> On Fri, 02 Jan 2004, anacedent_at_hotmail.com wrote:
>>
>>>With Oracle, readers do NOT block writers & writers do NOT
>>>block readers. Please provide specific & actual evidence of a
>>>deadlock within your Oracle instance.
>> In any old sqlplus session execute the following;
>>
>> ORA> create table deadlock (id number, fld char(1));
>>
>> ORA> insert into deadlock values (1,'A');
>> ORA> insert into deadlock values (2,'B');
>> ORA> commit;
>> Now, open up another sqlplus session
>> In Session1, issue the following:
>> ORA> update deadlock set fld = 'M' where id = 1;
>> Next, In Session2, issue the following;
>> ORA> update deadlock set fld = 'N' where id = 2;
>> Next, in Session1, issue the following;
>> ORA> update deadlock set fld = 'X' where id = 2;
>> Hm... Notice a waiting session?
>> Next, in Session2, issue the following;
>> ORA> update deadlock set fld = 'Y' where id = 1;
>> Hm... Does one of those sessions now show the following error?
>> ERROR at line 1:
>> ORA-00060: deadlock detected while waiting for resource
>>
>
>But of course. Anyone can intentionally create a deadlock.
>
>But you weren't paying attention to the OP's code. The OP had SELECT FOR UPDATE. And if you use SELECT FOR UPDATE no deadlock
>can occur. Not even with your contrived example.
>
>Given what the OP posted I don't believe any deadlock existed. Though I stand ready to be proven incorrect if he responds to our
>request that he explain what it is he thinks demonstrates that a deadlock has taken place.
>

Thanks to all for the responses, a couple of things to clear up.

The reason for using the SELECT FOR UPDATE is to ensure that the UPDATE's only be attempted if we know that the locks have been acquired. If the locks aren't acquired, then we employ some scheme to work around it (retry, or save to an intermediate table), rather than simply bombing out with the deadlock exception, as has been happening.

If this was just a single UPDATE transaction, there would be no problem, but there are 2 UPDATEs, and other sessions may require the 2 UPDATEs to operate on rows in reverse order to some other sessions, the chances of deadlock occurring are very real.

I am well aware that within Oracle readers do not block writers and vice versa, that much I have learnt in my 13 years experience with Oracle. I would like to point out that this is not an application I built - I have been assigned this problem to fix.

Second, having trumpeted my 13 years experience, I must eat a huge slice of humble pie and admit there was a bug in my code as follows:

IF p_old_date < p_old_date THEN

   OPEN get_lock ( p_per_id, p_old_date );    CLOSE get_lock;
   OPEN get_lock ( p_per_id, p_new_date );    CLOSE get_lock;
ELSE
   OPEN get_lock ( p_per_id, p_new_date );    CLOSE get_lock;
   OPEN get_lock ( p_per_id, p_old_date );    CLOSE get_lock;
END IF; instead of

IF p_old_date < p_new_date THEN
.
.

In tests (2 scripts that call the procedure in a loop 100 times, run simultaneously) the bug version yielded many deadlock errors. The fixed version thus far appears to be reliable, though I am continuing to run the tests.

In arriving at this current logic, I have tried a number of schemes, and there are occasions when even the SELECT FOR UPDATE does not prevent deadlocks. For example, my first attempt had a cursor as follows:

CURSOR get_lock ( cp_per_id NUMBER,
                  cp_old_date DATE,
                  cp_new_date DATE ) IS
SELECT id,
       num_cancelled,
       num_booked

FROM job_stats
WHERE per_id = cp_per_id
AND ( job_date = cp_old_date OR job_date = cp_new_date ) FOR UPDATE OF num_cancelled, num_booked;

This does produce deadlocks, see the trace file excerpt:

Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production With the Partitioning, OLAP and Oracle Data Mining options JServer Release 9.2.0.1.0 - Production
ORACLE_HOME = /opt/oracle/prod/9i_R2
System name: Linux

Node name:      nms3
Release:        2.4.18-4GB
Version:        #1 Wed Mar 27 13:57:05 UTC 2002
Machine:        i686

Instance name: DEVDB9I
Redo thread mounted by this instance: 1
Oracle process number: 13
Unix process pid: 8334, image: oracle_at_nms3 (TNS V1-V3)

SELECT id,

             num_cancelled,
             num_booked
      FROM job_stats
      WHERE per_id = :b3
      AND   ( job_date = :b2 OR job_date = :b1 )
      FOR UPDATE OF num_cancelled, num_booked
End of information on OTHER waiting sessions.

I tried imposing some form of ordering with this as follows:

CURSOR get_lock ( cp_per_id NUMBER,
                  cp_old_date DATE,
                  cp_new_date DATE ) IS

IF p_old_date < p_old_date THEN

   OPEN get_lock ( p_per_id, p_old_date, p_new_date );    CLOSE get_lock;
ELSE
   OPEN get_lock ( p_per_id, p_new_date, p_old_date );    CLOSE get_lock;
END IF; But this had no effect. I even tried using indexes with hints to influence the order in which Oracle finds and subsequently locks the rows, but this too did not work. Using the previous logic (2 separate OPEN's) appears to be working fine now, though there have been a few false dawns before.

The production system on which the error was first noticed is Oracle 8.1.7.4 on Solaris 8 (E10000). The error has been reproduced on the development Oracle 8.1.7.4 instances on both Windows NT and Solaris 8. In addition I have conducted tests on a 9i instance on Linux, and have again been able to reproduce the error.

If anyone would like me to send them some standalone scripts that reproduce the problem, feel free to email me (remove the "goaway" from the email address), and I will oblige.

Kind regards,

Saeed

sr_ng 786 Received on Sat Jan 03 2004 - 10:06:10 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US