RE: determining bind values in deadlock situations

From: Barun, Vlado <Vlado.Barun_at_JTV.com>
Date: Wed, 4 Mar 2009 07:54:32 -0500
Message-ID: <0181A069127F1944B484ED5B858D0C160C119879D2_at_KPMSPW02.jewelry.acn>



Your test case proves that I was incorrect :)

The trace files only contained two copies of update t set a=9 where a=3;
update t set a=10 where a=1;

Thank you for spending the time on educating me on this :)

Regards,
 
Vlado Barun, M.Sc.
Sr. Manager, Database Engineering and Operations Jewelry Television
Mobile: 865 335 7652
Email: vlado.barun_at_jtv.com

-----Original Message-----
From: Bobak, Mark [mailto:Mark.Bobak_at_proquest.com] Sent: Wednesday, March 04, 2009 12:30 AM To: Barun, Vlado; tanel_at_poderc.com; oracle-l_at_freelists.org Subject: RE: determining bind values in deadlock situations

Hi Vlado,

I'm very surprised. I don't see how the trace can identify the SQL that originally took the lock.

I ran on 11.1.0.6 RAC, but, was not able to duplicate results. How did you get the lmd trace for both sessions? I was only able to get the lmd trace for the session that caught the ORA-00060.

Unfortunately, my only 10gR2 RAC is 10.2.0.3 and doesn't have the patch you mentioned.

Can you please run the following, slightly modified, test case? My testcase:

create table t(a int primary key);

insert into t select rownum from dba_tables where rownum <5; commit;

session 1:
update t set a=5 where a=1;
update t set a=6 where a=2;

session 2:
update t set a=7 where a=3;
update t set a=8 where a=4;

session 1:
update t set a=9 where a=3;

session 2:
update t set a=10 where a=1;

I fully expect the trace files to contain: update t set a=9 where a=3;
update t set a=10 where a=1;

What I'm curious about is, will they contain: update t set a=5 where a=1;
and
update t set a=7 where a=3;

or

update t set a=6 where a=2;
and
update t set a=8 where a=4;

or all 4? Or something else entirely?

If it's really identifying the statement that originally created the lock that caused the block, I'd expect the traces to contain: update t set a=5 where a=1;
and
update t set a=7 where a=3;

Note that the other updates:
update t set a=6 where a=2;
and
update t set a=8 where a=4;

are holding TX locks, but are not involved in causing any blocks related to the deadlock.

The purpose of my test, is to determine if the trace is really showing the statement that created the TX lock causing the block, or just the previous SQL executed before the SQL that caught the deadlock....?

-Mark



From: Barun, Vlado [Vlado.Barun_at_JTV.com] Sent: Tuesday, March 03, 2009 11:27 PM
To: Bobak, Mark; tanel_at_poderc.com; oracle-l_at_freelists.org Subject: RE: determining bind values in deadlock situations

That is not correct.

Sql_a, sql_b, sql_c and sql_d are in the lmd0 trace files in a RAC environment, assuming you have the necessary patches applied (in my case 4441119 for 10.2.0.3 RAC/ASM on Solaris 10).

Following is a test case:

***Session 1:***

vbarun_at_qs01> create table t(a int);
Table created.

vbarun_at_qs01> insert into t values (1); 1 row created.

vbarun_at_qs01> insert into t values (2); 1 row created.

vbarun_at_qs01> commit;
Commit complete.

vbarun_at_qs02> update t set a=1 where a=1; 1 row updated.

***Session 2:***

vbarun_at_qs02> update t set a=1 where a=2; 1 row updated.

vbarun_at_qs02> update t set a=1 where a=1;

***Session 1: ***

vbarun_at_qs01> update t set a=1 where a=2; update t set a=1 where a=2

       *
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource

***Session 1 lmd0 trace file: ***

user session for deadlock lock a00a79e50   pid=43 serial=41143 audsid=803118528 user: 6607/VBARUN   O/S info: user: vla1, term: W75, ospid: 6012:6028, machine: W75

            program: sqlplus.exe
  application name: SQL*Plus, hash value=3669949024   Current SQL Statement:
  update t set a=1 where a=1

user session for deadlock lock 6813e19c8   pid=43 serial=41143 audsid=803118528 user: 6607/VBARUN   O/S info: user: vla1, term: W75, ospid: 6012:6028, machine: W75

            program: sqlplus.exe
  application name: SQL*Plus, hash value=3669949024   Current SQL Statement:
  update t set a=1 where a=1

Global Wait-For-Graph(WFG) at ddTS[0.c] :
BLOCKED a00cc3618 5 [0x8001c][0x2e11d9],[TX] [2004-0047-00001C7C] 1
BLOCKER a00a79e50 5 [0x8001c][0x2e11d9],[TX] [1002-002B-00000736] 0
BLOCKED 6813e19c8 5 [0x12002a][0x4454e0],[TX] [1002-002B-00000736] 0 BLOCKER 4701c8478 5 [0x12002a][0x4454e0],[TX] [2004-0047-00001C7C] 1

***Session 2 lmd0 trace file: ***

user session for deadlock lock a00cc3618   pid=71 serial=8786 audsid=803130021 user: 6607/VBARUN   O/S info: user: vla1, term: W75, ospid: 8108:6940, machine: W75

            program: sqlplus.exe
  application name: SQL*Plus, hash value=3669949024   Current SQL Statement:
  update t set a=1 where a=2

user session for deadlock lock 4701c8478   pid=71 serial=8786 audsid=803130021 user: 6607/VBARUN   O/S info: user: vla, term: W75, ospid: 8108:6940, machine: W75

            program: sqlplus.exe
  application name: SQL*Plus, hash value=3669949024   Current SQL Statement:
  update t set a=1 where a=2

As you can see, all 4 statements involved in the deadlock are in the trace files.

Regards,

Vlado Barun, M.Sc.
Sr. Manager, Database Engineering and Operations Jewelry Television
Mobile: 865 335 7652
Email: vlado.barun_at_jtv.com

-----Original Message-----
From: Bobak, Mark [mailto:Mark.Bobak_at_proquest.com] Sent: Tuesday, March 03, 2009 10:35 PM
To: Barun, Vlado; tanel_at_poderc.com; oracle-l_at_freelists.org Subject: RE: determining bind values in deadlock situations

If you don't mind me jumping in here, Vlado, I think I can clarify what Tanel was trying to say.

Assume you have a simple two session deadlock. Session 1 takes a TX lock after executing a SQL statement, call it sql_a. Session 2 takes a TX lock after executing a SQL, call it sql_b. Now, no deadlock yet, no waiting yet. Now, session 1 executes a SQL, call it sql_c, and it tries to take a TX lock that's not compatible w/ the TX lock held by session 2. So, session 1 starts waiting on session 2 to commit or rollback. Still no deadlock though. Finally, session 2 executes some SQL, call it sql_d, that wants to take a TX lock that's not compatible w/ the TX lock held by session 1. So, it starts waiting on session 1. This is a deadlock.

So, oracle raises ORA-60, and causes a tracefile to be written. The SQLs involved in the deadlock are written to the trace file. Those are the sql_c and sql_d mentioned above. But, this does not tell you what SQL statements initially took the TX locks. That is, the sql_a and sql_b from the example above are not in the tracefile, and are generally not possible to identify, short of having 10046 tracing enabled.

Hope that helps,

-Mark



From: oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] On Behalf Of Barun, Vlado [Vlado.Barun_at_JTV.com] Sent: Tuesday, March 03, 2009 9:07 PM
To: tanel_at_poderc.com; oracle-l_at_freelists.org Subject: RE: determining bind values in deadlock situations

Tanel, thank you for the very useful information you posted on your blog about this.

One correction though to your statement ".but you don't necessarily know when, why and by which statement these locks were taken for.". The lmd0 trace files do show the statements involved in the deadlock. Maybe I'm misunderstanding your statement.

Regards,

Vlado Barun, M.Sc.
Sr. Manager, Database Engineering and Operations Jewelry Television
Mobile: 865 335 7652
Email: vlado.barun_at_jtv.com

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Tanel Poder Sent: Tuesday, March 03, 2009 7:38 PM
To: oracle-l_at_freelists.org
Subject: RE: determining bind values in deadlock situations

On a related topic,

It's possible to make Oracle to automatically dump more stuff when ORA-60 or any other error happens, with KSD diagnostic events...

You can issue something like this at system level for example (after thinking what you want to achieve and what events levels are safe):

SQL> alter session set events '60 trace name hanganalyze_global level 4, forever; name systemstate level 522, lifetime 1';

Session altered.
That would give you global hanganalyze every time the error occurs and a local systemstate dump only the first time it happens in a session. System state dumps process stacks thanks to level 512 + 10, that can be useful for diagnosing low-level deadlocks and hangs.

Btw, knowing the current SQL statement and bind variable values for both sessions might still not give the full picture of the root cause of the deadlock. You know what for the sessions are currently waiting, but you don't necessarily know when, why and by which statement these locks were taken for (and why they are still being held).

I just published a related blog entry about the diagnostic event setting syntax and some more complex constructs:

http://blog.tanelpoder.com/2009/03/03/the-full-power-of-oracles-diagnostic-events-part-1-syntax-for-ksd-debug-event-handling/

--
Regards,
Tanel Poder
http://blog.tanelpoder.com<http://blog.tanelpoder.com/>

________________________________
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Scott
Sent: 03 March 2009 20:43
To: Vlado.Barun_at_JTV.com
Cc: oracle-l_at_freelists.org
Subject: Re: determining bind values in deadlock situations
The big question is how do you know you are getting a deadlock?

ORA-60, RAC or not will still get generated. It may take longer in RAC because the detection method does change. However you can still get non-table deadlocks in RAC that are not always dumped in to a tracefile. Then you need to run hanganalyze or a systemstate dump and look for open chains. There serveral bugs in 10g that related to library cache related deadlocks that are not always indicated in a tracefile.

Scott

________________________________
From: "Barun, Vlado" <Vlado.Barun_at_JTV.com>
To: Jared Still <jkstill_at_gmail.com>
Cc: "oracle-l_at_freelists.org" <oracle-l_at_freelists.org>
Sent: Monday, March 2, 2009 7:21:12 PM
Subject: RE: determining bind values in deadlock situations
Jared,

Was your test done in a non-RAC environment?
I can find the rowid's the way you mentioned in a non-RAC environment, but not in a RAC environment.

There is no trace file generated for a deadlock in udump on any of my RAC nodes.

Are you able to get a deadlock trace file in a RAC environment?

Thank you for the time you are spending on this.

Regards,

Vlado Barun, M.Sc.
Sr. Manager, Database Engineering and Operations
Jewelry Television
Mobile: 865 335 7652
Email: vlado.barun_at_jtv.com

From: Jared Still [mailto:jkstill_at_gmail.com]
Sent: Monday, March 02, 2009 1:28 PM
To: Barun, Vlado
Cc: oracle-l_at_freelists.org
Subject: Re: determining bind values in deadlock situations






--
http://www.freelists.org/webpage/oracle-l
Received on Wed Mar 04 2009 - 06:54:32 CST

Original text of this message