RE: help with a GES deadlock

From: Walker, Jed S <Jed_Walker_at_cable.comcast.com>
Date: Tue, 16 Aug 2011 16:45:27 +0000
Message-ID: <BAA6E28B6241F046AED1E62D8697516C4E3A4E56_at_COPDCEXMB08.cable.comcast.com>



I'm pretty sure, but I can't be certain. I checked v$pwfile_users and it only contains SYS. I had checked the triggers on the table and there is only one trigger that is owned by the schema, not SYS. I checked the schema and found no triggers on any of the schema tables that weren't owned by the same schema. As far as I can tell, there are no connections in the system as SYS that are actually application, but apparently that isn't the case (unless the 0/SYS is a bug).

From: Tim Gorman [mailto:tim_at_evdbt.com] Sent: Tuesday, August 16, 2011 10:21 AM
To: Walker, Jed S; oracle-l_at_freelists.org Subject: Re: help with a GES deadlock

Jed,

That session is running the conflicting SQL statement "delete from STB_SETTING where STB_SETTING_ID=:1" -- are you certain that it is not running from a session connected as SYSDBA? If they are not connected explicitly as "/ as sysdba" or "sys", perhaps they are another user who has been granted SYSDBA and has connected as SYSDBA. Run the following query "select * from v$pwfile_users" and see if anyone other than the usual suspects have SYSDBA privileges, perhaps?

Also, are there any triggers on the FLUX_STB_MESSAGE_QUEUE table? If so, who is the owner of that trigger?

Hope this helps...

-Tim
-----Original Message-----

From: Walker, Jed S [mailto:Jed_Walker_at_cable.comcast.com] Sent: Tuesday, August 16, 2011 09:55 AM
To: tim_at_evdbt.com, oracle-l_at_freelists.org Subject: RE: help with a GES deadlock
Thanks. I?m curious though about something in that deadlock message below. It appears to be showing that some of the transactions are by SYS, which doesn?t make sense. Note: pid=41 serial=4770 audsid=156523088 user: 0/SYS

These commands are all run by the application connections, and I checked and don?t see SYS connections from any of the app servers.

any ideas?

From: Tim Gorman [mailto:tim_at_evdbt.com] Sent: Monday, August 15, 2011 1:59 PM
To: Walker, Jed S; oracle-l_at_freelists.org Subject: Re: help with a GES deadlock

Jed,

Don't get lost in the RAC wait events, they are all just mirrors to the non-RAC wait events. Roughly, RAC wait events break down in two categories: global enqueues and global cache. The former (global enqueue waits) mirror all the non-RAC wait events starting with the phrase "enq:", the latter (global cache waits) mirror the datafile I/O waits (i.e. db file xxx read) and I/O contention waits (i.e. "buffer busy wait", "read on other session", etc).

My advice is to filter out the RAC waits and perform diagnosis based on the non-RAC waits. Take a global view, of course -- look across all the instances, but try not to get distracted by the amplifications that the RAC waits present -- focus on the underlying non-RAC issues. Fix the problem that way, make the non-RAC waits go away, and the RAC waits disappear commensurately.

Hope this helps...

-Tim

-----Original Message-----

From: Walker, Jed S [mailto:Jed_Walker_at_cable.comcast.com] Sent: Monday, August 15, 2011 09:33 AM
To: oracle-l_at_freelists.org
Subject: help with a GES deadlock
Hi all,

Here?s my next new to RAC question. We are getting a Global Enqueue Services Deadlock quite often. Looking at the trace file it looks to me like it is a regular deadlock, so I?m wondering if something else is going on here, or if RAC just reports a regular deadlock as a GED deadlock. This is a five node 10.2.0.4.0 RAC cluster.

Each schema should only be used on one instance of the database (each ?market? has its own schema and all connections for that market are sent to a single service running on a single instance, so there should not be any blocks being worked by multiple instances).

Would a standard deadlock show up as a GED deadlock in RAC?

If I read this correctly, sid 42 is doing a delete and then an update, but 41 is doing an update and then a delete (presumably affecting the same rows).

Am I reading this correctly?

  • 2011-08-01 17:17:45.785 Setting 3-way CR grants to 1 global-lru off? 0
  • 2011-08-01 18:14:11.067 user session for deadlock lock 0x4560db490 pid=41 serial=4770 audsid=156523088 user: 0/SYS O/S info: user: , term: , ospid: 1234, machine: flux-atlt-04.westchester.pa.bo.comcast.net program: Current SQL Statement: delete from STB_SETTING where STB_SETTING_ID=:1 user session for deadlock lock 0x4560db340 pid=42 serial=16019 audsid=156522961 user: 70/RDVRGW_ATLT O/S info: user: , term: , ospid: 1234, machine: flux-atlt-01.westchester.pa.bo.comcast.net program: Current SQL Statement:

update FLUX_STB_MESSAGE_QUEUE set SETTING_TYPE_ID=:1, SETTING_TYPE_RECORD_ID=:2, MAC_ADDR=:3, MESSAGE_SEQUENCE=:4, REQUEST_TAG=:5, STB_SYNC_STATE_ID=:6, ACTION_ID=:7, MESSAGE_SENT_DATE=:8, REC_START_TIME=:9, REC_END_TIME=:10, MESSAGE_CREATED=:11, PRIORITY=:12, STB_SETTING_ID=:13 where FLUX_STB_MESSAGE_QUEUE_ID=:14 user session for deadlock lock 0x457130888   pid=42 serial=16019 audsid=156522961 user: 70/RDVRGW_ATLT   O/S info: user: , term: , ospid: 1234, machine: flux-atlt-01.westchester.pa.bo.comcast.net

            program:
  Current SQL Statement:

update FLUX_STB_MESSAGE_QUEUE set SETTING_TYPE_ID=:1, SETTING_TYPE_RECORD_ID=:2, MAC_ADDR=:3, MESSAGE_SEQUENCE=:4, REQUEST_TAG=:5, STB_SYNC_STATE_ID=:6, ACTION_ID=:7, MESSAGE_SENT_DATE=:8, REC_START_TIME=:9, REC_END_TIME=:10, MESSAGE_CREATED=:11, PRIORITY=:12, STB_SETTING_ID=:13 where FLUX_STB_MESSAGE_QUEUE_ID=:14 user session for deadlock lock 0x457130738   pid=41 serial=4770 audsid=156523088 user: 0/SYS   O/S info: user: , term: , ospid: 1234, machine: flux-atlt-04.westchester.pa.bo.comcast.net

            program:
  Current SQL Statement:
  delete from STB_SETTING where STB_SETTING_ID=:1

Global blockers dump start:---------------------------------
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x2fa0019][0xc6eb2],[TX]
----------resource 0x0x4329c7288----------------------
resname       : [0x2fa0019][0xc6eb2],[TX]
Local node    : 0
dir_node      : 0
master_node   : 0
hv idx        : 31
hv last r.inc : 10
current inc   : 10
hv status     : 0
hv master     : 1
open options  : dd
grant_bits    : KJUSERNL KJUSEREX
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
val_state     : KJUSERVS_NOVALUE
valblk        : 0x00000000000000000000000000000000 .
access_node   : 0
vbreq_state   : 0
state         : x0
resp          : 0x4329c7288
On Scan_q?    : N
Total accesses: 43
Imm.  accesses: 38
Granted_locks : 1
Cvting_locks  : 1

value_block: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 GRANTED_Q :
lp 0x4560db340 gl KJUSEREX rp 0x4329c7288 [0x2fa0019][0xc6eb2],[TX]   master 0 gl owner 0x45b3b2330 possible pid 13832 xid 2B000-0001-00000BB0 bast 0 rseq 3 mseq 0 history 0x14951495   open opt KJUSERDEADLOCK
CONVERT_Q:
lp 0x4560db490 gl KJUSERNL rl KJUSEREX rp 0x4329c7288 [0x2fa0019][0xc6eb2],[TX]   master 0 gl owner 0x45b41d3b0 possible pid 13830 xid 2B000-0001-00000BC8 bast 0 rseq 3 mseq 0 history 0x1495149a   convert opt KJUSERGETVALUE
??.
  • Jed Walker

--

http://www.freelists.org/webpage/oracle-l Received on Tue Aug 16 2011 - 11:45:27 CDT

Original text of this message