RE: Session With oper EXCL is also waiting - where to now? (systemstate dump)

From: <Christopher.Taylor2_at_parallon.net>
Date: Wed, 19 Sep 2012 15:34:50 -0500
Message-ID: <F05D8DF1FB25F44085DB74CB916678E88515AA373D_at_NADCWPMSGCMS10.hca.corpad.net>



Ah hah (see what it's waiting on...also any additional guidance from here?) Here's the HOLDER

PROCESS 152:



  SO: 0xa7aba8578, type: 2, owner: (nil), flag: INIT/-/-/0x00   (process) Oracle pid2, calls cur/top: 0x6f9156810/0x6f8710648, flag: (0) -

            int error: 0, call error: 0, sess error: 0, txn error 0   (post info) last post received: 0 0 163

              last post received-location: kqrbtm
              last process to post me: a7ab873c0 1 6
              last post sent: 0 0 24
              last post sent-location: ksasnd
              last process posted by me: a7ab873c0 1 6
    (latch info) wait_event=0 bits

--> holding (efd) a5b747220 Child shared pool simulator level=8 child# <---
Location from where latch is held: kglsim_scan_lru: scan: Context saved from call: 0 state=free, wlstate=free holding (efd) 600fab30 Child shared pool level=7 child#=1 Location from where latch is held: kghfrunp: alloc: wait: Context saved from call: 0 state=busy, wlstate=free holding (efd) a5b6bd2a0 Child row cache objects level=4 child# Location from where latch is held: kghfrunp: clatch: wait:

Here's the WAITS for the HOLDER
  Dumping Session Wait History

     for 'latch: shared pool' count=1 wait_timeW
                address`0fab30, number, tries=0
     for 'latch: shared pool' count=1 wait_time&
                address`0fab30, number, tries=0
     for 'latch: row cache objects' count=1 wait_time7
                addressb6bd2a0, number, tries=1
     for 'latch: row cache objects' count=1 wait_timeY5844
                addressb6bd2a0, number, tries=0
     for 'latch: shared pool' count=1 wait_time
                address`0fab30, number, tries=0
     for 'latch: shared pool' count=1 wait_timee
                address`0fab30, number, tries=0
     for 'latch: row cache objects' count=1 wait_time16
                addressb6bd2a0, number, tries=1
     for 'latch: row cache objects' count=1 wait_timeE7111
                addressb6bd2a0, number, tries=0
     for 'latch: row cache objects' count=1 wait_timeq
                addressb6bd2a0, number, tries=1
     for 'latch: row cache objects' count=1 wait_timeR7243
                addressb6bd2a0, number, tries=0
    temporary object counter: 0

From: tanel_at_poderc.com [mailto:tanel_at_poderc.com] On Behalf Of Tanel Poder Sent: Wednesday, September 19, 2012 3:26 PM To: Taylor Christopher - Nashville
Cc: oracle-l_at_freelists.org
Subject: Re: Session With oper EXCL is also waiting - where to now? (systemstate dump)

Search for the address a5b6bd2a0 (case insensitive) in the systemstate dump file. If some process still happened to be holding that latch while the dump read its state object, it should dump out which one it was (and which session state object was under it).

But allowing shared pool to automatically shrink is a bad idea in super-critical latency-sensitive (OLTP) systems ... better "waste" some extra memory for shared pool instead of having to live with CPU usage and share pool latch/library cache mutex contention spikes...

Of course in your case the symptom was even more extreme (a hang), which may be due to a bug ...
--

Tanel Poder
Blog - http://blog.tanelpoder.com
App - http://voic.ee

On Wed, Sep 19, 2012 at 11:01 PM, <Christopher.Taylor2_at_parallon.net<mailto:Christopher.Taylor2_at_parallon.net>> wrote: (apologies for bad formatting - I'm trying to keep it simple tho) (pastebin is blocked, but used gist https://gist.github.com/3751878) 3-node RAC 10.2.0.4

We had a "hang" last night from 2:50 to ~3:48 and Oracle was kind enough to provide me with an automated system state dump. (the bottom of this email is where I'm confused - where do I go from here?)

Looking at the ASH and AWR reports for the 2 hour window:

I see:
1.) An SGA Resize occurred:
Buffer Cache 30,608M increases to 30,656M Shared Pool Size 5,136M *decreases* to 5,088M

--

http://www.freelists.org/webpage/oracle-l Received on Wed Sep 19 2012 - 15:34:50 CDT

Original text of this message