Re: using ASH to track down library cache lock contention

From: Tim Gorman <tim_at_evdbt.com>
Date: Thu, 18 Sep 2014 14:51:24 -0600
Message-ID: <541B45CC.9020701_at_evdbt.com>



A mystery! Love it!

I recall that in astronomy, one finds "black holes" by the behavior of the things near to it.

In this case, the waits on "library cache lock" are the behavior of the things near to the black hole. So, the black hole itself is hopefully something not mundane, something exclusively modifying the library cache, which I understand could be just about anything, such as an everyday SQL statement, but perhaps there was something out of the ordinary being executed?

Since you've got it nailed down already to a 10 minute period, perhaps you could run something like...

    SQL> select instance_number, sql_opname, session_state, event, count(*)

       2    from dba_hist_active_sess_history
       3    where sample_time >= to_timestamp( '2014-09-15 14:19',
    'YYYY-MM-DD HH24:MI' )
       4      and sample_time <  to_timestamp( '2014-09-15 14:28',
    'YYYY-MM-DD HH24:MI' )
       5      and sql_opname not in ('SELECT','UPDATE','INSERT','DELETE')
       6     group by instance_number, sql_opname, session_state, event;


This should filter out the majority of "usual" traffic and perhaps reveal an unexpected ALTER SYSTEM FLUSH SHARED_POOL running from a cron script somewhere? Unlikely I agree, but perhaps something suspect may pop out?

For follow-up investigation, you could look at whatever of interest popped up and see if it started prior to the arrival of the flood of "library cache lock" events?

Another approach might be to search sessions by the parameter values of the event...

    SQL> select parameter1, parameter2, parameter3 from v$event_name     where name = 'library cache lock';

    PARAMETER1 PARAMETER2 PARAMETER3

  • -------------------- -------------------- handle address lock address 100*mode+namespace

Granted the parameters for the "library cache lock" event are addresses within the SGA, and not really useful in and of themselves, but perhaps these values can be used as IDs to identify the commonly-sought object, without caring what the value means. For example...

      SQL> select instance_number, p2, sql_opname, session_state, event,     count(*)

       2    from dba_hist_active_sess_history
       3    where sample_time >= to_timestamp( '2014-09-15 14:19',
    'YYYY-MM-DD HH24:MI' )
       4      and sample_time <  to_timestamp( '2014-09-15 14:28',
    'YYYY-MM-DD HH24:MI' )
       5      and p2 in (select distinct p2 from
    dba_hist_active_sess_history
       6          where event = 'library cache lock'
       7          and   sample_time >= to_timestamp( '2014-09-15 14:19',
    'YYYY-MM-DD HH24:MI' )
       8          and   sample_time <  to_timestamp( '2014-09-15 14:28',
    'YYYY-MM-DD HH24:MI' )
       9          )
      10      and event <> 'library cache lock'
      11     group by instance_number, p2, sql_opname, session_state, event;


...or something like that, for P1, P2 (shown), and P3?

Hope this helps...

Thanks!

-Tim

On 9/18/14, 13:27, Adric Norris wrote:
> I'm trying to track down a recent issue on an 11.2.0.3 Solaris SPARC64
> system, where the majority of app-related database sessions
> unexpectedly hung for about 10 minutes. The condition cleared just as
> we were getting onto the system for troubleshooting, unfortunately, so
> we weren't able to capture much detail while the problem was
> ongoing... I'm therefore trying to identify the cause primarily via
> ASH data.
>
> At this point, we're fairly confident that it was due to contention
> for a library cache lock.
>
> SQL> select instance_number, session_state, event, count(*)
> 2 from dba_hist_active_sess_history
> 3 where sample_time >= to_timestamp( '2014-09-15 14:19',
> 'YYYY-MM-DD HH24:MI' )
> 4 and sample_time < to_timestamp( '2014-09-15 14:28',
> 'YYYY-MM-DD HH24:MI' )
> 5 group by instance_number, session_state, event
> 6 having count(*) >= 40
> 7 order by 4 desc, 2, 3, 1;
>
> INSTANCE_NUMBER SESSION EVENT COUNT(*)
> --------------- ------- ------------------------------ ----------
> 1 WAITING library cache lock 6264
> 3 WAITING library cache lock 6216
> 2 WAITING library cache lock 6031
> 1 WAITING db file sequential read 50
> 2 WAITING db file scattered read 41
>
> Unfortunately, the blocking-session info isn't captured for this type
> of wait. ASH does grab the P1/P2/P3 details, but everything I'm
> finding so far indicates that you have to match this information
> against various X$ tables while the issue is actually occurring. So
> the question, of course, is whether or not it's possible to
> definitively identify the offending session using just the ASH data.
>
> Any suggestions on how to accomplish this?
>
> Thanx!
>
> --
> "In the beginning the Universe was created. This made a lot of people
> very angry and has been widely regarded as a bad move." -Douglas Adams

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Sep 18 2014 - 22:51:24 CEST

Original text of this message