Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Interesting Locking Problem

Interesting Locking Problem

From: <Jared.Still_at_radisys.com>
Date: Thu, 31 Jul 2003 15:59:25 -0800
Message-ID: <F001.005C80E7.20030731155925@fatcity.com>


List,

Below is the description of a locking problem I ran into today. Any help to figure out what was happening here would be appreciated. Maybe I was overlooking something obvious, but I hope not. :)

An MS Word version is available at:

http://www.cybcon.com/~jkstill/CIM_enqueue_wait_2003_07_31.doc

as it may be a little easier to read.


I ran into a rather interesting problem today, and I am hoping that someone can help me determine what was happening here.

The app owner informed me that there were some problems with our CIM app this morning.

As this app sometimes has issues with locking, I first checked the locks:

                      OS                        OS      OS         OS 
Database                  Oracle  Lock Mode            Mode
   SID WAITER BLOCKER Program                   Usernam Machine    Process 
Object                    Usernam Type Held            Requested
------ ------ ------- ------------------------- ------- ---------- ------- 
------------------------- ------- ---- --------------- ----------
    18                                          SYSTEM CIM_MV.MLOG$  
CIM_MV  TM   Row-S (SS)      None
    18                                          SYSTEM CIM_MV.    TX 
Exclusive       None

    42     73         dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.TAB$              CIMUSER TM   Row-X (SX)      None
    42     73         dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.GENEALOGY                 TM   Row-X (SX)      None
    42     73         dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.GENEALOGY_LEGACY          TM   Row-X (SX)      None
    42     73         dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.MLOG$_GENEALOGY_L         TM   Row-X (SX)      None
    42     73         dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.MLOG$_GENEALOGY3          TM   Row-X (SX)      None
    42     73         dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.                          TX   Exclusive       None
    73             42 dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.TAB$                      TM   Row-X (SX)      None
    73             42 dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.                          TX   Exclusive       None
    73             42 dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.                          TX   None            Share
    73             42 dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.MLOG$_GENEALOGY3          TM   Row-X (SX)      None
    73             42 dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.GENEALOGY                 TM   Row-X (SX)      None
    73             42 dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.MLOG$_GENEALOGY_L         TM   Row-X (SX)      None
    73             42 dllhost.exe               cimvisi DC\CIMPROD 2312:15 
CIMUSER.GENEALOGY_LEGACY          TM   Row-X (SX)      None

    80                sqlplus_at_rsysdevdb.radisys jkstill rsysdevdb. 29339 
JKSTILL.DBA_LOCKS_TAB     JKSTILL TM   Row-X (SX)      None
    80                sqlplus_at_rsysdevdb.radisys jkstill rsysdevdb. 29339 
JKSTILL.DBA_WAITERS_TAB           TM   Row-X (SX)      None
    80                sqlplus_at_rsysdevdb.radisys jkstill rsysdevdb. 29339 
JKSTILL.DBA_LOCKS_HELD_TA         TM   Row-X (SX)      None
    80                sqlplus_at_rsysdevdb.radisys jkstill rsysdevdb. 29339 
JKSTILL.                          TX   Exclusive       None


19 rows selected.

Then I ran a check on v$session_wait:

                        WAIT      IN
USERNAME   SADDR    EVENT                             SID           P1   
P2         P3    TIME    WAIT STATE
---------- -------- ------------------------------ ------ ------------ 
------------ ---------- ------- ------- --------------------
CIMUSER    4027CF78 enqueue                            73   1415053316  
983086      18466       0   10580 WAITING
           40264C10 library cache lock                 27   1026859436 
902088748         21       0     372 WAITING
           4026CA64                                    42   1026859436 
928969196         21       0    2292 WAITING
           4027B634                                    70   1026859436 
482864888         21       0     372 WAITING
           4027BEA0                                    71   1026859436 
902084848         21       0    1008 WAITING
           4027C70C                                    72   1026859436 
330358188         21       0     152 WAITING
           4025A3A0 SQL*Net message from client         7   1413697536   1 
         0       0      89 WAITING
           4025AC0C                                     8   1413697536   1 
         0       0       0 WAITING
           402772D4                                    62   1413697536   1 
         0       0    1365 WAITING
..
CIM_MV     40260044 library cache lock                 18   1027075040 
1083691876         31       0     372 WAITING

JKSTILL    4026FCEC SQL*Net message from client        48   1413697536   1 
         0       0      20 WAITING
           40261988 SQL*Net message to client          21   1413697536   1 
         0      -1       0 WAITED SHORT TIME


60 rows selected.

Interesting to find several sessions waiting on library cache locks.

I then used the following query from MetaLink to see which session was responsible for causing all the library cache locks.

SELECT SID,USERNAME,TERMINAL,PROGRAM FROM V$SESSION WHERE SADDR in
(SELECT KGLLKSES FROM X$KGLLK LOCK_A

WHERE KGLLKREQ = 0
AND EXISTS (SELECT LOCK_B.KGLLKHDL FROM X$KGLLK LOCK_B WHERE KGLLKSES = '4026CA64' /* BLOCKED SESSION */ AND LOCK_A.KGLLKHDL = LOCK_B.KGLLKHDL
AND KGLLKREQ > 0)
);

.. where '4026CA64' is the v$session.saddr value for a blocked session.

Running this query for all sessions waiting on a library cache lock indicated
that they were waiting on session 73.

As seen in the lock output however, session 73 is waiting on an a lock held
by session 42: obviously a deadly embrace, but not the kind I'm accustomed to see.

A system state dump was done via

    ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME SYSTEMSTATE LEVEL 8'; and something interesting appeared in the trace file.

There was an 'update tab$' statement by session 62, also the CIMUSER account.

Apparently an 'alter table' was being done, but I could not determine which
table was being altered. I don't believe that this is something an app should
be doing in the normal course of events.

As you can see, session 62 was waiting on 'SQL*Net message from client'.

Here is the 'update tab$' statement as seen in the state dump:

update tab$ set ts#=:2,file#=:3,block#=:4,bobj#=decode(:5,0,null,:5), tab#=decode(:6,0,null,:6),intcols=:7,kernelcols=:8,clucols=decode(:9,0,null,:9), audit$=:10,flags=:11,pctfree$=:12,pctused$=:13,initrans=:14,maxtrans=:15,rowcnt=:16, blkcnt=:17,empcnt=:18,avgspc=:19,chncnt=:20,avgrln=:21,analyzetime=:22, samplesize=:23,cols=:24,property=:25,degree=decode(:26,1,null,:26), instances=decode(:27,1,null,:27),dataobj#=:28,avgspc_flb=:29,flbcnt=:30, trigflag=:31,spare1=:32,spare2=decode(:33,0,null,:33),spare4=:34,spar

It was apparently truncated in the trace file.

Why session 62 would be doing an 'alter table' is a bit of a mystery.

It was definitely not a user session, as can be seen here. The FlexNetKernel.exe is invoked as a service.

     CLIENT                         SRVR
USERNAME    SID SERIAL#  PID STATUS     MACHINE    OSUSER  CLIENT PROGRAM  
    PID       SERVER PROGRAM       PID   LOGON TIME        IDLE TIME
---------- ---- ------- ---- ---------- ---------- ------- 
-------------------- --------- -------------------- ----- 
----------------- -----------
CIMUSER      62   23972   46 INACTIVE   DC\CIMPROD cimvisi 
FlexNetKernel.exe    1952:2344 ORACLE.EXE           1696  07/31/03 
09:59:17 00:00:00:03
                                                   on

             63   29809   29 ACTIVE     DC\CIMPROD cimvisi 
FlexNetKernel.exe    1952:748  ORACLE.EXE           1652  07/31/03 
09:59:17 00:00:06:05
                                                   on


After a time, it became necessary to bounce the system, as I couldn't determine what was actually causing the problem, nor how to correct it. We started by stopping the application.

After doing so, there were a few remaining sessions from the application, these should have gone away when the application was stopped.
( SID 62 was not one of them, it was gone )

After bouncing the database and restarting the app, everything was fine.

Any ideas on what may have been happening here are certainly welcome.

Speculations gladly considered. :)

Jared

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: 
  INET: Jared.Still_at_radisys.com

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L

(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).
Received on Thu Jul 31 2003 - 18:59:25 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US