Interesting problem

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Wed, 25 May 2011 13:04:59 +0000 (UTC)
Message-ID: <pan.2011.05.25.13.04.58_at_gmail.com>



Yesterday, on one of my databases, DBMS_SCHEDULER job started burning CPU like crazy. I contacted the developer and he told me that the culprit job was supposed to regularly purge one fairly large table. The job was scheduled ad 03:00 AM and should have finished in half an hour, yet it was still running like crazy at 08:45 AM. Not only was it burning CPU, the most prevalent wait event was the wait for cache buffer chain latch, which is the real surprise here. Here is the source code:

CREATE OR REPLACE PROCEDURE REPORT_CLEANUP_PURGE   (
    p_rowlimit NUMBER DEFAULT 1000)
AS
TYPE report_tab
IS
  TABLE OF reports.report#%TYPE;
  rep# report_tab;
  CURSOR del_cur
  IS

     SELECT report#
       FROM reports
      WHERE status='RECENT'

    AND created < TRUNC(sysdate);
BEGIN
  LOOP
    OPEN del_cur ;
    LOOP
      FETCH del_cur BULK COLLECT INTO rep# LIMIT p_rowlimit;
      EXIT
    WHEN rep#.count=0;
      FORALL i    IN rep#.FIRST..rep#.LAST
       DELETE FROM reports WHERE report# = rep#(i);
     
      FORALL i IN rep#.FIRST..rep#.LAST
       DELETE FROM report_agency_Sessions WHERE report# = rep#(i);
      COMMIT;

    END LOOP;
    CLOSE del_cur;
  END LOOP;
END;
/

The problem is really simple. The "exit" statement terminates the innermost loop. The outer loop wasn't terminated at all. When the job was done, there was nothing more to mop up. The procedure would open the cursor, do bulk collect, figure out that the count is zero, exit loop, close cursor, re-open it and do the whole thing again. And again. And again.
That was easy to fix. After reloading my Glock and exchanging some kind words with the QA people, I am still confused about the buffer cache chains. Why was a simple infinite loop causing "latch: buffer chains" wait events? It was using much, much more CPU than I expected it to and affecting the system in a really adverse way. When I come to think of it, I would expect it to use library cache latch, because of the opening and closing the cursor literally thousands of times per second, but not buffer chains latch. Does anyone have an explanation?

-- 
http://mgogala.byethost5.com
Received on Wed May 25 2011 - 08:04:59 CDT

Original text of this message