Re: The Oracle WTF of the day.....sigh....

From: Thomas Roach <troach_at_gmail.com>
Date: Mon, 21 Nov 2016 22:37:05 -0500
Message-Id: <517E9262-81F4-43F2-ADC0-C3D2E7DFEE2A_at_gmail.com>


At least you found it before it went into production.

Sent from my iPhone

> On Nov 21, 2016, at 6:55 PM, Mark J. Bobak <mark_at_bobak.net> wrote:
>
> So, I'm approached by a developer, earlier today. He needs help tuning his PL/SQL Packaged procedure. It's been running for hours and hours, with no end in sight.
>
> Ok, I'll have a look. First thing, a few quick snapper.sql snapshots, see what's going on. (This is a small developmeent DB, and trhis process is literally the only thing running.) snapper.sql shows the process at 100% ON CPU. Ok....I'm thinking bad/poorly tuned SQL, perhaps an indexed access path that's not very selective and causing lots o buffer gets. So, grab SQL_ID from snapper.sql output, and do dbms_xplan.display_cursor. Let's have a look at the SQL an the execution plan. I get the top-level PL/SQL block (the anonymous block that's calling the stored procedure), and of course, no plan. That's weird, maybe it was some type of weird timing thing:? Surely if I run snapper again, I'll get a SQL_ID of one (or more) of the problem SQLs, that I can sink my teeth into and get to SQL tuning.....No, again, it returns only the SQL_ID of the top-level PL/SQL block. No matter how much I run, it refuses to return any other SQL_ID.
>
> I'm a bit stumped. The PL/SQL doesn't have any intricate logic, it's a just a simple procedural framework to call the key SQL statements. Why would it be spending all it's time in PL/SQL, with no apparent calls to SQL?
>
> I try tracing w/ DBMS_MONITOR, but because it spins indefinitely, I kill it after some time, and the trace file is a bit messy. Analyzing w/ OraSRP, I get 99.6% unaccounted-for time. That's fairly useless.
>
> Oh, BTW, this is 11.2.0.3 SE1 on Linux x86-64.
>
> Since it's SE1, I go old school. I install StatsPack, take a quick 5 minute snapshot. This shows there's *nothing* going on. In spite of that, I see the session is spinning on CPU at 100%.
>
> Now I'm annoyed. I've been looking at this, on and off, all day, while juggling several other things. Finally, everything else is done, and I need to concentrate on this.
>
> Ok, grab the developer, we're going to go through the code line by line, and execute it in a SQL*Plus session, to see what's going on.
>
> The overall process is pretty simple, and straightforward. There's a table that has data that needs to be processed. Define a cursor on that table, with the appropriate predicates, and that cursor will drive a PL/SQL loop that does various updates and deletes and inserts, etc. Simple, right?
>
> So, let's execute the query that makes up that driving cursor. Strip out the INTO, hardcode in the appropriate values, I just want to see how many rows ae returned...get a rough idea of how much data we have to process. Query returns immediately, "no rows selected". WTF? No rows selected? I double check my predicates, run it again. "no rows selected". I tell developer to take a closer look, make sure I got the predicates right. He looks, says it looks good. Run again, 'no rows selected". W-T-F??
>
> Long story short, the developer forgot to run a previous process, that populates the table, and as a result, the driving query is returning no rows. To compound the matter, the PL/SQL has a bug, between the way the LOOPs are defined, and exit conditions for the loop, that if valid data is in the table, everything runs great. But, if there are no rows returned, the PL/SQL spins on CPU....forever.....ARGH!
>
> I'm going home now.....WTF???
>
> -Mark
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Nov 22 2016 - 04:37:05 CET

Original text of this message