Re: PL/SQL: cursors and loops, hang issue

From: DA Morgan <damorgan_at_psoug.org>
Date: Tue, 21 Oct 2008 15:35:34 -0700
Message-ID: <1224628529.55930@bubbleator.drizzle.com>


Aya the Vampire Slayer wrote:
> I apologize in advance for being excessively wordy. :/
>
> DB: Oracle Database 10.2.0.1.0, Enterprise Edition
> OS: MS Windows Server 2003, Enterprise Edition (5.2.3790), SP1
> App: Visual Studio 2003 (7.1.6030) C#, ODP.Net 10.2.0.1.0
>
> I had been tasked with fixing a pretty severe bug a few weeks ago in a
> couple of stored procedures in our application schema. The bug is fixed
> now and the procs functional, but in the process of implementing the fix
> I ran into an unexpected and (what I consider) strange speed/hang issue.
> I essentially want to know if anyone has some insight into why the proc
> hung based on their knowledge of how oracle handles cursors+loops+etc
> under the hood. Details below.
>
> This particular database schema design has the concept of an anchor or
> parent table that is the container for multiple "versions" of a dataset.
> The 2 procedures in question take in an old version id and a new version
> id and each copy a different subset of that given version's data into
> the new "version" (e.g., selecting from and inserting into the same
> tables). All stored procs and tables etc are in the same SCHEMA. Also,
> this is a development database, and I'm the only one logged into it
> while this dev/testing is going on.
>
> Headers (NOTE, names have been changed throughout):
> create or replace procedure PR_Copy_Groups (P_To_Version_ID in RAW,
> P_From_Version_ID in RAW)
>
> create or replace procedure PR_Copy_Other_Groups (P_To_Version_ID in RAW,
> P_From_Version_ID in RAW)
>
> The "PR_Copy_Groups" copies one table's data, and the
> "PR_Copy_Other_Groups" copies 3 tables' deep worth of data, but the
> top-level table (first table) is defined exactly the same as the one
> table that "PR_Copy_Groups" handles. The bug was in the initial select
> statement only, for both procedures (the initial select is exactly the
> same in both procs, only the "from TABLE" is different between the two).
>
> The old version of the procedures used "Open Cursor --> loop and insert
> row by row" to do the copying. Because "Other_Groups" is copying 3 tables
> deep worth of data, that does in fact mean it had 3 nested loops (yuck).
> Initially, I just changed the first select statement because that's all
> the bug was -- it was selecting the wrong data and then later inserting
> it -- and left the loop(s) alone (for now).
>
> The select statement change for "PR_Copy_Groups" worked just fine in
> testing. After applying the fix to the "PR_Copy_Other_Groups" procedure
> and successfully compiling it, in testing the copying of "Other_Groups"
> started hanging and simply never finished, when it used to run/finish
> just fine before, even with the 3 loops and the old (incorrect) select
> statement. (I let it run for 30min and it never did finish)
>
> So I tested the copying from the application by doing the following in
> the "PR_Copy_Other_Groups" proc:
> I commented out the entire body of the proc -- it finishes
> I commented out the two inner loops -- it finishes
> I commented out only the innermost loop -- it finishes
> I uncommented all three loops with the new SELECT -- it does NOT finish
> I go back to the old bad SELECT with all 3 loops -- it finishes
>
> I am baffled about how that initial select can change a stored procedure
> that does easily finish running into one that does not at all, when the
> initial select is only run once, isn't really slow, and does in fact
> work properly (as is shown by PR_Copy_Groups and the commenting out
> tests). The dataset is also not very big (see below the pseudocode).
>
> Here's an overview of how the proc initially was written, the fix that
> caused the problem, and the final solution, using pseudo-PL/SQL (note
> the PSEUDO here, to keep it from becoming unbearably long).
>
> ------------------ ( OLD ) -------------------
> PROCEDURE SCHEMA.PR_Copy_Other_Groups ( ... )
> BEGIN
> open Cur1 for
> select *
> from SCHEMA.TB_OTHER_GROUP
> where VERSION_ID = P_From_Version_ID
> ;
>
> loop until Cur1 done
> -- IN param, out param
> PR_Copy_CLOB_Tbl_Entry( Old_CLOB_ID, New_CLOB_ID );
>
> INSERT INTO SCHEMA.TB_OTHER_GROUP ( ..., MY_CLOB_ID )
> VALUES ( ..., from Cur1, New_CLOB_ID )
>
> Open Cur2 for SELECT (...) from SCHEMA.TB_INNER1 [...];
> loop until Cur2 done
> INSERT INTO SCHEMA.TB_INNER1 [...] VALUES (..., from Cur2);
>
> Open Cur3 for SELECT (...) from SCHEMA.TB_INNER2 [...];
> loop until Cur3 done
> INSERT INTO SCHEMA.TB_INNER2 [...] VALUES (..., from Cur3);
> end loop;
> end loop;
> end loop;
> END;
> ------------------------------------------------
> --------------- ( FIX w/hang ) ---------------
> PROCEDURE SCHEMA.PR_Copy_Other_Groups ( ... )
> BEGIN
> open Cur1 for
> select (some stuff),
> e1.ID ENTITY1_ID, e2.ID ENTITY2_ID, e3.ID ENTITY3_ID, e4.ID ENTITY4_ID
> from SCHEMA.TB_OTHER_GROUP g
> LEFT OUTER JOIN SCHEMA.TB_ENTITY oe1 on oe1.ID = g.ENTITY1_ID
> LEFT OUTER JOIN SCHEMA.TB_ENTITY e1 on e1.NAME = oe1.NAME
> and e1.VERSION_ID = P_To_Version_ID
> LEFT OUTER JOIN SCHEMA.TB_ENTITY oe2 on oe2.ID = g.ENTITY2_ID
> LEFT OUTER JOIN SCHEMA.TB_ENTITY e2 on e2.NAME = oe2.NAME
> and e2.VERSION_ID = P_To_Version_ID
> LEFT OUTER JOIN SCHEMA.TB_ENTITY oe3 on oe3.ID = g.ENTITY3_ID
> LEFT OUTER JOIN SCHEMA.TB_ENTITY e3 on e3.NAME = oe3.NAME
> and e3.VERSION_ID = P_To_Version_ID
> LEFT OUTER JOIN SCHEMA.TB_ENTITY oe4 on oe4.ID = g.ENTITY4_ID
> LEFT OUTER JOIN SCHEMA.TB_ENTITY e4 on e4.NAME = oe4.NAME
> and e4.VERSION_ID = P_To_Version_ID
> where g.VERSION_ID = P_From_Version_ID
> ;
>
> loop until Cur1 done
> [ same as above, nothing changed here ]
> end loop;
>
> END;
> ------------------------------------------------
> --------------- ( FINAL FIX ) ----------------
> PROCEDURE SCHEMA.PR_Copy_Other_Groups ( ... )
> BEGIN
> INSERT INTO SCHEMA.TB_OTHER_GROUP ( ... )
> SELECT ( ... ) from [ that big new select above ]
> ;
>
> INSERT INTO SCHEMA.TB_INNER1 ( ... )
> SELECT ( ... ) from SCHEMA.TB_INNER1
> where VERSION_ID = P_From_Version_ID
> ;
>
> INSERT INTO SCHEMA.TB_INNER2 ( ... )
> SELECT ( ... ) from SCHEMA.TB_INNER2
> whre VERSION_ID = P_From_Version_ID
> ;
> END;
> ------------------------------------------------
> Row counts for the above tables:
> TB_GROUP: 2500
> TB_OTHER_GROUP: 1500
> TB_ENTITY: 200
> TB_INNER1: 2000
> TB_INNER2: 8000
> ------------------------------------------------
>
> I can provide some actual code snippets with changed names where needed
> but all the code (tables + stored procs) for this whole thing is really
> a ton of code and and it would take hours to desensitize it all if I
> were to post everything relevant (the old Copy Other Groups proc is 200
> lines by itself). Just ask if you'd like to see some particular snippet
> or xplans or init params or whatever.
>
> Anyway, like I said earlier, I am hoping, for my own curiosity, someone
> here can provide some under-the-hood insight into how oracle processes
> cursors+loops (or whatever else) that would cause a previously-working
> procedure to hang just by changing an exterior (to the loops) SELECT
> statement that is only run once per procedure invocation.
>
> Thanks for reading, if you managed to make it this far. :)

Sybrand is correct: Cursor loops are hopelessly obsolete and inefficient. For demos using array processing go to Morgan's Library at www.psoug.org and look up "Array Processing."

-- 
Daniel A. Morgan
Oracle Ace Director & Instructor
University of Washington
damorgan_at_x.washington.edu (replace x with u to respond)
Puget Sound Oracle Users Group
www.psoug.org
Received on Tue Oct 21 2008 - 17:35:34 CDT

Original text of this message