RE: Enforcing SQL execution plan

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Fri, 2 Sep 2016 20:45:33 +0000
Message-ID: <CE70217733273F49A8A162EE074F64D9015033AC72_at_exmbx05.thus.corp>



You could enable optimizer tracing for the instance for this statement. You'll have to test it, but in theory you can do the following:

alter system set events 'trace [rdbms.SQL_Optimizer.*][SQL:ck5smk1nw5u0g]';

(Obviously you have to change the SQL id to match the one for your statement).

Don't try tracing the SQL_Compiler - it seems to crash the instance.

This command should dump the 10053 trace every time the statement is hard parsed on production and may let you see why the optimizer is able to pick the wrong concatenation and index. Alternatively just do the sql_trace on every execution so that you can be sure you got the actual binds when it followed the unlucky subquery:

alter system set events

        '
        sql_trace[SQL:5s4pt15sk27x7]
        plan_stat=all_executions,
        wait=false,
        bind=true
        '

;

Again - do some testing to make sure that your version of Oracle is happy with this before firing it at production - in theory it should dump the trace for just that SQL_id whoever executes it.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle



From: Hameed, Amir [Amir.Hameed_at_xerox.com] Sent: 02 September 2016 20:10
To: Jonathan Lewis; oracle-l_at_freelists.org Subject: RE: Enforcing SQL execution plan

I tried to set the bind for column WDD.DATE_SCHEDULED with out-of-bound values and it from SQL*Plus but the result came out to be the same, that is, good plan. I also put some spaces in the SQL to force it to reparse:



| Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time |


| 0 | SELECT STATEMENT | | | | 12 (100)| |
| 1 | CONCATENATION | | | | | |
|*  2 |   FILTER                       |                             |        |       |            |          |

| 3 | NESTED LOOPS | | 1 | 186 | 6 (0)| 00:00:01 |
|* 4 | TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_DETAILS | 1 | 173 | 3 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | WSH_DELIVERY_DETAILS_U1 | 1 | | 2 (0)| 00:00:01 | |* 6 | TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_ASSIGNMENTS | 1 | 13 | 3 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | WSH_DELIVERY_ASSIGNMENTS_N3 | 1 | | 2 (0)| 00:00:01 | |* 8 | FILTER | | | | | |
| 9 | NESTED LOOPS | | 1 | 186 | 6 (0)| 00:00:01 |
|* 10 | TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_DETAILS | 1 | 173 | 3 (0)| 00:00:01 | |* 11 | INDEX UNIQUE SCAN | WSH_DELIVERY_DETAILS_U1 | 1 | | 2 (0)| 00:00:01 | |* 12 | TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_ASSIGNMENTS | 1 | 13 | 3 (0)| 00:00:01 | |* 13 | INDEX RANGE SCAN | WSH_DELIVERY_ASSIGNMENTS_N3 | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------------

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Jonathan Lewis Sent: Friday, September 02, 2016 2:29 PM To: oracle-l_at_freelists.org
Subject: RE: Enforcing SQL execution plan

There shouldn't really be any need to ask that question since the tkprof summary of the trace file from a bad run is a few lines further down the email.

The problem appears to be an awful choice of index for the second half of a concatenation that is (presumably) only triggered occasionally, and perhaps requires an out of range (above high value) bind variable to cause it to be considered a good choice.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle



From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org> [oracle-l-bounce_at_freelists.org] on behalf of fmhabash_at_gmail.com<mailto:fmhabash_at_gmail.com> [fmhabash_at_gmail.com] Sent: 02 September 2016 18:49
To: Hameed, Amir; oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: RE: Enforcing SQL execution plan So, how do the 2 runs compare when broken down by wait event class? What class is contributing the most to the bad run?

select st.SNAP_ID, st.sql_id, st.plan_hash_value, ss.begin_interval_time ,EXECUTIONS_DELTA execs

, IOWAIT_DELTA , CLWAIT_TOTAL , CCWAIT_DELTA , APWAIT_DELTA , CPU_TIME_DELTA , ELAPSED_TIME_DELTA , round((IOWAIT_DELTA/ELAPSED_TIME_DELTA)*100,2) io_wt_perc

, round((CLWAIT_TOTAL/ELAPSED_TIME_DELTA)*100,2) clust_wt_perc

, round((CCWAIT_DELTA/ELAPSED_TIME_DELTA)*100,2) concur_wt_perc

, round((APWAIT_DELTA/ELAPSED_TIME_DELTA)*100,2) app_wt_perc

, round((CPU_TIME_DELTA/ELAPSED_TIME_DELTA)*100) cpu_wt_perc

from dba_hist_sqlstat st, dba_hist_snapshot ss

where

ss.snap_id = st.snap_id

and st.instance_number = (select instance_number from v$instance )

and EXECUTIONS_DELTA > 0

--and sql_id = '&sql_id'

and rownum < 1000

and begin_interval_time >= sysdate - 7

order by snap_id

;



Thank you

From: Hameed, Amir<mailto:Amir.Hameed_at_xerox.com> Sent: Friday, September 2, 2016 1:17 PM
To: fmhabash_at_gmail.com<mailto:fmhabash_at_gmail.com>; oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: RE: Enforcing SQL execution plan

Thanks
The problem is that the plan is the same for the good run and the bad run, which might mean that because of bind-peeking, the skewing of data could potentially be causing the issue.

From: fmhabash_at_gmail.com<mailto:fmhabash_at_gmail.com> [mailto:fmhabash_at_gmail.com] Sent: Friday, September 02, 2016 1:12 PM To: Hameed, Amir <Amir.Hameed_at_xerox.com<mailto:Amir.Hameed_at_xerox.com>>; oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: RE: Enforcing SQL execution plan

Hameed …

From your description, it looks like this job did run acceptably in the past. If this is the case, why not grab one of the known good xplans and use it to create a baseline.

We stabilized most our EBS critical tasks using plan baselines. However, for some SQL, the baselines are becoming UNREPRODUCIBLE. Seeking MOS support, we are advised to apply 21463894.

Hope this helps as well.



Thank you

From: Hameed, Amir<mailto:Amir.Hameed_at_xerox.com> Sent: Thursday, September 1, 2016 9:13 PM To: oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: Enforcing SQL execution plan

Hi,
We run a certain job in the Oracle E-Business Suite (EBS) through a concurrent program. The environment information is listed below: DB version – 11.2.0.4
EBS version – 11.5.10.2
OS – Solaris

The statement uses bind variables that are supplied as part of running the concurrent program. As far as I can tell, the statement always produces the following execution plan:



| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |


| 0 | SELECT STATEMENT | | 17 | | | 12 (100)| | 17 |00:00:00.01 | 145 |
| 1 | CONCATENATION | | 17 | | | | | 17 |00:00:00.01 | 145 |
|*  2 |   FILTER                        |                             |     17 |        |       |            |          |     17 |00:00:00.01 |     145 |

| 3 | NESTED LOOPS | | 17 | 1 | 197 | 6 (0)| 00:00:01 | 17 |00:00:00.01 | 145 |
|* 4 | TABLE ACCESS BY INDEX ROWID | WSH_DELIVERY_DETAILS | 17 | 1 | 184 | 3 (0)| 00:00:01 | 17 |00:00:00.01 | 73 | |* 5 | INDEX UNIQUE SCAN | WSH_DELIVERY_DETAILS_U1 | 17 | 1 | | 2 (0)| 00:00:01 | 17 |00:00:00.01 | 51 | |* 6 | TABLE ACCESS BY INDEX ROWID | WSH_DELIVERY_ASSIGNMENTS | 17 | 1 | 13 | 3 (0)| 00:00:01 | 17 |00:00:00.01 | 72 | |* 7 | INDEX RANGE SCAN | WSH_DELIVERY_ASSIGNMENTS_N3 | 17 | 1 | | 2 (0)| 00:00:01 | 17 |00:00:00.01 | 53 | |* 8 | FILTER | | 0 | | | | | 0 |00:00:00.01 | 0 |
| 9 | NESTED LOOPS | | 0 | 1 | 197 | 6 (0)| 00:00:01 | 0 |00:00:00.01 | 0 |
| 10 | NESTED LOOPS | | 0 | 1 | 197 | 6 (0)| 00:00:01 | 0 |00:00:00.01 | 0 |
|* 11 | TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_DETAILS | 0 | 1 | 184 | 3 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | |* 12 | INDEX RANGE SCAN | WSH_DELIVERY_DETAILS_N1 | 0 | 1 | | 3 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | |* 13 | INDEX RANGE SCAN | WSH_DELIVERY_ASSIGNMENTS_N3 | 0 | 1 | | 2 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | |* 14 | TABLE ACCESS BY INDEX ROWID | WSH_DELIVERY_ASSIGNMENTS | 0 | 1 | 13 | 3 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | --------------------------------------------------------------------------------------------------------------------------------------------------------- For the past few weeks, the concurrent program has been intermittently taking time to complete. When traced, one of the “bad” runs produced the following information: call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 614.23 11147.25 941317 18000438 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 3 614.23 11147.25 941317 18000438 0 1 Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ db file sequential read 941317 1.00 3563.29 Disk file operations I/O 74 0.00 0.00 gc current grant busy 3 0.00 0.00 gc cr grant 2-way 44 0.08 0.09 read by other session 1706610 1.49 6482.34 gc buffer busy acquire 83710 1.00 431.39 latch: cache buffers chains 173199 0.01 9.25

(only the main wait events are shown above)

Rows (1st) Rows (avg) Rows (max) Row Source Operation

---------- ---------- ----------  ---------------------------------------------------
         1          1          1  CONCATENATION  (cr=18000438 pr=941317 pw=0 time=2557323227 us)
         0          0          0   FILTER  (cr=0 pr=0 pw=0 time=2 us)
         0          0          0    NESTED LOOPS  (cr=0 pr=0 pw=0 time=0 us cost=6 size=197 card=1)
         0          0          0     TABLE ACCESS BY INDEX ROWID WSH_DELIVERY_DETAILS (cr=0 pr=0 pw=0 time=0 us cost=3 size=184 card=1)
         0          0          0      INDEX UNIQUE SCAN WSH_DELIVERY_DETAILS_U1 (cr=0 pr=0 pw=0 time=0 us cost=2 size=0 card=1)(object id 165859)
         0          0          0     TABLE ACCESS BY INDEX ROWID WSH_DELIVERY_ASSIGNMENTS (cr=0 pr=0 pw=0 time=0 us cost=3 size=13 card=1)
         0          0          0      INDEX RANGE SCAN WSH_DELIVERY_ASSIGNMENTS_N3 (cr=0 pr=0 pw=0 time=0 us cost=2 size=0 card=1)(object id 165871)
         1          1          1   FILTER  (cr=18000438 pr=941317 pw=0 time=2557323217 us)
         1          1          1    NESTED LOOPS  (cr=18000438 pr=941317 pw=0 time=2557323212 us cost=6 size=197 card=1)
         1          1          1     NESTED LOOPS  (cr=18000361 pr=941296 pw=0 time=2557119556 us cost=6 size=197 card=1)
         1          1          1      TABLE ACCESS BY INDEX ROWID WSH_DELIVERY_DETAILS (cr=18000336 pr=941295 pw=0 time=2557112429 us cost=3 size=184 card=1)
  42599164   42599164   42599164       INDEX RANGE SCAN WSH_DELIVERY_DETAILS_N1 (cr=1798095 pr=99233 pw=0 time=600382285 us cost=3 size=0 card=1)(object id 917318)
         1          1          1      INDEX RANGE SCAN WSH_DELIVERY_ASSIGNMENTS_N3 (cr=25 pr=1 pw=0 time=7116 us cost=2 size=0 card=1)(object id 165871)
         1          1          1     TABLE ACCESS BY INDEX ROWID WSH_DELIVERY_ASSIGNMENTS (cr=77 pr=21 pw=0 time=203643 us cost=3 size=13 card=1)


It seems that when it runs longer, it scans a lot of rows from index WSH_DELIVERY_DETAILS_N1. This index is defined on a date column and is not very selective.

As part of the troubleshooting exercise, I captured bind variables of the bad run from DBA_HIST_SQLBIND, defined them as variables with appropriate data type and then ran them through SQL*Plus. The problem that I am facing is that from SQL*Plus, by using this technique, it always produces the following plan, regardless of whether I use values from the good run or the bad run:



| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |


| 0 | SELECT STATEMENT | | 1 | | | 12 (100)| | 0 |00:00:00.01 | 4 |
| 1 | CONCATENATION | | 1 | | | | | 0 |00:00:00.01 | 4 |
|*  2 |   FILTER                       |                             |      1 |        |       |            |          |      0 |00:00:00.01 |       0 |

| 3 | NESTED LOOPS | | 0 | 1 | 197 | 6 (0)| 00:00:01 | 0 |00:00:00.01 | 0 |
|* 4 | TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_DETAILS | 0 | 1 | 184 | 3 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | |* 5 | INDEX UNIQUE SCAN | WSH_DELIVERY_DETAILS_U1 | 0 | 1 | | 2 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | |* 6 | TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_ASSIGNMENTS | 0 | 1 | 13 | 3 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | |* 7 | INDEX RANGE SCAN | WSH_DELIVERY_ASSIGNMENTS_N3 | 0 | 1 | | 2 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | |* 8 | FILTER | | 1 | | | | | 0 |00:00:00.01 | 4 |
| 9 | NESTED LOOPS | | 1 | 1 | 197 | 6 (0)| 00:00:01 | 0 |00:00:00.01 | 4 |
|* 10 | TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_DETAILS | 1 | 1 | 184 | 3 (0)| 00:00:01 | 0 |00:00:00.01 | 4 | |* 11 | INDEX UNIQUE SCAN | WSH_DELIVERY_DETAILS_U1 | 1 | 1 | | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 3 | |* 12 | TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_ASSIGNMENTS | 0 | 1 | 13 | 3 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | |* 13 | INDEX RANGE SCAN | WSH_DELIVERY_ASSIGNMENTS_N3 | 0 | 1 | | 2 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | --------------------------------------------------------------------------------------------------------------------------------------------------------

I have tried to capture hints through outline from the bad run and added them as hints to the statement that I am running from SQL*Plus but I am still not able to enforce the “bad” plan. The outline hints are shown below:

  /*+

      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.2.0.4')
      DB_VERSION('11.2.0.4')
      OPT_PARAM('_b_tree_bitmap_plans' 'false')
      OPT_PARAM('_fast_full_scan_enabled' 'false')
      OPT_PARAM('_index_join_enabled' 'false')
      ALL_ROWS
      OUTLINE_LEAF(_at_"SEL$1")
      OUTLINE_LEAF(_at_"SEL$1_1")
      USE_CONCAT(_at_"SEL$1" 8 OR_PREDICATES(6) PREDICATE_REORDERS((23 22) (24 23) (26 24) (22 26)))
      OUTLINE_LEAF(_at_"SEL$1_2")
      OUTLINE(_at_"SEL$1")
      INDEX_RS_ASC(_at_"SEL$1_1" "WDD"_at_"SEL$1" ("WSH_DELIVERY_DETAILS"."DELIVERY_DETAIL_ID"))
      INDEX_RS_ASC(_at_"SEL$1_1" "WDA"_at_"SEL$1" ("WSH_DELIVERY_ASSIGNMENTS"."DELIVERY_DETAIL_ID"))
      INDEX_RS_ASC(_at_"SEL$1_2" "WDD"_at_"SEL$1_2" ("WSH_DELIVERY_DETAILS"."DATE_SCHEDULED"))
      INDEX(_at_"SEL$1_2" "WDA"_at_"SEL$1_2" ("WSH_DELIVERY_ASSIGNMENTS"."DELIVERY_DETAIL_ID"))
      LEADING(_at_"SEL$1_1" "WDD"_at_"SEL$1" "WDA"@"SEL$1")
      LEADING(_at_"SEL$1_2" "WDD"_at_"SEL$1_2" "WDA"@"SEL$1_2")
      USE_NL(_at_"SEL$1_1" "WDA"_at_"SEL$1")
      USE_NL(_at_"SEL$1_2" "WDA"_at_"SEL$1_2")
      NLJ_BATCHING(_at_"SEL$1_2" "WDA"_at_"SEL$1_2")
      END_OUTLINE_DATA

  */

The SQL statement is also pasted below:
SELECT WDD.SOURCE_LINE_ID, WDD.SOURCE_HEADER_ID, WDD.INVENTORY_ITEM_ID,

  WDD.SHIPMENT_PRIORITY_CODE, WDD.SOURCE_CODE SOURCE_CODE,
  WDD.SOURCE_HEADER_NUMBER SOURCE_HEADER_NUMBER, WDD.ORGANIZATION_ID,
  WDD.MOVE_ORDER_LINE_ID, WDD.SHIP_FROM_LOCATION_ID, WDD.SHIP_METHOD_CODE,
  WDD.SHIPMENT_PRIORITY_CODE, WDD.DATE_SCHEDULED, WDD.REQUESTED_QUANTITY,
  WDD.REQUESTED_QUANTITY_UOM, WDD.PREFERRED_GRADE, WDD.REQUESTED_QUANTITY2,
  WDD.REQUESTED_QUANTITY_UOM2, WDD.PROJECT_ID, WDD.TASK_ID, WDD.SUBINVENTORY,
  WDD.SUBINVENTORY, WDD.RELEASED_STATUS, WDD.SHIP_MODEL_COMPLETE_FLAG,
  WDD.TOP_MODEL_LINE_ID, WDD.SHIP_SET_ID, WDA.DELIVERY_ID,
  WDD.LAST_UPDATE_DATE

FROM
WSH_DELIVERY_DETAILS WDD, WSH_DELIVERY_ASSIGNMENTS WDA WHERE
  WDD.DELIVERY_DETAIL_ID = :B15 AND WDA.DELIVERY_DETAIL_ID =
  WDD.DELIVERY_DETAIL_ID AND WDD.DATE_SCHEDULED IS NOT NULL AND
  WDD.DATE_REQUESTED >= NVL(:B14 , WDD.DATE_REQUESTED) AND WDD.DATE_REQUESTED
  <= NVL(:B13 , WDD.DATE_REQUESTED) AND WDD.DATE_SCHEDULED >= NVL(:B12 ,   WDD.DATE_SCHEDULED) AND WDD.DATE_SCHEDULED <= NVL(:B11 , WDD.DATE_SCHEDULED)    AND NVL(WDD.REQUESTED_QUANTITY,0) > 0 AND WDD.RELEASED_STATUS IN ('R','B',   'X') AND NVL(WDD.SUBINVENTORY, -99) = DECODE(:B10 , NULL,   NVL(WDD.SUBINVENTORY, -99), :B10 ) AND NVL(WDD.PROJECT_ID,0) = DECODE(:B9 ,   0, NVL(WDD.PROJECT_ID,0), :B9 ) AND NVL(WDD.TASK_ID,0) = DECODE(:B8 , 0,   NVL(WDD.TASK_ID,0), :B8 ) AND NVL(WDD.SHIP_SET_ID,0) = DECODE(:B7 , 0,   NVL(WDD.SHIP_SET_ID,0), :B7 ) AND NVL(WDD.SHIPMENT_PRIORITY_CODE,-99) =   DECODE(:B6 , NULL, NVL(WDD.SHIPMENT_PRIORITY_CODE,-99), :B6 ) AND   WDD.ORGANIZATION_ID = NVL(:B5 , WDD.ORGANIZATION_ID) AND   WDD.SHIP_FROM_LOCATION_ID = DECODE(:B4 , -1, WDD.SHIP_FROM_LOCATION_ID, :B4   ) AND (( WDA.DELIVERY_ID IS NOT NULL AND ( :B3 <> 'N' OR WDA.DELIVERY_ID =   :B2 OR :B1 <> 0 ) ) OR ( WDA.DELIVERY_ID IS NULL AND :B2 = 0 AND :B1 = 0 ) )

How can I enforce the bad execution plan from SQL*Plus? If I am able to enforce it then I already have bind values from that run and I will be able to reproduce the issue.

Thanks,
Amir

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Sep 02 2016 - 22:45:33 CEST

Original text of this message