Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Optimizer Instability on 10.2.0.2

Re: Optimizer Instability on 10.2.0.2

From: Vladimir M. Zakharychev <vladimir.zakharychev_at_gmail.com>
Date: 20 Apr 2007 04:10:04 -0700
Message-ID: <1177067404.797223.54530@n59g2000hsh.googlegroups.com>


On Apr 20, 12:39 pm, m..._at_hotmail.com wrote:
> Oracle 10.2.0.2 on Windows 2003 (EM64T)
>
> One of our batch processes that runs every 30 minutes is fluctuating
> between fast and slow execution times. 50% of the time it runs in 1
> minute and the other 50% it runs for 10 minutes. The behaviour is
> erratic and there doesn't appear to be a pattern in the change in
> execution time. (There is a gap between 23:00 and 06:00 when it
> doesn't run).
>
> I have captured 10046 traces for a handful of the executions so that I
> can compare the SQLs. It appears that one particular piece of SQL
> switches from doing an efficient Index lookup on a large table
> (PS_JOB) to a full table scan intermittently between runs (see 10046
> output below).
>
> Any ideas where I can start looking for reasons why this is occurring.
>
> The stats havent been gathered for over a week (we use STALE stats
> gathering). Interestingly the "Misses in library cache" is 1 for the
> fast query and 0 for the slow query.
>
> Fast Run:
>
> SELECT DISTINCT I01.EMPLID, I01.EMPL_RCD, I02.PER_ORG,
> TO_CHAR(I02.EFFDT,
> 'YYYY-MM-DD'), I02.EFFSEQ
> from
> PS_AUD_JOB I01, PS_JOB I02 where I01.AUDIT_STAMP > :1 and
> I01.EMPLID
> = I02.EMPLID and I01.EMPL_RCD = I02.EMPL_RCD and
> I02.JOB_INDICATOR = 'P' and (I02.PER_ORG = 'EMP' or :2 =
> 'TMMFCHRO')
> and I02.COMPANY IN (SELECT MST.COMPANY FROM PS_HRMSSI_COMPANY
> MST WHERE
> MST.HRMSSI_ID = :2) and ( I02.EFFDT = (SELECT MAX(A_ED1.EFFDT)
> FROM
> PS_JOB A_ED1 WHERE I02.EMPLID = A_ED1.EMPLID AND I02.EMPL_RCD =
> A_ED1.EMPL_RCD) AND I02.EFFSEQ = (SELECT MAX(A_ES1.EFFSEQ) FROM
> PS_JOB
> A_ES1 WHERE I02.EMPLID = A_ES1.EMPLID AND I02.EMPL_RCD =
> A_ES1.EMPL_RCD AND
> I02.EFFDT = A_ES1.EFFDT) ) ORDER BY I01.EMPLID
>
> call count cpu elapsed disk query
> current rows
> ------- ------ -------- ---------- ---------- ---------- ----------
> ----------
> Parse 6 0.00 0.00 0 0
> 0 0
> Execute 7 0.14 0.08 0 918
> 0 0
> Fetch 8 0.01 0.01 0 2200
> 0 3
> ------- ------ -------- ---------- ---------- ---------- ----------
> ----------
> total 21 0.15 0.11 0 3118
> 0 3
>
> Misses in library cache during parse: 1
> Misses in library cache during execute: 1
> Optimizer mode: ALL_ROWS
> Parsing user id: 21
>
> Rows Row Source Operation
> ------- ---------------------------------------------------
> 0 SORT UNIQUE (cr=284 pr=0 pw=0 time=1966 us)
> 0 FILTER (cr=284 pr=0 pw=0 time=1949 us)
> 0 SORT GROUP BY (cr=284 pr=0 pw=0 time=1948 us)
> 0 NESTED LOOPS (cr=284 pr=0 pw=0 time=1923 us)
> 0 NESTED LOOPS (cr=284 pr=0 pw=0 time=1921 us)
> 93 NESTED LOOPS (cr=282 pr=0 pw=0 time=2324 us)
> 6 TABLE ACCESS BY INDEX ROWID PS_AUD_JOB (cr=3 pr=0 pw=0
> time=46 us)
> 6 INDEX RANGE SCAN PSBAUD_JOB (cr=2 pr=0 pw=0 time=28 us)
> (object id 69663008)
> 93 TABLE ACCESS BY INDEX ROWID PS_JOB (cr=279 pr=0 pw=0
> time=1787 us)
> 93 INDEX RANGE SCAN PSAJOB (cr=203 pr=0 pw=0 time=161 us)
> (object id 69663591)
> 93 SORT AGGREGATE (cr=188 pr=0 pw=0 time=966 us)
> 93 FIRST ROW (cr=188 pr=0 pw=0 time=689 us)
> 93 INDEX RANGE SCAN (MIN/MAX) PSAJOB (cr=188 pr=0 pw=0
> time=502 us)(object id 69663591)
> 0 INDEX UNIQUE SCAN PS_HRMSSI_COMPANY (cr=2 pr=0 pw=0
> time=177 us)(object id 52907198)
> 0 INDEX RANGE SCAN PSAJOB (cr=0 pr=0 pw=0 time=0 us)(object
> id 69663591)
>
> Elapsed times include waiting on following events:
> Event waited on Times Max. Wait Total
> Waited
> ---------------------------------------- Waited ----------
> ------------
> SQL*Net message to client 14
> 0.00 0.00
> SQL*Net message from client 14
> 0.00 0.00
>
> Slow Run:
>
> SELECT DISTINCT I01.EMPLID, I01.EMPL_RCD, I02.PER_ORG,
> TO_CHAR(I02.EFFDT,
> 'YYYY-MM-DD'), I02.EFFSEQ
> from
> PS_AUD_JOB I01, PS_JOB I02 where I01.AUDIT_STAMP > :1 and
> I01.EMPLID
> = I02.EMPLID and I01.EMPL_RCD = I02.EMPL_RCD and
> I02.JOB_INDICATOR = 'P' and (I02.PER_ORG = 'EMP' or :2 =
> 'TMMFCHRO')
> and I02.COMPANY IN (SELECT MST.COMPANY FROM PS_HRMSSI_COMPANY
> MST WHERE
> MST.HRMSSI_ID = :2) and ( I02.EFFDT = (SELECT MAX(A_ED1.EFFDT)
> FROM
> PS_JOB A_ED1 WHERE I02.EMPLID = A_ED1.EMPLID AND I02.EMPL_RCD =
> A_ED1.EMPL_RCD) AND I02.EFFSEQ = (SELECT MAX(A_ES1.EFFSEQ) FROM
> PS_JOB
> A_ES1 WHERE I02.EMPLID = A_ES1.EMPLID AND I02.EMPL_RCD =
> A_ES1.EMPL_RCD AND
> I02.EFFDT = A_ES1.EFFDT) ) ORDER BY I01.EMPLID
>
> call count cpu elapsed disk query
> current rows
> ------- ------ -------- ---------- ---------- ---------- ----------
> ----------
> Parse 6 0.00 0.00 0 0
> 0 0
> Execute 7 0.00 0.00 0 0
> 0 0
> Fetch 9 6.53 182.13 118791 194200
> 0 4
> ------- ------ -------- ---------- ---------- ---------- ----------
> ----------
> total 22 6.53 182.13 118791 194200
> 0 4
>
> Misses in library cache during parse: 0
> Optimizer mode: ALL_ROWS
> Parsing user id: 21
>
> Rows Row Source Operation
> ------- ---------------------------------------------------
> 1 SORT UNIQUE (cr=27768 pr=19559 pw=0 time=19634419 us)
> 1 FILTER (cr=27768 pr=19559 pw=0 time=19634377 us)
> 15 SORT GROUP BY (cr=27768 pr=19559 pw=0 time=19634385 us)
> 225 FILTER (cr=27768 pr=19559 pw=0 time=2971623 us)
> 225 NESTED LOOPS (cr=27736 pr=19559 pw=0 time=2971080 us)
> 15 NESTED LOOPS (cr=27704 pr=19559 pw=0 time=41958542 us)
> 67 HASH JOIN (cr=27702 pr=19559 pw=0 time=186768875 us)
> 4 TABLE ACCESS BY INDEX ROWID PS_AUD_JOB (cr=4 pr=1 pw=0
> time=44 us)
> 4 INDEX RANGE SCAN PSBAUD_JOB (cr=2 pr=0 pw=0 time=13
> us)(object id 69663008)
> 340719 TABLE ACCESS FULL PS_JOB (cr=27698 pr=19558 pw=0
> time=18413028 us)
> 15 INDEX UNIQUE SCAN PS_HRMSSI_COMPANY (cr=2 pr=0 pw=0
> time=499 us)(object id 52907198)
> 225 INDEX RANGE SCAN PSAJOB (cr=32 pr=0 pw=0 time=194 us)
> (object id 69663591)
> 15 SORT AGGREGATE (cr=32 pr=0 pw=0 time=248 us)
> 15 FIRST ROW (cr=32 pr=0 pw=0 time=160 us)
> 15 INDEX RANGE SCAN (MIN/MAX) PSAJOB (cr=32 pr=0 pw=0
> time=121 us)(object id 69663591)
>
> Elapsed times include waiting on following events:
> Event waited on Times Max. Wait Total
> Waited
> ---------------------------------------- Waited ----------
> ------------
> SQL*Net message to client 15
> 0.00 0.00
> SQL*Net message from client 15
> 0.00 0.00
> db file sequential read 11213
> 0.17 46.12
> db file scattered read 22862 0.17
> 131.08
> latch: shared pool 2
> 0.00 0.00

0 misses in library cache means that the slow run picks up cached execution plan from the shared pool. Is this query used anywhere else, or only in the batch? The problem seems to be in the join of PS_AUD_JOB and PS_JOB. Enter bind variable peeking: different values of :1 for PS_AUD_JOB.AUDIT_STAMP > :1 predicate might be the cause the CBO chooses nested loops join in one case and hash join in another. Since the CBO only peeks bind variable values on the first execution (in other words, when hard-parsing,) "unfortunate" first execution may produce a suboptimal plan, which will then be reused until the statement ages out of the cache. Probably AUDIT_STAMP for the slow run had rather high selectivity and the CBO went for hash join, which is generally faster when joining high volumes of data.

Regards,

   Vladimir M. Zakharychev
   N-Networks, makers of Dynamic PSP(tm)    http://www.dynamicpsp.com Received on Fri Apr 20 2007 - 06:10:04 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US