Re: CUBE temp table transformation gives bad access plan?

From: Mohamed Houri <mohamed.houri_at_gmail.com>
Date: Thu, 30 Oct 2014 13:04:02 +0100
Message-ID: <CAJu8R6ho0sT5M-MXR8N+snsh3PR_U-dmnJepxEZ3U-_gWmHBGQ_at_mail.gmail.com>



Martin

I was thinking about the same blog I read yesterday.

On of the table that serves for the temp materialization is LAGERKART which has been started 446K

TABLE ACCESS BY INDEX ROWID| LAGERKART                  |    446K|      1 |
  3162 |00:00:01.60

Because it is the inner table of a Nested Loop operation that is driven by an underestimate operation

ABLE ACCESS BY INDEX ROWID| DEBFAKPOST                 |      1 |     58 |

   446K

Trying to anihilate this materialization with the /*+ inline */ hint my give a CBO an extra path that might avoid that INDEX SKIP SCAN access

Mohamed Houri

2014-10-30 12:50 GMT+01:00 Martin Preiss <mtnpreiss_at_gmx.de>:

> Hi,
>
> maybe that's more of a free assoziation - but Randolf Geist just wrote
> about some strange effects with temp table transformation - resulting from
> the heuristic (i.e. not cost based) character of this transformation:
> http://oracle-randolf.blogspot.de/2014/10/heuristic-temp-table-transformation.html.
> So maybe an inline hint could improve the quality of the cardinalities for
> the CUBE variant.
>
> To understand why the CUBE version uses the temp table transformation at
> all, I would create a CBO trace (10053).
>
> Regards
>
> Martin Preiss
>
> Am 30.10.2014 11:29, schrieb Kim Berg Hansen:
>
> Hi, List
>
> I have stumbled upon something I think is weird behaviour when using
> CUBE and I was wondering if anyone has seen this before and maybe know of a
> workaround?
>
> I can't seem to reproduce a simple case, but I'll try explaining with a
> somewhat shortened version of my query that still exhibits this odd
> behaviour.
>
> First I have a statement that uses ROLLUP:
>
>
> select /*+ all_rows gather_plan_statistics */
> case grouping(mxned)
> when 1 then 'Total'
> else to_char(mxned,'YYYY-MM')
> end "MÃ¥ned"
> , case grouping(debitorgruppe)
> when 1 then 'Total'
> else debitorgruppe
> end "Gruppe"
> , count(distinct fakturakonto) "Ialt Kunder"
> , count(*) "Ialt Ordrer"
> , sum(samantal) "Ialt Styk total"
> , avg(samantal) "Ialt Styk gns"
> from (
> select s1.*
> , case when antal >= 5 and rabat = 1 then antal end samantal
> from (
> select
> trunc(fj.fakturadato,'MM') mxned
> , fj.debitorgruppe
> , fj.fakturakonto
> , fj.lxbenummer
> , sum(fp.antal) antal
> , case when sum(fp.liniebelxb * fp.momsfaktor) < sum(fp.antal *
> lk.salgspris * 0.9) then 1 else 0 end rabat
> , count(distinct fp.varenummer) blandvarer
> from lagerkart lk -- <<<
> join debfakpost fp -- <<<
> on fp.dataset = 'DAT' -- <<<
> and fp.varenummer = lk.varenummer -- <<<
> and fp.fakturadato >= date '2013-01-01' -- <<<
> and fp.fakturadato < date '2013-02-01' -- <<<
> join debfakjour fj
> on fj.dataset = fp.dataset
> and fj.fakturanummer = fp.fakturanummer
> and fj.fakturadato = fp.fakturadato
> and fj.ordrenummer = fp.ordrenummer
> where lk.dataset = 'DAT'
> and lk.samkxbsrabat = 'SLANGER'
> group by
> trunc(fj.fakturadato,'MM')
> , fj.debitorgruppe
> , fj.fakturakonto
> , fj.lxbenummer
> ) s1
> )
> group by
> rollup(mxned, debitorgruppe)
> order by
> mxned, debitorgruppe
> ;
>
>
> The join between LAGERKART (LK) and DEBFAKPOST (FP) and the predicates
> on FP (marked -- <<<) is the interesting bit.
> That's step 10-12 in the plan below, which is the good plan that I like.
>
>
>
> -----------------------------------------------------------------------------------------------------------------------------------------------
> | Id | Operation | Name | Starts
> | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
>
> -----------------------------------------------------------------------------------------------------------------------------------------------
> | 0 | SELECT STATEMENT | | 1
> | | 8 |00:00:00.35 | 63669 | | | |
> | 1 | SORT ORDER BY | | 1
> | 2 | 8 |00:00:00.35 | 63669 | 2048 | 2048 | 2048 (0)|
> | 2 | SORT GROUP BY ROLLUP | | 1
> | 2 | 8 |00:00:00.35 | 63669 | 38912 | 38912 |34816 (0)|
> | 3 | VIEW | | 1
> | 4 | 2943 |00:00:00.35 | 63669 | | | |
> | 4 | HASH GROUP BY | | 1
> | 4 | 2943 |00:00:00.34 | 63669 | 1003K| 1003K| 1358K (0)|
> | 5 | VIEW | VW_DAG_0 | 1
> | 10 | 3139 |00:00:00.34 | 63669 | | | |
> | 6 | HASH GROUP BY | | 1
> | 10 | 3139 |00:00:00.34 | 63669 | 1063K| 1000K| 1368K (0)|
> | 7 | NESTED LOOPS | | 1
> | | 3162 |00:00:00.04 | 63669 | | | |
> | 8 | NESTED LOOPS | | 1
> | 10 | 3162 |00:00:00.03 | 60555 | | | |
> | 9 | NESTED LOOPS | | 1
> | 14 | 3162 |00:00:00.01 | 51044 | | | |
> |* 10 | TABLE ACCESS FULL | LAGERKART | 1
> | 23 | 23 |00:00:00.16 | 47850 | | | |
> | 11 | TABLE ACCESS BY INDEX ROWID| DEBFAKPOST | 23
> | 1 | 3162 |00:00:00.01 | 3194 | | | |
> |* 12 | INDEX RANGE SCAN | I_102VARENRIDX | 23
> | 1 | 3162 |00:00:00.01 | 90 | | | |
> |* 13 | INDEX RANGE SCAN | I_041FAKTURANRIDX | 3162
> | 1 | 3162 |00:00:00.02 | 9511 | | | |
> |* 14 | TABLE ACCESS BY INDEX ROWID | DEBFAKJOUR | 3162
> | 1 | 3162 |00:00:00.01 | 3114 | | | |
>
> -----------------------------------------------------------------------------------------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
>
> 10 - filter(("LK"."SAMKXBSRABAT"='SLANGER' AND "LK"."DATASET"='DAT'))
> 12 - access("FP"."DATASET"='DAT' AND "FP"."VARENUMMER"="LK"."VARENUMMER"
> AND "FP"."FAKTURADATO">=TO_DATE(' 2013-01-01 00:00:00',
> 'syyyy-mm-dd hh24:mi:ss') AND "FP"."FAKTURADATO"<TO_DATE('
> 2013-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
> 13 - access("FJ"."DATASET"='DAT' AND
> "FJ"."FAKTURANUMMER"="FP"."FAKTURANUMMER" AND
> "FJ"."FAKTURADATO"="FP"."FAKTURADATO")
> filter(("FJ"."FAKTURADATO">=TO_DATE(' 2013-01-01 00:00:00',
> 'syyyy-mm-dd hh24:mi:ss') AND "FJ"."FAKTURADATO"<TO_DATE('
> 2013-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
> 14 - filter("FJ"."ORDRENUMMER"="FP"."ORDRENUMMER")
>
> Note
> -----
> - cardinality feedback used for this statement
>
>
> There is no decent index for the WHERE clause on LAGERKART - the full
> table access in step 10 is fine by me - it gets 23 rows.
> Nested loop for each of those 23 rows using index I_102VARENRIDX in step
> 12 is pretty much optimal getting a total of 3.162 rows.
> First component of the index is DATASET which has a constant predicate =
> 'DAT'.
> Second component of the index is VARENUMMER which is the join predicate
> from LAGERKART.
> Third component of the index is FAKTURADATO which has a range predicate of
> one month.
>
> (This is sales data - query finds the invoice lines for one month for
> the 23 products that belongs to a group 'SLANGER'.)
>
>
> Now I change ROLLUP in the statement to CUBE - that's the only change I
> make. Now I get this terrible plan:
>
>
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> | Id | Operation | Name |
> Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem |
> 1Mem | Used-Mem |
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> | 0 | SELECT STATEMENT | |
> 1 | | 14 |00:00:03.31 | 1599K| 15 | 15 | |
> | |
> | 1 | TEMP TABLE TRANSFORMATION | |
> 1 | | 14 |00:00:03.31 | 1599K| 15 | 15 | |
> | |
> | 2 | LOAD AS SELECT | |
> 1 | | 0 |00:00:03.28 | 1598K| 0 | 13 | 270K|
> 270K| 270K (0)|
> | 3 | HASH GROUP BY | |
> 1 | 1 | 2943 |00:00:03.27 | 1598K| 0 | 0 | 1009K|
> 1009K| 1363K (0)|
> | 4 | VIEW | |
> 1 | 1 | 3139 |00:00:03.26 | 1598K| 0 | 0 | |
> | |
> | 5 | SORT GROUP BY | |
> 1 | 1 | 3139 |00:00:03.26 | 1598K| 0 | 0 | 478K|
> 448K| 424K (0)|
> | 6 | NESTED LOOPS | |
> 1 | | 3162 |00:00:00.73 | 1598K| 0 | 0 | |
> | |
> | 7 | NESTED LOOPS | |
> 1 | 1 | 3162 |00:00:00.72 | 1596K| 0 | 0 | |
> | |
> | 8 | NESTED LOOPS | |
> 1 | 2 | 3162 |00:00:00.70 | 1586K| 0 | 0 | |
> | |
> | 9 | TABLE ACCESS BY INDEX ROWID| DEBFAKPOST |
> 1 | 58 | 446K|00:00:01.66 | 377K| 0 | 0 | |
> | |
> |* 10 | INDEX SKIP SCAN | I_102VAREGRPDEBGRPIDX |
> 1 | 1 | 446K|00:00:00.64 | 7451 | 0 | 0 | |
> | |
> |* 11 | TABLE ACCESS BY INDEX ROWID| LAGERKART |
> 446K| 1 | 3162 |00:00:01.60 | 1209K| 0 | 0 | |
> | |
> |* 12 | INDEX UNIQUE SCAN | I_011VAREIDX |
> 446K| 1 | 446K|00:00:00.94 | 762K| 0 | 0 | |
> | |
> |* 13 | INDEX RANGE SCAN | I_041FAKTURANRIDX |
> 3162 | 1 | 3162 |00:00:00.02 | 9436 | 0 | 0 | |
> | |
> |* 14 | TABLE ACCESS BY INDEX ROWID | DEBFAKJOUR |
> 3162 | 1 | 3162 |00:00:00.01 | 2916 | 0 | 0 | |
> | |
> | 15 | LOAD AS SELECT | |
> 1 | | 0 |00:00:00.01 | 43 | 13 | 1 | 270K|
> 270K| 270K (0)|
> | 16 | SORT GROUP BY ROLLUP | |
> 1 | 1 | 7 |00:00:00.01 | 18 | 13 | 0 | 36864 |
> 36864 |32768 (0)|
> | 17 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE201_36B54E9 |
> 1 | 1 | 2943 |00:00:00.01 | 18 | 13 | 0 | |
> | |
> | 18 | LOAD AS SELECT | |
> 1 | | 0 |00:00:00.01 | 20 | 0 | 1 | 270K|
> 270K| 270K (0)|
> | 19 | SORT GROUP BY ROLLUP | |
> 1 | 1 | 7 |00:00:00.01 | 15 | 0 | 0 | 33792 |
> 33792 |30720 (0)|
> | 20 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE201_36B54E9 |
> 1 | 1 | 2943 |00:00:00.16 | 15 | 0 | 0 | |
> | |
> | 21 | SORT ORDER BY | |
> 1 | 20 | 14 |00:00:00.01 | 7 | 2 | 0 | 2048 |
> 2048 | 2048 (0)|
> | 22 | VIEW | |
> 1 | 1 | 14 |00:00:00.01 | 7 | 2 | 0 | |
> | |
> | 23 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE203_36B54E9 |
> 1 | 1 | 14 |00:00:00.01 | 7 | 2 | 0 | |
> | |
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
>
> 10 - access("SYS_TBL_$4$"."DATASET"='DAT' AND
> "SYS_TBL_$4$"."FAKTURADATO">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd
> hh24:mi:ss') AND
> "SYS_TBL_$4$"."FAKTURADATO"<TO_DATE(' 2013-02-01 00:00:00',
> 'syyyy-mm-dd hh24:mi:ss'))
> filter(("SYS_TBL_$4$"."FAKTURADATO">=TO_DATE(' 2013-01-01
> 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
> SYS_OP_DESCEND("FAKTURADATO")>HEXTORAW('878EFDFEF8FEF8FF')
> AND "SYS_TBL_$4$"."FAKTURADATO"<TO_DATE(' 2013-02-01 00:00:00',
> 'syyyy-mm-dd hh24:mi:ss') AND
>
> SYS_OP_DESCEND("FAKTURADATO")<=HEXTORAW('878EFEF8FEF8FEFAFF') ))
> 11 - filter("SYS_TBL_$3$"."SAMKXBSRABAT"='SLANGER')
> 12 - access("SYS_TBL_$3$"."DATASET"='DAT' AND
> "SYS_TBL_$4$"."VARENUMMER"="SYS_TBL_$3$"."VARENUMMER")
> 13 - access("SYS_TBL_$5$"."DATASET"='DAT' AND
> "SYS_TBL_$5$"."FAKTURANUMMER"="SYS_TBL_$4$"."FAKTURANUMMER" AND
> "SYS_TBL_$5$"."FAKTURADATO"="SYS_TBL_$4$"."FAKTURADATO")
> filter(("SYS_TBL_$5$"."FAKTURADATO">=TO_DATE(' 2013-01-01
> 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
> "SYS_TBL_$5$"."FAKTURADATO"<TO_DATE(' 2013-02-01
> 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
> SYS_OP_DESCEND("SYS_TBL_$5$"."FAKTURADATO")<=HEXTORAW('878EFEF8FEF8FEFAFF')
> AND
>
> SYS_OP_DESCEND("SYS_TBL_$5$"."FAKTURADATO")>HEXTORAW('878EFDFEF8FEF8FF')
> AND "SYS_TBL_$5$"."DATASET"="SYS_TBL_$4$"."DATASET" AND
>
> SYS_OP_DESCEND("FAKTURADATO")=SYS_OP_DESCEND("SYS_TBL_$5$"."FAKTURADATO")))
> 14 - filter("SYS_TBL_$5$"."ORDRENUMMER"="SYS_TBL_$4$"."ORDRENUMMER")
>
> Note
> -----
> - cardinality feedback used for this statement
>
>
> Step 10 - the INDEX SKIP SCAN - now finds sales data for that month for
> ALL products - 446 thousand rows. For each of those near half million rows
> an INDEX UNIQUE SCAN on LAGERKART is used to filter away almost all of
> those leaving the 3.162 rows that are interesting.
>
>
> Now I notice that the ROLLUP plan step 4 to 14 gathers the desired rows,
> and then step 2 is a SORT GROUP BY ROLLUP.
> Fine, ROLLUP can be done as part of the sorting operation.
>
> The CUBE plan step 3 to 14 also gathers the desired rows, but then does
> TEMP TABLE TRANSFORMATION.
> And then the results of the CUBE is created by a couple of ROLLUP
> operations on the SYS TEMP TABLE.
> Fine, I can understand that as well ;-)
>
> What I cannot understand is why step 3-14 in the CUBE plan is not the
> same as step 4-14 in the ROLLUP plan?
> Both are doing the same job - gathering the desired rows - difference
> being that those steps in one case feeds rows to SORT GROUP BY ROLLUP, in
> the other case feeds rows to TEMP TABLE.
>
> Why is the same query part with same predicates picking so different
> access plans in those two cases?
>
>
> I can see that the cardinality estimate for DEBFAKPOST is wildly off. An
> estimate of 1 when actual is 446 thousand - that's bad, of course ;-)
> But I can't see that this should be the cause of the two plans being so
> different? Or can it?
>
> What can I look for to determine why that cardinality estimate is so
> wrong?
> I think my stats for the table and indexes should be OK, but probably they
> are not.
> How can I identify what's wrong with my stats?
>
>
> And another slightly weird thing. I tried hinting the inner query in the
> CUBE query with:
>
> /*+ index(fp I_102VARENRIDX) */
>
> That gives me the exact same plan as without the hint. Even the same
> plan hash value!
> Other hints as well - seems like the TEMP TABLE TRANSFORMATION causes the
> hints to be ignored?
> (Well, I know Jonathan Lewis state that hints aren't ignored, they are
> just not always applicable to the access path chosen, but it *looks* like
> ignored ;-)
>
> I have tried using hints in the inner query of the ROLLUP query, and
> hinting in that case works like normal.
>
>
> Version info:
> Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit
> Production
> PL/SQL Release 11.2.0.3.0 - Production
> CORE 11.2.0.3.0 Production
> TNS for Linux: Version 11.2.0.3.0 - Production
> NLSRTL Version 11.2.0.3.0 - Production
>
>
>
> Any ideas what to look for in my stats would be appreciated.
>
> That there is a difference in the way the optimizer works on the TEMP
> TABLE TRANSFORMATION when doing CUBE I probably can't change.
> (Well, I can upgrade and hope 12.1.0.2.0 behaves differently - but that
> won't happen until springtime ;-)
>
> But presumably if I can fix my cardinality estimates, hopefully I can
> get a good plan both for ROLLUP and CUBE?
>
>
> Thanks
>
>
> Regards
>
>
> Kim Berg Hansen
>
> http://dspsd.blogspot.com
> kibeha_at_gmail.com
> _at_kibeha
>
>
>
>

-- 

Houri Mohamed

Oracle DBA-Developer-Performance & Tuning

Member of Oraworld-team <http://www.oraworld-team.com/>

Visit My         - Blog <http://www.hourim.wordpress.com/>

Let's Connect -
<http://fr.linkedin.com/pub/mohamed-houri/11/329/857/>*Linkedin
Profile <http://fr.linkedin.com/pub/mohamed-houri/11/329/857/>*

My Twitter <https://twitter.com/MohamedHouri>      - MohamedHouri
<https://twitter.com/MohamedHouri>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Oct 30 2014 - 13:04:02 CET

Original text of this message