CUBE temp table transformation gives bad access plan?
Date: Thu, 30 Oct 2014 11:29:47 +0100
Message-ID: <CA+S=qd10mAb=G+ri1JuygFfe-4-SJYCT4Q=9UFEz-kAxa6JnAw_at_mail.gmail.com>
Hi, List
I have stumbled upon something I think is weird behaviour when using CUBE and I was wondering if anyone has seen this before and maybe know of a workaround?
I can't seem to reproduce a simple case, but I'll try explaining with a somewhat shortened version of my query that still exhibits this odd behaviour.
First I have a statement that uses ROLLUP:
select /*+ all_rows gather_plan_statistics */
case grouping(mxned)
when 1 then 'Total' else to_char(mxned,'YYYY-MM')
end "MÃ¥ned"
, case grouping(debitorgruppe)
when 1 then 'Total' else debitorgruppe
end "Gruppe"
, count(distinct fakturakonto) "Ialt Kunder" , count(*) "Ialt Ordrer" , sum(samantal) "Ialt Styk total" , avg(samantal) "Ialt Styk gns"
from (
select s1.*
, case when antal >= 5 and rabat = 1 then antal end samantal from ( select trunc(fj.fakturadato,'MM') mxned
, fj.debitorgruppe
, fj.fakturakonto
, fj.lxbenummer
, sum(fp.antal) antal
, case when sum(fp.liniebelxb * fp.momsfaktor) < sum(fp.antal *
lk.salgspris * 0.9) then 1 else 0 end rabat
, count(distinct fp.varenummer) blandvarer
from lagerkart lk -- <<< join debfakpost fp -- <<< on fp.dataset = 'DAT' -- <<< and fp.varenummer = lk.varenummer -- <<< and fp.fakturadato >= date '2013-01-01' -- <<< and fp.fakturadato < date '2013-02-01' -- <<< join debfakjour fj on fj.dataset = fp.dataset and fj.fakturanummer = fp.fakturanummer and fj.fakturadato = fp.fakturadato and fj.ordrenummer = fp.ordrenummer where lk.dataset = 'DAT' and lk.samkxbsrabat = 'SLANGER' group by trunc(fj.fakturadato,'MM')
, fj.debitorgruppe
, fj.fakturakonto
, fj.lxbenummer
) s1
)
group by
rollup(mxned, debitorgruppe)
order by
mxned, debitorgruppe
;
The join between LAGERKART (LK) and DEBFAKPOST (FP) and the predicates on FP (marked -- <<<) is the interesting bit. That's step 10-12 in the plan below, which is the good plan that I like.
| Id | Operation | Name | Starts |E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
| 0 | SELECT STATEMENT | | 1 | | 8 |00:00:00.35 | 63669 | | | | | 1 | SORT ORDER BY | | 1 | 2 | 8 |00:00:00.35 | 63669 | 2048 | 2048 | 2048 (0)| | 2 | SORT GROUP BY ROLLUP | | 1 | 2 | 8 |00:00:00.35 | 63669 | 38912 | 38912 |34816 (0)| | 3 | VIEW | | 1 | 4 | 2943 |00:00:00.35 | 63669 | | | | | 4 | HASH GROUP BY | | 1 | 4 | 2943 |00:00:00.34 | 63669 | 1003K| 1003K| 1358K (0)| | 5 | VIEW | VW_DAG_0 | 1 | 10 | 3139 |00:00:00.34 | 63669 | | | | | 6 | HASH GROUP BY | | 1 | 10 | 3139 |00:00:00.34 | 63669 | 1063K| 1000K| 1368K (0)| | 7 | NESTED LOOPS | | 1 | | 3162 |00:00:00.04 | 63669 | | | | | 8 | NESTED LOOPS | | 1 | 10 | 3162 |00:00:00.03 | 60555 | | | | | 9 | NESTED LOOPS | | 1 | 14 | 3162 |00:00:00.01 | 51044 | | | | |* 10 | TABLE ACCESS FULL | LAGERKART | 1 | 23 | 23 |00:00:00.16 | 47850 | | | | | 11 | TABLE ACCESS BY INDEX ROWID| DEBFAKPOST | 23 | 1 | 3162 |00:00:00.01 | 3194 | | | | |* 12 | INDEX RANGE SCAN | I_102VARENRIDX | 23 | 1 | 3162 |00:00:00.01 | 90 | | | | |* 13 | INDEX RANGE SCAN | I_041FAKTURANRIDX | 3162 | 1 | 3162 |00:00:00.02 | 9511 | | | | |* 14 | TABLE ACCESS BY INDEX ROWID | DEBFAKJOUR | 3162 | 1 | 3162 |00:00:00.01 | 3114 | | | | -----------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
10 - filter(("LK"."SAMKXBSRABAT"='SLANGER' AND "LK"."DATASET"='DAT')) 12 - access("FP"."DATASET"='DAT' AND "FP"."VARENUMMER"="LK"."VARENUMMER" AND "FP"."FAKTURADATO">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "FP"."FAKTURADATO"<TO_DATE(' 2013-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
13 - access("FJ"."DATASET"='DAT' AND "FJ"."FAKTURANUMMER"="FP"."FAKTURANUMMER" AND "FJ"."FAKTURADATO"="FP"."FAKTURADATO") filter(("FJ"."FAKTURADATO">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "FJ"."FAKTURADATO"<TO_DATE(' 2013-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))14 - filter("FJ"."ORDRENUMMER"="FP"."ORDRENUMMER")
Note
- cardinality feedback used for this statement
There is no decent index for the WHERE clause on LAGERKART - the full table
access in step 10 is fine by me - it gets 23 rows.
Nested loop for each of those 23 rows using index I_102VARENRIDX in step 12
is pretty much optimal getting a total of 3.162 rows.
First component of the index is DATASET which has a constant predicate =
'DAT'.
Second component of the index is VARENUMMER which is the join predicate
from LAGERKART.
Third component of the index is FAKTURADATO which has a range predicate of
one month.
(This is sales data - query finds the invoice lines for one month for the 23 products that belongs to a group 'SLANGER'.)
Now I change ROLLUP in the statement to CUBE - that's the only change I make. Now I get this terrible plan:
| Id | Operation | Name |Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem |
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:03.31 | 1599K| 15 | 15 | | | | | 1 | TEMP TABLE TRANSFORMATION | | 1 | | 14 |00:00:03.31 | 1599K| 15 | 15 | | | | | 2 | LOAD AS SELECT | | 1 | | 0 |00:00:03.28 | 1598K| 0 | 13 | 270K| 270K| 270K (0)| | 3 | HASH GROUP BY | | 1 | 1 | 2943 |00:00:03.27 | 1598K| 0 | 0 | 1009K| 1009K| 1363K (0)| | 4 | VIEW | | 1 | 1 | 3139 |00:00:03.26 | 1598K| 0 | 0 | | | | | 5 | SORT GROUP BY | | 1 | 1 | 3139 |00:00:03.26 | 1598K| 0 | 0 | 478K| 448K| 424K (0)| | 6 | NESTED LOOPS | | 1 | | 3162 |00:00:00.73 | 1598K| 0 | 0 | | | | | 7 | NESTED LOOPS | | 1 | 1 | 3162 |00:00:00.72 | 1596K| 0 | 0 | | | | | 8 | NESTED LOOPS | | 1 | 2 | 3162 |00:00:00.70 | 1586K| 0 | 0 | | | | | 9 | TABLE ACCESS BY INDEX ROWID| DEBFAKPOST | 1 | 58 | 446K|00:00:01.66 | 377K| 0 | 0 | | | | |* 10 | INDEX SKIP SCAN | I_102VAREGRPDEBGRPIDX | 1 | 1 | 446K|00:00:00.64 | 7451 | 0 | 0 | | | | |* 11 | TABLE ACCESS BY INDEX ROWID| LAGERKART | 446K| 1 | 3162 |00:00:01.60 | 1209K| 0 | 0 | | | | |* 12 | INDEX UNIQUE SCAN | I_011VAREIDX | 446K| 1 | 446K|00:00:00.94 | 762K| 0 | 0 | | | | |* 13 | INDEX RANGE SCAN | I_041FAKTURANRIDX | 3162 | 1 | 3162 |00:00:00.02 | 9436 | 0 | 0 | | | | |* 14 | TABLE ACCESS BY INDEX ROWID | DEBFAKJOUR | 3162 | 1 | 3162 |00:00:00.01 | 2916 | 0 | 0 | | | | | 15 | LOAD AS SELECT | | 1 | | 0 |00:00:00.01 | 43 | 13 | 1 | 270K| 270K| 270K (0)| | 16 | SORT GROUP BY ROLLUP | | 1 | 1 | 7 |00:00:00.01 | 18 | 13 | 0 | 36864 | 36864 |32768 (0)| | 17 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE201_36B54E9 | 1 | 1 | 2943 |00:00:00.01 | 18 | 13 | 0 | | | | | 18 | LOAD AS SELECT | | 1 | | 0 |00:00:00.01 | 20 | 0 | 1 | 270K| 270K| 270K (0)| | 19 | SORT GROUP BY ROLLUP | | 1 | 1 | 7 |00:00:00.01 | 15 | 0 | 0 | 33792 | 33792 |30720 (0)| | 20 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE201_36B54E9 | 1 | 1 | 2943 |00:00:00.16 | 15 | 0 | 0 | | | | | 21 | SORT ORDER BY | | 1 | 20 | 14 |00:00:00.01 | 7 | 2 | 0 | 2048 | 2048 | 2048 (0)| | 22 | VIEW | | 1 | 1 | 14 |00:00:00.01 | 7 | 2 | 0 | | | | | 23 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE203_36B54E9 | 1 | 1 | 14 |00:00:00.01 | 7 | 2 | 0 | | | | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
10 - access("SYS_TBL_$4$"."DATASET"='DAT' AND "SYS_TBL_$4$"."FAKTURADATO">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
"SYS_TBL_$4$"."FAKTURADATO"<TO_DATE(' 2013-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
filter(("SYS_TBL_$4$"."FAKTURADATO">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
SYS_OP_DESCEND("FAKTURADATO")>HEXTORAW('878EFDFEF8FEF8FF') AND "SYS_TBL_$4$"."FAKTURADATO"<TO_DATE(' 2013-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
SYS_OP_DESCEND("FAKTURADATO")<=HEXTORAW('878EFEF8FEF8FEFAFF') ))
11 - filter("SYS_TBL_$3$"."SAMKXBSRABAT"='SLANGER') 12 - access("SYS_TBL_$3$"."DATASET"='DAT' AND "SYS_TBL_$4$"."VARENUMMER"="SYS_TBL_$3$"."VARENUMMER") 13 - access("SYS_TBL_$5$"."DATASET"='DAT' AND "SYS_TBL_$5$"."FAKTURANUMMER"="SYS_TBL_$4$"."FAKTURANUMMER" AND "SYS_TBL_$5$"."FAKTURADATO"="SYS_TBL_$4$"."FAKTURADATO") filter(("SYS_TBL_$5$"."FAKTURADATO">=TO_DATE(' 2013-01-01 00:00:00','syyyy-mm-dd hh24:mi:ss') AND "SYS_TBL_$5$"."FAKTURADATO"<TO_DATE(' 2013-02-01
00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND SYS_OP_DESCEND("SYS_TBL_$5$"."FAKTURADATO")<=HEXTORAW('878EFEF8FEF8FEFAFF') AND SYS_OP_DESCEND("SYS_TBL_$5$"."FAKTURADATO")>HEXTORAW('878EFDFEF8FEF8FF') AND "SYS_TBL_$5$"."DATASET"="SYS_TBL_$4$"."DATASET" AND SYS_OP_DESCEND("FAKTURADATO")=SYS_OP_DESCEND("SYS_TBL_$5$"."FAKTURADATO"))) 14 - filter("SYS_TBL_$5$"."ORDRENUMMER"="SYS_TBL_$4$"."ORDRENUMMER")
Note
- cardinality feedback used for this statement
Step 10 - the INDEX SKIP SCAN - now finds sales data for that month for ALL products - 446 thousand rows. For each of those near half million rows an INDEX UNIQUE SCAN on LAGERKART is used to filter away almost all of those leaving the 3.162 rows that are interesting.
Now I notice that the ROLLUP plan step 4 to 14 gathers the desired rows, and then step 2 is a SORT GROUP BY ROLLUP. Fine, ROLLUP can be done as part of the sorting operation.
The CUBE plan step 3 to 14 also gathers the desired rows, but then does
TEMP TABLE TRANSFORMATION.
And then the results of the CUBE is created by a couple of ROLLUP
operations on the SYS TEMP TABLE.
Fine, I can understand that as well ;-)
What I cannot understand is why step 3-14 in the CUBE plan is not the same
as step 4-14 in the ROLLUP plan?
Both are doing the same job - gathering the desired rows - difference being
that those steps in one case feeds rows to SORT GROUP BY ROLLUP, in the
other case feeds rows to TEMP TABLE.
Why is the same query part with same predicates picking so different access plans in those two cases?
I can see that the cardinality estimate for DEBFAKPOST is wildly off. An estimate of 1 when actual is 446 thousand - that's bad, of course ;-) But I can't see that this should be the cause of the two plans being so different? Or can it?
What can I look for to determine why that cardinality estimate is so wrong?
I think my stats for the table and indexes should be OK, but probably they
are not.
How can I identify what's wrong with my stats?
And another slightly weird thing. I tried hinting the inner query in the CUBE query with:
/*+ index(fp I_102VARENRIDX) */
That gives me the exact same plan as without the hint. Even the same plan
hash value!
Other hints as well - seems like the TEMP TABLE TRANSFORMATION causes the
hints to be ignored?
(Well, I know Jonathan Lewis state that hints aren't ignored, they are just
not always applicable to the access path chosen, but it *looks* like
ignored ;-)
I have tried using hints in the inner query of the ROLLUP query, and hinting in that case works like normal.
Version info:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
Any ideas what to look for in my stats would be appreciated.
That there is a difference in the way the optimizer works on the TEMP TABLE TRANSFORMATION when doing CUBE I probably can't change. (Well, I can upgrade and hope 12.1.0.2.0 behaves differently - but that won't happen until springtime ;-)
But presumably if I can fix my cardinality estimates, hopefully I can get a good plan both for ROLLUP and CUBE?
Thanks
Regards
Kim Berg Hansen
http://dspsd.blogspot.com
kibeha_at_gmail.com
_at_kibeha
-- http://www.freelists.org/webpage/oracle-lReceived on Thu Oct 30 2014 - 11:29:47 CET