Re: CUBE temp table transformation gives bad access plan?
Date: Thu, 30 Oct 2014 12:36:04 +0100
Message-ID: <CA+S=qd2EzS52N0ynTOhvCda9k=xgTw34ZwmNSU1Pdp7pZbwtRQ_at_mail.gmail.com>
*(Dang - didn't mean to click "Send" yet - sorry... - here's the correct one)*
Addendum:
I have a snapshot standby copy of the database, where I have tested that I
get the same result.
Then I have tried gathering stats on the DEBFAKPOST table just to be sure
that it wasn't stale stats:
begin
dbms_stats.gather_table_stats(USER, 'DEBFAKPOST');
end;
/
select table_name, last_analyzed, num_rows, sample_size
from user_tables
where table_name = 'DEBFAKPOST'
/
TABLE_NAME LAST_ANALYZED NUM_ROWS SAMPLE_SIZE ------------------------------ ------------------- ---------- ----------- DEBFAKPOST 2014-10-30 12:12:47 81952641 81952641
select index_name, uniqueness, distinct_keys, num_rows, sample_size,
last_analyzed
from user_indexes
where table_name = 'DEBFAKPOST'
order by index_name
/
INDEX_NAME UNIQUENES DISTINCT_KEYS NUM_ROWSSAMPLE_SIZE LAST_ANALYZED
------------------------------ --------- ------------- ---------- ----------- ------------------- DEBFAKPOST_SIDSTRETTET NONUNIQUE 5244 78393202 294164 2014-10-30 12:13:04 I_102DEBGRPVAREGRPIDX UNIQUE 77214519 77214519 160141 2014-10-30 12:13:01 I_102FAKIDX UNIQUE 79044150 79044150 146206 2014-10-30 12:12:57 I_102KONTOVAREDATOID2 UNIQUE 80426437 80426437 138597 2014-10-30 12:12:54 I_102KONTOVAREDATOIDX UNIQUE 78044700 78044700 131529 2014-10-30 12:12:51 I_102POSTIDIDX UNIQUE 85013065 85013065 263743 2014-10-30 12:13:00 I_102SEQNO UNIQUE 83323022 83323022 300666 2014-10-30 12:12:56 I_102VAREGRPDEBGRPIDX UNIQUE 82251500 82251500 170391 2014-10-30 12:13:03 I_102VAREGRPDEBIDX UNIQUE 80831917 80831917 156951 2014-10-30 12:12:52 I_102VARENRIDX UNIQUE 81516590 81516590181090 2014-10-30 12:12:58
select column_name, data_type, data_length, last_analyzed, num_distinct,
num_buckets, sample_size, histogram
from user_tab_columns
where table_name = 'DEBFAKPOST'
and column_name in ('DATASET','FAKTURADATO','VARENUMMER')
order by column_id
/
COLUMN_NAME DATA_TYPE DATA_LENGTH LAST_ANALYZEDNUM_DISTINCT NUM_BUCKETS SAMPLE_SIZE HISTOGRAM
------------------------------ ---------- ----------- ------------------- ------------ ----------- ----------- --------------- DATASET VARCHAR2 3 2014-10-30 12:11:24 12 10 5484 FREQUENCY FAKTURADATO DATE 7 2014-10-30 12:11:24 6425 1 81952641 NONE VARENUMMER VARCHAR2 20 2014-10-30 12:11:24 256976 254 5484 HEIGHT BALANCED
So now I'm pretty sure the stats are recent and valid.
It did not change a single thing on the access plans of my statements - exactly same access plans as on my first mail :-(
Regards
Kim Berg Hansen
http://dspsd.blogspot.com
kibeha_at_gmail.com
_at_kibeha
On Thu, Oct 30, 2014 at 11:29 AM, Kim Berg Hansen <kibeha_at_gmail.com> wrote:
> Hi, List > > I have stumbled upon something I think is weird behaviour when using CUBE > and I was wondering if anyone has seen this before and maybe know of a > workaround? > > I can't seem to reproduce a simple case, but I'll try explaining with a > somewhat shortened version of my query that still exhibits this odd > behaviour. > > First I have a statement that uses ROLLUP: > > > select /*+ all_rows gather_plan_statistics */ > case grouping(mxned) > when 1 then 'Total' > else to_char(mxned,'YYYY-MM') > end "MÃ¥ned" > , case grouping(debitorgruppe) > when 1 then 'Total' > else debitorgruppe > end "Gruppe" > , count(distinct fakturakonto) "Ialt Kunder" > , count(*) "Ialt Ordrer" > , sum(samantal) "Ialt Styk total" > , avg(samantal) "Ialt Styk gns" > from ( > select s1.* > , case when antal >= 5 and rabat = 1 then antal end samantal > from ( > select > trunc(fj.fakturadato,'MM') mxned > , fj.debitorgruppe > , fj.fakturakonto > , fj.lxbenummer > , sum(fp.antal) antal > , case when sum(fp.liniebelxb * fp.momsfaktor) < sum(fp.antal * > lk.salgspris * 0.9) then 1 else 0 end rabat > , count(distinct fp.varenummer) blandvarer > from lagerkart lk -- <<< > join debfakpost fp -- <<< > on fp.dataset = 'DAT' -- <<< > and fp.varenummer = lk.varenummer -- <<< > and fp.fakturadato >= date '2013-01-01' -- <<< > and fp.fakturadato < date '2013-02-01' -- <<< > join debfakjour fj > on fj.dataset = fp.dataset > and fj.fakturanummer = fp.fakturanummer > and fj.fakturadato = fp.fakturadato > and fj.ordrenummer = fp.ordrenummer > where lk.dataset = 'DAT' > and lk.samkxbsrabat = 'SLANGER' > group by > trunc(fj.fakturadato,'MM') > , fj.debitorgruppe > , fj.fakturakonto > , fj.lxbenummer > ) s1 > ) > group by > rollup(mxned, debitorgruppe) > order by > mxned, debitorgruppe > ; > > > The join between LAGERKART (LK) and DEBFAKPOST (FP) and the predicates on > FP (marked -- <<<) is the interesting bit. > That's step 10-12 in the plan below, which is the good plan that I like. > > > > ----------------------------------------------------------------------------------------------------------------------------------------------- > | Id | Operation | Name | Starts > | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | > > ----------------------------------------------------------------------------------------------------------------------------------------------- > | 0 | SELECT STATEMENT | | 1 > | | 8 |00:00:00.35 | 63669 | | | | > | 1 | SORT ORDER BY | | 1 > | 2 | 8 |00:00:00.35 | 63669 | 2048 | 2048 | 2048 (0)| > | 2 | SORT GROUP BY ROLLUP | | 1 > | 2 | 8 |00:00:00.35 | 63669 | 38912 | 38912 |34816 (0)| > | 3 | VIEW | | 1 > | 4 | 2943 |00:00:00.35 | 63669 | | | | > | 4 | HASH GROUP BY | | 1 > | 4 | 2943 |00:00:00.34 | 63669 | 1003K| 1003K| 1358K (0)| > | 5 | VIEW | VW_DAG_0 | 1 > | 10 | 3139 |00:00:00.34 | 63669 | | | | > | 6 | HASH GROUP BY | | 1 > | 10 | 3139 |00:00:00.34 | 63669 | 1063K| 1000K| 1368K (0)| > | 7 | NESTED LOOPS | | 1 > | | 3162 |00:00:00.04 | 63669 | | | | > | 8 | NESTED LOOPS | | 1 > | 10 | 3162 |00:00:00.03 | 60555 | | | | > | 9 | NESTED LOOPS | | 1 > | 14 | 3162 |00:00:00.01 | 51044 | | | | > |* 10 | TABLE ACCESS FULL | LAGERKART | 1 > | 23 | 23 |00:00:00.16 | 47850 | | | | > | 11 | TABLE ACCESS BY INDEX ROWID| DEBFAKPOST | 23 > | 1 | 3162 |00:00:00.01 | 3194 | | | | > |* 12 | INDEX RANGE SCAN | I_102VARENRIDX | 23 > | 1 | 3162 |00:00:00.01 | 90 | | | | > |* 13 | INDEX RANGE SCAN | I_041FAKTURANRIDX | 3162 > | 1 | 3162 |00:00:00.02 | 9511 | | | | > |* 14 | TABLE ACCESS BY INDEX ROWID | DEBFAKJOUR | 3162 > | 1 | 3162 |00:00:00.01 | 3114 | | | | > > ----------------------------------------------------------------------------------------------------------------------------------------------- > > Predicate Information (identified by operation id): > --------------------------------------------------- > > 10 - filter(("LK"."SAMKXBSRABAT"='SLANGER' AND "LK"."DATASET"='DAT')) > 12 - access("FP"."DATASET"='DAT' AND "FP"."VARENUMMER"="LK"."VARENUMMER" > AND "FP"."FAKTURADATO">=TO_DATE(' 2013-01-01 00:00:00', > 'syyyy-mm-dd hh24:mi:ss') AND "FP"."FAKTURADATO"<TO_DATE(' > 2013-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) > 13 - access("FJ"."DATASET"='DAT' AND > "FJ"."FAKTURANUMMER"="FP"."FAKTURANUMMER" AND > "FJ"."FAKTURADATO"="FP"."FAKTURADATO") > filter(("FJ"."FAKTURADATO">=TO_DATE(' 2013-01-01 00:00:00', > 'syyyy-mm-dd hh24:mi:ss') AND "FJ"."FAKTURADATO"<TO_DATE(' > 2013-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))) > 14 - filter("FJ"."ORDRENUMMER"="FP"."ORDRENUMMER") > > Note > ----- > - cardinality feedback used for this statement > > > There is no decent index for the WHERE clause on LAGERKART - the full > table access in step 10 is fine by me - it gets 23 rows. > Nested loop for each of those 23 rows using index I_102VARENRIDX in step > 12 is pretty much optimal getting a total of 3.162 rows. > First component of the index is DATASET which has a constant predicate = > 'DAT'. > Second component of the index is VARENUMMER which is the join predicate > from LAGERKART. > Third component of the index is FAKTURADATO which has a range predicate of > one month. > > (This is sales data - query finds the invoice lines for one month for the > 23 products that belongs to a group 'SLANGER'.) > > > Now I change ROLLUP in the statement to CUBE - that's the only change I > make. Now I get this terrible plan: > > > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > | Id | Operation | Name | > Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | > 1Mem | Used-Mem | > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > | 0 | SELECT STATEMENT | | > 1 | | 14 |00:00:03.31 | 1599K| 15 | 15 | | > | | > | 1 | TEMP TABLE TRANSFORMATION | | > 1 | | 14 |00:00:03.31 | 1599K| 15 | 15 | | > | | > | 2 | LOAD AS SELECT | | > 1 | | 0 |00:00:03.28 | 1598K| 0 | 13 | 270K| > 270K| 270K (0)| > | 3 | HASH GROUP BY | | > 1 | 1 | 2943 |00:00:03.27 | 1598K| 0 | 0 | 1009K| > 1009K| 1363K (0)| > | 4 | VIEW | | > 1 | 1 | 3139 |00:00:03.26 | 1598K| 0 | 0 | | > | | > | 5 | SORT GROUP BY | | > 1 | 1 | 3139 |00:00:03.26 | 1598K| 0 | 0 | 478K| > 448K| 424K (0)| > | 6 | NESTED LOOPS | | > 1 | | 3162 |00:00:00.73 | 1598K| 0 | 0 | | > | | > | 7 | NESTED LOOPS | | > 1 | 1 | 3162 |00:00:00.72 | 1596K| 0 | 0 | | > | | > | 8 | NESTED LOOPS | | > 1 | 2 | 3162 |00:00:00.70 | 1586K| 0 | 0 | | > | | > | 9 | TABLE ACCESS BY INDEX ROWID| DEBFAKPOST | > 1 | 58 | 446K|00:00:01.66 | 377K| 0 | 0 | | > | | > |* 10 | INDEX SKIP SCAN | I_102VAREGRPDEBGRPIDX | > 1 | 1 | 446K|00:00:00.64 | 7451 | 0 | 0 | | > | | > |* 11 | TABLE ACCESS BY INDEX ROWID| LAGERKART | > 446K| 1 | 3162 |00:00:01.60 | 1209K| 0 | 0 | | > | | > |* 12 | INDEX UNIQUE SCAN | I_011VAREIDX | > 446K| 1 | 446K|00:00:00.94 | 762K| 0 | 0 | | > | | > |* 13 | INDEX RANGE SCAN | I_041FAKTURANRIDX | > 3162 | 1 | 3162 |00:00:00.02 | 9436 | 0 | 0 | | > | | > |* 14 | TABLE ACCESS BY INDEX ROWID | DEBFAKJOUR | > 3162 | 1 | 3162 |00:00:00.01 | 2916 | 0 | 0 | | > | | > | 15 | LOAD AS SELECT | | > 1 | | 0 |00:00:00.01 | 43 | 13 | 1 | 270K| > 270K| 270K (0)| > | 16 | SORT GROUP BY ROLLUP | | > 1 | 1 | 7 |00:00:00.01 | 18 | 13 | 0 | 36864 | > 36864 |32768 (0)| > | 17 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE201_36B54E9 | > 1 | 1 | 2943 |00:00:00.01 | 18 | 13 | 0 | | > | | > | 18 | LOAD AS SELECT | | > 1 | | 0 |00:00:00.01 | 20 | 0 | 1 | 270K| > 270K| 270K (0)| > | 19 | SORT GROUP BY ROLLUP | | > 1 | 1 | 7 |00:00:00.01 | 15 | 0 | 0 | 33792 | > 33792 |30720 (0)| > | 20 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE201_36B54E9 | > 1 | 1 | 2943 |00:00:00.16 | 15 | 0 | 0 | | > | | > | 21 | SORT ORDER BY | | > 1 | 20 | 14 |00:00:00.01 | 7 | 2 | 0 | 2048 | > 2048 | 2048 (0)| > | 22 | VIEW | | > 1 | 1 | 14 |00:00:00.01 | 7 | 2 | 0 | | > | | > | 23 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE203_36B54E9 | > 1 | 1 | 14 |00:00:00.01 | 7 | 2 | 0 | | > | | > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Predicate Information (identified by operation id): > --------------------------------------------------- > > 10 - access("SYS_TBL_$4$"."DATASET"='DAT' AND > "SYS_TBL_$4$"."FAKTURADATO">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd > hh24:mi:ss') AND > "SYS_TBL_$4$"."FAKTURADATO"<TO_DATE(' 2013-02-01 00:00:00', > 'syyyy-mm-dd hh24:mi:ss')) > filter(("SYS_TBL_$4$"."FAKTURADATO">=TO_DATE(' 2013-01-01 > 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND > SYS_OP_DESCEND("FAKTURADATO")>HEXTORAW('878EFDFEF8FEF8FF') > AND "SYS_TBL_$4$"."FAKTURADATO"<TO_DATE(' 2013-02-01 00:00:00', > 'syyyy-mm-dd hh24:mi:ss') AND > > SYS_OP_DESCEND("FAKTURADATO")<=HEXTORAW('878EFEF8FEF8FEFAFF') )) > 11 - filter("SYS_TBL_$3$"."SAMKXBSRABAT"='SLANGER') > 12 - access("SYS_TBL_$3$"."DATASET"='DAT' AND > "SYS_TBL_$4$"."VARENUMMER"="SYS_TBL_$3$"."VARENUMMER") > 13 - access("SYS_TBL_$5$"."DATASET"='DAT' AND > "SYS_TBL_$5$"."FAKTURANUMMER"="SYS_TBL_$4$"."FAKTURANUMMER" AND > "SYS_TBL_$5$"."FAKTURADATO"="SYS_TBL_$4$"."FAKTURADATO") > filter(("SYS_TBL_$5$"."FAKTURADATO">=TO_DATE(' 2013-01-01 > 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND > "SYS_TBL_$5$"."FAKTURADATO"<TO_DATE(' 2013-02-01 > 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND > SYS_OP_DESCEND("SYS_TBL_$5$"."FAKTURADATO")<=HEXTORAW('878EFEF8FEF8FEFAFF') > AND > > SYS_OP_DESCEND("SYS_TBL_$5$"."FAKTURADATO")>HEXTORAW('878EFDFEF8FEF8FF') > AND "SYS_TBL_$5$"."DATASET"="SYS_TBL_$4$"."DATASET" AND > > SYS_OP_DESCEND("FAKTURADATO")=SYS_OP_DESCEND("SYS_TBL_$5$"."FAKTURADATO"))) > 14 - filter("SYS_TBL_$5$"."ORDRENUMMER"="SYS_TBL_$4$"."ORDRENUMMER") > > Note > ----- > - cardinality feedback used for this statement > > > Step 10 - the INDEX SKIP SCAN - now finds sales data for that month for > ALL products - 446 thousand rows. For each of those near half million rows > an INDEX UNIQUE SCAN on LAGERKART is used to filter away almost all of > those leaving the 3.162 rows that are interesting. > > > Now I notice that the ROLLUP plan step 4 to 14 gathers the desired rows, > and then step 2 is a SORT GROUP BY ROLLUP. > Fine, ROLLUP can be done as part of the sorting operation. > > The CUBE plan step 3 to 14 also gathers the desired rows, but then does > TEMP TABLE TRANSFORMATION. > And then the results of the CUBE is created by a couple of ROLLUP > operations on the SYS TEMP TABLE. > Fine, I can understand that as well ;-) > > What I cannot understand is why step 3-14 in the CUBE plan is not the same > as step 4-14 in the ROLLUP plan? > Both are doing the same job - gathering the desired rows - difference > being that those steps in one case feeds rows to SORT GROUP BY ROLLUP, in > the other case feeds rows to TEMP TABLE. > > Why is the same query part with same predicates picking so different > access plans in those two cases? > > > I can see that the cardinality estimate for DEBFAKPOST is wildly off. An > estimate of 1 when actual is 446 thousand - that's bad, of course ;-) > But I can't see that this should be the cause of the two plans being so > different? Or can it? > > What can I look for to determine why that cardinality estimate is so wrong? > I think my stats for the table and indexes should be OK, but probably they > are not. > How can I identify what's wrong with my stats? > > > And another slightly weird thing. I tried hinting the inner query in the > CUBE query with: > > /*+ index(fp I_102VARENRIDX) */ > > That gives me the exact same plan as without the hint. Even the same plan > hash value! > Other hints as well - seems like the TEMP TABLE TRANSFORMATION causes the > hints to be ignored? > (Well, I know Jonathan Lewis state that hints aren't ignored, they are > just not always applicable to the access path chosen, but it *looks* like > ignored ;-) > > I have tried using hints in the inner query of the ROLLUP query, and > hinting in that case works like normal. > > > Version info: > Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit > Production > PL/SQL Release 11.2.0.3.0 - Production > CORE 11.2.0.3.0 Production > TNS for Linux: Version 11.2.0.3.0 - Production > NLSRTL Version 11.2.0.3.0 - Production > > > > Any ideas what to look for in my stats would be appreciated. > > That there is a difference in the way the optimizer works on the TEMP > TABLE TRANSFORMATION when doing CUBE I probably can't change. > (Well, I can upgrade and hope 12.1.0.2.0 behaves differently - but that > won't happen until springtime ;-) > > But presumably if I can fix my cardinality estimates, hopefully I can get > a good plan both for ROLLUP and CUBE? > > > Thanks > > > Regards > > > Kim Berg Hansen > > http://dspsd.blogspot.com > kibeha_at_gmail.com > _at_kibeha > > >
-- http://www.freelists.org/webpage/oracle-lReceived on Thu Oct 30 2014 - 12:36:04 CET