AW: Outer join -> Union

From: Petr Novak <Petr.Novak_at_trivadis.com>
Date: Fri, 5 Nov 2010 09:34:54 +0000
Message-ID: <09011014EB621E4CBC2536B62A1B64079B04_at_SMXC001.trivadis.com>



Hallo Dion,

I have found two Notes in Metalink:

Bug 7345484 - Left outer join with filters are slow [ID 7345484.8] CBO Chooses Poor Plan In 10.2.0.4/11.1.0.6 When Outer Joining [ID 727604.1]

but

ALTER SESSION SET "_optimizer_join_elimination_enabled"=false;

was without success

I tried the query on 11.2.0.2 DB , ( also with optimizer_features_enabled=9.2.0 and optimizer_features_enabled=10.2.0.1) and on 9.2.0.6 DB, but without success.

The queries

select t1.id,t1.aname,t2.id2,t2.bname
from t t1 left outer join t t2 on t1.id=t2.id2 where t2.aname='a12345';

select t1.id,t1.aname,t2.id2,t2.bname
from t t1 left outer join t t2 on t1.id=t2.id2 where t1.id=120;

peform well and use indexes. But in the whole query with USE_CONCAT Hint one part use indexes but the second part applies the filter AFTER the outer join. I will open SR next week.

Best Regards,
Petr



Von: Dion Cho [ukja.dion_at_gmail.com]
Gesendet: Freitag, 5. November 2010 07:38 Bis: Petr Novak
Cc: Ric Van Dyke; oracle-l_at_freelists.org Betreff: Re: Outer join -> Union

This is a concatenated version of your original query whose execution plan is far more inefficient than that of your original query. (My local database is 11.2.0.1. Not sure how lower version works)

TPACK_at_ukja1120_localhost> explain plan for   2 select /*+ use_concat */ t1.id<http://t1.id>,t1.aname,t2.id2,t2.bname   3 from t t1 left outer join t t2 on t1.id<http://t1.id>=t2.id2   4 where t1.id<http://t1.id>=120 or t2.aname='a12345';

Explained.

TPACK_at_ukja1120_localhost> select * from table(dbms_xplan.display(null, null, 'basic'));



| Id | Operation | Name |


| 0 | SELECT STATEMENT | |
| 1 | CONCATENATION | |
| 2 | FILTER | |
| 3 | HASH JOIN OUTER | |
| 4 | TABLE ACCESS FULL | T |
| 5 | TABLE ACCESS FULL | T |
| 6 | FILTER | |
| 7 | NESTED LOOPS OUTER | |
| 8 | TABLE ACCESS BY INDEX ROWID| T |
| 9 | INDEX RANGE SCAN | T_ID_IDX |
| 10 | TABLE ACCESS BY INDEX ROWID| T |
| 11 | INDEX RANGE SCAN | T_ID2_IDX |

Predicate Information (identified by operation id):


   2 - filter("T2"."ANAME"='a12345')
   3 - access("T1"."ID"="T2"."ID2"(+))
   6 - filter(LNNVL("T2"."ANAME"='a12345'))
   9 - access("T1"."ID"=120)
  11 - access("T1"."ID"="T2"."ID2"(+))



Compare this with your rewritten query.

TPACK_at_ukja1120_localhost> explain plan for   2 select t1.id<http://t1.id>,t1.aname,t2.id2,t2.bname   3 from t t1 left outer join t t2 on t1.id<http://t1.id>=t2.id2   4 where t1.id<http://t1.id>=120
  5 union
  6 select t1.id<http://t1.id>,t1.aname,t2.id2,t2.bname   7 from t t1 left outer join
  8 t t2 on t1.id<http://t1.id>=t2.id2 where t2.aname='a12345';

Explained.

TPACK_at_ukja1120_localhost>
TPACK_at_ukja1120_localhost> select * from table(dbms_xplan.display(null, null, 'basic'));



| Id | Operation | Name |


| 0 | SELECT STATEMENT | |
| 1 | SORT UNIQUE | |
| 2 | UNION-ALL | |
| 3 | MERGE JOIN OUTER | |
| 4 | TABLE ACCESS BY INDEX ROWID | T |
| 5 | INDEX RANGE SCAN | T_ID_IDX |
| 6 | BUFFER SORT | |
| 7 | TABLE ACCESS BY INDEX ROWID| T |
| 8 | INDEX RANGE SCAN | T_ID2_IDX |
| 9 | NESTED LOOPS | |
| 10 | NESTED LOOPS | |
| 11 | TABLE ACCESS BY INDEX ROWID| T |
| 12 | INDEX RANGE SCAN | T_ANAME_IDX |
| 13 | INDEX RANGE SCAN | T_ID_IDX |
| 14 | TABLE ACCESS BY INDEX ROWID | T |

Predicate Information (identified by operation id):


   5 - access("T1"."ID"=120)
   8 - access("T2"."ID2"(+)=120)
  12 - access("T2"."ANAME"='a12345')
  13 - access("T1"."ID"="T2"."ID2")


The biggest difference is that the union-ed query has a good success on converting unnecessary outer join(step 9 is nested loops, not nested loops outer) to inner join. By virtue of it, t2 was chosen as the driving table with the predicate of (t2.name<http://t2.name> = 'a12345' }, which is impossible for the outer join. This is called "outer join elimination".

With USE_CONCAT hint applied, the optimizer converts the query to union-ed one, but seems that dose not eliminate the outer join. This would be a implementation restriction which could be solved in the future version.

Without USE_CONCAT hint(which was your original query), the optimizer considers the concatenation, but just denies to adopt it because the concatenated version is far more inefficient than the non-concatenated version.



Dion Cho - Oracle Performance Storyteller
http://dioncho.wordpress.com (english)
http://ukja.tistory.com (korean)
http://sites.google.com/site/otpack (tpack)
================================


On Thu, Nov 4, 2010 at 9:39 PM, Petr Novak <Petr.Novak_at_trivadis.com<mailto:Petr.Novak_at_trivadis.com>> wrote:

Hallo Ric,

I forgot to mention the DB version - it is 10.2.0.4 on Linux, statistics were computed (without histograms). Original query takes 1.08 seconds, union 0.01

Original statement:

select * from table(dbms_xplan.display_cursor('a9cbapx2jx6v2',0,'ALL'));

PLAN_TABLE_OUTPUT



SQL_ID a9cbapx2jx6v2, child number 0

select t1.id<http://t1.id>,t1.aname,t2.id2,t2.bname from t t1 left outer join t t2 on t1.id<http://t1.id>=t2.id2 where t1.id<http://t1.id>=120 or t2.aname='a12345'

Plan hash value: 2853649387



| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |


| 0 | SELECT STATEMENT | | | | | 1978 (100)| |
|*  1 |  FILTER             |      |       |       |       |            |          |
|*  2 |   HASH JOIN OUTER   |      |     2 |    58 |    17M|  1978  (11)| 00:00:06 |

| 3 | TABLE ACCESS FULL| T | 800K| 8593K| | 408 (12)| 00:00:02 |
| 4 | TABLE ACCESS FULL| T | 800K| 13M| | 418 (14)| 00:00:02 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


  1 - filter(("T1"."ID"=120 OR "T2"."ANAME"='a12345'))   2 - access("T1"."ID"="T2"."ID2")

 Child     CPU/Ex    Elaps/Ex     Buffer/Ex     DReads/Ex  Executions  Opt. Cost      Sorts      Rows Username                  PlanHash Last Load
------ ---------- ----------- ------------- ------------- ----------- ---------- ---------- --------- -------------------- ------------- --------------
    0       1.08        1.08        6771.0            .0           1       1978          0        51 SYS                     2853649387 11-04/13:13:09



call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        5      1.93       1.89          0       6656          0          51
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        7      1.94       1.89          0       6656          0          51

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS

Rows Row Source Operation

-------  ---------------------------------------------------

    51 FILTER (cr=6656 pr=0 pw=0 time=720778 us)

1583976   HASH JOIN OUTER (cr=6656 pr=0 pw=0 time=2300778 us)
 800000    TABLE ACCESS FULL T (cr=3326 pr=0 pw=0 time=800131 us)
 800000    TABLE ACCESS FULL T (cr=3330 pr=0 pw=0 time=800093 us)



###############################################################

Second statement , with union

select * from table(dbms_xplan.display_cursor('3h8h7yw5yzr7p',0,'ALL'));

PLAN_TABLE_OUTPUT



SQL_ID 3h8h7yw5yzr7p, child number 0

select t1.id<http://t1.id>,t1.aname,t2.id2,t2.bname from t t1 left outer join t t2 on t1.id<http://t1.id>=t2.id2 where t1.id<http://t1.id>=120 union select t1.id<http://t1.id>,t1.aname,t2.id2,t2.bname from t t1 left outer join t t2 on t1.id<http://t1.id>=t2.id2 where t2.aname='a12345'

Plan hash value: 4270335634



| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |


| 0 | SELECT STATEMENT | | | | 16 (100)| |
| 1 | SORT UNIQUE | | 51 | 1129 | 16 (57)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
| 3 | MERGE JOIN OUTER | | 50 | 1100 | 7 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID | T | 1 | 11 | 4 (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN           | T_ID_IDX    |     1 |       |     3   (0)| 00:00:01 |

| 6 | BUFFER SORT | | 50 | 550 | 3 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T | 50 | 550 | 3 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | T_ID2_IDX | 50 | | 2 (0)| 00:00:01 |
| 9 | TABLE ACCESS BY INDEX ROWID | T | 1 | 11 | 3 (0)| 00:00:01 |
| 10 | NESTED LOOPS | | 1 | 29 | 7 (0)| 00:00:01 |
| 11 | TABLE ACCESS BY INDEX ROWID| T | 1 | 18 | 4 (0)| 00:00:01 |
|* 12 | INDEX RANGE SCAN | T_ANAME_IDX | 1 | | 3 (0)| 00:00:01 | |* 13 | INDEX RANGE SCAN | T_ID_IDX | 1 | | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("T1"."ID"=120) 8 - access("T2"."ID2"=120)

 12 - access("T2"."ANAME"='a12345')
 13 - access("T1"."ID"="T2"."ID2")
Child Nr    CPU/Ex  Elaps/Ex     Buffer/Ex     DReads/Ex  Executions Opt. Cost      Sorts      Rows Username                  PlanHash Last Load
-------- --------- --------- ------------- ------------- ----------- --------- ---------- --------- -------------------- ------------- --------------
      0       .01       .01          16.0            .0           1        16          2        51 SYS                     4270335634 11-04/13:15:06



call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        5      0.00       0.00          0         16          0          51
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        7      0.00       0.00          0         16          0          51

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS

Rows Row Source Operation

-------  ---------------------------------------------------

    51 SORT UNIQUE (cr=16 pr=0 pw=0 time=570 us)     51 UNION-ALL (cr=16 pr=0 pw=0 time=580 us)     50 MERGE JOIN OUTER (cr=8 pr=0 pw=0 time=418 us)

     1     TABLE ACCESS BY INDEX ROWID T (cr=4 pr=0 pw=0 time=72 us)
     1      INDEX RANGE SCAN T_ID_IDX (cr=3 pr=0 pw=0 time=51 us)(object id 63295)
    50     BUFFER SORT (cr=4 pr=0 pw=0 time=291 us)
    50      TABLE ACCESS BY INDEX ROWID T (cr=4 pr=0 pw=0 time=336 us)
    50       INDEX RANGE SCAN T_ID2_IDX (cr=3 pr=0 pw=0 time=126 us)(object id 63297)
     1    TABLE ACCESS BY INDEX ROWID T (cr=8 pr=0 pw=0 time=107 us)
     3     NESTED LOOPS  (cr=7 pr=0 pw=0 time=131 us)
     1      TABLE ACCESS BY INDEX ROWID T (cr=4 pr=0 pw=0 time=51 us)
     1       INDEX RANGE SCAN T_ANAME_IDX (cr=3 pr=0 pw=0 time=36 us)(object id 63296)
     1      INDEX RANGE SCAN T_ID_IDX (cr=3 pr=0 pw=0 time=18 us)(object id 63295)






Best Regards,
Petr

Betreff: RE: Outer join -> Union

What are the run time stats of running the outer join and the union? What do you mean by "significantly better"? Can you trace (10046) the runs and post the stat lines of each plan?



Ric Van Dyke
Hotsos Enterprises

Hotsos Symposium
March 6 - 10, 2011
You have to be there, yea I'm talking to you.

-----Original Message-----
From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org> [mailto:oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org>] On Behalf Of Petr Novak Sent: Thursday, November 04, 2010 5:34 AM To: oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: Outer join -> Union

Hallo ,

I try to replace Left outer join (with Full Scans) to replace with Union.

create table t (id number,aname varchar2(32),bname varchar2(32),id2 number);

insert into t
select rownum ,
'a'||to_char(rownum),'b'||to_char(rownum),round(rownum/50) from dual connect by rownum<=800000;

create index t_id_idx on t(id);
create index t_aname_idx on t(aname);
create index t_id2_idx on t(id2);



select t1.id<http://t1.id>,t1.aname,t2.id2,t2.bname from t t1 left outer join t t2 on t1.id<http://t1.id>=t2.id2 where t1.id<http://t1.id>=120 or t2.aname='a12345';

Execution Plan



Plan hash value: 2853649387


| Id | Operation | Name | Rows | Bytes |TempSpc| Cost
(%CPU)| Time |


| 0 | SELECT STATEMENT | | 2 | 58 | | 1978
(11)| 00:00:06 |
|*  1 |  FILTER             |      |       |       |       |

| |
|* 2 | HASH JOIN OUTER | | 2 | 58 | 17M| 1978 (11)| 00:00:06 |
| 3 | TABLE ACCESS FULL| T | 800K| 8593K| | 408
(12)| 00:00:02 |
| 4 | TABLE ACCESS FULL| T | 800K| 13M| | 418
(14)| 00:00:02 |

Predicate Information (identified by operation id):


  1 - filter("T1"."ID"=120 OR "T2"."ANAME"='a12345')   2 - access("T1"."ID"="T2"."ID2"(+))

Following query returns the same results, but is significantly better.

select t1.id<http://t1.id>,t1.aname,t2.id2,t2.bname from t t1 left outer join t t2 on t1.id<http://t1.id>=t2.id2 where t1.id<http://t1.id>=120
union
select t1.id<http://t1.id>,t1.aname,t2.id2,t2.bname from t t1 left outer join t t2 on t1.id<http://t1.id>=t2.id2 where t2.aname='a12345';




| Id | Operation | Name | Rows | Bytes |
Cost (%CPU)| Time |


| 0 | SELECT STATEMENT | | 51 | 1129 |
16 (57)| 00:00:01 |
| 1 | SORT UNIQUE | | 51 | 1129 |
16 (57)| 00:00:01 |
| 2 | UNION-ALL | | | |
| |
| 3 | MERGE JOIN OUTER | | 50 | 1100 |
7 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID | T | 1 | 11 |
4 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | T_ID_IDX | 1 | | 3 (0)| 00:00:01 |
| 6 | BUFFER SORT | | 50 | 550 |
3 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| T | 50 | 550 |
3 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | T_ID2_IDX | 50 | | 2 (0)| 00:00:01 |
| 9 | TABLE ACCESS BY INDEX ROWID | T | 1 | 11 |
3 (0)| 00:00:01 |
| 10 | NESTED LOOPS | | 1 | 29 |
7 (0)| 00:00:01 |
| 11 | TABLE ACCESS BY INDEX ROWID| T | 1 | 18 |
4 (0)| 00:00:01 |
|* 12 | INDEX RANGE SCAN | T_ANAME_IDX | 1 | | 3 (0)| 00:00:01 |
|* 13 | INDEX RANGE SCAN | T_ID_IDX | 1 | | 2 (0)| 00:00:01 |

Predicate Information (identified by operation id):


  5 - access("T1"."ID"=120)
  8 - access("T2"."ID2"(+)=120)
 12 - access("T2"."ANAME"='a12345')
 13 - access("T1"."ID"="T2"."ID2")


It is possible to force the original query not to use the FULL Scans ? I tried to play with CONCAT and INDEX Hints, but with no success.

Best Regards,
Petr--
http://www.freelists.org/webpage/oracle-l

--
http://www.freelists.org/webpage/oracle-l



--
http://www.freelists.org/webpage/oracle-l
Received on Fri Nov 05 2010 - 04:34:54 CDT

Original text of this message