| Oracle FAQ | Your Portal to the Oracle Knowledge Grid | |
Home -> Community -> Usenet -> comp.databases.theory -> Re: union all vs. left outer join
henrik.sorensen_at_balcab.ch wrote:
"Just to summarize the problem as I see it.
The query 1) should be the fastest, and is also much easier to
maintain.
Now, the problem OR join, has so far resulted in a table scan of A-B-C,
and
not as I expect a indexed lookup on B.id and C.bid respectively.
Ideally the optimizer should do something like this:
1) use index E.XM1
2) use index D.XDE
3.1) use index B.XB 3.2) use index B.XBA 4.1) use index C.XC 4.2) use index C.XCA
do you agree with this ?
looking forward to hear your results.
"
Using the data you've sent, Oracle produces the following execution plans:
SELECT a.*,d.*,e.*
from A
left outer join B on A.id = B.aid
left outer join C on A.id = C.aid
inner join D on
d.bid = B.id
or d.bid = C.bid
inner join E on E.id = D.eid
where E.m1 = 'A'
25824 rows selected.
Elapsed: 00:00:20.03
Execution Plan
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=9285 Card=99697 Bytes=8374548)
1 0 CONCATENATION
2 1 HASH JOIN (OUTER) (Cost=4832 Card=52143 Bytes=4380012) 3 2 HASH JOIN (Cost=4081 Card=52143 Bytes=3806439) 4 3 TABLE ACCESS (FULL) OF 'E' (TABLE) (Cost=201 Card=15630 Bytes=422010) 5 3 HASH JOIN (Cost=2709 Card=358780 Bytes=16503880) 6 5 TABLE ACCESS (FULL) OF 'D' (TABLE) (Cost=142 Card=263420 Bytes=2897620) 7 5 HASH JOIN (RIGHT OUTER) (Cost=956 Card=358780 Bytes=12557300) 8 7 TABLE ACCESS (FULL) OF 'C' (TABLE) (Cost=84 Card=81910 Bytes=1802020) 9 7 TABLE ACCESS (FULL) OF 'A' (TABLE) (Cost=225 Card=358780 Bytes=4664140) 10 2 TABLE ACCESS (FULL) OF 'B' (TABLE) (Cost=142 Card=288840 Bytes=3177240) 11 1 FILTER 12 11 HASH JOIN (RIGHT OUTER) (Cost=4453 Card=47554 Bytes=3994536) 13 12 TABLE ACCESS (FULL) OF 'C' (TABLE) (Cost=84 Card=81910 Bytes=1802020) 14 12 HASH JOIN (Cost=4023 Card=47554 Bytes=2948348) 15 14 TABLE ACCESS (FULL) OF 'E' (TABLE) (Cost=201 Card=15630 Bytes=422010) 16 14 HASH JOIN (Cost=2946 Card=327205 Bytes=11452175) 17 16 TABLE ACCESS (FULL) OF 'D' (TABLE) (Cost=142 Card=263420 Bytes=2897620) 18 16 HASH JOIN (RIGHT OUTER) (Cost=1469 Card=358780 Bytes=8610720) 19 18 TABLE ACCESS (FULL) OF 'B' (TABLE) (Cost=142 Card=288840 Bytes=3177240) 20 18 TABLE ACCESS (FULL) OF 'A' (TABLE) (Cost=225Card=358780 Bytes=4664140)
Statistics
6872 consistent gets
12035 physical reads
and :
SELECT a.*,d.*,e.*
from A
left outer join B on A.id = B.aid
inner join D on
d.bid = B.id
inner join E on E.id = D.eid
where E.m1 = 'A'
union all
SELECT a.*,d.*,e.*
from A
left outer join C on A.id = C.aid
inner join D on
d.bid = C.bid
inner join E on E.id = D.eid
where E.m1 = 'A'
25824 rows selected.
Elapsed: 00:00:18.08
Execution Plan
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=7724 Card=99697 Bytes=6185803)
1 0 UNION-ALL
2 1 HASH JOIN (Cost=3961 Card=47554 Bytes=2900794) 3 2 TABLE ACCESS (FULL) OF 'E' (TABLE) (Cost=201 Card=15630 Bytes=422010) 4 2 HASH JOIN (Cost=2902 Card=327205 Bytes=11124970) 5 4 TABLE ACCESS (FULL) OF 'D' (TABLE) (Cost=142 Card=263420 Bytes=2897620) 6 4 HASH JOIN (RIGHT OUTER) (Cost=1449 Card=358780 Bytes=8251940) 7 6 TABLE ACCESS (FULL) OF 'B' (TABLE) (Cost=142 Card=288840 Bytes=2888400) 8 6 TABLE ACCESS (FULL) OF 'A' (TABLE) (Cost=225 Card=358780 Bytes=4664140) 9 1 HASH JOIN (Cost=3763 Card=52143 Bytes=3285009) 10 9 TABLE ACCESS (FULL) OF 'E' (TABLE) (Cost=201 Card=15630 Bytes=422010) 11 9 HASH JOIN (Cost=2585 Card=358780 Bytes=12916080) 12 11 TABLE ACCESS (FULL) OF 'D' (TABLE) (Cost=142 Card=263420 Bytes=2897620) 13 11 HASH JOIN (RIGHT OUTER) (Cost=1082 Card=358780 Bytes=8969500) 14 13 TABLE ACCESS (FULL) OF 'C' (TABLE) (Cost=84 Card=8910 Bytes=982920) 15 13 TABLE ACCESS (FULL) OF 'A' (TABLE) (Cost=225Card=358780 Bytes=4664140)
Statistics
5901 consistent gets
10669 physical reads
The plans are pretty similar with the 'union all' being slightly more efficient (by about 15% judging by the execution times). Both perform a number of hash [outer] joins without any index access. The second plan is slightly better because with your specific data you knew that you could transform (AB+)C+DE into (AC+)DE union all (AB+)DE and save one join operation. In general, such transformation would produce different results.
Now, as to you questions:
Having indexes does not necessarily mean that the optimizer will use them with the queries like that. E.g. a simple query like
select * from a join b on a.id=b.aid
will result in two full table scans even though there indexes on a.id and b.aid because there's no point of going to the index and then to the respective row if we need to scan all the rows anyway. Similarly, it's faster to have a full scan of A because the index selectivity is low: with m1='A', you'll get 15K rows out of 400K. If the index selectivity were better the optimizer would use it. So, with your queries, the optimizer evaluated various access paths and chose ful table scans as less expensive in terms of IO and CPU. Received on Thu Nov 10 2005 - 14:02:18 CST
![]() |
![]() |