RE: Bitmap index not used when joining tables
Date: Sat, 31 May 2014 17:11:11 +0000
Message-ID: <CE70217733273F49A8A162EE074F64D901DF822F_at_exmbx05.thus.corp>
It's not surprising for the example you've given.
The basic principle is that the cost of a single table access via a bitmap index is likely to be different from the cost of access via the equivalent btree index because the bitmap index has no information about data clustering (i.e no clustering_factor) so it uses a guess. Sometimes this means the cost of the bitmap will be higher, sometimes lower, sometimes the same as for a btree. In your case the btree clustering factor will be very low (because of the order by in the CTAS) while the guess basically assumes that 20% of the data will be very widely scattered - in your case that probably means 400 blocks (20% of 2000 rows => 400 blocks).
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle
From: oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] on behalf of Chris Saxon [chris.saxon_at_gmail.com] Sent: 31 May 2014 16:21
To: oracle-l_at_freelists.org
Subject: Bitmap index not used when joining tables
Hi,
I've been testing using bitmap indexes on 11.2.0.2 EE. When joining two tables on a column with a BTree index, the index is used in the execution plan. If this index is changed to a bitmap index, Oracle no longer uses the index when executing the query! It assigns a higher cost to using the bitmap index when joining, despite this being a cheaper approach (in terms of consistent gets).
To see this, I created a date dimension table with 515 days (rows) and a sales fact table with 1,000 rows for each day:
create table date_d (
date_id integer not null primary key,
calendar_date date not null unique
);
create table sales_fact (
date_id integer not null
references date_d (date_id),
quantity number not null,
total_value number not null
);
insert into date_d
select rownum, date'2013-01-01'-1+rownum
from dual
connect by level <= sysdate - date'2013-01-01';
insert into sales_fact
with rws as (select * from dual connect by level <= 1000)
select d.date_id, round(dbms_random.value(1, 20)), round(dbms_random.value(10, 100), 2)
from date_d d
cross join rws
order by d.date_id;
begin
dbms_stats.gather_table_stats(user, 'sales_fact');
dbms_stats.gather_table_stats(user, 'date_d');
end;
/
If I create a BTree index on SALES_FACT.DATE_ID, then join from the date dim to the fact table, restricting to two days, Oracle uses the index on the fact table as I would expect (as we're fetching 2,000 of 515,000 rows):
create index safa_date_id on sales_fact (date_id);
set autotrace trace
select s.* from sales_fact s join date_d d
on d.date_id = s.date_id
where calendar_date between date'2013-12-31' and date'2014-01-01';
set autotrace off
Execution Plan
Plan hash value: 2189554905
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 3002 | 69046 | 21 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 3002 | 69046 | 21 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | SYS_C0037151 | 3 | | 2 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | SAFA_DATE_ID | 1000 | | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 1000 | 11000 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
5 - access("D"."DATE_ID"="S"."DATE_ID")
Statistics
1 recursive calls 0 db block gets 290 consistent gets 6 physical reads 0 redo size 36195 bytes sent via SQL*Net to client 1839 bytes received via SQL*Net from client 135 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2000 rows processed
However, if I drop the normal index and re-create it as a bitmap index the query above changes to a FTS on the SALES_FACT table:
set autotrace off
drop index safa_date_id;
create bitmap index safa_date_id on sales_fact (date_id);
set autotrace trace
select s.* from sales_fact s join date_d d
on d.date_id = s.date_id
where calendar_date between date'2013-12-31' and date'2014-01-01';
Execution Plan
Plan hash value: 525754326
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 3002 | 69046 | 315 (2)| 00:00:04 |
|* 1 | HASH JOIN | | 3002 | 69046 | 315 (2)| 00:00:04 |
| 2 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | SYS_C0037151 | 3 | | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | SALES_FACT | 515K| 5532K| 310 (1)| 00:00:04 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
1 - access("D"."DATE_ID"="S"."DATE_ID") 3 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
Statistics
1 recursive calls 0 db block gets 1267 consistent gets 0 physical reads 0 redo size 36195 bytes sent via SQL*Net to client 1839 bytes received via SQL*Net from client 135 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2000 rows processed
If we hint the query to use the bitmap index, we can see it has a higher cost. The autotrace stats report significantly fewer consistent gets though:
Execution Plan
Plan hash value: 1520624055
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 3002 | 69046 | 416 (0)| 00:00:05 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 3002 | 69046 | 416 (0)| 00:00:05 |
| 3 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | SYS_C0037151 | 3 | | 2 (0)| 00:00:01 |
| 5 | BITMAP CONVERSION TO ROWIDS| | | | | |
|* 6 | BITMAP INDEX SINGLE VALUE | SAFA_DATE_ID | | | | |
| 7 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 1000 | 11000 | 416 (0)| 00:00:05 |
Predicate Information (identified by operation id):
4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
6 - access("D"."DATE_ID"="S"."DATE_ID")
Statistics
1 recursive calls 0 db block gets 152 consistent gets 1 physical reads 0 redo size 36195 bytes sent via SQL*Net to client 1839 bytes received via SQL*Net from client 135 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2000 rows processed
If you remove the join to the date dim and just use the date ids, Oracle uses the index as expected:
select * from sales_fact
where date_id between 365 and 366;
Execution Plan
Plan hash value: 2749560877
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 3002 | 33022 | 221 (0)| 00:00:03 |
| 1 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 3002 | 33022 | 221 (0)| 00:00:03 |
| 2 | BITMAP CONVERSION TO ROWIDS| | | | | |
|* 3 | BITMAP INDEX RANGE SCAN | SAFA_DATE_ID | | | | |
Predicate Information (identified by operation id):
3 - access("DATE_ID">=365 AND "DATE_ID"<=366)
Statistics
1 recursive calls 0 db block gets 144 consistent gets 0 physical reads 0 redo size 36195 bytes sent via SQL*Net to client 1839 bytes received via SQL*Net from client 135 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2000 rows processed
Looking at the 10053 trace, I can see this is because the bm index join cost is calculated as higher than the FTS of SALES_FACT:
NL Join
Outer table: Card: 3.00 Cost: 3.00 Resp: 3.00 Degree: 1 Bytes: 12
Access path analysis for SALES_FACT
Inner table: SALES_FACT Alias: S
Access Path: TableScan
NL Join: Cost: 929.82 Resp: 929.82 Degree: 1
Cost_io: 920.00 Cost_cpu: 317629069 Resp_io: 920.00 Resp_cpu: 317629069
- trying bitmap/domain indexes ****** Access Path: index (AllEqJoinGuess) Index: SAFA_DATE_ID resc_io: 1.00 resc_cpu: 8171 ix_sel: 0.001942 ix_sel_with_filters: 0.001942 NL Join : Cost: 6.00 Resp: 6.00 Degree: 1 Cost_io: 6.00 Cost_cpu: 47359 Resp_io: 6.00 Resp_cpu: 47359 Bitmap nodes: Used SAFA_DATE_ID Cost = 6.001464, sel = 0.001942 Access path: Bitmap index - accepted Cost: 416.121488 Cost_io: 415.999594 Cost_cpu: 3943626.145192 Sel: 0.001942 Not Believed to be index-only
- finished trying bitmap/domain indexes ******
Best NL cost: 416.12 resc: 416.12 resc_io: 416.00 resc_cpu: 3943626 resp: 416.12 resp_io: 416.00 resc_cpu: 3943626
However, same section of the 10053 trace shows the following when using a BTree index on the SALES_FACT.DATE_ID column:
NL Join
Outer table: Card: 3.00 Cost: 3.00 Resp: 3.00 Degree: 1 Bytes: 12
Access path analysis for SALES_FACT
Inner table: SALES_FACT Alias: S
Access Path: TableScan
NL Join: Cost: 929.82 Resp: 929.82 Degree: 1
Cost_io: 920.00 Cost_cpu: 317629069 Resp_io: 920.00 Resp_cpu: 317629069 Access Path: index (AllEqJoinGuess)
Index: SAFA_DATE_ID
resc_io: 6.00 resc_cpu: 433579
ix_sel: 0.001942 ix_sel_with_filters: 0.001942 NL Join : Cost: 21.04 Resp: 21.04 Degree: 1
Cost_io: 21.00 Cost_cpu: 1323580 Resp_io: 21.00 Resp_cpu: 1323580 Best NL cost: 21.04 resc: 21.04 resc_io: 21.00 resc_cpu: 1323580 resp: 21.04 resp_io: 21.00 resc_cpu: 1323580
Why does this happen? Is this a bug or expected behaviour?
Thanks,
Chris
www.sqlfail.com<http://www.sqlfail.com>
-- http://www.freelists.org/webpage/oracle-lReceived on Sat May 31 2014 - 19:11:11 CEST