Re: FW: Bitmap index not used when joining tables

From: Chris Saxon <chris.saxon_at_gmail.com>
Date: Sun, 1 Jun 2014 10:32:18 +0100
Message-ID: <CAJ7OfbOK4bGk-tHV-qLt8RPavV=5STUdWzsgperNa4ACQJYjZw_at_mail.gmail.com>



Thanks for the explanation Jonathan.

Could you explain where the 20% value comes from please? Is this a default for bitmap indexes, or something you've inferred from the info posted?

On 31 May 2014 20:24, Mark W. Farnham <mwf_at_rsiz.com> wrote:

> hotel email bounce; trying again
>
>
>
> *From:* Mark W. Farnham [mailto:mwf_at_rsiz.com]
> *Sent:* Saturday, May 31, 2014 3:14 PM
> *To:* 'jonathan_at_jlcomp.demon.co.uk'; 'oracle-l_at_freelists.org'
> *Subject:* RE: Bitmap index not used when joining tables
>
>
>
> I do wonder about what your results might be for these two queries if you
> modified your query slightly as:
>
>
>
> select
>
> --+ gather_plan_statistics
>
> s.* from sales_fact s
>
> where s.rowid in (
>
> select
>
> --+ no_merge
>
> s2.rowid from sales_fact s2
>
> where s2.date_id in (
>
> select
>
> --+ no_merge
>
> d.date_id from date_d d
>
> where calendar_date between date’2013-12-31’ and
> date’2014-01-01’
>
> )
>
> );
>
>
>
> The idea being that if the only column it needs in the unmerged subquery
> (rowid) can be sourced from the index, this should be less total work.
>
> Possibly this is also optimized if the subquery is ordered by rowid, but
> that’s not the curiosity at this point. Of course even if this works you
> would only want to form it up this way when you knew it was to your benefit
> (knowledge the CBO can only guess by way of the thumbrule quoted by JL, and
> some other thumbrule for a literal range.)
>
>
>
> I just typed this in and didn’t test it, and working is release dependent,
> so you and I got lucky if this works via cut and paste. Once debugged I
> would sort of expect this to use the index in both cases to fetch back the
> rowid list.
>
>
>
> mwf
>
> *From:* oracle-l-bounce_at_freelists.org [
> mailto:oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org>] *On
> Behalf Of *Jonathan Lewis
> *Sent:* Saturday, May 31, 2014 1:11 PM
> *To:* oracle-l_at_freelists.org
> *Subject:* RE: Bitmap index not used when joining tables
>
>
>
>
>
> It's not surprising for the example you've given.
>
>
>
> The basic principle is that the cost of a single table access via a bitmap
> index is likely to be different from the cost of access via the equivalent
> btree index because the bitmap index has no information about data
> clustering (i.e no clustering_factor) so it uses a guess. Sometimes this
> means the cost of the bitmap will be higher, sometimes lower, sometimes the
> same as for a btree. In your case the btree clustering factor will be very
> low (because of the order by in the CTAS) while the guess basically assumes
> that 20% of the data will be very widely scattered - in your case that
> probably means 400 blocks (20% of 2000 rows => 400 blocks).
>
>
>
>
>
>
>
>
> Regards
> Jonathan Lewis
> http://jonathanlewis.wordpress.com
> _at_jloracle
> ------------------------------
>
> *From:* oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] on
> behalf of Chris Saxon [chris.saxon_at_gmail.com]
> *Sent:* 31 May 2014 16:21
> *To:* oracle-l_at_freelists.org
> *Subject:* Bitmap index not used when joining tables
>
> Hi,
>
>
>
> I've been testing using bitmap indexes on 11.2.0.2 EE. When joining two
> tables on a column with a BTree index, the index is used in the execution
> plan. If this index is changed to a bitmap index, Oracle no longer uses the
> index when executing the query! It assigns a higher cost to using the
> bitmap index when joining, despite this being a cheaper approach (in terms
> of consistent gets).
>
>
>
> To see this, I created a date dimension table with 515 days (rows) and a
> sales fact table with 1,000 rows for each day:
>
>
>
> create table date_d (
>
> date_id integer not null primary key,
>
> calendar_date date not null unique
>
> );
>
>
>
> create table sales_fact (
>
> date_id integer not null
>
> references date_d (date_id),
>
> quantity number not null,
>
> total_value number not null
>
> );
>
>
>
> insert into date_d
>
> select rownum, date'2013-01-01'-1+rownum
>
> from dual
>
> connect by level <= sysdate - date'2013-01-01';
>
>
>
> insert into sales_fact
>
> with rws as (select * from dual connect by level <= 1000)
>
> select d.date_id, round(dbms_random.value(1, 20)),
> round(dbms_random.value(10, 100), 2)
>
> from date_d d
>
> cross join rws
>
> order by d.date_id;
>
>
>
> begin
>
> dbms_stats.gather_table_stats(user, 'sales_fact');
>
> dbms_stats.gather_table_stats(user, 'date_d');
>
> end;
>
> /
>
>
>
> If I create a BTree index on SALES_FACT.DATE_ID, then join from the date
> dim to the fact table, restricting to two days, Oracle uses the index on
> the fact table as I would expect (as we're fetching 2,000 of 515,000 rows):
>
>
>
> create index safa_date_id on sales_fact (date_id);
>
>
>
> set autotrace trace
>
> select s.* from sales_fact s join date_d d
>
> on d.date_id = s.date_id
>
> where calendar_date between date'2013-12-31' and date'2014-01-01';
>
>
>
> set autotrace off
>
>
>
> Execution Plan
>
> ----------------------------------------------------------
>
> Plan hash value: 2189554905
>
>
>
>
> ----------------------------------------------------------------------------------------------
>
> | Id | Operation | Name | Rows | Bytes |
> Cost (%CPU)| Time |
>
>
> ----------------------------------------------------------------------------------------------
>
> | 0 | SELECT STATEMENT | | 3002 | 69046 |
> 21 (0)| 00:00:01 |
>
> | 1 | NESTED LOOPS | | | |
> | |
>
> | 2 | NESTED LOOPS | | 3002 | 69046 |
> 21 (0)| 00:00:01 |
>
> | 3 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 |
> 3 (0)| 00:00:01 |
>
> |* 4 | INDEX RANGE SCAN | SYS_C0037151 | 3 | |
> 2 (0)| 00:00:01 |
>
> |* 5 | INDEX RANGE SCAN | SAFA_DATE_ID | 1000 | |
> 3 (0)| 00:00:01 |
>
> | 6 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 1000 | 11000 |
> 6 (0)| 00:00:01 |
>
>
> ----------------------------------------------------------------------------------------------
>
>
>
> Predicate Information (identified by operation id):
>
> ---------------------------------------------------
>
>
>
> 4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00',
> 'syyyy-mm-dd
>
> hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01
> 00:00:00', 'syyyy-mm-dd
>
> hh24:mi:ss'))
>
> 5 - access("D"."DATE_ID"="S"."DATE_ID")
>
>
>
>
>
> Statistics
>
> ----------------------------------------------------------
>
> 1 recursive calls
>
> 0 db block gets
>
> 290 consistent gets
>
> 6 physical reads
>
> 0 redo size
>
> 36195 bytes sent via SQL*Net to client
>
> 1839 bytes received via SQL*Net from client
>
> 135 SQL*Net roundtrips to/from client
>
> 0 sorts (memory)
>
> 0 sorts (disk)
>
> 2000 rows processed
>
>
>
> However, if I drop the normal index and re-create it as a bitmap index the
> query above changes to a FTS on the SALES_FACT table:
>
>
>
> set autotrace off
>
> drop index safa_date_id;
>
> create bitmap index safa_date_id on sales_fact (date_id);
>
>
>
> set autotrace trace
>
> select s.* from sales_fact s join date_d d
>
> on d.date_id = s.date_id
>
> where calendar_date between date'2013-12-31' and date'2014-01-01';
>
>
>
> Execution Plan
>
> ----------------------------------------------------------
>
> Plan hash value: 525754326
>
>
>
>
> ---------------------------------------------------------------------------------------------
>
> | Id | Operation | Name | Rows | Bytes | Cost
> (%CPU)| Time |
>
>
> ---------------------------------------------------------------------------------------------
>
> | 0 | SELECT STATEMENT | | 3002 | 69046 |
> 315 (2)| 00:00:04 |
>
> |* 1 | HASH JOIN | | 3002 | 69046 |
> 315 (2)| 00:00:04 |
>
> | 2 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 |
> 3 (0)| 00:00:01 |
>
> |* 3 | INDEX RANGE SCAN | SYS_C0037151 | 3 | |
> 2 (0)| 00:00:01 |
>
> | 4 | TABLE ACCESS FULL | SALES_FACT | 515K| 5532K|
> 310 (1)| 00:00:04 |
>
>
> ---------------------------------------------------------------------------------------------
>
>
>
> Predicate Information (identified by operation id):
>
> ---------------------------------------------------
>
>
>
> 1 - access("D"."DATE_ID"="S"."DATE_ID")
>
> 3 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00',
> 'syyyy-mm-dd
>
> hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01
> 00:00:00', 'syyyy-mm-dd
>
> hh24:mi:ss'))
>
>
>
>
>
> Statistics
>
> ----------------------------------------------------------
>
> 1 recursive calls
>
> 0 db block gets
>
> 1267 consistent gets
>
> 0 physical reads
>
> 0 redo size
>
> 36195 bytes sent via SQL*Net to client
>
> 1839 bytes received via SQL*Net from client
>
> 135 SQL*Net roundtrips to/from client
>
> 0 sorts (memory)
>
> 0 sorts (disk)
>
> 2000 rows processed
>
>
>
> If we hint the query to use the bitmap index, we can see it has a higher
> cost. The autotrace stats report significantly fewer consistent gets though:
>
>
>
> Execution Plan
>
> ----------------------------------------------------------
>
> Plan hash value: 1520624055
>
>
>
>
> ----------------------------------------------------------------------------------------------
>
> | Id | Operation | Name | Rows | Bytes |
> Cost (%CPU)| Time |
>
>
> ----------------------------------------------------------------------------------------------
>
> | 0 | SELECT STATEMENT | | 3002 | 69046 |
> 416 (0)| 00:00:05 |
>
> | 1 | NESTED LOOPS | | | |
> | |
>
> | 2 | NESTED LOOPS | | 3002 | 69046 |
> 416 (0)| 00:00:05 |
>
> | 3 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 |
> 3 (0)| 00:00:01 |
>
> |* 4 | INDEX RANGE SCAN | SYS_C0037151 | 3 | |
> 2 (0)| 00:00:01 |
>
> | 5 | BITMAP CONVERSION TO ROWIDS| | | |
> | |
>
> |* 6 | BITMAP INDEX SINGLE VALUE | SAFA_DATE_ID | | |
> | |
>
> | 7 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 1000 | 11000 |
> 416 (0)| 00:00:05 |
>
>
> ----------------------------------------------------------------------------------------------
>
>
>
> Predicate Information (identified by operation id):
>
> ---------------------------------------------------
>
>
>
> 4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00',
> 'syyyy-mm-dd
>
> hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01
> 00:00:00', 'syyyy-mm-dd
>
> hh24:mi:ss'))
>
> 6 - access("D"."DATE_ID"="S"."DATE_ID")
>
>
>
>
>
> Statistics
>
> ----------------------------------------------------------
>
> 1 recursive calls
>
> 0 db block gets
>
> 152 consistent gets
>
> 1 physical reads
>
> 0 redo size
>
> 36195 bytes sent via SQL*Net to client
>
> 1839 bytes received via SQL*Net from client
>
> 135 SQL*Net roundtrips to/from client
>
> 0 sorts (memory)
>
> 0 sorts (disk)
>
> 2000 rows processed
>
>
>
> If you remove the join to the date dim and just use the date ids, Oracle
> uses the index as expected:
>
>
>
> select * from sales_fact
>
> where date_id between 365 and 366;
>
>
>
> Execution Plan
>
> ----------------------------------------------------------
>
> Plan hash value: 2749560877
>
>
>
>
> ---------------------------------------------------------------------------------------------
>
> | Id | Operation | Name | Rows | Bytes | Cost
> (%CPU)| Time |
>
>
> ---------------------------------------------------------------------------------------------
>
> | 0 | SELECT STATEMENT | | 3002 | 33022 |
> 221 (0)| 00:00:03 |
>
> | 1 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 3002 | 33022 |
> 221 (0)| 00:00:03 |
>
> | 2 | BITMAP CONVERSION TO ROWIDS| | | |
> | |
>
> |* 3 | BITMAP INDEX RANGE SCAN | SAFA_DATE_ID | | |
> | |
>
>
> ---------------------------------------------------------------------------------------------
>
>
>
> Predicate Information (identified by operation id):
>
> ---------------------------------------------------
>
>
>
> 3 - access("DATE_ID">=365 AND "DATE_ID"<=366)
>
>
>
>
>
> Statistics
>
> ----------------------------------------------------------
>
> 1 recursive calls
>
> 0 db block gets
>
> 144 consistent gets
>
> 0 physical reads
>
> 0 redo size
>
> 36195 bytes sent via SQL*Net to client
>
> 1839 bytes received via SQL*Net from client
>
> 135 SQL*Net roundtrips to/from client
>
> 0 sorts (memory)
>
> 0 sorts (disk)
>
> 2000 rows processed
>
>
>
> Looking at the 10053 trace, I can see this is because the bm index join
> cost is calculated as higher than the FTS of SALES_FACT:
>
>
>
> NL Join
>
> Outer table: Card: 3.00 Cost: 3.00 Resp: 3.00 Degree: 1 Bytes: 12
>
> Access path analysis for SALES_FACT
>
> Inner table: SALES_FACT Alias: S
>
> Access Path: TableScan
>
> NL Join: Cost: 929.82 Resp: 929.82 Degree: 1
>
> Cost_io: 920.00 Cost_cpu: 317629069
>
> Resp_io: 920.00 Resp_cpu: 317629069
>
> ****** trying bitmap/domain indexes ******
>
> Access Path: index (AllEqJoinGuess)
>
> Index: SAFA_DATE_ID
>
> resc_io: 1.00 resc_cpu: 8171
>
> ix_sel: 0.001942 ix_sel_with_filters: 0.001942
>
> NL Join : Cost: 6.00 Resp: 6.00 Degree: 1
>
> Cost_io: 6.00 Cost_cpu: 47359
>
> Resp_io: 6.00 Resp_cpu: 47359
>
> Bitmap nodes:
>
> Used SAFA_DATE_ID
>
> Cost = 6.001464, sel = 0.001942
>
> Access path: Bitmap index - accepted
>
> Cost: 416.121488 Cost_io: 415.999594 Cost_cpu: 3943626.145192 Sel:
> 0.001942
>
> Not Believed to be index-only
>
> ****** finished trying bitmap/domain indexes ******
>
>
>
> Best NL cost: 416.12
>
> resc: 416.12 resc_io: 416.00 resc_cpu: 3943626
>
> resp: 416.12 resp_io: 416.00 resc_cpu: 3943626
>
>
>
> However, same section of the 10053 trace shows the following when using a
> BTree index on the SALES_FACT.DATE_ID column:
>
>
>
> NL Join
>
> Outer table: Card: 3.00 Cost: 3.00 Resp: 3.00 Degree: 1 Bytes: 12
>
> Access path analysis for SALES_FACT
>
> Inner table: SALES_FACT Alias: S
>
> Access Path: TableScan
>
> NL Join: Cost: 929.82 Resp: 929.82 Degree: 1
>
> Cost_io: 920.00 Cost_cpu: 317629069
>
> Resp_io: 920.00 Resp_cpu: 317629069
>
> Access Path: index (AllEqJoinGuess)
>
> Index: SAFA_DATE_ID
>
> resc_io: 6.00 resc_cpu: 433579
>
> ix_sel: 0.001942 ix_sel_with_filters: 0.001942
>
> NL Join : Cost: 21.04 Resp: 21.04 Degree: 1
>
> Cost_io: 21.00 Cost_cpu: 1323580
>
> Resp_io: 21.00 Resp_cpu: 1323580
>
>
>
> Best NL cost: 21.04
>
> resc: 21.04 resc_io: 21.00 resc_cpu: 1323580
>
> resp: 21.04 resp_io: 21.00 resc_cpu: 1323580
>
>
>
> Why does this happen? Is this a bug or expected behaviour?
>
>
>
> Thanks,
>
> Chris
>
> www.sqlfail.com
>

--
http://www.freelists.org/webpage/oracle-l
Received on Sun Jun 01 2014 - 11:32:18 CEST

Original text of this message