Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Composite index and data distribution

Re: Composite index and data distribution

From: Richard Foote <richard.foote_at_bigpond.nospam.com>
Date: Fri, 11 Aug 2006 10:55:19 GMT
Message-ID: <rmZCg.10505$rP1.3762@news-server.bigpond.net.au>


"Michal Kuratczyk" <kura_at_lj.pl> wrote in message news:ebhmcl$nja$1_at_mx1.internetia.pl...
> Hi,
>
> I've observed a behaviour that I cannot explain by myself
> and I would be glad for help. Here it goes (10gXE
> but tested on 9.2.0.5 EE too):
>
>
> SIMPLE TABLE WITH TWO COLUMNS:
>
> SQL> create table foobar(foo number not null, bar number not null);
>
> Table created.
>
>
> DATA DISTRIBUTION: IN 1% OF THE ROWS FOO=1, IN 99% FOO=99:
>
> SQL> begin
> 2 for i in 1..1000000 loop
> 3 insert into foobar(foo, bar) values (case when i <= 10000 then 1 else
> 99 end, i);
> 4 end loop;
> 5 end;
> 6 /
>
> PL/SQL procedure successfully completed.
>
>
> COMPOSITE INDEX (BEST I CAN THINK OF FOR THE QUERY TO BE EXECUTED):
>
> SQL> create index foobar_idx on foobar(foo, bar);
>
> Index created.
>
>
> DETAILED STATISTICS (SO IT'S NOT STATISTICS-RELATED ISSUE):
>
> SQL> exec dbms_stats.gather_table_stats(user, 'FOOBAR', cascade=>true,
> estimate_percent=>100, method_opt=>'for all columns size auto');
>
> PL/SQL procedure successfully completed.
>
>
> JUST TO BE SURE - DATA DISTRIBUTION:
>
> SQL> select foo, count(*) from foobar group by foo;
>
> FOO COUNT(*)
> ---------- ----------
> 1 10000
> 99 990000
>
>
> TOOLS:
>
> SQL> set timing on
> SQL> set autotrace traceonly
> SQL> alter session set sql_trace=true;
>
> Session altered.
>
> Elapsed: 00:00:00.08
>
>
> AND HERE COMES THE QUERY FOR FOO=1 (1% OF THE ROWS):
>
> SQL> select /*+ FIRST_ROWS(25) */ * from (select foo, rownum rn
> 2 from foobar where foo=1 order by bar) where rn <= 25;
>
> 25 rows selected.
>
> Elapsed: 00:00:00.03
>
> Execution Plan
> ----------------------------------------------------------
> Plan hash value: 2407035716
>
> ---------------------------------------------------------------------------------
> | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
> |
> ---------------------------------------------------------------------------------
> | 0 | SELECT STATEMENT | | 25 | 650 | 3 (0)| 00:00:01
> |
> |* 1 | VIEW | | 25 | 650 | 3 (0)| 00:00:01
> |
> | 2 | COUNT | | | | |
> |
> |* 3 | INDEX RANGE SCAN| FOOBAR_IDX | 25 | 175 | 3 (0)| 00:00:01
> |
> ---------------------------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
>
> 1 - filter("RN"<=25)
> 3 - access("FOO"=1)
>
>
> Statistics
> ----------------------------------------------------------
> 1 recursive calls
> 0 db block gets
> 30 consistent gets
> 0 physical reads
> 0 redo size
> 811 bytes sent via SQL*Net to client
> 395 bytes received via SQL*Net from client
> 3 SQL*Net roundtrips to/from client
> 0 sorts (memory)
> 0 sorts (disk)
> 25 rows processed
>
>
> AND HERE IS THE SAME QUERY WITH FOO=99 (99% OF THE ROWS):
>
> SQL> select /*+ FIRST_ROWS(25) */ * from (select foo, rownum rn
> 2 from foobar where foo=99 order by bar) where rn <= 25;
>
> 25 rows selected.
>
> Elapsed: 00:00:00.78
>
> Execution Plan
> ----------------------------------------------------------
> Plan hash value: 2407035716
>
> ---------------------------------------------------------------------------------
> | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
> |
> ---------------------------------------------------------------------------------
> | 0 | SELECT STATEMENT | | 25 | 650 | 3 (0)| 00:00:01
> |
> |* 1 | VIEW | | 25 | 650 | 3 (0)| 00:00:01
> |
> | 2 | COUNT | | | | |
> |
> |* 3 | INDEX RANGE SCAN| FOOBAR_IDX | 25 | 175 | 3 (0)| 00:00:01
> |
> ---------------------------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
>
> 1 - filter("RN"<=25)
> 3 - access("FOO"=99)
>
>
> Statistics
> ----------------------------------------------------------
> 1 recursive calls
> 0 db block gets
> 2624 consistent gets
> 0 physical reads
> 0 redo size
> 811 bytes sent via SQL*Net to client
> 395 bytes received via SQL*Net from client
> 3 SQL*Net roundtrips to/from client
> 0 sorts (memory)
> 0 sorts (disk)
> 25 rows processed
>
>
> In the first case there were 30 consistent gets and in the second - 2624!
> Why?! Plan is the same. The index contains exactly the answer to the query
> (both columns, sorted as requested), so I would expect it to just grab the
> first 25 rows from the index and return them. I would ignore a few blocks
> difference but it's not the case - Oracle is obviously doing something
> different with those two queries.
>
> Here is the output from tkprof:
>
> ********************************************************************************
>
> select /*+ FIRST_ROWS(25) */ * from (select foo, rownum rn
> from foobar where foo=1 order by bar) where rn <= 25
>
> call count cpu elapsed disk query current
> rows
> ------- ------ ------ -------- ---------- ---------- ---------- ----------
> Parse 1 0.00 0.00 0 0 0
> 0
> Execute 1 0.00 0.00 0 0 0
> 0
> Fetch 3 0.00 0.00 0 30 0
> 25
> ------- ------ ------ -------- ---------- ---------- ---------- ----------
> total 5 0.00 0.00 0 30 0
> 25
>
> Misses in library cache during parse: 1
> Optimizer mode: FIRST_ROWS
> Parsing user id: 36
>
> Rows Row Source Operation
> ------- ---------------------------------------------------
> 25 VIEW (cr=30 pr=0 pw=0 time=70 us)
> 10000 COUNT (cr=30 pr=0 pw=0 time=60049 us)
> 10000 INDEX RANGE SCAN FOOBAR_IDX (cr=30 pr=0 pw=0 time=30038 us
> (object id 14207)
>
> ********************************************************************************
>
> select /*+ FIRST_ROWS(25) */ * from (select foo, rownum rn
> from foobar where foo=99 order by bar) where rn <= 25
>
> call count cpu elapsed disk query current
> rows
> ------- ------ ------ -------- ---------- ---------- ---------- ----------
> Parse 1 0.00 0.00 0 0 0
> 0
> Execute 1 0.00 0.00 0 0 0
> 0
> Fetch 3 0.75 0.75 0 2624 0
> 25
> ------- ------ ------ -------- ---------- ---------- ---------- ----------
> total 5 0.75 0.75 0 2624 0
> 25
>
> Misses in library cache during parse: 1
> Optimizer mode: FIRST_ROWS
> Parsing user id: 36
>
> Rows Row Source Operation
> ------- ---------------------------------------------------
> 25 VIEW (cr=2624 pr=0 pw=0 time=73 us)
> 990000 COUNT (cr=2624 pr=0 pw=0 time=5940050 us)
> 990000 INDEX RANGE SCAN FOOBAR_IDX (cr=2624 pr=0 pw=0 time=1980043 us
> (object id 14207)
>
>
> Seems like Oracle cannot optimize the query and really reads all those
> rows
> instead of stopping after the first 25. Is it true or am I missing
> something? I can't believe that with all those tricks that CBO is capable
> of, it can't do such an obvious one. Any suggestions?
>

Hi Michal

In the first query, the inner query fetches *all* the 10000 rows that have a foo value of 1, keeping track of the bottom 25 values of bar and returns the selected 25 rows.

In the second query, the inner query fetches *all* 990000 rows that have a foo value of 99, keeping track of the bottom 25 values of bar and returns the selected 25 rows.

As the second query fetches 99 times the number of rows than the first query, one would expect it to access a similar magnitude greater of data blocks.

The key point you're missing is that the inner queries need to retrieve *all* the matching rows of foo in order to obtain the bottom matching values of bar.

Hope this makes some kinda sense.

Cheers

Richard Received on Fri Aug 11 2006 - 05:55:19 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US