Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Query tuning experts: Any hope for tuning query with 5+ joins and a sort?

Re: Query tuning experts: Any hope for tuning query with 5+ joins and a sort?

From: Richard Foote <richard.foote_at_bigpond.com>
Date: Mon, 4 Nov 2002 23:18:38 +1000
Message-ID: <Ootx9.69092$g9.195946@newsfeeds.bigpond.com>


"Nuno Souto" <nsouto_at_optushome.com.au.nospam> wrote in message news:3dc51ac7$0$18871$afc38c87_at_news.optusnet.com.au...
> Sun, 3 Nov 2002 23:32:55 +1000, Richard Foote said (and I quote):
> >
> > Therefore using an index to simply avoid a sort is not always going to
cut
> > it for the CBO. Despite warm fuzzy feelings to the contrary, using the
full
> > table scan and performing a somewhat expensive sort could very well be a
> > cheaper option than using the index *if* a largish portion of the table
is
> > being accessed (and largish may not be as large as some may suggest).
> >
>
> That, I doubt. I need proof to believe it. No matter how good Oracle's
> sort mechanism may be (and it isn't very good), it is impossible for a
> sort after a full scan to take less time than a retrieval in final order
> by index.
>

Hi Nuno,

I think you've missed the point. It's not that the sort is particularly efficient, it's because using an index to read a large portion of a table is so terribly inefficient.

Look closely for proof you shall see !!

Using 9.2 on XP.

This is a simple example where using an index simply to avoid a full table scan and a sort is a no no. To make it simple, I'm selecting all rows from a 1.2M row table. To make it fair, I've ensured the sort is a disk sort. These are the results:

SQL> set timing on
SQL> set autotrace traceonly
SQL> select * from big_bowie

  2 where id > 0
  3 order by id;

1279872 rows selected.

Elapsed: 00:00:56.08

Execution Plan


   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=21552 Card=1279872 B
          ytes=19198080)

   1    0   SORT (ORDER BY) (Cost=21552 Card=1279872 Bytes=19198080)
   2    1     TABLE ACCESS (FULL) OF 'BIG_BOWIE' (Cost=395 Card=127987
          2 Bytes=19198080)





Statistics


          0  recursive calls
        120  db block gets
       4093  consistent gets
      14405  physical reads
          0  redo size
   16763907  bytes sent via SQL*Net to client
     939063  bytes received via SQL*Net from client
      85326  SQL*Net roundtrips to/from client
          0  sorts (memory)
          1  sorts (disk)

    1279872 rows processed

SQL> select /*+ first_rows */ * from big_bowie   2 where id > 0
  3 order by id;

1279872 rows selected.

Elapsed: 00:05:03.06

Execution Plan


   0      SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=1281278 Ca
          rd=1279872 Bytes=19198080)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'BIG_BOWIE' (Cost=1281278
           Card=1279872 Bytes=19198080)

   2    1     INDEX (RANGE SCAN) OF 'BIG_BOWIE_ID_IDX' (NON-UNIQUE) (C
          ost=2675 Card=1279872)





Statistics


          0  recursive calls
          0  db block gets
    1366515  consistent gets
      11143  physical reads
          0  redo size
   16763907  bytes sent via SQL*Net to client
     939063  bytes received via SQL*Net from client
      85326  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)

    1279872 rows processed

SQL> Initially, the CBO got it right and selected the full table scan.

I used the FIRST_ROWS hint to prod the CBO to use the index.

Note that the full table scan and disk sort was a good 5 x faster than the index. Also note that my poor little PC has only the 1 CPU. On a multi CPU box with a well striped layout, using parallelism would be N x faster still. If I were to set a large sort area and configure a memory sort, it would be faster still.

Now that was a select on the whole table. Let's be even fairer and only select just a teeny weeny 10% of the data. Surely the index would be better now !!

SQL> select * from big_bowie
  2 where id between 1 and 1000
  3 order by id;

128000 rows selected.

Elapsed: 00:00:09.03

Execution Plan


   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1561 Card=128013 Byt
          es=1920195)

   1    0   SORT (ORDER BY) (Cost=1561 Card=128013 Bytes=1920195)
   2    1     TABLE ACCESS (FULL) OF 'BIG_BOWIE' (Cost=395 Card=128013
           Bytes=1920195)





Statistics


          0  recursive calls
         12  db block gets
       4093  consistent gets
       2830  physical reads
          0  redo size
    1676929  bytes sent via SQL*Net to client
      94362  bytes received via SQL*Net from client
       8535  SQL*Net roundtrips to/from client
          0  sorts (memory)
          1  sorts (disk)
     128000  rows processed

SQL> select /*+ first_rows */ * from big_bowie   2 where id between 1 and 1000
  3 order by id;

128000 rows selected.

Elapsed: 00:00:26.03

Execution Plan


   0      SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=128156 Car
          d=128013 Bytes=1920195)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'BIG_BOWIE' (Cost=128156
          Card=128013 Bytes=1920195)

   2    1     INDEX (RANGE SCAN) OF 'BIG_BOWIE_ID_IDX' (NON-UNIQUE) (C
          ost=270 Card=128013)





Statistics


          0  recursive calls
          0  db block gets
     136783  consistent gets
       1085  physical reads
          0  redo size
    1676929  bytes sent via SQL*Net to client
      94362  bytes received via SQL*Net from client
       8535  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     128000  rows processed

SQL> Then again, maybe not !!

This time the full table scan and disk sort when reading *10%* of the data was only a mere 2.9 x faster.

Like I said, to use an index to read a large chuck of a table, even to avoid a sort, is tuning madness. Yes the index is more appealing but not *that* appealing. The index in a best case scenario has to deal with a massive number of consistent reads (one for each and every row accessed, not to mention reading the index itself). In a worst case scenario, this could be very well be translated into physical reads.

> Yeah, I know about all the benchmarks. Those are for simple retrieval of
> more than 25% of THE ROWS. Not a sort, as well.

I don't like to speak in terms of % of rows, but generally speaking it's a lot less than 25%. A lot, lot less.

>
> And in most cases one may well find that a retrieval by index is actually
> a lot faster than the benchmarks "prove", if the rows are actually stored
> in the blocks in the sequence of the index keys. It happens a lot more
> often than we think.
>

True the clustering factor is relevant but note that the CBO takes this into consideration.

> There is such a thing as sort/merge overhead and that is not being taken
> into account in the case you mention. The sorted-by-index retrieval is
> hardly more expensive than a final merge.

I'll let the above example speak for itself.

Give it a go and see for yourself ;)

Cheers

Richard

>
> --
> Cheers
> Nuno Souto
> nsouto_at_optusnet.com.au.nospam
Received on Mon Nov 04 2002 - 07:18:38 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US