Re: query performance -- another try

From: Doug <dfult_at_econs.umass.edu>
Date: 4 Jun 2003 16:03:39 -0700
Message-ID: <358a7114.0306041503.521ecdb3@posting.google.com>

I used alter session to set optimizer_index_caching=99 and optimizer_index_cost_adj=10 (based on some other posts).

The simple query now uses the index in both cases (42 OR clauses or 43 OR clauses), but the time difference is still present (for one additional OR clause) -- .31 to 3.31. If I ramp the number of OR clauses up (in practice, we can have over a thousand -- this is a postalCode proximity thing), the time continues to degrade pretty badly, e.g. 27 seconds for 325 OR clauses, using the index. Anyway, the crux of it seems to lie in the big difference between the two threshold cases (threshold in the sense that 43 OR clauses was where it used to switch to a table scan before I altered the session parameters as described above).

One more wrinkle: I dropped the "AND country = 'USA'" from the query and the time difference vanished. It suggests that I should add Country to the index => (postalCode, country), but it still doesn't explain why the huge time difference for one additional OR clause.

Any more insights?

501 rows selected.

Elapsed: 00:00:00.31

Execution Plan

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=16 Card=3 Bytes=39) 1 0 INLIST ITERATOR

   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'ADDRESS' (Cost=16 Card
          =3 Bytes=39)

   3    2       INDEX (RANGE SCAN) OF 'ADDRESS_POSTALCODE' (NON-UNIQUE
          ) (Cost=1 Card=3)

Statistics

          0  recursive calls
          0  db block gets
        339  consistent gets
          0  physical reads
          0  redo size
       8895  bytes sent via SQL*Net to client
       3030  bytes received via SQL*Net from client
         35  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        501  rows processed

502 rows selected.

Elapsed: 00:00:03.31

Execution Plan

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=16 Card=3 Bytes=39) 1 0 INLIST ITERATOR

   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'ADDRESS' (Cost=16 Card
          =3 Bytes=39)

   3    2       INDEX (RANGE SCAN) OF 'ADDRESS_POSTALCODE' (NON-UNIQUE
          ) (Cost=1 Card=3)

Statistics

          0  recursive calls
          4  db block gets
       1048  consistent gets
          0  physical reads
          0  redo size
       8905  bytes sent via SQL*Net to client
       3030  bytes received via SQL*Net from client
         35  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        502  rows processed

"Mikito Harakiri" <mikharakiri_at_ywho.com> wrote in message news:<dd8Da.6$354.192_at_news.oracle.com>...
> The buffer gets count increases from 282 to 1015, while the optimizer thinks
> the 2 statements have almost the same cost. The estimation error is,
> therefore 4 times -- not *that* bad.
>
> Now, the elapsed time differs much more dramatically: 0.02 sec for index
> range scan vs. 3.12 sec for full table scan. Is this index caching effect
> that you are experiencing?
>
> I'm taking back my statement that histogram might help in your case. If you
> collect histogram on country, for example, then optimizer would think that
> predicates are even less selective!
>
> I'm not sure what effect collecting the histogram on ZIP code would be. You
> can try
>
> begin
> dbms_stats.gather_table_stats(<schema>,<table>, cascade=>true,
> method_opt=>'for columns ZIP size skewonly');
> end;
> /
>
> But adjusting optimizer_index_caching or optimizer_index_cost_adj would
> certainly work in your case. (I don't remember which one affects indexed
> nested loops only, and which one index range scan -- I can't figure that out
> just from the name only).
>
> "Doug" <dfult_at_econs.umass.edu> wrote in message
> news:358a7114.0306031057.1054ca9c_at_posting.google.com...
> > Here is the tkprof output (first, where the index is used -- 42 OR
> > clauses; second, where it isn't -- 43 OR clauses):
> >
> > call count cpu elapsed disk query current
> > rows
> > ------- ------ -------- ---------- ---------- ---------- ----------
> > ----------
> > Parse 1 0.01 0.01 0 0 0
> > 0
> > Execute 1 0.01 0.01 0 0 0
> > 0
> > Fetch 2 0.01 0.02 0 282 0
> > 1
> > ------- ------ -------- ---------- ---------- ---------- ----------
> > ----------
> > total 4 0.03 0.04 0 282 0
> > 1
> >
> > Misses in library cache during parse: 1
> > Optimizer goal: CHOOSE
> > Parsing user id: 20 (ADMIN)
> >
> > Rows Row Source Operation
> > ------- ---------------------------------------------------
> > 1 SORT AGGREGATE
> > 501 INLIST ITERATOR
> > 501 TABLE ACCESS BY INDEX ROWID ADDRESS
> > 544 INDEX RANGE SCAN (object id 3860)
> >
> >
> > Rows Execution Plan
> > ------- ---------------------------------------------------
> > 0 SELECT STATEMENT GOAL: CHOOSE
> > 1 SORT (AGGREGATE)
> > 501 INLIST ITERATOR
> > 501 TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF
> > 'ADDRESS'
> > 544 INDEX GOAL: ANALYZED (RANGE SCAN) OF
> > 'ADDRESS_POSTALCODE' (NON-UNIQUE)
> >
> > ***************
> >
> > call count cpu elapsed disk query current
> > rows
> > ------- ------ -------- ---------- ---------- ---------- ----------
> > ----------
> > Parse 1 0.06 0.06 0 0 0
> > 0
> > Execute 1 0.00 0.00 0 0 0
> > 0
> > Fetch 2 3.06 3.12 0 1015 4
> > 1
> > ------- ------ -------- ---------- ---------- ---------- ----------
> > ----------
> > total 4 3.12 3.18 0 1015 4
> > 1
> >
> > Misses in library cache during parse: 1
> > Optimizer goal: CHOOSE
> > Parsing user id: 20 (ADMIN)
> >
> > Rows Row Source Operation
> > ------- ---------------------------------------------------
> > 1 SORT AGGREGATE
> > 502 TABLE ACCESS FULL ADDRESS
> >
> >
> > Rows Execution Plan
> > ------- ---------------------------------------------------
> > 0 SELECT STATEMENT GOAL: CHOOSE
> > 1 SORT (AGGREGATE)
> > 502 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'ADDRESS'
> >
> > I did the following to interrogate the presence of a histogram on
> > Country (from the Tuning doc):
> >
> > 1 select ENDPOINT_NUMBER, ENDPOINT_VALUE FROM DBA_HISTOGRAMS
> > 2* WHERE TABLE_NAME = 'ADDRESS' AND COLUMN_NAME = 'COUNTRY'
> > SQL> /
> >
> > ENDPOINT_NUMBER ENDPOINT_VALUE
> > --------------- --------------
> > 0 3.3917E+35
> > 1 6.0984E+35
> >
> > It doesn't look like a meaningful histogram exists, here (even to my
> > untrained eye). I tried the same thing with postalCode and got:
> >
> > ENDPOINT_NUMBER ENDPOINT_VALUE
> > --------------- --------------
> > 0 2.3469E+35
> > 1 6.5525E+35
> >
> > This also looks unhelpful. Perhaps I am doing the wrong thing to
> > verify the histograms.
> >
> > FYI: I created statistics with the following: analyze table Address
> > compute statistics;
> >
> > By the way, there is no index on country (the index on postalCode does
> > not include country). Almost everything we do is for US, and selects
> > either include postalCode (which I assumed to be selective enough --
> > certainly not aided by country except in rare cases) or on the primary
> > key.
Received on Wed Jun 04 2003 - 18:03:39 CDT