Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: SQL going after two date ranges and outer join

Re: SQL going after two date ranges and outer join

From: Sybrand Bakker <postbus_at_sybrandb.demon.nl>
Date: Wed, 10 Apr 2002 23:37:49 +0200
Message-ID: <0ur8bu01rthqo5nsmo9blnrrp82gogii5o@4ax.com>


On Wed, 10 Apr 2002 15:28:25 GMT, "Steve Mitchell" <stevem_at_hdcsi.com> wrote:

>
>Agreed that the correct driving table is being chosen. The problem is it's
>not using the available
>date index.
>
>where
> (arh.shipdatetime between '01-JAN-02' and '31-JAN-02' or
> arh.shipdatetime between '01-JAN-01' and '31-JAN-01') and
> arh.artrxhdridx = ash.artrxhdridx (+)
>
>..and..
>
>where
> arh.shipdatetime between '01-JAN-01' and '31-JAN-01' and
> arh.artrxhdridx = ash.artrxhdridx (+)
>
>..generate two different plans. (the later using the index on shipdatetime,
>the former does not).
>
>This also confuses me. My AR_TRX_HEADER table need not be directly involved
>in the outer
>join for the plan to change. Consider the following query without an outer
>join, and the
>next query, which contains an outer join on from ar_trx_detail to
>ar_trx_header_ship. In the
>second query, the optimizer abandons use of the date index on the
>shipdatetime column of
>the ar_trx_header table and does a full table access instead.
>
>select
> count(*)
>from
> ar_trx_header_ship ash,
> ar_trx_detail ard,
> ar_trx_header arh
>where
> (arh.shipdatetime between '01-JAN-02' and '31-JAN-02' or
> arh.shipdatetime between '01-JAN-01' and '31-JAN-01') and
> arh.artrxhdridx = ard.artrxhdridx and
> ard.shiphdridx = ash.shiphdridx
>
>..plan is:
>
>SELECT STATEMENT Optimizer=RULE
> SORT (AGGREGATE)
> CONCATENATION
> NESTED LOOPS
> NESTED LOOPS
> TABLE ACCESS (BY INDEX ROWID) OF AR_TRX_HEADER
> INDEX (RANGE SCAN) OF XIE4AR_TRX_HEADER (NON-UNIQUE)
> TABLE ACCESS (BY INDEX ROWID) OF AR_TRX_DETAIL
> INDEX (RANGE SCAN) OF XIF6429AR_TRX_DETAIL (NON-UNIQUE)
> INDEX (UNIQUE SCAN) OF XPKAR_TRX_HEADER_SHIP (UNIQUE)
> NESTED LOOPS
> NESTED LOOPS
> TABLE ACCESS (BY INDEX ROWID) OF AR_TRX_HEADER
> INDEX (RANGE SCAN) OF XIE4AR_TRX_HEADER (NON-UNIQUE)
> TABLE ACCESS (BY INDEX ROWID) OF AR_TRX_DETAIL
> INDEX (RANGE SCAN) OF XIF6429AR_TRX_DETAIL (NON-UNIQUE)
> INDEX (UNIQUE SCAN) OF XPKAR_TRX_HEADER_SHIP (UNIQUE)
>
>..whereas the same query with an outer join produces:
>
>select
> count(*)
>from
> ar_trx_header_ship ash,
> ar_trx_detail ard,
> ar_trx_header arh
>where
> (arh.shipdatetime between '01-JAN-02' and '31-JAN-02' or
> arh.shipdatetime between '01-JAN-01' and '31-JAN-01') and
> arh.artrxhdridx = ard.artrxhdridx and
> ard.shiphdridx = ash.shiphdridx (+)
>
>
>SELECT STATEMENT Optimizer=RULE
> SORT (AGGREGATE)
> NESTED LOOPS (OUTER)
> NESTED LOOPS
> TABLE ACCESS (FULL) OF AR_TRX_HEADER
> TABLE ACCESS (BY INDEX ROWID) OF AR_TRX_DETAIL
> INDEX (RANGE SCAN) OF XIF6429AR_TRX_DETAIL (NON-UNIQUE)
> INDEX (UNIQUE SCAN) OF XPKAR_TRX_HEADER_SHIP (UNIQUE)
>
>
>
>..as for why we don't use the cost based optimizer, the answer is that I'm
>finding that most of
>our queries run slower with the cost based optimizer than without. For
>example, in this case,
>the above query without the outer join results in a plan such as:
>
>SELECT STATEMENT Optimizer=HINT: CHOOSE (Cost=403 Card=1 Bytes=21)
> SORT (AGGREGATE)
> HASH JOIN (Cost=403 Card=97424 Bytes=2045904)
> TABLE ACCESS (FULL) OF AR_TRX_HEADER (Cost=17 Card=2730 Bytes=30030)
> NESTED LOOPS (Cost=174 Card=173151 Bytes=1731510)
> TABLE ACCESS (FULL) OF AR_TRX_DETAIL (Cost=174 Card=633027
>Bytes=3798162)
> INDEX (UNIQUE SCAN) OF XPKAR_TRX_HEADER_SHIP (UNIQUE)
>
>
>Which is another question I have: why is the cost based optimizer making
>such bad decisions? All of
>the tables involved are analyzed. There are indexes on all columns
>involved.
>
>I'm trying to learn more about how the (respective) optimizers make their
>decisions about these queries.
>
>Thanks in advance.
>
>--steve
>
>
>

The issue with the outer join is most likely being caused your parentheses
Try changing them in

      (arh.shipdatetime between '01-JAN-02' and '31-JAN-02') or
      ( arh.shipdatetime between '01-JAN-01' and '31-JAN-01') and

I really can't explain why it does work with one range, but I have seen the RBO gone wrong in similar cases.

CBO using wrong plans/indexes:
some basic reasons
- not having adequate recent statistics. - not having proper histograms on indexed columns, so the selectivity of any column will be incorrectly determined - not using optimizer_index_cost_adj (from the top of my head) - having parallel query enabled, which will often result in full table scans being cheaper
- an inappropiate db_file_multi_block_read_count some common solutions
- use optimizer_goal = first_rows (favors using indexes) - make sure your from list has the driving table *leftmost* (using RBO, this should be *right*most)
- use hints
- in 8i, use the stored outline concept to capture plans you consider to be correct.
IMO, there is no reason anymore not to us the CBO

Oracle has a note on Metalink detailing common CBO misconceptions Jonathan Lewis has a paragraph in Practical Oracle 8i demonstrating how you can have trace dump the 'stupid' decisions of CBO.

Hth a bit

Sybrand Bakker, Senior Oracle DBA

To reply remove -verwijderdit from my e-mail address Received on Wed Apr 10 2002 - 16:37:49 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US