Re: SQL going after two date ranges and outer join

From: Steve Mitchell <stevem_at_hdcsi.com>
Date: Thu, 11 Apr 2002 14:55:43 GMT
Message-ID: <Ptht8.462$bq.396420129@newssvr21.news.prodigy.com>

Thanks for the tips on the CBO. Worth looking into for us. For what it's worth, many of our
report queries join up many tables--up to 30. From what I've read, joining too many tables can
confuse both optimizers. I'd guess that RBO often runs faster here simply because it usually uses
nested loops--which also happens to usually be the right thing to do.

--steve

"Sybrand Bakker" <postbus_at_sybrandb.demon.nl> wrote in message news:0ur8bu01rthqo5nsmo9blnrrp82gogii5o_at_4ax.com...
> On Wed, 10 Apr 2002 15:28:25 GMT, "Steve Mitchell" <stevem_at_hdcsi.com>
> wrote:
>
> >
> >Agreed that the correct driving table is being chosen. The problem is
it's
> >not using the available
> >date index.
> >
> >where
> > (arh.shipdatetime between '01-JAN-02' and '31-JAN-02' or
> > arh.shipdatetime between '01-JAN-01' and '31-JAN-01') and
> > arh.artrxhdridx = ash.artrxhdridx (+)
> >
> >..and..
> >
> >where
> > arh.shipdatetime between '01-JAN-01' and '31-JAN-01' and
> > arh.artrxhdridx = ash.artrxhdridx (+)
> >
> >..generate two different plans. (the later using the index on
shipdatetime,
> >the former does not).
> >
> >This also confuses me. My AR_TRX_HEADER table need not be directly
involved
> >in the outer
> >join for the plan to change. Consider the following query without an
outer
> >join, and the
> >next query, which contains an outer join on from ar_trx_detail to
> >ar_trx_header_ship. In the
> >second query, the optimizer abandons use of the date index on the
> >shipdatetime column of
> >the ar_trx_header table and does a full table access instead.
> >
> >select
> > count(*)
> >from
> > ar_trx_header_ship ash,
> > ar_trx_detail ard,
> > ar_trx_header arh
> >where
> > (arh.shipdatetime between '01-JAN-02' and '31-JAN-02' or
> > arh.shipdatetime between '01-JAN-01' and '31-JAN-01') and
> > arh.artrxhdridx = ard.artrxhdridx and
> > ard.shiphdridx = ash.shiphdridx
> >
> >..plan is:
> >
> >SELECT STATEMENT Optimizer=RULE
> > SORT (AGGREGATE)
> > CONCATENATION
> > NESTED LOOPS
> > NESTED LOOPS
> > TABLE ACCESS (BY INDEX ROWID) OF AR_TRX_HEADER
> > INDEX (RANGE SCAN) OF XIE4AR_TRX_HEADER (NON-UNIQUE)
> > TABLE ACCESS (BY INDEX ROWID) OF AR_TRX_DETAIL
> > INDEX (RANGE SCAN) OF XIF6429AR_TRX_DETAIL (NON-UNIQUE)
> > INDEX (UNIQUE SCAN) OF XPKAR_TRX_HEADER_SHIP (UNIQUE)
> > NESTED LOOPS
> > NESTED LOOPS
> > TABLE ACCESS (BY INDEX ROWID) OF AR_TRX_HEADER
> > INDEX (RANGE SCAN) OF XIE4AR_TRX_HEADER (NON-UNIQUE)
> > TABLE ACCESS (BY INDEX ROWID) OF AR_TRX_DETAIL
> > INDEX (RANGE SCAN) OF XIF6429AR_TRX_DETAIL (NON-UNIQUE)
> > INDEX (UNIQUE SCAN) OF XPKAR_TRX_HEADER_SHIP (UNIQUE)
> >
> >..whereas the same query with an outer join produces:
> >
> >select
> > count(*)
> >from
> > ar_trx_header_ship ash,
> > ar_trx_detail ard,
> > ar_trx_header arh
> >where
> > (arh.shipdatetime between '01-JAN-02' and '31-JAN-02' or
> > arh.shipdatetime between '01-JAN-01' and '31-JAN-01') and
> > arh.artrxhdridx = ard.artrxhdridx and
> > ard.shiphdridx = ash.shiphdridx (+)
> >
> >
> >SELECT STATEMENT Optimizer=RULE
> > SORT (AGGREGATE)
> > NESTED LOOPS (OUTER)
> > NESTED LOOPS
> > TABLE ACCESS (FULL) OF AR_TRX_HEADER
> > TABLE ACCESS (BY INDEX ROWID) OF AR_TRX_DETAIL
> > INDEX (RANGE SCAN) OF XIF6429AR_TRX_DETAIL (NON-UNIQUE)
> > INDEX (UNIQUE SCAN) OF XPKAR_TRX_HEADER_SHIP (UNIQUE)
> >
> >
> >
> >..as for why we don't use the cost based optimizer, the answer is that
I'm
> >finding that most of
> >our queries run slower with the cost based optimizer than without. For
> >example, in this case,
> >the above query without the outer join results in a plan such as:
> >
> >SELECT STATEMENT Optimizer=HINT: CHOOSE (Cost=403 Card=1 Bytes=21)
> > SORT (AGGREGATE)
> > HASH JOIN (Cost=403 Card=97424 Bytes=2045904)
> > TABLE ACCESS (FULL) OF AR_TRX_HEADER (Cost=17 Card=2730
Bytes=30030)
> > NESTED LOOPS (Cost=174 Card=173151 Bytes=1731510)
> > TABLE ACCESS (FULL) OF AR_TRX_DETAIL (Cost=174 Card=633027
> >Bytes=3798162)
> > INDEX (UNIQUE SCAN) OF XPKAR_TRX_HEADER_SHIP (UNIQUE)
> >
> >
> >Which is another question I have: why is the cost based optimizer making
> >such bad decisions? All of
> >the tables involved are analyzed. There are indexes on all columns
> >involved.
> >
> >I'm trying to learn more about how the (respective) optimizers make their
> >decisions about these queries.
> >
> >Thanks in advance.
> >
> >--steve
> >
> >
> >
>
> The issue with the outer join is most likely being caused your
> parentheses
> Try changing them in
> (arh.shipdatetime between '01-JAN-02' and '31-JAN-02') or
> ( arh.shipdatetime between '01-JAN-01' and '31-JAN-01') and
>
> I really can't explain why it does work with one range, but I have
> seen the RBO gone wrong in similar cases.
>
> CBO using wrong plans/indexes:
> some basic reasons
> - not having adequate recent statistics.
> - not having proper histograms on indexed columns, so the selectivity
> of any column will be incorrectly determined
> - not using optimizer_index_cost_adj (from the top of my head)
> - having parallel query enabled, which will often result in full table
> scans being cheaper
> - an inappropiate db_file_multi_block_read_count
> some common solutions
> - use optimizer_goal = first_rows (favors using indexes)
> - make sure your from list has the driving table *leftmost* (using
> RBO, this should be *right*most)
> - use hints
> - in 8i, use the stored outline concept to capture plans you consider
> to be correct.
> IMO, there is no reason anymore not to us the CBO
>
> Oracle has a note on Metalink detailing common CBO misconceptions
> Jonathan Lewis has a paragraph in Practical Oracle 8i demonstrating
> how you can have trace dump the 'stupid' decisions of CBO.
>
> Hth a bit
>
>
> Sybrand Bakker, Senior Oracle DBA
>
> To reply remove -verwijderdit from my e-mail address
Received on Thu Apr 11 2002 - 09:55:43 CDT