Re: When does performance worsen when going from RULE to COST

From: <markp7832_at_my-deja.com>
Date: Sat, 05 Feb 2000 20:31:44 GMT
Message-ID: <87i1bg$gms$1@nnrp1.deja.com>

In article <uJIkTZ9b$GA.275_at_net025s>,
"Jaap W. van Dijk" <uw.naam_at_hetnet.nl> wrote:
> Hi g's,
>
> When you want to change a database from rule based to cost based
you're
> naturally are going to be worried somewhat about the general and query
> specific effects
>
> My question: Are there general causes for degradation people stumble
upon
> when doing this change? This of course under the assumption that
every table
> has been analyzed (ANALYZE TABLE <name> COMPUTE STATISTICS).
>
> For example (one I've seen often): the rule based optimizer uses an
index,
> and the cost based doesn't under the false assumption that a literal
in the
> WHERE clause is one of the evenly distributed values of a column,
where in
> fact it is a very rare value (e.g. the literal 'F' for the column
GENDER for
> a personnel table in a database used by the army). Of course this can
be
> gotten rid of by gathering histogram statistics for this column, but
solving
> this particular example is not the issue here.
>
> Jaap.
>

The CBO over-estimates the cost of using an index to begin with by assuming that every index block retrieval is one I/O where buffer cache management will probably result in the root and many branch blocks being held for the life of the query even if none of the index was in memory at the start of the query.

Then the CBO makes certain assumptions about how data is distributed which when applied to unevely distributed data values can result in plans that are out to lunch. Worse, estimates often result in very poor numbers for the distinctness of columns that fool the optimizer.

I know of no easy way to identify these indexes in advance except for having detailed knowledge of the data.

Watch for work and report tables; particullarily ones that are truncated or worse have every row deleted before being used. If you reanalyze and catch one of these tables with a low usage then the queries written against them may well go to lunch.

With a little work and using explains against production, or if you are lucky enough to have a full sized test system against it, you can identify reasonable plans and use hints to lock them into place. I think writing all from clause so that the tables appear in desired join order is a good practice. This allows the use of the ORDERED hint often with a USE_NL to lock in the join order and join method. If the join order and join method (nested loop, sort/merge, hash) is locked in then the CBO is unlikely to change the plan with a change in statistics. For tables/indexes where the statistics vary a lot this is a blessing.

Its my opinion that in a truely well designed system every SQL statement would be hinted to run the way it should always run and only ad-hoc would come under the control of the CBO. But who has that much time, and the CBO is right a fair amount of the time. --
Mark D. Powell -- The only advice that counts is the advice that you follow so follow your own advice --

Sent via Deja.com http://www.deja.com/
Before you buy. Received on Sat Feb 05 2000 - 14:31:44 CST