Re: full-scan vs index for "small" tables

From: Mladen Gogala <gogala_at_sbcglobal.net>
Date: Wed, 28 Jun 2006 17:44:18 -0400
Message-Id: <1151531058l.2942l.0l@medo.noip.com>

On 06/28/2006 12:00:17 PM, Cary Millsap wrote:
> > ...need stable sql plans.
>
> But the whole point of the CBO is that execution plan stability is
> inferior to execution plan adaptation to changing circumstances. As
> Jonathan Lewis points out very well, all it takes is the insertion of a
> single row to make the True Best Plan change from one execution to
> another.

Cary, while True Best Plan(TM) can change from one execution to the next, there are also some problems that are introduced by this:

One cannot develop on a smaller machine and expect to have the same execution path on another machine. That makes explain plan statement practically useless. The same applies to performance testing. This, in turn, complicates development process immensely.
Gathering statistics introduces an element of randomness into production environment. Some data is loaded into a big table in a disorderly fashion, it pushes the clustering factor over the limit and the index is no longer used. Tuning is complex, requires a cooperation from development and is generally regarded as an exercise in futility, as some future data change will send us all into an overdrive yet again.

>
> If you truly want plan stability, then you want stored outlines, do you
> not?

Not necessarily. I advise my clients not to gather statistics too frequently. Usually, I recommend bi-monthly period. I also recommend cheating and locking statistics. I love setting clustering factor to a low value and then locking the statistics for the index.

>
> Certainly, there are two distinct categories where CBO messes up:
>
> I. Where it has been misinformed by the data it uses to make decisions.
> II. Where it makes poor decisions based upon truly representative data.
>
> My experience is that most problems that people think are category II
> problems are really category I problems in disguise. The difference can
> be revealed by inspection of 10053 data.
>
> I do recognize the existence of category II problems as well. It's just
> that I think they're considerably rarer than most people believe.

I do believe that RBO should have been left as a choice. It is very nice to have a smart optimizer, as long as I am smarter still and I am able to predict what will the optimizer do. In other words, I believe that something like this should be built into the optimizer:

You will not harm the user or allow them to be harmed by inadequate performance because of your inactivity.
You will always follow orders of a DBA or developers, except when that contradicts the 1st law.
You will always protect the existence of statistics, except when that contradicts the 1st or the 2n law.

Unfortunately, I'm getting old and CBO is getting too complex, a real black box, practically impossible to predict. Maybe it does sound like Matt Lauer's doomsday comedy on the Sci-Fi channel, but I do believe that the optimizer is out to get us. Run for the hills!

-- 
Mladen Gogala
http://www.mgogala.com

--
http://www.freelists.org/webpage/oracle-l

Received on Wed Jun 28 2006 - 16:44:18 CDT