Re: why does CBO not use available indices

From: Iggy Fernandez <iggy_fernandez_at_hotmail.com>
Date: 6 Oct 2003 16:00:11 -0700
Message-ID: <a8cdfc59.0310061500.7ca02fd4@posting.google.com>

re: You have skewed data and need a histogram

if i may be bold enough to add my two pence to jonathan's analysis. you have already computed statistics but i believe that in the absence of the "size" clause, it does not really generate a histogram. it only computes the min and max values. validate this by "select end_value from dba_histograms where table_name='<table_name>' and column_name='column_name'". to create "real" histograms use the "size" clause e.g. "compute statistics on all columns size 100" will compute a histogram with 100 bands.

you need a "real" histogram for oracle to realize that FOREIGN_ID is a very selective column

an interesting piece of trivia is that oracle uses "height balanced histograms" (where each band contains the same number of datapoints but the width of each band is different) instead of "width balanced histograms" (where the width of each band is the same but the number of datapoints in each band is the same) as is common with other vendors. the graphical representation of "width balanced histograms" looks like graphs in a calculus book (e.g. the normal distribution). "height balanced histograms" are quite unusual but i guess the oracle developers believed they were more effective.

cheers,
iggy

caveat: i could be very wrong :-)

"Jonathan Lewis" <jonathan_at_jlcomp.demon.co.uk> wrote in message news:<blsdqs$kgr$1$8302bc10_at_news.demon.co.uk>...
> If Oracle is making a big mistake with the plan,
> then there are a couple of GUESSES I could
> make:
> a) You have skewed data and need a histogram
> b) Your sort_area_size is unsuitable
> c) Your db_file_multiblock_read_count is unsuitable
>
> Since you think a nested loop is the correct
> strategy you might try modifying the parameter
> optimizer_index_caching
> - values around 80 are often 'realistic'
> for OLTP systems.
>
> You don't need to fiddle with
> optimizer_max_permutations to 'fix'
> a two-table join.
>
>
> --
> Regards
>
> Jonathan Lewis
> http://www.jlcomp.demon.co.uk
>
> The educated person is not the person
> who can answer the questions, but the
> person who can question the answers -- T. Schick Jr
>
>
> One-day tutorials:
> http://www.jlcomp.demon.co.uk/tutorial.html
> ____Iceland__November (tbc)
> ____Belgium__November (EOUG event)
> ____UK_______December (UKOUG conference)
>
>
> Three-day seminar:
> see http://www.jlcomp.demon.co.uk/seminar.html
> ____USA__October
> ____UK___November
>
>
> The Co-operative Oracle Users' FAQ
> http://www.jlcomp.demon.co.uk/faq/ind_faq.html
>
>
> "Anke Heinrich" <anke.heinrich_at_marconi.com> wrote in message
> news:9042145d.0310060903.4ed4e354_at_posting.google.com...
> > Hi,
> >
> > I have a problem getting the cost based optimizer to use the
> available
> > indices instead of performing a full table scan. Using hints is not
> > possible as I can't change the source code at the moment. I've heard
> > about outlines, but I'm not sure if this works for us.
> >
> > I don't have much experience with query optimization or
> export/import
> > of outlines, so any help from you is greatly appreciated. The facts
> > follow:
> >
> > SQL statement:
> >
> > select * from TP, ETP
> > where ( TP.FK_RESOURCE_ETP = ETP.ID ) and ( ETP.FOREIGN_ID = ? )
> > order by ETP.FK_CONTAINER_EMLSN, ETP.FOREIGN_TYPE, ETP.FOREIGN_LABEL
> >
> > best select strategy:
> > 1) search for entries in ETP with given FOREIGN_ID
> > using index ETP_FOREIGN_ID on ETP (FOREIGN_ID) which is not
> unique,
> > but very selective (usually only one match)
> > 2) join the results from ETP with TP.FK_RESOURCE_ETP = ETP.ID
> > using unique index TP_KEY_RESOURCE on TP (FK_RESOURCE_ETP)
> > 3) sort results
> >
> > table statistics:
> > have been calculated using
> > analyze table TP compute statistics
> > for all indexes for all indexed columns;
> > analyze table ETP compute statistics
> > for all indexes for all indexed columns;
> >
> > optimizer parameter:
> > optimizer_index_cost_adj=10
> > optimizer_max_permutations=1000
> >
> > we've had a good execution plan when both tables had around 400
> > entries:
> > 0 SELECT STATEMENT Optimizer=CHOOSE
> > 1 0 SORT (ORDER BY)
> > 2 1 NESTED LOOPS
> > 3 2 TABLE ACCESS (BY INDEX ROWID) OF 'ETP'
> > 4 3 INDEX (RANGE SCAN) OF 'ETP_FOREIGN_ID'
> (NON-UNIQUE)
> > 5 2 TABLE ACCESS (BY INDEX ROWID) OF 'TP'
> > 6 5 INDEX (UNIQUE SCAN) OF 'TP_KEY_RESOURCE' (UNIQUE)
> >
> > with around 35000 entries (and updated statistics) the execution
> plan
> > now is:
> > 0 SELECT STATEMENT Optimizer=CHOOSE
> > 1 0 SORT (ORDER BY)
> > 2 1 MERGE JOIN
> > 3 2 TABLE ACCESS (BY INDEX ROWID) OF 'ETP'
> > 4 3 INDEX (FULL SCAN) OF 'ETP_ID' (UNIQUE)
> > 5 2 SORT (JOIN)
> > 6 5 TABLE ACCESS (FULL) OF 'TP'
> >
> > we tried out the hint which produced the preferred execution plan,
> but
> > as I said we can't change the source code, so this is not a real
> > option:
> >
> > /*+ USE_NL(ETP TP) INDEX(ETP ETP_FOREIGN_ID) INDEX(TP
> > TP_KEY_RESOURCE) */
> >
> > performance difference is 0.04 with good old plan, 0.46 with new one
> > :-(
> >
> > Is there a way to force the optimizer to use a specific plan for
> > dynamically generated statements without changing source code?
> >
> > Are there any other options left?
> >
> > Thanks in advance,
> > Anke
Received on Mon Oct 06 2003 - 18:00:11 CDT