Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: CBO histograms & density

Re: CBO histograms & density

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Fri, 15 Nov 2002 10:26:34 -0000
Message-ID: <ar2i63$mif$1$8302bc10@news.demon.co.uk>

Notes in-line

--
Regards

Jonathan Lewis
http://www.jlcomp.demon.co.uk

Next Seminar dates:
(see http://www.jlcomp.demon.co.uk/seminar.html )

____USA__________November 7/9   (Detroit)
____USA__________November 19/21 (Dallas)
____England______November 12/14

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html





Stephan Bressler wrote in message ...

>Hi Gurus,
>
>I've got a tricky problem on 9iR2 / HPUX.
>
>The statement is:
>select * from table where object_id=1234
>
>The table contains 50.000 rows in 900 blocks. There's an index on
object_id
>with ~26.000 distict keys. object_id is a number column.
>
So this is a single column index with an average of 2 rows per value - are null values allowed, and are there any very popular values ?
>We analyze our schema using
> dbms_stasts.gather_table_stats( .., estimate=>false,
> method_opt=>'for all indexes for all indexed columns size
16',
>cascade=>true)
>(BTW: the same happens if using "analyze schema ...")
>
In most cases, you do not need histograms to get a good optimisation path. If you do need histograms, you probably need more that 16 buckets.
>After that, the query is done by a full table scan (FTS). I believe
the
>problem to be the information dba_tab_cols.density, which contains
0,011. In
>turn the optimizer believes to get 0,011*50000 = 600 rows for a
given
>object_id by an index range scan.
The density is derived by building the histogram, estimating the number of distinct values, the number of nulls, identifying popular values, factoring popular values out of the total data set, and then dividing the estimated number of remaining rows by the number of distinct values, allowing for nulls. I'm having trouble working out how the data you have described (with 16 buckets) could end up setting this value to 0.011
>This seems to be 3 times more expensive than the FTS (which, of
course, is
>wrong!).
>
>The value of density is dependent on the number of bucket named in
the Generally to be expected - although sometimes luck plays a hand. The more buckets you have, the more detail Oracle acquires about popular, very popular and quite popular values, and the better the detail about the spread of non-popular values.
>analyze statement. The more bucket, the better the value. A value of
200
>buckets produce a density value of 2,1e-4, while the accurate value
of
>1/26.000=3,8e-5 is produced only if the "for all indexed columns" is
omitted
>or if a bucket size of 1 is given.
>If density is close to the correct value, the optimizer chooses a
"good"
>(=index) plan.
>
>My questions are:
>1. If there is histogram information available, why is the column
overall
>density used at all?
It isn't - your query should be using the enpoint_values that bound the input literal. The density is there for the benefit of bind-variable queries. (This may explain tablescan, but not the density).
>2. What does dba_tab_cols.density mean in the presence of histograms?
>
It is an indicator of the selectivity of non-popular values.
>TIA
>Stephan
>
>
>
>
>
Received on Fri Nov 15 2002 - 04:26:34 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US