Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: CBO histograms & density

Re: CBO histograms & density

From: Stephan Bressler <stephan.bressler_at_siemens.com>
Date: Fri, 15 Nov 2002 12:07:34 +0100
Message-ID: <ar2kgk$ode$1@news.mch.sbs.de>


Hi Jonathan,

thanx for the answer. Find my comments below.

There's a even more confusing fact.
There are 2 more indexes on this table. A primary key and a non-unique not-null index on column ean (with ~45000 distinct keys). The values of dba_tab_cols.density of these columns are always 1/num_distinct regardless of the number of buckets. Apart the difference to the primary key, why the difference to ean???

Stephan

> >The table contains 50.000 rows in 900 blocks. There's an index on
> object_id
> >with ~26.000 distict keys. object_id is a number column.
> >
>
> So this is a single column index with an average of 2 rows per value -
> are null values allowed, and are there any very popular values ?
Nulls are not allowed. object_id ranges from 1 to 20.000.000, few object_ids have 500-1000 rows.
So there are no popular values by means of histograms. BTW: The execution plan is independent of the literal value I use in my query (even if I used -1000)

>
> >We analyze our schema using
> > dbms_stasts.gather_table_stats( .., estimate=>false,
> > method_opt=>'for all indexes for all indexed columns size
> 16',
> >cascade=>true)
> >(BTW: the same happens if using "analyze schema ...")
> >
>
>
> In most cases, you do not need histograms to get a good
> optimisation path. If you do need histograms, you probably
> need more that 16 buckets.

I'm aware of this. The statistics are maintained by the customer who analyzed all
tables in the same way (ease of use, too few knowledge...)

>
> >After that, the query is done by a full table scan (FTS). I believe
> the
> >problem to be the information dba_tab_cols.density, which contains
> 0,011. In
> >turn the optimizer believes to get 0,011*50000 = 600 rows for a
> given
> >object_id by an index range scan.
>
>
> The density is derived by building the histogram, estimating the
> number of
> distinct values, the number of nulls, identifying popular values,
> factoring popular
> values out of the total data set, and then dividing the estimated
> number of remaining
> rows by the number of distinct values, allowing for nulls. I'm having
> trouble working
> out how the data you have described (with 16 buckets) could end up
> setting this
> value to 0.011

Same to me.
There are no popular values, all histogram endpoints show different numbers. 0.011 is what dba_tab_columns.density displays for this columns... (dba_tab_columns.num_distinct contains 26253 rows regardless of the number of buckets).

>
>
> >This seems to be 3 times more expensive than the FTS (which, of
> course, is
> >wrong!).
> >
> >The value of density is dependent on the number of bucket named in
> the
>
> Generally to be expected - although sometimes luck plays a hand. The
> more
> buckets you have, the more detail Oracle acquires about popular, very
> popular
> and quite popular values, and the better the detail about the spread
> of non-popular
> values.

Correct. But if the values of a columns are distributed almost evenly over the value range,
the same should be true if the entire range is divided into bucket. Received on Fri Nov 15 2002 - 05:07:34 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US