Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: CBO histograms & density

Re: CBO histograms & density

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Fri, 15 Nov 2002 15:06:33 -0000
Message-ID: <ar32is$jso$1$8300dec7@news.demon.co.uk>

Notes inline.

--
Regards

Jonathan Lewis
http://www.jlcomp.demon.co.uk

Next Seminar dates:
(see http://www.jlcomp.demon.co.uk/seminar.html )

____USA__________November 7/9   (Detroit)
____USA__________November 19/21 (Dallas)
____England______November 12/14

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html





Stephan Bressler wrote in message ...

>
>I did some further investigations. It seems, that the density
information
>changes its meaning in the presence of histograms on a given table.
>
That's what I said in my original posting. <quote> The density is derived by building the histogram, estimating the number of distinct values, the number of nulls, identifying popular values, factoring popular values out of the total data set, and then dividing the estimated number of remaining rows by the number of distinct values, allowing for nulls. </quote> I guess the point that I didn't make clearly is that Oracle's understanding of 'popular values' is based on the histogram results.
>Back to my original question:
>The column ean with a "good" density value in any case has max. 2
rows per
>key, while object_id has up to 3000 per key. This seems to make the
>difference.
Should do - this is a "popular value", and will therefore have some effect on the calculated density. Nevertheless, you have 50,000 rows and 25,000 distinct values. Call this 48,000 and 24,000 for convenience. Even if half the rows fell into 8 values at 3,000 rows per value I don't think the density should be as high as it is. Note - to build your histogram of 16 buckets, Oracle takes a data sample, sorts it into order, and then checks the value every one-sixteenth of the way down the sorted list (i.e. roughly every 3,000 rows - that's an interesting number). Your stats have just come through - I would certainly expect to see a different cost if you gave an index hint to id = 5000000 SB XXX object_id 4 4110683 SB XXX object_id 5 6196801 gap of 2M rather than id = 19897000 gap of 800 SB XXX object_id 12 19896336 SB XXX object_id 13 19907124 I can't see any clue why the density should be so high ! I am now very curious - I thought I had this bit of the optimiser cracked, but obviously I've missed something. If you want to experiment, you could clone the table and just try: analyze table copytab compute statistics for column xxx size 16 to see if the stats look very different. If you are allowed to, I would be interested in playing with the data set - if you want to 'create table x as select /*+ full */ guilty_column' then export it and email me direct. let me know the avg_row_len and I'll import the data and hack it to the right size and see what I can make of it.
Received on Fri Nov 15 2002 - 09:06:33 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US