Re: Re: Density statistic calculation in case of histogram

From: Alberto Dell'Era <alberto.dellera_at_gmail.com>
Date: Thu, 9 Apr 2009 10:06:24 +0200
Message-ID: <4ef2fbf50904090106r767d0da7o698355ef6c2ee468_at_mail.gmail.com>



On Thu, Apr 9, 2009 at 04:25, Neeraj Bhatia <neeraj.dba_at_gmail.com> wrote:
> One question: You have mentioned that CBO classifies at run-time the
> histogram type as it's stored in DD using a heuristic. As per my knowledge,
> based on NDV and #Buckets requested, CBO decides whether to create HB or
> Freq histogram and save the type in data dictionary, which is visible
> through dba_tab_columns.histogram column.

Actually the histogram type is NOT stored in the data dictionary, since the CBO does not need (very surprisingly at first sight) to know the histogram type,
since frequency and height-based histograms are used the same by the internal formulae ... see the following paper for further details:

http://www.adellera.it/investigations/join_over_histograms/JoinOverHistograms.pdf

dba_tab_columns.histogram is a view on the data dictionary, and the column "histogram" is calculated, not retrieved (as you can easily check yourself).
Here is an excerpt from the same paper:
"10g dba_tab_columns.histogram is an heuristic based on num_rows, num_buckets, num_distinct and density. It is also frequently wrong for FHs, since it contains the factor density*num_buckets <= 0.5 that is very prone to rounding errors (e.g. density*num_buckets = 0.5000001 instead of 0.5). Also, the type of histogram reported in the 10053 event is for information only, probably based on the same formula used for dba_tab_columns.histogram
in 10g, and with the same weaknesses."

hth
Alberto

-- 
Alberto Dell'Era
"the more you know, the faster you go"
--
http://www.freelists.org/webpage/oracle-l
Received on Thu Apr 09 2009 - 03:06:24 CDT

Original text of this message