Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Mailing Lists -> Oracle-L -> Re: histogram oddity: Height-balanced histograms where NDV < num_buckets
Further insights:
The system view (dba_tab_col_statistics) is built upon a couple other nested system views. After digging in, I came across this piece of code from the base view (dba_tab_cols):
case when nvl(h.row_cnt,0) = 0 then 'NONE' when (h.bucket_cnt > 255 or (h.bucket_cnt > h.distcnt and h.row_cnt = h.distcnt and h.density*h.bucket_cnt*2 <= 1)) then 'FREQUENCY' else 'HEIGHT BALANCED' end,
In order by this column to be "HEIGHT BALANCED", it has to meet the following criteria:
After checking off the first two items for a particular example (table
INDPART$, column TS#, bucket_cnt = 217), I looked at the 3rd. I found out
that bucket_cnt > h.distcnt. So that explains why the view reported this
information, but why is bucket_cnt > h.distcnt? In working through this
example, I also found cases where bucket_cnt > 254, which encouraged me to
check for the largest:
SQL > select max(bucket_cnt) from sys.hist_head$;
MAX(BUCKET_CNT)
2631537
Wow. I guess I really do not know what bucket_cnt means - I thought it was a count of histogram buckets, which I understaood to be limited to 254.
But back to the example (INDPART$.TS#) for a second. SQL > select count(*), count(distinct TS#) from sys.INDPART$;
COUNT(*) COUNT(DISTINCTTS#)
---------- -------------------- 236 1
To make matters even more confusing
SQL > select ENDPOINT_NUMBER,ENDPOINT_VALUE from dba_histograms where
table_name = 'INDPART$' and column_name = 'TS#';
ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------
217 79
Color me confused. In some ways, I hope I am missing something obvious. *grin*
PS - some day I need to get Wolfgang or Jonathan to further explain the HEIGHT BALANCED Density formula. I am having trouble with "the sum of the squared frequencies fo all non-popular values divided by the sum of the frequencies of all non-popular values times the count of rows with not null values". I would love to see those complex examples that you mention in writing, and I regret missing the presentations to see in person.
On 8/11/06, Charles Schultz < sacrophyte_at_gmail.com> wrote:
>
> Oracle 10.2.0.2, Solaris 8
>
> Wondering if anyone could explain this. We have 105 SYS-owned tables where
> num_distinct < num_buckets according to dba_tab_col_statistics (see
> following query). Even weirder, some columns have 217 buckets for only 1
> distinct value (verified in dba_tab_histograms). What kind of sense does
> that make? Similarly, we have one application partition that exhibits the
> same symptom.
>
> select distinct table_name, column_name, num_distinct, num_buckets
> from dba_tab_col_statistics where histogram = 'HEIGHT BALANCED' and
> num_distinct < num_buckets
> order by num_distinct,num_buckets
> /
>
-- Charles Schultz -- http://www.freelists.org/webpage/oracle-lReceived on Fri Aug 11 2006 - 12:07:40 CDT