Oracle FAQ Your Portal to the Oracle Knowledge Grid

Home -> Community -> Mailing Lists -> Oracle-L -> RE: "Multi-Column" Histogram / Histogram for Concatenated Index

RE: "Multi-Column" Histogram / Histogram for Concatenated Index

From: Mark W. Farnham <>
Date: Thu, 13 Dec 2007 12:01:17 -0500
Message-ID: <01ef01c83da9$c7cb83b0$>

Is this index required for relational integrity?

IF *NOT*, have you considered redefining the index without column C_5. That might result in an overall improvement, depending on possible extra cost to visit the table when the index would have been enough on some queries and possibly less good plans when there is an equijoin or in-list for *NOT* "881".

IF *REQUIRED*, second caveat above, consider putting C_5 last instead of third. Paradoxically, putting it first might result in a much poorer cost estimate for input to the optimizer.

I am not aware of way to raise the cost the optimizer puts on a skip scan.

For the 55% combination it seems to me if any C_2 results in 55% of the rows
(or presumably more, since even with the other columns pinned to particular
values you get 55%, then C_2 is pretty non-selective for that value. So probably you need a histogram on that column as well. So ditto on moving C_2 lower in the food chain in the order of the columns in the index.

Am I missing something.

-----Original Message-----
From: [] On Behalf Of Hemant K Chitale
Sent: Thursday, December 13, 2007 10:36 AM To:
Subject: "Multi-Column" Histogram / Histogram for Concatenated Index

I have a table with an Index on 4 columns. One particular column
(the third in the index)

is highly skewed with one value accounting for 70% of the rows. Queries with the Index columns as predicates are perfectly fine for all values
other than that one value.
Even if I gather a histogram on the skewed column, queries for the "bad" value
still use the index. The bad value alone accounts for 70% of the rows. In combination with a particular set of values for the other columns, the query
fetches about 55% of the rows. I do not want to use the index.

Say :
Table TAB_1
Columns C_1, C_2, C_3, C_4, C_5, C_6, C7 Index on C_2,C_4,C_5,C6

Value "881" in column C_5 accounts for 70% of the rows. A particular combination of values for C_2, C_4, C_5 ("881") and C_6 account for 55% of the rows.

I cannot change the SQL code. It is "generated" by the application. Even if I were to change the code, I find that this SQL query runs in a loop for different combinations of these columns. It is only one combination that is my problem. The combinations of C_2, C_4, C_5, C_6 are driven from three other tables Besides these columns C_1 and C_7 are also linked to the other tables. Some joins are equi-joins, some are ranges. {Yes, this sounds like a fact table and dimension tables, doesn't it ?}

This is, 64-bit.
Statistics are gathered nightly. I am allowed to modify the gather statistics
job to collect specific histograms.

I find that gathering histogram on C_5 alone doesn't help. {There are 6- distinct values, and I can specify upto 254 buckets}. Queries still use the index

Questions :
1. If I rebuild the index to use C_5 as the leading column would Oracle be using the Histogram to convert queries for C_5='881' into FullTableScans ?
2. Should I gather histograms on all the columns of this index ?

This is not exactly like the "correlated" columns that is used as an example for 11g MultiColumn Statistics ("Extended Statistics") as there is no relationship between the column values. What I am looking for is something like a Histogram on the Index Key
(rather than just "num_rows" and "distinct_keys", I want the density
for different combinations of key values -- a Histogram on the Index !)

Hemant K Chitale

"There is more to life than increasing its speed." Mohandas Gandhi Quotes


Received on Thu Dec 13 2007 - 11:01:17 CST

Original text of this message