Return-Path: <oracle-l-bounce@freelists.org>
Delivered-To: 2-oracle-l@orafaq.com
Received: (qmail 32051 invoked from network); 13 Dec 2007 10:22:41 -0600
Received: from freelists-180.iquest.net (HELO turing.freelists.org) (206.53.239.180)
  by static-ip-69-64-49-119.inaddr.intergenia.de with SMTP; 13 Dec 2007 10:22:41 -0600
Received: from localhost (localhost [127.0.0.1])
 by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id A1D847D8AD5;
 Thu, 13 Dec 2007 11:22:41 -0500 (EST)
Received: from turing.freelists.org ([127.0.0.1])
 by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id 09376-03; Thu, 13 Dec 2007 11:22:41 -0500 (EST)
Received: from turing (localhost [127.0.0.1])
 by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id F13797D8275;
 Thu, 13 Dec 2007 11:22:40 -0500 (EST)
Received: with ECARTIS (v1.0.0; list oracle-l); Thu, 13 Dec 2007 10:35:44 -0500 (EST)
Received: from localhost (localhost [127.0.0.1])
 by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id B72D37D8C6A
 for <oracle-l@freelists.org>; Thu, 13 Dec 2007 10:35:44 -0500 (EST)
Received: from turing.freelists.org ([127.0.0.1])
 by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id 32081-03 for <oracle-l@freelists.org>;
 Thu, 13 Dec 2007 10:35:44 -0500 (EST)
Received: from smtp41.singnet.com.sg (smtp41.singnet.com.sg [165.21.103.142])
 by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 0D5D97D8BB5
 for <oracle-l@freelists.org>; Thu, 13 Dec 2007 10:35:43 -0500 (EST)
Received: from chitale.singnet.com.sg (bb220-255-249-112.singnet.com.sg [220.255.249.112])
 by smtp41.singnet.com.sg (8.14.1/8.14.1) with ESMTP id lBDFZgh1029368
 for <oracle-l@freelists.org>; Thu, 13 Dec 2007 23:35:42 +0800
Message-Id: <200712131535.lBDFZgh1029368@smtp41.singnet.com.sg>
Date: Thu, 13 Dec 2007 23:35:38 +0800
To: oracle-l@freelists.org
From: Hemant K Chitale <hkchital@singnet.com.sg>
Subject: "Multi-Column" Histogram /  Histogram for Concatenated Index
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
X-archive-position: 3919
X-ecartis-version: Ecartis v1.0.0
Sender: oracle-l-bounce@freelists.org
Errors-to: oracle-l-bounce@freelists.org
X-original-sender: hkchital@singnet.com.sg
Precedence: normal
Reply-to: hkchital@singnet.com.sg
List-help: <mailto:ecartis@freelists.org?Subject=help>
List-unsubscribe: <oracle-l-request@freelists.org?Subject=unsubscribe>
List-software: Ecartis version 1.0.0
List-Id: oracle-l <oracle-l.freelists.org>
X-List-ID: oracle-l <oracle-l.freelists.org>
List-subscribe: <oracle-l-request@freelists.org?Subject=subscribe>
List-owner: <mailto:steve.adams@ixora.com.au>
List-post: <mailto:oracle-l@freelists.org>
List-archive: <http://www.freelists.org/archives/oracle-l>
X-list: oracle-l
X-Virus-Scanned: Debian amavisd-new at localhost.localdomain



I have a table with an Index on 4 columns.  One particular column 
(the third in the index)
is highly skewed with one value accounting for 70% of the rows.
Queries with the Index columns as predicates are perfectly fine for all values
other than that one value.
Even if I gather a histogram on the skewed column, queries for the "bad" value
still use the index.  The bad value alone accounts for 70% of the rows.
In combination with a particular set of values for the other columns, the query
fetches about 55% of the rows.  I do not want to use the index.

Say :
Table TAB_1
Columns  C_1, C_2, C_3, C_4, C_5, C_6, C7
Index on C_2,C_4,C_5,C6

Value "881" in column C_5 accounts for 70% of the rows.
A particular combination of values for C_2, C_4, C_5 ("881") and C_6
account for 55% of the rows.

I cannot change the SQL code.  It is "generated" by the application.
Even if I were to change the code,  I find that this SQL query runs
in a loop for different combinations  of these columns.  It is only
one combination that is my problem.  The combinations of
C_2, C_4, C_5, C_6   are driven from three other tables
Besides these columns C_1 and  C_7 are also linked to the other tables.
Some joins are equi-joins, some are ranges.
{Yes, this sounds like a fact table and dimension tables, doesn't it ?}

This is 9.2.0.4, 64-bit.
Statistics are gathered nightly.  I am allowed to modify the gather statistics
job to collect specific histograms.

I find that gathering histogram on C_5 alone doesn't help.
{There are 6- distinct values, and I can specify upto 254 buckets}.
Queries still use the index

Questions :
1.  If I rebuild the index to use C_5 as the leading column would
Oracle be using the Histogram to convert queries for C_5='881'
into FullTableScans ?
2.  Should I gather histograms on all the columns of this index ?

This is not exactly like the "correlated" columns that is used as an
example for 11g MultiColumn Statistics ("Extended Statistics")
as there is no relationship between the column values.
What I am looking for is something like a Histogram on the Index Key
(rather than just "num_rows" and "distinct_keys", I want the density
for different combinations of key values -- a Histogram on the Index !)




Hemant K Chitale
http://web.singnet.com.sg/~hkchital
and
http://hemantscribbles.blogspot.com
and
http://hemantoracledba.blogspot.com

"There is more to life than increasing its speed."
Mohandas Gandhi Quotes 
:  http://www.brainyquote.com/quotes/authors/m/mohandas_gandhi.html

--
http://www.freelists.org/webpage/oracle-l


