Re: multi-column indexes, statistics and selectivity
Date: Sat, 5 Dec 2009 07:26:21 -0800 (PST)
On Dec 4, 11:48 am, bugbear <bugbear_at_trim_papermule.co.uk_trim> wrote:
> Can anyone tell me (or point me at information) about
> how statistics are generated/used for multi-column indexes?
> We have a 2 column status model (major and minor status, if you like).
> This is key to our application.
> "Naturally" we have put a 2 column index on these two columns.
> But we have "doubts" about whether the CBO is always
> doing as good as job as we'd like.
> The key question seems to be:
> Are the columns "treated as a pair" so that the
> frequency estimate for a pair of statuses depends
> on both of them or...
> Are the columns treated in isolation
> so that the frequency estimate
> is simply obtained by getting the estimate
> for each column, and combining them mathematically.
> This would make a big difference in our app,
> since our major/minor status (in practice) have interesting
> and complex correlations.
To add to what others have said, how the CBO treats the two columns is going to in part depend on how the sql statement is written. If the query uses two bind variables, one for each indexed column, in the where clause conditions then the CBO (ignoring bind variable peeking) uses the default selectivity (.05) for the first column times the default selectivity of the second column in its calculations. If the data is not skewed then the presence or absence of histograms will probably not have a major mpact on the plan.
My question to you though is major/minor status of what? Will you be seeking all occurrences of whatever with a specific major status or all rows with a specific minor status? What I am trying to bring into consideration is if the status columns need to be associated with another key. That is you might need the index to be major/minor/ part_no or something like this depending on it major/minor is the PK of one of your tables or jsut really an attribute set of some of your tables. The three part key would allow more selective determination of the target rows before the table has to be accessed.
HTH -- Mark D Powell -- Received on Sat Dec 05 2009 - 09:26:21 CST