Re: Separate Index or Composite Indice ?

From: Jan Krueger <jk_at_stud.uni-hannover.de>
Date: Sat, 01 Sep 2007 09:05:17 +0200
Message-ID: <46d90ea6$0$29377$4c56b896@news-read1.lambdanet.net>

xhoster_at_gmail.com wrote:
> Jan Krueger <jk_at_stud.uni-hannover.de> wrote:

>> xhoster_at_gmail.com wrote:
>>> "userjohn" <userjohn_at_mailinator.com> wrote:
>>>> 10gR2
>>>>
>>>> If you have :
>>>> 2 columns on a hi-traffic transactional table
>>>> - one ("some_id")  high cardinality
>>>> - and one ("yes_no") low cardinality
>>>>
>>>> You use both columns in SQL - "some_id" is used in JOIN and "yes_no"
>>>> as a predicate
>>> In the same SQL, or in separate SQLs?
>>>
>>>> Would you create normal index on BOTH , just "some_id" or a composite
>>>> indice on "some_id" + "yes_no" ?
>>> I generally wouldn't bother with a single column index on "yes_no".
>>> I'd make one on (some_id,yes_no) and forget it.  If it forces itself
>>> back into your attention due to performance issues, then I'd look
>>> at explain plans.
>>>
>> I would consider creating it the other way around (yes_no, some_id)
>> because it
>> allows an index skip scan to be performed if you need to look only for
>> some_id.

>
> Maybe. But if you look only for some_id, then a regular range scan
> on (some_id,yes_no) would be better than the skip scan on (yes_no,some_id),
> wouldn't it?

True, skip scan would double the effort - one range scan for yes, one for no. But you have only one index to maintain. It depends...

> And if you specify just yes_no, the cluster factor would probably dictate
> against using the index at all, no matter which way around it is.
It depends on the value distribution of the yes_no column. I think it's more a matter of the histogram than the clustering factor. If you have only a few yes values i.e. the CBO should use the index in col = 'Y' conditions. The clustering factor could be very good even in a two value column if you sort the table by the value. Imagine a "to be processed flag" where new lines to be processed are always appended and afterwards processed. This gives a perfect low clustering factor, but using the index is only usefull for "to be processed" = yes.

Jan Received on Sat Sep 01 2007 - 02:05:17 CDT