Re: Index management

From: Howard J. Rogers <hjr_at_dizwell.com>
Date: Tue, 11 May 2004 10:15:31 +1000
Message-ID: <40a01b1d$0$4548$afc38c87@news.optusnet.com.au>

Mike Ault wrote:
> Well,
>
> After extensive testing (a couple of hours worth) using various
> combinations of insert, update and delete I was not able to achieve
> differing values for clustering factor with index rebuilds unless the
> base table is also rebuilt (which would be expected due to the
> redistribution of rows as pctfree is applied to rows that have been
> updated.) However, I was not able to fully simulate the effect of
> possibly hundreds of simultaneous IUD activities, however, that is a
> moot point.

It's moot because you really haven't understood what the clustering factor is, and if you had, you would never have made the claim that a rebuild would have changed it.

But it would be churlish of me not to acknowledge that, however late in the day, and however reluctantly, and however much padding you hope to obscure it with, you have finally admitted you were wrong.

That is good news for people who simply want kosher advice on how indexes work and whether they should use a particular number as evidence of an index's health or lack of it.

> Note that rebuild of the base table after thousands of IUD actions
> resulted in a net increase in clustering factor. To test this create a
> simple CTAS of the dba_objects view, create a couple of indexes on it,
> analyze the table using compute and check the CF of the indexes, then
> create a PL/SQL or other script that does insert, update and delete of
> various combinations of indexed values (I like to use a modulus of
> various constants against the object_id so I get a spread across the
> table with my statements). Once you complete your changes, do another
> table analysis, then recheck the clustering factors. Then rebuild the
> indexes and recheck (afer re-analysis). If you have done a number of
> updates, then rebuild the base table (alter table x move;), you will
> also have to rebuild any indexes on the table, then analyze the table
> again and check CF.

Yes, thank you. I think we all know how to measure the thing, because I posted an example of exactly how to do it about three days ago.

> The problem lays with the fact that this was not a complex environment
> involving millions of rows and many users, but for the time being I
> will concede the point that a simple index rebuild, in a simple
> environment, will not result in a change to clustering factor for a
> standard b-tree type index. I will change my lectures and update the
> tuning materials I have published to reflect this finding (I may even
> give a word of mention to the fellows Foote and Rogers...) when I am
> wrong, I admit it.

Well that is good news (and no I'm not being sarcastic about that). You just need to go one tiny step further and realise that no matter HOW complex the environment, the clustering factor is never going to change for a btree index for a simple rebuild. Ever.

Because a clustering factor indicates the alignment of an index's sort order with the order of rows in the table. After a rebuild, the index's sort order does not change (Z still comes after Y which still comes after X). And the rebuild does not alter the table in any way, shape or form. Therefore the number which describes the correlation between the two orders cannot change, either.

If you can grasp that simple truth, you won't need to qualify your admission with any caveats about the complexity of the system being measured.

> However, the CF to used baased table block ratio still tells you what
> indexes are becoming totally randomized and can still be used as a
> diagnostic pointer should the cf to used blocks ratio change
> substantially (this of course means tracking it for major tables.)

Yes, a deteriorating CF has diagnostic merit.

Only slight problem with your paragraph is that a deteriorating CF doesn't indicate an INDEX is "becoming totally randomised" (since an index is, clearly, always stored in ascending numeric or alphabetic order and is therefore never random). It is the order of rows in the TABLE which is becoming more disorganised vis-a-vis the index.

Therefore a deteriorating CF has diagnostic merit for TABLE reorganisation. You know, the table re-organisation process you had so much fun laughing at when Richard suggested it, roughly 35 posts ago.

> Also, since it is a direct factor in index cost analysis for the CBO
> steps such as re-ordering concatenated index columns or rebuild of the
> base table in index order to reduce clustering factor are still valid.

No-one ever argued about the importance of index design. I was making precisely that point about 30 posts ago, too.

With a concatenated index, the order of columns is incredibly important. That's as obvious as breathing. And in choosing between two equally plausible orderings, yes you might want to keep the CF in mind (though I think a handier rule of thumb might simply be to stick the column you query by most as the leading edge; the column which is least selective as the leading edge if that's a tie (because then you can take advantage of index compression and skip-scanning of indexes)). But yes, CF has a role to play in that decision-making process. No-one would deny that, or has ever denied it.

> I still advocate rebuild of indexes that are sparse or that
> demonstrate improper levels for their complexity.

And that's still a mistake (because sparse indexes have space that can be re-used). An index which is sparse *and stays that way over time despite subsequent DML* might be a candidate.

You might want to review Jonathan Lewis' prior posts here (a long time ago now I think) about index heights. Namely that they are seldom ever a problem, and very difficult to make grow dramatically.

> The decreases in clustering factor I noticed at sites may have been
> due to stale statistics, use of estimate rather than compute or other
> factors.

Mmmm. A bit of a fatal flaw in technique, don't you think? If one is going to develop advice from some statistics, it is rather a prerequisite that those statistics should be accurate, no?

> However, I will endeavor to check this in a more complex IUD situation
> than I can simulate on an XP laptop and if anomalous behavior (i.e.
> reduction in CF) does occur, I will report back to the list with the
> scenario and the statistics.

Looking forward to it.

Thank you for correcting your earlier statements, anyway.

HJR Received on Mon May 10 2004 - 19:15:31 CDT