Re: Index management

From: Howard J. Rogers <hjr_at_dizwell.com>
Date: Sun, 09 May 2004 10:24:02 +1000
Message-ID: <409d7a18$0$442$afc38c87@news.optusnet.com.au>

Mike Ault wrote:
> Ok.
>
> I don't mind when folks disagree, however, when you attach names to
> quotes and then say in effect this person is an idiot and this is
> rubbish, it does appear to be a personal attack rather than a
> professional one.

Well, it's actually appreciated by me at least that you have stuck around long enough this time to actually discuss the issue. Don's last post kind of indicated that you weren't going to.

So, thanks for continuing to discuss the issue, and fair enough if we don't agree, so long as neither one nor other of us actually misleads, right?

>
> To distill my advice on index rebuilds, I say to only rebuild indexes
> which show, through proper analysis, to have problems. Examples of
> possible probelms are: a large clustering factor to dirty base table
> block ratio,

Well, it appears we are never going to get an explanation of what a dirty base table block is (I've only asked three times, and Daniel's asked too).

But whatever this mysterious beastie happens to be, what possible difference can it make to such a ratio to rebuild an index??

The ratio is based on an index's clustering factor. An index's clustering factor DOES NOT CHANGE WHEN AN INDEX IS REBUILT. Ever. So if the ratio is bad *before* a rebuild, it will be *exactly as bad* after the rebuild.

The ratio must therefore be considered meaningless, and this cannot therefore be an example of "proper analysis".

In fact what it is, is an example of the "ratio snake-oil" that has been peddled for ages (not always by you, Mike) which seeks to make easy decisions about things like 'when do I add memory to my buffer cache'; 'when is an index useful'; 'when should I increase my shared pool size'; and 'when should I rebuild an index'.

It would be lovely if there was a nice simple couple of figures you could capture, divide, and thus get your answer. But any such figure that involves the clustering factor of an index had better understand what the clustering factor actually is, and deal with the fact that it never, ever, changes as a result of a simple index rebuild.

>a large number of levels (of course large is relative,
> some folks say 4 other 2 is where "large" starts), or a large amount
> of white space (I call it browned nodes, following with the leaf and
> tree metaphore.) I do not advocate rebuilding all indexes.
>
> In the case of a truely random index (such as some text indexes and
> concatenated indexes) rebuild of the index will achieve nothing.
> Unless you order the table inaccordance with the index, the clustering
> factor will remain nearer to the number of rows than to the number of
> dirty blocks. However, reordering the index columns to better align it
> with the underlying table order can reduce clustering factor
> significantly.

That is NOT an index rebuild. That is an index REDESIGN.

To suggest otherwise is to either be (a) extremely imprecise in your use of language or (b) intellectually dishonest.

> In special indexes such as bitmaps, the clustering factor can increase
> dramatically with IUD activity, and a rebuild will reduce it just as
> dramatically.
>
> Reduction in the size of the clustering factor improves the chance the
> index will be selected by the CBO since the clustering factor is a
> multiplier in the index cost.

ALTER INDEX X REBUILD never, ever, ever, ever, ever alters an index's clustering factor. And it is *that* command that everyone has in mind when the words "should I rebuild my indexes" comes up in polite conversation. Not 'drop index X...create index newx'.

> When this advice has been followed we (the folks at TUSC and other
> consultants) have seen batch times reduced from 5 hours to 2 hours,
> significant reduction in index space usage and significant reduction
> in required IO against the index tablespace.

Translation: when we completely redesign our indexes, so that they are built on different columns and with different column orders, we get improvements in their efficiency.

Why am I not surprised at that? Why is *no-one* surprised at that?

Because index redesign is obviously and powerfully capable of yielding performance benefits. Index rebuilding, however, isn't.

> As to whether an index rebuild will affect clustering factor, yes it
> will if the index has been broadened due to node splits and other
> activities.

Rebuilding an index never, ever, ever, ever, alters its clustering factor, no matter how much fresh air is in the index:

SQL> create table T1 as select * from dba_objects order by owner;
SQL> create index I1 on T1(object_id);
SQL> analyze table T1 compute statistics;
SQL> select clustering_factor from user_indexes

2 where index_name='I1';

CLUSTERING_FACTOR

4529

SQL> alter index I1 rebuild pctfree 99;
SQL> analyze table T1 compute statistics;
SQL> select clustering_factor from user_indexes

2 where index_name='I1';

CLUSTERING_FACTOR

4529

SQL> alter index I1 rebuild pctfree 0;
SQL> analyze table T1 compute statistics;
SQL> select clustering_factor from user_indexes

2 where index_name='I1';

CLUSTERING_FACTOR

4529

Strangely, my clustering factor never changes as a result of an index rebuild. The only thing that fixes it is to rebuild my *table*:

SQL> create table Ttemp as select * from T1;
SQL> truncate table T1;
SQL> insert into T1 select * from ttemp order by object_id;
SQL> analyze table T1 compute statistics;
SQL> select clustering_factor from user_indexes

2 where index_name='I1';

CLUSTERING_FACTOR

You CANNOT alter your clustering factor by an index rebuild, and I don't care how many node splits or "other activities" you've been suffering from. That's because clustering factor is an indication of how well the *table* rows are ordered vis-a-vis an index ordering, not how much fresh air your index may or may not contain.

>Remember that as old data is deleted and new data is
> inserted into the base table the order of rows in the base table
> changes as blocks are moved on and off of the free/used extent areas.

Absolutely true.

> By definition the order of rows in a relational table is random.

Also absolutely true.

>So a
> preloaded table that is perfectly ordered by key eventually falls into
> Oracle entropy if it undergoes delete and insert activities (the
> property which forces and ordered table into disorder.)

Yet again, absolutely true.

> This means
> that you may start with a perfect clustering factor but with no
> changes to the index, table level changes will alter the clustering
> factor and a rebuild will reduce it.

Absolute nonsense. 'The table order of rows has degenerated, so an index rebuild will fix it'?? An index rebuild will not affect the order of rows in the table at all (quite obviously). An index rebuild will not affect the order of leaf entries in the index (quite obviously). And the clustering factor is a measure of the similarity of the two orderings, so an index rebuild cannot, ever, cause the clustering factor to change.

See the above test. Except that I know you will claim that's not a valid test, because no DML is going on. So let's do some DML:

SQL> update T1 set object_id=object_Id+10; 6230 rows updated.
SQL> commit;
Commit complete.
SQL> analyze table T1 compute statistics; Table analyzed.
SQL> select clustering_factor from user_indexes

2 where index_name='I1';

CLUSTERING_FACTOR

4541

Yup. DML sure does change the Clustering Factor (it's 4541 now, not 4529). DML changing the order of rows in a table, and thus affecting the clustering factor, is not at issue. It's whether you can fix the thing up with an index rebuild:

SQL> alter index I1 rebuild;
Index altered.
SQL> analyze table T1 compute statistics; Table analyzed.
SQL> select clustering_factor from user_indexes

2 where index_name='I1';

CLUSTERING_FACTOR

4541

And no you can't. The clustering factor was affected by DML, and unless you REBUILD THE TABLE, the clustering factor is not going to change. Try again. Let's delete every other row in the table:

SQL> delete from T1 where mod(object_id,2)=1; 3109 rows deleted.
SQL> commit;
Commit complete.
SQL> analyze table T1 compute statistics; Table analyzed.
SQL> select clustering_factor from user_indexes

2 where index_name='I1';

CLUSTERING_FACTOR

2049

Clustering factor has come down a fair bit as a result of DML. It's true. But you're saying, Mike, that an INDEX REBUILD will fix it up:

SQL> alter index I1 rebuild;
Index altered.

SQL> analyze table T1 compute statistics; Table analyzed.

SQL> select clustering_factor from user_indexes

2 where index_name='I1';

CLUSTERING_FACTOR

2049

Oh no it won't. And one last test:

SQL> insert into T1 select * from dba_objects order by secondary; 6230 rows created.
SQL> commit;
Commit complete.
SQL> analyze table T1 compute statistics; Table analyzed.
SQL> select clustering_factor from user_indexes

2 where index_name='I1';

CLUSTERING_FACTOR

8471

Horror of horrors, my clustering factor has shot up as a result of the insert. But no worries, Mike's advice is that the rebuild will fix it:

SQL> alter index I1 rebuild;
Index altered.

SQL> analyze table T1 compute statistics; Table analyzed.

SQL> select clustering_factor from user_indexes

2 where index_name='I1';

CLUSTERING_FACTOR

8471

Ooops. There are, in other words, NO CIRCUMSTANCES where an index rebuild changes the clustering factor. If DML over time has caused your clustering factor to go sky-high, sure that's a problem and needs looking at. But the one thing that is guaranteed NOT to fix the problem is an INDEX rebuild.

> I agree that if the base table is held constant and only inserted into
> then there is little likelyhood that the indexes will need rebuilds.
> However, we all don't live in the town Perfect and we have to deal
> with the real world where both tables and indexes get messy and must
> periodically be cleaned up. Most of the arguments against index
> rebuilds seem to ignore the fact that both the base table and the
> index are in a state of flux in a high IUD environment and over time
> get misaligned, hence causing the clustering factor to not truly
> represent the ordering of the table against the the index. A rebuild
> realigns the index nodes to better reflect the structure of the
> underlying base table.

Jeez. How in God's name does an index rebuild "realign the index nodes to better reflect the structure of underlying base table"? There is only one way for index nodes to be ordered or aligned, and that's in ascending alphabetic or numeric order, depending on what column(s) you've built the index on, of course. An index cannot "reflect the structure of the table". And a rebuild doesn't change that fact at all, ever.

As is evidenced by the fact that I can rebuild my index until the cows come how and the clustering factor NEVER, ever changes.

>
> I completely understand the concept of clustering factor, both in an
> ideal world and in the real one. I must deal with the real one.

You don't, Mike. Clearly you don't. Otherwise you wouldn't make such silly statements to the effect that a rebuild will alter it; the fluffiness of an index will affect it; or that a rebuild somehow makes the index realign itself to match the table structure.

You are trying to offer quick solutions when none exist, and what is worse, the solutions exhibit a lack of understanding about the very materials we are dealing with. And worst of all, your advice achieves nothing. Nothing at all. Rebuild with gay abandon as you may, and your clustering factors will remain unchanged, every time.

Which means, in turn, that any ratio or measure that you can possibly devise that uses the clustering factor as a factor in determining when to rebuild the index is just stupid. Because it's going to measure as much after the rebuild as before.

HJR Received on Sat May 08 2004 - 19:24:02 CDT