Re: unbalanced indexes -- common wisdom?

From: Connor McDonald <connor_mcdonald_at_yahoo.com>
Date: Fri, 24 May 2002 22:17:02 +0100
Message-ID: <3CEEADCE.24AD@yahoo.com>

Mike Ault wrote:
>
> Hate to play the devils advocate (well not really...) but if rebuilds
> on the whole are a waste of time since Oracle is so efficient then why
> do index rebuilds when:
> 1. Levels are too high (very rare, I have never seen an index higher
> than 2 that I can remember and this is with giga to terabyte
> databases)
> 2. Index is too broad (excessive IUD resulting in sparse nodes)
> 3. Clustering Factor is too high in comparison to dirty table blocks
>
> result in sometimes dramatic performance improvements? Again, if the
> index gets too broad thus driving up clustering factor the CBO will
> not use it unless forced by a hint.
>
> Essentially the only reasons to rebuild are 2 and 3 since 1 rarely
> occurs. And, regardless of what all the experts are saying, it
> provides a verifiable, quantifiable increase in performance to do so.
> You can armchair quarterback DBA work all you wish, but until you show
> me the numbers and show that it applies across the board, I wish you
> would refrain form making generalized suggestions such as indexes
> rarely have to be rebuilt.
>
> It is a case-by-case scenario and each application is different.
> Analyze the SQL, look at how it performs, apply the fixes and go on.
> Don't accept anyones blanket generalizations (me included!).
>
> Mike
> "Richard Foote" <richard.foote_at_bigpond.com> wrote in message news:<zifH8.16748$b5.58356_at_newsfeeds.bigpond.com>...
> > Hi Jonathan
> >
> > The concept of "wastage" in an index is always interesting and as you
> > clearly discuss is often mistaken. Storage is only really wasted if it's
> > unlikely to be ever reused and in most scenarios with indexes this is
> > unlikely and most space is effectively reused. It depends on the life cycle
> > and DML characteristics of the table.
> >
> > One thing you mention that has me curious is how the splitting of one 3rd
> > level node block produces an imbalance. It results in an additional 3rd node
> > being introduced containing half the contents of the split node and it's
> > parent node on the 2nd level being updated with appropriate pointers to the
> > 2 affected child nodes. But everything is still "balanced".
> >
> > I'm obviously missing something (nothing new there) so any clarification
> > would be great (but finish your tea first :)
> >
> > Richard
> > "Jonathan Lewis" <jonathan_at_jlcomp.demon.co.uk> wrote in message
> > news:1022173912.11142.0.nnrp-01.9e984b29_at_news.demon.co.uk...
> > >
> > > The following statement appears in the article you quote:
> > >
> > > <quote>
> > > When a row is deleted from a table, the Oracle database will not
> > > reuse the corresponding index space until you rebuild the index.
> > > That's the way indexes are implemented in Oracle
> > > <end quote>
> > >
> > > The statement is untrue - but commonly quoted as fact.
> > >
> > > There are a very few special cases where the nature of the
> > > application, and the method of implementation result in
> > > space being wasted unduly. This wastage is a consequence
> > > of the fact that Oracle does not merge sparsely used blocks
> > > according to the full B-tree algorithm.
> > >
> > >
> > > The following statement also appears
> > > <quote>
> > > index blocks will not be put on the free list for reuse.
> > > <end quote>
> > >
> > > This is also untrue, and it is easy to prove (left as exercise)
> > > that it is untrue.
> > >
> > >
> > >
> > > The following statement also appears
> > > <quote>
> > > For example, a three-level index might have a node that experiences
> > > heavy INSERT activity. This node could spawn a fourth level without
> > > the other level-three nodes spawning new levels.
> > > That makes the index unbalanced.
> > > <end quote>
> > >
> > > This first part of this statement is TRUE - and in fact ideal. The second
> > > part is very misleading. No matter how perfectly an index is maintained,
> > > it will always be possible to ensure that you can get it to a state where
> > > either just one 3rd level node has split to produce an "imbalance" or
> > > (the only possible alternative) every 3rd level node has to split to
> > > leave the index half-populated if you want "balance" - i.e. every leaf
> > > has to be at the fourth level.
> > >
> > > The important point about balance b-trees is that no leaf is MORE THAN
> > > ONE level deeper that every other leaf.
> > >
> > >
> > > No more time for reading the second page, tea's up.
> > >
> > >
> > > --
> > > Jonathan Lewis
> > > http://www.jlcomp.demon.co.uk
> > >
> > > Author of:
> > > Practical Oracle 8i: Building Efficient Databases
> > >
> > > Next Seminar - Australia - July/August
> > > http://www.jlcomp.demon.co.uk/seminar.html
> > >
> > > Host to The Co-Operative Oracle Users' FAQ
> > > http://www.jlcomp.demon.co.uk/faq/ind_faq.html
> > >
> > >
> > >
> > > Mikito Harakiri wrote in message ...
> > > >Sybrand Bakker <postbus_at_sybrandb.demon.nl> wrote in message
> > news:<rupneuc1hcf9rviejkd6kmhtdm4vbstrav_at_4ax.com>...
> > > >> On 22 May 2002 11:09:35 -0700, mikharakiri_at_yahoo.com (Mikito Harakiri)
> > > >> wrote:
> > > >>
> > > >> >There are so many useless options for creating an index, and the most
> > > >> >important one -- "balanced" -- seems to be missing. Would the standard
> > > >> >B-Tree implemented in any time in the future?
> > > >>
> > > >> Just FYI: ordinary Oracle indexes *are* balanced.
> > > >
> > > >I'm sorry, I was distracted by the title:
> > > >http://gethelp.devx.com/techtips/oracle_pro/10min/10min0601/10min0601.asp
> > > >
> > > >I really meant fragmented. I know, this ugly index maintenance is
> > > >unneccessary, as it doesn't improve performance -- but there is still
> > > >a waisted storage. Since oracle created every possible option to
> > > >manage storage, how did they forget about this one?
> > >
> > >

Looking at the cases you've provided (sparse blocks, clustering factor), I would say that index rebuilds are still rarely required. For sparse blocks, presumably there are two scenarios: i) typical day to day usage has resulted in the sparse blocks ii) a single adhoc maintenance function has resulted in sparse blocks In (i), a rebuild simply delays the inevitable - after the rebuild, further day to day usage is simply going to take the index back up to a similar level of sparse blocks. It takes quite a convoluted set of circumstances to create sparse blocks that (with day to day usage) will a) not be reused
b) are not adjacent to a whole bunch of other sparse blocks thus reducing the effectiveness of a coalesce In (ii), then this is typically the only time I see the need for rebuilds - because you've done something to the index that is not regular. Excepting DSS, this would (should?) be quite rare as well.

For clustering factor, I'm not sure how the rebuild is going to dramatically improve the clustering factor - after all, the table row ordering is not going to be changed by a rebuild - the index statistics may get dated, but recalculating the stats, not a rebuild is the solution to that.

Here is a set of examples, all of which show an index that probably would not get any significant from an index rebuild:

SQL> @c:\idxtest
SQL> REM
SQL> REM Example 1: reinsert same rows
SQL> REM
SQL> set echo on
SQL> @@prelim.sql
SQL> drop table emp;

Table dropped.

SQL>
SQL> create table emp
  2 ( empno , padding ) nologging as
  3 select rownum*2, rpad(rownum,10)
  4 from sys.source$
  5 where rownum < 50000;

Table created.

SQL>
SQL> create index emp_ix
2 on emp ( empno) nologging;

Index created.

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics before test
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        111          1           0               145

SQL>
SQL> delete from emp
2 where empno < 15000;

7499 rows deleted.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> insert into emp
  2 select rownum*2, rpad(rownum,10)
  3 from all_objects
  4 where rownum < 7500;

7499 rows created.

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics after test
SQL> REM
SQL> 
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        111          1           0               146

SQL> 
SQL> 
SQL> REM
SQL> REM Example 2: Reinsert different rows
SQL> REM

SQL> @@prelim.sql
SQL> drop table emp;

Table dropped.

SQL>
SQL> create table emp
  2 ( empno , padding ) nologging as
  3 select rownum*2, rpad(rownum,10)
  4 from sys.source$
  5 where rownum < 50000;

Table created.

SQL>
SQL> create index emp_ix
2 on emp ( empno) nologging;

Index created.

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics before test
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        111          1           0               145

SQL>
SQL> begin
  2 for i in 1 .. 6000 loop
  3 delete from emp
  4 where empno = i*2;
  5 commit;
  6 insert into emp
  7 values (i*2+1,'padding');
  8 end loop;
  9 end;
10 /

PL/SQL procedure successfully completed.

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> REM
SQL> REM Statistics after test
SQL> REM
SQL> 
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        111          1           0               144

SQL> 
SQL> 
SQL> REM
SQL> REM Example 3: Reinsert rows at other end of the index
SQL> REM

SQL> @@prelim.sql
SQL> drop table emp;

Table dropped.

SQL>
SQL> create table emp
  2 ( empno , padding ) nologging as
  3 select rownum*2, rpad(rownum,10)
  4 from sys.source$
  5 where rownum < 50000;

Table created.

SQL>
SQL> create index emp_ix
2 on emp ( empno) nologging;

Index created.

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics before test
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        111          1           0               145

SQL>
SQL> delete from emp
2 where empno < 15000;

7499 rows deleted.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> insert into emp
2 select rownum*2+999999, rpad(rownum,10) 3 from all_objects
4 where rownum < 7500;

7499 rows created.

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics after test
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        112          1           0               147

SQL> 
SQL> 
SQL> REM
SQL> REM Example 4: Leave some blocks partially filled
SQL> REM

SQL> @@prelim.sql
SQL> drop table emp;

Table dropped.

SQL>
SQL> create table emp
  2 ( empno , padding ) nologging as
  3 select rownum*2, rpad(rownum,10)
  4 from sys.source$
  5 where rownum < 50000;

Table created.

SQL>
SQL> create index emp_ix
2 on emp ( empno) nologging;

Index created.

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics before test
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        111          1           0               145

SQL>
SQL> delete from emp
2 where mod(empno,4) = 0;

24999 rows deleted.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> insert into emp
2 select rownum*2+999999, rpad(rownum,10) 3 from all_objects
4 where rownum < 16282;

16281 rows created.

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics after test
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        145          1       24853               194

SQL> 
SQL> REM
SQL> REM But in this case, a coalesce is more than adequate
SQL> REM

SQL>
SQL> alter index emp_ix coalesce;

Index altered.

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics after test
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2         90          1           0               194

SQL> 
SQL> 
SQL> REM
SQL> REM Example 5: Random inserts/deletes - Normally distributed
SQL> REM

SQL> @@prelim.sql
SQL> drop table emp;

Table dropped.

SQL>
SQL> create table emp
  2 ( empno , padding ) nologging as
  3 select rownum*2, rpad(rownum,10)
  4 from sys.source$
  5 where rownum < 50000;

Table created.

SQL>
SQL> create index emp_ix
2 on emp ( empno) nologging;

Index created.

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics before test
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        111          1           0               145

SQL>
SQL> declare
  2 v number;
  3 begin
  4 for i in 1 .. 5000 loop
  5 v := mod((dbms_random.normal+5)*100000,50000)*2;   6 delete from emp
  7 where empno = v;
  8 commit;
  9 v := mod((dbms_random.normal+5)*100000,50000)*2; 10 insert into emp
11 values (v,'padding');
12 end loop;
13 end;
14 /

PL/SQL procedure successfully completed.

SQL>
SQL> select count(*) from emp;

COUNT(*)

54999

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics after test
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        221          1           0              9870

SQL> 
SQL> REM
SQL> REM Example 6: Random inserts/deletes, random distribution
SQL> REM

SQL> @@prelim.sql
SQL> drop table emp;

Table dropped.

SQL>
SQL> create table emp
  2 ( empno , padding ) nologging as
  3 select rownum*2, rpad(rownum,10)
  4 from sys.source$
  5 where rownum < 50000;

Table created.

SQL>
SQL> create index emp_ix
2 on emp ( empno) nologging;

Index created.

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics before test
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        111          1           0               145

SQL>
SQL> declare
  2 v number;
  3 begin
  4 for i in 1 .. 5000 loop
  5 v := mod(dbms_random.random,50000)*2;   6 delete from emp
  7 where empno = v;
  8 commit;
  9 v := mod(dbms_random.random,50000)*2; 10 insert into emp
11 values (v,'padding');
12 end loop;
13 end;
14 /

PL/SQL procedure successfully completed.

SQL>
SQL> select count(*) from emp;

COUNT(*)

52437

SQL>
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics after test
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        120          1          57              7185

SQL>
SQL> alter index emp_ix rebuild nologging;

Index altered.

SQL> 
SQL> 
SQL> analyze index emp_ix compute statistics;

Index analyzed.

SQL> analyze index emp_ix validate structure;

Index analyzed.

SQL> 
SQL> REM
SQL> REM Statistics after rebuild
SQL> REM
SQL> select HEIGHT, LF_BLKS, BR_BLKS, DEL_LF_ROWS, CLUSTERING_FACTOR

2 from index_stats, user_indexes;

HEIGHT LF_BLKS BR_BLKS DEL_LF_ROWS CLUSTERING_FACTOR

---------- ---------- ---------- ----------- -----------------
         2        117          1           0              7185

SQL> spool off

-- 
==============================
Connor McDonald

http://www.oracledba.co.uk

"Some days you're the pigeon, some days you're the statue..."

Received on Fri May 24 2002 - 16:17:02 CDT