Re: Index compression vs. table compression

From: Howard J. Rogers <hjr_at_dizwell.com>
Date: Wed, 29 Dec 2004 09:52:21 +1100
Message-ID: <41d1e3ad$0$1084$afc38c87@news.optusnet.com.au>

Rick Denoire wrote:
> Hello
>
> I run Oracle 9.2.0.5 on a RedHat AS 2.1 box with 8 Xeon CPUs.
> There is a clear mismatch between the overwhelming computer power and
> the poor disks attached. I am contemplating to setup a tablespace with
> compression and a blocksize of 16K (queries are of Data Warehouse
> type), and the corresponding 16k db cache (there is enough RAM), in
> order to take better advantage of this situation (the standard
> blocksize is 8K).

Unless you twiddled things when you were creating your file systems, then that's wrong: it should be 4K (assuming you're using bog-standard ext2, for example). You'll have I/O problems as a result of having an Oracle block size that's bigger than your file system buffer size. Adopting 16K blocks is going to make matters worse. And bear in mind that there is no recycle or keep caches for your 16K buffer cache... meaning physical I/O is likely to be higher than it would have been if you'd just left things alone.

Remember always that Oracle invented multiple block size support in 9i for one reason, and one reason only: to make transportable tablespaces work properly between databases with different db_block_sizes. Certain well-known authors claiming 'bung everything into 32K blocks' has just muddied the waters... but don't be taken in.

(Incidentally, there are articles concerning block sizes at www.dizwell.com).

> That is quite simple to do. A simple "alter table xxx move compress
> tablespace DWH_16C" should do it.

But not necessarily particularly well. Table compression works because rows with duplicate data co-reside in the same block. Therefore, if you have a random ordering of data in the table, simply switching on compression in the way you describe is unlikely to achieve anything very much. If you're going to do this, take the opportunity to physically order your table rows such that low-cardinality (and hence high-duplicating) columns are sorted together. And try not to use ASSM when you do this, otherwise it won't work (because ASSM scatters the rows regardless of your ordering request)!

In other words, a better approach is:

create table X as select * from Y
truncate table X;
alter table X move compress;
insert into X select * from Y order by col4,col2,col7;

Just as a quick example, and to show you the importance of correct row ordering to achieve decent compression:

SQL> create table T1 as select * from dba_objects; Table created.

SQL> analyze table T1 compute statistics; Table analyzed.

SQL> select blocks from dba_tables where table_name='T1';

BLOCKS

403

[The uncompressed 'control' for this experiment is therefore 400-odd blocks in size]

SQL> create table T2 compress as select * from dba_objects; Table created.

SQL> analyze table T2 compute statistics; Table analyzed.

SQL> select blocks from dba_tables where table_name='T2';

BLOCKS

209

[Simply switching on compression approximately halves that. Pretty good going for a 'brainless' application of compression]

SQL> create table T3 compress as select * from dba_objects order by owner; Table created.

SQL> analyze table T3 compute statistics; Table analyzed.

SQL> select blocks from dba_tables where table_name='T3';

BLOCKS

209

[Oddly enough, ordering the rows in the table by OWNER doesn't improve matters at all]

SQL> create table T4 compress as select * from dba_objects order by object_id;
Table created.

SQL> analyze table T4 compute statistics; Table analyzed.

SQL> select blocks from dba_tables where table_name='T4';

BLOCKS

129

[But ordering the same rows by OBJECT_ID reduces the table size by about another half again, which has clearly got to be much better than before. Careful choice of column ordering can therefore make or break table compression, though sometimes (as here) you get lucky straight off the bat without much thought].

> But with indexes, there is a
> difference I don't quite understand. Doing "ALTER INDEX xxx REBUILD
> COMPRESS y TABLESPACE DWH_16_IDX" would work, but I don't understand
> the meaning of "y" in this case.

If you have multi-column indexes, how many columns do you wish to be considered for duplicate suppression? For example, if you had an index on department, name and salary, would you want:

10 (four entries)
20 (two entries)

Bob 750
Sue 600
Bob 350
Bob 275
Sue 500
Tim 300

In which case 'alter index X rebuild compress' would be fine. Or do you want:

10BOB (three entries)
10SUE (one entry)
20SUE (one entry)
20TIM (one entry)

750
600
350
275
500
300

In which case, 'alter index X rebuild compress 2' would be called for.

> It seems that index compression requires a special attention for every
> single index to be compressed.

As does table compression if you're going to do it properly and achieve the maximum effect.

Regards
HJR
> Could someone illuminate me in this matter?
>
> Thanks
> Rick Denoire
>
Received on Tue Dec 28 2004 - 16:52:21 CST