Re: Index compression vs. table compression

From: Mark Bole <makbo_at_pacbell.net>
Date: Wed, 29 Dec 2004 01:46:25 GMT
Message-ID: <R%nAd.4309$yV1.4010@newssvr14.news.prodigy.com>

Howard J. Rogers wrote:

> Rick Denoire wrote:
>

>> Hello
>>
>> I run Oracle 9.2.0.5 on a RedHat AS 2.1 box with 8 Xeon CPUs.
>> There is a clear mismatch between the overwhelming computer power and
>> the poor disks attached. I am contemplating to setup a tablespace with
>> compression and a blocksize of 16K (queries are of Data Warehouse
>> type), and the corresponding 16k db cache (there is enough RAM), in
>> order to take better advantage of this situation (the standard
>> blocksize is 8K).

>
>
> Unless you twiddled things when you were creating your file systems,
> then that's wrong: it should be 4K (assuming you're using bog-standard
> ext2, for example). You'll have I/O problems as a result of having an
> Oracle block size that's bigger than your file system buffer size.
> Adopting 16K blocks is going to make matters worse. And bear in mind
> that there is no recycle or keep caches for your 16K buffer cache...
> meaning physical I/O is likely to be higher than it would have been if
> you'd just left things alone.
>
> Remember always that Oracle invented multiple block size support in 9i
> for one reason, and one reason only: to make transportable tablespaces
> work properly between databases with different db_block_sizes. Certain
> well-known authors claiming 'bung everything into 32K blocks' has just
> muddied the waters... but don't be taken in.
>
> (Incidentally, there are articles concerning block sizes at
> www.dizwell.com).
>

>> That is quite simple to do. A simple "alter table xxx move compress
>> tablespace DWH_16C" should do it.

>
>
> But not necessarily particularly well. Table compression works because
> rows with duplicate data co-reside in the same block. Therefore, if you
> have a random ordering of data in the table, simply switching on
> compression in the way you describe is unlikely to achieve anything very
> much. If you're going to do this, take the opportunity to physically
> order your table rows such that low-cardinality (and hence
> high-duplicating) columns are sorted together. And try not to use ASSM
> when you do this, otherwise it won't work (because ASSM scatters the
> rows regardless of your ordering request)!
>
> In other words, a better approach is:
>
> create table X as select * from Y
> truncate table X;
> alter table X move compress;
> insert into X select * from Y order by col4,col2,col7;
>
> Just as a quick example, and to show you the importance of correct row
> ordering to achieve decent compression:
>
> SQL> create table T1 as select * from dba_objects;
> Table created.
>
> SQL> analyze table T1 compute statistics;
> Table analyzed.
>
> SQL> select blocks from dba_tables where table_name='T1';
>
> BLOCKS
> ----------
> 403
>
> [The uncompressed 'control' for this experiment is therefore 400-odd
> blocks in size]
>
> SQL> create table T2 compress as select * from dba_objects;
> Table created.
>
> SQL> analyze table T2 compute statistics;
> Table analyzed.
>
> SQL> select blocks from dba_tables where table_name='T2';
>
> BLOCKS
> ----------
> 209
>
> [Simply switching on compression approximately halves that. Pretty good
> going for a 'brainless' application of compression]
>
> SQL> create table T3 compress as select * from dba_objects order by owner;
> Table created.
>
> SQL> analyze table T3 compute statistics;
> Table analyzed.
>
> SQL> select blocks from dba_tables where table_name='T3';
>
> BLOCKS
> ----------
> 209
>
> [Oddly enough, ordering the rows in the table by OWNER doesn't improve
> matters at all]
>
> SQL> create table T4 compress as select * from dba_objects order by
> object_id;
> Table created.
>
> SQL> analyze table T4 compute statistics;
> Table analyzed.
>
> SQL> select blocks from dba_tables where table_name='T4';
>
> BLOCKS
> ----------
> 129
>
> [But ordering the same rows by OBJECT_ID reduces the table size by about
> another half again, which has clearly got to be much better than before.
> Careful choice of column ordering can therefore make or break table
> compression, though sometimes (as here) you get lucky straight off the
> bat without much thought].
>

>> But with indexes, there is a
>> difference I don't quite understand. Doing "ALTER INDEX xxx REBUILD
>> COMPRESS y TABLESPACE DWH_16_IDX" would work, but I don't understand
>> the meaning of "y" in this case.

>
>
> If you have multi-column indexes, how many columns do you wish to be
> considered for duplicate suppression? For example, if you had an index
> on department, name and salary, would you want:
>
> 10 (four entries)
> 20 (two entries)
> --------------
> Bob 750
> Sue 600
> Bob 350
> Bob 275
> Sue 500
> Tim 300
>
> In which case 'alter index X rebuild compress' would be fine. Or do you
> want:
>
> 10BOB (three entries)
> 10SUE (one entry)
> 20SUE (one entry)
> 20TIM (one entry)
> ---------------
> 750
> 600
> 350
> 275
> 500
> 300
>
> In which case, 'alter index X rebuild compress 2' would be called for.
>

>> It seems that index compression requires a special attention for every
>> single index to be compressed.

>
>
> As does table compression if you're going to do it properly and achieve
> the maximum effect.
>
> Regards
> HJR
>
>
>

>> Could someone illuminate me in this matter?
>>
>> Thanks
>> Rick Denoire
>>

As a side note to HJR's thorough and enlightening reply, I was curious about the much narrower issue of consistent syntax raised by the OP.

The documentation at tahiti.oracle.com for version 9.2 is somewhat sparse in this regard. There is one brief mention in the data warehousing guide that a table partition will inherit the "compress" attribute of the tablespace it is located in.

The following terms appear: "data_segment_compression" for tables and tablespaces, plus "key_compression" for indexes -- but there is not an explanation why a tablespace attribute setting can cause table partitions (and, presumably, non-partitioned tables) to default to a given "compress" attribute, but not indexes.

The syntax as quoted from the OP: "ALTER INDEX xxx REBUILD >> COMPRESS y TABLESPACE DWH_16_IDX" is not documented anywhere, in other words the "compress y" clause does not appear to have any meaning (but the "compress [integer]" clause does, of course).

-Mark Bole Received on Tue Dec 28 2004 - 19:46:25 CST