Re: Convert SAP Oracle Database to IBM DB2 Database??

From: Serge Rielau <srielau_at_ca.ibm.com>
Date: Sat, 09 Feb 2008 11:10:05 -0500
Message-ID: <6161j9F1u7j0pU1@mid.individual.net>


Mark Townsend wrote:

>> The compression dictionary is part of the table's meta-data, thus it 
>> is cached.

>
> At some stage this is going to hit a memory limit, as more data and more
> tables are compressed.

In DB2 the size of the dictionary for a table is fixed. It does not grow with the size of the data. Data that is repetitive for a 1GB table is still repetitive for a 1TB table.
The ranges of data values in a column or row (DB2's compression is independent of column boundaries) do not increase as the amount of data increases once a sufficient size is reached. Aside. The savings in buffer pool footprint make up for the dictionary many times over.
> > If data values are highly localized, then having local
>> dictionaries can be helpful.

> That IS the crux of the matter. We think most data is indeed localized,
> I would be surprised, and willing to discuss over a beer, if IBM thought
> differently.

I can't speak for IBM, but I think differently. Data in a page is typically a random collection. Lets take an orders table. Addresses and names of customers will be random. You may have the occasional spike in order items over time (Plywood in hurricane season? But Overall most data in a page is spread independently of its physical location.
You will need to go to multidimensional clustering (nested partitions in Oracle speak) before you see significant localization of values. But even then only the clustering data will be localized (e.g. MONTH, REGION). What we do see is skew over time. For that reason dictionaries are local to range partitions.

> BTW - that is why random data generation does not show compression of
> well in Oracle.

Globally random is artificial. But random within a range, not so sure... To link to teh TPC comment in another post, We have tested TPC-H with compression and found DB2 to do quite well. TPC-C is too artificial (old). TPC-E will be much better as it is based on real market data.

> Remember that necessity is often the mother of invention, and what is a
> feature in one product is often a bug/design fix in another.
You are correct in general.
Reminds me on the discussion whether package variables can be public or not. Necessity turns to virtue turns to differentiator.

Wars (Flame here and real elsewhere) could be avoided if people would got more perspective.
You learn a lot about yourself, your culture, your country, your race, your DBMS of choice, ... by trying to immerse in another instead of merely yelling over the fence.

There - we've gone all philosophical now... Serge

-- 
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Received on Sat Feb 09 2008 - 10:10:05 CST

Original text of this message