Oracle-L: partitioning star schema

From: <becker.bill_at_marshfieldclinic.org>
Date: Fri, 10 Jan 2003 06:33:50 -0800
Message-ID: <F001.0052C43B.20030110063350@fatcity.com>

Hello,

We are still struggling with partitioning of star schema fact tables. As of yet, we haven't been able to test/compare any of the following scenarios (because we're not yet legal with the partitioning option), so I am posting in the hope that someone with more partitioning experience will comment. This will eventually be implemented on 9.2 on Solaris.

By star schema fact tables, I am referring to tables that consist mostly of surrogate key id fields (used for joining to dimension tables), and numeric fields containing a quantity measure.

The id fields are never directly referenced in WHERE clauses of queries as *filter* conditions, but are frequently referenced in join conditions. The filter conditions usually reference fields in one of the dimension tables joined to be the fact table.

We have developed some (untested) practical guidelines for partitioning. They are listed from best to worst. These are intended to optimize querying
(not the incremental loading), and they apply to tables rather than
indexes. (We are creating a similar list for indexes) Here they are:

partition by a field most frequently referenced in the WHERE clause as a filter condition; subpartition by a field less frequently referenced as a filter condition. This enables a double partition-pruning.
partition by a field most frequently referenced in the WHERE clause as a filter condition; subpartition by a field frequently referenced as a join condition, where the joined-to table is partitioned exactly the same way. This enables partition-pruning and partition-wise joins.
partition by a field in the table that is often referenced in WHERE clauses as a filter condition; this enables partition pruning.
partition by a frequently-used join field where the joined-to table is partitioned exactly the same; this enables partition-wise joins.
partition by a frequently-used join field.
partition by something is usually better than not partitioning at all.

In many cases, we have to go all the way to #5 before this applies. As I said, the fact table id fields are never referenced in WHERE clauses, the dimension tables are rarely large enough to be partitioned, and the WHERE filter conditions usually apply to a dimension table, so we wind up partitioning by an id field frequently used in a join clause. This id field is often a date_id field, which is used to join to a dates dimension table, because a date range is frequently used as a filter condition in queries.

Questions:
1) Do you agree with the ranking above?
2) Is there any substantial benefit to partitioning a fact table by an id

field, when the id field is used to join to a non-partitioned dimension table which is referenced in a filter condition?

Thanks to all who made it this far.
More thanks to any responders.
Most thanks to those with helpful comments. All-thanked-out, Bill.

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: 
  INET: becker.bill_at_marshfieldclinic.org

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L


(or the name of mailing list you want to be removed from).  You may

also send the HELP command for other information (like subscribing).

Received on Fri Jan 10 2003 - 08:33:50 CST