Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: partitioning star schema

Re: partitioning star schema

From: Arup Nanda <arupnanda_at_hotmail.com>
Date: Fri, 10 Jan 2003 08:54:15 -0800
Message-ID: <F001.0052C7BE.20030110085415@fatcity.com>


Bill,

I'm impressed! Most of us follow something similar to the rankings you mentioned - but a very few actually spell it out and put it for the data architects. Now that you have done most of the work, I have a few comments.

First, you don't have to worry about legal issues for testing these out. Oracle licensing enables you to try out all (yes, all, including RAC option) in development as long as you promise not to deploy in production. It is perfect for your situation - try out all 9.2 partitioning schemes in development.

Second, you are under impression that partitioning keys should be numeric; they don't have to be. Dates are most frequently used and 9.2 has another superior option called "list partitioning" where you specify discrete values, very useful in situations like, say, partitioning based on business units. Say you have 30 business units called 'TOYS', 'DOLLS', 'GAMES', etc.. and each one sends data infrequently. Your scheme will be to partition based on the business units and then sub partition based on the date.

Third, the one thing you have left out in the consideration for partitioning keys is the needs to store and archive. What is your archival strategy? If you archive off every quarter, then the date should be part of the key. Similarly in the previous point I mentioned the arrival of data in different intervals from different sources. So, in order to minimize the downtime you will need to partition based on source.

HTH. Arup Nanda
www.proligence.com

>
> Hello,
>
> We are still struggling with partitioning of star schema fact tables.
> As of yet, we haven't been able to test/compare any of the following
> scenarios (because we're not yet legal with the partitioning option),
> so I am posting in the hope that someone with more partitioning
> experience will comment. This will eventually be implemented on 9.2
> on Solaris.
>
> By star schema fact tables, I am referring to tables that consist mostly
> of surrogate key id fields (used for joining to dimension tables),
> and numeric fields containing a quantity measure.
>
> The id fields are never directly referenced in WHERE clauses of queries as
> *filter* conditions, but are frequently referenced in join conditions. The
> filter conditions usually reference fields in one of the dimension tables
> joined to be the fact table.
>
> We have developed some (untested) practical guidelines for partitioning.
> They are listed from best to worst. These are intended to optimize
querying
> (not the incremental loading), and they apply to tables rather than
> indexes. (We are creating a similar list for indexes) Here they are:
>
> 1) partition by a field most frequently referenced in the WHERE clause
> as a filter condition; subpartition by a field less frequently
> referenced as a filter condition. This enables a double
partition-pruning.
> 2) partition by a field most frequently referenced in the WHERE clause
> as a filter condition; subpartition by a field frequently
> referenced as a join condition, where the joined-to table is
> partitioned exactly the same way. This enables partition-pruning
> and partition-wise joins.
> 3) partition by a field in the table that is often referenced
> in WHERE clauses as a filter condition; this enables partition
> pruning.
> 4) partition by a frequently-used join field where the joined-to table
> is partitioned exactly the same; this enables partition-wise joins.
> 5) partition by a frequently-used join field.
> 6) partition by something is usually better than not partitioning at all.
>
> In many cases, we have to go all the way to #5 before this applies. As I
said,
> the fact table id fields are never referenced in WHERE clauses, the
dimension
> tables are rarely large enough to be partitioned, and the WHERE filter
conditions
> usually apply to a dimension table, so we wind up partitioning by an id
field
> frequently used in a join clause. This id field is often a date_id field,
> which is used to join to a dates dimension table, because a date range is
> frequently used as a filter condition in queries.
>
> Questions:
> 1) Do you agree with the ranking above?
> 2) Is there any substantial benefit to partitioning a fact table by an id
> field, when the id field is used to join to a non-partitioned dimension
> table which is referenced in a filter condition?
>
> Thanks to all who made it this far.
> More thanks to any responders.
> Most thanks to those with helpful comments.
> All-thanked-out, Bill.
>
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.net
> --
> Author:
> INET: becker.bill_at_marshfieldclinic.org
>
> Fat City Network Services -- 858-538-5051 http://www.fatcity.com
> San Diego, California -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from). You may
> also send the HELP command for other information (like subscribing).
>

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Arup Nanda
  INET: arupnanda_at_hotmail.com

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Fri Jan 10 2003 - 10:54:15 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US