Home » RDBMS Server » Performance Tuning » Performance Pruning on fact table (10.2.0.4)
Performance Pruning on fact table [message #442725] Wed, 10 February 2010 04:29 Go to next message
summoner
Messages: 44
Registered: March 2009
Member
Dear all,

Our data warehouse has a Range-hash partition table (My_Fact) that stored 1M records. The data is partitioned by the year and then the event ID. We have master tables Event_Master and Year

We tried to use a query (generated from software "Business Objects") to retreive the data My_Fact table that the event ID is 999999 or 1111111.
And the query is similiar as follow
Quote:

select *
from My_Fact, Event_Master, Year
where My_Fact.event_id = Event_Master.event_id
and My_Fact.year_id = Year.year_id
and (Event_Master.event_id = 999999 or Event_Master.event_id = 1111111);


When we check on the execution plan, the plan simply tried to retrieve all rows from the My_Fact table as we find the following statement in the plan

Quote:

PARTITION RANGE ALL
PARTITION HASH ALL
TABLE ACCESS FULL


What we think is that the system will calculate the hash value of the event ID and the check the correct partition. But this is not correct. We are not sure why pruning does not work. We also check the parameter "star_transformation_enabled" and it is set as true

Thanks for help
Re: Partition Pruning for a [message #442731 is a reply to message #442725] Wed, 10 February 2010 04:55 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
Well, the quick response would be that your query isn't specifying a single year or a range of years that it wants to retrieve data from.

You're specifying that the Event_Id has to be 111111 or 999999, but there is no restriction on the year_id
Re: Performance Pruning on fact table [message #442732 is a reply to message #442725] Wed, 10 February 2010 04:56 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
You aren't specifying a single year or a range of years.
Re: Performance Pruning on fact table [message #442734 is a reply to message #442732] Wed, 10 February 2010 05:03 Go to previous messageGo to next message
John Watson
Messages: 6396
Registered: January 2010
Location: Global Village
Senior Member
Hi - firstly, with regard to star transformation, do you have the correct indexes? B*tree indexes will not work for this. Secondly, it is possible that the optimizer is making the correct decision: you do not have very many rows. Thirdly, are you sure that your partitioning strategy is appropriate? You aren't using the range predicate at all in your query. Lastly, do you, in fact, have a problem at all? How long does the query take?
Re: Partition Pruning for a [message #442737 is a reply to message #442725] Wed, 10 February 2010 05:42 Go to previous messageGo to next message
Kevin Meade
Messages: 2098
Registered: December 1999
Location: Connecticut USA
Senior Member
I am no expert with partitioning. However I am with the OP. I would have expected Oracle to prune but this all depends upon if I understand how partitioning works or not. Let us walk through an example.

Assume we have years 2008,2009 and events 1,2,3. If we partition by year/event then I would have expected oracle to create six physical partitions as follows:

2008,1
2008,2
2008,3
2009,1
2009,2
2009,3


The data above (the six pairs of year,event) is metadata. It tells us the partitions that we expect to be created given our partitioning scheme and the data we load. In my understanding of partitioning, Oracle does indeed create and use this kind of metadata. Here now is the expectation...

If I want event 1, then I would have expected Oracle to scan the partition metadata and figure out that it only needs to look at partitions ((2008,1),(2009,1)). This requires a scan of the metadata because we are not supplying years, but this is trivial and the result is the pruning away of two thirds of the table before we start our query and so such a metadata scan is well worth it.

Maybe the issue is with the fact that you have used HASH partitioning for your secondary partitioning scheme. HASHING msy have changed the set of partitions created and/or how Oracle uses the partition metadata to figure things out. Try using list partitioning as your secondary partitioning strategy (assuming this is allowed in your version or Oracle).

Good luck, Kevin
Re: Partition Pruning for a [message #442746 is a reply to message #442737] Wed, 10 February 2010 06:11 Go to previous messageGo to next message
rleishman
Messages: 3724
Registered: October 2005
Location: Melbourne, Australia
Senior Member
If I understand you correctly, you have MY_FACT range partitioned on YEAR_ID and hash subpartitioned on EVENT_ID.

Theoretically, Oracle should be able to apply transitivity to the predicates:
where My_Fact.event_id = Event_Master.event_id
and (Event_Master.event_id = 999999 or Event_Master.event_id = 1111111);

to come up with:
where My_Fact.event_id = 999999 or My_Fact.event_id = 1111111);

Now once that transitive rule is applied, I would expect a HASH PARTITION ITERATOR to pick up the max 2 subpartitions per range partition.

But it's not doing that. Maybe it is the transitivity. Maybe it is the OR.

You need to run some tests. But first make sure of one thing: Are the data types of MY_FACT.EVENT_ID and EVENT_MASTER.EVENT_ID both NUMERIC? If not, your integer comparison or the join will cause casting, which could mess up the transitivity.

Lets try a few things. Tell us if these Prune the subpartitions.

select * 
from My_Fact
where My_Fact.event_id = 999999;

select * 
from My_Fact
where My_Fact.event_id = 999999 or My_Fact.event_id = 1111111);

select * 
from My_Fact, Event_Master
where My_Fact.event_id = Event_Master.event_id
and My_Fact.event_id = 999999;

select * 
from My_Fact, Event_Master
where My_Fact.event_id = Event_Master.event_id
and My_Fact.event_id = 999999 or My_Fact.event_id = 1111111;

select * 
from My_Fact, Event_Master
where My_Fact.event_id = Event_Master.event_id
and Event_Master.event_id = 999999;


Ross Leishman
Re: Partition Pruning for fact table [message #442833 is a reply to message #442737] Wed, 10 February 2010 21:41 Go to previous messageGo to next message
summoner
Messages: 44
Registered: March 2009
Member
Thank you for everyone. Razz
We rebuild the Event Master table and correct data type
Now the query can use correct partition pruning for all the queries shown above

Now our system needs to face the actual environment. Our DW is used for the tool "Business Objects" which allow user to drag and drop any fields from the DW.

The Event_Master table has two important fields "event_id" and "previous_event_id" and user always want to compare the revenue between current event and previous event. Somehow the query may be shown as follows

SELECT *
FROM   my_fact,
       event_master,
       YEAR
WHERE  my_fact.event_id = event_master.event_id
       AND my_fact.year_id = YEAR.year_id
       AND (event_master.event_id = 999999
             OR event_master.event_id IN (SELECT event_master.previous_event_id
                                          FROM   event_master
                                          WHERE  ((event_master.event_id) = 999999))); 



The system failed to do partition pruning in this case. However, we could solve it by rewriting the query as follows

SELECT *
FROM   my_fact,
       event_master,
       YEAR
WHERE  my_fact.event_id = event_master.event_id
       AND my_fact.year_id = YEAR.year_id
       AND (event_master.event_id = 999999)
UNION ALL
SELECT *
FROM   my_fact,
       event_master,
       YEAR
WHERE  my_fact.event_id = event_master.event_id
       AND my_fact.year_id = YEAR.year_id
       AND (event_master.event_id IN (SELECT event_master.previous_event_id
                                      FROM   event_master
                                      WHERE  ((event_master.event_id) = 999999))); 


Edited by BlackSwan to provide formatted "code" with correct tags

But the fact is that user could create any query they want. So rewriting a query is not a good solution in this case. Could anyone help us to solve the problem in more generic way? Thanks all


===========================
Edited
===========================
Thank you BlackSwan for formatting the code.

Now I am checking whether the "Query Rewrite" feature provided by Oracle can help me to solve the case. As I check some documents, they suggest us to create materialize view with "enable query rewrite" option. Therefore, I tried the following materialized view

CREATE MATERIALIZED VIEW "EVENT_PREV_MV"
ENABLE QUERY REWRITE
AS SELECT event_id AS src_evt_id,
          event_id AS evt_id
   FROM   event_master
   UNION ALL
   SELECT event_id,
          previous_event_id
   FROM   event_master
   WHERE  previous_event_id IS NOT NULL; 


When I tried the above query again, the query failed to rewrite and I think it is because the query for the MV are different.

[Updated on: Thu, 11 February 2010 02:14]

Report message to a moderator

Re: Performance Pruning on fact table [message #442914 is a reply to message #442725] Thu, 11 February 2010 05:14 Go to previous messageGo to next message
Kevin Meade
Messages: 2098
Registered: December 1999
Location: Connecticut USA
Senior Member
I could hazard a guess or two but there are just too many variables not accounted for.

For example, is event_id unique on the event_master table?

If so then can't you just code:

SELECT *
FROM   my_fact,
       event_master,
       YEAR
WHERE  my_fact.event_id = event_master.event_id
       AND my_fact.year_id = YEAR.year_id
       AND (event_master.event_id = 999999 or event_master.previous_event_id = 999999)


and not bother with the nested select?


Kevin
Re: Performance Pruning on fact table [message #442982 is a reply to message #442914] Thu, 11 February 2010 19:17 Go to previous messageGo to next message
summoner
Messages: 44
Registered: March 2009
Member
Kevin Meade wrote on Thu, 11 February 2010 05:14
I could hazard a guess or two but there are just too many variables not accounted for.

For example, is event_id unique on the event_master table?

If so then can't you just code:

SELECT *
FROM   my_fact,
       event_master,
       YEAR
WHERE  my_fact.event_id = event_master.event_id
       AND my_fact.year_id = YEAR.year_id
       AND (event_master.event_id = 999999 or event_master.previous_event_id = 999999)


and not bother with the nested select?


Kevin


The event_id is unique on the Event_Master table.
We cannot retrieve correct result from your query. Let's say the previous event of event #999999 is 77777.

Then what we need to do is to create a subquery to retrieve the previous event id (i.e 77777) for event 999999 from the event_master table. In this case, the query should be as follows

SELECT *
FROM   my_fact,
       event_master,
       YEAR
WHERE  my_fact.event_id = event_master.event_id
       AND my_fact.year_id = YEAR.year_id
       AND (event_master.event_id = 999999 or event_master.event_id = 77777)
Re: Performance Pruning on fact table [message #442984 is a reply to message #442982] Thu, 11 February 2010 20:13 Go to previous messageGo to next message
BlackSwan
Messages: 24904
Registered: January 2009
Senior Member
It would be helpful if you provided DDL (CREATE TABLE ...) for tables involved.
It would be helpful if you provided DML (INSERT INTO ...) for test data.
It would be helpful if you provided expected/desired results & a detailed explanation how & why the test data gets transformed or organized.
Re: Performance Pruning on fact table [message #443074 is a reply to message #442725] Fri, 12 February 2010 07:48 Go to previous message
Kevin Meade
Messages: 2098
Registered: December 1999
Location: Connecticut USA
Senior Member
So you are basically doing a hierarchical query one level deep.

Previous_event_id is on the table too. Thus the row where event_id = 999999 has the previous_event_id on it already.

Maybe you should try a join to a second copy of the table.

SELECT *
FROM   my_fact,
       event_master,
       event_master previous_event_master,
       YEAR
WHERE  my_fact.event_id = event_master.event_id
       AND my_fact.year_id = YEAR.year_id
       AND event_master.event_id = 999999
       AND event_master.previous_event_id = previous_event_master.event_id(+)
/


You will have both current and previous events on the same result row. If you must have them on two rows, you can do a pivot query.

select   my_fact.*
       , decode(rowduper.column_value,1,a[event_master].event_id,2,a.[previous_event_master].event_id) event_id
       , ...
       , year.*
from (
       SELECT *
       FROM   my_fact,
              event_master,
              event_master previous_event_master,
              YEAR
       WHERE  my_fact.event_id = event_master.event_id
              AND my_fact.year_id = YEAR.year_id
              AND event_master.event_id = 999999
              AND event_master.previous_event_id = previous_event_master.event_id(+)
     ) a
    ,(
       select level column_value
       from dual
       connect by level <= 2
     ) rowduper
/


The above code won't work as is because of obvious syntax errors. You will have to pay attention to your columns names etc. in your inner query by naming them well so you can seperate them in the outer query. But you can work those details out.

Maybe I am not understanding your need but this looks workable to me. You of course will also have to figure out if it will still do the partition pruning you need.

Kevin
Previous Topic: Index is not used after running dbms_stats.gather_table_stats
Next Topic: High enqueue waits in statspack - lock contention on Primary key index showing 98% of db wait time
Goto Forum:
  


Current Time: Sun Sep 25 11:10:20 CDT 2016

Total time taken to generate the page: 0.29991 seconds