Home » RDBMS Server » Performance Tuning » performance imapct on stats collection (Oracle 11gR2)
performance imapct on stats collection [message #612795] Thu, 24 April 2014 17:27 Go to next message
ravikanth_b
Messages: 25
Registered: November 2007
Location: Bay Area, CA
Junior Member
Hello experts!

Need your suggestion on this one.

We have several "big" tables in our data warehouse. After each ETL we gather stats on the tables. Each of the table has partition and sub-partition. The complete process of gathering stats is taking hours to finish.

currently, this is what we do for each table.

EXEC DBMS_STATS.GATHER_TABLE_STATS(<SCHEMA_NAME>,<TABLE_NAME>);


Do you recommend us doing the following as recommended at

https://blogs.oracle.com/optimizer/entry/maintaining_statistics_on_large_partitioned_tables



EXEC DBMS_STATS.SET_TABLE_PREFS(<SCHEMA_NAME>,<TABLE_NAME>,'INCREMENTAL','TRUE');

EXEC DBMS_STATS.GATHER_TABLE_STATS(<SCHEMA_NAME>,<TABLE_NAME>);
Re: performance imapct on stats collection [message #612796 is a reply to message #612795] Thu, 24 April 2014 17:53 Go to previous messageGo to next message
BlackSwan
Messages: 22911
Registered: January 2009
Senior Member
>We have several "big" tables in our data warehouse.
quantify "big".
at what size does table go from being normal size to being big?

>After each ETL we gather stats on the tables.
How many partitions & subpartitions existed before & after the ETL?
How many partitions & subpartitions are impacted by the ETL?
Re: performance imapct on stats collection [message #612798 is a reply to message #612796] Thu, 24 April 2014 18:10 Go to previous messageGo to next message
ravikanth_b
Messages: 25
Registered: November 2007
Location: Bay Area, CA
Junior Member
In reply to BlackSwan:

The application is part of a product offering.
The number of partitions depends on the clients installation and their data variance.

Typically, there could be 20 to 25 partitions/about 5 subpartitions and rows running into several millions in each partition with considerable column width.

For smaller customers it could be 5 to 10 partitions/about 2-3 subpartitons.
Re: performance imapct on stats collection [message #612799 is a reply to message #612798] Thu, 24 April 2014 18:18 Go to previous messageGo to next message
BlackSwan
Messages: 22911
Registered: January 2009
Senior Member
The action to be taken depends how much of the table data changes.
Re: performance imapct on stats collection [message #612800 is a reply to message #612799] Thu, 24 April 2014 18:32 Go to previous messageGo to next message
ravikanth_b
Messages: 25
Registered: November 2007
Location: Bay Area, CA
Junior Member
That is the hard part. We don't know customers data. Meaning, the product should be able to cater to least common denominator. I see no issues with smaller data sets. But some of our customers started complaining about longer processing times. Building stats is taking as much time as ETL, which is not acceptable for them.
Re: performance imapct on stats collection [message #612801 is a reply to message #612800] Thu, 24 April 2014 18:53 Go to previous messageGo to next message
rleishman
Messages: 3700
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Gathering stats immediately after ETL - every time - is a bit of a v8i approach. If you add whole new partitions in the ETL, then it wouldn't hurt to gather stats on just those partitions, but otherwise (in 10g and later) I would let Oracle's automatic stats gathering mechanisms decide when it is appropriate to gather.

By letting Oracle gather stats when it thinks it is necessary, you will minimise the objects being analyzed and the frequency at which they are analyzed.

http://docs.oracle.com/cd/E11882_01/server.112/e41573/stats.htm#PFGRF94716

http://docs.oracle.com/cd/E11882_01/server.112/e41573/stats.htm#i41282


Ross Leishman
Re: performance imapct on stats collection [message #612802 is a reply to message #612800] Thu, 24 April 2014 18:56 Go to previous messageGo to next message
BlackSwan
Messages: 22911
Registered: January 2009
Senior Member
That is the hard part. We only know what you post here which has consisted of big tables, millions of rows with considerable column width.

Who decided to partition the tables to start with and how did they decide the columns and partitioning strategy?
Do you realize that partition can actually make some SQL slower?

Until you can identify why any particular query is slow, you stand little chance to make the specific change to make it faster.
Where is time being spent when the query takes a long time to complete?
Re: performance imapct on stats collection [message #612803 is a reply to message #612801] Thu, 24 April 2014 19:37 Go to previous messageGo to next message
ravikanth_b
Messages: 25
Registered: November 2007
Location: Bay Area, CA
Junior Member
The reason we analyze after each ETL because, the aggregates and other reporting applications start querying this data as soon as ETL is done. We have see issue with these queries because of stale statistics. For example, partition A would have about 10k records for the first ETL, 2nd ETL might have a million records or more. This would cause huge impact on reporting queries.
Re: performance imapct on stats collection [message #612804 is a reply to message #612803] Thu, 24 April 2014 19:43 Go to previous message
BlackSwan
Messages: 22911
Registered: January 2009
Senior Member
Trade-offs

http://www.oracle.com/pls/db121/search?word=dynamic+sample
Previous Topic: Need to Understand Explain plan
Next Topic: ODI Performance issue with flat file
Goto Forum:
  


Current Time: Fri Oct 24 14:05:51 CDT 2014

Total time taken to generate the page: 0.11175 seconds