RE: Statistics Problem on partitioned table

From: TJ Kiernan <tkiernan_at_pti-nps.com>
Date: Tue, 1 Oct 2013 14:45:10 +0000
Message-ID: <196DB2D4BDE5804EAF3158CCC1C698BC09B1A72A_at_lopez.pti-nps.com>



This depends on your version, but you need more than just global stats given the wide range of record counts. If you're on or past 11.2, then incremental statistics would be a good option (particularly if most partitions don't change on a day-to-day basis). https://blogs.oracle.com/optimizer/entry/incremental_statistics_maintenance_what_statistics

The initial gather stats job takes a long time (as well as any subsequent table changes - add a field, re-gathers stats for all partitions), but I've been quite happy with the ongoing operations. I create a new partition every day, and then consolidate older partitions into monthly & quarterly partitions to reduce the overall time the full gather stats job takes (the time is multiplicative by the number of partitions and fields).

If you're on an older version, incremental stats may still be your best option, but Doug Burns' cautionary tale is recommended reading. http://oracledoug.com/serendipity/index.php?/archives/1590-Statistics-on-Partitioned-Tables-Contents.html

HTH,
T. J.
 

-----Original Message-----

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Mohamed Houri Sent: Tuesday, October 01, 2013 9:07 AM
To: ORACLE-L
Subject: Statistics Problem on partitioned table

Dear list
I have a performance problem that I have narrowed to a statistics problem where the CBO is not doing good estimations on a partitioned table

The culprit select looks like

select

    a

   ,b

   ,c

   ,per_ind

from XXX_PER_YYY

where per_ind = 0;

XXX_PER_YYY is a table range partitioned by a date. There are 1493 partitions.

AND there is an index XXX_PER_IND on (per_ind) local (note that it is local non prefixed)

The execution plan with Estimations and Actuals looks like


| Id | Operation                        | Name        | Starts | E-Rows |
A-Rows |
| 0  | SELECT STATEMENT                 |             | 1      |
|550K |

| 1 | PARTITION RANGE ALL | | 1 | 69 |550K |

| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| XXX_PER_YYY | 111 | 69 |550K |

|* 3 |  INDEX RANGE SCAN                | XXX_PER_IND | 111    | 69
|550K |

Predicate Information (identified by operation id):


    3 - access("PER_IND"=0)

And the execution plan showing the number of partition looks like


| Id  | Operation                          | Name        | Rows  | Bytes |
Pstart| Pstop |
|   0 | SELECT STATEMENT                   |             |    69 |  1173
|        |       |

|   1 |  PARTITION RANGE ALL               |             |    69 |  1173
| 1 | 1493 |

| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| XXX_PER_YYY | 69 | 1173 | 1 | 1493 |

|*  3 |    INDEX RANGE SCAN                | XXX_PER_IND |    69 |
| 1 | 1493 |

 Predicate Information (identified by operation id):


    3 - access("PER_IND"=0)

There is only one distinct value of per_ind (per_ind =0)

select per_ind, count(1) from XXX_PER_YYY group by per_ind;

*per_ind cout(1)*

0 6,118,184

And I have the following partitions num_rows distribution

758 partitions with num_rows = 0;

60 partitions with num_rows <= 5000

295 partitions with num_rows between 5000 and 10000

315 partitions with num_rows > 10,000 and num_rows <50,000

65 partitions with num_rows > 100,000;

Statistics are calculated at a global level

How to make the CBO having correct estimations and hence an optimal execution plan?

Thanks in advance

--
Bien Respectueusement
Mohamed Houri
www.hourim.wordpress.com


--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l
Received on Tue Oct 01 2013 - 16:45:10 CEST

Original text of this message