RE: Statistics Problem on partitioned table

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Wed, 2 Oct 2013 13:13:02 -0400
Message-ID: <032801cebf92$a7f16a70$f7d43f50$_at_rsiz.com>



This version survived the list formatting sufficiently that I can see that it is an execution plan with estimates and actuals. That would be good enough if it is a complete execution. I do not understand how it can be a complete execution since the actuals should be 6.1 million rows (not 550K). The count I suggested will not have to go to the table to get a result, so it should be cheaper to test.  

With no histogram then, you've got about 49 million rows, and a little over 700,000 distinct values. Since 4900/700 is 70, the estimate of 69 seems pretty doggone good.  

The confusing thing about your distinct cnt query that generated 56 rows is that it was distinct cnt, not distinct values.  

The number of rows in the test table matches the distinct values, which seems correct.  

I'm curious what the largest few values other than the 6 million count for the distinct value of zero was. That would tell us the maximum popularity level apart from value zero, but that is just a curiosity.  

It seems to me that about one eighth of your rows have the value zero, the rest of your distinct values have a pretty flat distribution (although without seeing the high missing distinct cnt below the 6 million and change we can't be sure.) So that histogram probably would help the optimizer report better estimates, but I doubt it would generate a more effective plan in this case since there is no other reasonable filter or access method.  

I suppose depending on hardware and row width is it possible that full scans of partitions would be faster than the indirect fetch via the local index. I suppose if you plugged a sufficiently bad cluster factor in for you index you might be able to test that plan. Possibly the histogram would tip the plan in favor of scanning.  

I suppose you could generate a union all for only the partitions having some zeros. In theory it could use the local index when that was effective and scan when it is not, per partition. I have not tested that.  

Is that sort of what you're trying to figure out, or were you just curious at the difference between estimates and actuals? I think that is just based on the flat distribution presumption of distinct values.  

mwf  

From: Mohamed Houri [mailto:mohamed.houri_at_gmail.com] Sent: Wednesday, October 02, 2013 8:19 AM To: Mark W. Farnham
Cc: Jonathan Lewis; ORACLE-L
Subject: Re: Statistics Problem on partitioned table  

Ok Mark I will request what you have asked.  

I am sorry  

Mohamed Houri  

2013/10/2 Mohamed Houri <mohamed.houri_at_gmail.com>

Mark  

I haven't posted the execution plan you've requested because I did it in my first e-mail  

select

    a

   ,b

   ,c

   ,per_ind

from XXX_PER_YYY

where per_ind = 0;  



| Id | Operation | Name | Starts | E-Rows |
A-Rows |



| 0 | SELECT STATEMENT | | 1 |
|550K |

| 1 | PARTITION RANGE ALL | | 1 | 69
|550K |

| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| XXX_PER_YYY | 111 | 69
|550K |

|* 3 | INDEX RANGE SCAN | XXX_PER_IND | 111 | 69
|550K |



Predicate Information (identified by operation id):


 

   3 - access("PER_IND"=0)  

It is not exactly what you've requested but was this not sufficient?    

Best regards  

Mohamed    

2013/10/2 Mark W. Farnham <mwf_at_rsiz.com>  

I fail to understand why the --+ gather_plan_statistics execution of the simple count I requested is not forthcoming.

Now it does leave out the table access for the non-index columns, but it should tell us a great deal about what the optimizer thinks is the cheapest way to get the relevant rowids from all the partitions and the plan should show what the CBO has estimated along with the actuals.

Fragmentary revelation of the underlying facts (obscuring, for example, that a view was involved) does not make it easier for folks to help you explain what you're asking about.

Please try to make it easier.

mwf

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Mohamed Houri

Sent: Wednesday, October 02, 2013 7:39 AM To: Jonathan Lewis
Cc: ORACLE-L
Subject: Re: Statistics Problem on partitioned table

Jonathan,
I am sorry I may have not been clear but there are in fact 721,699 and that is what I showed above

 *select per_ind, count(1) cnt from XXX_PER_YYY group by per_ind;*

721,699 rows .
<snip>  

-- 
Bien Respectueusement
Mohamed Houri





 

-- 
Bien Respectueusement
Mohamed Houri



--
http://www.freelists.org/webpage/oracle-l
Received on Wed Oct 02 2013 - 19:13:02 CEST

Original text of this message