Re: Optimizer estimated cardinality very low
Date: Tue, 13 Oct 2020 17:29:57 -0400
Message-ID: <CADX5pQXYELH6WfRrihAaZU0gFc=VA6+pTEvgqcJPvV6XSroCGA_at_mail.gmail.com>
On Fri, Oct 9, 2020 at 7:03 AM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:
>
> Daniel,
>
> I point I'd completely overlooked when I made the comment about matching
> column groups: column group stats are not used if any of the underlying
> columns has a histogram and the column group itself does not. In your July
> stats ordr_rqst_date has a histogram and the column groups don't, so their
> stats won't be used.
>
> A few other points about the stats:
> The table has '01-01-0001' recorded as the low value for the
> ordr_rqst_date - that extreme value MIGHT have an impact on some queries.
> The high value is 15th July 2020 for table and July partition.
>
> The ordr_rqst_number is stored as a varchar2 (or char), not as a number,
> although the values stored do look like numbers. This can have a serious
> impact on the optimizer's arithmetic, but since you're only using equality
> on this column in this query it's probably irrelevant to this case.
>
> It looks a little odd that the high value ordr_rqst_number for the table
> ( 9936510407132880 ) is much bigger than the value for July (
> 2020000000740002 ) - again that may be irrelevant in this case, and it may
> be something that you expect to see anyway (i.e. no correlation between
> data and number).
>
> Finally, you've said that there's one date per week in each partition -
> are the loads/dates supposed to be on the same day of each week or can they
> move around. I ask this because the July data has a low of 7th July and a
> high of 15th July, and 3 distinct values. This doesn't look consistent with
> 3 weeks on the same day of the week, so I was wondering if this was normal,
> or whether it was an indication that something a little unusual had
> happened during the data loads.
>
> ---------------
>
> It would be useful to have the june stats to compare with the July stats
> in case any significant difference shows up.
>
> ----------------------
>
> Sorry about the error in the previous post which basically said:
> if they match then is probably X but if they match then it's probably Y -
> one of those should have had a negative.
>
> My thinking was as follows:
>
> if specifying the partitions (with no date predicate) gives you the same
> estimates as supplying date predicates then the optimizer has *probably *recognized
> perfectly correctly that your dates are all about one specific partition
> and will be doing the same arithmetic in both cases -- and that would mean
> there's something about the statistics that causing a problem.
>
> if the two variations give you different estimates that would *suggest*
> that the optimizer has *NOT* (emphasised just to make sure one of my
> statements had the appropriate negative) used a code path that is 100%
> about just one known partition, and therefore there may be an error in the
> optimizer code.
>
> As it is you got the same estimates in both cases - which means it's
> probably something about the stats.
>
> Having said that there is a quirky little difference between the two sets
> of predicate information you posted when you compared "<= 31st July" with
> "< 1st Aug". They both have the same ACCESS and FILTER predicates (apart
> from the change in date value, of course) at exactly the same operation -
> but the query using 1st Aug didn't offload the filter on the date on the
> ordr_rqst_detail table (viz: the plan didn't report a STORAGE predicate.)
>
> Regards
> Jonathan Lewis
>
>
>
-- Daniel Coello Villacis -- http://www.freelists.org/webpage/oracle-lReceived on Tue Oct 13 2020 - 23:29:57 CEST