Re: Optimizer estimated cardinality very low
Date: Wed, 7 Oct 2020 20:03:11 -0400
Message-ID: <CADX5pQVH7VRz7xJmB+TrR5rtFCWwqQvAeVU0shrNY1pq1fYFpA_at_mail.gmail.com>
Hello,
I have added the column groups to match both sides of the join. The plans
and estimations did not change, july/2020 still show the same issue and
other months remain with accurate join estimates.
I kept other column groups in place, below is the details of the matching
ones:
("ORDR_RQST_NUMBER","ORDR_RQST_DATE") => SYS_STU4S$4GK1UDMC2M8$MI#W6QS_
Table level:
*TABLE_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE HISTOGRAM* ORDER_REQST ORDR_RQST_DATE 2696 64650101010101 78780715010101 HYBRID ORDER_DETL ORDR_RQST_DATE 2696 64650101010101 78780715010101 HYBRID ORDER_REQST SYS_STU4S$4GK1UDMC2M8$MI#W6QS_ 2717362176 C51C4B230743 CA132D442D076309543753 NONE ORDER_DETL SYS_STU4S$4GK1UDMC2M8$MI#W6QS_ 2717302784 C51C4B230743 CA132D442D076309543753 NONE ORDER_REQST ORDR_RQST_NUMBER 271971328030303030313130393039343230303232 39393336353130343037313332383830 NONE
ORDER_DETL ORDR_RQST_NUMBER 271966208030303030313130393039343230303232 39393336353130343037313332383830 NONE At partition level (july/2020):
*TABLE_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE HISTOGRAM* ORDER_REQST ORDR_RQST_DATE 3 78780707010101 78780715010101 FREQUENCY ORDER_DETL ORDR_RQST_DATE 3 78780707010101 78780715010101 FREQUENCY ORDER_REQST SYS_STU4S$4GK1UDMC2M8$MI#W6QS_ 26636288 C70326053D3B303E CA132D442C492532540249 NONE ORDER_DETL SYS_STU4S$4GK1UDMC2M8$MI#W6QS_ 26636288 C70326053D3B303E CA132D442C492532540249 NONE ORDER_REQST ORDR_RQST_NUMBER 26895392 31393137353034343332303930303030 32303230303030303030373430303032 NONE ORDER_DETL ORDR_RQST_NUMBER 2698444831393137353034343332303930303030 32303230303030303030373430303032 NONE The data in ORDR_RQST_DATE is 1 distinct value per week, yearly is about 50 distinct dates, there are a couple of weeks there is no data loaded (loads are done once a week as well).
Thanks in advance for any additional feedback you could provide.
On Wed, Oct 7, 2020 at 12:50 PM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:
> Daniel,
>
> Thanks for the response.
> My hypothesis (which may only be correct for some versions of Oracle) is
> the with the >= and < setup the optimizer will recognize that it must use
> partition-level stats.
> Then, with the matching column groups, it should use the column group
> stats to do the arithmetic.
>
> At present I think it might be using table-level stats, and comparing the
> table-level column group of two-columns on one table with the product of
> the two separate columns on the other table. (I have to admit I don't know
> why this would get you a reasonable estimate in the June partitions/
>
> Regards
> Jonathan Lewis
>
>
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virus-free.
> www.avg.com
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#m_-3128375331552044632_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Wed, 7 Oct 2020 at 17:26, Daniel Coello <coello.daniel_at_gmail.com>
> wrote:
>
>> Thank you Jonathan and Lothar.
>>
>> I have verified recommendation* a) *since values are date only. For this
>> scenario it didn't change the estimation (I'm making a note about the
>> "strictly less than" approach as it will be beneficial for standard
>> practice with this type of data).
>>
>> For suggestion *b *I am creating the recommended column group to match
>> both sides. There are no column groups or indexes in the detail table and
>> statistics at the partition/subpartition level are reasonably accurate for
>> the July month.
>>
>> I have checked DBA_PART_COL_STATISTICS and DBA_TAB_COL_STATISTICS for
>> low_value and high value in both joining columns and they match the values
>> for corresponding columns in the join.
>>
>> Thanks again for the feedback, I will post results once the column group
>> with the 2 columns is created and stats collected.
>>
>>
>> On Wed, Oct 7, 2020 at 5:00 AM Jonathan Lewis <jlewisoracle_at_gmail.com>
>> wrote:
>>
>>>
>>> I should have paid more attention to the last section of the email.
>>>
>>> Oracle cannot be using the column groups "properly" to estimate the join
>>> selectivity because the column groups don't match at both ends of the join.
>>> However it can use the 2-column column group at one end of the join, which
>>> would then tend to mean the product of the two column selectivities would
>>> be used at the other end, with a sanity check against the total number of
>>> rows in the table/partition/subpartition (depending on the level at which
>>> the query was running).
>>>
>>> Is there already a two-column column group (or index) on the detail
>>> table as well as the three-column one that you showed us?
>>> Are the partition-level stats for the column groups on the July
>>> partitions/subpartitions reasonable accurate?
>>>
>>> Although I'm quite interested in what the optimizer is doing with this
>>> query, if I were in your position I would make the changes I've suggested
>>> because those are the correct "generic" changes to give the optimizer the
>>> best possible chance of doing the right thing (and then I'd worry about why
>>> it was going wrong if that didn't help).
>>>
>>> Regards
>>> Jonathan Lewis
>>>
>>>
>>>
>>>
>>
>> --
>> Daniel Coello Villacis
>>
>>
-- Daniel Coello Villacis -- http://www.freelists.org/webpage/oracle-lReceived on Thu Oct 08 2020 - 02:03:11 CEST