Re: Optimizer estimated cardinality very low

From: Daniel Coello <coello.daniel_at_gmail.com>
Date: Wed, 7 Oct 2020 20:03:11 -0400
Message-ID: <CADX5pQVH7VRz7xJmB+TrR5rtFCWwqQvAeVU0shrNY1pq1fYFpA_at_mail.gmail.com>

Hello,

I have added the column groups to match both sides of the join. The plans and estimations did not change, july/2020 still show the same issue and other months remain with accurate join estimates. I kept other column groups in place, below is the details of the matching ones:
("ORDR_RQST_NUMBER","ORDR_RQST_DATE") => SYS_STU4S$4GK1UDMC2M8$MI#W6QS_ Table level:

*TABLE_NAME      COLUMN_NAME                      NUM_DISTINCT    LOW_VALUE
                        HIGH_VALUE                     HISTOGRAM*
ORDER_REQST    ORDR_RQST_DATE                    2696
64650101010101                      78780715010101                    HYBRID
ORDER_DETL     ORDR_RQST_DATE                    2696
64650101010101                      78780715010101                    HYBRID
ORDER_REQST    SYS_STU4S$4GK1UDMC2M8$MI#W6QS_    2717362176
 C51C4B230743                        CA132D442D076309543753
NONE
ORDER_DETL     SYS_STU4S$4GK1UDMC2M8$MI#W6QS_    2717302784
 C51C4B230743                        CA132D442D076309543753
NONE
ORDER_REQST    ORDR_RQST_NUMBER                  2719713280

30303030313130393039343230303232 39393336353130343037313332383830 NONE

ORDER_DETL     ORDR_RQST_NUMBER                  2719662080

30303030313130393039343230303232 39393336353130343037313332383830 NONE At partition level (july/2020):

*TABLE_NAME      COLUMN_NAME                       NUM_DISTINCT
 LOW_VALUE                           HIGH_VALUE
HISTOGRAM*
ORDER_REQST     ORDR_RQST_DATE                    3
78780707010101                      78780715010101                 FREQUENCY
ORDER_DETL      ORDR_RQST_DATE                    3
78780707010101                      78780715010101                 FREQUENCY
ORDER_REQST     SYS_STU4S$4GK1UDMC2M8$MI#W6QS_    26636288
C70326053D3B303E                    CA132D442C492532540249              NONE
ORDER_DETL      SYS_STU4S$4GK1UDMC2M8$MI#W6QS_    26636288
C70326053D3B303E                    CA132D442C492532540249              NONE
ORDER_REQST     ORDR_RQST_NUMBER                  26895392
31393137353034343332303930303030    32303230303030303030373430303032    NONE
ORDER_DETL      ORDR_RQST_NUMBER                  26984448

31393137353034343332303930303030 32303230303030303030373430303032 NONE The data in ORDR_RQST_DATE is 1 distinct value per week, yearly is about 50 distinct dates, there are a couple of weeks there is no data loaded (loads are done once a week as well).

Thanks in advance for any additional feedback you could provide.

On Wed, Oct 7, 2020 at 12:50 PM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:

> Daniel,
>
> Thanks for the response.
> My hypothesis (which may only be correct for some versions of Oracle) is
> the with the >= and < setup the optimizer will recognize that it must use
> partition-level stats.
> Then, with the matching column groups, it should use the column group
> stats to do the arithmetic.
>
> At present I think it might be using table-level stats, and comparing the
> table-level column group of two-columns on one table with the product of
> the two separate columns on the other table. (I have to admit I don't know
> why this would get you a reasonable estimate in the June partitions/
>
> Regards
> Jonathan Lewis
>
>
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virus-free.
> www.avg.com
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#m_-3128375331552044632_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Wed, 7 Oct 2020 at 17:26, Daniel Coello <coello.daniel_at_gmail.com>
> wrote:
>
>> Thank you Jonathan and Lothar.
>>
>> I have verified recommendation* a) *since values are date only. For this
>> scenario it didn't change the estimation (I'm making a note about the
>> "strictly less than" approach as it will be beneficial for standard
>> practice with this type of data).
>>
>> For suggestion *b *I am creating the recommended column group to match
>> both sides. There are no column groups or indexes in the detail table and
>> statistics at the partition/subpartition level are reasonably accurate for
>> the July month.
>>
>> I have checked DBA_PART_COL_STATISTICS and DBA_TAB_COL_STATISTICS for
>> low_value and high value in both joining columns and they match the values
>> for corresponding columns in the join.
>>
>> Thanks again for the feedback, I will post results once the column group
>> with the 2 columns is created and stats collected.
>>
>>
>> On Wed, Oct 7, 2020 at 5:00 AM Jonathan Lewis <jlewisoracle_at_gmail.com>
>> wrote:
>>
>>>
>>> I should have paid more attention to the last section of the email.
>>>
>>> Oracle cannot be using the column groups "properly" to estimate the join
>>> selectivity because the column groups don't match at both ends of the join.
>>> However it can use the 2-column column group at one end of the join, which
>>> would then tend to mean the product of the two column selectivities would
>>> be used at the other end, with a sanity check against the total number of
>>> rows in the table/partition/subpartition (depending on the level at which
>>> the query was running).
>>>
>>> Is there already a two-column column group (or index) on the detail
>>> table as well as the three-column one that you showed us?
>>> Are the partition-level stats for the column groups on the July
>>> partitions/subpartitions reasonable accurate?
>>>
>>> Although I'm quite interested in what the optimizer is doing with this
>>> query, if I were in your position I would make the changes I've suggested
>>> because those are the correct "generic" changes to give the optimizer the
>>> best possible chance of doing the right thing (and then I'd worry about why
>>> it was going wrong if that didn't help).
>>>
>>> Regards
>>> Jonathan Lewis
>>>
>>>
>>>
>>>
>>
>> --
>> Daniel Coello Villacis
>>
>>

-- 
Daniel Coello Villacis

--
http://www.freelists.org/webpage/oracle-l

Received on Thu Oct 08 2020 - 02:03:11 CEST

This message: [ Message body ]
Next message: dimensional.dba_at_comcast.net: "RE: How to find out when a redo log file was added"
Previous message: Moovarkku Mudhalvan: "RE: How to find out when a redo log file was added"
In reply to: Jonathan Lewis: "Re: Optimizer estimated cardinality very low"
In reply to Jonathan Lewis: "Re: Optimizer estimated cardinality very low"
Next in thread: Jonathan Lewis: "Re: Optimizer estimated cardinality very low"
Reply: Jonathan Lewis: "Re: Optimizer estimated cardinality very low"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message