Re: Nested loop cost looks too high on 19c

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Tue, 31 May 2022 23:47:03 +0100
Message-ID: <CAGtsp8kTYHujgpUYTxyZF2u6zizW5VR-MYfCdS810NwD68YSUQ_at_mail.gmail.com>



One other thing to check on gathering stats - what was the setting for the granularity paramter?

Regards
Jonathan Lewis

On Tue, 31 May 2022 at 23:33, Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:

>
> If the partition stats for 11g and 19c look very similar (and you're not
> using incremental stats with synopses) then all it takes to get a
> significant change in the global num_distinct is a change in the algorithm
> that Oracle uses to derive table num_distinct from partition num_distinct.
> I don't know if this has happened, but you could create a small model and
> test the effects. Having 20% of your partitions empty may also change the
> algorithm so include that in the model.
>
> Another detail to check before you build a model is the variation in
> low_value and high_value between the versions, and between partitions.
> Oracle tended to over-estimate the number of distinct values when combining
> partitions, which is why I wasn't too surprised to a drop in the
> num_distinct for 19c; but the way in which estimate is derived may depend
> heavily on how much the overlap of tx_id is between partitions.
>
> (Obviously if the low_value is pretty much the same and the high_value
> pretty much the same in all partitions then the number of distinct values
> in the table is quite likely to be similar to the number of distinct values
> in a single partition; on the other hand if the low_value of each partition
> is higher than the high value of the previous partition then the global
> num_distinct is going to be more like the sum of the num_distinct across
> all the partitions. Your case may be somewhere between those two extremes,
> and it may be handled differently in the two versions)
>
> Regards
> Jonathan Lewis
>
>
>
>
>
> On Tue, 31 May 2022 at 21:21, Pap <oracle.developer35_at_gmail.com> wrote:
>
>> I do not see any entry in sys.wri$_optstat_synopsis$ for that object,
>> which means the stats gathering is not happening as incremental. Also
>> checked the dba_tab_stat_prefs, they are exactly the same in both the
>> databases with granularity as PARTITION, estimate_percent as
>> 'DBMS_STATS.AUTO_SAMPLE_SIZE'. I see stale_percent set as '10' in table
>> level preference, but I hope that won't play a role here because global
>> default is also 10% belief.
>>
>> I tried fetching the sample_size, num_rows, num_distinct for column TX_ID
>> using below query side by side for each partition from both the
>> environments, i am not seeing any such anomaly. I pasted the results in the
>> link below. On 11.2 versions out of total ~267 partitions ~210 were non
>> zero partitions and on 19c out of total ~266 partitions ~211 were non zero
>> partitions. So in both ~55 partitions hold zero rows as they are future
>> partitions, so that should not cause this big deviation in both the
>> databases. Am I missing anything here?
>>
>> https://gist.github.com/oracle9999/28e991d0069befcf7d4d77f5b8e0420a
>>
>> select dtp.table_name, dtp.partition_name,dtp.sample_size, dtp.num_rows,
>> dpcs.num_distinct from dba_tab_partitions dtp, dba_part_col_statistics dpcs
>>
>> where dtp.table_name='TAB_PART' and dtp.table_name=dpcs.table_name and
>> dtp.partition_name= dpcs.partition_name
>>
>> and dpcs.column_name='TX_ID' and dtp.table_owner= dpcs.owner
>>
>> order by sample_size asc
>>
>>
>> On Wed, Jun 1, 2022 at 12:53 AM Jonathan Lewis <jlewisoracle_at_gmail.com>
>> wrote:
>>
>>>
>>> You are correct - that global num_distinct is driving the difference in
>>> the arithmetic
>>>
>>> Your next step would be to check the sample_size reported by the two
>>> systems. One may have been using a small sample while the other was using
>>> the approximate_ndv with 100% sample. Then check if you're using
>>> incremental statistics (and how it's configured) in both systems. If you're
>>> using incremental in 19c but not in 11g then the num_distinct in 11g would
>>> almost inevitably be larger than that in 19c which would be using synopses
>>> to get a better global
>>> num_distinct. Then check rows and num_distinct partition by partition.
>>>
>>> Regards
>>> Jonathan Lewis
>>>
>>>
>>> On Tue, 31 May 2022 at 19:37, Pap <oracle.developer35_at_gmail.com> wrote:
>>>
>>>> Thank You Jonathan.
>>>>
>>>> The line no-3 in the plan is the join between table TAB_PART and TSFS
>>>> and the estimation
>>>>
>>>> on 11.2 is coming as 3(cardinality of TAB_PART)*154(cardinality of
>>>> TSFS) =462 which is close to ~455
>>>>
>>>> however on 19c its 176(cardinality of TAB_PART)*209(cardinality of
>>>> TSFS)= 36784 which is close to ~36692
>>>>
>>>> So here the key factor which is influencing the figure drastically is
>>>> cardinality estimation of TAB_PART i.e 3 in 11.2 vs 176 in 19C and which is
>>>> determined by the predicate "TAB_PART.TX_ID=TO_NUMBER(:B2)". Here TAB_PART
>>>> is a range partitioned table and on this table stats is being gathered on
>>>> individual partitions which is getting rolled up by oracle itself. We don't
>>>> use incremental stats collection yet on this table. That is why the
>>>> global_stats column is showing as 'NO'.
>>>>
>>>> Now comparing the table and column statistics of this between 11.2 and
>>>> 19c databases below, the cardinality = density*num_rows, so here the
>>>> difference in num_distinct/density for this column at the global level is
>>>> playing a key role. I am sure the data pattern of column TX_ID is closely
>>>> the same in both , so I was wondering why there is such a big difference in
>>>> calculation of num_distinct/density by stats gathering process in both the
>>>> versions? Or is there really a difference in how the num_distinct/density
>>>> gets rolled up to global level from individual partitions in these two
>>>> versions and thus moving to 19C with this may impact other queries too?
>>>>
>>>> *ON 11.2:- *
>>>>
>>>> *from dba_tables:- *
>>>>
>>>>
>>>> *TABLE_NAME NUM_ROWS BLOCKS*
>>>>
>>>> TAB_PART 68889040371 1132560197
>>>>
>>>> *from dba_tab_col_statistics:- *
>>>>
>>>>
>>>> *TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY
>>>> NUM_NULLS HISTOGRAM GLOBAL_STATS DENSITY*68889040371*
>>>>
>>>> TAB_PART TX_ID 23151767844 4.3193245834968E-11 0
>>>> NONE NO 2.97554125607964
>>>>
>>>> *ON 19C:- *
>>>>
>>>> *from dba_tables:- *
>>>>
>>>> *TABLE_NAME NUM_ROWS BLOCKS*
>>>>
>>>> TAB_PART 64791566202 1055609782
>>>>
>>>> *from dba_tab_col_statistics:-*
>>>>
>>>>
>>>> *TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY
>>>> NUM_NULLS HISTOGRAM GLOBAL_STATS DENSITY*64791566202*
>>>>
>>>> TAB_PART TX_ID 368368793 2.71467078374362E-9
>>>> 0 NONE NO 175.88777180156
>>>>
>>>>
>>>> On Tue, May 31, 2022 at 10:23 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
>>>> wrote:
>>>>
>>>>> In the 11g plan:
>>>>> Operation 2 has a cost of 768K because operation 3 is telling it that
>>>>> it will call operation 10 455 times at a cost of 1683 each time.
>>>>> 455 * 1683 = 765,765, then add a bit for the line itself using CPU.
>>>>>
>>>>> In the 19c hinted plan
>>>>> Operation 2 has a cost of t1M because operation 3 is telling it that
>>>>> it will call operation 10 36693 times at a cost of 1677 each time.
>>>>> 36693 * 1677 = 61,534,161
>>>>>
>>>>> Thanks to self-caching and other details the multiplication isn't
>>>>> always this close - but it's a good guideline in many cases.
>>>>>
>>>>> Your first step should be to work out why the cardinality estimate of
>>>>> line 3 is so much higher.
>>>>>
>>>>> Regards
>>>>> Jonathan Lewis
>>>>>
>>>>>
>>>>>
>>>>> On Mon, 30 May 2022 at 16:11, Pap <oracle.developer35_at_gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello, While testing the behavior of queries on one of the
>>>>>> 19.11.0.0.0 database. Seeing the same query which running in quick time on
>>>>>> version 11.2.0.4 is running longer on 19C with different path. But when
>>>>>> forced the outline of the 11.2 path the cost comes higher in the 19C
>>>>>> database , which i was expecting but not this much high it's 760k vs
>>>>>> 61million . And when i checked the path the cost of nested loop operation
>>>>>> is too high on 19C at plan_line_id-2. So wanted to understand if this is
>>>>>> expected behavior or hitting any bug here?
>>>>>>
>>>>>> Below is the sql and i have added its execution path with version
>>>>>> 11.2, 19c and forced 11.2 path on 19c database. Also i have added the sql
>>>>>> monitor for each of those in below location.
>>>>>>
>>>>>> https://gist.github.com/oracle9999/69651125d9a3942e8fb261669611e7aa
>>>>>>
>>>>>>
>>>>>> select *
>>>>>>
>>>>>> FROM TAB_PART TAB_PART, FEES fees
>>>>>>
>>>>>> WHERE fees.SB_ID IN (SELECT DISTINCT RS_ID FROM TSFS WHERE OS_ID
>>>>>> = TAB_PART.SB_ID)
>>>>>>
>>>>>> AND fees.B_STS = 'XXX'
>>>>>>
>>>>>> AND TAB_PART.SM_ID = fees.OB_ID
>>>>>>
>>>>>> AND TAB_PART.TX_ID = :b2;
>>>>>>
>>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Jun 01 2022 - 00:47:03 CEST

Original text of this message