Re: Parallel Query Performance Issues

From: Goti <aryan.goti_at_gmail.com>
Date: Tue, 14 Jun 2022 14:49:43 +0530
Message-ID: <CAOzfMupHQfgL6oS9yppaZ_xAOP5bwXD3ieNSco8Brap_n_0Upw_at_mail.gmail.com>



Thanks Jonathan , Laurentiu and Mark for your inputs.

_at_Laurentiu Oprea <laurentiu.oprea06_at_gmail.com> : Please find the SQL plan post adding the hints what you have suggested below.

https://gist.github.com/aryangoti/888adb4095d4bcea5b68a9c2f82f9c22

As Jonathan suggested, adding no_merge hint did the trick and now the SQL is just executing fine.
_at_Jonathan Lewis <jlewisoracle_at_gmail.com> : Please find the requested details.

2018 Bad: https://gist.github.com/aryangoti/4546973ec9f7b3b1d1d60a39fe784165 2018 Good:
https://gist.github.com/aryangoti/03bd03b67969a13f159a4473cf653b1c 2019 Bad: https://gist.github.com/aryangoti/dadf3866633f0d150d8b37961b735b1f 2019 Good:
https://gist.github.com/aryangoti/91b39d4826e63e056c8865593e2754ec

Thanks,

Goti

On Tue, Jun 14, 2022 at 12:20 AM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:

>
> Could you also post (gist) a copy of the "normal" execution plans (2018,
> 2019, new 2019) pulled from memort so that we can see the PQ distribution
> columns (P->P, S->P, etc.) Including the outline information
> (format=>'+outline') would also be helpful.
>
> Regards
> Jonathan Lewis
>
>
> On Mon, 13 Jun 2022 at 15:59, Goti <aryan.goti_at_gmail.com> wrote:
>
>> Hi Laurentiu,
>>
>> Thanks again!.
>>
>> The SQL is still running slow and consumes more than 100GB of temp..
>>
>> https://gist.github.com/aryangoti/f2d0f6b8408a924fbf25673eb3929732
>>
>>
>> Thanks,
>>
>> Goti
>>
>>
>> On Mon, Jun 13, 2022 at 7:49 PM Laurentiu Oprea <
>> laurentiu.oprea06_at_gmail.com> wrote:
>>
>>> at this stage I would say let's just add them as hints something like:
>>>
>>> SELECT
>>> /*+ PARALLEL(8) PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OP"_at_"SEL$1" HASH HASH) PQ_DISTRIBUTE(@"SEL$B62753A3"
>>> "OPF"_at_"SEL$5" HASH HASH) */
>>> DISTINCT ....
>>> from (..)
>>>
>>> and then update the outcome similar with previous run
>>>
>>> Thanks
>>>
>>> În lun., 13 iun. 2022 la 17:13, Goti <aryan.goti_at_gmail.com> a scris:
>>>
>>>> HI Laurentiu,
>>>>
>>>> Thanks for the response. DO you want me to add the below hints to the
>>>> existing outline and execute the query?
>>>>
>>>> PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OP"_at_"SEL$1" HASH HASH)
>>>> PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OPF"_at_"SEL$5" HASH HASH)
>>>>
>>>> Thanks,
>>>>
>>>> Goti
>>>>
>>>>
>>>> On Mon, Jun 13, 2022 at 6:54 PM Laurentiu Oprea <
>>>> laurentiu.oprea06_at_gmail.com> wrote:
>>>>
>>>>> what is the outcome if you add the next hints:
>>>>>
>>>>> PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OP"_at_"SEL$1" HASH HASH)
>>>>> PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OPF"_at_"SEL$5" HASH HASH)
>>>>>
>>>>> În lun., 13 iun. 2022 la 13:02, Goti <aryan.goti_at_gmail.com> a scris:
>>>>>
>>>>>> Thanks Andy and Jonathan.
>>>>>>
>>>>>> I did change _parallel_broadcast_enabled to TRUE to have "PX
>>>>>> BROADCAST in the plan. But still it doesn't improve the response time of
>>>>>> the SQL. Can you please help me to identify why the step 38 actual rows
>>>>>> shows 495M whereas Oracle estimates it to be 1 row. Below are the gist
>>>>>> details.
>>>>>>
>>>>>>
>>>>>> https://gist.github.com/aryangoti/ec2804a7b832a7fe606ec0bf6a0681b7
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Goti
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 9, 2022 at 8:15 PM Andy Sayer <andysayer_at_gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Just quick thoughts - replace the distincts with group by, this
>>>>>>> might allow group by placement to happen for you.
>>>>>>>
>>>>>>> The inner distinct doesn’t seem to be executed as a distinct, there
>>>>>>> might be clues in the outline if it’s decided that it only need wants to do
>>>>>>> a sort.
>>>>>>>
>>>>>>> I’ll have a closer look when I can
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Andy
>>>>>>>
>>>>>>> On Thu, 9 Jun 2022 at 15:39, Goti <aryan.goti_at_gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks Jonathan for the quick response!
>>>>>>>>
>>>>>>>> I tried for the first 2 workarounds and that didn't work as
>>>>>>>> expected. I will work on the 3rd and 4th action plan and update here.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Goti
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jun 9, 2022 at 5:41 PM Jonathan Lewis <
>>>>>>>> jlewisoracle_at_gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> The two queries may return the same size result, but the 2019
>>>>>>>>> report generates and aggregates roughly 12 times as much data as the 2018
>>>>>>>>> report. Check the "Actual Rows" figures - the 2018 report hits 3M rows (and
>>>>>>>>> 3M execs of the subsequent table probes) while the 2919 report hits 39M
>>>>>>>>> rows/execs - and that's where a lot of time goes on CPU.
>>>>>>>>>
>>>>>>>>> Strangely (almost) all the data is passed to one PX server (at
>>>>>>>>> operation 13/14, I think) that blows it up through segement NL joins to get
>>>>>>>>> most of the 39M rows that have to be "buffer sorted" (i.e. buffered, but
>>>>>>>>> not actually sorted) which is where the temp space and I/O time goes.
>>>>>>>>>
>>>>>>>>> Possible workarounds
>>>>>>>>> - MAYBE if you tried parallel 7 rather than 8 the hash
>>>>>>>>> disrtibution at operation MIGHT be better balanced;
>>>>>>>>> - MAYBE if you set "_gby_hash_aggregation_enabled" to false and
>>>>>>>>> got a SORT UNIQUE instead of a hash unique the distribution would work
>>>>>>>>> better.
>>>>>>>>> - if you get the outline information for the plan you should be
>>>>>>>>> able to find the pq_distribute hint controls the distribution at operation
>>>>>>>>> 14 and change it from a hash distribution to a round-robin - this will
>>>>>>>>> probably introduce a 2nd layer of aggregation/uniqueness, but two small,
>>>>>>>>> shared stages may well do better than one very large operation.
>>>>>>>>> - can you rewrite the query to eliminate duplication earlier. This
>>>>>>>>> may require you to include inline non-mergeable views: ideally you want to
>>>>>>>>> avoid generating 39M rows at any point and then executing 39M join steps as
>>>>>>>>> that will still account for a lot of your time.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Jonathan Lewis
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, 9 Jun 2022 at 12:15, Goti <aryan.goti_at_gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Environment : 11.2.0.4 database running on Linux.
>>>>>>>>>>
>>>>>>>>>> Need help to understand parallel query performance issues. Below
>>>>>>>>>> are the query details and its associated plans. The 2018_query does execute
>>>>>>>>>> in 24 seconds and returns about 2.5K rows. The 2019_query is also expected
>>>>>>>>>> to process almost the same number of rows however it consumes a lot of TEMP
>>>>>>>>>> space and finally fails. The 2019_query without parallel completes in 45
>>>>>>>>>> minutes (Just by removing the parallel hint). The only difference between
>>>>>>>>>> both the queries is related to the predicate "opclf.year_number =
>>>>>>>>>> to_number('YYYY')". The stats are up to date for the tables are partitions.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2019_query:
>>>>>>>>>> https://gist.github.com/aryangoti/a7704a8075f118f7d942e49acee1900d
>>>>>>>>>>
>>>>>>>>>> 2018_query:
>>>>>>>>>> https://gist.github.com/aryangoti/a7704a8075f118f7d942e49acee1900d
>>>>>>>>>>
>>>>>>>>>> Stats and other details:
>>>>>>>>>> https://gist.github.com/aryangoti/a3797424ce0cb4fd87e194c05ad099b6
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Goti
>>>>>>>>>>
>>>>>>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Jun 14 2022 - 11:19:43 CEST

Original text of this message