Re: Parallel Query Performance Issues

From: Goti <aryan.goti_at_gmail.com>
Date: Mon, 13 Jun 2022 19:43:27 +0530
Message-ID: <CAOzfMuqgW6eA6P_hPDycSgCNGA0xt0HVxV6HuO4GszBGu2AgnQ_at_mail.gmail.com>



HI Laurentiu,

Thanks for the response. DO you want me to add the below hints to the existing outline and execute the query?

 PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OP"_at_"SEL$1" HASH HASH)

      PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OPF"_at_"SEL$5" HASH HASH)

Thanks,

Goti

On Mon, Jun 13, 2022 at 6:54 PM Laurentiu Oprea <laurentiu.oprea06_at_gmail.com> wrote:

> what is the outcome if you add the next hints:
>
> PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OP"_at_"SEL$1" HASH HASH)
> PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OPF"_at_"SEL$5" HASH HASH)
>
> În lun., 13 iun. 2022 la 13:02, Goti <aryan.goti_at_gmail.com> a scris:
>
>> Thanks Andy and Jonathan.
>>
>> I did change _parallel_broadcast_enabled to TRUE to have "PX BROADCAST in
>> the plan. But still it doesn't improve the response time of the SQL. Can
>> you please help me to identify why the step 38 actual rows shows 495M
>> whereas Oracle estimates it to be 1 row. Below are the gist details.
>>
>>
>> https://gist.github.com/aryangoti/ec2804a7b832a7fe606ec0bf6a0681b7
>>
>> Thanks,
>>
>> Goti
>>
>>
>> On Thu, Jun 9, 2022 at 8:15 PM Andy Sayer <andysayer_at_gmail.com> wrote:
>>
>>> Just quick thoughts - replace the distincts with group by, this might
>>> allow group by placement to happen for you.
>>>
>>> The inner distinct doesn’t seem to be executed as a distinct, there
>>> might be clues in the outline if it’s decided that it only need wants to do
>>> a sort.
>>>
>>> I’ll have a closer look when I can
>>>
>>> Thanks,
>>> Andy
>>>
>>> On Thu, 9 Jun 2022 at 15:39, Goti <aryan.goti_at_gmail.com> wrote:
>>>
>>>> Thanks Jonathan for the quick response!
>>>>
>>>> I tried for the first 2 workarounds and that didn't work as expected. I
>>>> will work on the 3rd and 4th action plan and update here.
>>>>
>>>> Thanks,
>>>>
>>>> Goti
>>>>
>>>>
>>>> On Thu, Jun 9, 2022 at 5:41 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> The two queries may return the same size result, but the 2019 report
>>>>> generates and aggregates roughly 12 times as much data as the 2018 report.
>>>>> Check the "Actual Rows" figures - the 2018 report hits 3M rows (and 3M
>>>>> execs of the subsequent table probes) while the 2919 report hits 39M
>>>>> rows/execs - and that's where a lot of time goes on CPU.
>>>>>
>>>>> Strangely (almost) all the data is passed to one PX server (at
>>>>> operation 13/14, I think) that blows it up through segement NL joins to get
>>>>> most of the 39M rows that have to be "buffer sorted" (i.e. buffered, but
>>>>> not actually sorted) which is where the temp space and I/O time goes.
>>>>>
>>>>> Possible workarounds
>>>>> - MAYBE if you tried parallel 7 rather than 8 the hash disrtibution
>>>>> at operation MIGHT be better balanced;
>>>>> - MAYBE if you set "_gby_hash_aggregation_enabled" to false and got a
>>>>> SORT UNIQUE instead of a hash unique the distribution would work better.
>>>>> - if you get the outline information for the plan you should be able
>>>>> to find the pq_distribute hint controls the distribution at operation 14
>>>>> and change it from a hash distribution to a round-robin - this will
>>>>> probably introduce a 2nd layer of aggregation/uniqueness, but two small,
>>>>> shared stages may well do better than one very large operation.
>>>>> - can you rewrite the query to eliminate duplication earlier. This may
>>>>> require you to include inline non-mergeable views: ideally you want to
>>>>> avoid generating 39M rows at any point and then executing 39M join steps as
>>>>> that will still account for a lot of your time.
>>>>>
>>>>>
>>>>> Regards
>>>>> Jonathan Lewis
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, 9 Jun 2022 at 12:15, Goti <aryan.goti_at_gmail.com> wrote:
>>>>>
>>>>>> Environment : 11.2.0.4 database running on Linux.
>>>>>>
>>>>>> Need help to understand parallel query performance issues. Below are
>>>>>> the query details and its associated plans. The 2018_query does execute in
>>>>>> 24 seconds and returns about 2.5K rows. The 2019_query is also expected to
>>>>>> process almost the same number of rows however it consumes a lot of TEMP
>>>>>> space and finally fails. The 2019_query without parallel completes in 45
>>>>>> minutes (Just by removing the parallel hint). The only difference between
>>>>>> both the queries is related to the predicate "opclf.year_number =
>>>>>> to_number('YYYY')". The stats are up to date for the tables are partitions.
>>>>>>
>>>>>>
>>>>>> 2019_query:
>>>>>> https://gist.github.com/aryangoti/a7704a8075f118f7d942e49acee1900d
>>>>>>
>>>>>> 2018_query:
>>>>>> https://gist.github.com/aryangoti/a7704a8075f118f7d942e49acee1900d
>>>>>>
>>>>>> Stats and other details:
>>>>>> https://gist.github.com/aryangoti/a3797424ce0cb4fd87e194c05ad099b6
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Goti
>>>>>>
>>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Jun 13 2022 - 16:13:27 CEST

Original text of this message