Re: Parallel Query Performance Issues

From: Andy Sayer <andysayer_at_gmail.com>
Date: Thu, 9 Jun 2022 15:45:42 +0100
Message-ID: <CACj1VR6jEp6dY9-T=Pc5=nX8C84YXo4Uou8_HyeF_yJYy+ke=Q_at_mail.gmail.com>



Just quick thoughts - replace the distincts with group by, this might allow group by placement to happen for you.

The inner distinct doesn’t seem to be executed as a distinct, there might be clues in the outline if it’s decided that it only need wants to do a sort.

I’ll have a closer look when I can

Thanks,
Andy

On Thu, 9 Jun 2022 at 15:39, Goti <aryan.goti_at_gmail.com> wrote:

> Thanks Jonathan for the quick response!
>
> I tried for the first 2 workarounds and that didn't work as expected. I
> will work on the 3rd and 4th action plan and update here.
>
> Thanks,
>
> Goti
>
>
> On Thu, Jun 9, 2022 at 5:41 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>>
>> The two queries may return the same size result, but the 2019 report
>> generates and aggregates roughly 12 times as much data as the 2018 report.
>> Check the "Actual Rows" figures - the 2018 report hits 3M rows (and 3M
>> execs of the subsequent table probes) while the 2919 report hits 39M
>> rows/execs - and that's where a lot of time goes on CPU.
>>
>> Strangely (almost) all the data is passed to one PX server (at operation
>> 13/14, I think) that blows it up through segement NL joins to get most of
>> the 39M rows that have to be "buffer sorted" (i.e. buffered, but not
>> actually sorted) which is where the temp space and I/O time goes.
>>
>> Possible workarounds
>> - MAYBE if you tried parallel 7 rather than 8 the hash disrtibution at
>> operation MIGHT be better balanced;
>> - MAYBE if you set "_gby_hash_aggregation_enabled" to false and got a
>> SORT UNIQUE instead of a hash unique the distribution would work better.
>> - if you get the outline information for the plan you should be able to
>> find the pq_distribute hint controls the distribution at operation 14 and
>> change it from a hash distribution to a round-robin - this will probably
>> introduce a 2nd layer of aggregation/uniqueness, but two small, shared
>> stages may well do better than one very large operation.
>> - can you rewrite the query to eliminate duplication earlier. This may
>> require you to include inline non-mergeable views: ideally you want to
>> avoid generating 39M rows at any point and then executing 39M join steps as
>> that will still account for a lot of your time.
>>
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>>
>> On Thu, 9 Jun 2022 at 12:15, Goti <aryan.goti_at_gmail.com> wrote:
>>
>>> Environment : 11.2.0.4 database running on Linux.
>>>
>>> Need help to understand parallel query performance issues. Below are the
>>> query details and its associated plans. The 2018_query does execute in 24
>>> seconds and returns about 2.5K rows. The 2019_query is also expected to
>>> process almost the same number of rows however it consumes a lot of TEMP
>>> space and finally fails. The 2019_query without parallel completes in 45
>>> minutes (Just by removing the parallel hint). The only difference between
>>> both the queries is related to the predicate "opclf.year_number =
>>> to_number('YYYY')". The stats are up to date for the tables are partitions.
>>>
>>>
>>> 2019_query:
>>> https://gist.github.com/aryangoti/a7704a8075f118f7d942e49acee1900d
>>>
>>> 2018_query:
>>> https://gist.github.com/aryangoti/a7704a8075f118f7d942e49acee1900d
>>>
>>> Stats and other details:
>>> https://gist.github.com/aryangoti/a3797424ce0cb4fd87e194c05ad099b6
>>>
>>> Thanks,
>>>
>>> Goti
>>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Jun 09 2022 - 16:45:42 CEST

Original text of this message