Re: Parallel Query Performance Issues

From: Goti <aryan.goti_at_gmail.com>
Date: Mon, 13 Jun 2022 21:55:01 +0530
Message-ID: <CAOzfMuoPZQr_sXX2+U6OgKSEqHf=2wWOKpvZJ+-=_5r+F+Mx0g_at_mail.gmail.com>

Hi Mark,

SQL> SQL> SQL> SQL> SQL> select count(distinct FIRM.firm_id) from report_engine.firm FIRM where FIRM.ultimate_parent_firm_id = to_number('252094');

COUNT(DISTINCTFIRM.FIRM_ID)

Thanks,

Goti

On Mon, Jun 13, 2022 at 9:53 PM Mark W. Farnham <mwf_at_rsiz.com> wrote:

> Just curious, what does:
>
>
>
> select count(distinct FIRM.firm_id) from report_engine.firm FIRM where
> FIRM.ultimate_parent_firm_id = to_number('252094');
>
>
>
> give you. (I hope I didn’t make a typo, and you could probably just use
> the number, but I was just cutting and pasting).
>
>
>
> mwf
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [mailto:
> oracle-l-bounce_at_freelists.org] *On Behalf Of *Goti
> *Sent:* Monday, June 13, 2022 11:00 AM
> *To:* Laurentiu Oprea
> *Cc:* ORACLE-L
> *Subject:* Re: Parallel Query Performance Issues
>
>
>
> Hi Laurentiu,
>
>
>
> Thanks again!.
>
>
>
> The SQL is still running slow and consumes more than 100GB of temp..
>
>
>
> https://gist.github.com/aryangoti/f2d0f6b8408a924fbf25673eb3929732
>
>
>
>
>
> Thanks,
>
>
>
> Goti
>
>
>
>
>
> On Mon, Jun 13, 2022 at 7:49 PM Laurentiu Oprea <
> laurentiu.oprea06_at_gmail.com> wrote:
>
> at this stage I would say let's just add them as hints something like:
>
>
>
> SELECT
> /*+ PARALLEL(8) PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OP"_at_"SEL$1" HASH
> HASH) PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OPF"_at_"SEL$5" HASH HASH) */
>
> DISTINCT ....
>
> from (..)
>
>
>
> and then update the outcome similar with previous run
>
>
>
> Thanks
>
>
>
> În lun., 13 iun. 2022 la 17:13, Goti <aryan.goti_at_gmail.com> a scris:
>
> HI Laurentiu,
>
>
>
> Thanks for the response. DO you want me to add the below hints to the
> existing outline and execute the query?
>
>
>
> PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OP"_at_"SEL$1" HASH HASH)
> PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OPF"_at_"SEL$5" HASH HASH)
>
>
>
> Thanks,
>
>
>
> Goti
>
>
>
>
>
> On Mon, Jun 13, 2022 at 6:54 PM Laurentiu Oprea <
> laurentiu.oprea06_at_gmail.com> wrote:
>
> what is the outcome if you add the next hints:
>
>
>
> PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OP"_at_"SEL$1" HASH HASH)
> PQ_DISTRIBUTE(_at_"SEL$B62753A3" "OPF"_at_"SEL$5" HASH HASH)
>
>
>
> În lun., 13 iun. 2022 la 13:02, Goti <aryan.goti_at_gmail.com> a scris:
>
> Thanks Andy and Jonathan.
>
>
>
> I did change _parallel_broadcast_enabled to TRUE to have "PX BROADCAST in
> the plan. But still it doesn't improve the response time of the SQL. Can
> you please help me to identify why the step 38 actual rows shows 495M
> whereas Oracle estimates it to be 1 row. Below are the gist details.
>
>
>
>
>
> https://gist.github.com/aryangoti/ec2804a7b832a7fe606ec0bf6a0681b7
>
>
>
> Thanks,
>
>
>
> Goti
>
>
>
>
>
> On Thu, Jun 9, 2022 at 8:15 PM Andy Sayer <andysayer_at_gmail.com> wrote:
>
> Just quick thoughts - replace the distincts with group by, this might
> allow group by placement to happen for you.
>
>
>
> The inner distinct doesn’t seem to be executed as a distinct, there might
> be clues in the outline if it’s decided that it only need wants to do a
> sort.
>
>
>
> I’ll have a closer look when I can
>
>
>
> Thanks,
>
> Andy
>
>
>
> On Thu, 9 Jun 2022 at 15:39, Goti <aryan.goti_at_gmail.com> wrote:
>
> Thanks Jonathan for the quick response!
>
>
>
> I tried for the first 2 workarounds and that didn't work as expected. I
> will work on the 3rd and 4th action plan and update here.
>
>
>
> Thanks,
>
>
>
> Goti
>
>
>
>
>
> On Thu, Jun 9, 2022 at 5:41 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>
>
> The two queries may return the same size result, but the 2019 report
> generates and aggregates roughly 12 times as much data as the 2018 report.
> Check the "Actual Rows" figures - the 2018 report hits 3M rows (and 3M
> execs of the subsequent table probes) while the 2919 report hits 39M
> rows/execs - and that's where a lot of time goes on CPU.
>
>
>
> Strangely (almost) all the data is passed to one PX server (at operation
> 13/14, I think) that blows it up through segement NL joins to get most of
> the 39M rows that have to be "buffer sorted" (i.e. buffered, but not
> actually sorted) which is where the temp space and I/O time goes.
>
>
>
> Possible workarounds
>
> - MAYBE if you tried parallel 7 rather than 8 the hash disrtibution at
> operation MIGHT be better balanced;
>
> - MAYBE if you set "_gby_hash_aggregation_enabled" to false and got a SORT
> UNIQUE instead of a hash unique the distribution would work better.
>
> - if you get the outline information for the plan you should be able to
> find the pq_distribute hint controls the distribution at operation 14 and
> change it from a hash distribution to a round-robin - this will probably
> introduce a 2nd layer of aggregation/uniqueness, but two small, shared
> stages may well do better than one very large operation.
>
> - can you rewrite the query to eliminate duplication earlier. This may
> require you to include inline non-mergeable views: ideally you want to
> avoid generating 39M rows at any point and then executing 39M join steps as
> that will still account for a lot of your time.
>
>
>
>
>
> Regards
>
> Jonathan Lewis
>
>
>
>
>
>
>
>
>
> On Thu, 9 Jun 2022 at 12:15, Goti <aryan.goti_at_gmail.com> wrote:
>
> Environment : 11.2.0.4 database running on Linux.
>
>
>
> Need help to understand parallel query performance issues. Below are the
> query details and its associated plans. The 2018_query does execute in 24
> seconds and returns about 2.5K rows. The 2019_query is also expected to
> process almost the same number of rows however it consumes a lot of TEMP
> space and finally fails. The 2019_query without parallel completes in 45
> minutes (Just by removing the parallel hint). The only difference between
> both the queries is related to the predicate "opclf.year_number =
> to_number('YYYY')". The stats are up to date for the tables are partitions.
>
>
>
>
>
> 2019_query:
> https://gist.github.com/aryangoti/a7704a8075f118f7d942e49acee1900d
>
>
>
> 2018_query:
> https://gist.github.com/aryangoti/a7704a8075f118f7d942e49acee1900d
>
>
>
> Stats and other details:
> https://gist.github.com/aryangoti/a3797424ce0cb4fd87e194c05ad099b6
>
>
>
> Thanks,
>
>
>
> Goti
>
>

--
http://www.freelists.org/webpage/oracle-l

Received on Mon Jun 13 2022 - 18:25:01 CEST

This message: [ Message body ]
Next message: Mark W. Farnham: "RE: Parallel Query Performance Issues"
In reply to Mark W. Farnham: "RE: Parallel Query Performance Issues"
Next in thread: Mark W. Farnham: "RE: Parallel Query Performance Issues"
Reply: Mark W. Farnham: "RE: Parallel Query Performance Issues"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message