Re: Query Performance issue

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Wed, 30 Dec 2020 13:04:57 +0000
Message-ID: <CAGtsp8m51bV9SUE6Lo8d_GE5Tc086No8DS+V2SboewiwG8LSVA_at_mail.gmail.com>



I've changed my mind about needing to rewrite this as a two-pass queries, collecting rowids as a minimum first pass then joining back. I've tried to model the query, but obvious since you haven't given us the information that identifies exactly which columns in which table map to which column aliases in your predicates I may have done it wrong. Also, know which columns are declared NOT NULL might make a difference. Having given that warning:
following Sayan's comments, the query can be rewritten as

select
from cox
left join bos
left join cs

(Obviously you could do this with two right joins as you tried to do in another post - SQL Monitor you produce clearly gave the wrong answer because it returned more rows and even if it hadn't the predicate section lost at least one predicate (the ct3='YYY' ... but perhaps that means there's a NOT NULL constraint on the BID1 column).

If you can do that rewrite then the the "cox left join bos" could be rewritten as a UNION ALL of two joins and could run efficiently IF there are suitable indexes on the COX and BOS tables, specifically the columns identified in your first SQL sample by the predicates "A1.BI0 = :b1" and "A1.COl1 = :b2"

In fact, as a test, if you take your original query change the FULL OUTER JOIN to LEFT JOIN in both cases, then run two different versions one with WHERE A1.BI0 = :b1
and one with
WHERE ( ( A1.COl1 = :b2 AND A1.I_DT IS NULL)

    AND ( A1.BI0 IS NOT NULL AND A1.CT1 = 'XXX'      OR A1.BI0 IS NULL AND A1.CT1 = 'YYY'    )
)

You may be able to show that the main OR is the real performance problem, and prove that it's worth the effort of rewritng as a UNION ALL. (You have to add a predicate to the second part to eliminate rows already returned by the first - this would be *and lnnvl(a1.b10 = :b1)*. unless a1.b10 is declared not null in which case *"and a1.b10 !- :b1"* would be sufficient.

Regards
Jonathan Lewis

On Tue, 29 Dec 2020 at 14:17, Lok P <loknath.73_at_gmail.com> wrote:

> Thank You so much Jonathan for the awesome details. It really helped.
>
> Regarding the cause of the issue , we found from dba_hist_pgastat the
> "total PGA allocated" stats was hovering around the same range before/after
> the issue i.e. ~4-5GB. But we found from dba_hist_active_sess_history that
> sql_id/session the max pga_allocated dropped significantly i.e. from .4GB
> to .005GB at the same time when the tempspace_allocated has been increased
> from zero to ~500MB+. And considering we have in gv$parameter , value of
> pga_aggregate_target set as ~15GB, workarea_size_policy AUTO and
> hash_area_size set as ~1.5GB, we suspect somehow from the application when
> the sessions were being spawned the workarea_size_policy was set to Manual
> (may be through alter session statement) which is why it was able to use
> more memory earlier for building HASH table but recently due to some change
> that is no more happening and the default setup being in play causing more
> temp spill and more time for the query execution. And I hope no other
> changes(say memory related change at server level etc) can make it behave
> like this in this scenario.
>
> With respect to rewriting of the query , I think I tried it rewriting
> through right outer join as Sayan suggetsd and shared the results in this
> email trail, but in that i saw the query still took ~5minutes+, because of
> still the filter was getting applied after the HASH Join rather before Join
> so temp spill was still happening. I will try to see what wrong is
> happening over there.
>
> Else I think the last option would be to project the required ROWIDS only
> from the Full outer JOIN and then fetch the columns from those in the final
> outer query as Lothar suggested.
>
> Thanks and Regards
> Lok
>
> On Tue, Dec 29, 2020 at 6:29 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>> I'm just catching up on some email from the last week, so a few comments.
>>
>> A full outer join between A and B means Oracle HAS to examine every row
>> in A to see whether or not it has a match in B, and every row in B to see
>> whether or not it has a match in A, that's why the historical plan for FOJ
>> was the equivalent of (A left join B) union all (B left join A where
>> A.join_column is null). I don't think anyone has written up a detailed
>> description of how the "native full outer join" works, but it's probably
>> something notionally simple like: flag every build row that has seen a
>> match, cache every B row that doesn't match; then report the (inner) join
>> rows as they appear followed by the unflagged A rows followed by the cached
>> B rows - which means it generally shouldn't do much more damage than a
>> standard hash join.
>>
>> More important, then is the impact of the second full outer join on the
>> optimizer. You have a WHERE clause that seems to eliminate viirtually all
>> the rows from the join, but the optimizer hasn't been able to push those
>> predicates inside the FOJ to eliminate data early, and this may be a
>> current limitation of the optimizer. I agree with Sayan's analysis, by the
>> way, that if you've correctly associated the predicate columns with the
>> base table columns then the query could be rewritten as "cox left join cs
>> left join box". The query you've shown, though looks like the "Unparsed
>> query" that you might have got from a 10053 trace, (all the An aliases and
>> double quotes) and when Oracle produces the unparsed query it doesn't
>> always show a completely legal, or exactly equivalent statement.
>>
>> Another important point about "small change in data" turns into "massive
>> change in performance" - you said that up to 50 concurrent sessions might
>> run this query. Maybe your big change in performance is more down to other
>> sessions exhausting the PGA, leaving sessions with a much smaller hash area
>> size allocation. Check v$sql_workarea_histogram data from the AWR (or even
>> v$) for signs the even fairly small workarea allocations are going one-pass
>> or multipass. Check v$sql_workarea to see which (other) statements are
>> using very large memory to go optimal, or have used very large tempseg to
>> go onepass or multipass. The latter view also has a column last_execution
>> which can capture the number of passes taken on the last execution of a
>> statement if it ran multi-pass.
>>
>>
>> Looking back at the SQL_Monitor output from the serial and parallel
>> execution, the thing to note is the difference between the writes and reads.
>>
>> You've got 3 tables with rowcounts in the order of 1M to 2M rows, so it's
>> not surprising that the number of bytes for the tablescans and joins is in
>> the order of several hundred million. And we can see that the hash joins
>> in the serial path have a max temp of 373M and 560M. however the READ bytes
>> are 8GB and 147GB respectively, which tells you that both hash joins must
>> have gone multi-pass - and in a big way, perhaps 25 passes for one and 250
>> passes for the other. The fact that the max mem is recorded as 2M and 1M
>> is also instructive, the optimizer seems to have allocated only a very
>> small memory for the hash table, which has resulted in the spill and large
>> number of passes.
>>
>> Compare with the parallel stats - the PX detail and plan Max Temp are
>> consistent, also showing the same (386M and 578M) as the serial path; but
>> the READ bytes is only 2GB and 4GB - which suggests a much smaller number
>> of passes on a multi-pass read. In part this may be because the memory
>> report is larger (9M), but that may be deceptive since it might be 9M
>> shared across 4 PX servers rather than per server.
>>
>>
>> Bottom line.
>>
>> a) If you've got lots of processes running then any brute force path
>> could result in a massive variation inperformance - it's possible that a
>> recent small increase in data could have tipped you into the critical
>> collision point (but it probably won't get much worse than this)
>>
>> b) running parallel 4 is possibly a safe strategy for reducing the
>> run-time provided you limit the number of sessions that can run the query
>> concurrently. Oracle allows for parallel query queueing to deal with this
>> scenario. What you don't want is 50 sessions running this query in parallal.
>>
>> c) You need to check the logic of the query and rewrite it. There seems
>> to be a simple rewrite that should allow the later WHERE predicates to be
>> pushed once you've got rid of the (apparently) redundant full outer joins.
>>
>> Regards
>> Jonathsan Lewis
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Dec 30 2020 - 14:04:57 CET

Original text of this message