Re: Query Performance issue
Date: Tue, 29 Dec 2020 12:58:26 +0000
Message-ID: <CAGtsp8kZ0Teo068o8js_bvF+248ZE_77u_Cr73y5-jwvLK14RQ_at_mail.gmail.com>
I'm just catching up on some email from the last week, so a few comments.
A full outer join between A and B means Oracle HAS to examine every row in
A to see whether or not it has a match in B, and every row in B to see
whether or not it has a match in A, that's why the historical plan for FOJ
was the equivalent of (A left join B) union all (B left join A where
A.join_column is null). I don't think anyone has written up a detailed
description of how the "native full outer join" works, but it's probably
something notionally simple like: flag every build row that has seen a
match, cache every B row that doesn't match; then report the (inner) join
rows as they appear followed by the unflagged A rows followed by the cached
B rows - which means it generally shouldn't do much more damage than a
standard hash join.
More important, then is the impact of the second full outer join on the
optimizer. You have a WHERE clause that seems to eliminate viirtually all
the rows from the join, but the optimizer hasn't been able to push those
predicates inside the FOJ to eliminate data early, and this may be a
current limitation of the optimizer. I agree with Sayan's analysis, by the
way, that if you've correctly associated the predicate columns with the
base table columns then the query could be rewritten as "cox left join cs
left join box". The query you've shown, though looks like the "Unparsed
query" that you might have got from a 10053 trace, (all the An aliases and
double quotes) and when Oracle produces the unparsed query it doesn't
always show a completely legal, or exactly equivalent statement.
Another important point about "small change in data" turns into "massive
change in performance" - you said that up to 50 concurrent sessions might
run this query. Maybe your big change in performance is more down to other
sessions exhausting the PGA, leaving sessions with a much smaller hash area
size allocation. Check v$sql_workarea_histogram data from the AWR (or even
v$) for signs the even fairly small workarea allocations are going one-pass
or multipass. Check v$sql_workarea to see which (other) statements are
using very large memory to go optimal, or have used very large tempseg to
go onepass or multipass. The latter view also has a column last_execution
which can capture the number of passes taken on the last execution of a
statement if it ran multi-pass.
Looking back at the SQL_Monitor output from the serial and parallel
execution, the thing to note is the difference between the writes and reads.
You've got 3 tables with rowcounts in the order of 1M to 2M rows, so it's
not surprising that the number of bytes for the tablescans and joins is in
the order of several hundred million. And we can see that the hash joins
in the serial path have a max temp of 373M and 560M. however the READ bytes
are 8GB and 147GB respectively, which tells you that both hash joins must
have gone multi-pass - and in a big way, perhaps 25 passes for one and 250
passes for the other. The fact that the max mem is recorded as 2M and 1M
is also instructive, the optimizer seems to have allocated only a very
small memory for the hash table, which has resulted in the spill and large
number of passes.
Compare with the parallel stats - the PX detail and plan Max Temp are
consistent, also showing the same (386M and 578M) as the serial path; but
the READ bytes is only 2GB and 4GB - which suggests a much smaller number
of passes on a multi-pass read. In part this may be because the memory
report is larger (9M), but that may be deceptive since it might be 9M
shared across 4 PX servers rather than per server.
Bottom line.
Regards
Jonathsan Lewis
--
http://www.freelists.org/webpage/oracle-l
Received on Tue Dec 29 2020 - 13:58:26 CET