Re: Query Performance issue
Date: Thu, 31 Dec 2020 00:57:06 +0530
Message-ID: <CAKna9Va6dvM9m5qTThBusNLL9zu0wp=ucfdCz2fe0DZuh-hv6w_at_mail.gmail.com>
CHILD_NUMBER OPERATION_TYPE OPERATION_ID POLICY
ESTIMATED_OPTIMAL_SIZE/1024/1024 ESTIMATED_ONEPASS_SIZE/1024/1024
LAST_MEMORY_USED/1024/1024 LAST_EXECUTION TOTAL_EXECUTIONS
OPTIMAL_EXECUTIONS ONEPASS_EXECUTIONS MULTIPASSES_EXECUTIONS
MAX_TEMPSEG_SIZE/1024/1024 LAST_TEMPSEG_SIZE/1024/1024
0 HASH-JOIN 2 AUTO 195.7695313 18.17675781 1.349609375 497 PASSES 1 0 0 1
534 534
0 SEGMENT SCAN 3 AUTO 1.000976563 1.000976563 0 OPTIMAL 0 0 0 0
0 HASH-JOIN 6 AUTO 182.4746094 17.73925781 2.169921875 26 PASSES 1 0 0 1 356
356
0 SEGMENT SCAN 7 AUTO 1.000976563 1.000976563 0 OPTIMAL 0 0 0 0
0 SEGMENT SCAN 8 AUTO 1.000976563 1.000976563 0 OPTIMAL 0 0 0 0
h
CHILD_NUMBER OPERATION_TYPE OPERATION_ID POLICY ESTIMATED_OPTIMAL_SIZE/1024/1024 ESTIMATED_ONEPASS_SIZE/1024/1024 LAST_MEMORY_USED/1024/1024 LAST_EXECUTION TOTAL_EXECUTIONS OPTIMAL_EXECUTIONS ONEPASS_EXECUTIONS MULTIPASSES_EXECUTIONS MAX_TEMPSEG_SIZE/1024/1024 LAST_TEMPSEG_SIZE/1024/1024 0 HASH-JOIN 2 MANUAL 195.7695313 18.17675781 257.7373047 OPTIMAL 1 1 0 0 0 SEGMENT SCAN 3 MANUAL 1.000976563 1.000976563 0 OPTIMAL 0 0 0 0 0 HASH-JOIN 6 MANUAL 182.4746094 17.73925781 245.9316406 OPTIMAL 1 1 0 0 0 SEGMENT SCAN 7 MANUAL 1.000976563 1.000976563 0 OPTIMAL 0 0 0 0 0 SEGMENT SCAN 8 MANUAL 1.000976563 1.000976563 0 OPTIMAL 0 0 0 0 On Wed, Dec 30, 2020 at 9:52 PM Lok P <loknath.73_at_gmail.com> wrote:
> Thank you Jonathan.
>
> I think you are correct. I tried modifying the query as below i.e. with
> left outer join but by simply doing that i saw the optimizer was still
> going for the FTS+HASH join on the two table BOS, COX. Then as you
> suggested I tried breaking then using UNION ALL with LNNVL added on the
> later part as below and as i tested for one set of binds , I am seeing the
> results matching with the original query(hopefully it will match for all
> type of blinds without breaking existing business logic). And again, though
> the table COX is still going for a FTS, others were able to utilize the
> available indexes and finish in a second. Attached is the sql monitor for
> the same.
>
> Below I have replaced the alias with actual inline table names to look a
> bit clear and also align the column names a bit to be more clear.
>
> SELECT ....
> FROM (SELECT ....
> FROM
> (SELECT ....
> FROM "USER1"."COX" COX --
> LEFT JOIN "USER1"."BOS" "BOS"
> ON ( "BOS"."BID" = "COX"."BID"
> AND "BOS"."OID" = TO_NUMBER ("COX"."OID"))
> LEFT JOIN "USER1"."CS" CS
> ON "COX"."ECID" = "CS"."ECID"
> )
> ) "A1"
> WHERE "COX"."BID" = :b1
> UNION ALL
> SELECT.......
> FROM (SELECT .....
> FROM
> (SELECT .....
> FROM "USER1"."COX" COX --
> LEFT JOIN "USER1"."BOS" "BOS"
> ON ( "BOS"."BID" = "COX"."BID"
> AND "BOS"."OID" = TO_NUMBER ("COX"."OID"))
> LEFT JOIN "USER1"."CS" CS
> ON "COX"."ECID" = "CS"."ECID"
> )
> ) "A1"
> WHERE
> "BOS"."BID" = :b2 and lnnvl("COX"."BID" = :b1) --BOS
> AND "A1"."I_DT" IS NULL --BOS
> AND ( "COX"."BID" IS NOT NULL
> AND "COX"."CT3" = 'XXX' --COX
> OR "COX"."BID" IS NULL AND "COX"."CT3" = 'YYY')
>
>
> On Wed, Dec 30, 2020 at 6:35 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>> I've changed my mind about needing to rewrite this as a two-pass queries,
>> collecting rowids as a minimum first pass then joining back. I've tried to
>> model the query, but obvious since you haven't given us the information
>> that identifies exactly which columns in which table map to which column
>> aliases in your predicates I may have done it wrong. Also, know which
>> columns are declared NOT NULL might make a difference. Having given that
>> warning:
>> following Sayan's comments, the query can be rewritten as
>>
>> select
>> from cox
>> left join bos
>> left join cs
>>
>> (Obviously you could do this with two right joins as you tried to do in
>> another post - SQL Monitor you produce clearly gave the wrong answer
>> because it returned more rows and even if it hadn't the predicate section
>> lost at least one predicate (the ct3='YYY' ... but perhaps that means
>> there's a NOT NULL constraint on the BID1 column).
>>
>> If you can do that rewrite then the the "cox left join bos" could be
>> rewritten as a UNION ALL of two joins and could run efficiently IF there
>> are suitable indexes on the COX and BOS tables, specifically the columns
>> identified in your first SQL sample by the predicates "A1.BI0 = :b1" and
>> "A1.COl1 = :b2"
>>
>> In fact, as a test, if you take your original query change the FULL OUTER
>> JOIN to LEFT JOIN in both cases, then run two different versions one with
>> WHERE A1.BI0 = :b1
>> and one with
>> WHERE ( ( A1.COl1 = :b2 AND A1.I_DT IS NULL)
>> AND ( A1.BI0 IS NOT NULL AND A1.CT1 = 'XXX'
>> OR A1.BI0 IS NULL AND A1.CT1 = 'YYY'
>> )
>> )
>>
>> You may be able to show that the main OR is the real performance problem,
>> and prove that it's worth the effort of rewritng as a UNION ALL. (You have
>> to add a predicate to the second part to eliminate rows already returned by
>> the first - this would be *and lnnvl(a1.b10 = :b1)*. unless a1.b10 is
>> declared not null in which case *"and a1.b10 !- :b1"* would be
>> sufficient.
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, 29 Dec 2020 at 14:17, Lok P <loknath.73_at_gmail.com> wrote:
>>
>>> Thank You so much Jonathan for the awesome details. It really helped.
>>>
>>> Regarding the cause of the issue , we found from dba_hist_pgastat the
>>> "total PGA allocated" stats was hovering around the same range before/after
>>> the issue i.e. ~4-5GB. But we found from dba_hist_active_sess_history that
>>> sql_id/session the max pga_allocated dropped significantly i.e. from .4GB
>>> to .005GB at the same time when the tempspace_allocated has been increased
>>> from zero to ~500MB+. And considering we have in gv$parameter , value of
>>> pga_aggregate_target set as ~15GB, workarea_size_policy AUTO and
>>> hash_area_size set as ~1.5GB, we suspect somehow from the application when
>>> the sessions were being spawned the workarea_size_policy was set to Manual
>>> (may be through alter session statement) which is why it was able to use
>>> more memory earlier for building HASH table but recently due to some change
>>> that is no more happening and the default setup being in play causing more
>>> temp spill and more time for the query execution. And I hope no other
>>> changes(say memory related change at server level etc) can make it behave
>>> like this in this scenario.
>>>
>>> With respect to rewriting of the query , I think I tried it rewriting
>>> through right outer join as Sayan suggetsd and shared the results in this
>>> email trail, but in that i saw the query still took ~5minutes+, because of
>>> still the filter was getting applied after the HASH Join rather before Join
>>> so temp spill was still happening. I will try to see what wrong is
>>> happening over there.
>>>
>>> Else I think the last option would be to project the required ROWIDS
>>> only from the Full outer JOIN and then fetch the columns from those in the
>>> final outer query as Lothar suggested.
>>>
>>> Thanks and Regards
>>> Lok
>>>
>>> On Tue, Dec 29, 2020 at 6:29 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
>>> wrote:
>>>
>>>> I'm just catching up on some email from the last week, so a few
>>>> comments.
>>>>
>>>> A full outer join between A and B means Oracle HAS to examine every row
>>>> in A to see whether or not it has a match in B, and every row in B to see
>>>> whether or not it has a match in A, that's why the historical plan for FOJ
>>>> was the equivalent of (A left join B) union all (B left join A where
>>>> A.join_column is null). I don't think anyone has written up a detailed
>>>> description of how the "native full outer join" works, but it's probably
>>>> something notionally simple like: flag every build row that has seen a
>>>> match, cache every B row that doesn't match; then report the (inner) join
>>>> rows as they appear followed by the unflagged A rows followed by the cached
>>>> B rows - which means it generally shouldn't do much more damage than a
>>>> standard hash join.
>>>>
>>>> More important, then is the impact of the second full outer join on the
>>>> optimizer. You have a WHERE clause that seems to eliminate viirtually all
>>>> the rows from the join, but the optimizer hasn't been able to push those
>>>> predicates inside the FOJ to eliminate data early, and this may be a
>>>> current limitation of the optimizer. I agree with Sayan's analysis, by the
>>>> way, that if you've correctly associated the predicate columns with the
>>>> base table columns then the query could be rewritten as "cox left join cs
>>>> left join box". The query you've shown, though looks like the "Unparsed
>>>> query" that you might have got from a 10053 trace, (all the An aliases and
>>>> double quotes) and when Oracle produces the unparsed query it doesn't
>>>> always show a completely legal, or exactly equivalent statement.
>>>>
>>>> Another important point about "small change in data" turns into
>>>> "massive change in performance" - you said that up to 50 concurrent
>>>> sessions might run this query. Maybe your big change in performance is
>>>> more down to other sessions exhausting the PGA, leaving sessions with a
>>>> much smaller hash area size allocation. Check v$sql_workarea_histogram
>>>> data from the AWR (or even v$) for signs the even fairly small workarea
>>>> allocations are going one-pass or multipass. Check v$sql_workarea to see
>>>> which (other) statements are using very large memory to go optimal, or have
>>>> used very large tempseg to go onepass or multipass. The latter view also
>>>> has a column last_execution which can capture the number of passes taken
>>>> on the last execution of a statement if it ran multi-pass.
>>>>
>>>>
>>>> Looking back at the SQL_Monitor output from the serial and parallel
>>>> execution, the thing to note is the difference between the writes and reads.
>>>>
>>>> You've got 3 tables with rowcounts in the order of 1M to 2M rows, so
>>>> it's not surprising that the number of bytes for the tablescans and joins
>>>> is in the order of several hundred million. And we can see that the hash
>>>> joins in the serial path have a max temp of 373M and 560M. however the READ
>>>> bytes are 8GB and 147GB respectively, which tells you that both hash joins
>>>> must have gone multi-pass - and in a big way, perhaps 25 passes for one and
>>>> 250 passes for the other. The fact that the max mem is recorded as 2M and
>>>> 1M is also instructive, the optimizer seems to have allocated only a very
>>>> small memory for the hash table, which has resulted in the spill and large
>>>> number of passes.
>>>>
>>>> Compare with the parallel stats - the PX detail and plan Max Temp are
>>>> consistent, also showing the same (386M and 578M) as the serial path; but
>>>> the READ bytes is only 2GB and 4GB - which suggests a much smaller number
>>>> of passes on a multi-pass read. In part this may be because the memory
>>>> report is larger (9M), but that may be deceptive since it might be 9M
>>>> shared across 4 PX servers rather than per server.
>>>>
>>>>
>>>> Bottom line.
>>>>
>>>> a) If you've got lots of processes running then any brute force path
>>>> could result in a massive variation inperformance - it's possible that a
>>>> recent small increase in data could have tipped you into the critical
>>>> collision point (but it probably won't get much worse than this)
>>>>
>>>> b) running parallel 4 is possibly a safe strategy for reducing the
>>>> run-time provided you limit the number of sessions that can run the query
>>>> concurrently. Oracle allows for parallel query queueing to deal with this
>>>> scenario. What you don't want is 50 sessions running this query in parallal.
>>>>
>>>> c) You need to check the logic of the query and rewrite it. There
>>>> seems to be a simple rewrite that should allow the later WHERE predicates
>>>> to be pushed once you've got rid of the (apparently) redundant full outer
>>>> joins.
>>>>
>>>> Regards
>>>> Jonathsan Lewis
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Dec 30 2020 - 20:27:06 CET