Re: Execution path having full scan in a nested loop

From: Mohamed Houri <mohamed.houri_at_gmail.com>
Date: Tue, 12 Oct 2021 14:01:32 +0200
Message-ID: <CAJu8R6iE+E-OPK3tP1H-_qZ2w+UO39SsOGyHeNQvrCrP3fnzSA_at_mail.gmail.com>



Hello,

Almost one year ago I got a similar issue in a query where most of its execution time was spent on TABLE ACCESS STORAGE FULL FIRST ROWS

It turned out that this was because the column in the NOT IN subquery is nullable from both sides.

This looks similar to your case as we can see in your first execution plan that Oracle has applied the LNNVL function to take into account the always threatening null values when full scanning the DETAIL table

9 - storage("DTL"."FL_TYP"='DP')

       filter("DTL"."FL_TYP"='DP' AND LNNVL("DTL"."FL_NM"<>:B1) AND LNNVL(NVL(:B2,'-')<>NVL("DTL"."SID",'-')))

 Applying the *LNNVL* function at predicate n°9 resulted in canceling the smart scan and predicate offloading.

When you applied Jonathan Lewis suggesting of unnesting the subquery Oracle came up with a transformation that is aware of the presence of NULL in the join condition: JOIN ANTI Null AWARE

 MERGE JOIN ANTI NA https://hourim.wordpress.com/2020/11/15/null-aware-anti-join-parsing-and-_optimizer_squ_bottomup/

And the smart scan on DETAIL table becomes at least possible since there is no LNNVL function applied in the corresponding predicate part for the hinted subquery

 14 - storage("DTL"."FL_TYP"='DP' AND SYS_OP_BLOOM_FILTER(:BF0000," DTL"."RID"))        filter("DTL"."FL_TYP"='DP' AND
SYS_OP_BLOOM_FILTER(:BF0000,"DTL"."RID"))  Best regards

Mohamed Houri

PS: In passing, for my client case, we agreed to change the NOT IN into a NON EXISTS (the client was okay with this despite my warning about the difference between NOT IN and NON EXISTS)

Le mar. 12 oct. 2021 à 11:28, Jonathan Lewis <jlewisoracle_at_gmail.com> a écrit :

>
> >
> > So I want to understand if we can modify this query to make it go for
> one time full table scan, maybe with a hash join kind of operation, so that
> this can complete in a faster time?
> >
>
> I knew that if I answered the question you'd ask you'd only ask a load
> more. ;)
>
> I suggested the hints because that would demonstrate whether or not the
> transformation was possible (it almost always is in 19c) and purely
> cost-based (which is not always the case - sometimes the optimizer will
> pick a higher cost path if a lower cost path is based in guesswork that
> may introduce a lot of variability at run-time).
>
> If the hinted code does what you expect and the cost of the query is
> HIGHER as a result this almost always means that there's a statistics issue
> involved. (If the cost is LOWER this tends to mean that Oracle has a rule
> that has rejected the path.) This means you may be able to find a suitable
> way to fix the statistics to get the plan you want without hinting.
>
> I suggested the NO_MERGE hint because you may find that hinted with only
> UNNEST a change in the data results in the optimizer deciding to do complex
> view merging - which means it will join all the tables and aggregate late,
> and that could be a disaster. (Here's a link to a very old blog note of
> mine demonstrating the concept:
> https://jonathanlewis.wordpress.com/2007/03/08/transformation-and-optimisation/
> )
>
> You started with a comment about an index on RID, but that's irrelevant,
> it could help the join from MASTER to DETAIL, but that doesn't help with
> the correlation between the subquery and PBRF. This becomes cleared when
> Oracle rewrites the NOT IN to NOT EXISTS and you can see the necessary
> correlation in the FILTER In the original plan:
> 3 - filter( NOT EXISTS
> (SELECT 0 FROM "USER1"."DETAIL" "DTL","USER1"."MASTER" "REQ"
> WHERE "REQ"."RID"="DTL"."RID" AND "REQ"."RTYP"=:B4
> AND ("REQ"."RSTS"='YY' OR "REQ"."RSTS"='XX')
> AND "DTL"."FL_TYP"='DP'
> AND LNNVL("DTL"."FL_NM"<>:B1) AND
> LNNVL(NVL(:B2,'-')<>NVL("DTL"."SID",'-'))))
>
> and compare it with the original NOT IN predicate
>
> AND (rf.fattr3, NVL (rf.fattr, '-')) NOT IN(
> SELECT DISTINCT dtl.FL_NM, NVL (dtl.SID, '-')
>
> The connection between RF and DTL is based on fattr3/fl_nm and fattr/sid -
> and because the new plan report a "null aware" antijoin, you can see that
> the optimizer has embedded the correlation predicates inside then lnnvl()
> function. There might be a way to add some NOT NULL declarations and
> function-based indexes to make this more efficient, but don't ask me to
> guess about things that might or might not be valid for your data. It's
> also possible that with the right indexing and stats the subquery could
> drive a nested loop from MASTER to DETAIL, but that would at the least
> require an index on (rtyp, rsts) and maybe some statistics that made it
> look as if that matched only a few rows.
>
> Another benefit of seeing the results of the unnested subquery is that we
> can see why the optimizer didn't choose it.
> The cost of the single SORT JOIN for the subquery is over 8,000; the cost
> of the tablescan of DETAIL is 607, which means Oracle could do the
> tablescan about 13 times before it should choose to unnest. However its
> estimate of rows from the driving table is 2, (although the actual is
> 9.626). If you can get the optimizer to get a better estimate of
> cardinality of the driving table it would unnest automatically. It's
> possible that creating a column group on (F_GRP, rf.F_CATG) would be
> sufficient if the number of combinations is significantly less than the
> product of the individual num_distinct.
>
> Regards
> Jonathan Lewis
>
>
>
>
>
>
> On Tue, 12 Oct 2021 at 07:05, Lok P <loknath.73_at_gmail.com> wrote:
>
>> Thank you Jonathan.
>>
>> Your suggested hints(even with only UNNEST hints) to the subquery making
>> the query finish in ~2seconds as opposed to ~143 seconds before and so it
>> seems it's possible as per optimizer transformations is concerned. So I
>> wanted to understand why optimizers by default are not doing this
>> because of any restriction or we should tweak the query someway Or any
>> issue with stats itself?
>>
>> Initially I was thinking if we are hitting the below restriction as it
>> is in the blog due to which the optimizer is not able to go for a hash
>> join, so tried putting that subquery in a case statement , but it
>> didn't work.
>>
>>
>> https://ctandrewsayer.wordpress.com/2018/06/06/conditional-outer-joins-forcing-nested-loops/
>>
>>
>> Global Information
>> ------------------------------
>> Status : DONE (ALL ROWS)
>> Instance ID : 1
>> SQL ID : 0v7fsttsqubh5
>> SQL Execution ID : 16777216
>> Execution Started : 10/12/2021 01:30:02
>> First Refresh Time : 10/12/2021 01:30:02
>> Last Refresh Time : 10/12/2021 01:30:04
>> Duration : 2s
>> Module/Action : SQL*Plus/-
>> Program : sqlplus.exe
>> Fetch Calls : 2
>>
>>
>> Global Stats
>>
>> ======================================================================================================
>> | Elapsed | Cpu | IO | Application | Concurrency | Fetch |
>> Buffer | Read | Read | Cell |
>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Calls |
>> Gets | Reqs | Bytes | Offload |
>>
>> ======================================================================================================
>> | 1.02 | 0.98 | 0.04 | 0.00 | 0.00 | 2 |
>> 31621 | 228 | 206MB | 81.41% |
>>
>> ======================================================================================================
>>
>> SQL Plan Monitoring Details (Plan Hash Value=3386919479)
>>
>> =============================================================================================================================================================================================
>> | Id | Operation | Name |
>> Rows | Cost | Time | Start | Execs | Rows | Read | Read |
>> Cell | Mem | Activity | Activity Detail |
>> | | | |
>> (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes |
>> Offload | (Max) | (%) | (# samples) |
>>
>> =============================================================================================================================================================================================
>> | 0 | SELECT STATEMENT | |
>> | | 1 | +2 | 1 | 421 | | |
>> | | | |
>> | 1 | WINDOW BUFFER | |
>> 1 | 8150 | 1 | +2 | 1 | 421 | | |
>> | 32768 | | |
>> | 2 | HASH GROUP BY | |
>> 1 | 8150 | 1 | +2 | 1 | 421 | | |
>> | 1M | | |
>> | 3 | FILTER | |
>> | | 1 | +2 | 1 | 6985 | | |
>> | | | |
>> | 4 | MERGE JOIN ANTI NA | |
>> 1 | 8149 | 1 | +2 | 1 | 6985 | | |
>> | | | |
>> | 5 | SORT JOIN | |
>> 2 | 120 | 1 | +2 | 1 | 9626 | | |
>> | 886K | | |
>> | 6 | TABLE ACCESS BY INDEX ROWID | PBRF |
>> 2 | 119 | 1 | +2 | 1 | 9626 | | |
>> | | | |
>> | 7 | INDEX SKIP SCAN | PBRF_IX1 |
>> 4 | 118 | 1 | +2 | 1 | 9626 | | |
>> | | | |
>> | 8 | SORT UNIQUE | |
>> 868K | 8029 | 2 | +1 | 9627 | 2641 | | |
>> | | 100.00 | Cpu (1) |
>> | 9 | VIEW | VW_NSO_1 |
>> 868K | 635 | 1 | +2 | 1 | 1M | | |
>> | | | |
>> | 10 | HASH JOIN | |
>> 868K | 635 | 1 | +2 | 1 | 1M | | |
>> | 2M | | |
>> | 11 | JOIN FILTER CREATE | :BF0000 |
>> 19381 | 18 | 1 | +2 | 1 | 18647 | | |
>> | | | |
>> | 12 | TABLE ACCESS STORAGE FULL | MASTER |
>> 19381 | 18 | 1 | +2 | 1 | 18647 | | |
>> | | | |
>> | 13 | JOIN FILTER USE | :BF0000 |
>> 1M | 607 | 1 | +2 | 1 | 1M | | |
>> | | | |
>> | 14 | TABLE ACCESS STORAGE FULL | DETAIL |
>> 1M | 607 | 1 | +2 | 1 | 1M | 228 | 206MB |
>> 81.41% | 7M | | |
>>
>> =============================================================================================================================================================================================
>>
>> Predicate Information (identified by operation id):
>> ---------------------------------------------------
>>
>> 3 - filter(TO_DATE(:B6,'MM/DD/YYYY')>=TO_DATE(:B5,'MM/DD/YYYY'))
>> 7 - access("RF"."W_DATE">=TO_DATE(:B5,'MM/DD/YYYY') AND
>> "RF"."F_GRP"=:B1 AND "RF"."W_DATE"<=TO_DATE(:B6,'MM/DD/YYYY'))
>> filter("RF"."F_GRP"=:B1 AND ("RF"."F_CATG"=:B2 OR
>> "RF"."F_CATG"=:B3))
>> 8 - access(INTERNAL_FUNCTION("RF"."fattr3")=INTERNAL_FUNCTION("FL_NM")
>> AND INTERNAL_FUNCTION("NVL(DTL.SID,'-')")=NVL("RF"."fattr2",'-'))
>> filter(INTERNAL_FUNCTION("NVL(DTL.SID,'-')")=NVL("RF"."fattr2",'-')
>> AND INTERNAL_FUNCTION("RF"."fattr3")=INTERNAL_FUNCTION("FL_NM"))
>> 10 - access("REQ"."RID"="DTL"."RID")
>> 12 - storage("REQ"."RTYP"=:B4 AND ("REQ"."RSTS"='FP' OR
>> "REQ"."RSTS"='IP'))
>> filter("REQ"."RTYP"=:B4 AND ("REQ"."RSTS"='FP' OR
>> "REQ"."RSTS"='IP'))
>> 14 - storage("DTL"."FL_TYP"='DP' AND
>> SYS_OP_BLOOM_FILTER(:BF0000,"DTL"."RID"))
>> filter("DTL"."FL_TYP"='DP' AND
>> SYS_OP_BLOOM_FILTER(:BF0000,"DTL"."RID"))
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Oct 12, 2021 at 2:31 AM Jonathan Lewis <jlewisoracle_at_gmail.com>
>> wrote:
>>
>>>
>>> The shape of the plan you're describing is one where the subquery is
>>> unnested to a non-mergeable aggregate view.
>>> To see if this is legal as far as the optimizer is concerned you could
>>> add the hints /*+ unnest no_merge */ to the subquery.
>>>
>>> Regards
>>> Jonathan Lewis
>>>
>>>
>>> On Mon, 11 Oct 2021 at 20:50, Lok P <loknath.73_at_gmail.com> wrote:
>>>
>>>> Hello Listers, We have one database on version 11.2.0.4 of oracle. And
>>>> below query is spending quite a lot of time while scanning table 'DETAIL'
>>>> in nested loop path i.e. step-9 below. This table does have an index with
>>>> the leading column as RID(which is joined column) but still it's going for
>>>> a 'TABLE ACCESS STORAGE FULL FIRST ROWS' within a nested loop. Not sure if
>>>> it's just because we are reading a lot of rows from that table or if we are
>>>> hitting any optimizer restriction. So I want to understand if we can modify
>>>> this query to make it go for one time full table scan, maybe with a hash
>>>> join kind of operation, so that this can complete in a faster time?
>>>>
>>>> Table DETAIL having ~1.7million rows in it. And column RID having 31K
>>>> distinct values in it. And table MASTER having ~34k in it.
>>>>
>>>> SELECT TRIM (rf.fattr3) ,TRIM (rf.fattr) , rf.W_DATE ,rf.CODE ,NVL (SUM
>>>> (rf.txn_cnt), 0) AS cnt,NVL (SUM (DECODE (TTYP, 'S', 1, -1) * rf.amt),0)
>>>> amt,count(*) over () count1
>>>> FROM PBRF rf
>>>> WHERE rf.F_GRP = :b1
>>>> AND rf.F_CATG IN ( :b2, :b3)
>>>> AND (rf.fattr3, NVL (rf.fattr, '-')) NOT IN
>>>> (SELECT DISTINCT dtl.FL_NM, NVL (dtl.SID, '-')
>>>> FROM MASTER req, DETAIL dtl
>>>> WHERE req.RID = dtl.RID
>>>> AND req.RSTS IN ('XX', 'YY')
>>>> AND req.RTYP = :b4
>>>> AND dtl.FL_TYP = 'DP')
>>>> AND rf.W_DATE BETWEEN TO_DATE ( :b5,'MM/DD/YYYY') AND TO_DATE (
>>>> :b6,'MM/DD/YYYY')
>>>> GROUP BY rf.CODE,TRIM (rf.fattr),TRIM (rf.fattr3), rf.W_DATE;
>>>>
>>>> Global Information
>>>> ------------------------------
>>>> Status : DONE (ALL ROWS)
>>>> Instance ID : 1
>>>> SQL ID : 7yc5x3uzx7vzm
>>>> SQL Execution ID : 16777216
>>>> Execution Started : 10/11/2021 15:18:40
>>>> First Refresh Time : 10/11/2021 15:18:40
>>>> Last Refresh Time : 10/11/2021 15:21:03
>>>> Duration : 143s
>>>> Module/Action : SQL*Plus/-
>>>> Program : sqlplus.exe
>>>> Fetch Calls : 2
>>>>
>>>> Global Stats
>>>>
>>>> ===================================================================================================
>>>> | Elapsed | Cpu | IO | Application | Other | Fetch |
>>>> Buffer | Read | Read | Cell |
>>>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Calls | Gets
>>>> | Reqs | Bytes | Offload |
>>>>
>>>> ===================================================================================================
>>>> | 144 | 131 | 11 | 1.13 | 0.18 | 2 |
>>>> 38M | 348K | 293GB | 80.58% |
>>>>
>>>> ===================================================================================================
>>>>
>>>> SQL Plan Monitoring Details (Plan Hash Value=3512223473)
>>>>
>>>> ==========================================================================================================================================================================================================================
>>>> | Id | Operation | Name
>>>> | Rows | Cost | Time | Start | Execs | Rows | Read |
>>>> Read | Cell | Mem | Activity | Activity Detail |
>>>> | | |
>>>> | (Estim) | | Active(s) | Active | | (Actual) | Reqs |
>>>> Bytes | Offload | (Max) | (%) | (# samples) |
>>>>
>>>> ==========================================================================================================================================================================================================================
>>>> | 0 | SELECT STATEMENT |
>>>> | | | 1 | +143 | 1 | 421 | |
>>>> | | | | |
>>>> | 1 | WINDOW BUFFER |
>>>> | 1 | 125 | 1 | +143 | 1 | 421 | |
>>>> | | 32768 | | |
>>>> | 2 | HASH GROUP BY |
>>>> | 1 | 125 | 142 | +2 | 1 | 421 | |
>>>> | | 1M | | |
>>>> | 3 | FILTER |
>>>> | | | 142 | +2 | 1 | 6985 | |
>>>> | | | | |
>>>> | 4 | FILTER |
>>>> | | | 142 | +2 | 1 | 9626 | |
>>>> | | | | |
>>>> | 5 | TABLE ACCESS BY INDEX ROWID | PBRF
>>>> | 2 | 119 | 142 | +2 | 1 | 9626 | |
>>>> | | | | |
>>>> | 6 | INDEX SKIP SCAN | PBRF_IX1
>>>> | 4 | 118 | 142 | +2 | 1 | 9626 | 1 |
>>>> 8192 | | | | |
>>>> | 7 | NESTED LOOPS |
>>>> | 3 | 5 | 142 | +2 | 1712 | 532 | |
>>>> | | | | |
>>>> | 8 | NESTED LOOPS |
>>>> | 3 | 5 | 142 | +2 | 1712 | 1417 | |
>>>> | | | | |
>>>> | 9 | TABLE ACCESS STORAGE FULL FIRST ROWS | DETAIL
>>>> | 6436 | 2 | 144 | +1 | 1712 | 1417 | 348K |
>>>> 293GB | 80.58% | 17M | 100.00 | enq: KO - fast object checkpoint (2) |
>>>> | | |
>>>> | | | | | | | |
>>>> | | | | Cpu (124) |
>>>> | | |
>>>> | | | | | | | |
>>>> | | | | reliable message (4) |
>>>> | | |
>>>> | | | | | | | |
>>>> | | | | cell smart table scan (14) |
>>>> | 10 | INDEX UNIQUE SCAN | MASTER_PK
>>>> | 1 | | 142 | +2 | 1417 | 1417 | |
>>>> | | | | |
>>>> | 11 | TABLE ACCESS BY INDEX ROWID | MASTER
>>>> | 1 | 1 | 142 | +2 | 1417 | 532 | |
>>>> | | | | |
>>>> ==========================================================================================================================================================================================================================
>>>>
>>>>
>>>>
>>>> Predicate Information (identified by operation id):
>>>> ---------------------------------------------------
>>>> 3 - filter( NOT EXISTS (SELECT 0 FROM "USER1"."DETAIL"
>>>> "DTL","USER1"."MASTER" "REQ" WHERE "REQ"."RID"="DTL"."RID" AND
>>>> "REQ"."RTYP"=:B4 AND
>>>> ("REQ"."RSTS"='YY' OR "REQ"."RSTS"='XX') AND
>>>> "DTL"."FL_TYP"='DP' AND LNNVL("DTL"."FL_NM"<>:B1) AND
>>>> LNNVL(NVL(:B2,'-')<>NVL("DTL"."SID",'-'))))
>>>> 4 - filter(TO_DATE(:B6,'MM/DD/YYYY')>=TO_DATE(:B5,'MM/DD/YYYY'))
>>>> 6 - access("RF"."W_DATE">=TO_DATE(:B5,'MM/DD/YYYY') AND
>>>> "RF"."F_GRP"=:B1 AND "RF"."W_DATE"<=TO_DATE(:B6,'MM/DD/YYYY'))
>>>> filter("RF"."F_GRP"=:B1 AND ("RF"."F_CATG"=:B2 OR
>>>> "RF"."F_CATG"=:B3))
>>>> 9 - storage("DTL"."FL_TYP"='DP')
>>>> filter("DTL"."FL_TYP"='DP' AND LNNVL("DTL"."FL_NM"<>:B1) AND
>>>> LNNVL(NVL(:B2,'-')<>NVL("DTL"."SID",'-')))
>>>> 10 - access("REQ"."RID"="DTL"."RID")
>>>> 11 - filter("REQ"."RTYP"=:B4 AND ("REQ"."RSTS"='YY' OR
>>>> "REQ"."RSTS"='XX'))
>>>>
>>>> Statistics
>>>> ----------------------------------------------------------
>>>> 1 recursive calls
>>>> 0 db block gets
>>>> 38469369 consistent gets
>>>> 38459185 physical reads
>>>> 52 redo size
>>>> 14543 bytes sent via SQL*Net to client
>>>> 472 bytes received via SQL*Net from client
>>>> 2 SQL*Net roundtrips to/from client
>>>> 1 sorts (memory)
>>>> 0 sorts (disk)
>>>> 421 rows processed
>>>>
>>>

-- 

Houri Mohamed

Oracle DBA-Developer-Performance & Tuning

Visit My         - Blog <http://www.hourim.wordpress.com/>

Let's Connect -
<http://fr.linkedin.com/pub/mohamed-houri/11/329/857/>*Linkedin
Profile <http://fr.linkedin.com/pub/mohamed-houri/11/329/857/>*

My Twitter <https://twitter.com/MohamedHouri>      - MohamedHouri
<https://twitter.com/MohamedHouri>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Oct 12 2021 - 14:01:32 CEST

Original text of this message