Re: Strange cell offload behavior

From: Sayan Malakshinov <xt.and.r_at_gmail.com>
Date: Tue, 25 May 2021 22:32:35 +0300
Message-ID: <CAOVevU6L4xx=PMPNgrt+LCTQ54ssk8ZTW3M77D8AkNu_ZbH2_A_at_mail.gmail.com>



Hi Lok,
 + Roger MacNicol in BCC,

Yes, you are absolutely correct, from your trace we can see: kcbstt 9836 - small table threshold (blocks) keep_nb 0 - no keep pool
kcbnbh 255727 - buffer cache size (blocks) nblks 541438 - object size in blocks
vlot 500 - very large object threshold, 255727 * 500% = VLOT=1278635 kcbimd: ...kcbnbh 25572 - object is bigger than 10% of buffer cache (255727*10%=25572), so it "is medium"
ntcache 0 - object is not cached (0 blocks in the buffer cache)

Hi _at_Roger, we read your
> NSMTIO shows the decision (but* we don't give any internal details of the
cost model used*)

But could you give any ideas, please, why in this simple case (segment size = 212% of buffer, no blocks in the buffer cache, no prewarm, no prefetch, serial read) it chooses buffer reads?
Logically, we can make an assumption that it's related to the estimated returning "Bytes" from the plan:
more bytes to return - less effective smart scan, but when E-Rows=1 ?!

PS. don't want to speculate but interesting switching point: more or less 1kB :D

On Tue, May 25, 2021 at 9:33 PM Lok P <loknath.73_at_gmail.com> wrote:

> We have sga_target set as 6GB and no specific value for dba_cache_size has
> been set.
>
> Below two lines suggest that the object was > medium table threshold and <
> very large table threshold. But the immediate next line states it's going
> with buffer cache reads. But nothing mentioned about why it decided to do
> so?
>
> NSMTIO: qertbFetch:[MTT < OBJECT_SIZE < VLOT]: Checking cost to read from
> caches(local/remote) and checking storage reduction factors (OLTP/EHCC Comp)
>
> NSMTIO: kcbdpc:NoDirectRead:[CACHE_READ]: tsn: 53, objd: 1063361, objn:
> 1063361
>
> Should i also trace the one with less number of columns which is going for
> smartscan and compare? But still if this trace won't give details on how
> it decided that. No meaning in tracing that. And also the odd part is
> increasing the number of columns in the SELECT list should actually make it
> more favorable for smartscan because of more block number but somehow it's
> working the other way here.
>
>
>
>
> On Tue, May 25, 2021 at 9:50 PM Lok P <loknath.73_at_gmail.com> wrote:
>
>>
>> Thank you so much Sayan.
>>
>> I tried tracing ,and below is the key section of the trace (10358+NSMTIO)
>> for the original query(i.e. will all the columns). I was trying to
>> understand it , but found it a bit confusing. I do see it has taken the
>> final decision on doing cache scan but not cell offload but not able to
>> understand the cause. Can you guide me here on what exactly it's pointing
>> to? Why are we failing to get the benefit of SMART SCAN?
>>
>>
>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>> 1, isQesSageEnabled: TRUE
>>
>>
>> *** 2021-05-25T07:15:15.503247-04:00
>>
>> kcbism: islarge 1 next 0 nblks 914171 type 3, bpid 65535, kcbisdbfc 0
>> kcbnhl 8192 kcbstt 9836 keep_nb 0 kcbnbh 255727 kcbnwp 4
>>
>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>> 1, isQesSageEnabled: TRUE
>>
>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>> 1, isQesSageEnabled: TRUE
>>
>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>> 1, isQesSageEnabled: TRUE
>>
>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>> 1, isQesSageEnabled: TRUE
>>
>> kcbism: islarge 0 next 0 nblks 36 type 3, bpid 65535, kcbisdbfc 0 kcbnhl
>> 8192 kcbstt 9836 keep_nb 0 kcbnbh 255727 kcbnwp 4
>>
>> kcbism: islarge 1 next 0 nblks 541438 type 2, bpid 3, kcbisdbfc 0 kcbnhl
>> 8192 kcbstt 9836 keep_nb 0 kcbnbh 255727 kcbnwp 4
>>
>> kcbimd: nblks 541438 kcbstt 9836 kcbnbh 25572 bpid 3 kcbisdbfc 0
>> is_medium 0
>>
>> kcbivlo: nblks 541438 vlot 500 pnb 255727 kcbisdbfc 0 is_large 0
>>
>> NSMTIO: qertbFetch:[MTT < OBJECT_SIZE < VLOT]: Checking cost to read from
>> caches(local/remote) and checking storage reduction factors (OLTP/EHCC Comp)
>>
>> NSMTIO: kcbdpc:NoDirectRead:[CACHE_READ]: tsn: 53, objd: 1063361, objn:
>> 1063361
>>
>> ckpt: 0, nblks: 541438, ntcache: 0, ntdist:0
>>
>> Direct Path for pdb 0 tsn 53 objd 1063361 objn 1063361
>>
>> Direct Path 0 ckpt 0, nblks 541438 ntcache 0 ntdist 0
>>
>> Direct Path mndb 0 tdiob 859 txiob 0 tciob 133
>>
>> Direct path diomrc 8 dios 2 kcbisdbfc 0
>>
>> kcbdpc: kx 8 kc 8 lhs 4 rhs NSMTIO: Additional Info: VLOT=1278635
>>
>> Object# = 1063361, Object_Size = 541438 blocks
>>
>> SqlId = XXXXXXX, plan_hash_value = 529217307, Partition# = 0
>>
>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>> 1, isQesSageEnabled: TRUE
>>
>> NSMTIO: kcbzib: Cache Scan triggered for tsn = 53, rdba=0x27c04404, fn =
>> 159, kobjd = 1063361, block cnt = 124, noncontig = FALSE pre-warm = FALSE,
>> prefetch = FALSE
>>
>> NSMTIO: kcbzib: Cache Scan triggered for tsn = 53, rdba=0x27c0440c, fn =
>> 159, kobjd = 1063361, block cnt = 116, noncontig = FALSE pre-warm = FALSE,
>> prefetch = FALSE
>>
>> NSMTIO: kcbzib: Cache Scan triggered for tsn = 53, rdba=0x27c04414, fn =
>> 159, kobjd = 1063361, block cnt = 108, noncontig = FALSE pre-warm = FALSE,
>> prefetch = FALSE
>>
>> NSMTIO: kcbzib: Cache Scan triggered for tsn = 53, rdba=0x27c0441c, fn =
>> 159, kobjd = 1063361, block cnt = 100, noncontig = FALSE pre-warm = FALSE,
>> prefetch = FALSE
>>
>>
>>
>>
>> On Mon, May 24, 2021 at 4:39 AM Sayan Malakshinov <xt.and.r_at_gmail.com>
>> wrote:
>>
>>> Hi Lok,
>>>
>>>
>>> https://blogs.oracle.com/smartscan-deep-dive/when-bloggers-get-it-wrong-part-1
>>>
>>> https://blogs.oracle.com/smartscan-deep-dive/when-bloggers-get-it-wrong-part-2
>>>
>>> https://blogs.oracle.com/smartscan-deep-dive/correct-syntax-for-the-table_stats-hint
>>> Have you read this series of articles by Roger MacNicol?
>>> You just need to analyze or provide the following trace events for your
>>> queries:
>>> https://github.com/xtender/xt_scripts/blob/master/trace_nsmtio.sql
>>> -- Buffer cache decision making:
>>> alter session set events '10358 trace name context forever, level 2';
>>> -- Direct I/O decision making:
>>> alter session set events 'trace [NSMTIO] disk highest';
>>>
>>>
>>>
>>> On Sun, May 23, 2021 at 7:43 PM Lok P <loknath.73_at_gmail.com> wrote:
>>>
>>>> Hello Listers,
>>>>
>>>> Its version 19C of oracle and optimizer_features_enable 19.1.0.
>>>>
>>>> While working on a separate issue, we came across a situation which is
>>>> a bit odd. A simple SELECT query fetching ~13 columns from a table - TAB1
>>>> showing odd behaviour. Sometimes within seconds and sometimes more than a
>>>> minute. When , It was running for more than a minute , it happened to be
>>>> clear that it's not doing cell offloading from its sql monitor. And then by
>>>> just removing one column from its SELECT list , the smartscan started
>>>> happening and the query finished in a few seconds.
>>>>
>>>> Then I started running the same query for all the ~13 columns in the
>>>> SELECT list but this time by setting session level "serial_direct_read"=
>>>> always. And here we see the smart scan again started happening and the
>>>> query finished in quick time. Is this behaviour because of any bug ?
>>>> Because I have not seen any such criteria in which the smartscan is
>>>> restricted by the number of columns in the SELECT list.
>>>>
>>>> Attached is the sample table script with the test case which produces
>>>> the above scenario. Table TAB1 is a non partition table with size ~7GB and
>>>> there are ~64million rows in it.
>>>>
>>>> Note- Its a third party database and we see few of the optimizer
>>>> parameters set to non default as its clear from the outline section. But at
>>>> least none of these should be impacting the choice of smartscan.
>>>>
>>>>
>>>> SELECT A_ID, A_ACTN, RNM, FNM, OVL,
>>>> NWVL, COL1, COL2, COL3, COL4, COL5,
>>>> COL6, COL7
>>>> FROM USER2.TAB1
>>>> WHERE OVL <> NWVL
>>>> AND ( (RNM = 'XXXX' AND COL1 = '1')
>>>> OR (RNM = 'YYYY' AND COL1 = '1'))
>>>>
>>>> Regards
>>>> Lok
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Sayan Malakshinov
>>> Oracle performance tuning engineer
>>> Oracle ACE Associate
>>> http://orasql.org
>>>
>>

-- 
Best regards,
Sayan Malakshinov
Oracle performance tuning engineer
Oracle ACE Associate
http://orasql.org

--
http://www.freelists.org/webpage/oracle-l
Received on Tue May 25 2021 - 21:32:35 CEST

Original text of this message