Re: Strange cell offload behavior

From: Fairlie Rego <fairlie.rego_at_gmail.com>
Date: Wed, 26 May 2021 07:31:42 +1000
Message-ID: <CAJEEaCwA20jro5fe8=GXD0SuNXhG7Yp3xCi0DGRjgVabwGnA1A_at_mail.gmail.com>



Well I have environments where we have applied the below fix on top of 19.10 (since we can't afford the risk)

*Bug 31626438 LARGE BUFFER CACHE ON EXADATA DISABLES SERIAL SMART SCAN OFFLOAD* REDISCOVERY INFORMATION:
 If finding it difficult to get serial scan to offload when a large buffer  cache is configured, this bug might be rediscovered.

 Large is a subjective and relative term and the published notes do not indicate
 what "large" means

Ta
Fairlie
On Wed, May 26, 2021 at 5:32 AM Sayan Malakshinov <xt.and.r_at_gmail.com> wrote:

> Hi Lok,
> + Roger MacNicol in BCC,
>
> Yes, you are absolutely correct, from your trace we can see:
> kcbstt 9836 - small table threshold (blocks)
> keep_nb 0 - no keep pool
> kcbnbh 255727 - buffer cache size (blocks)
> nblks 541438 - object size in blocks
> vlot 500 - very large object threshold, 255727 * 500% = VLOT=1278635
> kcbimd: ...kcbnbh 25572 - object is bigger than 10% of buffer cache
> (255727*10%=25572), so it "is medium"
> ntcache 0 - object is not cached (0 blocks in the buffer cache)
>
> Hi _at_Roger, we read your
> > NSMTIO shows the decision (but* we don't give any internal details of
> the cost model used*)
>
> But could you give any ideas, please, why in this simple case (segment
> size = 212% of buffer, no blocks in the buffer cache, no prewarm, no
> prefetch, serial read) it chooses buffer reads?
> Logically, we can make an assumption that it's related to the estimated
> returning "Bytes" from the plan:
> more bytes to return - less effective smart scan, but when E-Rows=1 ?!
>
> PS. don't want to speculate but interesting switching point: more or less
> 1kB :D
>
>
> On Tue, May 25, 2021 at 9:33 PM Lok P <loknath.73_at_gmail.com> wrote:
>
>> We have sga_target set as 6GB and no specific value for dba_cache_size
>> has been set.
>>
>> Below two lines suggest that the object was > medium table threshold and
>> < very large table threshold. But the immediate next line states it's going
>> with buffer cache reads. But nothing mentioned about why it decided to do
>> so?
>>
>> NSMTIO: qertbFetch:[MTT < OBJECT_SIZE < VLOT]: Checking cost to read from
>> caches(local/remote) and checking storage reduction factors (OLTP/EHCC Comp)
>>
>> NSMTIO: kcbdpc:NoDirectRead:[CACHE_READ]: tsn: 53, objd: 1063361, objn:
>> 1063361
>>
>> Should i also trace the one with less number of columns which is going
>> for smartscan and compare? But still if this trace won't give details on
>> how it decided that. No meaning in tracing that. And also the odd part is
>> increasing the number of columns in the SELECT list should actually make it
>> more favorable for smartscan because of more block number but somehow it's
>> working the other way here.
>>
>>
>>
>>
>> On Tue, May 25, 2021 at 9:50 PM Lok P <loknath.73_at_gmail.com> wrote:
>>
>>>
>>> Thank you so much Sayan.
>>>
>>> I tried tracing ,and below is the key section of the trace
>>> (10358+NSMTIO) for the original query(i.e. will all the columns). I was
>>> trying to understand it , but found it a bit confusing. I do see it has
>>> taken the final decision on doing cache scan but not cell offload but not
>>> able to understand the cause. Can you guide me here on what exactly it's
>>> pointing to? Why are we failing to get the benefit of SMART SCAN?
>>>
>>>
>>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>>> 1, isQesSageEnabled: TRUE
>>>
>>>
>>> *** 2021-05-25T07:15:15.503247-04:00
>>>
>>> kcbism: islarge 1 next 0 nblks 914171 type 3, bpid 65535, kcbisdbfc 0
>>> kcbnhl 8192 kcbstt 9836 keep_nb 0 kcbnbh 255727 kcbnwp 4
>>>
>>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>>> 1, isQesSageEnabled: TRUE
>>>
>>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>>> 1, isQesSageEnabled: TRUE
>>>
>>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>>> 1, isQesSageEnabled: TRUE
>>>
>>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>>> 1, isQesSageEnabled: TRUE
>>>
>>> kcbism: islarge 0 next 0 nblks 36 type 3, bpid 65535, kcbisdbfc 0 kcbnhl
>>> 8192 kcbstt 9836 keep_nb 0 kcbnbh 255727 kcbnwp 4
>>>
>>> kcbism: islarge 1 next 0 nblks 541438 type 2, bpid 3, kcbisdbfc 0 kcbnhl
>>> 8192 kcbstt 9836 keep_nb 0 kcbnbh 255727 kcbnwp 4
>>>
>>> kcbimd: nblks 541438 kcbstt 9836 kcbnbh 25572 bpid 3 kcbisdbfc 0
>>> is_medium 0
>>>
>>> kcbivlo: nblks 541438 vlot 500 pnb 255727 kcbisdbfc 0 is_large 0
>>>
>>> NSMTIO: qertbFetch:[MTT < OBJECT_SIZE < VLOT]: Checking cost to read
>>> from caches(local/remote) and checking storage reduction factors (OLTP/EHCC
>>> Comp)
>>>
>>> NSMTIO: kcbdpc:NoDirectRead:[CACHE_READ]: tsn: 53, objd: 1063361, objn:
>>> 1063361
>>>
>>> ckpt: 0, nblks: 541438, ntcache: 0, ntdist:0
>>>
>>> Direct Path for pdb 0 tsn 53 objd 1063361 objn 1063361
>>>
>>> Direct Path 0 ckpt 0, nblks 541438 ntcache 0 ntdist 0
>>>
>>> Direct Path mndb 0 tdiob 859 txiob 0 tciob 133
>>>
>>> Direct path diomrc 8 dios 2 kcbisdbfc 0
>>>
>>> kcbdpc: kx 8 kc 8 lhs 4 rhs NSMTIO: Additional Info: VLOT=1278635
>>>
>>> Object# = 1063361, Object_Size = 541438 blocks
>>>
>>> SqlId = XXXXXXX, plan_hash_value = 529217307, Partition# = 0
>>>
>>> NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table
>>> sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted:
>>> -1,is enc_ktid encrypted: -1, is sage enabled based on data -layer checks:
>>> 1, isQesSageEnabled: TRUE
>>>
>>> NSMTIO: kcbzib: Cache Scan triggered for tsn = 53, rdba=0x27c04404, fn =
>>> 159, kobjd = 1063361, block cnt = 124, noncontig = FALSE pre-warm = FALSE,
>>> prefetch = FALSE
>>>
>>> NSMTIO: kcbzib: Cache Scan triggered for tsn = 53, rdba=0x27c0440c, fn =
>>> 159, kobjd = 1063361, block cnt = 116, noncontig = FALSE pre-warm = FALSE,
>>> prefetch = FALSE
>>>
>>> NSMTIO: kcbzib: Cache Scan triggered for tsn = 53, rdba=0x27c04414, fn =
>>> 159, kobjd = 1063361, block cnt = 108, noncontig = FALSE pre-warm = FALSE,
>>> prefetch = FALSE
>>>
>>> NSMTIO: kcbzib: Cache Scan triggered for tsn = 53, rdba=0x27c0441c, fn =
>>> 159, kobjd = 1063361, block cnt = 100, noncontig = FALSE pre-warm = FALSE,
>>> prefetch = FALSE
>>>
>>>
>>>
>>>
>>> On Mon, May 24, 2021 at 4:39 AM Sayan Malakshinov <xt.and.r_at_gmail.com>
>>> wrote:
>>>
>>>> Hi Lok,
>>>>
>>>>
>>>> https://blogs.oracle.com/smartscan-deep-dive/when-bloggers-get-it-wrong-part-1
>>>>
>>>> https://blogs.oracle.com/smartscan-deep-dive/when-bloggers-get-it-wrong-part-2
>>>>
>>>> https://blogs.oracle.com/smartscan-deep-dive/correct-syntax-for-the-table_stats-hint
>>>> Have you read this series of articles by Roger MacNicol?
>>>> You just need to analyze or provide the following trace events for your
>>>> queries:
>>>> https://github.com/xtender/xt_scripts/blob/master/trace_nsmtio.sql
>>>> -- Buffer cache decision making:
>>>> alter session set events '10358 trace name context forever, level 2';
>>>> -- Direct I/O decision making:
>>>> alter session set events 'trace [NSMTIO] disk highest';
>>>>
>>>>
>>>>
>>>> On Sun, May 23, 2021 at 7:43 PM Lok P <loknath.73_at_gmail.com> wrote:
>>>>
>>>>> Hello Listers,
>>>>>
>>>>> Its version 19C of oracle and optimizer_features_enable 19.1.0.
>>>>>
>>>>> While working on a separate issue, we came across a situation which is
>>>>> a bit odd. A simple SELECT query fetching ~13 columns from a table - TAB1
>>>>> showing odd behaviour. Sometimes within seconds and sometimes more than a
>>>>> minute. When , It was running for more than a minute , it happened to be
>>>>> clear that it's not doing cell offloading from its sql monitor. And then by
>>>>> just removing one column from its SELECT list , the smartscan started
>>>>> happening and the query finished in a few seconds.
>>>>>
>>>>> Then I started running the same query for all the ~13 columns in the
>>>>> SELECT list but this time by setting session level "serial_direct_read"=
>>>>> always. And here we see the smart scan again started happening and the
>>>>> query finished in quick time. Is this behaviour because of any bug ?
>>>>> Because I have not seen any such criteria in which the smartscan is
>>>>> restricted by the number of columns in the SELECT list.
>>>>>
>>>>> Attached is the sample table script with the test case which produces
>>>>> the above scenario. Table TAB1 is a non partition table with size ~7GB and
>>>>> there are ~64million rows in it.
>>>>>
>>>>> Note- Its a third party database and we see few of the optimizer
>>>>> parameters set to non default as its clear from the outline section. But at
>>>>> least none of these should be impacting the choice of smartscan.
>>>>>
>>>>>
>>>>> SELECT A_ID, A_ACTN, RNM, FNM, OVL,
>>>>> NWVL, COL1, COL2, COL3, COL4, COL5,
>>>>> COL6, COL7
>>>>> FROM USER2.TAB1
>>>>> WHERE OVL <> NWVL
>>>>> AND ( (RNM = 'XXXX' AND COL1 = '1')
>>>>> OR (RNM = 'YYYY' AND COL1 = '1'))
>>>>>
>>>>> Regards
>>>>> Lok
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Sayan Malakshinov
>>>> Oracle performance tuning engineer
>>>> Oracle ACE Associate
>>>> http://orasql.org
>>>>
>>>
>
> --
> Best regards,
> Sayan Malakshinov
> Oracle performance tuning engineer
> Oracle ACE Associate
> http://orasql.org
>

-- 
Fairlie Rego
Executive Database Architect
www.technoconsulting.com.au
http://www.linkedin.com/in/fairlierego
https://fairlierego.wordpress.com/

Twitter _at_fairlierego

--
http://www.freelists.org/webpage/oracle-l
Received on Tue May 25 2021 - 23:31:42 CEST

Original text of this message