Re: Parallel Query change from smart scan to cell single block access
Date: Sun, 19 Jun 2022 01:47:23 -0400
Message-ID: <CAMHX9JLBUyZ=gGqEpApjuKQWh0Zaj+eR9pWQV7gsppqgxe=P0Q_at_mail.gmail.com>
That 60 second snapshot indicates that the full table scan / smart scan was
not making any progress during that time. No increases in "table scan
blocks gotten" nor "cell physical IO bytes eligible for predicate offload"
or any other "cell blocks%" metrics.
The "consistent gets", "consistent changes" and "transaction tables
consistent reads - undo records applied" have all increased by the same
amount (68471) without any table/index block access metrics. You might be
stuck processing the same block and looks like you might be hitting
something like a delayed block cleanout issue (no metric saying undo blocks
applied to a datablock, just "transaction tables consistent reads - undo
records applied").
Smart scan can't process the block in the storage layer as it doesn't know
whether some previous changes to that block had been committed before or
after your smart scan started (snapshot SCN). So it sends the block back in
passthrough mode and Oracle now starts examining (and rolling back) a CR
copy of the undo segment header block, to get to the truth.
Thoughts:
This is somewhat related to the topic (long running transactions causing
some optimizations not be usable):
--
Tanel Poder
https://learn.tanelpoder.com
On Sat, Jun 18, 2022 at 8:10 PM Ls Cheng <exriscer_at_gmail.com> wrote:
> Hi
>
> Sorry for the late reply.
>
> I did take snaps of gv$sesstat for the PQ slave which was doing cell
> single block reads but saw nothing strange
>
> The following was sesstat in two snapshots with 60 seconds interval, only
> statistics where value - prev_value > 0 is considered
>
> SNAP_DATE NAME
> VALUE-PREV_VALUE INST_ID SID
> ---------------
> ----------------------------------------------------------------
> ---------------- ---------- ------
> 20220614 221231 messages sent
> 2 7 1221
> 20220614 221231 cluster wait time
> 110 7 1221
> 20220614 221231 blocks decrypted
> 364 7 1221
> 20220614 221231 gc hash slot removed
> 374 7 1221
> 20220614 221231 prefetched blocks aged out before use
> 461 7 1221
> 20220614 221231 pinned buffers inspected
> 536 7 1221
> 20220614 221231 dirty buffers inspected
> 2156 7 1221
> 20220614 221231 shared hash latch upgrades - no wait
> 4236 7 1221
> 20220614 221231 user I/O wait time
> 5439 7 1221
> 20220614 221231 non-idle wait time
> 5549 7 1221
> 20220614 221231 hot buffers moved to head of LRU
> 6684 7 1221
> 20220614 221231 calls to kcmgas
> 40158 7 1221
> 20220614 221231 free buffer inspected
> 66597 7 1221
> 20220614 221231 cell flash cache read hits
> 66681 7 1221
> 20220614 221231 physical read requests optimized
> 66681 7 1221
> 20220614 221231 physical read total IO requests
> 66681 7 1221
> 20220614 221231 physical reads
> 66681 7 1221
> 20220614 221231 physical read IO requests
> 66681 7 1221
> 20220614 221231 gc local grants
> 66681 7 1221
> 20220614 221231 physical reads cache
> 66681 7 1221
> 20220614 221231 free buffer requested
> 66681 7 1221
> 20220614 221231 gc remote disk read
> 66681 7 1221
> 20220614 221231 consistent gets from cache
> 68471 7 1221
> 20220614 221231 consistent gets examination
> 68471 7 1221
> 20220614 221231 consistent gets
> 68471 7 1221
> 20220614 221231 consistent changes
> 68471 7 1221
> 20220614 221231 transaction tables consistent reads - undo records applied
> 68471 7 1221
> 20220614 221231 session logical reads
> 68471 7 1221
> 20220614 221231 non-idle wait count
> 161928 7 1221
> 20220614 221231 file io wait time
> 54386966 7 1221
> 20220614 221231 physical read bytes
> 546250752 7 1221
> 20220614 221231 cell physical IO interconnect bytes
> 546250752 7 1221
> 20220614 221231 physical read total bytes
> 546250752 7 1221
> 20220614 221231 physical read total bytes optimized
> 546250752 7 1221
> 20220614 221231 logical read bytes from cache
> 560914432 7 1221
>
> 35 rows selected.
>
> Thanks
>
>
> On Wed, Jun 15, 2022 at 10:10 PM Tanel Poder <tanel_at_tanelpoder.com> wrote:
>
>> Check the v$sesstat metrics, there's nowadays plenty of feedback from
>> storage cells telling how "deep" the offload processing actually was or if
>> there was pushback/passthrough caused
>>
>> Common causes for smart scan "hiccups" with single block reads:
>>
>> 1. chained rows
>> 2. migrated rows for some operations (update, maybe select for update
>> etc)
>> 3. uncommitted transactions in blocks (or not trivially cleanable
>> blocks in storage)
>>
>>
>> But v$sesstat metrics will tell you more.
>>
>> Here's an article from 12 years ago, where chained rows caused problems
>> for a smart scan.
>>
>> -
>> http://tech.e2sn.com/oracle/exadata/performance-troubleshooting/exadata-smart-scan-performance
>>
>> Things have somewhat improved by now, but back then the "cell chained
>> rows%" metrics didn't even exist, had to reason by looking at other
>> metrics, like the "table fetch continued row" and the difference of
>> "processing depth" at cache & txn layer vs data layer in the storage cells:
>>
>> 869, DWH , STAT, table fetch continued row ,
>> 3660, 732,
>> 869, DWH , STAT, cell blocks processed by cache layer ,
>> 5428, 1.09k,
>> 869, DWH , STAT, cell blocks processed by txn layer ,
>> 5428, 1.09k,
>> 869, DWH , STAT, cell blocks processed by data layer ,
>> 3625, 725,
>>
>> --
>> Tanel Poder
>> https://learn.tanelpoder.com
>>
>>
>> On Tue, Jun 14, 2022 at 6:26 PM Ls Cheng <exriscer_at_gmail.com> wrote:
>>
>>> Hi all
>>>
>>> I have a strange situation where a very simple query (no join, a single
>>> FROM table) sometimes is fast (seconds) and sometimes slow (hours). After
>>> digging a bit It seems that one of PQ slave instead of accessing the table
>>> using cell smart table scan is accessing by cell single blocks.
>>>
>>> This is 19.10, has anyone observed or faced such a problem?
>>>
>>> Thanks
>>>
>>>
--
http://www.freelists.org/webpage/oracle-l
Received on Sun Jun 19 2022 - 07:47:23 CEST