Re: intermittent long "log file sync" waits
Date: Sat, 1 Feb 2020 22:34:46 -0500
Message-ID: <CAMHX9JLchyEEvh4mHH8hHXEa53r5FP4Ck-kpCMLUiTgOJzYuEw_at_mail.gmail.com>
Hi Chris,
As 12c+ has log writer slaves and Oracle can adaptavely switch from single
LGWR to LGWR + LGxx slave (_use_single_log_writer=adaptive), can you check
ASH data for the LGxx slaves at this time too. Not all process-to-process
interactions/waits are instrumented with a blocking_session in Oracle...
Should be this command:
_at_ashtop username,event2,program2,blocking_session,blocking_inst_id "program
like '%(LG%)'" "TIMESTAMP'2020-01-29 13:04:00'" "TIMESTAMP'2020-01-29
13:05:00'"
--
Tanel Poder
https://blog.tanelpoder.com/seminar
On Thu, Jan 30, 2020 at 3:44 PM Chris Stephens <cstephens16_at_gmail.com>
wrote:
> Here's some additional ash output from the trace data posted in last
> message including event_hist.sql:
>
> SQL> _at_ashtop username,event2,program2,blocking_session,blocking_inst_id
> 1=1 "TIMESTAMP'2020-01-29 13:04:20'" "TIMESTAMP'2020-01-29 13:04:27'"
> TOTALSECONDS AAS %This USERNAME EVENT2
> PROGRAM2 BLOCKING_SESSION BLOCKING_INST_ID FIRST_SEEN
> LAST_SEEN DIST_SQLEXEC_SEEN
> _______________ ______ __________ ________________ ________________
> _____________ ___________________ ___________________
> ______________________ ______________________ ____________________
> 2 0.3 50% | SYS ON CPU
> (sqlplus) 2020-01-29 13:04:24
> 2020-01-29 13:04:25 1
> 1 0.1 25% | GEN3_MGOWER_3 ON CPU
> (python) 2020-01-29 13:04:22
> 2020-01-29 13:04:22 1
> 1 0.1 25% | GEN3_MGOWER_3 log file sync
> (python) 1710 2 2020-01-29 13:04:20
> 2020-01-29 13:04:20 1
>
>
> SQL> _at_bg lgwr
> NAME DESCRIPTION SID OPID SPID PADDR
> SADDR
> _______ ______________ _______ _______ ________ ___________________
> ___________________
> LGWR Redo etc. 1710 34 26552 00000001E8718860
> 00000001D8BFF4A0
>
> SQL> _at_ashtop username,event2,program2,blocking_session,blocking_inst_id
> session_id=1710 "TIMESTAMP'2020-01-29 13:04:20'" "TIMESTAMP'2020-01-29
> 13:04:27'"
>
> no rows selected
>
> SQL> _at_ashtop username,event2,program2,blocking_session,blocking_inst_id
> session_id=1710 "TIMESTAMP'2020-01-29 13:04:00'" "TIMESTAMP'2020-01-29
> 13:05:00'"
>
> no rows selected
>
> SQL> _at_ash/event_hist "log file" 1=1 "TIMESTAMP'2020-01-29 13:04:00'"
> "TIMESTAMP'2020-01-29 13:05:00'"
> EVH_EVENT EVH_MILLISEC EVH_SAMPLE_COUNT
> EVH_EST_EVENT_COUNT EVH_EST_TOTAL_SEC PCT_EVT_TIME EVH_GRAPH
> FIRST_SEEN LAST_SEEN
> ________________ __________________ ___________________
> ______________________ ____________________ _______________ _______________
> ______________________ ______________________
> log file sync < 1024 1
> 1.8 1.8 12.8 |# | 2020-01-29
> 13:04:13 2020-01-29 13:04:13
> log file sync < 4096 1
> 1 4.1 29.1 |### | 2020-01-29
> 13:04:12 2020-01-29 13:04:12
> log file sync < 8192 1
> 1 8.2 58.2 |###### | 2020-01-29
> 13:04:20 2020-01-29 13:04:20
>
> SQL> _at_ash/event_hist "write" 1=1 "TIMESTAMP'2020-01-29 13:04:00'"
> "TIMESTAMP'2020-01-29 13:05:00'"
>
> no rows selected
>
> SQL> _at_ash/event_hist "*" 1=1 "TIMESTAMP'2020-01-29 13:04:00'"
> "TIMESTAMP'2020-01-29 13:05:00'"
> EVH_EVENT EVH_MILLISEC EVH_SAMPLE_COUNT
> EVH_EST_EVENT_COUNT EVH_EST_TOTAL_SEC PCT_EVT_TIME EVH_GRAPH
> FIRST_SEEN LAST_SEEN
> _________________ __________________ ___________________
> ______________________ ____________________ _______________ _______________
> ______________________ ______________________
> log file sync < 1024 1
> 1.8 1.8 12.8 |# | 2020-01-29
> 13:04:13 2020-01-29 13:04:13
> log file sync < 4096 1
> 1 4.1 29.1 |### | 2020-01-29
> 13:04:12 2020-01-29 13:04:12
> log file sync < 8192 1
> 1 8.2 58.2 |###### | 2020-01-29
> 13:04:20 2020-01-29 13:04:20
> row cache lock < 1 1
> 2469.1 2.5 100 |##########| 2020-01-29
> 13:04:01 2020-01-29 13:04:01
>
> SQL> _at_ashtop username,event2,program2,blocking_session,blocking_inst_id
> "lower(program) like '%lg%'" "TIMESTAMP'2020-01-29 13:04:00'"
> "TIMESTAMP'2020-01-29 13:05:00'"
>
> no rows selected
>
> i'm going to copy over gv$active_session_history to a single instance
> database and take a look at ash_wait_chains but i don't expect that to
> reveal anything either.
>
> i need to confirm whether adding redo resolved issue (will have developer
> fire up workload) but I think next step is to investigate x$trace and
> possible process communication issues?
>
> On Thu, Jan 30, 2020 at 10:27 AM Chris Stephens <cstephens16_at_gmail.com>
> wrote:
>
>> Hey Tanel,
>>
>> Thanks for taking time to write this up! Will definitely be attending
>> that webinar.
>>
>> comments inline
>>
>> On Wed, Jan 29, 2020 at 10:32 PM Tanel Poder <tanel_at_tanelpoder.com>
>> wrote:
>>
>>> Comments:
>>>
>>> 1) First, looks like you're using all the right tools for systematic
>>> troubleshooting ;-)
>>>
>>
>> ***i try!
>>
>>>
>>> 2) As "log file sync" is not directly an I/O wait event (IO happens
>>> indirectly under the hood), but "wait for LGWR to ACK/increment redo
>>> persisting position" then there may be multiple reasons for delays. I'm
>>> repeating some of what others have said here too:
>>> - LGWR could be waiting for remote ACK (synchronous data guard) - that
>>> should show up as a LNS* wait or something like that though
>>> - Process slowness due to CPU overload/scheduling latency or swapping
>>> (in both cases, LGWR may do its work fast, but it takes a while for the
>>> foreground process to wake up and end the wait event - you could check if
>>> there are any reports regarding switching LGWR mode in alert.log around
>>> that time)
>>> - Bug/missed post due to switching between LGWR progress polling vs.
>>> waiting for LGWR post
>>> - ASM instance communication issues (ASM communication can be on the
>>> critical path when creating/changing files on ASM and even for reading,
>>> when the ASM metadata cache in your local instance doesn't have what it
>>> needs)
>>> - So, the physical redo log file disk write is not necessarily the
>>> biggest contributor to the log file sync wait
>>>
>>> *** no data guard in the env.
>> *** we have excess CPU. (20 cores / instance and usually only 1-5 active
>> foreground processes but i will confirm from TFA collections)
>> *** yesterday i was ready to blame bug. today, not so sure. i checked log
>> file switches with a script google found me a few days ago. now thinking
>> that was inaccurate as a script from metastink shows tones of switching so
>> that could very well be related.
>> ***asm communication is something i need to look into. TFA has reported
>> some ASM issues but there's no details and i never saw anything in asm
>> alert logs indicating an issue.
>>
>>
>>> 3) You'll need to measure what LGWR itself was doing during the problem
>>> time
>>> - Since you're on 19c (12c or later) the LGWR has slaves too and they
>>> may be doing most of the work
>>> - You'll need to look into both the LGWR process and any other LG*
>>> ones too
>>> - Since you have (Oracle) Snapper running already and writing to trace
>>> you should have the data already
>>> - The *lgwr* in the following _at_snapper ash,stats,trace 10 999999 lgwr
>>> syntax includes LGWR slaves too
>>> - As LGWR slaves have their own sessions (and SIDs), you should run
>>> ashtop etc and report if there's any BLOCKING_SESSION value listed for LGWR
>>> and slaves
>>> - If LGWR and the slaves were listed *idle* during the problem time
>>> (while others waited for log file sync event at the same time) *and if *there
>>> was no huge OS level CPU/memory shortage at the time, this looks like a
>>> background <-> foreground communication problem.
>>> - This (and any potential ASM issues) could probably be
>>> troubleshooted via KST tracing, KSR channels and X$TRACE that could help
>>> here (some of it is enabled by default ( Like I showed in a previous
>>> hacking session: https://www.youtube.com/watch?v=mkmvZv58W6w )
>>>
>>>
>> here is all the output i have corresponding to a 6 sec "log file sync"
>> wait:
>>
>> Experience duration: *R* = *t*1 − *t*0 = 41.336137 seconds
>> <#m_7542945560652427325_m_-5509691095923674063_profile-by-subroutine>
>> Experience begin time: *t*0 = 2020-01-29T13:04:07.560478
>> Experience end time: *t*1 = 2020-01-29T13:04:48.896615
>> CALL-NAME DURATION % CALLS MEAN
>> MIN MAX
>> ------------------------------ --------- ------ ------ --------
>> -------- --------
>> SQL*Net message from client 26.017105 66.5% 9,886 0.002632
>> 0.000209 7.858051
>> log file sync 10.582409 27.1% 150 0.070549
>> 0.000295 6.909711
>> EXEC 2.088757 5.3% 9,702 0.000215
>> 0.000000 0.209199
>> DLM cross inst call completion 0.109617 0.3% 212 0.000517
>> 0.000005 0.080575
>> row cache lock 0.070118 0.2% 125 0.000561
>> 0.000208 0.000972
>> FETCH 0.047586 0.1% 1,029 0.000046
>> 0.000000 0.003056
>> PARSE 0.045293 0.1% 4,778 0.000009
>> 0.000000 0.002692
>> CLOSE 0.033453 0.1% 4,784 0.000007
>> 0.000000 0.000237
>> library cache lock 0.023614 0.1% 55 0.000429
>> 0.000152 0.000901
>> library cache pin 0.022311 0.1% 55 0.000406
>> 0.000153 0.000689
>> 17 others 0.071430 0.2% 10,772 0.000007
>> 0.000000 0.001487
>> ------------------------------ --------- ------ ------ --------
>> -------- --------
>> TOTAL (27) 39.111693 100.0% 41,548 0.000941
>> 0.000000 7.858051
>>
>> 6 sec wait from raw data:
>>
>> *** 2020-01-29T13:04:20.724666-06:00
>> WAIT #0: nam='log file sync' ela= 6909711 buffer#=58996 sync
>> scn=976089279 p3=0 obj#=-1 tim=1025228531958
>> WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1413697536
>> #bytes=1 p3=0 obj#=-1 tim=1025228532118
>> WAIT #0: nam='SQL*Net message from client' ela= 1711 driver id=1413697536
>> #bytes=1 p3=0 obj#=-1 tim=1025228533852
>> BINDS #140204587734864:
>>
>> i enabled 10046 on lgwr:
>>
>> *** 2020-01-29T13:04:20.725994-06:00
>> WAIT #0: nam='rdbms ipc message' ela= 2918443 timeout=300 p2=0 p3=0
>> obj#=-1 tim=1025228533313
>> WAIT #0: nam='log file parallel write' ela= 279 files=1 blocks=17
>> requests=2 obj#=-1 tim=1025228533777
>> WAIT #0: nam='log file parallel write' ela= 171 files=1 blocks=3
>> requests=2 obj#=-1 tim=1025228534104
>> WAIT #0: nam='LGWR wait for redo copy' ela= 22 copy latch #=28 p2=0 p3=0
>> obj#=-1 tim=1025228534182
>> WAIT #0: nam='log file parallel write' ela= 183 files=1 blocks=2
>> requests=2 obj#=-1 tim=1025228534437
>> WAIT #0: nam='log file parallel write' ela= 174 files=1 blocks=2
>> requests=2 obj#=-1 tim=1025228534721
>> WAIT #0: nam='rdbms ipc message' ela= 31 timeout=8 p2=0 p3=0 obj#=-1
>> tim=1025228534786
>> WAIT #0: nam='rdbms ipc message' ela= 3233 timeout=8 p2=0 p3=0 obj#=-1
>> tim=1025228538038
>> WAIT #0: nam='log file parallel write' ela= 170 files=1 blocks=2
>> requests=1 obj#=-1 tim=1025228538297
>> WAIT #0: nam='rdbms ipc message' ela= 25 timeout=8 p2=0 p3=0 obj#=-1
>> tim=1025228538358
>> WAIT #0: nam='rdbms ipc message' ela= 11829 timeout=8 p2=0 p3=0 obj#=-1
>> tim=1025228550204
>> WAIT #0: nam='LGWR wait for redo copy' ela= 52 copy latch #=8 p2=0 p3=0
>> obj#=-1 tim=1025228550294
>> WAIT #0: nam='log file parallel write' ela= 239 files=1 blocks=7
>> requests=1 obj#=-1 tim=1025228550624
>> WAIT #0: nam='rdbms ipc message' ela= 23 timeout=6 p2=0 p3=0 obj#=-1
>> tim=1025228550699
>> WAIT #0: nam='rdbms ipc message' ela= 16750 timeout=6 p2=0 p3=0 obj#=-1
>> tim=1025228567467
>> WAIT #0: nam='log file parallel write' ela= 207 files=1 blocks=9
>> requests=1 obj#=-1 tim=1025228567762
>> WAIT #0: nam='rdbms ipc message' ela= 26 timeout=5 p2=0 p3=0 obj#=-1
>> tim=1025228567825
>> WAIT #0: nam='rdbms ipc message' ela= 16249 timeout=5 p2=0 p3=0 obj#=-1
>> tim=1025228584091
>> WAIT #0: nam='log file parallel write' ela= 213 files=1 blocks=9
>> requests=1 obj#=-1 tim=1025228584400
>> WAIT #0: nam='rdbms ipc message' ela= 48 timeout=3 p2=0 p3=0 obj#=-1
>> tim=1025228584502
>> WAIT #0: nam='rdbms ipc message' ela= 16466 timeout=3 p2=0 p3=0 obj#=-1
>> tim=1025228600993
>> WAIT #0: nam='log file parallel write' ela= 228 files=1 blocks=9
>> requests=1 obj#=-1 tim=1025228601393
>> WAIT #0: nam='rdbms ipc message' ela= 29 timeout=1 p2=0 p3=0 obj#=-1
>> tim=1025228601482
>> WAIT #0: nam='rdbms ipc message' ela= 9548 timeout=1 p2=0 p3=0 obj#=-1
>> tim=1025228611053
>> WAIT #0: nam='rdbms ipc message' ela= 7467 timeout=300 p2=0 p3=0 obj#=-1
>> tim=1025228618574
>> WAIT #0: nam='LGWR wait for redo copy' ela= 13 copy latch #=8 p2=0 p3=0
>> obj#=-1 tim=1025228618636
>> WAIT #0: nam='log file parallel write' ela= 222 files=1 blocks=9
>> requests=1 obj#=-1 tim=1025228618960
>> WAIT #0: nam='rdbms ipc message' ela= 31 timeout=300 p2=0 p3=0 obj#=-1
>> tim=1025228619066
>> WAIT #0: nam='rdbms ipc message' ela= 15946 timeout=300 p2=0 p3=0 obj#=-1
>> tim=1025228635045
>> WAIT #0: nam='log file parallel write' ela= 240 files=1 blocks=9
>> requests=1 obj#=-1 tim=1025228635427
>> WAIT #0: nam='rdbms ipc message' ela= 33 timeout=298 p2=0 p3=0 obj#=-1
>> tim=1025228635525
>> WAIT #0: nam='rdbms ipc message' ela= 15987 timeout=298 p2=0 p3=0 obj#=-1
>> tim=1025228651537
>> WAIT #0: nam='log file parallel write' ela= 255 files=1 blocks=9
>> requests=1 obj#=-1 tim=1025228651912
>> WAIT #0: nam='rdbms ipc message' ela= 51 timeout=296 p2=0 p3=0 obj#=-1
>> tim=1025228652034
>> WAIT #0: nam='rdbms ipc message' ela= 16332 timeout=296 p2=0 p3=0 obj#=-1
>> tim=1025228668405
>> WAIT #0: nam='log file parallel write' ela= 256 files=1 blocks=9
>> requests=1 obj#=-1 tim=1025228668804
>> WAIT #0: nam='rdbms ipc message' ela= 37 timeout=295 p2=0 p3=0 obj#=-1
>> tim=1025228668911
>> WAIT #0: nam='rdbms ipc message' ela= 16969 timeout=295 p2=0 p3=0 obj#=-1
>> tim=1025228685907
>> WAIT #0: nam='log file parallel write' ela= 273 files=1 blocks=9
>> requests=1 obj#=-1 tim=1025228686327
>> WAIT #0: nam='rdbms ipc message' ela= 38 timeout=293 p2=0 p3=0 obj#=-1
>> tim=1025228686453
>> WAIT #0: nam='rdbms ipc message' ela= 6908 timeout=293 p2=0 p3=0 obj#=-1
>> tim=1025228693390
>> WAIT #0: nam='log file parallel write' ela= 247 files=1 blocks=2
>> requests=1 obj#=-1 tim=1025228693768
>> WAIT #0: nam='rdbms ipc message' ela= 38 timeout=292 p2=0 p3=0 obj#=-1
>> tim=1025228693862j
>> ...more of the same
>> 264396 WAIT #0: nam='rdbms ipc message' ela= 32 timeout=52 p2=0 p3=0
>> obj#=-1 tim=1025231090742
>> 264397 WAIT #0: nam='rdbms ipc message' ela= 19460 timeout=52 p2=0 p3=0
>> obj#=-1 tim=1025231110227
>> 264398 WAIT #0: nam='log file parallel write' ela= 272 files=1 blocks=9
>> requests=1 obj#=-1 tim=1025231110633
>> 264399 WAIT #0: nam='rdbms ipc message' ela= 37 timeout=50 p2=0 p3=0
>> obj#=-1 tim=1025231110748
>> 264400 WAIT #0: nam='rdbms ipc message' ela= 15484 timeout=50 p2=0 p3=0
>> obj#=-1 tim=1025231126261
>> 264401 WAIT #0: nam='log file parallel write' ela= 247 files=1 blocks=4
>> requests=1 obj#=-1 tim=1025231126641
>> 264402 WAIT #0: nam='rdbms ipc message' ela= 36 timeout=49 p2=0 p3=0
>> obj#=-1 tim=1025231126753
>> 264403 WAIT #0: nam='rdbms ipc message' ela= 16732 timeout=49 p2=0 p3=0
>> obj#=-1 tim=1025231143514
>> 264404 WAIT #0: nam='LGWR wait for redo copy' ela= 31 copy latch #=8
>> p2=0 p3=0 obj#=-1 tim=1025231143603
>> 264405 WAIT #0: nam='log file parallel write' ela= 279 files=1 blocks=9
>> requests=1 obj#=-1 tim=1025231143976
>> 264406 WAIT #0: nam='rdbms ipc message' ela= 35 timeout=47 p2=0 p3=0
>> obj#=-1 tim=1025231144093
>> 264407 WAIT #0: nam='rdbms ipc message' ela= 5384 timeout=47 p2=0 p3=0
>> obj#=-1 tim=1025231149515
>> 264408 WAIT #0: nam='log file parallel write' ela= 255 files=1 blocks=4
>> requests=1 obj#=-1 tim=1025231149889
>> 264409 WAIT #0: nam='rdbms ipc message' ela= 32 timeout=47 p2=0 p3=0
>> obj#=-1 tim=1025231149989
>> 264410
>> 264411 *** 2020-01-29T13:04:23.802774-06:00
>> 264412 WAIT #0: nam='rdbms ipc message' ela= 460056 timeout=46 p2=0 p3=0
>> obj#=-1 tim=1025231610086
>> 264413
>> 264414 *** 2020-01-29T13:04:26.797925-06:00
>> 264415 WAIT #0: nam='rdbms ipc message' ela= 2995027 timeout=300 p2=0
>> p3=0 obj#=-1 tim=1025234605238
>> 264416 WAIT #0: nam='rdbms ipc message' ela= 10649 timeout=1 p2=0 p3=0
>> obj#=-1 tim=1025234616054
>> 264417
>>
>> i will get you snapper lgwr output associated w/ this interval as soon as
>> i can find it (can't remember where i stashed it and lost trace file name
>> .. fml)
>>
>> 4) Paste the full *Snapper* output (of one time snapshot from the
>>> problem time) here if you can - we'll be able to see what the slaves were
>>> doing
>>> - You'd need to include not only LGWR SID but also any LG* slave SIDs
>>> too
>>> - Also we'll see better LGWR delay metrics from V$SESSTAT that go
>>> beyond just wait events
>>> - Thinking of metrics like these - snapper already handles them (if
>>> you look into the rightmost column)
>>> - *redo synch time overhead (usec)* - "FG wakeup overhead per log
>>> file sync"
>>> - *redo write time* - accounting end to end redo write time (not
>>> just the log file parallel write syscall) "per redo write"
>>>
>>> 5) Regarding the *ASH wait chains* error on 19c - I'm aware of this but
>>> forgot to look into it
>>> - It has (probably) something to do with GV$ views (or PX slaves used
>>> for PDB V$ queries) and possibly the connect by I'm using
>>> - Does it work in the CDB (if multitenant)?
>>> - Does it work with lower optimizer features enabled?
>>> - As we are already manually looking into what LGWR and its slaves
>>> are doing, there's no need for the wait chains script here
>>>
>>> ***oracle has filed a bug. altering optimizer_features_enabled doesn't
>> help.
>>
>>
>>> 6) Linux Process Snapper - pSnapper
>>> - If you want to see the full "wait" states of Linux
>>> processes/threads, use the* -a *option.
>>> - It's somewhat like always showing "idle" threads too
>>> - But for linux apps it's harder to determine what's idle and what's
>>> some network RPC wait, so I currently just do what Linux "load" metric does
>>> - I just shows threads in Runnable and Disk/Uninterruptible sleep
>>> states by default (-a shows all states, including Sleep and kernel-Idle
>>> threads etc)
>>> - So the Linux pSnapper "idle" means a somewhat different thing than
>>> Oracle "idle" - say LGWR was waiting for a Data Guard ACK, that should not
>>> be shown as Idle in Oracle, but at the Linux process level it's "just some
>>> network socket wait" and you'd need to use *-a* option to show all of
>>> them
>>>
>>
>> *** will add that option to future collections (and watch the video/read
>> help)
>>
>>>
>>> 7) Were LGWR/LG* processes mostly idle during the problem time?
>>> - As I kept reading the thread, it looks like ASH reported LGWR being
>>> mostly idle during the problem time
>>> - This is not exactly the same pattern, but I've seen cases in the
>>> past where everyone's waiting for LGWR and LGWR doesn't seem to be busy and
>>> doesn't show any significant waits
>>> - It turned out that LGWR was actually waiting for CKPT to make
>>> progress, but it wasn't synchronously waiting, but just going to sleep and
>>> "checking again later" in a loop
>>> - I wrote about this here:
>>> https://blog.tanelpoder.com/posts/log-file-switch-checkpoint-incomplete-and-lgwr-waiting-for-checkpoint-progress/
>>> - So it may still make sense to check what LGWR/LG* processes are
>>> doing even if they don't seem to be active much (like do you see an
>>> occasional controlfile read wait by LGWR etc)
>>> - SQL*Trace would capture all activity of LGWR, but may slow things
>>> down further, so you could also sample & spool V$SESSION_WAIT_HISTORY (that
>>> keeps a log of last 10 waits of every session in memory)
>>>
>>> *** i think i've addressed most of this.
>>
>>
>>> Incidentally I'm doing another hacking session tomorrow (Thursday 30 Jan
>>> _at_ 12pm ET) about complex Oracle performance problems that span multiple
>>> layers and have multiple moving parts. I can add some LGWR troubleshooting
>>> magic in the end of that (here goes my plan to fit the session in 1.5h
>>> again!). I'll post the details in my blog in a few minutes.
>>>
>>> *** can't wait!!
>>
>>> --
>>> Tanel Poder
>>> https://blog.tanelpoder.com/seminar
>>>
>>>
>>> On Tue, Jan 28, 2020 at 2:22 PM Chris Stephens <cstephens16_at_gmail.com>
>>> wrote:
>>>
>>>> doesn't look like any ash data for lgwr:
>>>>
>>>> SQL> _at_ashtop session_id,blocking_session,event2 1=1
>>>> "TIMESTAMP'2020-01-27 15:50:48'" "TIMESTAMP'2020-01-27 15:52:40'"
>>>> TOTALSECONDS AAS %This SESSION_ID BLOCKING_SESSION
>>>> EVENT2 FIRST_SEEN LAST_SEEN
>>>> DIST_SQLEXEC_SEEN
>>>> _______________ ______ __________ _____________ ___________________
>>>> _____________________ ______________________ ______________________
>>>> ____________________
>>>> 81 0.7 82% | 1713 1710 log
>>>> file sync 2020-01-27 15:50:54 2020-01-27 15:52:14
>>>> 1
>>>> 5 0 5% | 1221 ON
>>>> CPU 2020-01-27 15:50:59 2020-01-27 15:52:14
>>>> 1
>>>> 3 0 3% | 246 ON
>>>> CPU 2020-01-27 15:51:00 2020-01-27 15:52:00
>>>> 1
>>>> 3 0 3% | 1713 ON
>>>> CPU 2020-01-27 15:50:51 2020-01-27 15:52:27
>>>> 2
>>>> 1 0 1% | 123 ON
>>>> CPU 2020-01-27 15:51:35 2020-01-27 15:51:35
>>>> 1
>>>> 1 0 1% | 127 ON
>>>> CPU 2020-01-27 15:51:50 2020-01-27 15:51:50
>>>> 1
>>>> 1 0 1% | 252 2321
>>>> latch free 2020-01-27 15:52:28 2020-01-27 15:52:28
>>>> 1
>>>> 1 0 1% | 978 ges
>>>> remote message 2020-01-27 15:50:48 2020-01-27 15:50:48
>>>> 1
>>>> 1 0 1% | 983
>>>> latch free 2020-01-27 15:52:28 2020-01-27 15:52:28
>>>> 1
>>>> 1 0 1% | 1713
>>>> library cache lock 2020-01-27 15:50:48 2020-01-27 15:50:48
>>>> 1
>>>> 1 0 1% | 1831 ON
>>>> CPU 2020-01-27 15:50:49 2020-01-27 15:50:49
>>>> 1
>>>>
>>>>
>>>> 11 rows selected.
>>>>
>>>>
>>>> SQL> _at_ashtop session_id,blocking_session,event2 session_id=1710
>>>> "TIMESTAMP'2020-01-27 15:50:48'" "TIMESTAMP'2020-01-27 15:52:40'"
>>>>
>>>> no rows selected
>>>>
>>>> On Tue, Jan 28, 2020 at 1:09 PM Noveljic Nenad <
>>>> nenad.noveljic_at_vontobel.com> wrote:
>>>>
>>>>> You’re using v0.2, but v0.6 should be the latest version. 0.2
>>>>> wouldn’t show idle blockers.
>>>>>
>>>>> I looked at the ashtop output once again. Strange that lgwr processes
>>>>> nor any other bg processes show up significantly there. The question is,
>>>>> what was lgwr doing while the fg processes were waiting on him.
>>>>>
>>>>>
--
http://www.freelists.org/webpage/oracle-l
Received on Sun Feb 02 2020 - 04:34:46 CET
