Return-Path: <oracle-l-bounce@freelists.org>
X-Original-To: oracle-l@orafaq.com
Delivered-To: oracle-l@orafaq.com
Received: from turing.freelists.org (turing.freelists.org [206.53.239.180])
 by malta2546.startdedicated.com (Postfix) with ESMTPS id 55DAB1002D136B
 for <oracle-l@orafaq.com>; Sun,  2 Feb 2020 04:52:34 +0100 (CET)
Received: from localhost (localhost [127.0.0.1])
 by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 78DB925F7C;
 Sat,  1 Feb 2020 22:52:32 -0500 (EST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=freelists.org;
 s=turing; t=1580615552;
 bh=NlenKc5rLXvVZrMMygqaolKnEqvk4itt+p7X4doIXVk=;
 h=From:Sender:Sender:From;
 b=niASEymNyEPdep6bG72EcX9gl9vScd6hZBW/AeHt+EkP3T2A3EASlNvDLEQdhtQmG
	 DiyCVohYpwYdBWFhW4uG5MhAmLRYhzPFjzuxVg8H+isq3mVNLhnEmR48qu8Iztg00Q
	 o6nhH41nEjeBslbm/YlApW+Kx2e6y0KNKQNJh8do=
X-Virus-Scanned: Debian amavisd-new at turing.freelists.org
Received: from turing.freelists.org ([127.0.0.1])
 by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id TobzpcBVHcjE; Sat,  1 Feb 2020 22:52:31 -0500 (EST)
Received: from turing.freelists.org (localhost [127.0.0.1])
 by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 79C9825FAA;
 Sat,  1 Feb 2020 22:51:40 -0500 (EST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=freelists.org;
 s=turing; t=1580615545;
 bh=NlenKc5rLXvVZrMMygqaolKnEqvk4itt+p7X4doIXVk=;
 h=From:Sender:Sender:From;
 b=nn+I3dP9ywSKDv7M+UpwPNWXqaWlgn0E4x+fH4RW0PZ4fQ2fTI7eQDLtO11OagNnM
	 fh+ViU3eAvJ69kPABrd7m+FIpSsTkC/XHhAd4Kvd8jzmUUZBtZBMV7KPf8UdBx/FOx
	 49i1g53qtMcMGC2eGTz9A8H6e90qbyOKumagvKLg=
Received: with ECARTIS (v1.0.0; list oracle-l); Sat, 01 Feb 2020 22:50:55 -0500 (EST)
Received: from localhost (localhost [127.0.0.1])
 by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 9C04925F53
 for <Oracle-L@freelists.org>; Sat,  1 Feb 2020 22:50:54 -0500 (EST)
Authentication-Results: turing.freelists.org;
 dkim=pass (2048-bit key; unprotected) header.d=tanelpoder-com.20150623.gappssmtp.com header.i=@tanelpoder-com.20150623.gappssmtp.com header.b="vHhQn6V2";
 dkim-atps=neutral
Received: from turing.freelists.org ([127.0.0.1])
 by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id mlHo9uiutCs8 for <Oracle-L@freelists.org>;
 Sat,  1 Feb 2020 22:50:54 -0500 (EST)
Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id BD13825F50
 for <Oracle-L@freelists.org>; Sat,  1 Feb 2020 22:50:53 -0500 (EST)
Received: by mail-lj1-f179.google.com with SMTP id x14so11098506ljd.13
        for <Oracle-L@freelists.org>; Sat, 01 Feb 2020 19:50:53 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=OAhpOydo/2cx1npvj74bqFE9CUEe7HYpWjn1FkwKhwU=;
        b=Ru7O6bJ7ZMinTO8e511U4IWw8T9XJobY06UqHyFe3LQegYXwy47xpIX6TrZCa8XfEo
         Qa6RJKJ+kt8DDDQdptfUfkrEKtxdBSZDvDdFvvOY5kCoXuL/lOigBcSIH+meUoNENHdh
         pBGSqTJZf5HQPxwo/ysnPR13jaTRFMiL7Rb6eQahFYQ/thuOthwvjrF5XGq75yhH/Dg0
         vpcWmfKH5R2VCFBlyRaT/NtoMnogj2lhnhnCcS3+Fycqn16ftXXNK8F6cnhmu/wRjbqr
         oQoSC7Ab56P8MYKyB9sOKBxRfeGwphl6298k+pMV2RBmnJ15+1JKflYG1K1q8uCJBHiz
         Z1Dw==
X-Gm-Message-State: APjAAAXvKpKR5JVqftvHbIyO2y9tOCHqIdqGHnsbUUdLWux7nMwyhnAL
 neVoaLpaihJS7S6ZJGXfuHSIWnsIjZNMH3imaobmCg==
X-Google-Smtp-Source: APXvYqxVIb9Mn2HCp6umPRAbRoAvsPv3Zex9n8oQl+3tL6ytLByYdFn1FKWwsFSHL+d61nF4K8IG2tAWqsHsTzlsf8U=
X-Received: by 2002:a2e:86d6:: with SMTP id n22mr9424819ljj.77.1580615452228;
 Sat, 01 Feb 2020 19:50:52 -0800 (PST)
MIME-Version: 1.0
References: <CAEFL0szEBH0P-hqb39Pmjw-pdJ60OZ+RUuFoqCXsC7+=iPEFzQ@mail.gmail.com>
 <38763_1580228753_5E306091_38763_12713_1_cf89e1868e2e40949f6177bf1c604696@vontobel.com>
 <CAEFL0swS2CGeDF2pGWHDBQkO8J1cH7cMbDCxBngx613UW52A+Q@mail.gmail.com>
 <39760_1580232621_5E306FAD_39760_6519_1_238f704e660247f78d5e2f6a5c98b27f@vontobel.com>
 <CAEFL0szP5+-ZUog8Q=P0TiusLw6MpDDM46hZM=7ps0v0o3p_OQ@mail.gmail.com>
 <38763_1580238546_5E3086D2_38763_18592_1_8d9cba2aa89d4aba82f0b60173245848@vontobel.com>
 <CAEFL0szn17WU9kSj7ipRhieXd4w_3hzhU5gSmxDxhx1=WnqDkw@mail.gmail.com>
 <CAMHX9J+FooJWWr896aFx7oqeMk=PKA__GaHjxjU_gsOyE1Tzjg@mail.gmail.com>
 <CAEFL0sx7fAF89Qike6PS2Jy9-N5S-w-r31XQ7kj8AEGP5FonTg@mail.gmail.com>
 <CAEFL0syR4Xy2fPiL4M3jtmmfj_02MmYgC_haCM=LDZ17ikuGgg@mail.gmail.com> <CAMHX9JLchyEEvh4mHH8hHXEa53r5FP4Ck-kpCMLUiTgOJzYuEw@mail.gmail.com>
In-Reply-To: <CAMHX9JLchyEEvh4mHH8hHXEa53r5FP4Ck-kpCMLUiTgOJzYuEw@mail.gmail.com>
From: Tanel Poder <tanel@tanelpoder.com>
Date: Sat, 1 Feb 2020 22:50:40 -0500
Message-ID: <CAMHX9JLW_i_c+7V0jMwXhWeAKNvUhEdVkf0sKEyHg62XyZP3aA@mail.gmail.com>
Subject: Re: intermittent long "log file sync" waits
To: Tanel Poder <tanel@tanelpoder.com>
Cc: Chris Stephens <cstephens16@gmail.com>, Noveljic Nenad <nenad.noveljic@vontobel.com>,
 oracle-l <Oracle-L@freelists.org>
Content-Type: multipart/alternative; boundary="0000000000008e8d78059d8fb69f"
X-archive-position: 76127
X-ecartis-version: Ecartis v1.0.0
Sender: oracle-l-bounce@freelists.org
Errors-to: oracle-l-bounce@freelists.org
X-original-sender: tanel@tanelpoder.com
Precedence: normal
Reply-To: tanel@tanelpoder.com
List-Help: <mailto:ecartis@freelists.org?Subject=help>
List-Unsubscribe: <mailto:oracle-l-request@freelists.org?Subject=unsubscribe>
List-software: Ecartis version 1.0.0
List-Id: oracle-l <oracle-l.freelists.org>
X-List-ID: oracle-l <oracle-l.freelists.org>
List-Subscribe: <mailto:oracle-l-request@freelists.org?Subject=subscribe>
List-Owner: <mailto:>
List-post: <mailto:oracle-l@freelists.org>
List-Archive: <http://www.freelists.org/archives/oracle-l>
X-list: oracle-l
--0000000000008e8d78059d8fb69f
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

But your trace file here already indicates that there's some glitch
regarding LGWR wakeups/posting:

*Foreground:*

*** 2020-01-29T*13:04:20*.724666-06:00
WAIT #0: nam=3D'log file sync' ela=3D 6909711 buffer#=3D58996 sync scn=3D97=
6089279
p3=3D0 obj#=3D-1 tim=3D1025228531958

*LGWR at the same time:*

*** 2020-01-29T*13:04:20*.725994-06:00
WAIT #0: nam=3D'rdbms ipc message' ela=3D 2918443 timeout=3D300 p2=3D0 p3=
=3D0 obj#=3D-1
tim=3D1025228533313

The WAIT lines are printed out when the wait ends (and the *** timestamp
line just before the WAIT line is printed out, if enough time has passed
since the last *** line). So this shows that LGWR had happily been idle,
waiting for its idle 3-second timeout to expire, or someone to wake it up
(that's what "rdbms ipc message" means) - while someone had issued a commit
and was waiting for LGWR acknowledgment. So this shows that LGWR didn't
even get the post from your session, nor any other sessions (if you had
concurrent transactional activity). Back in 11g, typically these problems
happened when adaptive log file syncing switch between poll and post/wait
wakeup method happened and you hit a bug. But in 12c there are more moving
parts.

--
Tanel Poder
https://blog.tanelpoder.com/seminar

On Sat, Feb 1, 2020 at 10:34 PM Tanel Poder <tanel@tanelpoder.com> wrote:

> Hi Chris,
>
> As 12c+ has log writer slaves and Oracle can adaptavely switch from singl=
e
> LGWR to LGWR + LGxx slave (_use_single_log_writer=3Dadaptive), can you ch=
eck
> ASH data for the LGxx slaves at this time too. Not all process-to-process
> interactions/waits are instrumented with a blocking_session in Oracle...
>
> Should be this command:
>
> @ashtop username,event2,program2,blocking_session,blocking_inst_id
> "program like '%(LG%)'" "TIMESTAMP'2020-01-29 13:04:00'"
> "TIMESTAMP'2020-01-29 13:05:00'"
>
> --
> Tanel Poder
> https://blog.tanelpoder.com/seminar
>
>
> On Thu, Jan 30, 2020 at 3:44 PM Chris Stephens <cstephens16@gmail.com>
> wrote:
>
>> Here's some additional ash output from the trace data posted in last
>> message including event_hist.sql:
>>
>> SQL> @ashtop username,event2,program2,blocking_session,blocking_inst_id
>> 1=3D1 "TIMESTAMP'2020-01-29 13:04:20'" "TIMESTAMP'2020-01-29 13:04:27'"
>>    TOTALSECONDS    AAS      %This         USERNAME           EVENT2
>>  PROGRAM2    BLOCKING_SESSION    BLOCKING_INST_ID             FIRST_SEEN
>>            LAST_SEEN    DIST_SQLEXEC_SEEN
>> _______________ ______ __________ ________________ ________________
>> _____________ ___________________ ___________________
>> ______________________ ______________________ ____________________
>>               2    0.3   50% |    SYS              ON CPU
>> (sqlplus)                                             2020-01-29 13:04:2=
4
>>  2020-01-29 13:04:25                       1
>>               1    0.1   25% |    GEN3_MGOWER_3    ON CPU
>> (python)                                              2020-01-29 13:04:2=
2
>>  2020-01-29 13:04:22                       1
>>               1    0.1   25% |    GEN3_MGOWER_3    log file sync
>>  (python)                     1710                   2 2020-01-29 13:04:=
20
>>    2020-01-29 13:04:20                       1
>>
>>
>> SQL> @bg lgwr
>>    NAME    DESCRIPTION     SID    OPID     SPID               PADDR
>>         SADDR
>> _______ ______________ _______ _______ ________ ___________________
>> ___________________
>> LGWR    Redo etc.         1710      34 26552    00000001E8718860
>>  00000001D8BFF4A0
>>
>> SQL> @ashtop username,event2,program2,blocking_session,blocking_inst_id
>> session_id=3D1710 "TIMESTAMP'2020-01-29 13:04:20'" "TIMESTAMP'2020-01-29
>> 13:04:27'"
>>
>> no rows selected
>>
>> SQL> @ashtop username,event2,program2,blocking_session,blocking_inst_id
>> session_id=3D1710 "TIMESTAMP'2020-01-29 13:04:00'" "TIMESTAMP'2020-01-29
>> 13:05:00'"
>>
>> no rows selected
>>
>> SQL> @ash/event_hist "log file" 1=3D1  "TIMESTAMP'2020-01-29 13:04:00'"
>> "TIMESTAMP'2020-01-29 13:05:00'"
>>        EVH_EVENT       EVH_MILLISEC    EVH_SAMPLE_COUNT
>>  EVH_EST_EVENT_COUNT    EVH_EST_TOTAL_SEC    PCT_EVT_TIME       EVH_GRAP=
H
>>           FIRST_SEEN              LAST_SEEN
>> ________________ __________________ ___________________
>> ______________________ ____________________ _______________ ____________=
___
>> ______________________ ______________________
>> log file sync             < 1024                      1
>>  1.8                  1.8            12.8 |#         |    2020-01-29
>> 13:04:13    2020-01-29 13:04:13
>> log file sync             < 4096                      1
>>    1                  4.1            29.1 |###       |    2020-01-29
>> 13:04:12    2020-01-29 13:04:12
>> log file sync             < 8192                      1
>>    1                  8.2            58.2 |######    |    2020-01-29
>> 13:04:20    2020-01-29 13:04:20
>>
>> SQL> @ash/event_hist "write" 1=3D1  "TIMESTAMP'2020-01-29 13:04:00'"
>> "TIMESTAMP'2020-01-29 13:05:00'"
>>
>> no rows selected
>>
>> SQL> @ash/event_hist "*" 1=3D1  "TIMESTAMP'2020-01-29 13:04:00'"
>> "TIMESTAMP'2020-01-29 13:05:00'"
>>         EVH_EVENT       EVH_MILLISEC    EVH_SAMPLE_COUNT
>>  EVH_EST_EVENT_COUNT    EVH_EST_TOTAL_SEC    PCT_EVT_TIME       EVH_GRAP=
H
>>           FIRST_SEEN              LAST_SEEN
>> _________________ __________________ ___________________
>> ______________________ ____________________ _______________ ____________=
___
>> ______________________ ______________________
>> log file sync              < 1024                      1
>>    1.8                  1.8            12.8 |#         |    2020-01-29
>> 13:04:13    2020-01-29 13:04:13
>> log file sync              < 4096                      1
>>      1                  4.1            29.1 |###       |    2020-01-29
>> 13:04:12    2020-01-29 13:04:12
>> log file sync              < 8192                      1
>>      1                  8.2            58.2 |######    |    2020-01-29
>> 13:04:20    2020-01-29 13:04:20
>> row cache lock                < 1                      1
>> 2469.1                  2.5             100 |##########|    2020-01-29
>> 13:04:01    2020-01-29 13:04:01
>>
>> SQL> @ashtop username,event2,program2,blocking_session,blocking_inst_id
>> "lower(program) like '%lg%'" "TIMESTAMP'2020-01-29 13:04:00'"
>> "TIMESTAMP'2020-01-29 13:05:00'"
>>
>> no rows selected
>>
>> i'm going to copy over gv$active_session_history to a single instance
>> database and take a look at ash_wait_chains but i don't expect that to
>> reveal anything either.
>>
>> i need to confirm whether adding redo resolved issue (will have develope=
r
>> fire up workload) but I think next step is to investigate x$trace and
>> possible process communication issues?
>>
>> On Thu, Jan 30, 2020 at 10:27 AM Chris Stephens <cstephens16@gmail.com>
>> wrote:
>>
>>> Hey Tanel,
>>>
>>> Thanks for taking time to write this up! Will definitely be attending
>>> that webinar.
>>>
>>> comments inline
>>>
>>> On Wed, Jan 29, 2020 at 10:32 PM Tanel Poder <tanel@tanelpoder.com>
>>> wrote:
>>>
>>>> Comments:
>>>>
>>>> 1) First, looks like you're using all the right tools for systematic
>>>> troubleshooting ;-)
>>>>
>>>
>>> ***i try!
>>>
>>>>
>>>> 2) As "log file sync" is not directly an I/O wait event (IO happens
>>>> indirectly under the hood), but "wait for LGWR to ACK/increment redo
>>>> persisting position" then there may be multiple reasons for delays. I'=
m
>>>> repeating some of what others have said here too:
>>>>   - LGWR could be waiting for remote ACK (synchronous data guard) -
>>>> that should show up as a LNS* wait or something like that though
>>>>   - Process slowness due to CPU overload/scheduling latency or swappin=
g
>>>> (in both cases, LGWR may do its work fast, but it takes a while for th=
e
>>>> foreground process to wake up and end the wait event - you could check=
 if
>>>> there are any reports regarding switching LGWR mode in alert.log aroun=
d
>>>> that time)
>>>>   - Bug/missed post due to switching between LGWR progress polling vs.
>>>> waiting for LGWR post
>>>>   - ASM instance communication issues (ASM communication can be on the
>>>> critical path when creating/changing files on ASM and even for reading=
,
>>>> when the ASM metadata cache in your local instance doesn't have what i=
t
>>>> needs)
>>>>   - So, the physical redo log file disk write is not necessarily the
>>>> biggest contributor to the log file sync wait
>>>>
>>>> *** no data guard in the env.
>>> *** we have excess CPU. (20 cores / instance and usually only 1-5 activ=
e
>>> foreground processes but i will confirm from TFA collections)
>>> *** yesterday i was ready to blame bug. today, not so sure. i checked
>>> log file switches with a script google found me a few days ago. now
>>> thinking that was inaccurate as a script from metastink shows tones of
>>> switching so that could very well be related.
>>> ***asm communication is something i need to look into. TFA has reported
>>> some ASM issues but there's no details and i never saw anything in asm
>>> alert logs indicating an issue.
>>>
>>>
>>>> 3) You'll need to measure what LGWR itself was doing during the proble=
m
>>>> time
>>>>   - Since you're on 19c (12c or later) the LGWR has slaves too and the=
y
>>>> may be doing most of the work
>>>>   - You'll need to look into both the LGWR process and any other LG*
>>>> ones too
>>>>   - Since you have (Oracle) Snapper running already and writing to
>>>> trace you should have the data already
>>>>   - The *lgwr* in the following @snapper ash,stats,trace 10 999999 lgw=
r
>>>> syntax includes LGWR slaves too
>>>>   - As LGWR slaves have their own sessions (and SIDs), you should run
>>>> ashtop etc and report if there's any BLOCKING_SESSION value listed for=
 LGWR
>>>> and slaves
>>>>   - If LGWR and the slaves were listed *idle* during the problem time
>>>> (while others waited for log file sync event at the same time) *and
>>>> if *there was no huge OS level CPU/memory shortage at the time, this
>>>> looks like a background <-> foreground communication problem.
>>>>      - This (and any potential ASM issues) could probably be
>>>> troubleshooted via KST tracing, KSR channels and X$TRACE that could he=
lp
>>>> here (some of it is enabled by default ( Like I showed in a previous
>>>> hacking session: https://www.youtube.com/watch?v=3DmkmvZv58W6w )
>>>>
>>>>
>>> here is all the output i have corresponding to a 6 sec "log file sync"
>>> wait:
>>>
>>> Experience duration: *R* =3D *t*1 =E2=88=92 *t*0 =3D 41.336137 seconds
>>> <#m_-8103031405908768729_m_7542945560652427325_m_-5509691095923674063_p=
rofile-by-subroutine>
>>> Experience begin time: *t*0 =3D 2020-01-29T13:04:07.560478
>>> Experience end time: *t*1 =3D 2020-01-29T13:04:48.896615
>>> CALL-NAME                        DURATION       %   CALLS      MEAN
>>>   MIN       MAX
>>> ------------------------------  ---------  ------  ------  --------
>>>  --------  --------
>>> SQL*Net message from client     26.017105   66.5%   9,886  0.002632
>>>  0.000209  7.858051
>>> log file sync                   10.582409   27.1%     150  0.070549
>>>  0.000295  6.909711
>>> EXEC                             2.088757    5.3%   9,702  0.000215
>>>  0.000000  0.209199
>>> DLM cross inst call completion   0.109617    0.3%     212  0.000517
>>>  0.000005  0.080575
>>> row cache lock                   0.070118    0.2%     125  0.000561
>>>  0.000208  0.000972
>>> FETCH                            0.047586    0.1%   1,029  0.000046
>>>  0.000000  0.003056
>>> PARSE                            0.045293    0.1%   4,778  0.000009
>>>  0.000000  0.002692
>>> CLOSE                            0.033453    0.1%   4,784  0.000007
>>>  0.000000  0.000237
>>> library cache lock               0.023614    0.1%      55  0.000429
>>>  0.000152  0.000901
>>> library cache pin                0.022311    0.1%      55  0.000406
>>>  0.000153  0.000689
>>> 17 others                        0.071430    0.2%  10,772  0.000007
>>>  0.000000  0.001487
>>> ------------------------------  ---------  ------  ------  --------
>>>  --------  --------
>>> TOTAL (27)                      39.111693  100.0%  41,548  0.000941
>>>  0.000000  7.858051
>>>
>>> 6 sec wait from raw data:
>>>
>>> *** 2020-01-29T13:04:20.724666-06:00
>>> WAIT #0: nam=3D'log file sync' ela=3D 6909711 buffer#=3D58996 sync
>>> scn=3D976089279 p3=3D0 obj#=3D-1 tim=3D1025228531958
>>> WAIT #0: nam=3D'SQL*Net message to client' ela=3D 2 driver id=3D1413697=
536
>>> #bytes=3D1 p3=3D0 obj#=3D-1 tim=3D1025228532118
>>> WAIT #0: nam=3D'SQL*Net message from client' ela=3D 1711 driver
>>> id=3D1413697536 #bytes=3D1 p3=3D0 obj#=3D-1 tim=3D1025228533852
>>> BINDS #140204587734864:
>>>
>>> i enabled 10046 on lgwr:
>>>
>>> *** 2020-01-29T13:04:20.725994-06:00
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 2918443 timeout=3D300 p2=3D0 =
p3=3D0
>>> obj#=3D-1 tim=3D1025228533313
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 279 files=3D1 blocks=3D=
17
>>> requests=3D2 obj#=3D-1 tim=3D1025228533777
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 171 files=3D1 blocks=3D=
3
>>> requests=3D2 obj#=3D-1 tim=3D1025228534104
>>> WAIT #0: nam=3D'LGWR wait for redo copy' ela=3D 22 copy latch #=3D28 p2=
=3D0 p3=3D0
>>> obj#=3D-1 tim=3D1025228534182
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 183 files=3D1 blocks=3D=
2
>>> requests=3D2 obj#=3D-1 tim=3D1025228534437
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 174 files=3D1 blocks=3D=
2
>>> requests=3D2 obj#=3D-1 tim=3D1025228534721
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 31 timeout=3D8 p2=3D0 p3=3D0 =
obj#=3D-1
>>> tim=3D1025228534786
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 3233 timeout=3D8 p2=3D0 p3=3D=
0 obj#=3D-1
>>> tim=3D1025228538038
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 170 files=3D1 blocks=3D=
2
>>> requests=3D1 obj#=3D-1 tim=3D1025228538297
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 25 timeout=3D8 p2=3D0 p3=3D0 =
obj#=3D-1
>>> tim=3D1025228538358
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 11829 timeout=3D8 p2=3D0 p3=
=3D0 obj#=3D-1
>>> tim=3D1025228550204
>>> WAIT #0: nam=3D'LGWR wait for redo copy' ela=3D 52 copy latch #=3D8 p2=
=3D0 p3=3D0
>>> obj#=3D-1 tim=3D1025228550294
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 239 files=3D1 blocks=3D=
7
>>> requests=3D1 obj#=3D-1 tim=3D1025228550624
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 23 timeout=3D6 p2=3D0 p3=3D0 =
obj#=3D-1
>>> tim=3D1025228550699
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 16750 timeout=3D6 p2=3D0 p3=
=3D0 obj#=3D-1
>>> tim=3D1025228567467
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 207 files=3D1 blocks=3D=
9
>>> requests=3D1 obj#=3D-1 tim=3D1025228567762
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 26 timeout=3D5 p2=3D0 p3=3D0 =
obj#=3D-1
>>> tim=3D1025228567825
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 16249 timeout=3D5 p2=3D0 p3=
=3D0 obj#=3D-1
>>> tim=3D1025228584091
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 213 files=3D1 blocks=3D=
9
>>> requests=3D1 obj#=3D-1 tim=3D1025228584400
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 48 timeout=3D3 p2=3D0 p3=3D0 =
obj#=3D-1
>>> tim=3D1025228584502
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 16466 timeout=3D3 p2=3D0 p3=
=3D0 obj#=3D-1
>>> tim=3D1025228600993
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 228 files=3D1 blocks=3D=
9
>>> requests=3D1 obj#=3D-1 tim=3D1025228601393
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 29 timeout=3D1 p2=3D0 p3=3D0 =
obj#=3D-1
>>> tim=3D1025228601482
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 9548 timeout=3D1 p2=3D0 p3=3D=
0 obj#=3D-1
>>> tim=3D1025228611053
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 7467 timeout=3D300 p2=3D0 p3=
=3D0 obj#=3D-1
>>> tim=3D1025228618574
>>> WAIT #0: nam=3D'LGWR wait for redo copy' ela=3D 13 copy latch #=3D8 p2=
=3D0 p3=3D0
>>> obj#=3D-1 tim=3D1025228618636
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 222 files=3D1 blocks=3D=
9
>>> requests=3D1 obj#=3D-1 tim=3D1025228618960
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 31 timeout=3D300 p2=3D0 p3=3D=
0 obj#=3D-1
>>> tim=3D1025228619066
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 15946 timeout=3D300 p2=3D0 p3=
=3D0
>>> obj#=3D-1 tim=3D1025228635045
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 240 files=3D1 blocks=3D=
9
>>> requests=3D1 obj#=3D-1 tim=3D1025228635427
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 33 timeout=3D298 p2=3D0 p3=3D=
0 obj#=3D-1
>>> tim=3D1025228635525
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 15987 timeout=3D298 p2=3D0 p3=
=3D0
>>> obj#=3D-1 tim=3D1025228651537
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 255 files=3D1 blocks=3D=
9
>>> requests=3D1 obj#=3D-1 tim=3D1025228651912
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 51 timeout=3D296 p2=3D0 p3=3D=
0 obj#=3D-1
>>> tim=3D1025228652034
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 16332 timeout=3D296 p2=3D0 p3=
=3D0
>>> obj#=3D-1 tim=3D1025228668405
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 256 files=3D1 blocks=3D=
9
>>> requests=3D1 obj#=3D-1 tim=3D1025228668804
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 37 timeout=3D295 p2=3D0 p3=3D=
0 obj#=3D-1
>>> tim=3D1025228668911
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 16969 timeout=3D295 p2=3D0 p3=
=3D0
>>> obj#=3D-1 tim=3D1025228685907
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 273 files=3D1 blocks=3D=
9
>>> requests=3D1 obj#=3D-1 tim=3D1025228686327
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 38 timeout=3D293 p2=3D0 p3=3D=
0 obj#=3D-1
>>> tim=3D1025228686453
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 6908 timeout=3D293 p2=3D0 p3=
=3D0 obj#=3D-1
>>> tim=3D1025228693390
>>> WAIT #0: nam=3D'log file parallel write' ela=3D 247 files=3D1 blocks=3D=
2
>>> requests=3D1 obj#=3D-1 tim=3D1025228693768
>>> WAIT #0: nam=3D'rdbms ipc message' ela=3D 38 timeout=3D292 p2=3D0 p3=3D=
0 obj#=3D-1
>>> tim=3D1025228693862j
>>>  ...more of the same
>>> 264396 WAIT #0: nam=3D'rdbms ipc message' ela=3D 32 timeout=3D52 p2=3D0=
 p3=3D0
>>> obj#=3D-1 tim=3D1025231090742
>>>  264397 WAIT #0: nam=3D'rdbms ipc message' ela=3D 19460 timeout=3D52 p2=
=3D0 p3=3D0
>>> obj#=3D-1 tim=3D1025231110227
>>>  264398 WAIT #0: nam=3D'log file parallel write' ela=3D 272 files=3D1 b=
locks=3D9
>>> requests=3D1 obj#=3D-1 tim=3D1025231110633
>>>  264399 WAIT #0: nam=3D'rdbms ipc message' ela=3D 37 timeout=3D50 p2=3D=
0 p3=3D0
>>> obj#=3D-1 tim=3D1025231110748
>>>  264400 WAIT #0: nam=3D'rdbms ipc message' ela=3D 15484 timeout=3D50 p2=
=3D0 p3=3D0
>>> obj#=3D-1 tim=3D1025231126261
>>>  264401 WAIT #0: nam=3D'log file parallel write' ela=3D 247 files=3D1 b=
locks=3D4
>>> requests=3D1 obj#=3D-1 tim=3D1025231126641
>>>  264402 WAIT #0: nam=3D'rdbms ipc message' ela=3D 36 timeout=3D49 p2=3D=
0 p3=3D0
>>> obj#=3D-1 tim=3D1025231126753
>>>  264403 WAIT #0: nam=3D'rdbms ipc message' ela=3D 16732 timeout=3D49 p2=
=3D0 p3=3D0
>>> obj#=3D-1 tim=3D1025231143514
>>>  264404 WAIT #0: nam=3D'LGWR wait for redo copy' ela=3D 31 copy latch #=
=3D8
>>> p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025231143603
>>>  264405 WAIT #0: nam=3D'log file parallel write' ela=3D 279 files=3D1 b=
locks=3D9
>>> requests=3D1 obj#=3D-1 tim=3D1025231143976
>>>  264406 WAIT #0: nam=3D'rdbms ipc message' ela=3D 35 timeout=3D47 p2=3D=
0 p3=3D0
>>> obj#=3D-1 tim=3D1025231144093
>>>  264407 WAIT #0: nam=3D'rdbms ipc message' ela=3D 5384 timeout=3D47 p2=
=3D0 p3=3D0
>>> obj#=3D-1 tim=3D1025231149515
>>>  264408 WAIT #0: nam=3D'log file parallel write' ela=3D 255 files=3D1 b=
locks=3D4
>>> requests=3D1 obj#=3D-1 tim=3D1025231149889
>>>  264409 WAIT #0: nam=3D'rdbms ipc message' ela=3D 32 timeout=3D47 p2=3D=
0 p3=3D0
>>> obj#=3D-1 tim=3D1025231149989
>>>  264410
>>>  264411 *** 2020-01-29T13:04:23.802774-06:00
>>>  264412 WAIT #0: nam=3D'rdbms ipc message' ela=3D 460056 timeout=3D46 p=
2=3D0
>>> p3=3D0 obj#=3D-1 tim=3D1025231610086
>>>  264413
>>>  264414 *** 2020-01-29T13:04:26.797925-06:00
>>>  264415 WAIT #0: nam=3D'rdbms ipc message' ela=3D 2995027 timeout=3D300=
 p2=3D0
>>> p3=3D0 obj#=3D-1 tim=3D1025234605238
>>>  264416 WAIT #0: nam=3D'rdbms ipc message' ela=3D 10649 timeout=3D1 p2=
=3D0 p3=3D0
>>> obj#=3D-1 tim=3D1025234616054
>>>  264417
>>>
>>> i will get you snapper lgwr output associated w/ this interval as soon
>>> as i can find it (can't remember where i stashed it and lost trace file
>>> name .. fml)
>>>
>>> 4) Paste the full *Snapper* output (of one time snapshot from the
>>>> problem time) here if you can - we'll be able to see what the slaves w=
ere
>>>> doing
>>>>   - You'd need to include not only LGWR SID but also any LG* slave SID=
s
>>>> too
>>>>   - Also we'll see better LGWR delay metrics from V$SESSTAT that go
>>>> beyond just wait events
>>>>   - Thinking of metrics like these - snapper already handles them (if
>>>> you look into the rightmost column)
>>>>      - *redo synch time overhead (usec)* - "FG wakeup overhead per log
>>>> file sync"
>>>>      - *redo write time* - accounting end to end redo write time (not
>>>> just the log file parallel write syscall) "per redo write"
>>>>
>>>> 5) Regarding the *ASH wait chains* error on 19c - I'm aware of this
>>>> but forgot to look into it
>>>>    - It has (probably) something to do with GV$ views (or PX slaves
>>>> used for PDB V$ queries) and possibly the connect by I'm using
>>>>    - Does it work in the CDB (if multitenant)?
>>>>    - Does it work with lower optimizer features enabled?
>>>>    - As we are already manually looking into what LGWR and its slaves
>>>> are doing, there's no need for the wait chains script here
>>>>
>>>> ***oracle has filed a bug. altering optimizer_features_enabled doesn't
>>> help.
>>>
>>>
>>>> 6) Linux Process Snapper - pSnapper
>>>>    - If you want to see the full "wait" states of Linux
>>>> processes/threads, use the* -a *option.
>>>>    - It's somewhat like always showing "idle" threads too
>>>>    - But for linux apps it's harder to determine what's idle and what'=
s
>>>> some network RPC wait, so I currently just do what Linux "load" metric=
 does
>>>>    - I just shows threads in Runnable and Disk/Uninterruptible sleep
>>>> states by default (-a shows all states, including Sleep and kernel-Idl=
e
>>>> threads etc)
>>>>    - So the Linux pSnapper "idle" means a somewhat different thing tha=
n
>>>> Oracle "idle" - say LGWR was waiting for a Data Guard ACK, that should=
 not
>>>> be shown as Idle in Oracle, but at the Linux process level it's "just =
some
>>>> network socket wait" and you'd need to use *-a* option to show all of
>>>> them
>>>>
>>>
>>> *** will add that option to future collections (and watch the video/rea=
d
>>> help)
>>>
>>>>
>>>> 7) Were LGWR/LG* processes mostly idle during the problem time?
>>>>    - As I kept reading the thread, it looks like ASH reported LGWR
>>>> being mostly idle during the problem time
>>>>    - This is not exactly the same pattern, but I've seen cases in the
>>>> past where everyone's waiting for LGWR and LGWR doesn't seem to be bus=
y and
>>>> doesn't show any significant waits
>>>>    - It turned out that LGWR was actually waiting for CKPT to make
>>>> progress, but it wasn't synchronously waiting, but just going to sleep=
 and
>>>> "checking again later" in a loop
>>>>    - I wrote about this here:
>>>> https://blog.tanelpoder.com/posts/log-file-switch-checkpoint-incomplet=
e-and-lgwr-waiting-for-checkpoint-progress/
>>>>    - So it may still make sense to check what LGWR/LG* processes are
>>>> doing even if they don't seem to be active much (like do you see an
>>>> occasional controlfile read wait by LGWR etc)
>>>>    - SQL*Trace would capture all activity of LGWR, but may slow things
>>>> down further, so you could also sample & spool V$SESSION_WAIT_HISTORY =
(that
>>>> keeps a log of last 10 waits of every session in memory)
>>>>
>>>> *** i think i've addressed most of this.
>>>
>>>
>>>> Incidentally I'm doing another hacking session tomorrow (Thursday 30
>>>> Jan @ 12pm ET) about complex Oracle performance problems that span mul=
tiple
>>>> layers and have multiple moving parts. I can add some LGWR troubleshoo=
ting
>>>> magic in the end of that (here goes my plan to fit the session in 1.5h
>>>> again!). I'll post the details in my blog in a few minutes.
>>>>
>>>> *** can't wait!!
>>>
>>>> --
>>>> Tanel Poder
>>>> https://blog.tanelpoder.com/seminar
>>>>
>>>>
>>>> On Tue, Jan 28, 2020 at 2:22 PM Chris Stephens <cstephens16@gmail.com>
>>>> wrote:
>>>>
>>>>> doesn't look like any ash data for lgwr:
>>>>>
>>>>> SQL> @ashtop session_id,blocking_session,event2 1=3D1
>>>>> "TIMESTAMP'2020-01-27 15:50:48'" "TIMESTAMP'2020-01-27 15:52:40'"
>>>>>    TOTALSECONDS    AAS      %This    SESSION_ID    BLOCKING_SESSION
>>>>>              EVENT2             FIRST_SEEN              LAST_SEEN
>>>>>  DIST_SQLEXEC_SEEN
>>>>> _______________ ______ __________ _____________ ___________________
>>>>> _____________________ ______________________ ______________________
>>>>> ____________________
>>>>>              81    0.7   82% |             1713                1710
>>>>> log file sync         2020-01-27 15:50:54    2020-01-27 15:52:14
>>>>>             1
>>>>>               5      0    5% |             1221                     O=
N
>>>>> CPU                2020-01-27 15:50:59    2020-01-27 15:52:14
>>>>>         1
>>>>>               3      0    3% |              246                     O=
N
>>>>> CPU                2020-01-27 15:51:00    2020-01-27 15:52:00
>>>>>         1
>>>>>               3      0    3% |             1713                     O=
N
>>>>> CPU                2020-01-27 15:50:51    2020-01-27 15:52:27
>>>>>         2
>>>>>               1      0    1% |              123                     O=
N
>>>>> CPU                2020-01-27 15:51:35    2020-01-27 15:51:35
>>>>>         1
>>>>>               1      0    1% |              127                     O=
N
>>>>> CPU                2020-01-27 15:51:50    2020-01-27 15:51:50
>>>>>         1
>>>>>               1      0    1% |              252                2321
>>>>> latch free            2020-01-27 15:52:28    2020-01-27 15:52:28
>>>>>             1
>>>>>               1      0    1% |              978
>>>>> ges remote message    2020-01-27 15:50:48    2020-01-27 15:50:48
>>>>>             1
>>>>>               1      0    1% |              983
>>>>> latch free            2020-01-27 15:52:28    2020-01-27 15:52:28
>>>>>             1
>>>>>               1      0    1% |             1713
>>>>> library cache lock    2020-01-27 15:50:48    2020-01-27 15:50:48
>>>>>             1
>>>>>               1      0    1% |             1831                     O=
N
>>>>> CPU                2020-01-27 15:50:49    2020-01-27 15:50:49
>>>>>         1
>>>>>
>>>>>
>>>>> 11 rows selected.
>>>>>
>>>>>
>>>>> SQL> @ashtop session_id,blocking_session,event2 session_id=3D1710
>>>>> "TIMESTAMP'2020-01-27 15:50:48'" "TIMESTAMP'2020-01-27 15:52:40'"
>>>>>
>>>>> no rows selected
>>>>>
>>>>> On Tue, Jan 28, 2020 at 1:09 PM Noveljic Nenad <
>>>>> nenad.noveljic@vontobel.com> wrote:
>>>>>
>>>>>> You=E2=80=99re using v0.2, but  v0.6 should be the latest version. 0=
.2
>>>>>> wouldn=E2=80=99t show idle blockers.
>>>>>>
>>>>>> I looked at the ashtop output once again. Strange that lgwr processe=
s
>>>>>> nor any other bg processes show up significantly there. The question=
 is,
>>>>>> what was lgwr doing while the fg processes were waiting on him.
>>>>>>
>>>>>>

--0000000000008e8d78059d8fb69f
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">But you=
r trace file here already indicates that there&#39;s some glitch regarding =
LGWR wakeups/posting:<div><br></div><div><b>Foreground:</b></div><div><br><=
/div><div><div>*** 2020-01-29T<b><font color=3D"#0000ff">13:04:20</font></b=
>.724666-06:00<br>WAIT #0: nam=3D&#39;log file sync&#39; ela=3D 6909711 buf=
fer#=3D58996 sync scn=3D976089279 p3=3D0 obj#=3D-1 tim=3D1025228531958<br><=
/div><div><br></div><div><b>LGWR at the same time:</b></div><div><br></div>=
<div>*** 2020-01-29T<b><font color=3D"#0000ff">13:04:20</font></b>.725994-0=
6:00<br>WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 2918443 timeout=
=3D300 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228533313</div></div><div><br></di=
v><div>The WAIT lines are printed out when the wait ends (and the *** times=
tamp line just before the WAIT line is printed out, if enough time has pass=
ed since the last *** line). So this shows that LGWR had happily been idle,=
 waiting for its idle 3-second timeout to expire, or someone to wake it up =
(that&#39;s what &quot;rdbms ipc message&quot; means) - while someone had i=
ssued a commit and was waiting for LGWR acknowledgment. So this shows that =
LGWR didn&#39;t even get the post from your session, nor any other sessions=
 (if you had concurrent transactional activity). Back in 11g, typically=C2=
=A0these problems happened when adaptive log file syncing switch between po=
ll and post/wait wakeup method happened and you hit a bug. But in 12c there=
 are more moving parts.</div><div><br></div><div>--</div><div>Tanel Poder</=
div><div><a href=3D"https://blog.tanelpoder.com/seminar">https://blog.tanel=
poder.com/seminar</a></div></div></div></div><br><div class=3D"gmail_quote"=
><div dir=3D"ltr" class=3D"gmail_attr">On Sat, Feb 1, 2020 at 10:34 PM Tane=
l Poder &lt;<a href=3D"mailto:tanel@tanelpoder.com">tanel@tanelpoder.com</a=
>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px=
 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-co=
lor:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><d=
iv dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">Hi Chris,<div><br></div><d=
iv>As 12c+ has log writer slaves and Oracle can adaptavely switch=C2=A0from=
 single LGWR to LGWR=C2=A0+ LGxx slave (_use_single_log_writer=3Dadaptive),=
 can you check ASH data for the LGxx slaves at this time too. Not all proce=
ss-to-process interactions/waits are instrumented with a blocking_session i=
n Oracle...</div><div><br></div><div>Should be this command:</div><div><br>=
</div><div><div><font face=3D"monospace">@ashtop username,event2,program2,b=
locking_session,blocking_inst_id &quot;program like &#39;%(LG%)&#39;&quot; =
&quot;TIMESTAMP&#39;2020-01-29 13:04:00&#39;&quot; &quot;TIMESTAMP&#39;2020=
-01-29 13:05:00&#39;&quot;</font></div></div><div><br></div><div>--<br></di=
v><div>Tanel Poder</div><div><a href=3D"https://blog.tanelpoder.com/seminar=
" target=3D"_blank">https://blog.tanelpoder.com/seminar</a></div><div><br><=
/div></div></div></div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr=
" class=3D"gmail_attr">On Thu, Jan 30, 2020 at 3:44 PM Chris Stephens &lt;<=
a href=3D"mailto:cstephens16@gmail.com" target=3D"_blank">cstephens16@gmail=
.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-=
left-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><font face=
=3D"monospace">Here&#39;s some additional ash output from the trace data po=
sted in last message including event_hist.sql:</font><div><font face=3D"mon=
ospace"><br></font></div><div><font face=3D"monospace">SQL&gt; @ashtop user=
name,event2,program2,blocking_session,blocking_inst_id 1=3D1 &quot;TIMESTAM=
P&#39;2020-01-29 13:04:20&#39;&quot; &quot;TIMESTAMP&#39;2020-01-29 13:04:2=
7&#39;&quot;<br>=C2=A0 =C2=A0TOTALSECONDS =C2=A0 =C2=A0AAS =C2=A0 =C2=A0 =
=C2=A0%This =C2=A0 =C2=A0 =C2=A0 =C2=A0 USERNAME =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 EVENT2 =C2=A0 =C2=A0 =C2=A0PROGRAM2 =C2=A0 =C2=A0BLOCKING_SESSIO=
N =C2=A0 =C2=A0BLOCKING_INST_ID =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 F=
IRST_SEEN =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0LAST_SEEN =C2=A0 =
=C2=A0DIST_SQLEXEC_SEEN<br>_______________ ______ __________ ______________=
__ ________________ _____________ ___________________ ___________________ _=
_____________________ ______________________ ____________________<br>=C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 2 =C2=A0 =C2=A00.3 =C2=A0 50% | =
=C2=A0 =C2=A0SYS =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ON CPU =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (sqlplus) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 2020-01-29 13:04:24 =C2=A0=
 =C2=A02020-01-29 13:04:25 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 1 =C2=A0 =C2=A00.1 =C2=A0 25% | =C2=A0 =C2=A0GEN3_MGOWER_3 =C2=A0 =
=C2=A0ON CPU =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (python) =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02020-0=
1-29 13:04:22 =C2=A0 =C2=A02020-01-29 13:04:22 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1<br>=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 1 =C2=A0 =C2=A00.1 =C2=A0 25% | =C2=A0 =C2=A0GE=
N3_MGOWER_3 =C2=A0 =C2=A0log file sync =C2=A0 =C2=A0(python) =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1710 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 2 2020-01-29 13:04:20 =C2=
=A0 =C2=A02020-01-29 13:04:20 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1<br><br><br>SQL&gt; @bg lgwr<br>=C2=A0 =C2=
=A0NAME =C2=A0 =C2=A0DESCRIPTION =C2=A0 =C2=A0 SID =C2=A0 =C2=A0OPID =C2=A0=
 =C2=A0 SPID =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 PADDR =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 SADDR<br>_______ ______________ _=
______ _______ ________ ___________________ ___________________<br>LGWR =C2=
=A0 =C2=A0Redo etc. =C2=A0 =C2=A0 =C2=A0 =C2=A0 1710 =C2=A0 =C2=A0 =C2=A034=
 26552 =C2=A0 =C2=A000000001E8718860 =C2=A0 =C2=A000000001D8BFF4A0<br></fon=
t></div><div><font face=3D"monospace"><br></font></div><div><font face=3D"m=
onospace">SQL&gt; @ashtop username,event2,program2,blocking_session,blockin=
g_inst_id session_id=3D1710 &quot;TIMESTAMP&#39;2020-01-29 13:04:20&#39;&qu=
ot; &quot;TIMESTAMP&#39;2020-01-29 13:04:27&#39;&quot;<br><br>no rows selec=
ted<br><br>SQL&gt; @ashtop username,event2,program2,blocking_session,blocki=
ng_inst_id session_id=3D1710 &quot;TIMESTAMP&#39;2020-01-29 13:04:00&#39;&q=
uot; &quot;TIMESTAMP&#39;2020-01-29 13:05:00&#39;&quot;<br><br>no rows sele=
cted<br></font></div><div><font face=3D"monospace"><br></font></div><div><f=
ont face=3D"monospace">SQL&gt; @ash/event_hist &quot;log file&quot; 1=3D1 =
=C2=A0&quot;TIMESTAMP&#39;2020-01-29 13:04:00&#39;&quot; &quot;TIMESTAMP&#3=
9;2020-01-29 13:05:00&#39;&quot;<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0EVH_EVENT =
=C2=A0 =C2=A0 =C2=A0 EVH_MILLISEC =C2=A0 =C2=A0EVH_SAMPLE_COUNT =C2=A0 =C2=
=A0EVH_EST_EVENT_COUNT =C2=A0 =C2=A0EVH_EST_TOTAL_SEC =C2=A0 =C2=A0PCT_EVT_=
TIME =C2=A0 =C2=A0 =C2=A0 EVH_GRAPH =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 FIRST_SEEN =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0LAST_SEEN<br=
>________________ __________________ ___________________ __________________=
____ ____________________ _______________ _______________ _________________=
_____ ______________________<br>log file sync =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 &lt; 1024 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A01.8 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A01.8 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A012.8 |# =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 | =C2=A0 =C2=A02020-01-29 13:04:13 =C2=A0 =C2=A020=
20-01-29 13:04:13<br>log file sync =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 &lt; 4096 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A04.1 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A029.1 |### =C2=A0 =C2=
=A0 =C2=A0 | =C2=A0 =C2=A02020-01-29 13:04:12 =C2=A0 =C2=A02020-01-29 13:04=
:12<br>log file sync =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 &lt; 8192 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A08.2 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A058.2 |###### =C2=A0 =C2=A0| =C2=A0=
 =C2=A02020-01-29 13:04:20 =C2=A0 =C2=A02020-01-29 13:04:20<br></font></div=
><div><font face=3D"monospace"><br></font></div><div><font face=3D"monospac=
e">SQL&gt; @ash/event_hist &quot;write&quot; 1=3D1 =C2=A0&quot;TIMESTAMP&#3=
9;2020-01-29 13:04:00&#39;&quot; &quot;TIMESTAMP&#39;2020-01-29 13:05:00&#3=
9;&quot;<br><br>no rows selected<br><br>SQL&gt; @ash/event_hist &quot;*&quo=
t; 1=3D1 =C2=A0&quot;TIMESTAMP&#39;2020-01-29 13:04:00&#39;&quot; &quot;TIM=
ESTAMP&#39;2020-01-29 13:05:00&#39;&quot;<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 EV=
H_EVENT =C2=A0 =C2=A0 =C2=A0 EVH_MILLISEC =C2=A0 =C2=A0EVH_SAMPLE_COUNT =C2=
=A0 =C2=A0EVH_EST_EVENT_COUNT =C2=A0 =C2=A0EVH_EST_TOTAL_SEC =C2=A0 =C2=A0P=
CT_EVT_TIME =C2=A0 =C2=A0 =C2=A0 EVH_GRAPH =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 FIRST_SEEN =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0LAST_=
SEEN<br>_________________ __________________ ___________________ __________=
____________ ____________________ _______________ _______________ _________=
_____________ ______________________<br>log file sync =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0&lt; 1024 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01.8 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A01.8 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A012.=
8 |# =C2=A0 =C2=A0 =C2=A0 =C2=A0 | =C2=A0 =C2=A02020-01-29 13:04:13 =C2=A0 =
=C2=A02020-01-29 13:04:13<br>log file sync =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0&lt; 4096 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A04.1 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A029.1 |### =
=C2=A0 =C2=A0 =C2=A0 | =C2=A0 =C2=A02020-01-29 13:04:12 =C2=A0 =C2=A02020-0=
1-29 13:04:12<br>log file sync =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0&lt; 8192 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A08.2 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A058.2 |###### =C2=A0 =
=C2=A0| =C2=A0 =C2=A02020-01-29 13:04:20 =C2=A0 =C2=A02020-01-29 13:04:20<b=
r>row cache lock =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0&lt=
; 1 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A01 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 2469.1 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02.5 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 100 |##########| =C2=A0 =C2=A02020-01-29 13=
:04:01 =C2=A0 =C2=A02020-01-29 13:04:01<br></font></div><div><font face=3D"=
monospace"><br></font></div><div><font face=3D"monospace">SQL&gt; @ashtop u=
sername,event2,program2,blocking_session,blocking_inst_id &quot;lower(progr=
am) like &#39;%lg%&#39;&quot; &quot;TIMESTAMP&#39;2020-01-29 13:04:00&#39;&=
quot; &quot;TIMESTAMP&#39;2020-01-29 13:05:00&#39;&quot;<br><br>no rows sel=
ected<br></font></div><div><font face=3D"monospace"><br></font></div><div><=
font face=3D"monospace">i&#39;m going to copy over gv$active_session_histor=
y to a single instance database and take a look at ash_wait_chains but i do=
n&#39;t expect that to reveal anything either.</font></div><div><font face=
=3D"monospace"><br></font></div><div><font face=3D"monospace">i need to con=
firm whether adding redo resolved issue (will have developer fire up worklo=
ad) but I think next step is to investigate x$trace and possible process co=
mmunication issues?</font></div></div><br><div class=3D"gmail_quote"><div d=
ir=3D"ltr" class=3D"gmail_attr">On Thu, Jan 30, 2020 at 10:27 AM Chris Step=
hens &lt;<a href=3D"mailto:cstephens16@gmail.com" target=3D"_blank">cstephe=
ns16@gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" st=
yle=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:sol=
id;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><d=
iv dir=3D"ltr">Hey=C2=A0Tanel,<div><br></div><div>Thanks for taking time to=
 write this up! Will definitely=C2=A0be attending that webinar.</div><div><=
br></div><div>comments inline</div></div><br><div class=3D"gmail_quote"><di=
v dir=3D"ltr" class=3D"gmail_attr">On Wed, Jan 29, 2020 at 10:32 PM Tanel P=
oder &lt;<a href=3D"mailto:tanel@tanelpoder.com" target=3D"_blank">tanel@ta=
nelpoder.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" styl=
e=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid=
;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div=
 dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D=
"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div>Comments:</div><div><br></div>=
<div>1) First, looks like you&#39;re using all the right tools for systemat=
ic troubleshooting ;-)</div></div></div></div></div></div></div></div></div=
></blockquote><div><br></div><div>***i try!=C2=A0</div><blockquote class=3D=
"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;borde=
r-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><di=
v dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=
=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div><br></div>=
<div>2) As &quot;log file sync&quot; is not directly an I/O wait event (IO =
happens indirectly under the hood), but &quot;wait for LGWR to ACK/incremen=
t redo persisting position&quot; then there may be multiple reasons for del=
ays. I&#39;m repeating some of what others have said here too:</div><div>=
=C2=A0 - LGWR could be waiting for remote ACK (synchronous data guard) - th=
at should show up as a LNS* wait or something like that though</div><div>=
=C2=A0 - Process slowness due to CPU overload/scheduling latency=C2=A0or sw=
apping (in both cases, LGWR may do its work fast, but it takes a while for =
the foreground process to wake up and end the wait event - you could check =
if there are any reports regarding switching LGWR mode in alert.log around =
that time)</div><div>=C2=A0 - Bug/missed post due to switching between LGWR=
 progress polling vs. waiting for LGWR post</div><div>=C2=A0 - ASM instance=
 communication issues (ASM communication can be on the critical path when c=
reating/changing files on ASM and even for reading, when the ASM metadata c=
ache in your local instance doesn&#39;t have what it needs)</div><div>=C2=
=A0 - So, the physical redo log file disk write is not necessarily the bigg=
est contributor to the log file sync wait</div><div><br></div></div></div><=
/div></div></div></div></div></div></blockquote><div>*** no data guard in t=
he env.</div><div>*** we have excess CPU. (20 cores / instance and usually =
only 1-5 active foreground processes but i will confirm from TFA collection=
s)</div><div>*** yesterday i was ready to blame bug. today, not so sure. i =
checked log file switches with a script google found me a few days ago. now=
 thinking that was inaccurate as a script from metastink shows tones of swi=
tching so that could very well be related.</div><div>***asm communication i=
s something i need to look into. TFA has reported some ASM issues but there=
&#39;s no details and i never saw anything in asm alert logs indicating an =
issue.</div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-=
left-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"=
ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><d=
iv dir=3D"ltr"><div dir=3D"ltr"><div></div><div>3) You&#39;ll need to measu=
re what LGWR itself was doing during the problem time</div><div>=C2=A0 - Si=
nce you&#39;re on 19c (12c or later) the LGWR has slaves too and they may b=
e doing most of the work=C2=A0</div><div>=C2=A0 - You&#39;ll need to look i=
nto both the LGWR process and any other LG* ones too</div><div>=C2=A0 - Sin=
ce you have (Oracle) Snapper running already and writing to trace you shoul=
d have the data already</div><div dir=3D"ltr">=C2=A0 - The <b>lgwr</b>=C2=
=A0in the following <font face=3D"monospace">@snapper ash,stats,trace 10 99=
9999 lgwr</font> syntax includes LGWR slaves too<br></div><div dir=3D"ltr">=
=C2=A0 - As LGWR slaves have their own sessions (and SIDs), you should run =
ashtop etc and report if there&#39;s any=C2=A0BLOCKING_SESSION value listed=
 for LGWR and slaves<br></div><div>=C2=A0 - If LGWR and the slaves were lis=
ted <i>idle</i> during the problem time (while others waited for log file s=
ync event at the same time) <i>and if=C2=A0</i>there was no huge OS level C=
PU/memory shortage at the time, this looks like a background &lt;-&gt; fore=
ground communication problem.=C2=A0</div><div>=C2=A0 =C2=A0 =C2=A0- This (a=
nd any potential ASM issues) could probably be troubleshooted via KST traci=
ng, KSR channels and X$TRACE that could help here (some of it is enabled by=
 default ( Like I showed in a previous hacking session: <a href=3D"https://=
www.youtube.com/watch?v=3DmkmvZv58W6w" target=3D"_blank">https://www.youtub=
e.com/watch?v=3DmkmvZv58W6w</a> )</div><div><br></div></div></div></div></d=
iv></div></div></div></div></blockquote><div><br></div><div>here is all the=
 output i have corresponding to a 6 sec &quot;log file sync&quot; wait:</di=
v><div><br></div><div><table style=3D"vertical-align:baseline;font-size:12p=
x;line-height:13.5pt;font-family:Lato,verdana,&quot;bitstream vera sans&quo=
t;,&quot;dejavu sans&quot;,&quot;liberation sans&quot;,geneva,sans-serif;co=
lor:rgb(0,0,0);margin:2ex 0px 2ex 40px;border-collapse:collapse;border:0px"=
><tbody style=3D"vertical-align:baseline"><tr style=3D"vertical-align:basel=
ine"><td style=3D"vertical-align:baseline;padding:0.5ex 0px 0.5ex 1em;text-=
align:right;font-size:10.2px;border:0px;white-space:nowrap">Experience dura=
tion:</td><td style=3D"vertical-align:baseline;padding:0.5ex 1em;width:100e=
m"><em style=3D"vertical-align:baseline">R</em>=C2=A0=3D=C2=A0<em style=3D"=
vertical-align:baseline">t</em><span style=3D"vertical-align:0px;font-size:=
7.2px">1</span>=C2=A0=E2=88=92=C2=A0<em style=3D"vertical-align:baseline">t=
</em><span style=3D"vertical-align:0px;font-size:7.2px">0</span>=C2=A0=3D=
=C2=A0<span title=3D"This report is relevant only if the end-user experienc=
e you are analyzing has consumed the duration shown here. If this duration =
does not correspond to your actual end-user performance experience, then yo=
u should repair your trace data or collect trace data more carefully for an=
other experience. Time your data collection activation and deactivation to =
match precisely the begin and end times of the experience you are analyzing=
." style=3D"vertical-align:baseline;color:tomato;font-weight:bold"><a href=
=3D"#m_-8103031405908768729_m_7542945560652427325_m_-5509691095923674063_pr=
ofile-by-subroutine" style=3D"vertical-align:baseline;text-decoration-line:=
none;color:tomato">41.336137=C2=A0seconds</a></span></td></tr><tr style=3D"=
vertical-align:baseline"><td style=3D"vertical-align:baseline;padding:0.5ex=
 0px 0.5ex 1em;text-align:right;font-size:10.2px;border:0px;white-space:now=
rap">Experience begin time:</td><td style=3D"vertical-align:baseline;paddin=
g:0.5ex 1em;width:100em"><em style=3D"vertical-align:baseline">t</em><span =
style=3D"vertical-align:0px;font-size:7.2px">0</span>=C2=A0=3D 2020-01-29T1=
3:04:07.560478</td></tr><tr style=3D"vertical-align:baseline"><td style=3D"=
vertical-align:baseline;padding:0.5ex 0px 0.5ex 1em;text-align:right;font-s=
ize:10.2px;border:0px;white-space:nowrap">Experience end time:</td><td styl=
e=3D"vertical-align:baseline;padding:0.5ex 1em;width:100em"><em style=3D"ve=
rtical-align:baseline">t</em><span style=3D"vertical-align:0px;font-size:7.=
2px">1</span>=C2=A0=3D 2020-01-29T13:04:48.896615</td></tr></tbody></table>=
</div>CALL-NAME =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0DURATION =C2=A0 =C2=A0 =C2=A0 % =C2=A0 CALLS =C2=A0=
 =C2=A0 =C2=A0MEAN =C2=A0 =C2=A0 =C2=A0 MIN =C2=A0 =C2=A0 =C2=A0 MAX<br>---=
--------------------------- =C2=A0--------- =C2=A0------ =C2=A0------ =C2=
=A0-------- =C2=A0-------- =C2=A0--------<br>SQL*Net message from client =
=C2=A0 =C2=A0 26.017105 =C2=A0 66.5% =C2=A0 9,886 =C2=A00.002632 =C2=A00.00=
0209 =C2=A07.858051<br>log file sync =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 10.582409 =C2=A0 27.1% =C2=A0 =C2=A0 150 =C2=A00.0=
70549 =C2=A00.000295 =C2=A06.909711<br>EXEC =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 2.088757=
 =C2=A0 =C2=A05.3% =C2=A0 9,702 =C2=A00.000215 =C2=A00.000000 =C2=A00.20919=
9<br>DLM cross inst call completion =C2=A0 0.109617 =C2=A0 =C2=A00.3% =C2=
=A0 =C2=A0 212 =C2=A00.000517 =C2=A00.000005 =C2=A00.080575<br>row cache lo=
ck =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.070118 =
=C2=A0 =C2=A00.2% =C2=A0 =C2=A0 125 =C2=A00.000561 =C2=A00.000208 =C2=A00.0=
00972<br>FETCH =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00.047586 =C2=A0 =C2=A00.1% =C2=A0 1,0=
29 =C2=A00.000046 =C2=A00.000000 =C2=A00.003056<br>PARSE =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A00.045293 =C2=A0 =C2=A00.1% =C2=A0 4,778 =C2=A00.000009 =C2=A00.000000=
 =C2=A00.002692<br>CLOSE =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00.033453 =C2=A0 =C2=A00.1% =
=C2=A0 4,784 =C2=A00.000007 =C2=A00.000000 =C2=A00.000237<br>library cache =
lock =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.023614 =C2=A0 =C2=
=A00.1% =C2=A0 =C2=A0 =C2=A055 =C2=A00.000429 =C2=A00.000152 =C2=A00.000901=
<br>library cache pin =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A00.022311 =C2=A0 =C2=A00.1% =C2=A0 =C2=A0 =C2=A055 =C2=A00.000406 =C2=A00=
.000153 =C2=A00.000689<br>17 others =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00.071430 =C2=A0 =C2=A00.2% =C2=
=A010,772 =C2=A00.000007 =C2=A00.000000 =C2=A00.001487<br>-----------------=
------------- =C2=A0--------- =C2=A0------ =C2=A0------ =C2=A0-------- =C2=
=A0-------- =C2=A0--------<br><div>TOTAL (27) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A039.111693 =C2=A0100.0% =C2=
=A041,548 =C2=A00.000941 =C2=A00.000000 =C2=A07.858051</div><div><br></div>=
<div>6 sec wait from raw data:</div><div><br></div><div>*** 2020-01-29T13:0=
4:20.724666-06:00<br>WAIT #0: nam=3D&#39;log file sync&#39; ela=3D 6909711 =
buffer#=3D58996 sync scn=3D976089279 p3=3D0 obj#=3D-1 tim=3D1025228531958<b=
r>WAIT #0: nam=3D&#39;SQL*Net message to client&#39; ela=3D 2 driver id=3D1=
413697536 #bytes=3D1 p3=3D0 obj#=3D-1 tim=3D1025228532118<br>WAIT #0: nam=
=3D&#39;SQL*Net message from client&#39; ela=3D 1711 driver id=3D1413697536=
 #bytes=3D1 p3=3D0 obj#=3D-1 tim=3D1025228533852<br>BINDS #140204587734864:=
<br></div><div><br></div><div>i enabled 10046 on lgwr:</div><div><br></div>=
<div>*** 2020-01-29T13:04:20.725994-06:00<br>WAIT #0: nam=3D&#39;rdbms ipc =
message&#39; ela=3D 2918443 timeout=3D300 p2=3D0 p3=3D0 obj#=3D-1 tim=3D102=
5228533313<br>WAIT #0: nam=3D&#39;log file parallel write&#39; ela=3D 279 f=
iles=3D1 blocks=3D17 requests=3D2 obj#=3D-1 tim=3D1025228533777<br>WAIT #0:=
 nam=3D&#39;log file parallel write&#39; ela=3D 171 files=3D1 blocks=3D3 re=
quests=3D2 obj#=3D-1 tim=3D1025228534104<br>WAIT #0: nam=3D&#39;LGWR wait f=
or redo copy&#39; ela=3D 22 copy latch #=3D28 p2=3D0 p3=3D0 obj#=3D-1 tim=
=3D1025228534182<br>WAIT #0: nam=3D&#39;log file parallel write&#39; ela=3D=
 183 files=3D1 blocks=3D2 requests=3D2 obj#=3D-1 tim=3D1025228534437<br>WAI=
T #0: nam=3D&#39;log file parallel write&#39; ela=3D 174 files=3D1 blocks=
=3D2 requests=3D2 obj#=3D-1 tim=3D1025228534721<br>WAIT #0: nam=3D&#39;rdbm=
s ipc message&#39; ela=3D 31 timeout=3D8 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025=
228534786<br>WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 3233 timeout=
=3D8 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228538038<br>WAIT #0: nam=3D&#39;log=
 file parallel write&#39; ela=3D 170 files=3D1 blocks=3D2 requests=3D1 obj#=
=3D-1 tim=3D1025228538297<br>WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=
=3D 25 timeout=3D8 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228538358<br>WAIT #0: =
nam=3D&#39;rdbms ipc message&#39; ela=3D 11829 timeout=3D8 p2=3D0 p3=3D0 ob=
j#=3D-1 tim=3D1025228550204<br>WAIT #0: nam=3D&#39;LGWR wait for redo copy&=
#39; ela=3D 52 copy latch #=3D8 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228550294=
<br>WAIT #0: nam=3D&#39;log file parallel write&#39; ela=3D 239 files=3D1 b=
locks=3D7 requests=3D1 obj#=3D-1 tim=3D1025228550624<br>WAIT #0: nam=3D&#39=
;rdbms ipc message&#39; ela=3D 23 timeout=3D6 p2=3D0 p3=3D0 obj#=3D-1 tim=
=3D1025228550699<br>WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 16750=
 timeout=3D6 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228567467<br>WAIT #0: nam=3D=
&#39;log file parallel write&#39; ela=3D 207 files=3D1 blocks=3D9 requests=
=3D1 obj#=3D-1 tim=3D1025228567762<br>WAIT #0: nam=3D&#39;rdbms ipc message=
&#39; ela=3D 26 timeout=3D5 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228567825<br>=
WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 16249 timeout=3D5 p2=3D0 =
p3=3D0 obj#=3D-1 tim=3D1025228584091<br>WAIT #0: nam=3D&#39;log file parall=
el write&#39; ela=3D 213 files=3D1 blocks=3D9 requests=3D1 obj#=3D-1 tim=3D=
1025228584400<br>WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 48 timeo=
ut=3D3 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228584502<br>WAIT #0: nam=3D&#39;r=
dbms ipc message&#39; ela=3D 16466 timeout=3D3 p2=3D0 p3=3D0 obj#=3D-1 tim=
=3D1025228600993<br>WAIT #0: nam=3D&#39;log file parallel write&#39; ela=3D=
 228 files=3D1 blocks=3D9 requests=3D1 obj#=3D-1 tim=3D1025228601393<br>WAI=
T #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 29 timeout=3D1 p2=3D0 p3=3D0=
 obj#=3D-1 tim=3D1025228601482<br>WAIT #0: nam=3D&#39;rdbms ipc message&#39=
; ela=3D 9548 timeout=3D1 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228611053<br>WA=
IT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 7467 timeout=3D300 p2=3D0 p=
3=3D0 obj#=3D-1 tim=3D1025228618574<br>WAIT #0: nam=3D&#39;LGWR wait for re=
do copy&#39; ela=3D 13 copy latch #=3D8 p2=3D0 p3=3D0 obj#=3D-1 tim=3D10252=
28618636<br>WAIT #0: nam=3D&#39;log file parallel write&#39; ela=3D 222 fil=
es=3D1 blocks=3D9 requests=3D1 obj#=3D-1 tim=3D1025228618960<br>WAIT #0: na=
m=3D&#39;rdbms ipc message&#39; ela=3D 31 timeout=3D300 p2=3D0 p3=3D0 obj#=
=3D-1 tim=3D1025228619066<br>WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=
=3D 15946 timeout=3D300 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228635045<br>WAIT=
 #0: nam=3D&#39;log file parallel write&#39; ela=3D 240 files=3D1 blocks=3D=
9 requests=3D1 obj#=3D-1 tim=3D1025228635427<br>WAIT #0: nam=3D&#39;rdbms i=
pc message&#39; ela=3D 33 timeout=3D298 p2=3D0 p3=3D0 obj#=3D-1 tim=3D10252=
28635525<br>WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 15987 timeout=
=3D298 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228651537<br>WAIT #0: nam=3D&#39;l=
og file parallel write&#39; ela=3D 255 files=3D1 blocks=3D9 requests=3D1 ob=
j#=3D-1 tim=3D1025228651912<br>WAIT #0: nam=3D&#39;rdbms ipc message&#39; e=
la=3D 51 timeout=3D296 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228652034<br>WAIT =
#0: nam=3D&#39;rdbms ipc message&#39; ela=3D 16332 timeout=3D296 p2=3D0 p3=
=3D0 obj#=3D-1 tim=3D1025228668405<br>WAIT #0: nam=3D&#39;log file parallel=
 write&#39; ela=3D 256 files=3D1 blocks=3D9 requests=3D1 obj#=3D-1 tim=3D10=
25228668804<br>WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 37 timeout=
=3D295 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025228668911<br>WAIT #0: nam=3D&#39;r=
dbms ipc message&#39; ela=3D 16969 timeout=3D295 p2=3D0 p3=3D0 obj#=3D-1 ti=
m=3D1025228685907<br>WAIT #0: nam=3D&#39;log file parallel write&#39; ela=
=3D 273 files=3D1 blocks=3D9 requests=3D1 obj#=3D-1 tim=3D1025228686327<br>=
WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 38 timeout=3D293 p2=3D0 p=
3=3D0 obj#=3D-1 tim=3D1025228686453<br>WAIT #0: nam=3D&#39;rdbms ipc messag=
e&#39; ela=3D 6908 timeout=3D293 p2=3D0 p3=3D0 obj#=3D-1 tim=3D102522869339=
0<br>WAIT #0: nam=3D&#39;log file parallel write&#39; ela=3D 247 files=3D1 =
blocks=3D2 requests=3D1 obj#=3D-1 tim=3D1025228693768<br>WAIT #0: nam=3D&#3=
9;rdbms ipc message&#39; ela=3D 38 timeout=3D292 p2=3D0 p3=3D0 obj#=3D-1 ti=
m=3D1025228693862j<br></div><div>=C2=A0...more of the same</div><div>264396=
 WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 32 timeout=3D52 p2=3D0 p=
3=3D0 obj#=3D-1 tim=3D1025231090742<br>=C2=A0264397 WAIT #0: nam=3D&#39;rdb=
ms ipc message&#39; ela=3D 19460 timeout=3D52 p2=3D0 p3=3D0 obj#=3D-1 tim=
=3D1025231110227<br>=C2=A0264398 WAIT #0: nam=3D&#39;log file parallel writ=
e&#39; ela=3D 272 files=3D1 blocks=3D9 requests=3D1 obj#=3D-1 tim=3D1025231=
110633<br>=C2=A0264399 WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 37=
 timeout=3D50 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025231110748<br>=C2=A0264400 W=
AIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 15484 timeout=3D50 p2=3D0 =
p3=3D0 obj#=3D-1 tim=3D1025231126261<br>=C2=A0264401 WAIT #0: nam=3D&#39;lo=
g file parallel write&#39; ela=3D 247 files=3D1 blocks=3D4 requests=3D1 obj=
#=3D-1 tim=3D1025231126641<br>=C2=A0264402 WAIT #0: nam=3D&#39;rdbms ipc me=
ssage&#39; ela=3D 36 timeout=3D49 p2=3D0 p3=3D0 obj#=3D-1 tim=3D10252311267=
53<br>=C2=A0264403 WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 16732 =
timeout=3D49 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025231143514<br>=C2=A0264404 WA=
IT #0: nam=3D&#39;LGWR wait for redo copy&#39; ela=3D 31 copy latch #=3D8 p=
2=3D0 p3=3D0 obj#=3D-1 tim=3D1025231143603<br>=C2=A0264405 WAIT #0: nam=3D&=
#39;log file parallel write&#39; ela=3D 279 files=3D1 blocks=3D9 requests=
=3D1 obj#=3D-1 tim=3D1025231143976<br>=C2=A0264406 WAIT #0: nam=3D&#39;rdbm=
s ipc message&#39; ela=3D 35 timeout=3D47 p2=3D0 p3=3D0 obj#=3D-1 tim=3D102=
5231144093<br>=C2=A0264407 WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=
=3D 5384 timeout=3D47 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025231149515<br>=C2=A0=
264408 WAIT #0: nam=3D&#39;log file parallel write&#39; ela=3D 255 files=3D=
1 blocks=3D4 requests=3D1 obj#=3D-1 tim=3D1025231149889<br>=C2=A0264409 WAI=
T #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 32 timeout=3D47 p2=3D0 p3=3D=
0 obj#=3D-1 tim=3D1025231149989<br>=C2=A0264410<br>=C2=A0264411 *** 2020-01=
-29T13:04:23.802774-06:00<br>=C2=A0264412 WAIT #0: nam=3D&#39;rdbms ipc mes=
sage&#39; ela=3D 460056 timeout=3D46 p2=3D0 p3=3D0 obj#=3D-1 tim=3D10252316=
10086<br>=C2=A0264413<br>=C2=A0264414 *** 2020-01-29T13:04:26.797925-06:00<=
br>=C2=A0264415 WAIT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 2995027 t=
imeout=3D300 p2=3D0 p3=3D0 obj#=3D-1 tim=3D1025234605238<br>=C2=A0264416 WA=
IT #0: nam=3D&#39;rdbms ipc message&#39; ela=3D 10649 timeout=3D1 p2=3D0 p3=
=3D0 obj#=3D-1 tim=3D1025234616054<br>=C2=A0264417<br></div><div><br></div>=
<div>i will get you snapper lgwr output associated w/ this interval as soon=
 as i can find it (can&#39;t remember where i stashed it and lost trace fil=
e name .. fml)</div><div><br></div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;=
border-left-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div =
dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"=
ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div></div><div>4) Paste the full <b=
>Snapper</b> output (of one time snapshot from the problem time) here if yo=
u can - we&#39;ll be able to see what the slaves were doing</div><div>=C2=
=A0 - You&#39;d need to include not only LGWR SID but also any LG*=C2=A0sla=
ve SIDs too</div><div>=C2=A0 - Also we&#39;ll see better LGWR delay metrics=
 from V$SESSTAT that go beyond just wait events</div><div>=C2=A0 - Thinking=
 of metrics like these - snapper already handles them (if you look into the=
 rightmost column)</div><div><div>=C2=A0 =C2=A0 =C2=A0- <b>redo synch time =
overhead (usec)</b> - &quot;FG wakeup overhead per log file sync&quot;</div=
><div>=C2=A0 =C2=A0 =C2=A0- <b>redo write time</b> - accounting end to end =
redo write time (not just the log file parallel write syscall) &quot;per re=
do write&quot;</div></div><div><br></div><div>5) Regarding the <b>ASH wait =
chains</b> error on 19c - I&#39;m aware of this but forgot to look into it<=
/div><div>=C2=A0 =C2=A0- It has (probably) something to do with GV$ views (=
or PX slaves used for PDB V$ queries) and possibly the connect by I&#39;m u=
sing</div><div>=C2=A0 =C2=A0- Does it work in the CDB (if multitenant)?</di=
v><div>=C2=A0 =C2=A0- Does it work with lower optimizer features enabled?</=
div><div>=C2=A0 =C2=A0- As we are already manually looking into what LGWR a=
nd its slaves are doing, there&#39;s no need for the wait chains script her=
e</div><div><br></div></div></div></div></div></div></div></div></div></blo=
ckquote><div>***oracle has filed a bug. altering optimizer_features_enabled=
 doesn&#39;t help.</div><div>=C2=A0</div><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:s=
olid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">=
<div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div di=
r=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div></div><div>6) Linux Proces=
s Snapper - pSnapper</div><div>=C2=A0 =C2=A0- If you want to see the full &=
quot;wait&quot; states of Linux processes/threads, use the<b> -a </b>option=
.</div><div>=C2=A0 =C2=A0- It&#39;s somewhat like always showing &quot;idle=
&quot; threads too</div><div>=C2=A0 =C2=A0- But for linux apps it&#39;s har=
der to determine what&#39;s idle and what&#39;s some network RPC wait, so I=
 currently just do what Linux &quot;load&quot; metric does</div><div>=C2=A0=
 =C2=A0- I just shows threads in Runnable and Disk/Uninterruptible sleep st=
ates by default (-a shows=C2=A0all states, including Sleep and kernel-Idle =
threads etc)</div><div>=C2=A0 =C2=A0- So the Linux pSnapper &quot;idle&quot=
; means a somewhat different thing than Oracle &quot;idle&quot; - say LGWR =
was waiting for a Data Guard ACK, that should not be shown as Idle in Oracl=
e, but at the Linux process level it&#39;s &quot;just some network socket w=
ait&quot; and you&#39;d need to use <b>-a</b>=C2=A0option to show all of th=
em</div></div></div></div></div></div></div></div></div></blockquote><div><=
br></div><div>*** will add that option to future collections (and watch the=
 video/read help)=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-=
left-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"=
ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><d=
iv dir=3D"ltr"><div dir=3D"ltr"><div><br></div><div>7) Were LGWR/LG* proces=
ses mostly idle during the problem time?</div><div>=C2=A0 =C2=A0- As I kept=
 reading the thread, it looks like ASH reported LGWR being mostly idle duri=
ng the problem time</div><div>=C2=A0 =C2=A0- This is not exactly the same p=
attern, but I&#39;ve seen cases in the past where everyone&#39;s waiting fo=
r LGWR and LGWR doesn&#39;t seem to be busy and doesn&#39;t show any signif=
icant waits</div><div>=C2=A0 =C2=A0- It turned out that LGWR was actually w=
aiting for CKPT to make progress, but it wasn&#39;t synchronously waiting, =
but just going to sleep and &quot;checking again later&quot; in a loop</div=
><div>=C2=A0 =C2=A0- I wrote about this here: <a href=3D"https://blog.tanel=
poder.com/posts/log-file-switch-checkpoint-incomplete-and-lgwr-waiting-for-=
checkpoint-progress/" target=3D"_blank">https://blog.tanelpoder.com/posts/l=
og-file-switch-checkpoint-incomplete-and-lgwr-waiting-for-checkpoint-progre=
ss/</a></div><div>=C2=A0 =C2=A0- So it may still make sense to check what L=
GWR/LG* processes are doing even if they don&#39;t seem to be active much (=
like do you see an occasional controlfile read wait by LGWR etc)</div><div>=
=C2=A0 =C2=A0- SQL*Trace would capture all activity of LGWR, but may slow t=
hings down further, so you could also sample &amp; spool V$SESSION_WAIT_HIS=
TORY (that keeps a log of last 10 waits of every session in memory)</div><d=
iv><br></div></div></div></div></div></div></div></div></div></blockquote><=
div>*** i think i&#39;ve addressed most of this.</div><div>=C2=A0</div><blo=
ckquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left=
-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);paddi=
ng-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D=
"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><=
div></div><div>Incidentally I&#39;m doing another hacking session tomorrow =
(Thursday 30 Jan @ 12pm ET) about complex Oracle performance problems that =
span multiple layers and have multiple moving parts. I can add some LGWR tr=
oubleshooting magic in the end of that (here goes my plan to fit the sessio=
n in 1.5h again!). I&#39;ll post the details in my blog in a few minutes.</=
div><div><br></div></div></div></div></div></div></div></div></div></blockq=
uote><div>*** can&#39;t wait!!=C2=A0</div><blockquote class=3D"gmail_quote"=
 style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:=
solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"=
><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div d=
ir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div></div><div>--</div><div>T=
anel Poder</div><div><a href=3D"https://blog.tanelpoder.com/seminar" target=
=3D"_blank">https://blog.tanelpoder.com/seminar</a></div><div><br></div></d=
iv></div></div></div></div></div><br><div class=3D"gmail_quote"><div dir=3D=
"ltr" class=3D"gmail_attr">On Tue, Jan 28, 2020 at 2:22 PM Chris Stephens &=
lt;<a href=3D"mailto:cstephens16@gmail.com" target=3D"_blank">cstephens16@g=
mail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D=
"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;bor=
der-left-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div><fo=
nt face=3D"monospace">doesn&#39;t look like any ash data for lgwr:</font></=
div><div><font face=3D"monospace"><br></font></div><font face=3D"monospace"=
>SQL&gt; @ashtop session_id,blocking_session,event2 1=3D1 &quot;TIMESTAMP&#=
39;2020-01-27 15:50:48&#39;&quot; &quot;TIMESTAMP&#39;2020-01-27 15:52:40&#=
39;&quot;<br>=C2=A0 =C2=A0TOTALSECONDS =C2=A0 =C2=A0AAS =C2=A0 =C2=A0 =C2=
=A0%This =C2=A0 =C2=A0SESSION_ID =C2=A0 =C2=A0BLOCKING_SESSION =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0EVENT2 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 FIRST_SEEN =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0LAST_SEEN =C2=A0 =C2=A0DIST_SQLEXEC_SEEN<br>_______________ ______ __=
________ _____________ ___________________ _____________________ __________=
____________ ______________________ ____________________<br>=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A081 =C2=A0 =C2=A00.7 =C2=A0 82% | =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1713 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A01710 log file sync =C2=A0 =C2=A0 =C2=A0 =C2=A0 2020-01-=
27 15:50:54 =C2=A0 =C2=A02020-01-27 15:52:14 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1<br>=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 5 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A05% | =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1221 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ON CPU =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A02020-01-27 15:50:59 =C2=A0 =C2=A02020-01-27 15:5=
2:14 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 1<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 3 =C2=A0 =C2=
=A0 =C2=A00 =C2=A0 =C2=A03% | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0246 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 ON CPU =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02020-01-2=
7 15:51:00 =C2=A0 =C2=A02020-01-27 15:52:00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1<br>=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 3 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A03% | =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1713 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ON CPU =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A02020-01-27 15:50:51 =C2=A0 =C2=A02020-01-27 15:52:2=
7 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 2<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 =C2=A0 =C2=A0 =
=C2=A00 =C2=A0 =C2=A01% | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01=
23 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ON=
 CPU =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02020-01-27 15:5=
1:35 =C2=A0 =C2=A02020-01-27 15:51:35 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 1 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A01% | =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0127 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ON CPU =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A02020-01-27 15:51:50 =C2=A0 =C2=A02020-01-27 15:5=
1:50 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 1<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 =C2=A0 =C2=
=A0 =C2=A00 =C2=A0 =C2=A01% | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0252 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02321 latch fr=
ee =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02020-01-27 15:52:28 =C2=A0 =C2=
=A02020-01-27 15:52:28 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 1<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 1 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A01% | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0978 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 ges remote message =C2=A0 =C2=A02020-01-27 15:50:48 =
=C2=A0 =C2=A02020-01-27 15:50:48 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 1 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A01% | =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0983 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 latch free =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A02020-01-27 15:52:28 =C2=A0 =C2=A02020-01-27 15:52:28 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1<br>=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 =C2=A0 =C2=A0 =C2=A00 =C2=
=A0 =C2=A01% | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1713 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 library cache lock=
 =C2=A0 =C2=A02020-01-27 15:50:48 =C2=A0 =C2=A02020-01-27 15:50:48 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1<br>=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 =C2=A0 =C2=A0 =C2=A00 =
=C2=A0 =C2=A01% | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1831 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ON CPU =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02020-01-27 15:50:49 =C2=A0 =
=C2=A02020-01-27 15:50:49 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 1<br><br><br>11 rows selected.<br><br><br>SQL&g=
t; @ashtop session_id,blocking_session,event2 session_id=3D1710 &quot;TIMES=
TAMP&#39;2020-01-27 15:50:48&#39;&quot; &quot;TIMESTAMP&#39;2020-01-27 15:5=
2:40&#39;&quot;<br><br>no rows selected</font><br></div><br><div class=3D"g=
mail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Tue, Jan 28, 2020 at 1=
:09 PM Noveljic Nenad &lt;<a href=3D"mailto:nenad.noveljic@vontobel.com" ta=
rget=3D"_blank">nenad.noveljic@vontobel.com</a>&gt; wrote:<br></div><blockq=
uote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-wi=
dth:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-=
left:1ex">




<div>
You=E2=80=99re using v0.2, but =C2=A0v0.6 should be the latest version. 0.2=
 wouldn=E2=80=99t show idle blockers.
<div><br>
</div>
<div>I looked at the ashtop output once again. Strange that lgwr processes =
nor any other bg processes show up significantly there. The question is, wh=
at was lgwr doing while the fg processes were waiting on him.<br>
<br></div>

</div>

</blockquote></div>
</blockquote></div></div></div>
</blockquote></div></div>
</blockquote></div>
</blockquote></div></div>
</blockquote></div></div>

--0000000000008e8d78059d8fb69f--
--
http://www.freelists.org/webpage/oracle-l


