Re: intermittent long "log file sync" waits
From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Sun, 2 Feb 2020 09:39:52 +0000
Message-ID: <LNXP265MB1562810EC75BB7A638891C97A5010_at_LNXP265MB1562.GBRP265.PROD.OUTLOOK.COM>
Date: Sun, 2 Feb 2020 09:39:52 +0000
Message-ID: <LNXP265MB1562810EC75BB7A638891C97A5010_at_LNXP265MB1562.GBRP265.PROD.OUTLOOK.COM>
It be interesting to see the previous 5 or 6 seconds of lgwr WAITS - using the "tim=" values as indicators to get to a point before that log file sync started. The foreground doesn't have to send a message to the log writer if the log writer is already writing because it knows that LGWR will check if there's more to write before going to sleep. Possibly there's a race condition where the foreground decides that it doesn't need to send a message but the log writer has decided there's no more to write - the log writer's activity in the few seconds before it's 2.9 second wait might give us a clue. I'm also a little puzzled that a foreground could wait 6.9 seconds. I thought there was a timeout for LFS (either 1 second or 1/10th second - possibly version dependent, and settable by hidden parameter). So, unless the foreground times out but doesn't start a new wait, it can't wait for 6.9 seconds. Purely for the sake of collecting information I think I'd take a look at smon_scn_time to to see what time the foregrounds "sync scn" corresponded to, and to check the SCNs for the start and end time of the LFS wait. Regards Jonathan Lewis ________________________________________ From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> on behalf of Tanel Poder <tanel_at_tanelpoder.com> Sent: 02 February 2020 03:50 To: Tanel Poder Cc: Chris Stephens; Noveljic Nenad; oracle-l Subject: Re: intermittent long "log file sync" waits But your trace file here already indicates that there's some glitch regarding LGWR wakeups/posting: Foreground: *** 2020-01-29T13:04:20.724666-06:00 WAIT #0: nam='log file sync' ela= 6909711 buffer#=58996 sync scn=976089279 p3=0 obj#=-1 tim=1025228531958 LGWR at the same time: *** 2020-01-29T13:04:20.725994-06:00 WAIT #0: nam='rdbms ipc message' ela= 2918443 timeout=300 p2=0 p3=0 obj#=-1 tim=1025228533313 The WAIT lines are printed out when the wait ends (and the *** timestamp line just before the WAIT line is printed out, if enough time has passed since the last *** line). So this shows that LGWR had happily been idle, waiting for its idle 3-second timeout to expire, or someone to wake it up (that's what "rdbms ipc message" means) - while someone had issued a commit and was waiting for LGWR acknowledgment. So this shows that LGWR didn't even get the post from your session, nor any other sessions (if you had concurrent transactional activity). Back in 11g, typically these problems happened when adaptive log file syncing switch between poll and post/wait wakeup method happened and you hit a bug. But in 12c there are more moving parts. -- Tanel Poder https://blog.tanelpoder.com/seminar --http://www.freelists.org/webpage/oracle-l Received on Sun Feb 02 2020 - 10:39:52 CET