Re: log buffer size and log file syncs

From: Tanel Poder <tanel_at_tanelpoder.com>
Date: Sat, 28 Apr 2012 01:09:05 +0300
Message-ID: <CAMHX9JJCUms+YNX+2U3x1g3ZT8u1NzwJ5Y1qQ-oLGouzKjrdtQ_at_mail.gmail.com>



Yep, it's possible in theory and practice too - if you have some large (parallel) transactions generating redo on multiple CPUs - which flood the log buffer. This can drive the *log file sync* times up, as there's sometimes tens or 100+ MB of redo waiting to be flushed before the commit can return.
With lower log buffer size, your large transaction would run slower as it would wait for *log buffer space* wait much more. Your smaller transactions wait for log buffer space as well, but as there's less redo to be flushed in the "queue" then they'll get their redo in there faster and also get the write done sooner too.

*However* - I would first look into what the LGWR itself is doing (how much CPU it's consuming vs waiting for IO, what's the average IO size (and redo size per message received by LGWR etc). A common cause for such symptoms is that LGWR just doesn't get enough CPU.

No need to guess, you can run snapper on LGWR too (snapper just queries v$ views so it doesn't matter whether it's a background or foreground session):

    SID, USERNAME , TYPE, STATISTIC
, DELTA, HDELTA/SEC, %TIME, GRAPH


     98, (LGWR)    , STAT, non-idle wait count

, 2, .4,
98, (LGWR) , STAT, messages sent
, 2, .4,
98, (LGWR) , STAT, messages received
, 2, .4,
98, (LGWR) , STAT, background timeouts
, 2, .4,
98, (LGWR) , STAT, physical write total IO requests , 3, .6, 98, (LGWR) , STAT, physical write total multi block requests
, 1, .2,
98, (LGWR) , STAT, physical write total bytes , 232448, 46.49k, 98, (LGWR) , STAT, cell physical IO interconnect bytes , 232448, 46.49k, 98, (LGWR) , STAT, redo wastage , 28, 5.6, 98, (LGWR) , STAT, redo writes
, 1, .2,
98, (LGWR) , STAT, redo blocks written
, 274, 54.8,
98, (LGWR) , STAT, redo blocks checksummed by FG (exclusive)
, 180, 36,
98, (LGWR) , TIME, background elapsed time
, 861, 172.2us, .0%, | |
98, (LGWR) , WAIT, *rdbms ipc message* , 4981797, 996.36ms, 99.6%, |_at_@@@@@@@@@| 98, (LGWR) , WAIT, *log file parallel write* , 752, 150.4us, .0%, | |
-- End of Stats snap 1, end=2012-04-28 00:40:59, seconds=5

If you happen to run on Solaris, then you can also run *prstat -mLp* *PID*on LGWR's process to see how much time it waits in the CPU runqueue to get
onto CPU (the LAT column - scheduling latency).

Another thing to definitely check is to see whether async IO is properly enabled, especially in the case of multiplexed redo logs...

-- 
Tanel Poder
Enkitec Europe
http://www.enkitec.com/
Advanced Oracle Troubleshooting v2.0 Seminars in May/June 2012!
http://blog.tanelpoder.com/seminar/



On Fri, Apr 27, 2012 at 11:36 PM, CRISLER, JON A <JC1706_at_att.com> wrote:

> Has anybody run into a case where the log buffer size was too big, and
> negatively affected log file writes and log file sync waits ?
>
> I have a system that ignores my parameter for 32mb log file size and sets
> it to 128mb, due to the granule size.  The workaround is to lower the
> granule size so that I can manually set the redo log buffer lower.  But I
> am not sure if this will have any positive affect on log file sync times.
>  Normal I/O for log file writes is fast at 2-4 ms, but log file syncs are
> bad at 50 to 150 ms.
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>


--
http://www.freelists.org/webpage/oracle-l
Received on Fri Apr 27 2012 - 17:09:05 CDT

Original text of this message