Re: LOG_BUFFER

From: Ram Cheruvattath <ram.cheruvattath_at_gmail.com>
Date: Wed, 22 Jun 2016 10:10:02 -0400
Message-ID: <1C0C89F003A74D0BB3302415027D55D4_at_RAMPC>





Hi Jonathan

My apologies for not responding earlier. During peak redo generation we see a low number of transactions (150-200). These transactions are large transactions and on an average generates 2-3MB of redo per transaction. They are mostly deletes, INSERT AS SELECT etc.

About 2-3 weeks ago we increased _log_parallelism_max from 18 to 36. Since then, redo allocation latch waits are no longer on top of the list. I noticed that the change resulted in an increase in public strands and the number of redo allocation latches of course. Interestingly, that also resulted in the number of log writer workers to double.

Thanks
Ram

From: Jonathan Lewis
Sent: Thursday, May 19, 2016 5:19 PM
To: 'oracle-l-freelists'
Subject: RE: LOG_BUFFER

I may have trouble modelling that on my laptop ;) If there are any boundary conditions causing odd effects in this area I think you're probably going to find most of them.

How many concurrent transactions are you seeing at peak times - and do these transactions all do a very small amount of work (say a couple of rows inserted or updated before commit) or are they doing work which is more batch like (100+ rows to a transaction). If you snapshot v$latch_children across a few minutes of stress are the redo allocation latch sleeps evenly spread, or is there a suggestion of a pattern. If you snapshot v$session_event across a few minutes of stress are the waits for "latch: redo allocation" evenly spread or is there a suggestion of a pattern.

At this point I'm thinking about NUMA problems - along the lines of some of your redo allocation latches not being local to the CPUs that are trying to acquire them, resulting in being beaten by the CPUs where the latches are local. Wondering if this would show up as a non-uniform distribution pattern of misses and waits.

The granule size and log buffer sizes are loosely linked - the SGA is split into fixed memory granules of a uniform size, and at least one granule had to be reserved for the log buffer(s) - and, if I recall correctly, the "fixed SGA" (x$ksmfsv) shared the granule. In many cases a single granule was more than enough to hold all the log buffer space that the instance required, so (from 10g onwards) people often found that their log buffer was reported as much bigger than the size they had seen in earlier versions because Oracle basically expanded the log buffer to fill most of the granule. In large cases, though, the log buffer allocation could take more than a single granule; and since you've specified a very large log buffer you'll see multiple granules allocated.    

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle


From: Ram Cheruvattath [ram.cheruvattath_at_gmail.com] Sent: 19 May 2016 14:38
To: Jonathan Lewis; 'oracle-l-freelists' Subject: Re: LOG_BUFFER

Hi Jonathan

Yes. This is an Oracle RAC 12c database with 4 instances. The database is approximately 500TB. There are 288 physical cores on each server. Multithreading is disabled. Each node has 5.5TB of memory. SGA is approximately 4.5TB. Whenever we experience high waits for redo allocation latch, there is no indication of CPU shortage. I have verified from both AWR and OSW that the overall CPU idle is around 30-35% when this happens, and that is very consistent. The LG* processes are running in RT.

Yes. The default granule size would have been 512M. But is is currently 128M due to explicitly setting _ksmg_granule_size. We plan on removing it from spfile so it can default.

With the default no. of redo strands (18) I am seeing 10 log writer workers. However, on our test system, after changing the _max_parallelism_max to 36, I am seeing 18 workers. Of course, the redo allocation latches have also gone up to 36.

Finally, I was under the impression that granule size does not apply to log_buffer. Is that not the case?

Thanks
Ram

From: Jonathan Lewis
Sent: Wednesday, May 18, 2016 3:48 PM
To: 'oracle-l-freelists'
Subject: RE: LOG_BUFFER

Looks like you have a very large SGA (> 64GB), with 512MB granules and two of them for the log buffer. Do you really have 288 CPUs, or do you have a smaller number of cores that are pretending to be multiple CPUs but not doing it very well ?

Waits for redo allocation latch are deceptive - they might be waits for the public thread latch as private threads are dues to be copied into the public threads, or they may be waits for the private thread latches because you have too many processes try to get private redo all the time.

Log file sync > 200ms is suspect and I think it points to CPU starvation (which is why I ask about the cores/CPUs difference). The log file sync timeout is 10 centiseconds by default (i.e. 100 ms) so processes should have timed out and checked for log write completion. It's possible that if the write hasn't completed then the session resumes the same wait rather than starting a new wait - and I don't know whether or not it does, I suppose I could check it by suspending the log writer - but if it starts a new wait then a 200 - 300 ms log file sync wait means the process is going onto the run queue after no more than 100ms but not running for another 100 - 200 ms i.e. CPU starvation. (In fact, any time you see things like log file parallel write = quick, log file sync = much slower, you can suspect CPU starvation - otherwise start checking Frits Hoogland's blog for other reasons for log writer anomalies under pressure.)

The thinking on cores vs. CPUs: if you have (say) 64 cores claiming to be 288 CPUs you might actually find that you are reporting 17% CPU utilisation (thanks to the arithmetic CPU count) while suffering CPU starvation (thanks to core count).

And I've just realised you said gc flush time - which is largely about the speed for log file syncs triggered by the RAC background processes - so you're running RAC, which means (even in 12c, I think) no private strands (check x$kcrfstrand to confirm) which means your latch waits are for the public redo strands.    

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle


From: Ram Cheruvattath [ram.cheruvattath_at_gmail.com] Sent: 18 May 2016 18:35
To: Jonathan Lewis; 'oracle-l-freelists' Subject: Re: LOG_BUFFER

Thanks for the detailed explanation. On this database, the total number of strands is at 18 (ncpu/16) and each strand is about 56M. We see significant waits for redo allocation latch when redo generation is at its peak. We are looking at increasing the strands. We do see high averages for log file sync (200-300 ms) along with similar numbers for gc cr/current block flush time during peak redo generation. log file parallel writes are usually under 3-4ms, and may go up to 20-25ms at times.

IMHO, since the database is in no archive mode anyway we should do as much non logged operations as possible. But...

Ram

From: Jonathan Lewis
Sent: Wednesday, May 18, 2016 11:54 AM
To: 'oracle-l-freelists'
Subject: RE: LOG_BUFFER

The documentation on the log_buffer parameter has hardly changed since 8i - although its degree of accuracy has changed significantly. On my 2 CPU 12c instance at present I see that log_buffer has defaulted to 1.3M, though checking x$kcrfstrand I see this is really 2 public strands of 6.5M each.

The 1MB trigger is correct, though it's not in the documentation - it's easy to demonstrate, though - and the documentation also doesn't point out that the log writer will start a new write immediately after it finishes processing a write if it finds that there is still some log waiting to be written; nor does the documentation point out that processes can wake themselves up and detect that their bit of log buffer has been written BEFORE the log writer gets around to posting them - which is why the old threat of "log buffer too big" is not as important as it used to be.

The statement of the 1MB being "not useful" claim pre-dates machines with 64 CPUs and a couple of hundred (actually) concurrent sessions, so you can (and ought to) ignore that. It's actually important in busy systems to have a large log buffer otherwise lots of sessions can end up waiting on "log buffer space". The log writer can write asynchronously in multiple chunks of up to 1MB each - so a "single" write can take some time, during which there may be lots of sessions still working to fill the log buffer, and that's what a buffer is for. (12c allows multiple log writer slaves - lg00 to lg99 - to help with very busy systems rather than depending on async I/O).

1GB sounds big - but if you're not seeing much time lost on "log file sync" waits (and assuming that CPU overload is not contributing to those waits) and the typical duration of the log file sync waits is similar to the duration of the log file parallel writes then its size is not the cause of a problem.

(Standard caveat: I didn't write the specification or code for the Oracle program - so I may be missing something that I might notice if I had your system running in front of me.)

If you really wanted to get some idea of whether or not the buffer was much bigger than it needed to be you could monitor x$kcrfstrand (extremely frequently) to keep checking how much of the public buffer was in use at any instant - but I've never felt it necessary to do this so I have no idea whether trying to do this MIGHT introduce a performance problem.    

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle


From: oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] on behalf of Ram Cheruvattath [ram.cheruvattath_at_gmail.com] Sent: 18 May 2016 16:18
To: Mark W. Farnham; Hemant-K.Chitale_at_sc.com; 'oracle-l-freelists' Subject: Re: LOG_BUFFER

Yes. I have not come across anything that says a large log_buffer is an issue (other than wasting memory if not being used properly). Even the Oracle documentation says that, though, as per the documentation, a size of more than 1MB is generally not useful. On one of Tom's articles I have read that LGWR write redo out when it get's to 1MB regardless of other factors. I have not see that in the documentation (12c) though. I wonder if that is why more than 1M is not useful, though I would think that the rest of the space can still be used by DML in progress while LGWR is writing. Configuring the Redo Log Buffer
Server processes making changes to data blocks in the buffer cache generate redo data into the log buffer. The log writer process (LGWR) begins writing to copy entries from the redo log buffer to the online redo log if any of the following conditions are true:

  a.. The redo log buffer becomes at least one-third full

  b.. LGWR is posted by a server process performing a COMMIT or ROLLBACK

  c.. A database writer process (DBWR) posts LGWR to do so

Sizing the Redo Log Buffer
The default size of the redo log buffer is calculated as follows:

MAX(0.5M, (128K * number of cpus))
On most systems, sizing the log buffer larger than 1 MB does not provide any performance benefit. However, increasing the size of the redo log buffer does not have any negative implications on performance; it merely uses more memory. More importantly, a modestly-sized redo log buffer can significantly enhance throughput on systems that perform many updates. Applications that insert, modify, or delete large volumes of data may require a change to the default size of the redo log buffer.

To determine if the size of the redo log buffer is too small, monitor the redo log buffer statistics, as described in "Using Redo Log Buffer Statistics". You can also check if the log buffer space wait event is a significant factor in the wait time for the database instance. If it is not, then the log buffer size is most likely adequately-sized.

http://docs.oracle.com/database/121/TGDBA/tune_buffer_cache.htm#TGDBA556

From: Mark W. Farnham
Sent: Wednesday, May 18, 2016 9:43 AM
To: Hemant-K.Chitale_at_sc.com ; ram.cheruvattath_at_gmail.com ; 'oracle-l-freelists' Subject: RE: LOG_BUFFER

If you never come close to filling the LOG_BUFFER before a flush event, then the downside is occupying memory you never use.

I have *not observed* Oracle performance degrade due to having a large LOG_BUFFER if there is no essential better use for the memory such as caching large frequently referenced look up tables.

I usually safely oversize the LOG_BUFFER so that hitting the buffer one-third full is not routinely the flushing event, but I’ve never needed as much as you are currently using.

Once your LOG_BUFFER is big enough so you cannot write one-third of it before the time triggered flush, I *believe* that constitutes a useful upper size limit, even if your jobs are all single commit monoliths. (There may be other useful upper size limits that are smaller, ergo the use of “a” not “the” in the phrase “a useful upper size limit.”)

*Someone let me know if that is not still at least theoretically correct.*

So while you *might* be “wasting” some memory, I’m not aware of any other harm. I don’t believe they will let you set it bigger than the program can handle.

mwf

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Chitale, Hemant K Sent: Wednesday, May 18, 2016 2:02 AM
To: ram.cheruvattath_at_gmail.com; oracle-l-freelists Subject: RE: LOG_BUFFER

What kind of storage do you write to _at_4TB/hour ? How and when do you backup and purge/delete the archivelogs.

I’ve seen log buffer exceeding 1GB.

Hemant K Chitale

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Ram Cheruvattath Sent: Wednesday, May 18, 2016 3:00 AM
To: oracle-l-freelists
Subject: LOG_BUFFER

Hi

We have a 12c database that has very high redo generation (4TB/hr). Our log_buffer is set to 1GB.

Is there a downside to setting log_bugger this big?

Thanks

Ram

This email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please delete all copies and notify the sender immediately. You may wish to refer to the incorporation details of Standard Chartered PLC, Standard Chartered Bank and their subsidiaries at https://www.sc.com/en/incorporation-details.html



--
http://www.freelists.org/webpage/oracle-l


Emoticon1.gif
(image/gif attachment: Emoticon1.gif)

Received on Wed Jun 22 2016 - 16:10:02 CEST

Original text of this message