Re: Enterprise Manager warns of blocking on LGWR

From: Andy Sayer <andysayer_at_gmail.com>
Date: Thu, 7 Dec 2017 23:09:14 +0000
Message-ID: <CACj1VR4LKwmZ-LG9v=bg6BUQF1Ds8YW=6ymczMsmM8PqrNYbGA_at_mail.gmail.com>



When a session commits, it waits on LGWR to tell it that it's transaction is safely preserved in an online redo log file. If your blocking wait is on log file sync then this could be an indication you are writing redo too fast for LGWR to handle (although as of 11.2 there is a new mechanism whereby the committing session will poll LGWR rather than wait to be told by LGWR, this would wouldn't show up as a block AFAIK). It could be any number of other scenarios though and it would be silly to guess and make changes based on a guess.

If you have access to v$active_session_history (if you have purchased the diagnostics pack licence), then you can query to figure out what sessions were being blocked by LGWR and what they were trying to do. If you don't have v$ash, then you could write your own sampling routine against v$session to record the required information (sid, username, module, sql_id, event, blocking_session, program.. sysdate) then when you get the alert you can have a dig into what exactly was going on at that time.

If this is only happening since upgrading then something has changed, a couple of guesses:
perhaps your application is committing more frequently in smaller batches perhaps the redo groups are not the same size as they were previously or you have less of them
perhaps it is something to do with the new LGnn processes in 12c. https://docs.oracle.com/database/121/REFRN/GUID-86184690-5531-405F-AA05-BB935F57B76D.htm "On multiprocessor systems, LGWR creates worker processes to improve the performance of writing to the redo log. LGWR workers are not used when there is a SYNC standby destination. Possible processes include LG00-LG99."

If you do have a process that is frequently committing in small batches but shouldn't really need to confirm that their redo has been written to the online logs, then perhaps you could look into the commit_wait parameter or the commit nowait option
https://docs.oracle.com/database/121/SQLRF/statements_4011.htm#SQLRF01110 https://docs.oracle.com/database/121/REFRN/GUID-40E51330-96D0-4B98-BBF3-6A45D970AAB1.htm#REFRN10266 But be aware that it is sacrificing recoverability of your data, I wouldn't take that decision lightly.

Regards,
Andy

On 7 December 2017 at 22:23, Rich J <rjoralist3_at_society.servebeer.com> wrote:

> On 2017/12/07 16:19, Rich J wrote:
>
> I had your exact problem, in addition to EM complaining about "blocking"
> locks between PQ slaves. For LGWR, the problem happens when there is
> over-committing, like doing 1M inserts with a commit between each row. I
> could not force this problem using SQL/PLSQL, but I did create a short Perl
> program which inserts rows from a significantly large table into a new
> one. I used sys.wrm$_snapshot_details, that had ~1.5M rows in it. Running
> the program triggered our alert.
>
>
> For clarification, the problem isn't necessarily EM per se, but in the
> mechanism that populates the DBA_OUTSTANDING_ALERTS view in the target
> database, which has been properly circumvented in EM13r2.
>
> Rich
>

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Dec 08 2017 - 00:09:14 CET

Original text of this message