Re: lgwr performance

From: Riyaj Shamsudeen <riyaj.shamsudeen_at_gmail.com>
Date: Thu, 12 Jan 2012 09:57:43 -0600
Message-ID: <CAA2DszxYqMQD21C+MLb8nRP10XeYCmQ18XBXknA+aR1-KouBxQ_at_mail.gmail.com>



Hello Purav
  So, your log file parallel write is averaging at 69ms. That number is high. By any chance, are you running your database in Maximum availability mode? Can you post your values for log_archive_dest* parameters?   I don't have an answer for you, but my thinking would be along these lines: If the database is not running in Max availability then, you have to figure out whether the wait time is spent in the OS side, SAN side, or in the path. What type of luns do you have? do you have iSCSI, if yes, is the network route optimal? Is the iostat output that you mentioned specific to those redo devices? Have they configured SAN cache for write caching? Who else is using the same SAN? Is DBWR suffering from same I/O write latency? Cheers

Riyaj Shamsudeen
Principal DBA,
Ora!nternals - http://www.orainternals.com - Specialists in Performance, RAC and EBS
Blog: http://orainternals.wordpress.com
OakTable member http://www.oaktable.com and Oracle ACE Director

Co-author of the books: Expert Oracle
Practices<http://tinyurl.com/book-expert-oracle-practices/>, Pro Oracle SQL, Expert PL/SQL
Practices<http://tinyurl.com/book-expert-plsql-practices>

On Thu, Jan 12, 2012 at 8:40 AM, Purav Chovatia <puravc_at_gmail.com> wrote:

> Our applications connect to only 1 node of a 2-node 10gR2 (10.2.0.5)
> RAC system on Solaris 10. ASM is configured.
> The TPS is mostly around 5 and at times 11.
> There is hardly any load on the server. CPU is around 8% on a HP DL380
> box (it has 2 cpu with 6 cores each). Out of that 8%, 7.9% is consumed
> by lgwr. Dont understand why? prstat -Lw for the lgwr pid shows that
> there are 2 LWPs and for one of them CPU in SYS mode is around 50-60%.
>
> To troubleshoot, checked statspack and AWR reports and found that
> maximum time spent is on log file sync (avg is 75msec). For log file
> parallel write, avg is 69msec. I understand that is too high but
> iostat always shows 0.2msec. Why this mismatch?
> So enabled 10046 tracing for the lgwr and it showed only log file
> parallel write and ela was in line with what is reported by AWR i.e.
> 69msec.
> asmiostat shows 0msec and after every 6-8 seconds, it shows 135msec.
> But I think that the asmiostat script as available on metalink has a
> bug. Because what it gathers is in centisec and what it displays is in
> millsec. And hence it should multiply by 10 but it multiplies by 1000.
> Pls correct me if I am wrong here.
>
> As a result of the above, the system is very vulnerable to contention.
> At times, application sessions wait for as long as 10-15 seconds on
> log file sync and as a result, apps restart.
>
> Thanks.
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Jan 12 2012 - 09:57:43 CST

Original text of this message