RE: LGWR, EMC or app cursors?

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Fri, 11 Oct 2019 10:55:17 -0400
Message-ID: <0aad01d58043$e548df20$afda9d60$_at_rsiz.com>





Apart from any i/o issue, you CAN at least put the rows into different blocks and avoid the gcc block contention.  

You won’t miss the extra 2 blocks.    

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Herring, David (Redacted sender "HerringD" for DMARC) Sent: Thursday, October 10, 2019 1:03 PM To: exriscer_at_gmail.com; dmarc-noreply_at_freelists.org Cc: oracle-l_at_freelists.org
Subject: RE: LGWR, EMC or app cursors?  

The code in question is probably going to raise some eyebrows but here goes since I knew I'd have to explain at some point (and then give the excuse it's a 3rd party app):  

I guess that's a VERY long answer. Just confirming that COMMITs, although quite frequent, are no more than 3 every 5 sec per app design.  

Regards,  

Dave  

cid:image001.png_at_01D05044.5C2AEE60  

Dave Herring

DBA 103 JFK Parkway

Short Hills, New Jersey 07078

Mobile 630.441.4404  

 <http://www.dnb.com/> dnb.com  

 <http://www.facebook.com/DunBradstreet> cid:image002.png_at_01D05044.5C2AEE60 <http://twitter.com/dnbus> cid:image003.png_at_01D05044.5C2AEE60 <http://www.linkedin.com/company/dun-&-bradstreet> cid:image004.png_at_01D05044.5C2AEE60 <http://www.youtube.com/user/DunandBrad> cid:image005.png_at_01D05044.5C2AEE60  

From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> On Behalf Of Ls Cheng Sent: Thursday, October 10, 2019 7:24 AM To: dmarc-noreply_at_freelists.org
Cc: oracle-l_at_freelists.org
Subject: Re: LGWR, EMC or app cursors?  

CAUTION: This email originated from outside of D&B. Please do not click links or open attachments unless you recognize the sender and know the content is safe.  

Hi  

Can you try to identify the number of commits when that happens plus the IOPS?  

I had a similar problems years ago, it turns our in the code a loop ran 2500 commits per sec  

 <https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.avast.com%2Fsig-email%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Dwebmail&data=02%7C01%7Cherringd%40dnb.com%7Ce06948109f034f7f372008d74d7cd85b%7C19e2b708bf12437597198dec42771b3e%7C0%7C0%7C637063070921709817&sdata=8Hfty12ixkrOqg9EfpPgu0kP9Kt7wrj6Fx4B2OM%2F8F0%3D&reserved=0>

Virus-free. <https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.avast.com%2Fsig-email%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Dwebmail&data=02%7C01%7Cherringd%40dnb.com%7Ce06948109f034f7f372008d74d7cd85b%7C19e2b708bf12437597198dec42771b3e%7C0%7C0%7C637063070921709817&sdata=8Hfty12ixkrOqg9EfpPgu0kP9Kt7wrj6Fx4B2OM%2F8F0%3D&reserved=0> www.avast.com  

On Mon, Oct 7, 2019 at 5:20 PM Herring, David <dmarc-noreply_at_freelists.org> wrote:

Folks, I've got a bit of a mystery with a particular db where we're getting a periodic 25-30 pause between user sessions and LGWR processes and can't clearly identify what's the cause.  

· The database is 11.2.0.4, RHEL 7.5, running ASM on EMC.

· Sometimes once a day, sometimes more (never more than 5) times a day we see user processes start waiting on "log file sync". LGWR is waiting on "log file parallel write".

· At the same time one of the emcpower* devices shows 100% busy and service time 200+ (from iostat via osw). mpstat shows 1 CPU at 100% on iowait. It's not always the same disk (emcpowere1, a1, h1, …), not always the same CPU. EMC and sysadmins have confirmed there are no disk errors and from their standpoint the disks are waiting on Oracle.

· During this time LGWR stats in ASH are all 0 - TIME_WAITED, DELTA* columns. Only after the problem goes away (about 25 secs) these columns are populated again, obviously the DELTA* columns 1 row later. LGWR's session state is WAITING so I assume the column value observations are due to LGWR waiting, as it won't write stats until it can do something.  

I am stuck trying to find out, really prove who is the culprit or what exactly the wait is on. Is LGWR waiting on user sessions and user sessions are waiting on LGWR and all that causes the disk to be 100%? Can I enable some sort of tracing on LGWR and would that point to exactly what he's waiting on to prove where the problem is?  

Regards,  

Dave











--
http://www.freelists.org/webpage/oracle-l



Received on Fri Oct 11 2019 - 16:55:17 CEST

Original text of this message