RE: RAC interconnect packet size??
Date: Wed, 22 Apr 2009 10:04:50 -0400
I hear what you're saying, but, because the LMS processes were by far the biggest CPU hogs, I was thinking that the overhead of breaking down and reassembling packets was the primary cause of CPU starvation.
As I said, we're currently in "wait and see" mode, hoping that we've seen the last of these events. Obviously, if I see more CPU starvation, I'll have to re-think the root cause. But, as I mentioned before, enabling jumbo frames is the "right" thing to do, and there's really no downside, so....
Anyhow, we'll see what happens.
From: Tanel Poder [mailto:tanel_at_poderc.com] Sent: Wednesday, April 22, 2009 7:33 AM
To: Bobak, Mark; 'Greg Rahn'
Cc: TESTAJ3_at_nationwide.com; oracle-l_at_freelists.org Subject: RE: RAC interconnect packet size??
I haven't read the whole thread but if you're troubleshooting high waits when having CPU starvation there's one thing to remember. When you have serious CPU starvation with large CPU runqueues (long scheduling latency) then whatever increased waits you see may just be symptoms of CPU starvation (as due scheduling latency it takes longer for an Oracle process to get onto CPU to execute the "wait end" function).
Of course if your LMS processes normally don't use as much CPU you might have something there. Otherwise just see what are all these sessions doing who try to be on CPU (from ASH or some other form of v$session history) and how it differs from a normal situation. Things like whether the exec plan of the prevalent SQL_ID executed then is the same as usual etc.
> -----Original Message-----
> From: oracle-l-bounce_at_freelists.org
> [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Bobak, Mark
> Sent: 22 April 2009 14:26
> To: Greg Rahn
> Cc: TESTAJ3_at_nationwide.com; oracle-l_at_freelists.org
> Subject: RE: RAC interconnect packet size??
> Hi Greg,
> I agree. Allow me to describe what we were seeing:
> - CPU spikes w/ run queues going into the 20s, very low or
> no %wait for I/O, 0% idle
> - Looking at V$SESSION_WAIT, lots of waits on gc wait events
> - up to four LMS processes, burning CPU like crazy
> (all this on a three node RAC of DL-585s, 4 dual core CPUs per node)
> The above seemed to be consistent with a system w/ a busy
> interconnect and no jumbo frames configured.
> Only time will tell whether enabling jumbo frames actually
> solved the problem.
> One other thing, assuming that all your hardware (all NICs
> and interconnect switches) supports a jumbo frame
> configuration, there should really be no downside to enabling them.