Re: Oracle RAC and IRQ Balance
Date: Sun, 9 Oct 2011 17:35:34 -0700
Message-ID: <CAGXkmivUtN9Re6KkZihuopfufGBVHBxWe2Dcdq6sncJVQWOEBQ_at_mail.gmail.com>
A few things:
- Just for clarity - this isn't RAC specific. The issue of "burning" an entire core/thread on interrupts can happen on any system sending enough packets. I've seen it plenty of times on network interfaces to chatty application tiers.
- Even though on said core/thread there is 41.33% user, the %idle is only 4.08% - so this little guy is almost out of gas.
As long as 1 core/thread doesn't run out of gas, this shouldn't be an issue, but in this case, it's pretty darn close -- too close for my comfort. I'd recommend enabling irqbalance and monitoring the workload & sys metrics carefully. More details on this can be found at http://irqbalance.org/
You may find that collectl [http://collectl.sourceforge.net/] comes in handy for gathering & monitoring this sys metric (and others!). My mantra on collectl is: "If your OS is Linux and you are not using collectl, you probably should be." (I'm a big fan.)
Cheers,
On Sun, Oct 9, 2011 at 10:00 AM, Riyaj Shamsudeen
<riyaj.shamsudeen_at_gmail.com> wrote:
> Hello Jed
> NIC cards interrupt CPU for the packet delivery. Of course, in a busy RAC
> database, there can be huge amount of network packets being transferred
> leading to high IRQs. If IRQs are pinned to be interrupted to one CPU, then
> latency in that CPU can cause issues as kernel threads need to be scheduled
> to serve the irqs only in that CPU.
> If you want IRQs to be pinned to one CPU, then you should make sure that
> no other process is scheduled to execute in that CPU. But, I see that 40% of
> usage in CPU in USER mode which indicates that this is probably not
> happening in your case.
> But, why is this important for you? Do you see network delays causing RAC
> performance issues? If yes, then I don't see an issue of IRQs being serviced
> by all CPUs. Also, I am surprised that this is not a default.
>
>
> On Mon, Oct 3, 2011 at 4:29 PM, Walker, Jed S
> <Jed_Walker_at_cable.comcast.com>wrote:
>
>> Back to my learning of RAC. Today, it was suggested that we turn on
>> IRQBALANCE on our Oracle 11.2.0 RAC systems to help distribute the IRQ load,
>> to hopefully help with performance. I did a check and can see that just one
>> CPU appears to be handling all of these.
>> mpstat -P ALL 2
>> Linux 2.6.18-53.el5 (node-01) 10/03/2011
>>
>> 09:19:46 PM CPU %user %nice %sys %iowait %irq %soft %steal
>> %idle intr/s
>> 09:19:48 PM all 14.30 0.00 3.04 23.54 0.25 1.27 0.00
>> 57.59 10903.06
>> 09:19:48 PM 0 41.33 0.00 9.18 40.31 1.02 4.08 0.00
>> 4.08 10902.55
>> 09:19:48 PM 1 2.55 0.00 0.51 14.29 0.00 0.00 0.00
>> 82.65 0.00
>> 09:19:48 PM 2 12.24 0.00 2.04 34.18 0.00 0.00 0.00
>> 52.04 0.00
>> 09:19:48 PM 3 1.02 0.00 0.51 6.63 0.00 0.00 0.00
>> 92.35 0.00
>> (this is consistent over a period of time)
>>
>> I then read an article saying that in many cases this doesn't matter -
>> something to do with processes being pinned to a CPU (Sorry, I can't find
>> the article again!).
>>
>> Does anyone have any experience, or is there a good practice for this and
>> RAC?
>>
>> service irqbalance start
>> chkconfig irqbalance on
>>
-- Regards, Greg Rahn http://structureddata.org -- http://www.freelists.org/webpage/oracle-lReceived on Sun Oct 09 2011 - 19:35:34 CDT