Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: 99% IOWAIT with Oracle RAC 10g (10.1.0.4) on Linux

Re: 99% IOWAIT with Oracle RAC 10g (10.1.0.4) on Linux

From: <mccmx_at_hotmail.com>
Date: 14 Jun 2005 08:29:28 -0700
Message-ID: <1118762968.427565.116080@g44g2000cwa.googlegroups.com>


After a few days of fault finding I've managed to get to the root of the problem (thanks to the help of Jermey and Daniel):

The /var/log/messages file had the following entries in it every time the IOWAIT went through the roof:

Jun 13 12:20:30 linux1 kernel: ieee1394: sbp2: aborting sbp2 command
Jun 13 12:20:30 linux1 kernel: Read (10) 00 05 e7 d2 80 00 00 05 00
Jun 13 12:20:30 linux1 kernel: ieee1394: sbp2: aborting sbp2 command
Jun 13 12:20:30 linux1 kernel: Read (10) 00 00 15 9a 60 00 00 05 00

So the problem appeared to be either in the sbp2 driver or the hard drive itself. The hard drive has the Oxford 911 chipset so my investigation centered around the sbp2 driver.

A good dig around google for the abort messages above lead me to an optional parameter for loading the sbp2 module.

sbp2_serialize_io

By adding the following line into the /etc/modules.conf and rebooting each node, I have solved the problem.

options sbp2 sbp2_serialize_io=1

This option is generally used to workaround bugs in the sbp2 driver, or for debugging purposes so I suspect that it may be slower than the default setting. But for my purposes the stability is the major priority.

Thanks to everyone who contributed to the thread....

BTW - to confirm whether this option is effective check for the following string in the 'dmesg' output:

ieee1394: sbp2: Driver forced to serialize I/O (serialize_io = 1)

Cheers

Matt Received on Tue Jun 14 2005 - 10:29:28 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US