Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.server -> 99% IOWAIT with Oracle RAC 10g (10.1.0.4) on Linux
Oracle 10.1.0.4 EE running on 2 node RHEL 3 cluster (Oracle Firewire
Kernel)
Shared Storage : Maxtor One Touch II
It seems that periodically the I/O to the shared device seems to 'hang up' (i.e. 99% I/O Wait in 'top') for exactly 1 minute when both instances are booted.
At first I suspected that this was just a 'top' reporting anomoly, so I traced a SQL statement which runs for approx 30 seconds with only one instance started.
I then traced the session with both instances running and the execution time jumped to 90 seconds, which corresponds to the normal 30 secs plus this strange 60 second timeout. When I tkprof'd the trace file, I can see that of the 90 seconds response time, 1 individual 'db file scattered read' took 59.8 seconds. This is highly unusual for one multi block read:
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited ------------------------------- Waited ---------- ------------ SQL*Net message to client 2 0.00 0.00 db file scattered read 6954 59.8 82.42 SQL*Net message from client 2 276.12 276.12
This issue is easily repeatable.
The thing that makes me think that this is a I/O problem to the shared disk is that we had to increase the CSS misscount to 120 seconds because of repeated "Voting Disk timeout" errors which used to crash CRS on one of the nodes.
Anyone have any idea how to diagnose the source of this I/O hang.
When I run iostat during this period of 99% IOWAIT, there is no activity to the shared disk at all. 0 bytes read, 0 bytes written.
Matt Received on Fri Jun 10 2005 - 09:25:49 CDT
![]() |
![]() |