Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Strange IO problem on T3

Strange IO problem on T3

From: jfixsen <jfixsen_at_virtumundo.com>
Date: 22 Jan 2004 12:15:57 -0800
Message-ID: <26bc4d13.0401221215.7e7d065a@posting.google.com>


Hello!

Oracle 9.2.0.4
SunOS pchi-db01 5.8 Generic_108528-19 sun4u sparc SUNW,Ultra-EnterpriseSystem = SunOS
Node = pchi-db01
Release = 5.8
KernelID = Generic_108528-19
Machine = sun4u
BusType = <unknown>
Serial = <unknown>
Users = <unknown>
OEM# = 0
Origin# = 1
NumCPU = 8

History: we had been using a SAN disk array for storage and then switched over to a Sun T3. About a week after moving to the T3, I saw the following message in my alert log: WARNING: aiowait timed out 1 times. No performance problems happened until about a month after that.

Problem: We started seeing huge performance problems from out of nowhere on December 16 on everything from big batch jobs (heavy FTS) to simply logging in, and I started seeing several of these aiowait messages each day, sometimes up to 20. No application changes were made at any time during any of this, and after the performance problems started, I even cut the load, mostly big FTS jobs, way back.

I had been running statspack everyday the entire time, and on the day the performance problems hit hard (about 5 weeks after going to the T3), the noticeable difference I saw in the statspack reports were all related to writes from what I could tell.

BEFORE: Top 5 Timed Events



% Total
Event                                               Waits    Time (s)
Ela Time
-------------------------------------------- ------------ -----------

CPU time                                                   43,689,174 
  92.51
db file scattered read                        131,668,468     949,948 
   2.01
PX Deq: Execute Reply                             931,750     496,692 
   1.05
direct path read                               73,177,620     489,356 
   1.04
PX Deq Credit: send blkd                       24,148,414     425,685 

    .90

AFTER: Top 5 Timed Events



% Total
Event                                               Waits    Time (s)
Ela Time
-------------------------------------------- ------------ -----------

log file sync                                     874,418     604,164 
  32.92
direct path write                                 121,724     233,840 
  12.74
PX Deq Credit: send blkd                        1,103,485     212,285 
  11.57
db file scattered read                         12,039,568     165,860 
   9.04
log buffer space                                  158,742     127,009 

   6.92

BEFORE:

                                                                  Avg
                                                     Total Wait   wait
   Waits
Event                               Waits   Timeouts   Time (s)   (ms)

    /txn
---------------------------- ------------ ---------- ---------- ------



db file scattered read 131,668,468 0 949,948 7

    60.4

PX Deq: Execute Reply             931,750    211,623    496,692    533
     0.4
direct path read               73,177,620          0    489,356      7
    33.5
PX Deq Credit: send blkd       24,148,414    122,149    425,685     18
    11.1
db file sequential read        31,794,392          0    349,188     11
    14.6
direct path write               2,652,105          0    309,880    117
     1.2
log file sync                   3,021,375    104,267    201,582     67
     1.4
db file parallel write            547,546    254,564     68,136    124
     0.3
enqueue                            27,670     14,655     50,246   1816
     0.0
log buffer space                  110,172     32,943     47,180    428
     0.1

AFTER:
                                                                   
Avg                                                    Total Wait  
wait    Waits
Event                               Waits   Timeouts   Time (s)   (ms)

    /txn
---------------------------- ------------ ---------- ---------- ------


log file sync                     874,418    534,000    604,164    691
     3.4
direct path write                 121,724          0    233,840   1921
     0.5
PX Deq Credit: send blkd        1,103,485     94,421    212,285    192
     4.3
db file scattered read         12,039,568          0    165,860     14
    46.5
log buffer space                  158,742    120,911    127,009    800
     0.6
PX Deq: Execute Reply             742,346     61,708    126,984    171
     2.9


I am waiting on a response from Sun, but I want to be sure it's not something I'm overlooking. We've got all the latest sun patches, and one post I saw
that had to do with the aiowait message said to change /etc/system to include the following parm after installing the patch (which we already had):

I haven't seen the aiowait message in the alert log since the parm had been added and the db restarted, but performance still sucks and statspack numbers are still the same/bad. One thing consistently noticable is when I try logging into sqlplus, it hangs from 10-60 seconds. If I look at the wait events from another session, it always waits on log buffer space initially, and then log file sync. Or if I try dropping a small simple table, it gives me the enqueue wait.

To me, this all smells like a SUN IO problem since one day it "just started happening", but I would love everyone's opinions. Thanks

Jason
jfixsen_at_nospam_virtumundo.com Received on Thu Jan 22 2004 - 14:15:57 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US