Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Strange IO problem on T3

Re: Strange IO problem on T3

From: James Williams <willjamu_at_mindspring.com>
Date: Fri, 23 Jan 2004 01:30:09 GMT
Message-ID: <401078d5.9083801@news.east.earthlink.net>


On 22 Jan 2004 12:15:57 -0800, jfixsen_at_virtumundo.com (jfixsen) wrote:

Once had to rebuild disk array to fix this problem.

Also, older releases of Veritas could cause this.

>Hello!
>
>Oracle 9.2.0.4
>SunOS pchi-db01 5.8 Generic_108528-19 sun4u sparc
>SUNW,Ultra-EnterpriseSystem = SunOS
>Node = pchi-db01
>Release = 5.8
>KernelID = Generic_108528-19
>Machine = sun4u
>BusType = <unknown>
>Serial = <unknown>
>Users = <unknown>
>OEM# = 0
>Origin# = 1
>NumCPU = 8
>
>History: we had been using a SAN disk array for storage and then
>switched over to a Sun T3. About a week after moving to the T3, I saw
>the following message in my alert log: WARNING: aiowait timed out 1
>times. No performance problems happened until about a month after
>that.
>
>Problem: We started seeing huge performance problems from out of
>nowhere on December 16 on everything from big batch jobs (heavy FTS)
>to simply logging in, and I started seeing several of these aiowait
>messages each day, sometimes up to 20. No application changes were
>made at any time during any of this, and after the performance
>problems started, I even cut the load, mostly big FTS jobs, way back.
>
>I had been running statspack everyday the entire time, and on the day
>the performance problems hit hard (about 5 weeks after going to the
>T3), the noticeable difference I saw in the statspack reports were all
>related to writes from what I could tell.
>
>BEFORE:
>
>Top 5 Timed Events
>~~~~~~~~~~~~~~~~~~
>% Total
>Event Waits Time (s)
>Ela Time
>-------------------------------------------- ------------ -----------
>--------
>CPU time 43,689,174
> 92.51
>db file scattered read 131,668,468 949,948
> 2.01
>PX Deq: Execute Reply 931,750 496,692
> 1.05
>direct path read 73,177,620 489,356
> 1.04
>PX Deq Credit: send blkd 24,148,414 425,685
> .90
>
>AFTER:
>
>Top 5 Timed Events
>~~~~~~~~~~~~~~~~~~
>% Total
>Event Waits Time (s)
>Ela Time
>-------------------------------------------- ------------ -----------
>--------
>log file sync 874,418 604,164
> 32.92
>direct path write 121,724 233,840
> 12.74
>PX Deq Credit: send blkd 1,103,485 212,285
> 11.57
>db file scattered read 12,039,568 165,860
> 9.04
>log buffer space 158,742 127,009
> 6.92
>
>BEFORE:
>
> Avg
> Total Wait wait
> Waits
>Event Waits Timeouts Time (s) (ms)
> /txn
>---------------------------- ------------ ---------- ---------- ------
>--------
>db file scattered read 131,668,468 0 949,948 7
> 60.4
>PX Deq: Execute Reply 931,750 211,623 496,692 533
> 0.4
>direct path read 73,177,620 0 489,356 7
> 33.5
>PX Deq Credit: send blkd 24,148,414 122,149 425,685 18
> 11.1
>db file sequential read 31,794,392 0 349,188 11
> 14.6
>direct path write 2,652,105 0 309,880 117
> 1.2
>log file sync 3,021,375 104,267 201,582 67
> 1.4
>db file parallel write 547,546 254,564 68,136 124
> 0.3
>enqueue 27,670 14,655 50,246 1816
> 0.0
>log buffer space 110,172 32,943 47,180 428
> 0.1
>
>AFTER:
>
>Avg Total Wait
>wait Waits
>Event Waits Timeouts Time (s) (ms)
> /txn
>---------------------------- ------------ ---------- ---------- ------
>--------
>log file sync 874,418 534,000 604,164 691
> 3.4
>direct path write 121,724 0 233,840 1921
> 0.5
>PX Deq Credit: send blkd 1,103,485 94,421 212,285 192
> 4.3
>db file scattered read 12,039,568 0 165,860 14
> 46.5
>log buffer space 158,742 120,911 127,009 800
> 0.6
>PX Deq: Execute Reply 742,346 61,708 126,984 171
> 2.9
>
>
>I am waiting on a response from Sun, but I want to be sure it's not
>something I'm overlooking. We've got all the latest sun patches, and
>one post I saw
>that had to do with the aiowait message said to change /etc/system to
>include the following parm after installing the patch (which we
>already had):
>
>* this parm is associated with the aiowait errors that was corrected
>in patch 112255-01 solaris v8
>set TS:ts_sleep_promote=1
>
>I haven't seen the aiowait message in the alert log since the parm had
>been added and the db restarted, but performance still sucks and
>statspack numbers are still the same/bad. One thing consistently
>noticable is when I try logging into sqlplus, it hangs from 10-60
>seconds. If I look at the wait events from another session, it always
>waits on log buffer space initially, and then log file sync. Or if I
>try dropping a small simple table, it gives me the enqueue wait.
>
>To me, this all smells like a SUN IO problem since one day it "just
>started happening", but I would love everyone's opinions.
>Thanks
>
>Jason
>jfixsen_at_nospam_virtumundo.com
Received on Thu Jan 22 2004 - 19:30:09 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US