Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Strange IO problem on T3

Re: Strange IO problem on T3

From: bards <bards1888_at_yahoo.com.au.au>
Date: Fri, 23 Jan 2004 07:00:39 GMT
Message-ID: <rE3Qb.259$KW.13664@news.optus.net.au>


jfixsen wrote:
> Hello!
>
> Oracle 9.2.0.4
> SunOS pchi-db01 5.8 Generic_108528-19 sun4u sparc
> SUNW,Ultra-EnterpriseSystem = SunOS
> Node = pchi-db01
> Release = 5.8
> KernelID = Generic_108528-19
> Machine = sun4u
> BusType = <unknown>
> Serial = <unknown>
> Users = <unknown>
> OEM# = 0
> Origin# = 1
> NumCPU = 8
>
> History: we had been using a SAN disk array for storage and then
> switched over to a Sun T3. About a week after moving to the T3, I saw
> the following message in my alert log: WARNING: aiowait timed out 1
> times. No performance problems happened until about a month after
> that.
>
> Problem: We started seeing huge performance problems from out of
> nowhere on December 16 on everything from big batch jobs (heavy FTS)
> to simply logging in, and I started seeing several of these aiowait
> messages each day, sometimes up to 20. No application changes were
> made at any time during any of this, and after the performance
> problems started, I even cut the load, mostly big FTS jobs, way back.
>
> I had been running statspack everyday the entire time, and on the day
> the performance problems hit hard (about 5 weeks after going to the
> T3), the noticeable difference I saw in the statspack reports were all
> related to writes from what I could tell.
>
> BEFORE:
>
> Top 5 Timed Events
> ~~~~~~~~~~~~~~~~~~
> % Total
> Event Waits Time (s)
> Ela Time
> -------------------------------------------- ------------ -----------
> --------
> CPU time 43,689,174
> 92.51
> db file scattered read 131,668,468 949,948
> 2.01
> PX Deq: Execute Reply 931,750 496,692
> 1.05
> direct path read 73,177,620 489,356
> 1.04
> PX Deq Credit: send blkd 24,148,414 425,685
> .90
>
> AFTER:
>
> Top 5 Timed Events
> ~~~~~~~~~~~~~~~~~~
> % Total
> Event Waits Time (s)
> Ela Time
> -------------------------------------------- ------------ -----------
> --------
> log file sync 874,418 604,164
> 32.92
> direct path write 121,724 233,840
> 12.74
> PX Deq Credit: send blkd 1,103,485 212,285
> 11.57
> db file scattered read 12,039,568 165,860
> 9.04
> log buffer space 158,742 127,009
> 6.92
>
> BEFORE:
>
> Avg
> Total Wait wait
> Waits
> Event Waits Timeouts Time (s) (ms)
> /txn
> ---------------------------- ------------ ---------- ---------- ------
> --------
> db file scattered read 131,668,468 0 949,948 7
> 60.4
> PX Deq: Execute Reply 931,750 211,623 496,692 533
> 0.4
> direct path read 73,177,620 0 489,356 7
> 33.5
> PX Deq Credit: send blkd 24,148,414 122,149 425,685 18
> 11.1
> db file sequential read 31,794,392 0 349,188 11
> 14.6
> direct path write 2,652,105 0 309,880 117
> 1.2
> log file sync 3,021,375 104,267 201,582 67
> 1.4
> db file parallel write 547,546 254,564 68,136 124
> 0.3
> enqueue 27,670 14,655 50,246 1816
> 0.0
> log buffer space 110,172 32,943 47,180 428
> 0.1
>
> AFTER:
>
> Avg Total Wait
> wait Waits
> Event Waits Timeouts Time (s) (ms)
> /txn
> ---------------------------- ------------ ---------- ---------- ------
> --------
> log file sync 874,418 534,000 604,164 691
> 3.4
> direct path write 121,724 0 233,840 1921
> 0.5
> PX Deq Credit: send blkd 1,103,485 94,421 212,285 192
> 4.3
> db file scattered read 12,039,568 0 165,860 14
> 46.5
> log buffer space 158,742 120,911 127,009 800
> 0.6
> PX Deq: Execute Reply 742,346 61,708 126,984 171
> 2.9
>
>
> I am waiting on a response from Sun, but I want to be sure it's not
> something I'm overlooking. We've got all the latest sun patches, and
> one post I saw
> that had to do with the aiowait message said to change /etc/system to
> include the following parm after installing the patch (which we
> already had):
>
> * this parm is associated with the aiowait errors that was corrected
> in patch 112255-01 solaris v8
> set TS:ts_sleep_promote=1
>
> I haven't seen the aiowait message in the alert log since the parm had
> been added and the db restarted, but performance still sucks and
> statspack numbers are still the same/bad. One thing consistently
> noticable is when I try logging into sqlplus, it hangs from 10-60
> seconds. If I look at the wait events from another session, it always
> waits on log buffer space initially, and then log file sync. Or if I
> try dropping a small simple table, it gives me the enqueue wait.
>
> To me, this all smells like a SUN IO problem since one day it "just
> started happening", but I would love everyone's opinions.
> Thanks
>
> Jason
> jfixsen_at_nospam_virtumundo.com

Also, check the batteries on the T3 have not expired, if they have the T3 will use write-through caching (instead of write-back) until they are replaced. Performance is very bad in this mode as you will no doubt appreciate.

I'm not saying that this is in any way related to your error message. I'm just posting here my experience of badly performing T3s.

HTH. Received on Fri Jan 23 2004 - 01:00:39 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US