Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Strange IO problem on T3

Re: Strange IO problem on T3

From: Brian Peasland <dba_at_remove_spam.peasland.com>
Date: Thu, 22 Jan 2004 21:03:09 GMT
Message-ID: <40103A8D.15BF15C6@remove_spam.peasland.com>


Check out Note 222989.1 on Metalink!

Cheers,
Brian

jfixsen wrote:
>
> Hello!
>
> Oracle 9.2.0.4
> SunOS pchi-db01 5.8 Generic_108528-19 sun4u sparc
> SUNW,Ultra-EnterpriseSystem = SunOS
> Node = pchi-db01
> Release = 5.8
> KernelID = Generic_108528-19
> Machine = sun4u
> BusType = <unknown>
> Serial = <unknown>
> Users = <unknown>
> OEM# = 0
> Origin# = 1
> NumCPU = 8
>
> History: we had been using a SAN disk array for storage and then
> switched over to a Sun T3. About a week after moving to the T3, I saw
> the following message in my alert log: WARNING: aiowait timed out 1
> times. No performance problems happened until about a month after
> that.
>
> Problem: We started seeing huge performance problems from out of
> nowhere on December 16 on everything from big batch jobs (heavy FTS)
> to simply logging in, and I started seeing several of these aiowait
> messages each day, sometimes up to 20. No application changes were
> made at any time during any of this, and after the performance
> problems started, I even cut the load, mostly big FTS jobs, way back.
>
> I had been running statspack everyday the entire time, and on the day
> the performance problems hit hard (about 5 weeks after going to the
> T3), the noticeable difference I saw in the statspack reports were all
> related to writes from what I could tell.
>
> BEFORE:
>
> Top 5 Timed Events
> ~~~~~~~~~~~~~~~~~~
> % Total
> Event Waits Time (s)
> Ela Time
> -------------------------------------------- ------------ -----------
> --------
> CPU time 43,689,174
> 92.51
> db file scattered read 131,668,468 949,948
> 2.01
> PX Deq: Execute Reply 931,750 496,692
> 1.05
> direct path read 73,177,620 489,356
> 1.04
> PX Deq Credit: send blkd 24,148,414 425,685
> .90
>
> AFTER:
>
> Top 5 Timed Events
> ~~~~~~~~~~~~~~~~~~
> % Total
> Event Waits Time (s)
> Ela Time
> -------------------------------------------- ------------ -----------
> --------
> log file sync 874,418 604,164
> 32.92
> direct path write 121,724 233,840
> 12.74
> PX Deq Credit: send blkd 1,103,485 212,285
> 11.57
> db file scattered read 12,039,568 165,860
> 9.04
> log buffer space 158,742 127,009
> 6.92
>
> BEFORE:
>
> Avg
> Total Wait wait
> Waits
> Event Waits Timeouts Time (s) (ms)
> /txn
> ---------------------------- ------------ ---------- ---------- ------
> --------
> db file scattered read 131,668,468 0 949,948 7
> 60.4
> PX Deq: Execute Reply 931,750 211,623 496,692 533
> 0.4
> direct path read 73,177,620 0 489,356 7
> 33.5
> PX Deq Credit: send blkd 24,148,414 122,149 425,685 18
> 11.1
> db file sequential read 31,794,392 0 349,188 11
> 14.6
> direct path write 2,652,105 0 309,880 117
> 1.2
> log file sync 3,021,375 104,267 201,582 67
> 1.4
> db file parallel write 547,546 254,564 68,136 124
> 0.3
> enqueue 27,670 14,655 50,246 1816
> 0.0
> log buffer space 110,172 32,943 47,180 428
> 0.1
>
> AFTER:
>
> Avg Total Wait
> wait Waits
> Event Waits Timeouts Time (s) (ms)
> /txn
> ---------------------------- ------------ ---------- ---------- ------
> --------
> log file sync 874,418 534,000 604,164 691
> 3.4
> direct path write 121,724 0 233,840 1921
> 0.5
> PX Deq Credit: send blkd 1,103,485 94,421 212,285 192
> 4.3
> db file scattered read 12,039,568 0 165,860 14
> 46.5
> log buffer space 158,742 120,911 127,009 800
> 0.6
> PX Deq: Execute Reply 742,346 61,708 126,984 171
> 2.9
>
> I am waiting on a response from Sun, but I want to be sure it's not
> something I'm overlooking. We've got all the latest sun patches, and
> one post I saw
> that had to do with the aiowait message said to change /etc/system to
> include the following parm after installing the patch (which we
> already had):
>
> * this parm is associated with the aiowait errors that was corrected
> in patch 112255-01 solaris v8
> set TS:ts_sleep_promote=1
>
> I haven't seen the aiowait message in the alert log since the parm had
> been added and the db restarted, but performance still sucks and
> statspack numbers are still the same/bad. One thing consistently
> noticable is when I try logging into sqlplus, it hangs from 10-60
> seconds. If I look at the wait events from another session, it always
> waits on log buffer space initially, and then log file sync. Or if I
> try dropping a small simple table, it gives me the enqueue wait.
>
> To me, this all smells like a SUN IO problem since one day it "just
> started happening", but I would love everyone's opinions.
> Thanks
>
> Jason
> jfixsen_at_nospam_virtumundo.com

-- 
===================================================================

Brian Peasland
dba_at_remove_spam.peasland.com

Remove the "remove_spam." from the email address to email me.


"I can give it to you cheap, quick, and good. Now pick two out of
 the three"
Received on Thu Jan 22 2004 - 15:03:09 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US