Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: High wio on New Hitachi SAN Storage of DB Server
Prashant, Folks
>From Statspack (as asked)
56,896,369 1 56,896,369.0 22.8 1466052601 select ocp.set_num setid, gam.schm_code schm, gam.foracid acct, gam.acct_name acct_name, oci.instrmnt_id Instr_No, ocp.tran_amt inst_amt, mpt.mp_topic topic from mpt, gam, ocp, oci where ocp.sol_id = '212' and ocp.clg_zone_dat e = '15-10-2005' and ocp.clg_zone_code = 'OW-MICR' and ocp.c
56,874,454 1 56,874,454.0 22.8 1143844609 select ocp.set_num setid, gam.schm_code schm, gam.foracid acct, gam.acct_name acct_name, oci.instrmnt_id Instr_No, ocp.tran_amt inst_amt, mpt.mp_topic topic from mpt, gam, ocp, oci where ocp.sol_id = '212' and ocp.clg_zone_dat e = '15.10.2005' and ocp.clg_zone_code = 'OW-MICR' and ocp.c
28,324,295 38,167 742.1 11.4 2332700722 SELECT TABLE_ABBR, TABLE_KEY, TABLE_FUNC, CONTEXT_SOLID from SO D where ROWNUM <= 250
7,442,129 1,019,697 7.3 3.0 696473406 SELECT table_func FROM SOD WHERE TABLE_ABBR = :1 and TABLE _KEY = :2 and CONTEXT_SOLID = :3 FOR UPDATE NOWAIT
Another basic Qs:-
If init.ora parameter disk_async_io is set to FALSE is usage of the AIO Servers (of AIX) automatically diabled?
Will provide any info needed
Thanks indeed
-----Original Message-----
From: Mark Brinsmead [mailto:mark.brinsmead_at_shaw.ca]
Sent: Saturday, November 12, 2005 6:25 AM
To: ade.turner_at_gmail.com
Cc: VIVEK_SHARMA; 'Oracle-L'
Subject: Re: High wio on New Hitachi SAN Storage of DB Server
Vivek and Adrian,
I'm afraid I can't offer any "answers", but I do have a "dumb"
question that
might be helpful. (Or not.)
Are you using Async-I/O, or Concurrent I/O? How many AIO servers
have
you configured?
If you are truly using Async-I/O against "cooked" JFS2 storage, there
are a
couple fairly obvious options available to you.
One would be to try Async I/O with raw partitions instead. As I
recall,
AIX 5.2 (and later) uses a kernel-based AIO driver, as opposed to AIO
servers. If you see a pronouced performance difference by moving to
raw partitions (which aren't *too* awful with the AIX volume manager)
this would probably suggest that you have misconfigured the AIO servers,
either by not permitting enough of them or perhaps by not allowing them
to queue enough I/Os per (logical) disk...
Another option, at least for adrian, is to mount your filesystems
with the
"cio" option, and enable Concurrent I/O in the database, you will be
able
to bypass the AIO servers completely.
I'm afraid I'm out of the office right now, so I can't easily offer
more
details. There are several whitepapers on CIO for Oracle (or databases
in general) on IBM's website. You'll also find a considerable amount of
discussion (now, maybe a year or two old) about AIX and CIO on
MetaLink.
At a (very) wild guess, it would sound like maybe you (or your
sysadmins)
have substantially under-configure one or both of AIO "minservers" and
"maxservers" (I think those are the terms used by SMIT), and/or maybe
need to "lie" to AIX about the number of I/Os that can be queued for
each of your logical disks. (I've never bothered to tweek the latter;
I've never had to.)
For the record, my configuration is P610s going against an IBM ESS disk array (yeah, I know, small potatoes) and at least indirectly P570s going against IBM ESS disk arrays, and EMC CX-700 disk arrays (medium-sized potatoes). I've never seen indications of your kind of problems with either.
One other silly question: Is there any chance that somebody has
"goofed"
with the RAID configuration? 70% WIO is *extremely* high -- high enough
to maybe suggest the possibility that somebody has configured hardware
or software RAID-1 between two "logical" disks that are actually located
on the same "physical" disk. I *know* this one is a long shot, but I
*have*
crossed paths with people who have actually done this with software RAID
at least. Does 'sar' suggest consistently poor performance for all
devices?
Adrian wrote:
>Hi Vivek,
>
>Your note interested me; we have exactly the same configuration and
symptoms
>but we are still running async_io = TRUE. I concur with your findings
on the
>extreme performance issues - short term server freezes are occurring.
You
>mention a bug; is this hearsay or do you have details?
>
>If you've turned of async_io (the default on aix), then you may want to
use
>DBWR_IO_SLAVES. I've not tried this yet.
>
>Anyway, ours is a HDS9990 with McData Switches attached to a p690
running
>AIX 5.2 ML4. The filesystems are jfs2.
>
>I can reproduce the problem just by creating a 10GB datafile, multiple
users
>doing random i/o can also get the same issue, or indeed when a parallel
rman
>backup is in progress. However concurrent cp's of multiple files do not
>reproduce the issue (hence I believe it is likely an async i/o problem.
>
>Under load Topas shows high wait for i/o and Sar -q shows that swpocc%
= 100
>and swpocc > 20
>
>My unix admins are currently looking at the async i/o settings as per
>metalink note 271444.1, but are heavily loaded and this is not prod
(yet) so
>the urgency is low.
>
>If you or anyone else has any pointers with this configuration please
let me
>know.
>
>Kind Regards
>Adrian
>
>-----Original Message-----
>From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org]
>On Behalf Of VIVEK_SHARMA
>Sent: 11 November 2005 18:56
>To: Oracle-L
>Subject: High wio on New Hitachi SAN Storage of DB Server
>
>
>Folks,
>
>While doing Application Testing of Hybrid Trans (OLTP mostly though) by
>200 users (approx) on a NEWLY configured HITACHI SAN Storage, on DB
>Server (of AIX) High wait for IO i.e. wio = 70 % till 1400 Hours is
>observed.
>
>NOTE - wio reduced to about 10 % gradually from 1400 Hours to 2000
>Hours.
>
>Average sar Output from morning to 1400 Hours:-
>
>13:38:00 %usr %sys %wio %idle
>13:43:00 6 5 67 22
>13:48:00 10 6 74 10
>13:53:00 10 5 66 19
>13:58:00 7 5 61 27
>14:03:00 5 5 67 22
>14:08:00 7 5 74 15
>14:13:00 9 6 69 15
>
>CONFIG
>======
>DB Server = 7 CPUs
>OS AIX 5.2
>Oracle 8.1.7
>Hitachi Storage for DB Server = 40 underlying Disks
>Hardware Raid
>Init.ora disk_asynch_io = FALSE (When disk_asynch_io is set TRUE there
>is extreme performance degradation with AIX 5.2 [seems a bug])
>
>
>Comments by IBM
>===============
>The average number of process waiting for IO to complete is 12. This
>indicates that these processes are waiting for the IO to complete. This
>is the reason why we are seeing an average iowait of 70%.
>
>The seek rate is 95.72% on the hdsdb9960lv LV's indicates a high
degree
>of random IO, usually caused by the application or a high degree of
disk
>fragmentation.
>
>
>STATSPACK report (will provide any other sections as needed)
>================
>
>DB Name DB Id Instance Inst Num Release OPS Host
>------------ ----------- ------------ -------- ----------- ---
>------------
>ABNPROD 34298189 abnprod 1 8.1.7.4.0 NO findb
>
> Snap Id Snap Time Sessions
> ------- ------------------ --------
> Begin Snap: 5 15-Oct-05 13:00:55 352
> End Snap: 6 15-Oct-05 14:00:37 352
> Elapsed: 59.70 (mins)
>
>Cache Sizes
>~~~~~~~~~~~
> db_block_buffers: 215000 log_buffer: 18874368
> db_block_size: 8192 shared_pool_size: 754288000
>
>Load Profile
>~~~~~~~~~~~~ Per Second Per
Transaction
> ---------------