Re: Balance I/O

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: 20 Jul 2006 05:23:24 -0700
Message-ID: <1153398204.566996.116890@75g2000cwc.googlegroups.com>

astalavista wrote:

> hi,
>
> I try  to balance I/O, I use sar -d ( AIX 5.1 , oracle 9.2.0.4 )
> Is it a good idea to decrease the disks which are near 100% busy ?
> ( 90% busy is it a low throuput disk ? )
> Is it the right direction to look at %busy ?
>
> Thanks in advance ...

100% busy is not desirable. Find out what files and events are contributing to the problem. Do you have too many different types of files on the same disk spindle? Do you have too many inefficient SQL statements being processed? Did you not implement a large/fast enough disk storage system? Making guesses based on apparent disk usage is not sufficient. Start taking a look at what is causing the disk usage. When you notice a performance problem, record the output of the following SQL statements, executed once a minute for 10 minutes:

Redo log files and their size:
SELECT
LF.GROUP#,
LF.TYPE,

  L.STATUS,
  L.FIRST_TIME,
  L.BYTES,

  LF.MEMBER
FROM
  V$LOGFILE LF,
  V$LOG L
WHERE
  L.GROUP#=LF.GROUP#
ORDER BY
  LF.GROUP#; Redo log file switch frequency (only need to execute this the last time):
SELECT

  LH2.RECID,
  LH2.STAMP,
  LH2.THREAD#,
  LH2.SEQUENCE#,
  LH2.FIRST_CHANGE#,
  LH2.NEXT_CHANGE#,
  LH1.FIRST_TIME PREV_FIRST_TIME,
  LH2.FIRST_TIME,

  ROUND((LH2.FIRST_TIME-LH1.FIRST_TIME)*24,2) HOURS FROM
  V$LOG_HISTORY LH1,
  V$LOG_HISTORY LH2
WHERE
  LH2.RECID=LH1.RECID+1
  AND LH1.FIRST_TIME>TRUNC(SYSDATE-180)
ORDER BY
  LH2.RECID DESC; Activity of the data files and temp files: SELECT
  'DATA' FILE_TYPE,

  FS.FILE#,
  FS.PHYRDS,
  FS.PHYWRTS,
  FS.PHYBLKRD,
  FS.PHYBLKWRT,
  FS.READTIM,
  FS.WRITETIM,
  FS.AVGIOTIM,
  FS.LSTIOTIM,
  FS.MINIOTIM,
  FS.MAXIORTM,
  FS.MAXIOWTM,
  DF.NAME

FROM
  V$FILESTAT FS,
  V$DATAFILE DF
WHERE
  FS.FILE#=DF.FILE#
UNION ALL
SELECT
  'TEMP' FILE_TYPE,

  TS.FILE#,
  TS.PHYRDS,
  TS.PHYWRTS,
  TS.PHYBLKRD,
  TS.PHYBLKWRT,
  TS.READTIM,
  TS.WRITETIM,
  TS.AVGIOTIM,
  TS.LSTIOTIM,
  TS.MINIOTIM,
  TS.MAXIORTM,
  TS.MAXIOWTM,
  TF.NAME

FROM
  V$TEMPSTAT TS,
  V$TEMPFILE TF
WHERE
  TS.FILE#=TF.FILE#
ORDER BY
  1,
  2;

Rollback segment usage:
SELECT

  RN.NAME,
  RN.USN,
  RS.EXTENTS,
  RS.RSSIZE,
  RS.WRITES,
  RS.XACTS,
  RS.GETS,
  RS.WAITS,
  RS.OPTSIZE,
  RS.HWMSIZE,
  RS.SHRINKS,
  RS.WRAPS,
  RS.EXTENDS,
  RS.AVEACTIVE,
  RS.CUREXT,
  RS.CURBLK

FROM
  V$ROLLNAME RN,
  V$ROLLSTAT RS
WHERE
  RN.USN=RS.USN
ORDER BY
  RN.NAME; Also, take a broad look at the wait events that may be contributing to disk utilization and other delays in the system. Once again, execute once a minute for 10 minutes:
SELECT
  0 SID,
  TO_DATE(TO_CHAR(SYSDATE,'YYYYMMDD HH24:MI'),'YYYYMMDD HH24:MI') TIME_STAMP,
  EVENT,
  TOTAL_WAITS,
  TOTAL_TIMEOUTS,
  TIME_WAITED,
  AVERAGE_WAIT
FROM
  V$SYSTEM_EVENT
WHERE
  EVENT IN (
'buffer busy waits',
'db file sequential read',
'db file scattered read',
'direct path read',
'direct path write',
'enqueue',
'free buffer waits',
'latch free',
'log file parallel write',
'log file sync')

UNION ALL
SELECT
  SID,
  TO_DATE(TO_CHAR(SYSDATE,'YYYYMMDD HH24:MI'),'YYYYMMDD HH24:MI') TIME_STAMP,
  EVENT,
  TOTAL_WAITS,
  TOTAL_TIMEOUTS,
  TIME_WAITED,
  AVERAGE_WAIT
FROM
  V$SESSION_EVENT
WHERE
  EVENT IN (
'buffer busy waits',
'db file sequential read',
'db file scattered read',
'direct path read',
'direct path write',
'enqueue',
'free buffer waits',
'latch free',
'log file parallel write',
'log file sync');

If you prefer, you can create a quick logging table by inserting this SQL code before the first run of the above SQL statement (remove prior to the second execution of the SQL statement): CREATE TABLE
SYSTEM_LOG
AS

With the above logging table in place, you can quickly check for problem areas:
SELECT

  SL2.SID,
  SL2.TIME_STAMP,
  SL2.EVENT,
  SL2.TOTAL_WAITS-NVL(SL1.TOTAL_WAITS,0) WAITS,
  SL2.TOTAL_TIMEOUTS-NVL(SL1.TOTAL_TIMEOUTS,0) TIME_OUTS,
  SL2.TIME_WAITED-NVL(SL1.TIME_WAITED,0) TIME_WAITED,
  SL2.AVERAGE_WAIT

FROM
  SYSTEM_LOG SL1,
  SYSTEM_LOG SL2
WHERE
  SL2.SID=SL1.SID(+)
  AND SL2.TIME_STAMP=SL1.TIME_STAMP(+) + (1/24/60)   AND SL2.EVENT=SL1.EVENT(+)
ORDER BY
  SID,
  SL2.TIME_STAMP,
  SL2.EVENT; If you find a problem area where the SID is listed as 0, look for the largest contributor in the same time period where the SID is greater than 0. For example, if the delta value of time waited for db file scattered read where SID=0 seems consistently high, that is an indication that full table scans are likely contributing to the system performance problems. You can then look at that value for the various sessions to determine which session is contributing the most to the performance problem. For there, you can enable a 10046 trace for the session to determine the SQL statements executed by the session, the wait events that result from the execution of the SQL statement, and the file and block number (or other information present with other types of waits) involved in the wait, which may be used to determine the table that is the greatest contributor. This may lead you to look at the statistics for the table and indexes on the table, where you may find that the statistics for the table and indexes are stale - statistics show that the table has 10 rows with an average size of 100 bytes per row, when there are actually 1,000,000 rows in the table.

Charles Hooper
PC Support Specialist
K&M Machine-Fabricating, Inc. Received on Thu Jul 20 2006 - 07:23:24 CDT