Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: SAN issues

Re: SAN issues

From: <Jared.Still_at_radisys.com>
Date: Mon, 12 Aug 2002 15:44:52 -0800
Message-ID: <F001.004B24A5.20020812154452@fatcity.com>


Babu,

Nice comprehensive list of things to consider with a SAN,

Just a couple of thoughts.

> ®     Oracle requests DBWR-1 for IDX1 and waits. DBWR-1 makes a Unix IO
> call and waits for Unix to return data. Unix talks to SAN and SAN starts
> reading from the disk. Assume that it takes 3 seconds to read the entire
> IDX1. SAN starts returning data in chunks to Unix and Unix gives it back 
to
> Oracle.

Data is read from Disk by server processes, not by DBWR.

> ® Now a slightly bigger picture. There are 6 processes trying to read
> the data from six different tables.

This occurs regardless of the type of storage system, so I'm not sure it really belongs in a list of SAN specific concerns.

> ® Lets forget all this buffering, caches etc. Assume we have 10 disks

> in two LUNs. Both the LUNs share the 10 disks. Each of this LUN is made
> visible to Unix as a mountpoint. The DBA uses one mountpoint for indexes
> and one mountpoint for tables.

You can have this same kind of configuration problem with any disk storage manager.

Don't forget the management issue with SANs. SA's love them because it greatly reduces the amount of work they must do to manage storage. They can be properly configured from a database point of view, at least as far as distribuing IO is concerned, you just need to make it known that you would like some input on it's configuration.

Jared

Babu.Nagarajan_at_Cummins.com
Sent by: root_at_fatcity.com
08/12/2002 01:38 PM
Please respond to ORACLE-L  

        To:     Multiple recipients of list ORACLE-L <ORACLE-L_at_fatcity.com>
        cc: 
        Subject:        SAN issues



All

I an trying to get our management understand the issues related to SAN. These are my thoughts. Let me know what you think about it... (PS : Apologies if you recv this twice. I posted it but I never saw it come
through the list and so I posted again)

Babu

SAN Issues

SAN and Oracle ? Conflicting IO behavior
® There are four types of IO in Oracle

1.    Random Reads (RR) ? DBWR - Using indexes
2.    Sequential Reads (SR) ? DBWR - Full table scans
3.    Random Writes (RW) ? DBWR ? Writing dirty blocks
4.    Sequential Writes (SW) ? LGWR, Arch ? Writing redo logs and Redo
Archival + Control files

® Bulk of any Oracle database's IO is done in RR, SR and RW. If SW is
very high it denotes configuration problems.

® SAN (or for that matter any RAID device) is configured for writing
or
reading large chunks at a time. The stripe size on most SANs and RAID devices are 256K or more. Compare this to the Oracle block size of 4k/8k in
most databases (going upto 32K in datawarehouses)
® SANs do "Read Ahead". If one block is requested, they read more than
one blocks "while at the disk" hoping that the same process will request the other blocks some time soon.

Here is the conflict.
® When ever Oracle does a RR, SR or RW it writes randomly and not
sequentially. It will read/write a particular block at a time in case of RR and RW and 'x' blocks (where x = dbfile_multi_block_read_count) in case of SR. Therefore only during SR will Oracle use the entire stripe width. In
all other cases, The difference in the stripe width and db_block_size will be excess IO.
® Why "read ahead" will cause a conflict :

      ®     The internal structure of a datafile could be as follows. The
      file consists of 10 blocks. These are occupied by 3 tables.  The
      blocks shown below are numbered using table_name.block_number
 
|---------+---------+---------+---------+---------+---------+---------+---------+---------+---------|
     |         |         |         |         |         |         | |   |   
   |         |
     | 1.1     | 1.2     | 2.1     | 3.1     | 3.2     | 3.3     | 2.2 | 
1.3     | 2.3     | 3.4     |
     |         |         |         |         |         |         | |   |   
   |         |
 

|---------+---------+---------+---------+---------+---------+---------+---------+---------+---------|

      ®     The first block on the datafile is the first block of table 1,
      second block is the second block of table 1, the third block is the
      first block of table 2 and so on.. (For simplicity sake, I am
      assuming Oracle will allocate space in blocks and not in extents)
      ®     Now assume Oracle requests the first block of table 1.  Assume
      read ahead is set to three blocks (three blocks will be read instead
      of 2 blocks). In this case the SAN will read 2.1, 3.1,3.2.
      ®     The blocks 3.1 and 3.2 will be entirely useless as Oracle is
      never going to read it. SAN cannot tell that the block 2.2 that
      Oracle might possible request next is the 7th block in the datafile
      and so it can never "read ahead" intelligently.

Why the buffer of SAN has very little impact w.r.t Oracle read performance?
® Oracle has its own buffering for all IO types
® DBWR reads and writes uses the DB Buffer Cache
® LGWR uses the Log buffer
® Db buffer Cache is managed by a LRU Algorithm (Touchcount from 9I).
® Bulk of the IO done by Oracle is Logical IO (LIO) and not Physical
IO
(PIO).
® Assume the buffer cache hit ratio is 80%. This means that only 20%
of
the IO calls are PIO. Only 20% of the calls ever hit the SAN's cache. Since
this 20% is probably the least requested/never requested data (going by Oracle's LRU algorithm) , its quite likely that the SAN's buffers don't have this either.
® Given that Oracle is going to cache even this 20% in its buffers,
the
next PIO call is going to be for something totally different ? which is not
there in the SAN's buffer.
® Couple this with the read-ahead (discussed earlier), Our SAN's
buffer
is now populated with lots of data that Oracle might never use a PIO to retrieve.
® Thus the SAN's buffer can never really provide to Oracle the data it
reads most ? Its already there in Oracle.
® To be fair, SAN's huge buffers will come as a boon to small
databases
? where the entire database can be cached in the SAN's buffers.

SAN or no SAN ? Why will performance be affected if we have indexes and tables on the same disk

® Lets forget all this buffering, caches etc. Assume we have 10 disks
in two LUNs. Both the LUNs share the 10 disks. Each of this LUN is made visible to Unix as a mountpoint. The DBA uses one mountpoint for indexes and one mountpoint for tables.
® Since we have used all the 10 disks for both the LUNs, the structure
of one disk can look like this. The first two blocks are a chunk of LUN1 where we have the table TAB1. The next three blocks are a chunk of LUN2 where we have indexes IDX1, IDX2 and IDX3(for tables TAB1, TAB2 and TAB3).

|------------------+------------------+------------------+------------------+-----------------------------------------------|
|                  |                  |                  |  |           |
| 1.TAB1           | 1.TAB1           | 2.IDX1           | 2.IDX2  | 
2.IDX3                                        |
|                  |                  |                  |  |           |
|------------------+------------------+------------------+------------------+-----------------------------------------------|

® Assume when Oracle is reading TAB1 using IDX1. Also assume there are no

   caches and all IO is physical.
® Oracle requests DBWR-1 for IDX1 and waits. DBWR-1 makes a Unix IO
call and waits for Unix to return data. Unix talks to SAN and SAN starts reading from the disk. Assume that it takes 3 seconds to read the entire IDX1. SAN starts returning data in chunks to Unix and Unix gives it back to
Oracle.
® Oracle uses that data to request TAB1 using DBWR-2. Now DBWR-2 goes
to Unix which in turn goes to the SAN. SAN now goes to the disk but the disk has only one head ? which can read or write. In this case that head is
already busy reading IDX1.
® Now two things can happen
® Either the disk head starts running between TAB1 and IDX1 increasing
the latency time
® Or TAB1's read waits till IDX1's reads complete.
® Either way the parallel operation is now serialized leading to poor
response time.

How Oracle performs an IO and Why can Oracle suffer from an IO problem while Unix/SAN statistics don't show any?

® DBWR performs nearly all of the IO for Oracle. LGWR, Arch etc do
some
IO but DBWR beats them all by sheer volume.
® There can be multiple DBWR processes. Lets assume that there are two
DBWR processes (or DBWR_IO_SLAVES if you prefer).
® Lets assume that it takes one second for each DBWR to perform a
single IO call. There are no waits on the Unix or IO Subsystem.
® Consider a routine scenario ? Oracle is doing a table read via a
index. Oracle sends DBWR1 to read the index block and it returns the data after 1 second. Oracle then uses that data to ask DBWR2 read the data from the table. In this case the process that requested the data had to wait 2 seconds before getting it ? a 2 sec IO wait.
® Now a slightly bigger picture. There are 6 processes trying to read
the data from six different tables. Since we have only 2 DBWRs they have to
service all this IO. Each process would post the DBWR to do their read and wait on "dbfile sequential read". However the DBWRs can service only one request at a time. So while DBWRs 1 and 2 service Processes 1 and 2, the Processes 3-7 wait for IO.
® This internal queue is never visible to the UNIX or to the SAN. So
the SAN/Unix never shows a IO queue but there is a IO issue that makes processes wait.

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: 
  INET: Jared.Still_at_radisys.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Mon Aug 12 2002 - 18:44:52 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US