Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: SAME and OFA

Re: SAME and OFA

From: Michael Brown <mike_at_mlbrown.com>
Date: Fri, 11 May 2001 21:34:51 -0400
Message-ID: <hc3pftgrf5ofvv99ca817j1teo7oeabgpl@4ax.com>

The two are totally compatable. In spite of the common misconception, when the term SAME was originally used, it did not (repeat NOT) say to use the same RAID set for everything. There are multiple white papers that have taken SAME to mean use one stripe set. In spite of this, SAME actually just says to put everything on disks that have been striped and mirrored. No matter how many disks are in a stripe set, you still have a physical spindle limitation that translates to a drive being able to effectively service no more than 1 concurrent write and ~6 concurrent read requests with the drives on board logic (obviously only 1 I/O actually will occur on a normal single accuator SCSI or FC drive). This means that if you mix write intensive structures (redo, rollback, temp, archived redo) on the same stripe set, you almost certainly guarantee conflict.

Also, if you are concerned with disk performance, never put structures that are written to on RAID-5. Oracle does almost all writes with small I/Os. To write a small I/O to RAID-5 requires reading the parity block from the stripe, reading the old data from the stripe, computing the new value for the parity block based on the new data block and writing both the data block and parity block back. Hardware RAID-5 usually uses some cache to minimize this by grouping more blocks in a single stripe together to minimize the writes to the parity block, but normal Oracle I/O activity makes this difficult since Oracle is attempting to stripe the data over the datafiles in the tablespace which reduces the odds for multiple writes to the same RAID-5 set. If you use RAID-10, you write both data blocks, but do not have the extra read I/Os. Of course, RAID-5 perfomance is also significantly worse while a disk has failed or is being re-built than mirroring since during that period every block in the RAID-5 group has to be read for I/O involving the failed drive (which during a rebuild of a 36GB disk is a lot of I/O).

I do use RAID-5 in my development and test environments since cost edges out performance in these areas. All of my production data (which includes the standby I maintain for HA) is on RAID-10 (0+1 if you prefer).

Michael Brown
Senior DBA
Glen Raven, Inc.

 Fri, 11 May 2001 17:37:47 GMT, Paul Drake <paled_at_home.com> wrote:

>Ed Stevens wrote: (removed some text for brevity)
>
>> Subject: SAME and OFA
>>
>> As I read more and more about SAME (Strip And Mirror Everything) it
>> occurs to me that the performance rationale for OFA is breaking down.
>> Am I missing something here?
>>
>> The most detailed paper I've read on SAME is "Optimal Storage
>> Configuration Made Easy" by Juan Loaiza of Oracle.
>>
>> From our Day One on Oracle (running 7.3 on NT) we have been using
>> RAID-1 and/or RAID-5 on our various Oracle servers and simulating an
>> OFA architecture by chopping the RAID partitions up into multiple
>> logical drives. I have been preaching (to no avail) to those who make
>> the hardware purchase and configuration decisions that Oracle
>> databases need more drives, not bigger drives.
>>
>> If I'm not missing anything, my question is two-fold. First, has the
>> ascendancy of large RAID arrays made OFA obsolete? Second, given an
>> Oracle implementation on a RAID system, is there any rational for
>> simulating OFA with multiple logical drives - or should we just
>> develop a manageable directory structure on a single logical drive and
>> quit worrying about OFA?
>>
>> As I read over the above, it sounds like I'm an old fart that doesn't
>> want to let go of old technology (OFA). The truth is that out of
>> necessity, we've always had everything on either a single stripe set
>> or a stripe set for data and a mirror set for rollbacks and archive
>> logs. Further, when some poor sizing choices caused us to run out of
>> space on some logical drives while barely utilizing others , we moved
>> to replacing multiple logical drives (to simulate an OFA architecture)
>> with multiple directories. So it seems that we have arrived at the
>> essentials of SAME anyway. Should I enjoy being ahead of my time, or
>> do I still have battles to fight? Is there anything I should be
>> looking out for in the position I find myself in?
>>
>> Comments and elucidation appreciated.
>>
>> --
>> Ed Stevens
>> (Opinions expressed do not necessarily represent those of my employer.)
>
>Ed,
>
>The assumptions stated in the paper were for relatively large OS I/O sizes
>- e.g. 1024 KB.
>NT uses a 64 KB I/O size. The assumptions in the SAME paper are invalid for
>deploying on NT/W2K. Period.
>They are aimed at larger shops running on *nix, with large external storage
>solutions in place.
>That's where the revenue is - not selling 8 drives for an NT Server
>internal cabinet.
>I believe that their smallest test configuration had around 64 drives.
>
>Now, some of the principles are still valid, just that the performance
>numbers won't match up.
>There is much to be said for using wide stripes - e.g. a 4 drive RAID 0
>array (or largers) or 5 drive RAID 5 array
>for increased througput during sequential reads (full table scans, index
>range scans, loads, backups, exports).
>But for higher general concurrency, multple (as in more than half a dozen)
>RAID volumes - separate drives on separate controllers) leads to less
>contention and higher concurrency. You'll have to examine the read/write
>characteristics or the apps that you run against your databases.
>
>But - if you can fit your entire database in the SGA or on the RAID
>controller cache - the storage volume configuration matters less - except
>during startup/shutdown and recovery.
>
>The idea of breaking up a RAID volume into separate partitions escapes me.
>This is forcing your disk heads to scan further than would otherwise be
>required.
>Yes, I'd have my C: no larger than 4 GB.
>Yes, I'd configure a dedicated swap partition with a block size of 4 KB.
>But for all other partitions - I'd have the entire RAID volume assigned to
>a single drive letter - partition.
>Size your datafiles correctly and don't use auto-extend - and they'll be
>nicely packed together in contiguous space.
>
>Paul
>
Received on Fri May 11 2001 - 20:34:51 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US