Re: ASM for single-instance 11g db server?

From: joel garry <joel-garry_at_home.com>
Date: Wed, 6 Apr 2011 10:21:16 -0700 (PDT)
Message-ID: <5e460107-c163-4375-be45-7b1d15574d75_at_y26g2000yqd.googlegroups.com>



On Apr 6, 4:47 am, Noons <wizofo..._at_yahoo.com.au> wrote:
> Mladen Gogala wrote,on my timestamp of 6/04/2011 6:21 AM:
>
> > On Tue, 05 Apr 2011 22:14:34 +1000, Noons wrote:
>
> >> Agreed 100%.  Very much so.  In fact, I am in the process of
> >> re-allocating all our databases to RAID5 in the new SAN, precisely
> >> because given two equal SAN/processor environments I can't prove to
> >> myself that RAID5 is inherently slower than RAID10.
>
> > Noons, that goes against the traditional lore. Do you have any numbers to
> > back such claim up?
>
> That's the problem: "traditional lore".  Not based in any reasonably recent
> facts.  Hence why I haven't joined BAARF.
>
> Look, without a shadow of a doubt: 10 years ago I'd think twice about putting
> *any* database in RAID5.
>
> Now?  With a SAN?  I don't even blink.  Note that I said "with a SAN"!
>
> All my development dbs are in RAID5.  And of the production ones, only one is
> not and it's going to RAID5 in the SAN refresh in 3 months time.
>
> Mostly because:
>
> 1- All the arguments about what happens to RAID5 when 2 disks fail
> simultaneously can always be matched by the equivalent in RAID10.  Hey, if it's
> a problem with both, exactly what was the point?

Parity. I'm no expert, so correct me if I'm wrong, but RAID5 steps the parity among the disks, so losing two disks means you can't reconstruct the stripe. RAID10 mirrors, so you would have to lose the mirrored disk, less chance of losing two specific drives of many than two arbitrary drives of many.

Of course if one of many controllers corrupts, welcome to hell.

>
> 2- When was the last time any 2 disks in your (recent crop!) SAN failed
> simultaneously in the same RAID5 string?  I thought so.  And isn't that why we
> go to all the trouble of having a DR site *and* daily online backups *and*
> archived redo logs?  There *is* a limit to how much "tobesuretobesure" we need
> to follow!

Never underestimate the power of the damager: http://groups.google.com/group/comp.databases.oracle.server/msg/1cedb062e0071ac0 http://groups.google.com/group/comp.databases.oracle/msg/29f1684073028648

>
> 3- Modern SANs use disk failure prediction technology that avoids most if not
> all ad-hoc MTBF failures.  In 4 years it's been used and abused, our Clarion has
> never had one single failure of a running lun: the box has always phoned home
> asking for a new one loooong before any disk failed. SANs cost a lot of moolah
> because they are designed to do that sort of thing from the word go.

I've been amazed many times how often warnings are ignored. (Couldn't find with a quick search something I came across yesterday on the hp forums - guy posted his warning messages with a raid question, apparently didn't realize that the "battery dead" messages meant no battery backup on write cache [depends on hardware as to whether that turns cache off, I believe].) Yet another use for duct tape - covering up annoying warning lights.
http://groups.google.com/group/comp.databases.oracle.server/msg/9be05efc4b33ac4c

>
> 4- I don't run OLTP dbs. Now. Most of my dbs are relatively small and low
> volume.  In the order of 100-500GB size, with maybe 1TB/day of total I/O.
> With one exception, which is DW.  That one is as far from OLTP as it can get:
> 3TB per instance (multiple for dev/test and uat) and around 8TB/day of total I/O
> (~= 100MB/s averaged for the whole day).  Most of the accesses there are
> sequential and BIG - ideal fit for a big thumping RAID5 string of disks,
> exceedingly expensive to duplicate with RAID10.
>
> We run Statspack everywhere every 4 hours and have done so for 4 years.  I can
> reconstruct and plot any period of I/O stats and waits in our dbs in that
> timespan.
>
> I've used that to compare loads between our prod system where we use RAID10 and
> our dr/dev environment where I've set things up with RAID5, 8+1 disk stripe.
>
> Then we ran tests.
>
> CPU and memory same, workload same, no difference whatsoever in I/O waits and
> throughput.
>
> One thing came out: rman backups and restores take slightly less with the RAID5
> setup.
>
> Given that I'm not gaining anything in performance and I'm losing in
> backup/restore performance and I'm paying twice more with RAID10, my reaction
> was simply: "er...whaaaat?"
>
> So it's now RAID5 all the way.
>
> Until we have to run a OLTP system, that is!  ;)

I'm running OLTP on RAID5. No one mentions what happens when the write buffer is full. Or when the batteries die and turn off the cache.

jg

--
_at_home.com is bogus.
http://www.techeye.net/hardware/oracle-makes-big-claims-about-its-x86-servers
Received on Wed Apr 06 2011 - 12:21:16 CDT

Original text of this message