Re: asm on san

From: Michael Austin <maustin_at_firstdbasource.com>
Date: Thu, 11 Dec 2008 19:26:38 -0600
Message-ID: <p1j0l.8925$D32.524@flpi146.ffdc.sbc.com>


Mladen Gogala wrote:
> helter skelter wrote:
>

>> Hi,
>>
>> I'm preparing to test some configurations on san storage array. I would
>> like to find out what will be better in my environment: raid1+0 or
>> raid5. Other thing is what size of segment size should be configured on
>> storage array. For example, I've got 5 disks in RAID5, so if ASM default
>> stripe size for datafiles is 1MB, so segment size on strage shouls be
>> 256KB to hit all drives, am I right?
>> What tools are You recommend to do tests like this: sequentail/sequence
>> reads etc?  thanks
>>
>> oracle 10gr2, rhel5

>
>
> RAID 5 is not good for databases, period. Not for data files, especially not
> for the log files, you can only use RAID 5 partition for log_archive_dest_25
> If your boss told you to evaluate RAID 5, it is very likely that he wants to
> save some money in the wrong place and will buy you RAID 5, no matter what
> your findings are. Moan Isnogood has a great site at http://www.baarf.com.
> The abbreviation "BAARF" stands for "Battle Against Any Raid Five". Your
> company is probably doomed and will not make it through the recession. Have
> a nice day.

IMProfessionalO baarf is so out-dated when it comes to the modern day SAN - especially on IBM Shark/whatever, EMC, Hitachi, NetApp, HP (rebranded Hitachi), HP Storage Works (formerly DEC) and a host of others where they have xxxGB of cache on the front end. Reads and Write acknowledgments will be returned in the 1-3ms range. I have had databases on all of the aforementioned Arrays with both RAID1+0 and RAID5 at the device level even on systems that require hundreds of thousands of transactions/sec with no noticeable difference. When carving out 200 x 1.5TB LUNS to be presented to a host, it is the only reasonable and cost effective way to manage something of this size as well as have some reasonable idea that even a double spindle failure will not cripple you.

On ALL of these systems, bottlenecks can be in different places. If your "network (NAS)" or SAN (Fibre) does not have sufficient bandwith, then it really doesn't matter the spindle read/write rate. If your CPU and the system/PCI bus is slow, you are still not even going to see the spindle read/write rate due to the cache on the array. When baarf was originally "thought up", you had RAID controllers in the systems that connected directly to a SCSI bus that connected directly to the spindle.   And yes, RAID5 was quite expensive. Again, with the tons of cache on the array front end, the write acknowledgment is very fast - LONG before it ever hits the spindle. Reads on RAID5 are faster as it reads from many spindles. And in an ASM environment - ALL bets are off, because now it reads from many LUNS which are carved out of many RAIDx devices and if you are using IBM SVC - you have another layer of obfuscation.

You can listen to the "baarf" crowd - or you can actually use it on a day to day basis and see that even if it is somewhat measurable, in any moderately sized/busy system, you will never see it. Maybe on a pristine - unused server/system as that is the only time you can truly measure those differences and as long as you know exactly where on the spindle it is being written, how far the head has to travel to get to the cylinder you want etc. etc... all I can say is BAARRRRRFFFFF!.

The baarf crowd can throw numbers all day long, but the bottom line is that like the TPS benchmarks they are generally gathered from pristine systems with no real work being done. Received on Thu Dec 11 2008 - 19:26:38 CST

Original text of this message