Re: Storage array advice anyone?

From: Giovanni Cuccu <gcuccu_at_dianoema.it>
Date: Tue, 14 Dec 2004 12:13:11 +0100
Message-ID: <41BECAC7.6030405@dianoema.it>

Hi all,

I'd like to share my experience with you. I never been involved in setting up such storage systems, but the company I work with has several hundred of customers sites in Italy each with al least one server equipped with RAID 1 or 5.
During the last year we encountered some problems only with RAID 5. I followed some of them and the most frequent error is the following: when a write fails due to a parity check the controller marks the wrong disk as bad causing all the system to be unusable. This happend more than once and usually the solution (at the hardware level) was firmware/driver update. Now, when I can decide the raid type I always choose RAID 10.
This is my real world experience, but others may have different one. Giovanni
> Stephen,
>
> This is a classic debate / argument that's been going on for in one form or
> other for years. Anyway for what it's worth here's my "two penny worth" or 2
> cents if you prefer.
>
> Last year I was involved in setting up an IBM ESS 8000 "Shark" which had 80 -
> 146 GB drives, so similar but not quite the capacity of your array. We had to
> go through a number of decisions:
>
> 1. RAID 5 or 10 ?
> (I've heard some people say that on the Shark RAID5 is actually RAID4 but Id
> on't want to go there now).
> There are the usual trade-offs:
> RAID 5 gives more capacity
> RAID 10 gives better protection against disk failures, although with hot
> spares etc. you'd have to be very, very unlucky to suffer data loss using RAID
> 5
>
> On performance RAID 10 is generally better but it depends on things such as
> read/write ratio, does the RAID 10 implementation use both plexes for reading
> or does it only read from the primary etc. RAID 5 suffers when there's been a
> disk failure especially when it's re-building the disk using the hot-spare.
>
> In our situation RAID 5 was choosen due to price and capacity requirements. Also
> with the nature of our Oracle databases the performance benefits of RAID 10 were
> likely to be marginal except when recovering from a disk failure.
>
> 2. Striping etc
>
> The first question you need to ask is:
> "Do I have different workloads e.g. dev, live, performance critical databases
> etc ?"
> If you do, which is likely, then you need to decide if you want to segregate
> the workloads / databases onto separate groups of disks to avoid any performance
> contention (at the disk level, you can't avoid it at the cache level) between
> the workloads / databases. James Morle has written an excellent paper on this
> "Sane SAN", it should be available on his website scaleabilities.co.uk
>
> In our case we affectively had a single critical workload (a group of databases
> and flat files). When this workload was running nothing else would be running.
> So to maximise performance we did the following:
>
> a) Divided each disk group (set of 8 disks as a RAID 5 set) into 20 GB LUNs i.e.
> "disks" / Physical volumes (PVs) from the OS view.
>
> b) Created volume groups made up of an equal number of LUNs from each disk
> group. e.g. VG02 contained 2 LUNs from each of the 10 disk groups so 400 GB.
>
> c) Created filesystems from these volumes that were striped with a 4 MB strip
> size across all disks in the VG. This was done using "extent based striping"
> performed by the volume manager (both HP-UX and AIX).
>
> This meant that our critical workload had access to all the phyiscal disks all
> the time and the IO was evenly spread across all disks, hence maximising
> performance.
>
> If you decide you want to segregate workloads then you need allocate a physical
> separate group of disks to each workload. Then I'd suggest you stripe across
> each separate group of disks as shown above. So you might end up with 3 or 4
> groups of disks which have their own spearate striping.
>
> When doing this you need to bear in mind what you are going to do when extra
> capacity is added in a year or two's time. This can be quite a challenge.
>
> Hope that all made sense.
>
> 3. Disk failures
>
> My experience is that with either RAID 5 or 10 you have to be unbelievably
> unlucky to lose data providing disks are replaced when they fail and not left
> for a few days or even more. You are talking extremely remote. It might be an
> idea to get someone to do the maths and work out the probabilities.
>
> Well I hope that helps.
>
> Chris
>
> PS I know about BAARF and in an ideal world we wouldn't use RAID 5 but sometimes
> when managers are managers and bean counters are counting their beans you can't
> justify RAID 10 over RAID 5. You just need to make management aware of the
> trade-offs and understand the implications of the decision.
>
> Quoting Stephen Lee <Stephen.Lee_at_DTAG.Com>:
>
>

>>There is a little debate going on here about how best to setup a new
>>system which will consist of IBM pSeries and a Hitachi TagmaStore 9990
>>array of 144 146-gig drives (approx. 20 terabytes).  One way is to go
>>with what I am interpreting is the "normal" way to operate where the
>>drives are all aggregated as a big storage farm -- all reads/writes go
>>to all drives.  The other way is to manually allocate drives for
>>specific file systems.
>>
>>Some around here are inclined to believe the performance specs and
>>real-world experience of others that say the best way is keep your hands
>>off and let the storage hardware do its thing.
>>
>>Others want to manually allocate drives for specific file systems.
>>Although they might be backing off (albeit reluctantly) on their claims
>>that is it required for performance reasons, they still insist that
>>segregation is required for fault tolerance.  Those opposed to that
>>claim insist that the only way (practically speaking) to lose a file
>>system is to lose the array hardware itself in which case all is lost
>>anyway no matter how the drives were segregated, and if they really
>>wanted fault tolerance they would have bought more than one array.  And
>>around and around the arguments go.
>>
>>Is there anyone on the list who would like to weigh in with some real
>>world experience and knowledge on the subject of using what I suppose is
>>a rather beefy, high-performance array.
>>
>>--
>>http://www.freelists.org/webpage/oracle-l
>>

>
>
>
> Chris Dunscombe
>
> Christallize Ltd
> --
> http://www.freelists.org/webpage/oracle-l
>
>

-- 

----------------------------------------
Giovanni Cuccu
Sw Engineer_at_dianoema.it
Dianoema S.p.A.
Via de' Carracci 93 40131 Bologna
Tel: 051-7098211   051-4193911
e-mail:gcuccu_at_dianoema.it
----------------------------------------
No man does it all by himself,
I said young man,
put your pride on the shelf
----------------------------------------
--
http://www.freelists.org/webpage/oracle-l

Received on Tue Dec 14 2004 - 05:08:59 CST