Re: Replicated File System Consistency

From: joel garry <joel-garry_at_home.com>
Date: Fri, 2 Jan 2009 13:27:53 -0800 (PST)
Message-ID: <8795cdd4-513f-494f-99eb-165be79578f2_at_o40g2000prn.googlegroups.com>



On Jan 1, 9:12 am, Pat <pat.ca..._at_service-now.com> wrote:
> On Dec 30 2008, 9:55 am, DA Morgan <damor..._at_psoug.org> wrote:
>
> > David's link is a good one but I'd like to address the fact that you
> > are not seeing any improvement from the NetApp 3040. Here are a couple
> > of questions you might explore:
>
> > 1. What is the limiting factor? CPU? Network bandwidth/latency? Storage?
>
>   Nature of our app is that the working set fits in memory most of the
> time, so we're largely CPU bound, and you wouldn't expect a SAN to
> change that at all. We have certain queries/operations that tend to be
> IO bound, but we didn't see a dramatic improvement on these (maybe 20%
> if I recall).
>
> > 2. How many LUNs? (one is never almost never the right answer)
>
>    I think its 70 disks, split into 3 aggregates of 10, 30, 30. Dunno
> how many luns the san has, but all the DB servers have three luns.
> Boot LUN is on the 10 disk aggregate, /u01 is on the second aggregate
> (30), and /u02 is on the third aggregate (30).
>
> > 3. What RAID level?
>
>    RAID-DP which is netapp's version of RAID-6
>
> > 4. How is the cache configured? What percentage read? What percentage write?
>
>    8G cache on each head unit (2 head units), but I don't know how its
> configured. Frankly didn't even realize you could configure different
> read/write percentages. I supect the SAN guys know though; is it worth
> asking? Is there a recommendation as to how the cache should be
> allocated? Our workload is very read heavy so I'd naively assume a
> bigger read cache would be preferable.
>
> > 5. How many physical disks are you striped over for your hottest data files?
>
>    30 Fiber Channel 15k drives on each of the main data aggregates.
>
>    All the numbers point to the SAN being much faster, but I think
> what we basically proved is that we don't have an IO bound workload.
> To give a little history, we used to have serious problems with IO
> throughput, so maybe 2 years ago we went through a project to put
> everything on 64 bit oracle (we used to run 32 bit) with uniform 24G
> SGAs across all our databases. WIth that much memory and a read heavy
> workload, our IO problems were largely solved even before we brought
> the SAN into the picture.
>
>    My suspicion is that if we took the old (IO bound) configuration
> and moved it on top of the SAN we'd see a tremendous boost in
> throughput, but the current config is largely CPU bound so the SAN
> doesn't make a lot of difference.

Of course, it depends why it is CPU bound. With your 12G SGA, it could be Oracle deciding to do full table scans, then having to do lots of housekeeping to keep track of the latching and read consistency involved, when if it had decided to range scan indices, it would have transferred the load to I/O - but the load and wall time would have been less. I've seen this with the simple loss of an index - simply recreating the index completely changed the bottleneck distribution of all apps on the box. Of course, with all the things that can happen with the optimizer and bugs, it could be a lot more subtle of a root cause, but with similar gross consequences. I say bugs - I'm not going to research them for you, but there are some cpu loading bugs, it may be worth it to see if any apply to you - and note, some things are fixed in patches beyond 10.2.0.4, whatever that means. It is important to note that kind of thing, since the complex combination of Oracle, the OS file buffers, if any, cache on the SAN, can obscure the real "nature of the app." Did you happen to mention your OS, and configuration of asyncronicity? That is severely platform dependent.

>
>    FWIW the main reason we went SAN wasn't throughput, but crash
> recovery. If a DB server explodes, I can mount its 3 LUNS on a hot
> spare and have it back up in a matter of minutes. Since nobody wanted
> to pay for dataguard, its the best hot failover strategy we could come
> up with.

OK, as long as we don't hear anything stupid from you, like "the vendor guaranteed no data loss for 10,000 years..." :-) (You'd be amazed how far off reality some people are!)

jg

--
_at_home.com is bogus.
"Welcome to Carlsbad Village!" - electronic sign next to train
platform in Oceanside, CA transit station, the next one north of
Carlsbad Village.
Received on Fri Jan 02 2009 - 15:27:53 CST

Original text of this message