Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: 9i RAC or 10g RAC ?

Re: 9i RAC or 10g RAC ?

From: Don Granaman <granaman_at_cox.net>
Date: Wed, 12 May 2004 02:41:59 -0400
Message-ID: <00cc01c43861$a358de70$6401a8c0@dilbert>


I'll chime in...

Cary, Zhu, and some others are correct. If you are looking for "performance", don't use RAC. The cost of everything , including just starting the instance, is greater. If you want basic availability, don't use RAC. It is more complex and has more "moving parts" that can go wrong. Please refer to Mogens' paper "You Probably Don't Need RAC" at http://www.miracleas.dk (via Writings From Mogens).

My experience with RAC on RedHat As 2.1 (and now ES 3.0) is similar to Zhu's. The behavior of 9.0.x was almost comical. It seemed sometimes that a single stray neutrino could crash everything even near a cluster component. Even with 9.2.0.4 we had a few additional hardware (?) issues that were "interesting". One was a driver (for the EMC Clariion) that would occasionally cause a node to "lose its LUNs". Even though they were still accessible from the other node, its instance too would die a horrible death. Then after reboot, instance recovery would take ~45 minutes - even when there was little to do. There would be little or no I/O, very little CPU utilization, but the instance would sit there for 43 minutes or so, then realize that it was supposed to be doing recovery, suddenly come to life, and complete in 70 seconds or so. Support finally created an (as yet unpublished) bug on it. However, after several driver updates (my boss is a VP, the SA, and was a sr systems engineer at EMC for years - and has a few connections...), all this silliness finally disappeared, as well as some other issues (e.g. 12170 two-task layer errors) that were evidently tied to the driver. I got to close three long-standing TARs! [GD: "What a long, strange trip its been..."]

After six months, multiple OS patches, multiple QLogic driver updates, a few Oracle patches and workarounds, and some application process partitioning, the system is now fairly stable (with 9.2.0.4). Not as stable as on 8.1.7.4 exclusive though. The interconnect speed, as mentioned previously, is important, but beware that there are some limiting issues with Linux (multiple/redundancy/crashworthiness, speed, etc.) with Linux (RedHat on Dell at least) compared to more "mature" cluster implementations (e.g Sun PDB, HP, etc.). I haven't really had any significant problems with it, but I don't have a "randomly splatter processes over an array of nodes/instances" implementation.

What I like best about RAC is that I can often exile ill-behaved processes (e.g. "What's a bind variable?", LIO pigs, cache-trashers, and their ilk) to node(s) not running more critical and well-behaved code. You do get multiple redo threads and some other things that may help in certain situations. To quote one true OPS/RAC expert (name withheld to protect an Oracle employee from "political incorrectness" charges). "I like RAC, I like Linux, but I don't like RAC on Linux - at least not yet".

I have no experience with 10g yet, but if you want "cheap", I find the concept of Oracle One RAC on a large cluster of 2-CPU Linux machines both intriguing and scary. Yep - with 10g RAC is available for SE, EE is not required! At the serious risk of sounding like a dinosaur/heretic, if you can intelligently partition your application processes between nodes/instances, it might work well on a (Oraclely speaking) shoestring budget.

One thing that has always been the bane of Oracle on Linux is ruinStaller. The compatibility matrix for Linux is like "where's Waldo?" If you have the right version of Linux, the right version of Oracle, the right glibc/compatibility, the right JRE/JDK ("M-O-U-S-E"), all the particular environment quirks for the combination (e.g LD_ASSUME_KERNEL, ad nauseum), and have sacrificed enough chickens, it works - usually. 10g *appears* to be a vast improvement in this respect. The 9.2.0.5 patchset even uses it. Even better, you can get an RPM for a 10g "lite" client install!

General advice: Get the fastest, baddest CPUs you can (at least for the 2-CPU nodes model) - RAC has additional overhead. Then there is "krefilld" (sp? -don't currently have a Linux shell session)... Stuff the boxes with memory - the first issue we encountered was memory, even with significantly more than on the old 8.1.7.4 exclusive server.

-Don Granaman
(verbose) OraSaurus

> Please advise if i am wrong, but I'm hoping that moving to RAC would =
> help improve performance because :=20
>
> Our Solaris production box, which is 12 cpu 1200 mhz,24 gb ram is cpu =
> bound at peak working hours, and its lease ends in 3 months.
> Management does not want to spend money on a bigger Solaris box. So we =
> decided to purchase new linux boxes.
>
> Since 12+ cpu linux boxes are very expensive, we have decided to go for =
> 4cpu linux boxes featuring 2.2ghz Opteron cpu's on 9i or 10g rac, and a =
> new san with faster disks.
>
> rac is being used primarily because i want to link together these 3 or 4 =
> linux boxes.
> The combination of these factors, faster cpu's , faster disks and bigger =
> ram should provide better performance, i hope ?
>
> any advice is appreciated.=20
>
> thanks & regards
> ratnesh=20
>
>

[...snipped...]
> -----------------------------------------------------------------



Please see the official ORACLE-L FAQ: http://www.orafaq.com

To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.
--
Archives are at http://www.freelists.org/archives/oracle-l/
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------
Received on Wed May 12 2004 - 01:39:03 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US