Re: RAC or Large SMP...?

From: DA Morgan <>
Date: Fri, 10 Oct 2008 06:17:57 -0700
Message-ID: <> wrote:

>> I'm also not convinced that the fewer servers are easier to administer
>> arguement is as valid these days. This was certainly true in the past,
>> but modern package management has become quite sophisticated.
>> Managing larger numbers of servers dedicated to the same role isn't that
>> much of an overhead anymore. At least we haven't seen a substantial
>> increase in administration  since moving to RAC. In fact, the added
>> fault tolerance has reduced impact and stress on staff when hardware
>> failures occur.
>> Tim

> Its exactly this area of RAC (i.e. adminstration) that concerns me.
> In your experience does the following scenario sound familiar:
> "Ah yes, troubleshooting. Iíve seen many clusters that just froze for
> no apparent
> reason in my time. Itís always possible to make the OS or Cluster
> software dump a
> trace/log file when it happens.
> The resulting trace/log file from the cluster will normally be the
> size of Texas, and
> only one or two people in the entire vendor organisation can truly
> understand them,
> you will be told.
> Then the files (often with sizes measured in GB) are shipped to the
> vendor and some
> months later they will report back that it wasnít possible to pinpoint
> the exact reason
> for the complete cluster freeze or crash, but that this parameter was
> probably a bit low
> and this parameter was probably a bit high.
> Thatís what always happens. I have never Ė really: never Ė seen a
> vendor who could
> correctly diagnose and explain a hanging cluster or a cluster that
> kept crashing.
> As to Oracle trouble shooting Iím not so worried. Oracle will either
> have a
> performance problem, which is easy to diagnose using the Wait
> Interface or youíll
> get ora-600 errors that are fairly easy to diagnose, although youíll
> need to spend the
> required 42 hours logging and maintaining an iTAR or SR or whatever
> the name is
> these days.
> In other words: Finding out whatís wrong (if anything) in Oracle is
> much easier than
> finding out whatís wrong with a cluster."
> This quote was pulled from
> Has the Oracle clusteware and RAC become mature enough so that the
> above is no longer a common problem..? The company I now work for
> deployed RAC 9i and went through 6 months of hell exactly like the
> scenario above, so they have been burned in the past.
> There is also the argument that RAC systems will require more
> scheduled downtime than single instance systems because there are more
> Oracle homes to patch (CRS, multiple database homes, ASM homes etc).
> Personally, I'd love to implement the RAC solution as I think that it
> is an excellent technology but somehow I think that I may regret it in
> the long run......

The first RAC implementation I thought stable enough for production consideration was Since then it has gotten substantially better.

Mogen's comments on RAC are accurate within their context: That is not the only context there is.

If you are going to go into RAC then make sure you build into your budget monies for a test cluster for software testing and to be used for DBA training and as a DBA sand box and also training for staff. With real hands-on RAC training, unfortunately something Oracle itself does not provide, 10gR2 and 11g RAC are not that much more difficult to manage than stand-alone.

Daniel A. Morgan
Oracle Ace Director & Instructor
University of Washington (replace x with u to respond)
Puget Sound Oracle Users Group
Received on Fri Oct 10 2008 - 08:17:57 CDT

Original text of this message