Re: RAC or Large SMP...?
Date: Fri, 10 Oct 2008 06:17:57 -0700
>> I'm also not convinced that the fewer servers are easier to administer >> arguement is as valid these days. This was certainly true in the past, >> but modern package management has become quite sophisticated. >> Managing larger numbers of servers dedicated to the same role isn't that >> much of an overhead anymore. At least we haven't seen a substantial >> increase in administration since moving to RAC. In fact, the added >> fault tolerance has reduced impact and stress on staff when hardware >> failures occur. >> >> Tim
> Its exactly this area of RAC (i.e. adminstration) that concerns me.
> In your experience does the following scenario sound familiar:
> "Ah yes, troubleshooting. Iíve seen many clusters that just froze for
> no apparent
> reason in my time. Itís always possible to make the OS or Cluster
> software dump a
> trace/log file when it happens.
> The resulting trace/log file from the cluster will normally be the
> size of Texas, and
> only one or two people in the entire vendor organisation can truly
> understand them,
> you will be told.
> Then the files (often with sizes measured in GB) are shipped to the
> vendor and some
> months later they will report back that it wasnít possible to pinpoint
> the exact reason
> for the complete cluster freeze or crash, but that this parameter was
> probably a bit low
> and this parameter was probably a bit high.
> Thatís what always happens. I have never Ė really: never Ė seen a
> vendor who could
> correctly diagnose and explain a hanging cluster or a cluster that
> kept crashing.
> As to Oracle trouble shooting Iím not so worried. Oracle will either
> have a
> performance problem, which is easy to diagnose using the Wait
> Interface or youíll
> get ora-600 errors that are fairly easy to diagnose, although youíll
> need to spend the
> required 42 hours logging and maintaining an iTAR or SR or whatever
> the name is
> these days.
> In other words: Finding out whatís wrong (if anything) in Oracle is
> much easier than
> finding out whatís wrong with a cluster."
> This quote was pulled from http://www.miracleas.dk/WritingsFromMogens/YouProbablyDontNeedRACUSVersion.pdf.
> Has the Oracle clusteware and RAC become mature enough so that the
> above is no longer a common problem..? The company I now work for
> deployed RAC 9i and went through 6 months of hell exactly like the
> scenario above, so they have been burned in the past.
> There is also the argument that RAC systems will require more
> scheduled downtime than single instance systems because there are more
> Oracle homes to patch (CRS, multiple database homes, ASM homes etc).
> Personally, I'd love to implement the RAC solution as I think that it
> is an excellent technology but somehow I think that I may regret it in
> the long run......
The first RAC implementation I thought stable enough for production consideration was 220.127.116.11: Since then it has gotten substantially better.
Mogen's comments on RAC are accurate within their context: That is not the only context there is.
If you are going to go into RAC then make sure you build into your budget monies for a test cluster for software testing and to be used for DBA training and as a DBA sand box and also training for staff. With real hands-on RAC training, unfortunately something Oracle itself does not provide, 10gR2 and 11g RAC are not that much more difficult to manage than stand-alone.
-- Daniel A. Morgan Oracle Ace Director & Instructor University of Washington damorgan_at_x.washington.edu (replace x with u to respond) Puget Sound Oracle Users Group www.psoug.orgReceived on Fri Oct 10 2008 - 08:17:57 CDT