Re: RAC or Large SMP...?
Date: Thu, 9 Oct 2008 18:39:07 -0700 (PDT)
Message-ID: <951152d1-95c5-42eb-8bb4-8d91aa14d34c@s9g2000prg.googlegroups.com>
> I'm also not convinced that the fewer servers are easier to administer
> arguement is as valid these days. This was certainly true in the past,
> but modern package management has become quite sophisticated.
> Managing larger numbers of servers dedicated to the same role isn't that
> much of an overhead anymore. At least we haven't seen a substantial
> increase in administration since moving to RAC. In fact, the added
> fault tolerance has reduced impact and stress on staff when hardware
> failures occur.
>
> Tim
Its exactly this area of RAC (i.e. adminstration) that concerns me. In your experience does the following scenario sound familiar:
"Ah yes, troubleshooting. I’ve seen many clusters that just froze for
no apparent
reason in my time. It’s always possible to make the OS or Cluster
software dump a
trace/log file when it happens.
The resulting trace/log file from the cluster will normally be the
size of Texas, and
only one or two people in the entire vendor organisation can truly
understand them,
you will be told.
Then the files (often with sizes measured in GB) are shipped to the
vendor and some
months later they will report back that it wasn’t possible to pinpoint
the exact reason
for the complete cluster freeze or crash, but that this parameter was
probably a bit low
and this parameter was probably a bit high.
That’s what always happens. I have never – really: never – seen a
vendor who could
correctly diagnose and explain a hanging cluster or a cluster that
kept crashing.
As to Oracle trouble shooting I’m not so worried. Oracle will either
have a
performance problem, which is easy to diagnose using the Wait
Interface or you’ll
get ora-600 errors that are fairly easy to diagnose, although you’ll
need to spend the
required 42 hours logging and maintaining an iTAR or SR or whatever
the name is
these days.
In other words: Finding out what’s wrong (if anything) in Oracle is
much easier than
finding out what’s wrong with a cluster."
This quote was pulled from http://www.miracleas.dk/WritingsFromMogens/YouProbablyDontNeedRACUSVersion.pdf.
Has the Oracle clusteware and RAC become mature enough so that the above is no longer a common problem..? The company I now work for deployed RAC 9i and went through 6 months of hell exactly like the scenario above, so they have been burned in the past.
There is also the argument that RAC systems will require more scheduled downtime than single instance systems because there are more Oracle homes to patch (CRS, multiple database homes, ASM homes etc).
Personally, I'd love to implement the RAC solution as I think that it is an excellent technology but somehow I think that I may regret it in the long run...... Received on Thu Oct 09 2008 - 20:39:07 CDT