Re: HA Clustering with RAC

From: Andreas Schulze <b79xan_at_gmx.de>
Date: Fri, 19 Mar 2004 17:36:44 +0100
Message-ID: <c3f611$rd32_at_news-1.bank.dresdner.net>


"Ulf" <Ulf.Pralle_at_web.de> schrieb im Newsbeitrag news:57f997b6.0403190720.746edd9_at_posting.google.com...
> > Hallo Ulf,
> >
> > you might start your thesis with getting the definitions of
> > failover/takeover/high availability etc. straight. As you are going to
> > consider clusters in general plus software of different vendors you
> > are likely to experience that some terms are used differently
> > depending on who uses them. High availability as defined by IBM does
> > not mean that there is no downtime. The IBM docs state expressively
> > that HACMP is not the adequate solution if you cannot afford having
> > downtime. I.e. you might use a HACMP cluster for credit card
> > processing (and there may be some minutes when no cards are being
> > processed) but it is not suitable for running a nuclear power plant.
> > Take a look at the IBM redbook site http://www.redbooks.ibm.com/ There
> > is a huge number of pdf docs available (at no cost) that will tell
> > you a lot of what you need to know about IBM HACMP.
> >
> > Just my 5 cent.
> > HTH,
> > Andreas
>
> Hi Andreas,
> thanx a lot.But I didn't get it.If you are creating a cluster, does
> the cluster always need a special software for clustering? Or is
> Oracle without a special software able to create and handle a cluster?
> I think that is no possibility creating a cluster without soecial
> software like HACMP or Veritus....isn't it?
>
> Bye

Hi Ulf,

well, no, Oracle cannot handle a cluster but then no again you don't always need a special software. You could even build a (well, sort of) HACMP cluster with (korn-)shell scripts. (The problem is then just how you define the conditions on which a takeover will take place and how you make the nodes respond to failures to accomplish controled takeover.) But you do need something that controls the conditions of the cluster and the failover/takeover process. This will very likely be closely related to the operating system and rather remotely related to the application but that is not a must. If a software for high availability has a reaction time of several minutes you might even run a manual takeover by some operator in the same time. (Maybe some chance for further outsourcing to India ;-)) HACMP uses the AIX system resource controller e.g. the clustermanager is a daemon that runs with highest priority and basically just checks in short interval for devices being active. This is done by different means, e.g. heartbeats that are send to/from the cluster's nodes. HACMP has another advantage over the mentioned script solution as it comes with tools that can break scsi locks and reservations. Thus it can take over disks without rebooting the other node(s) (this is rather difficult with just kornshell). The applications running on the nodes however are controled by scripts (named application servers), at least for starting / stopping the apps. Once I have run a cluster that took two to three hours for a complete takeover because the oracle dbs took so long for stopping/starting via scripts. (oracle 8.1.6i with SAP R/3 on two S80 with 12 CPU and 12GB RAM). Oracle and SAP had nothing to do with the HACMP actually. Oracle did come into the game when it came to replaying the redo logs after failover (which then took even longer), however, but that's not part of the high availability. HACMP is a commercial solution that integrates smoothly with AIX. It is quite expensive and therefore (and some other reasons) becoming more and more rare these days. You can use it either for conveniance or because you need software being backed by a big IT firm but you could also use some scripts on some linux boxes to build sth similar to it. Why, even HACMP startet as a script solution once IIRC (for a big american newspaper). Therefore I would not pretend that the only true high availability cluster solution needs to be commercial. As I suggested before: get your definition of high availability straight to make sure that your are not misunderstood.

Regards,
Andreas Received on Fri Mar 19 2004 - 17:36:44 CET

Original text of this message