Message-Id: <10570.113041@fatcity.com> From: Chuck Hamilton Date: Wed, 26 Jul 2000 12:25:00 -0700 (PDT) Subject: Re: Oracle HA .. --0-2044897763-964639500=:27355 Content-Type: text/plain; charset=us-ascii Haven't used veritas, but we are using SGI's failsafe 2.0. For the most part it works well but we have had a few situations where it doesn't I would suspect other HA options would run into the same problems as us. One problem is with the script the (monitor in sgi's case) that checks to see if the instance is running. It basically just searches for the pmon process. Problem is we have two instances with similar names - MC and MCTR. As long as MCTR is up, the software thinks MC is up too because a scan for ora_pmon_MC finds the ora_pmon_MCTR process too! I've also had a situation where a rollback segment get corrupted which caused the instance to crash. The HA software sprang into action and failed the instance over to the other node in the cluster, which of course couldn't start it with a corrupt rollback segment either, so it crashed again. That in turn caused it to fail back to the original node... and so on. The two nodes played ping-pong with the instance for 20 minutes before we were able to put it in "maintenenance mode" and manually correct the problem. Another problem we've encountered is when an instance crashes in hot backup mode, but this is little to do with HA. The instance fails over and can't be opened because of the datafiles in backup mode. But the HA software doesn't recognize that the instance is down because the pmon process is still running when the instance is left in the mount stage. I guess that's not necessarily a bad thing though, because if it did detect that the instance was down, we'd start playing ping-pong again. We ended up modifying the HA oracle startup script to always check for files left in backup mode and alter them out before attempting to open the database. Adam Turner wrote: Anybody have any good - or horror stories about Oracle HA (or Veritas's high availability software) for Oracle 8.1.6 (64 bit Sun) anything would be helpful. I have 100 + pages of white papers and notes to read, but real world stories work so much better. thanks! adam -- Author: Adam Turner INET: ATurner@concretemedia.com Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051 San Diego, California -- Public Internet access / Mailing Lists -------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru@fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). --------------------------------- Do You Yahoo!? Get Yahoo! Mail - Free email you can access from anywhere! --0-2044897763-964639500=:27355 Content-Type: text/html; charset=us-ascii

Haven't used veritas, but we are using SGI's failsafe 2.0. For the most part it works well but we have had a few situations where it doesn't I would suspect other HA options would run into the same problems as us.

One problem is with the script the (monitor in sgi's case) that checks to see if the instance is running. It basically just searches for the pmon process. Problem is we have two instances with similar names - MC and MCTR. As long as MCTR is up, the software thinks MC is up too because a scan for ora_pmon_MC finds the ora_pmon_MCTR process too!

I've also had a situation where a rollback segment get corrupted which caused the instance to crash. The HA software sprang into action and failed the instance over to the other node in the cluster, which of course couldn't start it with a corrupt rollback segment either, so it crashed again. That in turn caused it to fail back to the original node... and so on. The two nodes played ping-pong with the instance for 20 minutes before we were able to put it in "maintenenance mode" and manually correct the problem.

Another problem we've encountered is when an instance crashes in hot backup mode, but this is little to do with HA. The instance fails over and can't be opened because of the datafiles in backup mode. But the HA software doesn't recognize that the instance is down because the pmon process is still running when the instance is left in the mount stage. I guess that's not necessarily a bad thing though, because if it did detect that the instance was down, we'd start playing ping-pong again. We ended up modifying the HA oracle startup script to always check for files left in backup mode and alter them out before attempting to open the database.

  Adam Turner <ATurner@concretemedia.com> wrote:

Anybody have any good - or horror stories about Oracle HA (or Veritas's high
availability software) for Oracle 8.1.6 (64 bit Sun)

anything would be helpful. I have 100 + pages of white papers and notes to
read, but real world stories work so much better.


thanks!

adam
--
Author: Adam Turner
INET: ATurner@concretemedia.com

Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051
San Diego, California -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru@fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (lik! e subscribing).



Do You Yahoo!?
Get Yahoo! Mail - Free email you can access from anywhere!