RE: HP Open view

From: <oracle_at_dunbar-it.co.uk>
Date: Wed, 22 Aug 2007 15:18:02 +0200
Message-Id: <21244535.208751187788682834.JavaMail.servlet@kundenserver>

Hi Jack,

>> ... does anyone have experience with HP Open view
>> andalerts/monitoring of oracle database environments?

Yes, and I hate it.

>> Any horror stories, gotcha’s etc etc?

How long have you got? :o)

Seriously, it works - as long as things are running ok. We have a 6 node HP cluster with agents running on each node. There are 'many' oracle databases running one the 6 nodes and OV monitors them all.

From what I've been told by the OV admins, and seen, when we fail over a node manually, OV writes to a config file on that node to say that 'database x is no longer running here' and on the node we have failed it over onto, the agent there writes to its config file to say that 'database x is now running here'. Monitoring continues ok.

BUT, if a node crashes, OV cannot write it's config file on the failed node - because it has failed and the agent is long gone. The new node that receives the failed package(s) seems to be able to update it's config and monitors the newly added databases.

However, the failed node is no longer communicating with the 'management' server and so we get alerted to say that the databases that were running on the failed node are all down when they are actually up and running on some of the other nodes.

OV doesn't seem to be able to use any form of TNS to track databases. The agent config tells it where a database is and heaven forbid anyone or anything that changes the TNS information without first updating OV.

When the Oracle Names service stops on one of our cluster nodes, OV doesn't notice because it doesn't check that the service is actually *working* only that it is *running*.

Now, all of this may be config problems, but I'm seriously a non-believer there. If it was config, I suspect that it would have been fixed ages ago because I continually rant about it alerting when it should not be alerting.

I have a backup system, Nagios, which uses TNS/LDAP/whatever to follow databases around and it never has problems when a node goes down. It also knows when an Oracle Names service is no longer working properly because it *checks*. :o)

I have had it explained to me that the OV information is only asserting the running, or otherwise, of the OV agent on the remote servers and not the actual state of the packages, databases, WebLogic, Apache and so on.

Good Luck !

Cheers,
Norman.

--
http://www.freelists.org/webpage/oracle-l

Received on Wed Aug 22 2007 - 08:18:02 CDT