RE: EM12c incidents when restarting OMS

From: Garry Chen <gc92_at_cornell.edu>
Date: Fri, 20 Jun 2014 18:14:06 +0000
Message-ID: <144bdad763e542d9a792e2281cce4000_at_BLUPR04MB738.namprd04.prod.outlook.com>



I do see the same behaving as Mike posted. When my OMS(12.1.0.3) restart we got "Agent is unable to communicate with the OMS" critical event and it should be a correct reaction . I think the solution is do a block out before the shutdown and unblock after the start. You should be able to write a script to do that.

Garry

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Brian Pardy Sent: Friday, June 20, 2014 1:56 PM
To: Michael Schmitt; ORACLE-L
Subject: RE: EM12c incidents when restarting OMS

Very interesting. I'm running 12.1.0.4 but I don't think that is the cause of what I see behaving differently.

I've just tried several tests and I can't replicate this behavior. I only receive alerts for targets on the OMS host.

By subscribing to every metric alert event and every incident create/change and every out-of-the-box rule, I received about 150 email alerts from one OMS bounce, but every single alert was for a target running on the OMS host or monitored by the central agent. Even with all that enabled, I did not receive even one alert for a target on a separate managed host. This bothers me because I remember having the same problem you described and now I don't know what fixed it.

If there's a path to solving this, it probably involves taking a look at one of your notifications to identify the exact incident rule triggering the alert, then studying the exact criteria that you have on that rule. I wonder if there's a bug on your side causing too many notifications or a bug on mine keeping them from going out.

From: Michael Schmitt [mailto:mschmitt_at_uchicago.edu] Sent: Friday, June 20, 2014 12:13 PM
To: Brian Pardy; ORACLE-L
Subject: RE: EM12c incidents when restarting OMS

Hi Brian,

We are on 12.1.0.3 which I think is the latest version, but I could be wrong about that. I had already completed steps 1 and 2. It is step 2 that is firing off all the incidents for us.

The alert messages all seem to be tied to the agents (not the agent on the oms). They fire "Agent is unable to communicate with the OMS. (REASON = Agent is Unreachable)". That fires for all the servers we have agents on. Then we get a flood of Target incidents (agent, host, listener,Instance) which we have set to trigger on an Agent unreachable Availability check. Once we start the OMS back up, everything then clears out

I will check that MOS note you referenced

Thanks,
Mike

From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org> [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Brian Pardy Sent: Friday, June 20, 2014 10:22 AM
To: ORACLE-L
Subject: RE: EM12c incidents when restarting OMS

Hi Mike,

I used to have a similar issue. Could you please confirm which version/release of EM12c you use?

My fix for this was as follows:

  1. Unsubscribe from the out-of-the-box incident rules in the EM12c. The out-of-the-box rules (at least on EM12c R1, I have not checked recently) include notifications for EM12c components (WebLogic domains and servers and so on) that I do not wish to receive.
  2. Create custom incident rules for all the targets, target types and incident categories for which you DO wish to be notified (eg Database System, Listener, MySQL Instance, Agent) and subscribe to these instead.
  3. Configure OOB monitoring for the oracle_emrep OMS and Repository target, per MOS note 1472854.1.

I haven't seen an actual crash in my EM12c environment for which I would have needed notification since R1+BP1. Using this setup, and a script that shuts down my central agent along with the OMS when I need to bounce EM12c, I do not receive any notifications at all from a planned bounce. I still monitor the repository database target just like my other non-OEM databases, but I do not monitor any of the internal EM12c components. This may be more complicated if your environment includes other WebLogic targets (mine doesn't, so I ignore them all), but you can create target groups to include/exclude specific targets from notifications as needed.

Original message:

From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org> [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Michael Schmitt

Hello,

Does anyone else who uses 12c cloud control have the issue where you get a ton of incidents firing off if you shutdown the OMS?

We have incident notifications setup related to the availability of targets, and whenever we try to shutdown the OMS we get a ton of pages and emails as a result of these. Was thinking there has to be a way to prevent them from kicking off, but haven't found a good answer yet.

Does anyone know a command or workaround for this?

Thanks in advance for the help
Mike

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Jun 20 2014 - 20:14:06 CEST

Original text of this message