Re: help me find out why instance died

From: chao_ping <member_at_dbforums.com>
Date: Sun, 22 Dec 2002 16:35:41 +0000
Message-ID: <2310633.1040574941_at_dbforums.com>


Hi, tim:
  Thanks for your suggestion, i checked that article, but still unable   to solve the problem.
  The same time the next day, another instance in the cluster died, with   the same reason. ora-29740, still with reason 2. The cluster runs quite stable in the past month(since the patchset is installed, it is just about 30 days).
When i check the linux /var/log/messages, i found at the exact same time, syslogd restarted in both node in the two days , when rac instance died. Whould there be some relations between them?Unix did not rebooted ,I checked uptime value.
From the trace file, i found it said the dead instance failed to transfer heart beat:
first day, from the alive instace rac1:

>
>
> *** 2002-12-21 04:01:54.227
> kjxgrnbrisalive: (1, 2) not beating, HB: 479418910, 479418910
> *** 2002-12-21 04:01:54.239
> kjxgrnbrdead: Detected death of 1, initiating reconfig
> kjxgrrcfgchk: Initiating reconfig, reason 2
> *** 2002-12-21 04:01:59.256
> kjxgmrcfg: Reconfiguration started, reason 2
> kjxgmcs: Setting state to 6 0.
> *** 2002-12-21 04:01:59.258
> Name Service frozen
> kjxgmcs: Setting state to 6 1.
>

from the trace file of the second day, from the alive instance rac2:

>
>
> *** 2002-12-22 04:01:56.457
> kjxgrnbrisalive: (0, 1) not beating, HB: 479438832, 479438832
> *** 2002-12-22 04:01:56.457
> kjxgrnbrdead: Detected death of 0, initiating reconfig
> kjxgrrcfgchk: Initiating reconfig, reason 2
> *** 2002-12-22 04:02:01.486
> kjxgmrcfg: Reconfiguration started, reason 2
> kjxgmcs: Setting state to 9 0.
> *** 2002-12-22 04:02:01.495
> Name Service frozen
>

I wonder if anyone here have the experience of dealing with rac system. What shall i check to verify why rac instance failed to update the controlfile. I already enabled event:
event="29740 trace name errorstack level 3" in one instance.
shall i enable the undocumented parameter _imr_active=false in the system?

--
Posted via http://dbforums.com
Received on Sun Dec 22 2002 - 17:35:41 CET

Original text of this message