Re: watchdog failure at same time every week

From: Matthew D. Bennett <mbennett_at_utah-inter.net>
Date: Sat, 03 Nov 2001 23:16:49 -0700
Message-ID: <3BE4DD3F.7A3C4B7C_at_utah-inter.net>


Mike,

We have been running 9i RAC since July and have never seen this problem. It sounds like you have something configured wrong. The whole point of 9i RAC is to be able to have a node shut down and have the system continue functioning without loss of transactions or downtime.

Are you running a certified hardware configuration? How is your disk storage connected?

Good luck.

Matt.

Mike F wrote:
>
> we are testing Oracle 9i RAC.Every sunday morning _at_ 0102 - 0104 the
> watchdog fails to communicate and shuts down one node and then the
> whole database. The machines then reboot. This happens without fail.
> It happened on RedHat 7.1 and it happens on SUSE 7.2. Our SA are not
> away of any crontab jobs runing at that time. Any help would be
> appreciated.
>
> NODE 2
> wdd.log
> UTC: Wed Oct 24 19:00:07 GMT 2001 (746103)
> wddProcRegisterPacket: info: registered client
> name = /tmp/.watchdog/cl_sock_788_15373,
> pid = 788,
> tid = 15373,
> margin = 5000,
> level = 1,
> option = 0,
> description = ClientProcListen.
> Time: Wed Oct 24 15:00:07 EDT 2001 (746170)
> UTC: Wed Oct 24 19:00:07 GMT 2001 (746170)
> wddSendRegisterReply: info: sent register ack to client.
> Time: Sun Oct 28 01:01:44 EDT 2001 (207767)
> UTC: Sun Oct 28 05:01:44 GMT 2001 (207767)
> wddScanClients: fatal: client (name=cl_sock_714_5125) ping came too late
> (expiry=1004245303,764, now=1004245304,208).
> wddPerformWatch: fatal: at least one client is late in checking in.
> Time: Sun Oct 28 01:01:44 EDT 2001 (208071)
> UTC: Sun Oct 28 05:01:44 GMT 2001 (208071)
> Shutting down the entire node...
>
> nm.log
>
> | WARNING | ClusterListener (pid=708, tid=1026): WatchdogPing failed
> (rc=12).
> Sun Oct 28 01:02:36 2001
> | WARNING | ClusterListener (pid=708, tid=1026): WatchdogPing failed
> (rc=12).
> Sun Oct 28 01:02:37 2001
> | WARNING | ClusterListener (pid=708, tid=1026): WatchdogPing failed
> (rc=12).
> Sun Oct 28 01:02:37 2001
> | WARNING | ClusterListener (pid=708, tid=1026): WatchdogPing failed
> (rc=12).
> Sun Oct 28 01:02:38 2001
> | WARNING | ClusterListener (pid=708, tid=1026): WatchdogPing failed
> (rc=12).
> Sun Oct 28 01:02:38 2001
> | WARNING | ClusterListener (pid=708, tid=1026): WatchdogPing failed
> (rc=12).
> Sun Oct 28 01:02:39 2001
> | WARNING | ClusterListener (pid=708, tid=1026): WatchdogPing failed
> (rc=12).
>
> cm.log
>
> | WARNING | 340b | ClientProcListener (pid=784, tid=13323):
> WatchdogPing failed (rc=12).
> Sun Oct 28 01:02:39 2001
> | WARNING | 2006 | ClientProcListener (pid=779, tid=8198):
> WatchdogPing failed (rc=12).
> Sun Oct 28 01:02:39 2001
> | WARNING | 2c09 | ClientProcListener (pid=782, tid=11273):
> WatchdogPing failed (rc=12).
> Sun Oct 28 01:02:39 2001
>
>
>
> --
> Sent by dbadba62 from hotmail within area com
> This is a spam protected message. Please answer with reference header.
> Posted via http://www.usenet-replayer.com/cgi/content/new
Received on Sun Nov 04 2001 - 07:16:49 CET

Original text of this message