RAC Full cluster outage (almos)
Date: Wed, 11 Mar 2009 16:35:41 +0100
A couple of days one of my customers faced a almost full cluster outage in a 2 node 10.2.0.4 RAC on Sun Solaris 10 Sparc (full oracle stack).
The sequence was as follows
- node 2 lost private network, interface went down
- node 1 evicts noe 2 (as expected)
- node 1 then evicts himself
- after nodes 1 returned to the cluster and cluster reformed from 1 node to two nodes, node 2 lost private network again and this time eviction occurs in node 2
So it was not really a full cluster outage but the eviction occured one after another so it looked full outage to the users.
My doubt is, in a nodes cluster node 1 always survives which is not in this case. My only theory is node 2 was so ill that it could not reboot the server, node 1 then evicts himself to avoid corruptions.
Any more ideas?
http://www.freelists.org/webpage/oracle-l Received on Wed Mar 11 2009 - 10:35:41 CDT