Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: Oracle CRS and Split Brin

Re: Oracle CRS and Split Brin

From: Naqi Mirza <naqimirza_at_yahoo.com>
Date: Wed, 28 Mar 2007 03:24:48 -0700 (PDT)
Message-ID: <486461.65059.qm@web32412.mail.mud.yahoo.com>


My basic understanding:

When the master node looses its private network, the surviving node becomes the master, reconfiguration of the cluster takes place - the old master is ejected from the cluster configuration - and rebooted. The following can be seen in its crsd.log file (this is for the crs component of the oracle clusterware, responsible for managing oracle resources):
.

I AM THE NEW OCR MASTER at incar 6. Node Number = 1
.

Followed by :
.

Processing member leave for node1, incarnation: 7 Do failover for: node1
.

I've excluded the time portion that is also displayed on the same lines.
.

In addition to that, if you look at the alert log for the surviving instance you will see a reconfiguration will be taking place:
.

Reconfiguration started (old inc 24, new inc 26)
.

You will also see the following on the new master node's alert<hostname>.log file found under $ORA_CRS_HOME/log - lets assume this node is called node0
.

CSSD evicting node node11. Details in /ORACLE_HOME/log/node0/cssd/ocssd.log. CRS-1601:CSSD Reconfiguration complete. Active nodes are node0 .
.

The cssd is the cluster synchronization services daemon responsible for node membership. Also you'll find the vip of the ejected node will also be relocated to the surviving node - node0 in our case. Running the appropriate network command for your interfaces you will find that the vip is relocated onto the same interface as your public interface. You will see in the crsd.log file for the node master node the start of this oracle resource - VIP .
.

Forgot to mention this earlier, but prior to the eviction, you will see messages such as:
.

clssnmPollingThread: node node1 (1) missed(590) checkin(s)  clssnmPollingThread: node node1 (1) missed(591) checkin(s
.

In the occsd.log file for the node that has a problem in its private network (node1). The missed (590) runs uptill the misscount parameter is reached, in this particular scenario since the vendor clusterware is also being used misscount is 600 - once this value is reached a cluster reconfiguration will commence to evict the node.

Naqi

I wonder if anyone has expertise with Cluster in General and has worked with Oracle Clusterware?

Does anyone know how CRS behaves in a two node Split Brain situation? For example the master node looses its private network connection. I know voting disks are used but what happens underneath?

Thanks

Alex                 



What kind of emailer are you? Find out today - get a free analysis of your email personality. Take the quiz at the Yahoo! Mail Championship. http://uk.rd.yahoo.com/evt=44106/*http://mail.yahoo.net/uk
--
http://www.freelists.org/webpage/oracle-l
Received on Wed Mar 28 2007 - 05:24:48 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US