Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: Oracle CRS and Split Brin

Re: Oracle CRS and Split Brin

From: Naqi Mirza <naqimirza_at_yahoo.com>
Date: Wed, 28 Mar 2007 17:25:58 -0700 (PDT)
Message-ID: <426533.70644.qm@web32403.mail.mud.yahoo.com>

Heres something from metalink document id: 294430.1

The table below explains in the conditions under which eviction will occur.

Network Ping                                                                Disk Ping                                                                Reboot
Takes More than misscount seconds                           Completes in Misscount seconds                         Y

So the node which has lost its interconnect, would have a network ping that would eventually take more than misscount seconds - logged in cssd.log and should therefore be evicted.

naqi

----- Original Message ----
From: Naqi Mirza <naqimirza@yahoo.com>
To: amonte <ax.mount@gmail.com>; Kevin Closson <kevinc@polyserve.com>
Cc: oracle-l@freelists.org
Sent: Thursday, 29 March, 2007 4:55:15 AM
Subject: Re: Oracle CRS and Split Brin

node1 loses its interconnect - this is where I understand the misscount parameter comes into play. The oracle cssd process checks for the network and disk heartbeat. Misscount represents the maximum time that a heartbeat can 
be missed before entering into a cluster reconfiguration to evict a node. So if node1 were to lose its interconnect (regardless of being the master it should be evicted shouldn't it?). This would leave one of the other(s) to become the new master - which one becomes the master - i guess Kevin's already answered that one.
When the evicted node is back in business - i assume you mean its interconnect is now fixed, a cluster reconfiguration should take place adding that node back into the cluster.
I'm sure once Kevin has a glance
 over this he'll correct me where I'm wrong.

Naqi

----- Original Message ----
From: amonte <ax.mount@gmail.com>
To: Kevin Closson <kevinc@polyserve.com>
Cc: naqimirza@yahoo.com; oracle-l@freelists.org
Sent: Thursday, 29 March, 2007 4:32:02 AM
Subject: Re: Oracle CRS and Split Brin

That is what I mean Kevin, how does a node know which one will be evicted.

 

For instance if node 1 (the lower node) loses its interconnect what happens? The other(s) will be evicted? What happens if the evicted node(s) is back to business. Because it cannot contact node 1 through network what will happen? (node 1 lost private network)


 

How does Voting Disk help to determine Split Brain?

 

Thanks

 

Alex

 

On 3/28/07, Kevin Closson <kevinc@polyserve.com> wrote:



 





When the master node looses its private network, the surviving node becomes the master, reconfiguration of the cluster takes place - the old master is ejected from the cluster configuration - and rebooted. The following can be seen in its 
crsd.log file (this is for the crs component of the oracle clusterware, responsible for managing oracle resources):
.
I AM THE NEW OCR MASTER at incar 6. Node Number = 1


..[…lots good CRS stuff deleted…]

…This was a very good follow up, but the question was about split brain. Split brain is when there is an equal number of "survivors" and both "think" they are the sole survivor. I think the original post was asking how Oracle determines who gets to anoint themselves the new master in a split brain scenario. I have not seen the full algorithm Oracle uses documented anywhere on the net so if someone has, please let us know. There are a lot of cluster implementations out there. One common approach is to maintain knowledge of the IP addresses of members and use the lowest IP node as one of the factors in choosing the winner in a SB scenario.  That is not how CRS does it though as has become evident in a thread I've had with a reader of my blog. In his 2 node case his CRS master was also the lowest IP and in a meltdown scenario, the other
 node was chosen as the sole survivor. That really surprised me.


I think all I've said is Oracle is not telling us what the full algorithm is for survivorship in a true split-brain scenario.


There are some clusterware topics here: 
http://kevinclosson.wordpress.com/kevin-closson-index/real-application-clusters-related-topics/ 

Such as 
http://kevinclosson.wordpress.com/2007/01/10/comparing-10201-and-10203-linux-rac-fencing-also-fencing-failures-split-brain/

 



 











		 
All new Yahoo! Mail "The new Interface is stunning in its simplicity and ease of use." - PC Magazine





		
___________________________________________________________ 
All new Yahoo! Mail "The new Interface is stunning in its simplicity and ease of use." - PC Magazine 
http://uk.docs.yahoo.com/nowyoucan.html
--
http://www.freelists.org/webpage/oracle-l
Received on Wed Mar 28 2007 - 19:25:58 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US