Re: data guard fast start failover

From: <Laimutis.Nedzinskas_at_seb.lt>
Date: Mon, 19 Jan 2009 09:40:36 +0200
Message-ID: <OF0D2BEFF1.071DD009-ONC2257543.0029EBF5-C2257543.002A2BFB_at_seb.lt>



>I think the reason for this is that if the Observer detects that it cant communicate with the primary but can still communicate with the standby it will initiate a fail-over (assuming it gets confirmation from the standby that it is synchronised and also cannot communicate with the primary)

In my case primary killed itself because it lost communication with BOTH observer and standby. Then primary thinks that since FSF is enabled then observer and/or standby would attempt to failover. It has a sense but it's kind of dangerous.

brgds, Laimis

                                                                           
             Ian Cary                                                      
             <ian.cary_at_ons.gsi                                             
             .gov.uk>                                                   To 
                                       Laimutis.Nedzinskas_at_seb.lt          
             2009.01.16 16:24                                           cc 
                                       oracle-l_at_freelists.org,             
                                       oracle-l-bounce_at_freelists.org       
                                                                   Subject 
                                       Re: data guard fast start failover  
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




I think the reason for this is that if the Observer detects that it cant communicate with the primary but can still communicate with the standby it will initiate a fail-over (assuming it gets confirmation from the standby that it is synchronised and also cannot communicate with the primary)

Normally this will have occured because the primary has already died for one reason or another so there wouldn't be anything to worry about. However if the original primary does happen to still be alive you would want it to kill itself as it is fair to assume that the observer would be in the process of failing over the old standby to be a new primary which may cause a split brain if the old primary doesnt abort itself.

Cheers,

Ian

|---------+----------------------------->

| | Laimutis.Nedzinska|
| | s_at_seb.lt |
| | Sent by: |
| | oracle-l-bounce_at_fr|
| | eelists.org |
| | |
| | |
| | 16/01/2009 13:08 |
| | Please respond to |
| | Laimutis.Nedzinska|
| | s |
| | |
|---------+-----------------------------> >--------------------------------------------------------------------------------------------------------------|

  |
|
  | To: Ian Cary/ONS_at_ONS
|
  | cc: oracle-l_at_freelists.org |
  | Subject: Re: data guard fast start failover |

>--------------------------------------------------------------------------------------------------------------|





Thank you for comments.

What is interesting for me is that primary kills itself if it can not connect to both observer and standby. Appearently this is done to avoid a split brain.
I am just not sure this is a desired behavour....

             Ian Cary
             <ian.cary_at_ons.gsi
             .gov.uk>                                                   To
                                       Laimutis.Nedzinskas_at_seb.lt
             2009.01.16 14:40                                           cc
                                       oracle-l_at_freelists.org,
                                       oracle-l-bounce_at_freelists.org
                                                                   Subject
                                       Re: data guard fast start failover










Hi Laimis,

I've just implemented this on three 10.2.0.3 systems here and it all seems fairly straightforward. Its early days so its probably too soon to say whether there are any issues or not but everything seems to be working OK so far.

Testing the failover by aborting the original primary worked smoothly and took around 4 seconds. The observer also reinstates the original primary to be the new secondary quite happily when it is remounted. End user connections are also seamlessly transitioned to the new primary without any need for manual intervention.

When you say split brain I assume you are thinking of a circumstance where both instances believe they are the primary and also have active services allowing users to connect. The documentation states that automatic fast-start failover never allows there to be more than one primary and my testing seemed to bear this out.

The synchronous nature of the log shipping can have an impact on the primary performance and a primary commit won't complete until the redo-information has been accepted by the standy by so it is pretty important to ensure that the network speed between servers is good and also that I/O speed on the secondary doesn't cause a delay in copying the log files. Other than that there should be no impact on normal activities.

Hope this helps,

Cheers,

Ian

|---------+----------------------------->

| | Laimutis.Nedzinska|
| | s_at_seb.lt |
| | Sent by: |
| | oracle-l-bounce_at_fr|
| | eelists.org |
| | |
| | |
| | 16/01/2009 07:27 |
| | Please respond to |
| | Laimutis.Nedzinska|
| | s |
| | |
|---------+-----------------------------> >--------------------------------------------------------------------------------------------------------------|

  |
|
  | To: oracle-l_at_freelists.org |
  | cc:
|
  | Subject: data guard fast start failover |

>--------------------------------------------------------------------------------------------------------------|








Hi all

Anyone's using data guard fast-start failover ? What are the experiences ?
What about split brain?
Does it interfere heavily with normal database activities? Any other comments?

Thank you in advance,

Laimis N

--
http://www.freelists.org/webpage/oracle-l



This email was received from the INTERNET and scanned by the Government
Secure Intranet anti-virus service supplied by Cable&Wireless in
partnership with MessageLabs. (CCTM Certificate Number 2007/11/0032.) In
case of problems, please call your organisation’s IT Helpdesk.
Communications via the GSi may be automatically logged, monitored and/or
recorded for legal purposes.


For the latest data on the economy and society consult National Statistics
at http://www.statistics.gov.uk

*********************************************************************************





Please Note:  Incoming and outgoing email messages are routinely monitored
for compliance with our policy on the use of electronic communications
*********************************************************************************





Legal Disclaimer  :  Any views expressed by the sender of this message are
not necessarily those of the Office for National Statistics
*********************************************************************************





The original of this email was scanned for viruses by the Government Secure
Intranet virus scanning service supplied by Cable&Wireless in partnership
with MessageLabs. (CCTM Certificate Number 2007/11/0032.) On leaving the
GSi this email was certified virus free.
Communications via the GSi may be automatically logged, monitored and/or
recorded for legal purposes.


--
http://www.freelists.org/webpage/oracle-l



This email was received from the INTERNET and scanned by the Government
Secure Intranet anti-virus service supplied by Cable&Wireless in
partnership with MessageLabs. (CCTM Certificate Number 2007/11/0032.) In
case of problems, please call your organisation’s IT Helpdesk.
Communications via the GSi may be automatically logged, monitored and/or
recorded for legal purposes.


The original of this email was scanned for viruses by the Government Secure
Intranet virus scanning service supplied by Cable&Wireless in partnership
with MessageLabs. (CCTM Certificate Number 2007/11/0032.) On leaving the
GSi this email was certified virus free.
Communications via the GSi may be automatically logged, monitored and/or
recorded for legal purposes.


--
http://www.freelists.org/webpage/oracle-l
Received on Mon Jan 19 2009 - 01:40:36 CST

Original text of this message