Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: Data Mirroring on two data centers -- How to use ASM ?

RE: Data Mirroring on two data centers -- How to use ASM ?

From: Kevin Closson <kevinc_at_polyserve.com>
Date: Fri, 19 May 2006 10:23:06 -0700
Message-ID: <5D2570CAFC98974F9B6A759D1C74BAD0E5A494@ex2.ms.polyserve.com>


>>>> when network failures occur no "third party" can choose which node
>>>> should survive. So a manual failover is the only solution. Only a
>>>> third site will give you enough "quorum" to provide an

This represents a very redimentary understanding of clusters, or more likely a very deep understanding of very redimentary clusters.

Two node clusters can work out proper membership and split-brain resolution, but it requires sophistacted membership and fencing mechanisms.
The simple "who's got more" sort of quorum stuff is just not robust enough. In fact, it is for this reason that SuSE has said that 2 node clusters with OCFS2 are not possible. You must have a minimum of 3 nodes...as was the case for quite some time with GPFS on AIX. In case anyone thinks I'm making up this bit about quorum and fencing:

http://lists.suse.com/archive/suse-oracle/2006-Apr/0061.html http://lists.suse.com/archive/suse-oracle/2006-Apr/0071.html

It is a fact that most cluster membership schemes available out there are architected poorly for sake of first-to-market needs. Or they carry age-old legacy implementation choices. The OCFS2 problem cited in these suse-oracle email archives do indeed reflect bugs. However, the architecture itself will continue to breed bugs. Architecture choices can be "Bug Factories". Consider shared nothing cluster database approaches. They are bug factories too really.
Any of the following clustering approaches are bug factories. The reason they are bug factories is because the architecture is not solid enough to "just work", so layers and layers and layers of workarounds in the form of bug fixes ensue. I wont name the products that implement the following Achilles Heel cluster architectures, but they are out there:

  1. Persistent reservation schemes
  2. Self fencing (e.g., node has been informed it's supposed to die so it tries to execute the reboot command)
  3. Simple majority quorum
  4. Central lock managers/metadata managers (SPOF, bottleneck) -- http://www.freelists.org/webpage/oracle-l
Received on Fri May 19 2006 - 12:23:06 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US