RE: Oracle High Availability Question(s)

From: Mark W. Farnham <>
Date: Fri, 16 Feb 2018 17:40:45 -0500
Message-ID: <0e9201d3a777$4720e000$d562a000$>

On the original post, my friend and fellow IOUG director Arjen Visser was awaiting moderation. Mr. Bobak can spank me later if he takes exception. This is about a product (DBvisit) that may match the original poster’s sweet spot:  

On 15/02/2018, at 4:50 PM, Mark W. Farnham <> wrote:  

Mark Bobak is the list administrator.  

Usually a brief note that is responsive to a request is considered okay. Asking in advance is an A+.  

I should know your product better than I do. It fills the gap between too many bells and whistles and too much effort rolling your own scripts and I’ve only heard good things about it.  

So I think your proposed post is okay. If Mark takes exception I would be surprised.  


From: Arjen Visser [] Sent: Wednesday, February 14, 2018 10:28 PM To: Mark W. Farnham
Subject: Re: Oracle High Availability Question(s)  

Hi Mark,  

I want to reply to the list with the following message but not sure if it breaches any rules since I am promoting a product. I do not want to upset the list. What do you think? Can I send it to the list?  

Thanks Arjen

Here is the message:  

There is another alternative to Dataguard and manual scripting. It is called Dbvisit Standby and is specifically for Oracle Standard Edition where Dataguard does not work. But it can also be used for Oracle Enterprise Edition. There are customers that use it for both Enterprise Edition and Standard Edition because it is so easy to use and install. It has a GUI console and command line interface. It has numerous features susch as graceful switchover to test your DR. Works on premise and in the cloud.  

Full disclosure: I am the founder of Dbvisit      

From: [] On Behalf Of Reen, Elizabeth (Redacted sender "elizabeth.reen" for DMARC) Sent: Friday, February 16, 2018 12:01 PM To: ''; Cc:; Chris Taylor; Tim Gorman; Scott Canaan;; Subject: RE: Oracle High Availability Question(s)  

                It’s not high availability if both nodes are on the same frame.  I have had frames die.  It is also not HA if it is in the same building on the same electric grid or even in the same area.  Hurricanes can knock out a wide area.  The last black out I remember took out the north east.  To be HA there can be no single point of failure.  Everything must have a backup.  Since the disks are shared, they go so goes your RAC.  Dataguard can be HA.  Failover can be instaneous if you set up your network correctly and have hot standby equipment in the other site.  To do it right is very expensive.


                GG is replication.  I am using it for DR because the business does not want to pay for a fast enough network connection between the Bahamas and Singapore.  It’s not ideal but better than dump and load.  We use GG  several different ways;  for availability, for DR, and for replication of data between regions.  We go to swing when Prod and DR are not available due to software releases.  We deal with the fall back when the two dbs are no longer the same.  We use it to sync account information between regions.  Here we are just transferring a bit of data, not the whole db.  It is here we ran into the PK issue.


                I was talking about the issue we had where the app had sequences, but was not using sequences.  They would select the highest value from the table and then add 1.  As they were using the sequence as a PK, we had loads of fun.  The different PKs came from alterations made by the other reason for their purposes.  


                GG is very strong and flexible.  You just need to understand it.  I would not give it to a team without training.



From: [] On Behalf Of Bobby Curtis Sent: Friday, February 16, 2018 10:47 AM To:
Cc:; Chris Taylor; Tim Gorman; Scott Canaan;; Subject: Re: Oracle High Availability Question(s)  

I’ve been watching this topic come in for a day or two. I’m surprised by the frustration of GoldenGate, but I also think there is some confusion on definitions. The subject line is talking about Oracle High Availability options; so here is my 2 cents for people to digest.  

High Availability is the approach of ensuring that you keep a system up and running as long as possible in the event of a failure, hardware, etc…  

Now there are a couple of ways to achieve this.  

  1. In a local data center/office, Oracle Real Application Clusters provide a localized HA for a single database between nodes that are connected. Normally within the same rack. Oracle RAC is a good technology, but can have its challenges when trying to first learn the technology. People in the community have banded together and produced the Oracle RAC SIG ( <> ) which produced RACAttack and provided a flexible way to learning how RAC works.
  2. Data Guard (non-active/active), provide a physical failover between two database (non-RAC or RAC). This is a good approach to disaster recovery not high availability.
  3. GoldenGate, provides a logical failover between two or more databases. Additionally, GoldenGate is a great tool/knowledge to have in your toolbox for many different architectures and migrations. Yet it is not 100% valid for high availability alone.

If you combine all three technologies, you get the approach that Oracle recommends for Maximum Availability Architecture (MAA - <> ).  

To address the GoldenGate concerns that people at saying about Primary Keys and sequences. Oracle GoldenGate uses PKs to identify the records to update/delete when those SQL operations are read from the trail file. If you see an issue with sequences, depending on the architecture you either need to run the sequence.sql file in the $OGG_HOME (classic architecture) or ensure that you are doing either even/odd (active/active) or site numbers (multi-master). This ensure that the sequences do not conflict or have any issues with PK, if used for that purpose.  

Looking at bug 26553124, this appears to be related to Integrated Replicat (IR) on release The bug has nothing to do with sequences, but PKs when using IR. The workaround for this bug, is to either use the “same” PK column for source and target table or switch to Classic Replicat (CR). As I mentioned in the previous paragraph, the PK has to match on both sides of the replication environment. If you are replicating tables with no PKs, then you need to define a key using KEYCOLS. Although this bug appears to be valid, I would have to ask how did you setup your environments if you were having problems with updates and deletes during replication?  



On Feb 16, 2018, at 9:57 AM, Reen, Elizabeth (Redacted sender "elizabeth.reen" for DMARC) <> wrote:  

GG requires that you understand the app. It also requires that you make the changes in both DBs. It’s not unpredictable, you just need to stay on top of it and the changes. We ran into all sorts of sequence issues. It turns out that the app was not using sequences but reading the max and adding 1. GG will find all of the weaknesses in your code.  


From: [] On Behalf Of Ls Cheng Sent: Thursday, February 15, 2018 5:50 PM To: Chris Taylor
Cc: Tim Gorman; Scott Canaan;; Subject: Re: Oracle High Availability Question(s)  


Just some warning with GoldenGate.

Recently we had a big issue with GoldenGate in the most critical database in one of or customers. GoldenGate ignored all updates in target because the target and source had different PK (target had same table structure as source but partitioned so PK had an additional column, the partitioning key), just because of different PK even we knew that and specified KEYCOLS in the target the updates was ignored until 1 month later a data analyst noticed some data divergence and adviced our support team. We had to restore 4 backups (each 4TB) to recover the data. It turns out bug 26553124 and there werent even a Alert in MOS explaining such behaviour.

Lesson learnt. GoldenGate is unpredictable, this is the second time in 2 years I see such data divergence due to GoldenGate bug and the impact is huge, huge and huge because data divergence is soooo difficult to detect.

So for DR stick with Data Guard (Physical Standby). I consider RAC as HA because you have several nodes available for a single copy of data (I consider DR more than one copy of data) and the death of one node still makes the application available, only 100%/number of nodes is impacted and the recovery is fast.


On Thu, Feb 15, 2018 at 8:47 PM, Chris Taylor < <>> wrote:

What about GoldenGate Tim??  

(Since I find myself trying to support this with no training/prior experience and learning in the deep end of the pool.... :)  


On Wed, Feb 14, 2018 at 3:23 PM, Tim Gorman < <>> wrote:

Going into Data Guard without training is uncomfortable, but going into RAC without training is untenable. You can try it, but it is going to hurt a lot, and you'll end up with something you'll regret.



Received on Fri Feb 16 2018 - 23:40:45 CET

Original text of this message