RE: Zero DataLoss

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Wed, 9 Nov 2022 13:10:42 -0500
Message-ID: <1b1c01d8f466$94e3aca0$beab05e0$_at_rsiz.com>



I absolutely love this instance of “can’t be unseen once you see it” and I heartily endorse Tim’s statement on this as well as Clay’s kind words and his point.  

mwf  

From: Tim Gorman [mailto:tim.evdbt_at_gmail.com] Sent: Wednesday, November 09, 2022 11:26 AM To: Clay.Jackson_at_quest.com; mwf_at_rsiz.com; chrishna0007_at_gmail.com; 'Oracle L' Subject: Re: Zero DataLoss  

Friends,

I'd like to point out something perhaps subtle but can't be unseen once you see it.

The DataGuard product group at Oracle has been nothing less than brilliant in creating and naming the three modes of DataGuard...

  1. MAX_PROTECTION
  2. MAX_AVAILABILITY
  3. MAX_PERFORMANCE
The simple fact is that it is not possible to prioritize data protection, service availability, and service performance simultaneously. Only one of these can be the top priority, and the other two must be subordinate. Period. End of sentence.

If you are going to prioritize data protection (a.k.a. true ZERO data loss), then you must do so regardless of the impact on availability or performance of the database service. If zero data loss is really the goal, then the other considerations must be subordinate.

That is why MAX_PROTECTION is rarely used in real-life. Very, very few organizations are willing to subordinate service availability. MAX_PROTECTION requires that if the standby is down, then the primary must be down too, which is absolutely what is required for maximum data protection. If a pending transaction cannot be protected, then it cannot be permitted to commit. This is not a limitation or a compromise, this is simply purity of vision.

It's almost like the old saying about three choices of good, fast, and cheap, except in this situation you only choose any one. There is always a trade-off.

The original poster's opening sentence about "zero data loss availability" shows fundamental confusion, because there is no such thing as data protection and availability with equal priority. Either "zero data loss" is the goal, or "availability" is the goal.

The original poster's organization has clearly chosen against "zero data loss" in prioritizing either availability or performance above. They should listen to their own decision, and realize that the organization is implicitly prioritizing performance over data protection and availability.

There is no judgement here, it is simply a fact. zero data loss is not possible unless it is the priority.

All that being said, MAX_AVAILABILITY does minimize the possibility of data loss substantially, and with the inclusion of a FarSync instance, can also greatly minimize impact on performance. MAX_AVAILABILITY is the mode that most closely approaches the ideal of meeting all three priorities, but still it requires dedication to service availability as the top priority, making data protection and performance subordinate. So by choosing MAX_AVAILABILITY, be clear that there must be a negative impact on data protection (i.e. RPO > 0) and on application performance, albeit minimized. Likewise, choosing MAX_PERFORMANCE must be accompanied by acceptance of a significant negative impact on data protection as well as service availability.

Choose one of the three modes, and understand all the implications of that choice. Also, understand what must be improved infrastructurally in order to adhere to the chosen mode. Nobody uses MAX_AVAILABILITY or MAX_PROTECTION on the "cheap".

Once you see it, you can't unsee it.

Hope this helps...

-Tim

On 11/9/2022 7:04 AM, Clay Jackson (Clay.Jackson) wrote:

As usual, MWF hit the relevant points, except perhaps not standing in the predicted meteor impact zone😊. Also consider these points:

For ANY zero data loss system it’s possible to come up with a scenario where (committed) data will be lost.

Attempting to achieve zero data loss can be an infinite resource (time and money) sink, so one should ALWAYS consider things like the cost and probability of that “last bit” of data actually being “lost”.  

And something to think about with MAA – all MAA or any “two-phase commit” system does is prevent a transaction from committing until there are multiple (presumably at least one of would be “secure”) copies of said transaction. In a failure scenario, what really happens is that the “last transaction”, which in fact MAY have been “commitable” in a non-MAA environment, doesn’t get committed, and, like any other “in-flight” transaction, is “lost”. All you’re really doing is changing the timing.  

Good luck!  

Clay Jackson        

From: oracle-l-bounce_at_freelists.org <mailto:oracle-l-bounce_at_freelists.org> <oracle-l-bounce_at_freelists.org> On Behalf Of Mark W. Farnham Sent: Wednesday, November 9, 2022 5:27 AM To: chrishna0007_at_gmail.com; 'Oracle L' <mailto:oracle-l_at_freelists.org> <oracle-l_at_freelists.org> Subject: RE: Zero DataLoss  

CAUTION: This email originated from outside of the organization. Do not follow guidance, click links, or open attachments unless you recognize the sender and know the content is safe.  

Presuming you’re already doing some sort of log application to a recovery system, a radio accessible way to pull the redo logs from a “dead” data center to be taken to your remote recovery site is probably the best you can do. Axxana Inc had this sort of hardware, but I’m getting a problem trying to visit their website to copy a link to you.  

IF a given “disaster” has a little warning you can update a custom table (say insert a row with the current scn and timestamp), commit, alter system switch logfile, alter system archive log all to accelerate shoving all the transactions committed so far to your recovery system. You can also have a policy that switches the database to restricted in the event of a disaster “early warning” but notice that in our mostly hacked world that is a slippery slope under time pressure for analysis. In combination those steps maximize the chances of shoving the required redo logs to your remote recovery systems in time. In lieu of the overhead of MAA or a piece of hardware that has a plex of your logfiles and archived logfiles and can transmit them in a burned up building that is buried in the crevasse of an earthquake and flooded in the crevasse, that’s about as close to zero as you are going to get. (I didn’t mention nuclear, because if you put a radio transmit unit in an EMP cage, you probably interfere with its ability to transmit. I leave that as an exercise for the community.)  

mwf  

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Krishnaprasad Yadav Sent: Tuesday, November 08, 2022 10:55 PM To: Oracle L
Subject: Zero DataLoss  

Dear Experts,  

We want to achieve zero data loss availability in our environment , for this we are planning to put MAA in DB , but we see there is overhead of redo causing lgwr events .

so we put it back to maximum performance .  

Apart from using ZDLRA and MAA, is their any other solution we can use to achieve this? .  

Regards,

Krishna          

--

http://www.freelists.org/webpage/oracle-l Received on Wed Nov 09 2022 - 19:10:42 CET

Original text of this message