Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: Some Dataguard is good, lots more must be better?(specifically, when do most actual failovers really occur?)

RE: Some Dataguard is good, lots more must be better?(specifically, when do most actual failovers really occur?)

From: Carel-Jan Engel <cjpengel.dbalert_at_xs4all.nl>
Date: Thu, 21 Sep 2006 11:51:27 +0200
Message-Id: <1158832287.10717.38.camel@dbalert199.dbalert.nl>


On Thu, 2006-09-21 at 08:46 +0000, Laimutis Nedzinskas wrote:

> No, I do not confuse. I just was not 100% sure if Oracle can do it
> because I've never tested it myself.

Your phrase 'Well, it is not a good option for maximum data protection(as Oracle defines it.)' is misplaced then. You don't know, you haven't tested whether it is a good option. Stating an untested assumption as a fact is not right. I did test this a lot. It actually works.

> The point is I've never used this option is that together with Data
> Protection one wants High Availability which means that time lag is
> contradicting this requirement. In numbers if 15 minutes downtime is
> allowed then recovery must be 15 minutes.

No. I have never seen (which doesn't meen it isn't possible) recovery lasting as long as the timeframe spanned by the redo to be applied. In general 15 minutes worth of redo does not take 15 minutes to apply.

> I am not sure how to calculate maximum lag allowed as it depends on
> machine speed and redo size and probably redo contents.

The maximum lag allowed should be business driven. How much time does business allow themselves to discover a logical error? How much time do they allow you to do the same? The time it takes to apply the amount of redo for that a timeframe can only be determined by testing. How much redo is generated at most during such a timeframe? How much time does it take to apply that amount of redo? That depends mainly on your CPU, storage abilities. Frequently I see 8 hours worth of redo being applied in a handful of minutes. This is not a very idle system, BTW. Your Mileage May Vary. TEST!

Again, this is why many organisations tend to install two standbys, once the decision for installing a standby is made.

If your business really cannot afford an outage of say, > 4-6 hours, 2 standbys are required IMHO. Think of the situation when a real disaster struck. Then you are running at your DR center, and that is your last resort. If failover is not tested at a regular basis, this is an extraordinary sitiuation for all admins involved. It leads to an even more error prone situation, all the admins working with the systems they never used to work with for real. Then there is the distraction of the destroyed primary DC or system. New hardware needs to be ordered, or selected first? This needs a setup? Co-workers are in hospital or even died? How to concentrate on your daily work then, which additionally isn't a routine at all in the strange environment? And then you're running just one system, no standbys, (I've even seen 'no backup hardware at the DR site') and so on. How vulnerable do you want to be? If a second disaster (more likely to be originated by humanoid carbon objects (Thank you Casey Dyke) under the circumstances described) strikes it's over. How much time will it take then to get new hardware, restore, and so on? 4-6 hours is rather optimistic I guess.

If you take countermeasures for HA, investigate the risks first. And not just the risks at normal operation, but also the risks when running at the DR site with a missing main DC. Calculate the costs of disasters in terms of busniness interruption. Find out how much 'insurance premium' it is worth to the business to cover for the risks. Calculate the costs of the various solutions to cover for the risks. Make clear what the leftover risks are when a certain (combination of) countermeasure(s) is choosen for. Then let the business decide what the game plan will be. It's their data, it's their budget.

And that 2 standby thing brings me back to the point the OP had in mind when he started with this thread. Storage replication versus Data Guard. How easy is it to replicate storage to two standbys, let's say from A to B and C? And how easy is it to switch to the situation B to A and C, and then to C to B and A, and then back to A to B and C? I can do this easily with standby databases. Is it as easy for storage? With Data Guard I can do this on a per database basis. Can storage replication handle that? Or is it a 'box' granularity in role switching between primary and standby?

Storage replication versus Data Guard. We skipped the bandwidth discussion so far. I've disappointed the OP for that part. He was so much expecting me to start discussing that. Maybe more about bandwidth later. I have to figure how to test that for a fair comparison first. Much later would be a better term. All suggestions are most welcomed.

Best regards,

Carel-Jan Engel

===
If you think education is expensive, try ignorance. (Derek Bok) ===

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Sep 21 2006 - 04:51:27 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US