Re: Standby Database

From: Don Granaman <granaman_at_home.com>
Date: Thu, 29 Nov 2001 01:10:01 -0800
Message-ID: <F001.003CFFEF.20011129000518@fatcity.com>

Quick question, long answer...

You are not only making sense, you have hit the primary issue with the Oracle standby database directly on the head. With DataGuard in 9i (or 8i on HP-UX or Solaris only), you can *try* to retrieve redo log files. You could also do it manually - with any version. In either case, there are no guarantees. If the primary site goes away in a tornado, the redo logs and possibly one or more unfinished (w.r.t. sent or archived) archive logs go with it - and the standby does not have all the transactions. The 9i marketing rhetoric says that this is not an issue with 9i DataGuard since it allows synchronous logfile writes to a remote site and some other enhancements. I haven't tried it yet, but I'm not drinking all their Kool-Aid. I'm sure that even Oracle9i didn't change the laws of physics (186,000 miles/second isn't just a good idea - its the law! And light doesn't travel in anything vaguely resembling a straight line inside a fiber optic cable.). Synchronous writes, especially of redo logs, to a geographically remote standby have to be a significant performance hit on any non-trivial primary. Even locally, synchronous host-based writes usually have a very significant performance impact.

There are options of geo-mirroring both the online redo logs and archive logs with something like EMC SRDF to create a true "no loss" standby database. There is a white paper on it somewhere on EMC's site and I've seen a more generic white paper / presentation on it from Oracle (from Wei Hu or Ron Weiss probably). I've designed something like this with a long-haul multi-hop EMC implementation (local synchronous SRDF with R2 in bunkered Symmetrix, BCV split in bunkered Sym, adaptive copy of the BCV to remote standby). It works well, but doesn't look much like an automagically managed "normal" standby database. This required custom scripts - to enforce a delay, manage recovery and such. The idea is to synchronously mirror to a "safe" location (in the EMC scenario, a bunkered Symmetrix a short distance away) and then asynchronously/periodically update the more remote standby system from there. It is an expensive solution, but if you truly can't afford any data loss there are no cheap ones. This one in particular has the added advantage that the "heavy lifting" grunge work is done in the Symmetrix so there is no noticeable host load on the primary. Some other storage vendors - Hitachi, etc. - have similar capability. You could do it with host-based software (e.g. Veritas) also, but then you have host load, potential/probability of OS write performance degradation, and perhaps some other issues (e.g. multi-hop capability?).

I don't even know what DoubleTake is. However, local clustering is an entirely different critter compared to a standby database. It provides a "standby instance" for fast failover in the event of a system/instance failure, but doesn't provide any intrinsic media protection or a disaster recovery solution. A standby database is typically a disaster recovery (DR) solution, but a poor high availability (HA) solution - but, as Bill Clinton might say "that depends on the meaning of 'high'." ;-). Local clustering (either model: OPS/RAC or HA/takeover) typically provides excellent HA, but no DR at all. The best business continuity solutions for extremely critical 24xForever, "no data loss is ever acceptable" systems demand hybrid solutions. I've built a few for a major brokerage using clustered Sun E10Ks, 8i OPS, Net8 TAF, EMC Symms, TimeFinder, SRDF, and (delayed) standby databases at a backup site 200 miles away. Extreme HA, extreme DR, and extreme expense!

There are some interesting HA, DR, & scalability blueprints at www.eECOstructure.com - in multiple phases. Phase I is the "Resilient" Blueprint - a hardened single site with HA. Phase II is the "Recovery" blueprint - adding multi-site and DR. Phase III is the "Accelerated" blueprint - higher scalability, security, etc. Each phase builds upon the previous. Remember, they are blueprints, not commandments. Nobody ever builds a house without modifications to the standard model, and nobody is likely to build an infrastructure that way either. The concepts can be adapted to other "components" (e.g. WebLogic and/or Tuxedo instead of iAS).

-Don Granaman
[OraSaurus]

Original Message ----- To: "Multiple recipients of list ORACLE-L" <ORACLE-L_at_fatcity.com> Sent: Wednesday, November 28, 2001 5:35 PM

> Quick question. Is it a fair statement to say that using Oracle's
hot
> standby database allows you recoverability up to the last archive
log, but
> would NOT recover to the latest redo log (prior to a log switch).
In other
> words, the potential to lose transactions is very high if you depend
on this
> for failover (not good for e-commerce type databases). Would it be
possible
> to somehow mirror redo logs across to the failover server and apply
them
> when activating the standby database, or is the only real solution
> clustering or something like DoubleTake?
>
> Am I making sense?
>
> Thanks,
>
> Ed

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Don Granaman
  INET: granaman_at_home.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Received on Thu Nov 29 2001 - 03:10:01 CST