Re: What is the best backup method, using RMAN, for backing up the physical and standby database?
Date: Thu, 31 Jan 2008 10:52:00 -0800 (PST)
On Jan 31, 3:59 am, Steve Howard <stevedhow..._at_gmail.com> wrote:
> On Jan 30, 9:24 pm, DG problem <skatef..._at_gmail.com> wrote:
> > 22.214.171.124 on HP-UX.
> > Primary DB on HOST1 (150Gb in size) and physical standby DB on HOST2.
> > Both hosts have their own tape library backup unit.
> > A low bandwidth link connects both hosts. Both hosts are in different
> > states.
> > It is important to keep backups of both databases on their respective
> > tape library.
> > Backups should go to disk first and then be backed up to tape.
> > What would be the best backup method?
> Out of curiosity, why would you back it up in both places? We have
> several physical standby databases and backup just the standby, as it
> is a block for block copy of the primary.
> Our storage is robust (EMC DMX), so we made the decision to just
> failover to the primary if we have a media failure (which are *very*
> rare for us, given redundant storage, controllers, switches, etc.),
> and rebuild the primary once the media failure has been repaired.
> That eliminated the need for duplicate backup media, and offloaded the
> RMAN backup CPU cycles to the standby node.
> I'm assuming the low bandwidth connection makes direct transfer a pain
> in the rear? How did you build the standby originally (timing,
> logistic issues, etc.)? If that was acceptable, rebuilding the
> primary from a backup of the new primary after failover should be
> feasible, and then switch back over to it at an agreed upon time.
> If you ever have complete catastrophe, the backups of the standby can
> be used to rebuild it in either physical location (block for block
> copy of either database).
Not disagreeing with Steve, for Steve's kind of situation.
But I'm also in the situation of having a weak link between primary and standby. We're in process of upgrading to new hardware, so some of the procedures are also in the process of being reevaluated. When the standby needs to be redone for whatever reason (which has usually been hardware problems), it takes several days over the network, or a commercial jet flight with a tape or disk drive. Our hardware is not robust, the manager is a hardware-oriented person, which means all sorts of manual work gets done, rather than just buying new things. And his manager is an accountant, so is overly cost-sensitive.
How well a standby works is largely predicated on the network. If that's the weak link in the chain, you don't want backups to be primarily depedent on it.
Of course, you want backups to be redundant. In the past, there was a cold backup to tape, with someone taking the tape home after it ran, in addition to disk backups which would propagate off the host. When I talked them into finally not bothering with a cold backup anymore, I suggested still having a hot backup to tape, to at least be able to cover a computer-room-only type of diaster without having to go through an entire switchover (switchback isn't there yet, so remaking the primary would be a bigger deal than just getting another server going from bare metal, and standby is smaller, dying hardware with nothing nearby to copy to). This was met with resistance, the feeling being if this site is nuked, the standby would be the only thing left anyways. I wrote the hot backup regardless, it only took a few minutes to write. But of course, no tape since no one wants to clean the dang thing.
This morning I came in and couldn't get mail to my .pst files which are on a server.
Yes, I'm not making this up. Last night one of the 220V plugs on the dual UPS's melted, blowing the building breakers, though it didn't actually short out. Since there's about 45 minutes worth of juice, everything died. One breaker is now needing replacement. The manager will be implementing SMS notifications for when the UPS starts draining its batteries. We will be using a removable disk on the new hardware. We are just plain lucky there wasn't a fire in the computer room. This is called management by crisis.
Of course, Oracle on the primary came up no problem, standby patiently never missed a beat. But the backups were hosed, since it died right in the middle of RMAN and never got to the logical. Way too close for comfort.
-- @home.com is bogus. Talk about single point of failure: http://www.guardian.co.uk/technology/2008/jan/31/internet.blackout.asia?gusrc=rss&feed=networkfrontReceived on Thu Jan 31 2008 - 12:52:00 CST