Re: alter system checkpoint in hot backup script. Why

From: Burt Peltier <burttemp1ReMoVeThIs_at_bellsouth.net>
Date: Fri, 21 Nov 2003 20:08:32 -0600
Message-ID: <suzvb.21735$ow5.21074@bignews2.bellsouth.net>

Thanks for the great reply. Also, sorry this is getting to be a lot longer post than I wanted, but it is necessary I think.

I think I understand the issue that apparently I was not aware of before (needing a checkpoint in our particular case).

Please correct me if I incorrectly summarize for me or anyone else looking at this (also have 1 more question at the end of my post).

Note: I don't think I am being paranoid. I am trying to cover the particular case of loosing EVERYTHING including REDO logs and subsequent archived redo logs after the last night's hot/online backup.

Summarizing: We need to do the "alter system checkpoint" after a hot/online backup if I want to ensure uncommitted REDO entries are written from buffers to the REDO log disk files. This would then allow the "alter system archive log current" to copy off the REDO log to the archived redo log dest. This archived redo log file will NEED to be included in our hot/online backups, IF we want to recover the database to a consistent state from the last night's backup and only loose 1 day's worth of data (data since the last backup). Also, if we had the previous night's backup and archived redo log during THAT backup, then we could restore from that backup for a consistent state. But, then of course, we would loose 2 day's worth of data.

Note: I don't want to go off on another issue, but this particular database is completely stored on 1 RAID partition (not my choice and out of my control). So, if we loose the RAID partition, we loose ALL Oracle files and the Oracle software too. Also, no one is questioning performance implications (because of 1 RAID partition) or the loss of 1 day's data.

Also, there is one slight complication for me. The hot/online backups are using RMAN full backups on weekends and then incremental nightly backups. Note that this is an Oracle8i 8.1.7.4 database on W2K.

I am not as familiar with RMAN backups (especially incremental). Anyway, it does NOT seem that this should make any difference (still need the checkpoint and archive log current commands).

But, if you have any comments on this extra bit of RMAN/version information, please post a reply.

Thanks for all the very useful information.

-- 
"Howard J. Rogers" <hjr_at_dizwell.com> wrote in message
news:3fbe71ac$0$14031$afc38c87_at_news.optusnet.com.au...


> Sorry... I didn't see this before.

>

> "Burt Peltier" <burttemp1ReMoVeThIs_at_bellsouth.net> wrote in message

> news:Br3sb.93707$un.92753_at_bignews6.bellsouth.net...

> > Just double checking, but would the following commands be enough after

the


> > "end backup"s ?

> >

> > (1) alter system checkpoint;

>

> Depends on what bit of the scary stuff you're worried about. If you are

> concerned that the datafile headers won't be synchronised, that would do

the


> job. It would also ensure the online logs contain all the redo needed to

> make sure that the just-backed-up datafile can be restored and recovered

> fully.

>

> But it wouldn't address the 'what happens if I lose my online logs' scare.

> That's only addressed by making sure the last bit of redo needed to make

> your backup consistent is safely in the archive files. Forcing a

checkpoint


> would only get it into the online logs. Since you should never backup the

> online logs, there's still a risk (minute, but real) that you'd lose that

> necessary redo.

>

> > - Or else take a chance some REDO buffer info is not written to the REDO

> log

> > on disk !?

> > (2) alter system archive log current;

>

> That will do. Causes a log switch. This is, in fact, what RMAN does

> automatically for you at the end of every backup in 9i.

>

> > (3) Now backup the archived redo logs.

>

> Precisely.

>

> > I hadn't thought about the buffer'ed REDO log area not being written to

> the

> > REDO log on disk, so I see where a checkpoint would be needed (not an

> > option?).

>

> Well, remember that redo for committed transactions is by definition

safely


> written to the online logs. So that's not the concern. The concern is that

> even uncommitted transactions' redo has to be applied to a hot backed-up

> datafile to make it consistent and roll-forwardable. So if your instance

was


> to crash and in doing so cause a media failure, the backup of your last

> datafile might be unusable, but you'd presumably have other backups you

> could use to do the media recovery.

>

> For the checkpoint to be absolutely required after each datafile is backed

> up, you would have to imagine a scenario in which your instance failed

> immediately after the backup, and you lost the 'live' version of the file

> you just backed up, and you lost your current log, and you had no prior

> backups of that datafile (or that you did have a prior backup of that

> datafile, but without the corresponding archives). It's a long list of

> tragic circumstances that I can't see coming all to fruition at precisely

> the same time very often... so I'd still say the checkpoint at the 'end

> backup' time is extremely optional.

>

> >

> > Doesn't seem like just an option for the paranoid, although I probably

> fall

> > into that category.

> >

>

> The real paranoia comes merely from the fact that the worry is that the

bit


> of redo left in your online logs, but not yet archived, which will be

needed


> to make your hot backup consistent, might be lost. But if you mutliplex

your


> online redo logs, that is unlikely. Left to their own devices, the online

> logs will switch themselves eventually anyway, and thus that last bit of

> needed redo will get archived by its own efforts, too.

>

> So: there are two issues. Is the checkpoint at the end of each datafile

> backup really required? Not really. Is the log switch at the end of the

> entire backup really needed? Not really. But are there circumstances where

> the failure to do either could cause trouble? Yes... but they require a

> really cacked-up database to materialise. Are there costs associated with

> being "super safe"? Absolutely... checkpoints and archiving are not free

> exercises. So where do you strike the balance? Up to you, really.

>

> It's a bit like the paranoia surrounding the "shutdown abort" command.

> That's only dangerous if, immediately after it, and before your next

> startup, you were to lose your current redo log. Now, I suppose it *is*

> possible to lose your current log, but I never have. And it would be an

> extremely rare occurrence for any database running on a half-decent O/S

> with properly multiplexed and then mirrored online logs.

>

> I suspect that anyone who has ever issued a shutdown abort has already

> indicated their faith in the survival capabilities of their online logs

> (quite reasonably, in my opinion) and thus should not need the final log

> switch at the end of a backup either.

>

> Regards

> HJR

>

>

>

>

> > Thanks!

> >

> > -- 

> > "Howard J. Rogers" <hjr_at_dizwell.com> wrote in message

> > news:3fb01fb3$0$3499$afc38c87_at_news.optusnet.com.au...

> > >

> > > "Guy Dallaire" <gd-newsgroups_at_spamex.com> wrote in message

> > > news:ulUrb.1104$IK2.109764_at_news20.bellglobal.com...

> > > > Hello,

> > > >

> > > > While  visiting  http://www.geocities.com/lydian_third/ (Which by

the


> > way

> > > is

> > > > a REALLY GOOD resource) I noticed in an hot backup article, that the

> > > author

> > > > is doing:

> > > >

> > > > alter tablespace ... begin backup;

> > > > host copy ...

> > > > alter tablespace ... end backup;

> > > > alter system checkpoint;            <- This puzzles me

> > > >

> > > > Why are we issuing a checkpoint ? In the article, is says that it is

> to

> > > > force its header SCN back into synchronisation with the rest of the

> > > > database. I don't see the point (no pun intended) of the checkpoint

> > here.

> > > > The author really knows what he's doing, I'm sure, but I'd like to

> know

> > > why

> > > > this is necessary/beneficial.

> > > >

> > > > Isn't oracle supposed to know that the datafile is out of backup

mode


> > and

> > > > automatically start updating the header when needed ?

> > >

> > > As the author, I take it this means I didn't write very clearly :-(

> > > It's a while since I wrote that article, and the site's not mine, so I

> > don't

> > > read things on it as closely as I should.

> > >

> > > Anyways: the point is this. Yes, if you wait long enough, then at the

> next

> > > checkpoint, the header of the just-backed-up file will be brought into

> > > synchronisation with the rest of the database all on its own and

without


> > any

> > > effort on your part. But one assumes that you are not checkpointing

like


> > > crazy (otherwise performance would be none too good). Therefore, there

> is

> > a

> > > window of, say, 5 minutes or more, where it is possible that your

> instance

> > > will crash, before the headers got resynchronised. At which point,

when


> > you

> > > start up the instance, it will moan about the data files not being in

> > synch.

> > > and the database therefore needing recovery... which it actually

> doesn't,

> > > but you'll probably panic and start restoring and recovering anyway

(and


> > the

> > > last person to do pretty much exactly this unnecessary recovery of a

> > > database posted here recently: he botched it up and lost data as a

> > result).

> > >

> > > So a forced synchronisation with a spurious checkpoint is just a

safety


> > > measure.

> > >

> > > There is another aspect to the safety that I didn't mention in that

> paper.

> > > You've just taken a hot copy of a datafile. It is possible that whilst

> it

> > > was being copied, some transactions affected that file (thus

generating


> > > redo). Were you to lose all your online redo logs, that backup copy of

> the

> > > datafile might well not be useful, because the redo required to get it

> > > consistent is now missing.  By forcing a checkpoint, you force DBWR to

> > flush

> > > buffers to disk. Anytime DBWR wants to write to disk, we kick LGWR off

> > > first. Therefore, the redo for the transactions that affected the data

> > file

> > > being copied is now in the redo logs. Doesn't help much, I suppose, if

> you

> > > now lose all your online logs. But hopefully there'll be a log switch

in


> > > there at some point, and then the redo needed to make your recent

backup


> > > actually useful is safe.

> > >

> > > In other words, by forcing a checkpoint, you go some way to trying to

> > ensure

> > > that the redo needed to make the recent backup useable is safe on

disk,


> > and

> > > not just floating around the instance. And once it's on disk, it's

> likely

> > to

> > > be archived shortly (lots of people force a log switch at the end of

the


> > > entire backup for precisely that same reason).

> > >

> > > Personally, I tend not to bother with the checkpoint myself. There'll

be


> > > another one along shortly anyway. But for the terminally paranoid, it

> > makes

> > > some sense.

> > >

> > > In short, yes, Oracle will update the header "when needed" all of its

> own

> > > accord. But your backup might be useless if a failure intervenes

before


> it

> > > has a chance to do so.

> > >

> > > Regards

> > > HJR

> > >

> > >

> > >

> > >

> > >

> >

> >

>

>

Received on Fri Nov 21 2003 - 20:08:32 CST