DISASTER RECOVERY: Our experience and questions for Gurus!

From: Deja User <dejacom_at_my-deja.com>
Date: Mon, 06 Dec 1999 20:07:28 GMT
Message-ID: <82h51r$sjk$1@nnrp1.deja.com>

We had to do a point-in-time recovery and ran into some unexpected scenerio. Your help will be appreciated. (FYI, the most important question is at the very end).

First on What triggered a recovery:
A table space was accidently dropped. But since users were logged in and actively working, the "drop tablespace" command did not complete when it hit a "locked" record (we use "select for update nowait" in our applications). So we had a tablespace with half of the tables gone. According to alert log/trace file, it indicated that it encountered an resource busy error during "tablespace drop". So, the "drop tablespace" didn't have a "completed" entry in the log.

First attempt at recovery:
Wishfull thinking led to believe that since the "drop tablespace" didn't complete and abnormally ended due to an error, simply shutting down the db and restarting it would fix the error. For some reason it was not attempted. FIRST QUESTION: do you think it would have worked? i.e., would restarting the DB would have "fixed" it? Your guess would do!

Second attemp at recovery:
We took the tablespace's datafile from an old backup, copied it over the existing datafile and issued the command "recover tablespace". Again the wishfull thinking was at work that since the "drop tablespace" did not complete due to an exception, somewhere internally it would have been flagged as an "in-doubt" command/operation and "recover tablespace" will bring back the tablespace just prior to the "drop" (i.e. abended operation) command. Recover command did initiate the recover process where it was re-applying all operations from re-do logs. But when it ended, we had the same *incomplete* tablespace. Just like it was right after the drop command--the last operation executed! So much for that. Oracle, dutifully, had re-applied the drop command as well. SECOND QUESTION: From the behavoiur of "drop tablespace" even after re-applying re-do logs, it seems that "drop tablesapce" command, internally, is sort of like a macro. It first issues individual drop commands against all object (tables?) it contains, and then when they all succeed, it marks itself empty. So, in simplest form, a drop tablespace is a very long "drop table(s)" command. And in the re-do logs, it doesn't record the "drop tablespace" as the operation, but each individual "drop table" is logged. Is that true? That is the only explanation I can think offor an incomplete tablespace after recovery even though the original "drop tablespace" command abended.

Third attemp at recovery:
It was the plain old point-in-time recovery with datafiles copied from an old backup. The command used was "recover database until time "1999- 12-03:11:00:00". Notice that we don't have milliseconds in the command. Should we? Anyway, we didn't have the change number prior to the "drop tablespace" so we only had the "time" option. THIRD QUESTION: How can a change-number be obtained for an operation which you are trying to recover upto? For example, in our case, how could we get the change- number for the last commit prior to the "drop tablespace" command. Continuing on...when the recover ended, we started the database with "alter database open resetlogs". FOURTH QUESTION: Why doesn't oracle dis-allow 'noresetlogs' as an option after an incomplete recovery? After a point-in-time recovery, I can't think of *ANY* reason to first open the database, after point-in-time recovery, with "noresetlogs" option. Doing so just rolls the DB forward to the final state and all your point-in-time recovery was for nothing! Moving on... When we looked at the data, three of the tables reported following errors when we tried to view their data: ORA:01578 Oracle datablock corrupt, file #7, block # 285699. ORA:01110 datafile 7 <...blah blah blah...> MOST IMPORTANT QUESTION: How do we fix this error. How can we fix such a corrupt datablock after a point-in-time recovery without rolling forward "dangrously" (I hope you know what I mean by dangrously!). Our concern is thatrecover tablespace and recover datafile commands don't have UNTIL as an option. So a simple recover will only roll them foward. Or not? Or, is it that at this point in time, we should have opened the db with "resetlogs" option (that we did) and then issued the recover tablespace/datafile command to only fix that corrupt block? WHAT IS THE WAY TO DEAL WITH IT? Thanks very much for your help!
Regards,
Mike.

Sent via Deja.com http://www.deja.com/
Before you buy. Received on Mon Dec 06 1999 - 14:07:28 CST