Re: The Idiot’s Guide to Recovering from Truncating a Production Table

From: Tim Gorman <tim_at_evdbt.com>
Date: Wed, 19 Mar 2008 19:10:46 -0600
Message-ID: <47E1B996.9090407@evdbt.com>


David,

It is not always easy to determine the importance of a single table in a complex application, never mind the downstream impact of a unilateral decision to SHUTDOWN in the middle of a work day. In many circumstances, step #2, not step #1, could be a a true career-limiting move. Personally, I'd reverse the order of steps #2 and #3...

Also, step #13 is missing an important step between the SHUTDOWN and STARTUP command -- BACKUP DATABASE (full or level-0). Before you open that newly-recovered database for users, after a prolonged and unexpected outage, you need to ensure that you can recover that mutilated database from any further failures. And, you may very well have to do that full database backup "cold", with the users shut out. Think about it -- if some failure should occur prior to the completion of a "hot" backup? So, step #13 can be maddeningly prolonged, but it is necessary, unless one wishes to start from step #1 all over again...

But all this can be avoided if you instead perform a point-in-time recovery to a separate "clone" database, which consists of just all the tablespaces that contain undo/rollback segments (including SYSTEM) and the set of datafiles containing the table that was truncated. Once this offline clone database is restored, you can then import the recovered table back into the production database. The step involving some form of import of the table may seem inefficient, but the advantages are that you can retain any records added after the truncate, you don't lose the data in the other tables, and you do not have to perform the full/level0 database backup in order to maintain recoverability.

The last time I had to do this in production was prior to the advent of RMAN. We did a PITR recovery to a clone database, imported the table, brought the applications back online within 90 mins of the DROP TABLE, then went out and got smashed the rest of the afternoon. We returned to the office after dark to get our cars, and found a new production outage in progress. As I was smashed and this happened back in the 20th century, 13 years ago, I don't recall what that second outage was, but I remember our happy buzz turning to mean hangover in an instant, followed by a late and painful night. Funny the things you remember... ;-)

Hope this helps...

-Tim

David Taft wrote:
> All,
>
> After reading some of the posts on the "DBA Future" thread, it
> reminded me of the worse mistake I ever made as a DBA. Immediately
> after the incident I documented the actual recovery procedures partly
> in jest. Thankfully I haven't had a job interview in 15 years and
> don't anticipate having one anytime soon, so I feel relatively safe
> sharing this here for anyones possible enjoyment. :-)
>
> Cheers,
>
> David
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> The Idiot's Guide to Recovering from Truncating a Production Table
>
> In this scenario you just truncated a table in production that you
> meant to truncate in development. Now what do you do?
>
> 1) Shout OHHH SH**! (fill in the '*' with whatever letters you think
> are appropriate)
>
> 2) After you've gotten over the initial shock of this monumental mistake, do a
>
> Shutdown immediate;
>
> 3) Stop, think and immediately tell your manager or coworkers what you did.
>
> 4) Curse a coworker who had nothing to do with YOUR mistake. It won't
> help the situation, but now you can feel bad for two stupid mistakes.
>
> 5) Apologize for cursing your coworker and accept FULL responsibility
> for YOUR mistake.
>
> 6) Now that you are rational, calmly discuss with your
> manager/coworkers the best course of action. If it is decided to
> proceed with a point-in-time recovery, then continue with the
> following steps.
>
> 7) Backup the controlfile in case you have to restart the recovery:
>
> Alter database backup controlfile to
> '$ORACLE_HOME/dbs/$ORACLE_SID_backup.ctl';
>
> 8) Check the alert log for the exact time you started the "shutdown immediate".
>
> 9) Think about how much time may have passed between step 1 and 2,
> then decide on a safe point-in-time before the truncate occurred.
>
> 10) Run 'rman'.
>
> RMAN> connect target /
> RMAN> restore database;
> -- After restore completes
> RMAN> exit
>
> 11) Run 'sqlplus "/ as sysdba'
>
> SQL> recover database until time 'YYYY-MM-DD:HH24:MI:SS';
> SQL> alter database open resetlogs;
> --shutdown may take a while to apply undo from recovery.
> SQL> shutdown immediate;
> SQL> Startup restrict;
>
> 12) Check to make sure everything looks OK.
>
> 13) If everything looks good, then open the database to the users:
>
> SQL> shutdown immediate;
> SQL> startup;
>
> 14) Thank your coworkers for all their help and be grateful you still
> have a job.
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Mar 19 2008 - 20:10:46 CDT

Original text of this message