Preparing to recover
The following steps should be followed in the event of a media or system failure. In the event of a disaster (the physical system or computer room is destroyed), these steps should not be used and the Database Administrator should refer to the Business Continuity manual for the Organization.
- Remain calm and do not panic.
- Determine the severity of the failure.
- Assess the recovery strategies. If necessary, refer to an Oracle Backup and Recovery Manual or call Oracle World Wide Support for assistance (See the organization's Site Guide for account information).
- If the disk containing the online backup is accessible, use that data to restore the database, otherwise work with the UNIX or Windows administrators to restore necessary files from tape.
- Recover the database.
- If an incomplete recovery was necessary, get a cold backup of the database immediately to protect the company against another failure in the near future.
 Why is Recovery Essential
To improve performance, Oracle keeps many changes in memory, even after they are committed. It may also write data to the datafiles to free up memory, even though the changes have not been committed. At the time of a failure, all data in memory is lost. In order to ensure no committed changes are lost, Oracle records all operations in an online redo logfile. The information in the log file allows Oracle to redo any operations that may be lost in a failure. Writing to the logfile does not hurt performance, because these writes are sequential and very fast. Writing to datafiles on the other hand is random and can be very slow because the disk block to be modified on disk must be located, and the disk head properly positioned for every write.
To Understand Recovery more clearly, we need to Understand two "Buzz Words", Roll Forward and Rollback.
 Roll forward
During the roll forward phase, Oracle replays transactions in the online redo log beginning with the checkpoint position. The checkpoint position in the place in the redo log where changes associated with previous redo entries had been saved to the datafiles before the failure (Each data file, in its header, has a checkpoint structure the contents gets incremented every time LGWR issues a checkpoint to the DBWR. The checkpoint structure has two structures checkpoint counter and SCN). As Oracle replays the redo operations, it applies both committed and uncommitted changes to the datafiles. At the conclusion of the roll forward phase, the data files contain all committed changes, as well as new uncommitted changes (applied during roll forward) and old uncommitted changes (saved to the datafiles to free up space in buffer cache prior to the failure). Some questions which arises here are:
- Committed changes - Due to the way DBWR writes it is possible that, data changed and committed in the Database Block Buffers, make entry in the Online RedoLog files, but may not necessarily be written to the Database Files by DBWR.
- Uncommitted Changes - DBWR may write uncommitted changes to the Database File (During High Transaction Activities), to make room in the Database Block Buffers, for new transactions.
During the rollback phase, Oracle searches out changes associated with dead transactions that had not committed before the failure occurred. It then uses transaction tables in rollback segment to rollback the change to its previous value. At the completion of this phase, the data returns to a transactionally consistent state. Oracle can be open during the rollback phase. Any new transaction that tries to modify a row locked by a dead transaction will need to rollback the transaction blocking its path to release the lock.
When the database is opened a start SCN is recorded in the control file for every data file associated in the database and a stop SCN is set to Infinity.
During normal database operation's, The SCN and the checkpoint counter, information in the data file header is incremented every time a checkpoint is done.
When the database is shutdown with the normal or immediate option, an end SCN is recorded in the data file header, This information is also recorded in the control files, i.e. end SCN of the datafile is equal to the stop SCN of the control file.
When database is opened the next time, Oracle makes two checks:
- If end SCN in the data file header matches its corresponding stop SCN in the control file.
- If checkpoint in the data file header matches its corresponding checkpoint counter in the control file.
Let us say you shut down the database in ABORT mode:
- checkpoint is not performed and the STOP SCN in the control file is left at infinity (the same state when you started or opened your data files).
- For example end SCN in the datafile header is "1000" and stop SCN in the control file is "Infinity"
In this case Oracle performs crash recovery and as a part of crash, Oracle reads the on-line redo log files and applies the changes to the database as a part of the roll forward and reads the rollback segment's transaction table to perform transaction recovery (roll backward).
You can often use the table V$RECOVER_FILE to determine which files to recover. This table lists all files that need to be recovered, and explains why they need to be recovered.NoteThe table is not useful if the control file currently in use is a restored backup or a new control file created since the media failure occurred. A restored or re-created control file does not contain the information Oracle needs to fill V$RECOVER_FILE accurately.
The following query displays the file ID numbers of datafiles that require recovery:
SELECT file#, online, error FROM v$recover_file;
 Recovery Scenarios
- Complete Media Recovery
- Incomplete Media Recovery
- Datafile belonging to UNDO tablespace
- Archivelog mode and datafile recovery (non-SYSTEM tablespace)
- Archivelog mode and redolog recovery
- Control file recovery
- One or more datafiles damaged
- Datafile damaged and backup not available
- Recovering from user errors