Re: Lightweight method for testing database backup processes

From: Ilmar Kerm <ilmar.kerm_at_gmail.com>
Date: Mon, 21 Aug 2017 19:36:01 +0200
Message-ID: <CAKnHwtfqmbHUQpTiYgvy4W-11VJhfhr+GW0V4Q+=LmcQWt3ZLA_at_mail.gmail.com>



Hi

We test all our database backups fully every day, and our scripts are here (includes backup and also autorestore scripts): https://github.com/unibet/oracle-imagecopy-backup

Autorestore results are logged and there are also alerts from Jenkins and Nagios if any database fails to restore. After restoring the database and opening it read only, the autorestore script can execute any query that outputs date value and compares it with the restore target value. If difference is larger than the tolerance interval, restore fails.
Autorestore script is run from a separate VM, that only has access to backup NAS, not the production storage.

Ilmar

On Mon, Aug 21, 2017 at 3:39 PM, Chris Stephens <cstephens16_at_gmail.com> wrote:

> We are looking for an efficient way to regularly test RMAN backups across
> a large (and growing) Exadata database environment.
>
> After watching this video https://youtu.be/Ds1xrfdlZRc i thought about
> doing the following:
>
> create a dedicated, small tablespace in all databases to hold a single
> table with a single date/timestamp column. create a scheduler job to
> insert current sysdate/systimestamp value once per day and delete all rows
> older than recovery window setting for RMAN.
>
> write a script to 1) randomly pick a database on each Exadata system 2)
> randomly pick a day that falls within the recovery window requirement for
> that database 3) converts that day to a valid SCN 4) uses the new table
> PITR functionality to restore the table 4) confirm expected table content
> 5) sends success/failure summary email.
>
> execute the script with a frequency that makes us feel comfortable with
> our backups.
>
> we also intend to have a process that utilizes the "restore preview" RMAN
> command to get a list of backup pieces to run the RMAN "validate" command
> against for a randomly chosen SCN that falls within recovery window.
>
> Does anyone see any big issues with this process? Any other ideas for
> efficiently testing database backups? our databases will soon be large
> enough to make testing through full restores infeasible.
>
> any feedback is greatly appreciated!
>
> thanks,
> chris
>
>

-- 
Ilmar Kerm

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Aug 21 2017 - 19:36:01 CEST

Original text of this message