Re: Lightweight method for testing database backup processes

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Mon, 21 Aug 2017 14:37:16 -0400
Message-ID: <56f5dcba-5e9a-895c-c851-f72abdb70d6b_at_gmail.com>



I am not working with tapes much these days, mostly with the things like Glacier, cheap remote storage like Isilon or a combination of both. So, backups are kept less than a week on the primary site, around a month on the Isilon and almost forever in the Amazon Glacier. Every modern backup suite has a built-in verification mechanism, which can verify whether backup is good or not. You can also run "restore validate" on a regular schedule. I don't see a big science here. I have restored a TB sized database from Glacier, no problems at all. There are also non-rman mechanisms like SRDF, HUR and SnapVault which can be used for backing up databases. At the last stage, a file backup of the snapshot is performed and stored to the Glacier, to meet regulatory obligations. How do you propose to validate those, on a weekly basis?

Regards

On 08/21/2017 02:25 PM, Matthew Parker wrote:
>
> I have to disagree with you, in most organizations it is not a DR
> test. A DR test servers a different purpose than ensuring that your
> backups and processes are good by testing them on a regular basis..
>
> It is not faith based testing.
>
> There are a variety of testing that can be performed.
>
> First there are backups that are offsite besides just database backups.
>
> I have been in a variety of organizations that have quarterly SOX
> audits where we pull back a set of tapes based on a random selections
> of files by the auditor to verify that backups are statistically good.
>
> There is also some organizations I have worked for where the
> requirement was to yearly test all backups and it was not a single
> yearly test it was were testing backups throughout the year to verify
> the system was working throughout the year, not just at one selected
> timepoint, but by the end of the year we had recovered at least once
> all multi-thousand databases. You normally setup automation to perform
> the onsite based backups, but the selection of offsite backups to
> prove those processes too normally has some manual intervention.
>
> Testing of a single tablespace is a viability test of the database if
> you fully recover it to open the database. This is how lots of
> organizations that have databases that are 100TB – 1PB size oracle
> database test the viability of the backup. They don’t necessarily have
> enough space to restore every portion of the database but can restore
> pieces at time.
>
> I also restored system, sysaux, undo and 1 tablespace through multiple
> cycles so that in the end the complete database was restore tested.
>
> Having all your DBAs testing restores also keeps them practiced on the
> process and increases the interaction between the DBA and Backup team
> which is always a good thing. Yes, I have been at organizations where
> they do not test backups at all, and then when the oncall is pinged to
> do the restore something is wrong they fumble through SOPs to try and
> figure what needs to be done and the recovery takes longer than it
> should or others have to become involved because the DBAs are not
> practicing their craft.
>
> It also helps your team capture changes in the process as sometime the
> different teams don’t communicate well with each other and it is
> better to discover some change that could be detrimental to you during
> a test instead of when you really need it.
>
> When I first started out as a DBA the Senior DBA in our org basically
> setup a test system and put me through 30 days of disaster recovery
> training. He would destroy the database and it was my job to restore
> it and explain how he had destroyed/broken it. It was invaluable training
>
> *Matthew Parker*
>
> *Chief Technologist*
>
> *Dimensional DBA*
>
> *425-891-7934 (cell)*
>
> *D&B *047931344**
>
> *CAGE *7J5S7**
>
> *Dimensional.dba_at_comcast.net*<mailto:Dimensional.dba_at_comcast.net>**
>
> *View Matthew Parker's profile on
> LinkedIn*<http://www.linkedin.com/pub/matthew-parker/6/51b/944/>
>
> www.dimensionaldba.com<http://www.dimensionaldba.com/>
>
> *From:*Mladen Gogala [mailto:gogala.mladen_at_gmail.com]
> *Sent:* Monday, August 21, 2017 9:32 AM
> *To:* Matthew Parker <dimensional.dba_at_comcast.net>;
> nenad.noveljic_at_vontobel.ch; cstephens16_at_gmail.com; oracle-l_at_freelists.org
> *Subject:* Re: Lightweight method for testing database backup processes
>
> On 08/21/2017 11:04 AM, Matthew Parker wrote:
>
> Most organizations who have to participate in any type of
> compliance requirements such as SOX Compliance are required to
> test their backups.
>
>
> And most organizations do perform such testing. Such practice is
> called "DR test" and usually occurs once or twice per year. Testing on
> daily or weekly schedule is something unusual, even if it's only done
> using "restore validate".
> Further more, the OP proposed testing backup/restore of a specific
> tablespace. I don't see how correctness of the tablespace backup
> guarantees the correctness of the full or incremental database backup?
> That looks like a faith based testing strategy.
> Regards
>
> --
> Mladen Gogala
> Oracle DBA
> Tel: (347) 321-1217

-- 
Mladen Gogala
Oracle DBA
Tel: (347) 321-1217


--
http://www.freelists.org/webpage/oracle-l
Received on Mon Aug 21 2017 - 20:37:16 CEST

Original text of this message