Re: Regular restore tests

From: Jeremy Schneider <schneider_at_ardentperf.com>
Date: Mon, 17 Apr 2017 14:10:00 +0300
Message-Id: <3BAF53D2-1C28-491D-B740-D1ED2BAE1B3B_at_ardentperf.com>



Nobody can define "enough" except you and your business. As with security, you can always do more. We all juggle many priorities with finite man-hours and cash. Business continuity or disaster recovery is one item on the list of priorities.

It's worth tracking with the conversation about devops though, where people study principles of modern manufacturing and agile development and look for revolutionary & counterintuitive ideas that could dramatically improve IT operations. Frankly much of our field is still running the same way it was run 30 years ago. The time has grown ripe for reinvention by bringing modern scientific thinking about systems and management to the table.

-J

Sent from my iPhone

--
http://about.me/jeremy_schneider

> On Apr 17, 2017, at 11:16 AM, Ls Cheng <exriscer_at_gmail.com> wrote:
> 
> Just wonder, I do some occasionally restore tests but I mainly run restore validate against the tape backups and disk backups. Is that enough or we must run the real physical restore?
> 
> Thanks
> 
> 
> 

>> On Thu, Apr 13, 2017 at 6:18 PM, Jeremy Schneider <schneider_at_ardentperf.com> wrote:
>> On Thu, 13 Apr 2017 09:46:37 +0100 John Hallas wrote:
>> > As an aside we have a server (well 2 actually - one HPUX and one
>> > Linux) that we use for backup proving.
>> > We have a couple of Oracle homes on each and we recover the backups
>> > from our Commvault repository onto these servers. If we have an
>> > 11.2.0.3 database we will recover into whichever OH is installed - it
>> > might be 11.2.0.2 or 11.2.0.4. and we do not run any
>> > upgrade/downgrade scripts All we are testing is that the database can
>> > be restored and that the backup is fully functional
>>
>> Nice, John!
>>
>> In my most recent position before this current freelancing spell, I
>> helped architect our restore testing process. Every week we randomly
>> picked one of our several hundred production web applications and did a
>> full out-of-place restore of both the database and application tiers. We
>> then logged into the restored application and pulled screenshots
>> showing dates of recent entries (but no sensitive data), proving the
>> date of the restore. We stored the screenshots as part of the evidence
>> on our ISO & SAS certifications. I found that auditors really liked
>> these screenshots too - proving the restore visually without lots of
>> explanation, unlike log or terminal captures.
>>
>> One of the counterintuitive principles in devops is that when something
>> is painful you should consider doing it more instead of less. Because
>> we did these restores weekly, we got very skilled and fast and reliable
>> in running them. The restores were also shared across the DBA team so
>> that we all acquired the skills. Eventually, every step was either
>> automated (by code in svn) or cut-and-pasted from a change-controlled
>> wiki.
>>
>> I was really proud of what we did on that operations team!
>>
>> -Jeremy
>>
>> --
>> http://about.me/jeremy_schneider
>>
>>
>> ##
>> Ok, I guess I just got so impressed with the size of a 64-bit value that
>> I was overwhelmed. Consider, for example:
>>
>> u64 i;
>> for (i = 1; i != 0; i++);
>>
>> Now in theory this will count each possible number, but in practice the
>> machine will die long before it ever finishes.
>>
>> - George Anzinger on linux-kernel
>> --
>> http://www.freelists.org/webpage/oracle-l
> -- http://www.freelists.org/webpage/oracle-l
Received on Mon Apr 17 2017 - 13:10:00 CEST

Original text of this message