Re: Backups versus snapshots

From: Paresh Yadav <yparesh_at_gmail.com>
Date: Fri, 19 Sep 2014 03:05:55 -0400
Message-ID: <CAPXEL0KZFuRF8XWPx61HmTrB4nFsPRLXaWvv5sL37Zd7ZGEyhA_at_mail.gmail.com>



Thanks again Mathews for detailed response. "... 1 manager and 3 engineers, responsible for tape systems and backup SW worldwide (6 PB/month)....", that is quite scalable automation/arrangement, congrats!

Your non profit example reminds me about when I worked for a small company long time a go where we took the backups to disk and copied them to RW DVDs and deposited them in a bank safety deposit box nearby every day from Mon-Fri. I think we kept daily backup DVDs for 7 days, weekly for 3 months and yearly for 7 years (never went beyond 3 years in practice as the company was acquired) . Never had to recall old DVDs but not sure if DVDs would have been readable after 3/7 years. Occasional used backups from "yesterday" to refresh dev/test environments which kind of tested the backups. The catalogue was nothing more than what was written on the DVD jackets. So yes whatever works given the resources!

Thanks
Paresh
416-688-1003

On Fri, Sep 19, 2014 at 2:19 AM, Dimensional DBA < dimensional.dba_at_comcast.net> wrote:

> We didn’t follow the version route. The Global backups team at Amazon was
> 1 manager and 3 engineers, responsible for tape systems and backup SW
> worldwide (6 PB/month). The objective at Amazon was automation, not scaling
> the human work force and solve the problems of today and the future with
> good design.
>
>
>
> The new process was to simply upgrade and uplift media in the 7 year
> cycle. The 7 year cycle was based calculations on failure of tape media
> with the more hostile 90 degree Fahrenheit temperature environment and the
> 3 week over write reuse of the tapes in most instances. The object was to
> eliminate the equipment version problem by having newer versions of the LTO
> tape drives come into play and by the second version down the road that
> could still read 2 versions back start the retrofit of the old media to
> newer version media, then basically dump all the old equipment. You do what
> you have to do and then modify process, procedures and architecture to
> eliminate your problems. We also did a lot with process to eliminate humans
> interacting with the media which has the greatest potential for tape loss
> and monitoring the equipment to ensure proper operation of the tape drives
> and rate detection to adjust backups to eliminate shoe shining of tapes.
>
>
>
> For the specific recovery I had external disk copies of Oracle back to
> Version 7. I sort of maintain my own version copies at home. (If anyone has
> versions older than 7.x I would love to have them.)
>
> I also maintain older copies of Linux, HPUX, Solaris and MS Windows. At
> the time for hardware as planning wasn’t done for maintaining older
> versions of equipment or what would you need to do to really restore, I
> fell back on ebay for equipment for the actual drive as I already had an
> old server tucked away.
>
>
>
> The 1997 copy was an actually export instead of regular backup as someone
> had the foresight at the time to think about the complexity of full
> database restore if you were not storing all the other components although
> later I fond CD s of a backup too. The recovery was actually off of 8mm
> DAT. I had a few later ones from CDs.
>
> Realistically speaking some of the recoveries were simply luck based on
> the lack of care of the media.
>
> The longer time that you may be thinking of is from most people thinking
> linearly instead of performing multiple tasks in parallel where possible. I
> have seen a backup team not start any work until the tape is actually in
> their hands, when there is associated work that could be performed like
> prep the server and install he relevant SW, then start the restore as soon
> as the media arrives. I have had that need to push some teams in certain
> companies as the media was awaiting for the DBA team to formally request
> the backup team to start the restore.
>
>
>
> In this case the 8mm tape was actually in a desk drawer I took over from
> the previous manager, so retrieving the item in question a simple check off
> the list as it was at my desk. If I would have had to go through the +12K
> tapes that were stored in Iron Mountain prior to the year 2004, then it
> would have taken much longer as the previous team had performed an upgrade
> in 2003 on Symantec SW and had simply installed net new and the previous
> catalog was lost. Also in 2000 Iron Mountain had upgraded their systems and
> everything prior to 12/2000 was listed as ingested by Iron Mountain on that
> date. So if I really wanted a specific tape in Iron Mountain before 2003,
> then we would have had to retrieve every tape from Iron Mountain prior to
> 2004 read them all to rebuild a catalog to find anything (Estimated time
> would have been 9 months). There are lots of things that can go wrong in
> the infrastructure if you are not thinking about long term in the future.
> That includes disk storage if your vendor is not using some technology to
> counter bit rot and verification of data moves from point A to point B as
> they perform data moves to upgrade equipment.
>
>
>
> The fact I had the media immediately available then it was just kick start
> the server and install the OS, then install Oracle, less than 6 hours. The
> longest wait was 2 days for the arrival of the drive, (Saved me time from
> having to dig through hundreds of boxes in my storage shed where I know
> tape drives of all sort remains buried to this day.)
>
>
>
> Once we got the processes down, we had tape backups of the OS kickstart
> servers with all copies of all images of the OS used along with all the
> database SW installable SW homes, so we could restore to any specific OS
> and version. You still have to deal with driver problems with the OS
> sometimes or relink problems with the Oracle homes. In some cases this is
> why a complete system image including database may be stored. (Every
> situation has some differences as what you are trying to do.)
>
>
>
> As to small business or large business you have to have a process and
> understand the technology. Tape systems are not expensive for the small
> business if you use the smaller version systems from the smaller tape
> system vendors. An example you can buy a single tape drive desktop unit
> with 8 slots, I had one of the first ones back in 1996, that currently cost
> only about $4K, whereas for anyone that purchases large scale system
> vendors know that the list price on a single say LTO6 drive is 5 times
> that. There are also replaceable disk units or as I have seen at some small
> companies they simply attach a USB drive to a back of the server, then send
> it off to storage to Iron Mountain. It is the longer term process that
> media needs to be pulled back and converted if necessary. If you are a
> business under regulatory compliance you will do what is necessary to
> accomplish the task. How well you accomplish the tasks varies sadly by the
> humans involved.
>
>
>
> I remember writing backups to tape systems with mt and tar and
> manipulating the library directly with shell scripts before all the nice
> backup SW existed. It is all doable for even small companies, but you must
> have process. Yes, it takes some extra human effort to perform the job. I
> have seen some really small business such as a local community center
> (non-profit which means spend no money if possible as it all should be
> spent to help the community) that had a single DAT drive on their windows
> server and pulled the tape out every morning after the backup ran the night
> before as a safety measure for their data. Yes, the kindly old lady at the
> computer dutifully took the tape backup from the previous night home every
> night in her purse and kept it in a cabinet for a couple months before
> bringing the old tapes back.
>
>
>
> You have a variety of choices to make relative to cost, simplicity, risk
> etc. There is not one right answer for every business as each aspect of
> your choices have different priorities to the business, (disk, tape, cloud,
> nothing)
>
>
>
> I worked at a lot of companies working closely with the CFO on financial
> systems and from the CFO perspective their concept was do what is necessary
> to ensure compliance and keep them out of jail. You present options with
> risk analysis and they will choose what level of risk versus cost they are
> willing to take. Your job is to implement the system with proper monitoring
> and processes to ensure reliability including testing restores on a regular
> basis. Even the best system can be undermined by the humans or by neglect.
>
>
>
>
>
>
>
> *Matthew Parker*
>
> *Chief Technologist*
>
> *425-891-7934 <425-891-7934> (cell)*
>
> *Dimensional.dba_at_comcast.net <Dimensional.dba_at_comcast.net>*
>
> *View Matthew Parker's profile on LinkedIn*
> <http://www.linkedin.com/pub/matthew-parker/6/51b/944/>
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [mailto:
> oracle-l-bounce_at_freelists.org] *On Behalf Of *Paresh Yadav
> *Sent:* Thursday, September 18, 2014 9:15 PM
> *To:* iggy_fernandez_at_hotmail.com
> *Cc:* kmoore_at_zephyrus.com; Oracle-L_at_freelists.org
> *Subject:* Re: Backups versus snapshots
>
>
>
> Thanks Matthew for sharing your valuable experience at Amazon.. As Hemant
> mentioned you must have preserved all associated tech (tape library to read
> the old tapes, machine and OS version that can run the old db software
> version, db software at version that can restore the backups etc.). And
> this needs to be done for all possible tech (hardware and software) and its
> version that gets used over a period of time. Amazon can afford the
> infrastructure and manpower required to maintain this but how does a SMB
> meet 7 year regulatory retention requirement?
>
>
>
> What was typical time to recover a 1997 Oracle db backup (probably Oracle
> version 7.x) in 2010 after having to install Oracle 7.x software on a
> compatible OS and hardware? This will involve not only locating the backups
> but also the software install media and the hardware that can run the
> software.
>
>
> Thanks
>
> Paresh
>
> 416-688-1003
>
>
>
>
>
> On Fri, Sep 19, 2014 at 12:04 AM, Iggy Fernandez <
> iggy_fernandez_at_hotmail.com> wrote:
>
> snapshots or backups are just means to an end; that is, meeting the
> availability and regulatory requirements within the available budget. if,
> for example, you have regulatory requirements to store data for a certain
> number of years, then you could copy the contents of the snapshots to tape.
>
>
>
> re: if the database goes poof then the snapshot is gone as well
>
> if the database goes poof, then the snapshot remains
>
>
>
> iggy
>
>
>
> > To be clear, the snapshots are not physical copies of the database. They
> only track the differences between the database at the time of the snapshot
> and the current time. So if the database goes poof then the snapshot is
> gone as well.
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Sep 19 2014 - 09:05:55 CEST

Original text of this message