Re: creative use of storage snapshots.

From: kyle Hailey <kylelf_at_gmail.com>
Date: Mon, 20 Dec 2010 16:04:34 -0800
Message-ID: <AANLkTikLMPxLurHcTx3zKnPgpfemEc--LnLYJ+d8ov=j_at_mail.gmail.com>



>> it sounds to me like one of those great ideas
yes it's definitely an great idea who's time has come. This idea is being harnessed by more and more companies across different industries - we've be implemented this at Shopping.com (part of ebay), Tivo (consumer tech), Corporate Express (retail), Tagged (social networking), Clearwire (telcom), KLA Tencor (hardware), Holland America (travel) . With copy on write file systems and the right kind of code to manage the clones (virtual databases), virtual copies of a database can be made in minutes despite the size of the data, and all the data blocks are shared shared except those blocks that have changes made to them. I can make about multiple virtual copies of a TB size database in minutes and basically only use the amount of space it takes for each to have their own private redo log files. After the virtual database are provisioned and running each would keep private copies of any data block that they modify.
Virtualizing databases saves tons of disk space and reduces the time to make copies from hours, sometimes days to just minutes

I cringe when I talk to customers who don't have development or QA copies of production. Usually developers work on subsets of production and often query performance on these subsets don't reflect the performance in production.  Even if production is a full copy of development there is often just one copy shared by all the developers and QA. Any change to this copy has to go through a number of hoops and signoffs which slows down the development process and de-motivates developers. I don't know if you can imagine having a copy per developer of production where each developer can make and test as many changes as fast as they want on an exact copy of production - it's awesome and inspires the developers. After the developers are confident of there modifications then they can submit the final change request for review.

Delphix makes all of this easy with no changes on production and creation of virtual databases just a few clicks of a mouse in GUI .

It can of course be done by hand as a few people have explained. Oracle explained how to do it in a white paper: http://www.oracle.com/technetwork/database/features/availability/maa-db-clone-szfssa-172997.pdf

Three ways that come to mind to create these virtual copies:

  1. snapshoting the production files system with something like EMC's snapview or Netapps Flexclone or ZFS
  2. doing the same thing on dataguard
  3. using Delphix

Snaps on production could be problematic if they share the same file system with production, QA and Dev. A better case would be copying off the snapshot clones to another filesystem, or mirroring the filesystem and breaking the mirrors.

With dataguard, the solution, as proposed by Oracle, is to have a production database, then secondary dataguard database, then copy the dataguard database onto an Oracle 7000 series storage with zfs and then use zfs snapshotting to provision databases. Of course requiring a special storage device and an extra oracle as well as dataguard license is not cheap

For Delphix, (where i work), we don't modify production any. We copy over the production database using an optimized RMAN API onto a Delphix appliance which typically has a compression ratio of 2:1 or better (ie 50% or better). The Delphix box then exports via NFS the database files which can be mounted on other machines. Each machine gets a private view of the datafiles, ie a virtual database. All data is shared unless modifications are made.

Caveats

Space
Since modified blocks are copied and kept separate, then the more data blocks that are changed by each virtual database and the longer the virtual database lives before being refreshed from a newer snapshot of production, the more space each virtual database will take. It's unlikely that they vdbs will ever reach the size of the base database, but they could get close with enough widely distributed changes. With Delphix both the original copy of the database and all changed blocks are compressed, so the virtual database is always much smaller than the original.

Performance
In the Delphix situation, performance is generally on par with production given the same hardware. The biggest issue I see is people skimping on the network connection and since the datafiles are accessed over NFS/TCP, any skimping on the network will affect the "physical" read performance. Often what Oracle thinks is a physical read is really a read from the read cache on Delphix. The read performance on Delphix often shows "super scalability" since multiple virtual database are using the same read cache on the Delphix box and since most blocks are shared, the Delphix cache get's gets loaded with blocks that end up benefiting other databases sharing that same underlying database.

Management
As with anything, the more manual steps the more time consuming, brittle, and prone to errors. Eliminating errors usually takes some time and iterations. Its completely doable but it's nice to have it all packaged in an easy UI and have someone else streamline, automate and stabilize the process.

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Dec 20 2010 - 18:04:34 CST

Original text of this message