Re: Moving DR site from 30miles to 1600miles

From: Craig I. Hagan <hagan_at_cih.com>
Date: Wed, 9 Apr 2008 22:36:54 -0400 (EDT)
Message-ID: <Pine.LNX.4.64.0804092228140.19160@colo.cih.com>


>
> We are planning to move our DR site which is currently about 30 miles from
> production site to ~1600 miles away. We currently have a 4-node RAC setup on
> our production site that houses 3 production instances (all 10.2.0.3 on
> Solaris 10). The SAN is Storagetek and we use ASM for volume management.
> In our testing, we are hitting issues in network transfer rates to the
> 1600-miles site -- a simple "scp" of 1GB file takes about 21 minutes. We
> generate archives at the rate of approx 1GB/8minutes. The network folks tell
> me that the TCP setting is a constraint here (currently set to 64k
> window-size which Sysadmins here say is the max setting). We have an Oc3
> link that can transfer @ 150Mbps (that is what the networking team tells
> me).
>
> I've an SR open w/ Oracle and have also gone thru few Metalink notes that
> talk about optimizing the network from dataguard perspective. One of the
> notes I came across also talks about cascaded standby dataguard setup (one
> standby local pushes logs to the remote site).
>
> I'm trying to collect ideas how others are doing it under similar scenarios
> and if there is something we can do to utilize the entire network bandwidth
> that we have available to us.

done this and similar a few times.

scp is a poor test of wan bandwidth. It just can't put enough bytes on the wire when you have a long fast pipe. seriously, even with a well tuned stack.

Also, I'm surprised at the max window size being 64k. Which OS/version? That is surprisingly low, perhaps your sysadmins aren't aware of the RFC1323 extensions to tcp? Else, perhaps your OS uses a different parameter set to handle "window scaling" which is how tcp got around the header field having a maximum size of 64k. Suggest that they work with the OS vendor to sort out how to properly deploy/tune RFC1323 extensions.

some things i've learned:

  1. I used bbcp to xfer my datafiles. scp just can't copy fast enough to saturate a fast link. note: bbcp compression (-c) maxes out at 100mbit/s raw xfer rate (you may see more due to compression). w/o compression i can easily saturate several hundred megabit speed wan connections with sufficiently large tcp windows. Since you are asm using this to xfer your files it likely right out, but you CAN use it to test your performance bymoving real world data (e.g. archived logs).
  2. the max connections parameter for dataguard is key for moving redo. I believe that the maximum amount of connections is five, experiment.
  3. you will want to have sufficient log archive processes to handle everything on all hosts, that would be the sum of the maxconnections for each destination plus two for local archiving
  4. for tuning your tcp stack, you will want to maximize the allowed tcp windows/window scaling. This is critical.
Received on Wed Apr 09 2008 - 21:36:54 CDT

Original text of this message