Re: Moving DR site from 30miles to 1600miles

From: Jakub Wartak <vnulllists_at_pcnet.com.pl>
Date: Wed, 9 Apr 2008 21:35:07 +0200
Message-Id: <200804092135.08009.vnulllists@pcnet.com.pl>


Dnia środa, 9 kwietnia 2008 17:39, Ravi Gaur napisał:
> Hello all,
>
> We are planning to move our DR site which is currently about 30 miles from
> production site to ~1600 miles away. We currently have a 4-node RAC setup
> on our production site that houses 3 production instances (all 10.2.0.3 on
> Solaris 10). The SAN is Storagetek and we use ASM for volume management. In
> our testing, we are hitting issues in network transfer rates to the
> 1600-miles site -- a simple "scp" of 1GB file takes about 21 minutes. We
> generate archives at the rate of approx 1GB/8minutes. The network folks
> tell me that the TCP setting is a constraint here (currently set to 64k
> window-size which Sysadmins here say is the max setting). We have an Oc3
> link that can transfer @ 150Mbps (that is what the networking team tells
> me).
>
> I've an SR open w/ Oracle and have also gone thru few Metalink notes that
> talk about optimizing the network from dataguard perspective. One of the
> notes I came across also talks about cascaded standby dataguard setup (one
> standby local pushes logs to the remote site).
>
> I'm trying to collect ideas how others are doing it under similar scenarios
> and if there is something we can do to utilize the entire network bandwidth
> that we have available to us.
>

You must tune your TCP/IP network stack for WAN (increase TCP window size and so on). What's yours latency to that DR site? (this is very important for TCP) You could try playing with those parameters:

# bursty WAN traffic
ndd -set /dev/tcp tcp_deferred_acks_max 8 ndd -set /dev/tcp tcp_deferred_ack_interval 500

# max buffer which app can request using setsockopt(), use with care! ndd -set /dev/tcp tcp_max_buf 83886080
ndd -set /dev/tcp tcp_cwnd_max 83886080

AFAIK 64k is not the limit (!!) with Solaris 10, you could go much higher than that - 1GB is the limit?

# window size
ndd -set /dev/tcp tcp_xmit_hiwat <ws>
ndd -set /dev/tcp tcp_recv_hiwat <ws>

You can estimate needed window size by calculating: bandwidth (bytes/s!) * latency (in secs) = <winsize>

So for 155 Mbps (OC3) and 100ms latency (you have to measure that by own or ask network/sysadmin guys!) it's:
(155*1024*1024/8) bytes/sec * 0.1 second = ~ near 2MB (for 5 MB you would set it to 8 MB and so on)

Also you can set TCP window size per IP route using route(1) (IMHO very good way to do this). I'm not sure which systems in RAC+DG combo perform sending to the other side, but you should tune both sides (in 1->1 DG case that would trival, in RAC you would have tune each one I think - you need to double check that).

p.s. Be sure to check what is Oracle trying to use for setsockopt() SO_SNDBUF and SO_RCVBUF. Possible solutions are to use DTrace or pfiles (easier) on Oracle process sending or reciving the data depending on the side (it will show you fd/socket with params).

p.s.#2 Test and benchmark before making any changes on production!

p.s.#3 If in doubt, use snoop or better tcpdump ;)

-- 
Jakub Wartak
http://vnull.pcnet.com.pl
--
http://www.freelists.org/webpage/oracle-l
Received on Wed Apr 09 2008 - 14:35:07 CDT

Original text of this message