Re: Active Dataguard -- Cascade Standby or not

From: Gaja Krishna Vaidyanatha <gajav_at_yahoo.com>
Date: Tue, 6 Mar 2012 15:21:38 -0800 (PST)
Message-ID: <1331076098.85703.YahooMailNeo_at_web83605.mail.sp1.yahoo.com>



Hi Stalin,
Late last year you had posed this question on the list and I had responded that I will share our experience on this subject, as soon as we completed a customer PoC. Here are some of the high-level observations:
  1. The most important driving factor at this customer's environment was AUTOMATION. We needed to ensure that at any given time, when there was a failure in the PRIMARY database, the HA database kicked into action instantly (Here I am referring to HA across data centers in a geographically dispersed Cloud environment). So yes, we are talking about 2 different non-RAC databases, in 2 different data centers when we refer to as PRIMARY and HA.
  2. For #1 we needed to use the ADG Broker/Observer on a server independent of the database servers as it provided us the required independence from the database servers along with the necessary automation. And given that automation was paramount to everything, we also needed the "Fast Start Failover" feature of ADG. Just looking at this aspect, we could NOT setup ADG in a Cascaded Configuration, as the Broker currently does not support this configuration with the required automation "bells and whistles" that we needed. A cascaded ADG configuration needs to be manually managed (some level of automation can be achieved with scripting and job scheduling but it is tricky) and due to time and other constraints, we had to opt out of it.
  3. Log transport was configured in SYNC mode between the PRIMARY and HA databases and in ASYNC mode between the PRIMARY and DR databases.
  4. The next aspect relates to the decision made in #1 - #3, is the additional overhead on the PRIMARY database server for additional/multiple redo shipping. We took the capacity planning angle to this issue, measured the resource consumption during redo transport and propagation and ensured that the PRIMARY database was configured with enough hardware resources to guarantee the performance and SLA of this database even with all of the redo propagation that was going on to various HA and DR databases. Where relevant, we implemented Database Resource Management (resource profiles, consumer groups etc) to ensure "bread and butter" transactions/jobs were processed without any performance/elapsed-time blips. This ensured that dynamic workloads such as ad-hoc reports did not eat away resources to glory and create an artificial resource starvation problem.
  5. The network pipe between the PRIMARY and HA data centers was "dark fiber" and the inter-data center latency, throughput and physical routing is as good as it could be. Even though light travels at a speed of 299,792,458 m/sec, that speed is measured in a vacuum, not when it touches physical devices such as routers and switches. When it comes to real-life network configurations, "dark fiber networks" (like other networks) have to traverse through many physical devices, to get from A to B. This impedes the fantastic theoretical speed of light. But it is still pretty darn good. The data centers in question here were 35 km apart, implying light could travel "in a vacuum" in 0.00011674743332 secs ~0.17 ms between the 2 data centers. But in reality the network latency between the data centers was between 1-5 ms. In this configuration, the SYNC mode provided "near instantaneous" log transport for the HA database, implying that the HA database never had to "catch up". The same is not true for the DR databases, but rightfully so, as their log transport is ASYNC.  

Hope this provides some additional insight into this. Please let me know if you have any further questions.

Cheers,

Gaja

Gaja Krishna Vaidyanatha,
CEO & Founder, DBPerfMan LLC
http://www.dbperfman.com
http://www.dbcloudman.com

Phone - +1-650-743-6060
http://www.linkedin.com/in/gajakrishnavaidyanathaCo-author:Oracle Insights:Tales of the Oak Table - http://www.apress.com/book/bookDisplay.html?bID14 Co-author:Oracle Performance Tuning 101 - http://www.amazon.com/gp/reader/0072131454/ref=sib_dp_pt/102-6130796-4625766 Enabling Cloud Deployment & Management for Oracle Databases



 From: Stalin <stalinsk_at_gmail.com>
To: oracle-l <oracle-l_at_freelists.org> Sent: Wednesday, November 30, 2011 5:24 PM Subject: Active Dataguard -- Cascade Standby or not  

We have a requirement from one of our customer to have up to 15 ReadOnly DB sites all replicating data from so called primary site. Active dataguard seems to be a perfect fit but I was wondering the impact on the primary site in replicating data to all 15 readonly Active physical standbys. Only one standby site will be the failover target and configured in SYNC Availablity mode and rest in ASYNC Performance mode. Also, i was wondering if having a cascade standby's instead of having primary site to replicate all standby's a viable option to reduce load on primary with the trade off in additional lags from standby.
If anyone could share your experences or things to watch for in similar setup is greatly appreciated.

--

Thanks,

Stalin
PS. 11.2.0.2 RAC (Primary), 11.2.0.2 (Standby, Single Instance), Linux

--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l Received on Tue Mar 06 2012 - 17:21:38 CST

Original text of this message