RE: Troubleshooting log shipping - Solved
From: Upendra nerilla <nupendra_at_hotmail.com>
Date: Fri, 22 Jan 2016 11:35:21 -0500
Message-ID: <BLU181-W490FE3811621877CA1C70D8C40_at_phx.gbl>
I had a SR open for this issue as well. Oracle support suggested to measure the frequency of log switches in primary and recommended to size the redo logs in such a way the log switches are spread between 20 mins.. Based on their suggestion we increased the redolog size to 4GB, which seems to have helped the issue. Here are the scripts which might be helpful for anyone else in similar situation:
Date: Fri, 22 Jan 2016 11:35:21 -0500
Message-ID: <BLU181-W490FE3811621877CA1C70D8C40_at_phx.gbl>
I had a SR open for this issue as well. Oracle support suggested to measure the frequency of log switches in primary and recommended to size the redo logs in such a way the log switches are spread between 20 mins.. Based on their suggestion we increased the redolog size to 4GB, which seems to have helped the issue. Here are the scripts which might be helpful for anyone else in similar situation:
L>
SQL> -- Check how often logs are switching. Log switches should not regularly be occuring in < 20 mins. SQL> -- Excessive log switching is a performance overhead. Whilst rapid log switching is not in itself a Data Guard issue it can affect Data guard. SQL> -- It may also indicate a problem with log shipping. Use redo log size >= peak redo rate x 20 minutes. SQL> SQL> SELECT fs.log_switches_under_20_mins, ss.log_switches_over_20_mins FROM (SELECT SUM(COUNT (ROUND((b.first_time - a.first_time) * 1440) )) "LOG_SWITCHES_UNDER_20_MINS" FROM v$archived_log a, v$archived_log b WHERE a.sequence# + 1 = b.sequence# AND a.dest_id = 1 AND a.thread# = b.thread# AND a.dest_id = b.dest_id AND a.dest_id = (SELECT MIN(dest_id) FROM gv$archive_dest WHERE target='PRIMARY' AND destination IS NOT NULL) AND ROUND((b.first_time - a.first_time) * 1440) < 20 GROUP BY ROUND((b.first_time - a.first_time) * 1440)) fs, (SELECT SUM(COUNT (ROUND((b.first_time - a.first_time) * 1440) )) "LOG_SWITCHES_OVER_20_MINS" FROM v$archived_log a, v$archived_log b WHERE a.sequence# + 1 = b.sequence# AND a.dest_id = 1 AND a.thread# = b.thread# AND a.dest_id = b.dest_id AND a.dest_id = (SELECT MIN(dest_id) FROM gv$archive_dest WHERE target='PRIMARY' AND destination IS NOT NULL) AND ROUND((b.first_time - a.first_time) * 1440) > 19 GROUP BY ROUND((b.first_time - a.first_time) * 1440)) ss;
LOG_SWITCHES_UNDER_20_MINS LOG_SWITCHES_OVER_20_MINS 11446 619
1 row selected.
SQL> SQL> column minutes format a12 SQL> SQL> SELECT (CASE WHEN bucket = 1 THEN '<= ' || TO_CHAR(bucket * 5) WHEN (bucket >1 AND bucket < 9) THEN TO_CHAR(bucket * 5 - 4) || ' TO ' || TO_CHAR(bucket * 5) WHEN bucket > 8 THEN '>= ' || TO_CHAR(bucket * 5 - 4) END) "MINUTES", switches "LOG_SWITCHES" FROM (SELECT bucket , COUNT(b.bucket) SWITCHES FROM (SELECT WIDTH_BUCKET(ROUND((b.first_time - a.first_time) * 1440), 0, 40, 8) bucket FROM v$archived_log a, v$archived_log b WHERE a.sequence# + 1 = b.sequence# AND a.dest_id = b.dest_id AND a.thread# = b.thread# AND a.dest_id = (SELECT MIN(dest_id) FROM gv$archive_dest WHERE target = 'PRIMARY' AND destination IS NOT NULL)) b GROUP BY bucket ORDER BY bucket);
MINUTES LOG_SWITCHES
<= 5 9545
6 TO 10 1273
11 TO 15 387
16 TO 20 241
21 TO 25 132
26 TO 30 146
31 TO 35 341
On Fri, Jan 8, 2016 at 1:22 PM, Upendra nerilla <nupendra_at_hotmail.com> wrote: Hello everyone - Happy new year! Oracle RAC 11.2.0.3 OEL 5.x I have an environment in which the primary is shipping logs to a local standby and a remote dataguard database. What I am seeing is that, somehow not all the archivelogs are being shipped to the remote site. At the time of log apply these logs are being transmitted over. Here is an example on how the redologs appear in the remote site: 512 1860501 952576512 954204160 thread_1_seq_434986.5778.900455621 512 1883168 964182016 965738496 thread_1_seq_434991.8962.900456295 512 1850346 947377152 948961280 thread_1_seq_434997.9060.900456899 512 1865067 954914304 956301312 thread_1_seq_435003.3202.900457539 512 134582 68905984 70254592 thread_1_seq_435008.1471.900457605 512 1848325 946342400 947912704 thread_1_seq_435009.3990.900458259 512 1863147 953931264 955252736 thread_1_seq_435017.6484.900458793 512 1834310 939166720 940572672 thread_1_seq_435023.8233.900459079 512 1438234 736375808 738197504 thread_1_seq_435024.11250.900460605 Similar gaps for thread_2 as well.. Here is the configuration of the log_archive_dest_3 on primary for the remote site: service="SID_DR", LGWR ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=15 db_unique_name="SID_DR" net_timeout=30, valid_for=(all_logfiles,primary_role) I do not see any errors for the redo transport in the alert log on the primary site or on the remote DG site. Any thoughts on how should I troubleshoot this? Are there any known bugs around this? Your help is appreciated. Thanks -Upendra
-- http://www.freelists.org/webpage/oracle-lReceived on Fri Jan 22 2016 - 17:35:21 CET