Re: One primary with two physical standbys exhibiting different behavior with regard to lag

From: Sandra Becker <sbecker6925_at_gmail.com>
Date: Fri, 4 Jan 2019 08:41:17 -0700
Message-ID: <CAJzM94CB1H6S1j8u8=3WZbwP5krv-xoJu6D8QrvYzGZXemEfHA_at_mail.gmail.com>

I had an error pop up that indicated a problem with a size difference between the standby redo logs and the online redo logs. Standby on the primary was 10x the size of online redo logs. All the logs on the standby were set up the same as the online redo. I've corrected that just to be consistent and follow best practices. While checking correcting the sizes, I noticed that the standby that lags never assigns a standby redo log. Will be researching that next.

Sandy

On Fri, Jan 4, 2019 at 7:35 AM Sandra Becker <sbecker6925_at_gmail.com> wrote:

> Yes, archiving the current log allows the standby to catch up.
>
> DGMGRL> show configuration verbose
>
> Configuration - itsdb_DG_CONFIG
>
> Protection Mode: MaxPerformance
> Members:
> itsdb1 - Primary database
> itsdb2 - Physical standby database
> fitsdb1 - Physical standby database
>
> Properties:
> FastStartFailoverThreshold = '30'
> OperationTimeout = '30'
> TraceLevel = 'USER'
> FastStartFailoverLagLimit = '30'
> CommunicationTimeout = '180'
> ObserverReconnect = '0'
> FastStartFailoverAutoReinstate = 'TRUE'
> FastStartFailoverPmyShutdown = 'TRUE'
> BystandersFollowRoleChange = 'ALL'
> ObserverOverride = 'FALSE'
> ExternalDestination1 = ''
> ExternalDestination2 = ''
> PrimaryLostWriteAction = 'CONTINUE'
>
> Fast-Start Failover: DISABLED
>
> Configuration Status:
> SUCCESS
>
> DGMGRL> show database verbose itsdb1
>
> Database - itsdb1
>
> Role: PRIMARY
> Intended State: TRANSPORT-ON
> Instance(s):
> its1
>
> Properties:
> DGConnectIdentifier = 'itsdb1'
> ObserverConnectIdentifier = ''
> LogXptMode = 'ASYNC'
> RedoRoutes = ''
> DelayMins = '0'
> Binding = 'optional'
> MaxFailure = '0'
> MaxConnections = '1'
> ReopenSecs = '300'
> NetTimeout = '30'
> RedoCompression = 'DISABLE'
> LogShipping = 'ON'
> PreferredApplyInstance = ''
> ApplyInstanceTimeout = '0'
> ApplyLagThreshold = '0'
> TransportLagThreshold = '0'
> TransportDisconnectedThreshold = '30'
> ApplyParallel = 'AUTO'
> StandbyFileManagement = 'AUTO'
> ArchiveLagTarget = '0'
> LogArchiveMaxProcesses = '30'
> LogArchiveMinSucceedDest = '1'
> DbFileNameConvert = '+DATA/ITSDB2/, +DATA/ITSDB1/'
> LogFileNameConvert = '+DATA/ITSDB2/, +DATA/ITSDB1/'
> FastStartFailoverTarget = ''
> InconsistentProperties = '(monitor)'
> InconsistentLogXptProps = '(monitor)'
> SendQEntries = '(monitor)'
> LogXptStatus = '(monitor)'
> RecvQEntries = '(monitor)'
> StaticConnectIdentifier =
> '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=pitsdbs01)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=itsdb1_DGMGRL)(INSTANCE_NAME=its1)(SERVER=DEDICATED)))'
> StandbyArchiveLocation = '/backup/itsdb/archives'
> AlternateLocation = ''
> LogArchiveTrace = '0'
> LogArchiveFormat = '%t_%s_%r.arc'
> TopWaitEvents = '(monitor)'
>
> Database Status:
> SUCCESS
>
> DGMGRL> show database verbose itsdb2
>
> Database - itsdb2
>
> Role: PHYSICAL STANDBY
> Intended State: APPLY-ON
> Transport Lag: 0 seconds (computed 0 seconds ago)
> Apply Lag: 0 seconds (computed 0 seconds ago)
> Average Apply Rate: 71.00 KByte/s
> Active Apply Rate: 450.00 KByte/s
> Maximum Apply Rate: 15.18 MByte/s
> Real Time Query: OFF
> Instance(s):
> its2
>
> Properties:
> DGConnectIdentifier = 'itsdb2'
> ObserverConnectIdentifier = ''
> LogXptMode = 'ASYNC'
> RedoRoutes = ''
> DelayMins = '0'
> Binding = 'optional'
> MaxFailure = '0'
> MaxConnections = '1'
> ReopenSecs = '300'
> NetTimeout = '30'
> RedoCompression = 'DISABLE'
> LogShipping = 'ON'
> PreferredApplyInstance = ''
> ApplyInstanceTimeout = '0'
> ApplyLagThreshold = '0'
> TransportLagThreshold = '0'
> TransportDisconnectedThreshold = '30'
> ApplyParallel = 'AUTO'
> StandbyFileManagement = 'AUTO'
> ArchiveLagTarget = '0'
> LogArchiveMaxProcesses = '30'
> LogArchiveMinSucceedDest = '1'
> DbFileNameConvert = '+DATA/ITSDB1/, +DATA/ITSDB2/'
> LogFileNameConvert = '+DATA/ITSDB1/, +DATA/ITSDB2/'
> FastStartFailoverTarget = ''
> InconsistentProperties = '(monitor)'
> InconsistentLogXptProps = '(monitor)'
> SendQEntries = '(monitor)'
> LogXptStatus = '(monitor)'
> RecvQEntries = '(monitor)'
> StaticConnectIdentifier =
> '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=pitsdbs02)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=itsdb2_DGMGRL)(INSTANCE_NAME=its2)(SERVER=DEDICATED)))'
> StandbyArchiveLocation = '/backup/itsdb/archives'
> AlternateLocation = ''
> LogArchiveTrace = '0'
> LogArchiveFormat = '%t_%s_%r.arc'
> TopWaitEvents = '(monitor)'
>
> Database Status:
> SUCCESS
>
> *The standby below is the one that lags.*
> DGMGRL> show database verbose fitsdb1
>
> Database - fitsdb1
>
> Role: PHYSICAL STANDBY
> Intended State: APPLY-ON
> Transport Lag: 12 minutes 45 seconds (computed 12 seconds ago)
> Apply Lag: 12 minutes 45 seconds (computed 12 seconds ago)
> Average Apply Rate: 79.00 KByte/s
> Active Apply Rate: 60.79 MByte/s
> Maximum Apply Rate: 61.21 MByte/s
> Real Time Query: OFF
> Instance(s):
> fits1
>
> Properties:
> DGConnectIdentifier = 'fitsdb1'
> ObserverConnectIdentifier = ''
> LogXptMode = 'ASYNC'
> RedoRoutes = ''
> DelayMins = '0'
> Binding = 'optional'
> MaxFailure = '0'
> MaxConnections = '1'
> ReopenSecs = '300'
> NetTimeout = '30'
> RedoCompression = 'DISABLE'
> LogShipping = 'ON'
> PreferredApplyInstance = ''
> ApplyInstanceTimeout = '0'
> ApplyLagThreshold = '0'
> TransportLagThreshold = '0'
> TransportDisconnectedThreshold = '30'
> ApplyParallel = 'AUTO'
> StandbyFileManagement = 'AUTO'
> ArchiveLagTarget = '0'
> LogArchiveMaxProcesses = '30'
> LogArchiveMinSucceedDest = '1'
> DbFileNameConvert = '+DATA/ITSDB2/, +DATA/FITSDB1/'
> LogFileNameConvert = '+DATA/ITSDB2/, +DATA/FITSDB1/'
> FastStartFailoverTarget = ''
> InconsistentProperties = '(monitor)'
> InconsistentLogXptProps = '(monitor)'
> SendQEntries = '(monitor)'
> LogXptStatus = '(monitor)'
> RecvQEntries = '(monitor)'
> StaticConnectIdentifier =
> '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=fitsdbs1)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=fitsdb1_DGMGRL)(INSTANCE_NAME=fits1)(SERVER=DEDICATED)))'
> StandbyArchiveLocation = '/backup/itsdb/archives'
> AlternateLocation = ''
> LogArchiveTrace = '0'
> LogArchiveFormat = '%t_%s_%r.arc'
> TopWaitEvents = '(monitor)'
>
> Database Status:
> SUCCESS
>
>
> I may be missing something, but I didn't see any differences in the config
> between the two standbys.
>
> Sandy
>
> On Fri, Jan 4, 2019 at 3:11 AM Neil Chandler <neil_chandler_at_hotmail.com>
> wrote:
>
>> Sandy,
>>
>> So if you do a manual "alter system archive log current" on the Primary,
>> it catches up?
>>
>> I would look at the config parameters to compare the configuration. Could
>> you share the configuration info (db/server names suitably redacted):
>>
>> dgmgrl> show configuration verbose
>> dgmgrl> show database verbose <primary|standby2|standby2>
>>
>>
>> regards
>>
>> Neil Chandler
>> Database Guy, Knows Things.
>> ------------------------------
>> *From:* Sandra Becker <sbecker6925_at_gmail.com>
>> *Sent:* 03 January 2019 22:49
>> *To:* John Thomas
>> *Cc:* Andrew Kerber; Neil Chandler; oracle-l
>> *Subject:* Re: One primary with two physical standbys exhibiting
>> different behavior with regard to lag
>>
>> My understanding is the network setup is the same between the standbys.
>> I didn't look at the network right away. :-) I made the changes
>> suggested by Neil, but I'm still seeing the delay. Before I did a manual
>> log switch, the delay was over 30 minutes. Not good for this critical
>> production standby.
>>
>> Sandy
>>
>> On Thu, Jan 3, 2019 at 3:36 PM John Thomas <jt2354_at_gmail.com> wrote:
>>
>> There's no excessive delay between your primary and the lagging standby
>> is there? Smaller pipe? Lots of network retries?
>>
>> Probably the second thing you checked...
>>
>> Regards,
>>
>> John
>>
>> On Thu, 3 Jan 2019 at 22:10, Sandra Becker <sbecker6925_at_gmail.com> wrote:
>>
>> Thanks, Andrew. That's one of the first things I checked. It's the same
>> on both standbys.
>>
>> Sandy
>>
>> On Thu, Jan 3, 2019 at 2:47 PM Andrew Kerber <andrew.kerber_at_gmail.com>
>> wrote:
>>
>> Neil most likely spotted the problem. But you should also check to make
>> sure that the protection mode is the same on both standbys.If the instance
>> that is behind is using the maximum performance (async) mode it can run a
>> ways behind the primary.
>>
>> On Thu, Jan 3, 2019 at 3:36 PM Neil Chandler <neil_chandler_at_hotmail.com>
>> wrote:
>>
>> Sandy,
>>
>> Have you checked the Standby Redo logs? There's a slight (annoying)
>> change in Oracle 12.1 onwards which means that Standby Redo logs get
>> created with Thread 0 instead of Thread 1 by default (for a single instance
>> database). Redo can only use Standby Redo when the threads are the same. If
>> this is RAC you need Standby Redo for each thread - and you must have 1
>> more Standby Redo than Online Redo for each thread.
>>
>> By coincidence, I wrote a blog post about this 10 minutes ago.
>>
>> https://chandlerdba.com/2019/01/03/data-guard-unexpected-lag/
>>
>> regards
>>
>> Neil Chandler
>> Database Guy. Knows Things.
>>
>> ------------------------------
>> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> on
>> behalf of Sandra Becker <sbecker6925_at_gmail.com>
>> *Sent:* 03 January 2019 20:29
>> *To:* oracle-l
>> *Subject:* One primary with two physical standbys exhibiting different
>> behavior with regard to lag
>>
>> Oracle 12.1.0.2
>> RHEL7
>>
>> To begin with, I have not worked much at all with standby databases, so
>> my knowledge is somewhat lacking.
>>
>> For business reasons, we have a primary database with two physical
>> standbys. Everything is configured in dgmgrl and enabled. Monitoring with
>> EM13c is reporting the lag times, so all looks good for basic setup and
>> monitoring. We seem to have significant lag at times on one of the
>> standbys, as much as 20 minutes. When looking at v$managed_standby, we see
>> the status as "WAIT_FOR_LOG". The other standby never seems to be more
>> that a few seconds behind, if at all, and the status is "APPLYING_LOG".
>>
>> Is this normal? I've been researching, but haven't found an answer yet.
>> I didn't create or start the standby databases, so I don't have any idea
>> what was actually done that could be causing this behavior. Any
>> suggestions would be appreciated.
>>
>> Thank you,
>>
>> --
>> Sandy B.
>>
>>
>>
>> --
>> Andrew W. Kerber
>>
>> 'If at first you dont succeed, dont take up skydiving.'
>>
>>
>>
>> --
>> Sandy B.
>>
>>
>>
>> --
>> Sandy B.
>>
>>
>
> --
> Sandy B.
>
>

-- 
Sandy B.


--
http://www.freelists.org/webpage/oracle-l

Received on Fri Jan 04 2019 - 16:41:17 CET

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message