Re: One primary with two physical standbys exhibiting different behavior with regard to lag

From: Sandra Becker <sbecker6925_at_gmail.com>
Date: Mon, 7 Jan 2019 11:41:00 -0700
Message-ID: <CAJzM94BS0UZ+xB8K=2yxumqLq5ij-W-5YCPac9082Z2ZR9NPOA_at_mail.gmail.com>



Just discovered that the DBA who created the two standbys started real time apply on one and not the other. She didn't think it made a difference. Team lead does not want to use real time apply to avoid accidentally turning on Active Dataguard, which we are not licensed to do.

On Mon, Jan 7, 2019 at 3:44 AM Rakesh Ra <rakeshra.tr_at_gmail.com> wrote:

> Just to add starting 12.1 real time apply is the default mode..
>
> Regards,
> Rakesh RA
>
> On Mon, Jan 7, 2019, 14:39 Hemant K Chitale <hemantkchitale_at_gmail.com>
> wrote:
>
>> How is the
>> "ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
>> USING CURRENT LOGFILE"
>> command issued on the 2nd standby ? Is it "USING CURRENT LOGFILE ?
>> That would be "Real Time Apply"
>> Else, "Real Time Apply" wouldn't be happening and the Standby would wait
>> for the Archive Log.
>>
>> Hemant K Chitale
>>
>>
>>
>>
>> On Fri, Jan 4, 2019 at 10:37 PM Sandra Becker <sbecker6925_at_gmail.com>
>> wrote:
>>
>>> Yes, archiving the current log allows the standby to catch up.
>>>
>>> DGMGRL> show configuration verbose
>>>
>>> Configuration - itsdb_DG_CONFIG
>>>
>>> Protection Mode: MaxPerformance
>>> Members:
>>> itsdb1 - Primary database
>>> itsdb2 - Physical standby database
>>> fitsdb1 - Physical standby database
>>>
>>> Properties:
>>> FastStartFailoverThreshold = '30'
>>> OperationTimeout = '30'
>>> TraceLevel = 'USER'
>>> FastStartFailoverLagLimit = '30'
>>> CommunicationTimeout = '180'
>>> ObserverReconnect = '0'
>>> FastStartFailoverAutoReinstate = 'TRUE'
>>> FastStartFailoverPmyShutdown = 'TRUE'
>>> BystandersFollowRoleChange = 'ALL'
>>> ObserverOverride = 'FALSE'
>>> ExternalDestination1 = ''
>>> ExternalDestination2 = ''
>>> PrimaryLostWriteAction = 'CONTINUE'
>>>
>>> Fast-Start Failover: DISABLED
>>>
>>> Configuration Status:
>>> SUCCESS
>>>
>>> DGMGRL> show database verbose itsdb1
>>>
>>> Database - itsdb1
>>>
>>> Role: PRIMARY
>>> Intended State: TRANSPORT-ON
>>> Instance(s):
>>> its1
>>>
>>> Properties:
>>> DGConnectIdentifier = 'itsdb1'
>>> ObserverConnectIdentifier = ''
>>> LogXptMode = 'ASYNC'
>>> RedoRoutes = ''
>>> DelayMins = '0'
>>> Binding = 'optional'
>>> MaxFailure = '0'
>>> MaxConnections = '1'
>>> ReopenSecs = '300'
>>> NetTimeout = '30'
>>> RedoCompression = 'DISABLE'
>>> LogShipping = 'ON'
>>> PreferredApplyInstance = ''
>>> ApplyInstanceTimeout = '0'
>>> ApplyLagThreshold = '0'
>>> TransportLagThreshold = '0'
>>> TransportDisconnectedThreshold = '30'
>>> ApplyParallel = 'AUTO'
>>> StandbyFileManagement = 'AUTO'
>>> ArchiveLagTarget = '0'
>>> LogArchiveMaxProcesses = '30'
>>> LogArchiveMinSucceedDest = '1'
>>> DbFileNameConvert = '+DATA/ITSDB2/, +DATA/ITSDB1/'
>>> LogFileNameConvert = '+DATA/ITSDB2/, +DATA/ITSDB1/'
>>> FastStartFailoverTarget = ''
>>> InconsistentProperties = '(monitor)'
>>> InconsistentLogXptProps = '(monitor)'
>>> SendQEntries = '(monitor)'
>>> LogXptStatus = '(monitor)'
>>> RecvQEntries = '(monitor)'
>>> StaticConnectIdentifier =
>>> '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=pitsdbs01)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=itsdb1_DGMGRL)(INSTANCE_NAME=its1)(SERVER=DEDICATED)))'
>>> StandbyArchiveLocation = '/backup/itsdb/archives'
>>> AlternateLocation = ''
>>> LogArchiveTrace = '0'
>>> LogArchiveFormat = '%t_%s_%r.arc'
>>> TopWaitEvents = '(monitor)'
>>>
>>> Database Status:
>>> SUCCESS
>>>
>>> DGMGRL> show database verbose itsdb2
>>>
>>> Database - itsdb2
>>>
>>> Role: PHYSICAL STANDBY
>>> Intended State: APPLY-ON
>>> Transport Lag: 0 seconds (computed 0 seconds ago)
>>> Apply Lag: 0 seconds (computed 0 seconds ago)
>>> Average Apply Rate: 71.00 KByte/s
>>> Active Apply Rate: 450.00 KByte/s
>>> Maximum Apply Rate: 15.18 MByte/s
>>> Real Time Query: OFF
>>> Instance(s):
>>> its2
>>>
>>> Properties:
>>> DGConnectIdentifier = 'itsdb2'
>>> ObserverConnectIdentifier = ''
>>> LogXptMode = 'ASYNC'
>>> RedoRoutes = ''
>>> DelayMins = '0'
>>> Binding = 'optional'
>>> MaxFailure = '0'
>>> MaxConnections = '1'
>>> ReopenSecs = '300'
>>> NetTimeout = '30'
>>> RedoCompression = 'DISABLE'
>>> LogShipping = 'ON'
>>> PreferredApplyInstance = ''
>>> ApplyInstanceTimeout = '0'
>>> ApplyLagThreshold = '0'
>>> TransportLagThreshold = '0'
>>> TransportDisconnectedThreshold = '30'
>>> ApplyParallel = 'AUTO'
>>> StandbyFileManagement = 'AUTO'
>>> ArchiveLagTarget = '0'
>>> LogArchiveMaxProcesses = '30'
>>> LogArchiveMinSucceedDest = '1'
>>> DbFileNameConvert = '+DATA/ITSDB1/, +DATA/ITSDB2/'
>>> LogFileNameConvert = '+DATA/ITSDB1/, +DATA/ITSDB2/'
>>> FastStartFailoverTarget = ''
>>> InconsistentProperties = '(monitor)'
>>> InconsistentLogXptProps = '(monitor)'
>>> SendQEntries = '(monitor)'
>>> LogXptStatus = '(monitor)'
>>> RecvQEntries = '(monitor)'
>>> StaticConnectIdentifier =
>>> '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=pitsdbs02)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=itsdb2_DGMGRL)(INSTANCE_NAME=its2)(SERVER=DEDICATED)))'
>>> StandbyArchiveLocation = '/backup/itsdb/archives'
>>> AlternateLocation = ''
>>> LogArchiveTrace = '0'
>>> LogArchiveFormat = '%t_%s_%r.arc'
>>> TopWaitEvents = '(monitor)'
>>>
>>> Database Status:
>>> SUCCESS
>>>
>>> *The standby below is the one that lags.*
>>> DGMGRL> show database verbose fitsdb1
>>>
>>> Database - fitsdb1
>>>
>>> Role: PHYSICAL STANDBY
>>> Intended State: APPLY-ON
>>> Transport Lag: 12 minutes 45 seconds (computed 12 seconds ago)
>>> Apply Lag: 12 minutes 45 seconds (computed 12 seconds ago)
>>> Average Apply Rate: 79.00 KByte/s
>>> Active Apply Rate: 60.79 MByte/s
>>> Maximum Apply Rate: 61.21 MByte/s
>>> Real Time Query: OFF
>>> Instance(s):
>>> fits1
>>>
>>> Properties:
>>> DGConnectIdentifier = 'fitsdb1'
>>> ObserverConnectIdentifier = ''
>>> LogXptMode = 'ASYNC'
>>> RedoRoutes = ''
>>> DelayMins = '0'
>>> Binding = 'optional'
>>> MaxFailure = '0'
>>> MaxConnections = '1'
>>> ReopenSecs = '300'
>>> NetTimeout = '30'
>>> RedoCompression = 'DISABLE'
>>> LogShipping = 'ON'
>>> PreferredApplyInstance = ''
>>> ApplyInstanceTimeout = '0'
>>> ApplyLagThreshold = '0'
>>> TransportLagThreshold = '0'
>>> TransportDisconnectedThreshold = '30'
>>> ApplyParallel = 'AUTO'
>>> StandbyFileManagement = 'AUTO'
>>> ArchiveLagTarget = '0'
>>> LogArchiveMaxProcesses = '30'
>>> LogArchiveMinSucceedDest = '1'
>>> DbFileNameConvert = '+DATA/ITSDB2/, +DATA/FITSDB1/'
>>> LogFileNameConvert = '+DATA/ITSDB2/, +DATA/FITSDB1/'
>>> FastStartFailoverTarget = ''
>>> InconsistentProperties = '(monitor)'
>>> InconsistentLogXptProps = '(monitor)'
>>> SendQEntries = '(monitor)'
>>> LogXptStatus = '(monitor)'
>>> RecvQEntries = '(monitor)'
>>> StaticConnectIdentifier =
>>> '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=fitsdbs1)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=fitsdb1_DGMGRL)(INSTANCE_NAME=fits1)(SERVER=DEDICATED)))'
>>> StandbyArchiveLocation = '/backup/itsdb/archives'
>>> AlternateLocation = ''
>>> LogArchiveTrace = '0'
>>> LogArchiveFormat = '%t_%s_%r.arc'
>>> TopWaitEvents = '(monitor)'
>>>
>>> Database Status:
>>> SUCCESS
>>>
>>>
>>> I may be missing something, but I didn't see any differences in the
>>> config between the two standbys.
>>>
>>> Sandy
>>>
>>> On Fri, Jan 4, 2019 at 3:11 AM Neil Chandler <neil_chandler_at_hotmail.com>
>>> wrote:
>>>
>>>> Sandy,
>>>>
>>>> So if you do a manual "alter system archive log current" on the
>>>> Primary, it catches up?
>>>>
>>>> I would look at the config parameters to compare the configuration.
>>>> Could you share the configuration info (db/server names suitably redacted):
>>>>
>>>> dgmgrl> show configuration verbose
>>>> dgmgrl> show database verbose <primary|standby2|standby2>
>>>>
>>>>
>>>> regards
>>>>
>>>> Neil Chandler
>>>> Database Guy, Knows Things.
>>>> ------------------------------
>>>> *From:* Sandra Becker <sbecker6925_at_gmail.com>
>>>> *Sent:* 03 January 2019 22:49
>>>> *To:* John Thomas
>>>> *Cc:* Andrew Kerber; Neil Chandler; oracle-l
>>>> *Subject:* Re: One primary with two physical standbys exhibiting
>>>> different behavior with regard to lag
>>>>
>>>> My understanding is the network setup is the same between the
>>>> standbys. I didn't look at the network right away. :-) I made the
>>>> changes suggested by Neil, but I'm still seeing the delay. Before I did a
>>>> manual log switch, the delay was over 30 minutes. Not good for this
>>>> critical production standby.
>>>>
>>>> Sandy
>>>>
>>>> On Thu, Jan 3, 2019 at 3:36 PM John Thomas <jt2354_at_gmail.com> wrote:
>>>>
>>>> There's no excessive delay between your primary and the lagging standby
>>>> is there? Smaller pipe? Lots of network retries?
>>>>
>>>> Probably the second thing you checked...
>>>>
>>>> Regards,
>>>>
>>>> John
>>>>
>>>> On Thu, 3 Jan 2019 at 22:10, Sandra Becker <sbecker6925_at_gmail.com>
>>>> wrote:
>>>>
>>>> Thanks, Andrew. That's one of the first things I checked. It's the
>>>> same on both standbys.
>>>>
>>>> Sandy
>>>>
>>>> On Thu, Jan 3, 2019 at 2:47 PM Andrew Kerber <andrew.kerber_at_gmail.com>
>>>> wrote:
>>>>
>>>> Neil most likely spotted the problem. But you should also check to
>>>> make sure that the protection mode is the same on both standbys.If the
>>>> instance that is behind is using the maximum performance (async) mode it
>>>> can run a ways behind the primary.
>>>>
>>>> On Thu, Jan 3, 2019 at 3:36 PM Neil Chandler <neil_chandler_at_hotmail.com>
>>>> wrote:
>>>>
>>>> Sandy,
>>>>
>>>> Have you checked the Standby Redo logs? There's a slight (annoying)
>>>> change in Oracle 12.1 onwards which means that Standby Redo logs get
>>>> created with Thread 0 instead of Thread 1 by default (for a single instance
>>>> database). Redo can only use Standby Redo when the threads are the same. If
>>>> this is RAC you need Standby Redo for each thread - and you must have 1
>>>> more Standby Redo than Online Redo for each thread.
>>>>
>>>> By coincidence, I wrote a blog post about this 10 minutes ago.
>>>>
>>>> https://chandlerdba.com/2019/01/03/data-guard-unexpected-lag/
>>>>
>>>> regards
>>>>
>>>> Neil Chandler
>>>> Database Guy. Knows Things.
>>>>
>>>> ------------------------------
>>>> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org>
>>>> on behalf of Sandra Becker <sbecker6925_at_gmail.com>
>>>> *Sent:* 03 January 2019 20:29
>>>> *To:* oracle-l
>>>> *Subject:* One primary with two physical standbys exhibiting different
>>>> behavior with regard to lag
>>>>
>>>> Oracle 12.1.0.2
>>>> RHEL7
>>>>
>>>> To begin with, I have not worked much at all with standby databases, so
>>>> my knowledge is somewhat lacking.
>>>>
>>>> For business reasons, we have a primary database with two physical
>>>> standbys. Everything is configured in dgmgrl and enabled. Monitoring with
>>>> EM13c is reporting the lag times, so all looks good for basic setup and
>>>> monitoring. We seem to have significant lag at times on one of the
>>>> standbys, as much as 20 minutes. When looking at v$managed_standby, we see
>>>> the status as "WAIT_FOR_LOG". The other standby never seems to be more
>>>> that a few seconds behind, if at all, and the status is "APPLYING_LOG".
>>>>
>>>> Is this normal? I've been researching, but haven't found an answer
>>>> yet. I didn't create or start the standby databases, so I don't have any
>>>> idea what was actually done that could be causing this behavior. Any
>>>> suggestions would be appreciated.
>>>>
>>>> Thank you,
>>>>
>>>> --
>>>> Sandy B.
>>>>
>>>>
>>>>
>>>> --
>>>> Andrew W. Kerber
>>>>
>>>> 'If at first you dont succeed, dont take up skydiving.'
>>>>
>>>>
>>>>
>>>> --
>>>> Sandy B.
>>>>
>>>>
>>>>
>>>> --
>>>> Sandy B.
>>>>
>>>>
>>>
>>> --
>>> Sandy B.
>>>
>>>

-- 
Sandy B.

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Jan 07 2019 - 19:41:00 CET

Original text of this message