Re: Q: far-sync gap

From: amihay gonen <agonenil_at_gmail.com>
Date: Wed, 29 Oct 2014 11:14:23 +0200
Message-ID: <CAKb+SBVDqonj4ZZAy=ua5s9sn2-TNmm5Os3vRiEuosvaDmt-SA_at_mail.gmail.com>



*Ok , I think I found the reason for the GAP.*

the FS won't push the last gap , since the LNS won't start if there is no connection to primary .

Now the question - what is the command to "force" far sync to push this gap (since primary is "lost" ).

see my test :

test :
step 1:
1 .primary -->fs x standby (down)

ldb1 LNS 124 61773

ldb1fs           RFS              124      61773
ldb1fs           RFS                0          0
ldb1fs           RFS                0          0
ldb1fs           LNS              117      26626
select INSTANCE_NAME,process,SEQUENCE#,BLOCK# from v$instance ,v$managed_standby
 where process in ('LNS','RFS')
*
ERROR at line 1:
ORA-01034: ORACLE not available
Process ID: 0
Session ID: 0 Serial number: 0

step 2 : shutdown abort primary

select INSTANCE_NAME,process,SEQUENCE#,BLOCK# from v$instance ,v$managed_standby
 where process in ('LNS','RFS')
*
ERROR at line 1:
ORA-01034: ORACLE not available
Process ID: 0
Session ID: 0 Serial number: 0

ldb1fs LNS 117 26626 select INSTANCE_NAME,process,SEQUENCE#,BLOCK# from v$instance ,v$managed_standby
 where process in ('LNS','RFS')
*
ERROR at line 1:
ORA-01034: ORACLE not available
Process ID: 0
Session ID: 0 Serial number: 0

step 3: restart FS (03:06:40 [ 29-OCT-14 ] SYS_at_ldb1fs>startup force mount)

no process LNS or RFS on far_Sync (since both primary and standby are down)

step 4 - start standby (no process on far_Sync at all) : Every 2.0s: ./doquery.sh

             Wed Oct 29 03:09:12 2014

select INSTANCE_NAME,process,SEQUENCE#,BLOCK# from v$instance ,v$managed_standby where process in ('LNS','RFS') *
ERROR at line 1:
ORA-01034: ORACLE not available
Process ID: 0
Session ID: 0 Serial number: 0

ldb1sb           RFS                0          0
ldb1sb           RFS                0          0
ldb1sb           RFS                0          0


step 5 - starting primary (everying got sync:

Every 2.0s: ./doquery.sh

             Wed Oct 29 03:20:17 2014

ldb1 LNS 125 2065

ldb1fs           RFS              125       2065
ldb1fs           LNS              125       2065
ldb1fs           RFS                0          0
ldb1fs           RFS                0          0

ldb1sb           RFS                0          0
ldb1sb           RFS                0          0
ldb1sb           RFS                0          0
ldb1sb           RFS              125       2065





On Wed, Oct 29, 2014 at 10:33 AM, amihay gonen <agonenil_at_gmail.com> wrote:

> Thanks for the input . you are correct , as you can bellow :
>
> although I still don't understand ,why in the following scenario , I get
> data gap:
>
> 1. stop standby.
> 2. put data into primary
> 3. stop primary abort
> 4. restart FS
> 5. start standby .
> 6. compare count on some test table - i see difference .
>
> anyhow , I'll repeat my test using the new query bellow and report to the
> oracle-l list
>
>
> Every 2.0s: ./doquery.sh
>
> Wed Oct 29 02:31:57 2014
>
>
> ldb1 LNS 114 356
>
> ldb1fs RFS 0 0
> ldb1fs RFS 0 0
> ldb1fs RFS 114 356
> ldb1fs RFS 0 0
> ldb1fs LNS 114 356
>
> ldb1sb RFS 0 0
> ldb1sb RFS 114 356
> ldb1sb RFS 0 0
>
>
> set timing off feedback off head off
> select INSTANCE_NAME,process,SEQUENCE#,BLOCK# from v$instance
> ,v$managed_standby where process in ('LNS','RFS');
> exit
>
>
> On Wed, Oct 29, 2014 at 9:40 AM, Ludovico Caldara <
> ludovico.caldara_at_gmail.com> wrote:
>
>> Hi Amihay, I put back the list as recipient, yesterday I've done a reply
>> instead a reply to all using my smartphone...
>>
>> No, indeed, SYNC should be better.
>> The problem is that you're checking the RECOVERY LAG and not the
>> TRANSPORT LAG,
>> SYNC means that the transport is synchronous but you can still have a
>> small recovery lag. A commit on the primary forces a write of the standby
>> log on the standby site but doesn't force the recovery process (MRP0) to
>> apply it on the standby. So this is the problem.
>>
>> You should check instead the sequence# and block# of the RFS processes in
>> the v$managed_standby view. There you should notice no lag if the transport
>> is sync.
>>
>> Best regards
>> --
>> Ludo
>>
>> 2014-10-29 2:08 GMT+01:00 amihay gonen <agonenil_at_gmail.com>:
>>
>>> setting to fastsync doesn't seems to do any difference ...
>>>
>>> DGMGRL> show database verbose ldb1
>>>
>>> Database - ldb1
>>>
>>> Role: PRIMARY
>>> Intended State: TRANSPORT-ON
>>> Instance(s):
>>> ldb1
>>>
>>> Properties:
>>> DGConnectIdentifier = 'ldb1'
>>> ObserverConnectIdentifier = ''
>>> LogXptMode = 'SYNC'
>>> RedoRoutes = '(LOCAL : ldb1fs FASTSYNC)'
>>> DelayMins = '0'
>>> Binding = 'optional'
>>>
>>>
>>> On Wed, Oct 29, 2014 at 2:44 AM, amihay gonen <agonenil_at_gmail.com>
>>> wrote:
>>>
>>>> hi , I'm not sure , but I think I'm using SYNC .
>>>>
>>>> My main problem is that it seems the FS is not sync.
>>>>
>>>> For example :
>>>> the following test:
>>>> 1. stop standby db.
>>>> 2. put data into primary.(commit)
>>>> 3. shutdown abort primary.
>>>> 4. restart FS
>>>> 5. start standby db.
>>>>
>>>> after step 5 - it seems that the standby doesn't completely close the
>>>> GAP .
>>>>
>>>> when I start the Primary again the gap is closed .
>>>>
>>>>
>>>>
>>>>
>>>> here the output of dgmlgr command and oracle configuration :
>>>>
>>>> DGMGRL> show database verbose ldb1
>>>>
>>>> Database - ldb1
>>>>
>>>> Role: PRIMARY
>>>> Intended State: TRANSPORT-ON
>>>> Instance(s):
>>>> ldb1
>>>>
>>>> Properties:
>>>> DGConnectIdentifier = 'ldb1'
>>>> ObserverConnectIdentifier = ''
>>>> LogXptMode = 'SYNC'
>>>> RedoRoutes = '(LOCAL : ldb1fs SYNC)'
>>>> DelayMins = '0'
>>>> Binding = 'optional'
>>>> MaxFailure = '0'
>>>> MaxConnections = '1'
>>>> ReopenSecs = '300'
>>>> NetTimeout = '30'
>>>> RedoCompression = 'DISABLE'
>>>> LogShipping = 'ON'
>>>> PreferredApplyInstance = ''
>>>> ApplyInstanceTimeout = '0'
>>>> ApplyLagThreshold = '0'
>>>> TransportLagThreshold = '0'
>>>> TransportDisconnectedThreshold = '30'
>>>> ApplyParallel = 'AUTO'
>>>> StandbyFileManagement = 'AUTO'
>>>> ArchiveLagTarget = '0'
>>>> LogArchiveMaxProcesses = '4'
>>>> LogArchiveMinSucceedDest = '1'
>>>> DbFileNameConvert = 'ldb1, ldb1sb'
>>>> LogFileNameConvert = 'ldb1, ldb1sb'
>>>> FastStartFailoverTarget = ''
>>>> InconsistentProperties = '(monitor)'
>>>> InconsistentLogXptProps = '(monitor)'
>>>> SendQEntries = '(monitor)'
>>>> LogXptStatus = '(monitor)'
>>>> RecvQEntries = '(monitor)'
>>>> StaticConnectIdentifier =
>>>> '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.3.11)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=ldb1_DGMGRL)(INSTANCE_NAME=ldb1)(SERVER=DEDICATED)))'
>>>> StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
>>>> AlternateLocation = ''
>>>> LogArchiveTrace = '0'
>>>> LogArchiveFormat = '%t_%s_%r.arc'
>>>> TopWaitEvents = '(monitor)'
>>>>
>>>> Database Status:
>>>> SUCCESS
>>>>
>>>>
>>>> Far Sync Instance - ldb1fs
>>>>
>>>> Transport Lag: 0 seconds (computed 1 second ago)
>>>> Instance(s):
>>>> ldb1fs
>>>>
>>>> Properties:
>>>> DGConnectIdentifier = 'ldb1fs'
>>>> LogXptMode = 'ASYNC'
>>>> RedoRoutes = '(ldb1 : ldb1sb ASYNC)'
>>>> Binding = 'optional'
>>>> MaxFailure = '0'
>>>> MaxConnections = '1'
>>>> ReopenSecs = '300'
>>>> NetTimeout = '30'
>>>> RedoCompression = 'DISABLE'
>>>> LogShipping = 'ON'
>>>> TransportLagThreshold = '0'
>>>> TransportDisconnectedThreshold = '30'
>>>> LogArchiveMaxProcesses = '4'
>>>> LogArchiveMinSucceedDest = '1'
>>>> LogFileNameConvert = 'ldb1, ldb1fs'
>>>> InconsistentProperties = '(monitor)'
>>>> InconsistentLogXptProps = '(monitor)'
>>>> LogXptStatus = '(monitor)'
>>>> StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
>>>> AlternateLocation = ''
>>>> LogArchiveTrace = '0'
>>>> LogArchiveFormat = '%t_%s_%r.arc'
>>>> TopWaitEvents = '(monitor)'
>>>>
>>>> Far Sync Instance Status:
>>>> SUCCESS
>>>>
>>>>
>>>> DGMGRL> show database verbose ldb1sb
>>>>
>>>> Database - ldb1sb
>>>>
>>>> Role: PHYSICAL STANDBY
>>>> Intended State: APPLY-ON
>>>> Transport Lag: 0 seconds (computed 1 second ago)
>>>> Apply Lag: 0 seconds (computed 1 second ago)
>>>> Average Apply Rate: 3.00 KByte/s
>>>> Active Apply Rate: 1.38 MByte/s
>>>> Maximum Apply Rate: 5.11 MByte/s
>>>> Real Time Query: ON
>>>> Instance(s):
>>>> ldb1sb
>>>>
>>>> Properties:
>>>> DGConnectIdentifier = 'ldb1sb'
>>>> ObserverConnectIdentifier = ''
>>>> LogXptMode = 'ASYNC'
>>>> RedoRoutes = ''
>>>> DelayMins = '0'
>>>> Binding = 'optional'
>>>> MaxFailure = '0'
>>>> MaxConnections = '1'
>>>> ReopenSecs = '300'
>>>> NetTimeout = '30'
>>>> RedoCompression = 'DISABLE'
>>>> LogShipping = 'ON'
>>>> PreferredApplyInstance = ''
>>>> ApplyInstanceTimeout = '0'
>>>> ApplyLagThreshold = '0'
>>>> TransportLagThreshold = '0'
>>>> TransportDisconnectedThreshold = '30'
>>>> ApplyParallel = 'AUTO'
>>>> StandbyFileManagement = 'AUTO'
>>>> ArchiveLagTarget = '0'
>>>> LogArchiveMaxProcesses = '4'
>>>> LogArchiveMinSucceedDest = '1'
>>>> DbFileNameConvert = 'ldb1, ldb1sb'
>>>> LogFileNameConvert = 'ldb1, ldb1sb'
>>>> FastStartFailoverTarget = ''
>>>> InconsistentProperties = '(monitor)'
>>>> InconsistentLogXptProps = '(monitor)'
>>>> SendQEntries = '(monitor)'
>>>> LogXptStatus = '(monitor)'
>>>> RecvQEntries = '(monitor)'
>>>> StaticConnectIdentifier =
>>>> '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=127.0.0.1)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=ldb1sb_DGMGRL)(INSTANCE_NAME=ldb1sb)(SERVER=DEDICATED)))'
>>>> StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
>>>> AlternateLocation = ''
>>>> LogArchiveTrace = '0'
>>>> LogArchiveFormat = '%t_%s_%r.arc'
>>>> TopWaitEvents = '(monitor)'
>>>>
>>>> Database Status:
>>>> SUCCESS
>>>>
>>>>
>>>> ----------------------------------- -----------
>>>> ------------------------------
>>>> log_archive_dest string
>>>> log_archive_dest_1 string
>>>> LOCATION=USE_DB_RECOVERY_FILE_
>>>> DEST
>>>> VALID_FOR=(ALL_LOGFILES
>>>> ,ALL_ROLES)
>>>> log_archive_dest_10 string
>>>> log_archive_dest_11 string
>>>> log_archive_dest_12 string
>>>> log_archive_dest_13 string
>>>> log_archive_dest_14 string
>>>> log_archive_dest_15 string
>>>> log_archive_dest_16 string
>>>> log_archive_dest_17 string
>>>> log_archive_dest_18 string
>>>> log_archive_dest_19 string
>>>> log_archive_dest_2 string
>>>> log_archive_dest_20 string
>>>> log_archive_dest_21 string
>>>> log_archive_dest_22 string
>>>> log_archive_dest_23 string
>>>> log_archive_dest_24 string
>>>> log_archive_dest_25 string
>>>> log_archive_dest_26 string
>>>> log_archive_dest_27 string
>>>> log_archive_dest_28 string
>>>> log_archive_dest_29 string
>>>> log_archive_dest_3 string service="ldb1fs", SYNC
>>>> AFFIRM
>>>> delay=0 optional
>>>> compression=d
>>>> isable max_failure=0
>>>> max_conne
>>>> ctions=1 reopen=300
>>>> db_unique_
>>>> name="ldb1fs"
>>>> net_timeout=30,
>>>>
>>>> valid_for=(online_logfile,all_
>>>> roles)
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Oct 28, 2014 at 8:43 PM, Ludovico Caldara <
>>>> ludovico.caldara_at_gmail.com> wrote:
>>>>
>>>>> Hi, Are you using sync or fast sync for the redo transport?
>>>>> Il 28-ott-2014 15:21 "amihay gonen" <agonenil_at_gmail.com> ha scritto:
>>>>>
>>>>> Hi all ,
>>>>>> I'm testing Farsync and it seems that always a gap in current_scn
>>>>>> between primary and farsync and standby .
>>>>>>
>>>>>> I don't understand why...
>>>>>>
>>>>>>
>>>>>> I've the following configuration :
>>>>>>
>>>>>> DGMGRL> show configuration verbose
>>>>>>
>>>>>> Configuration - dr
>>>>>>
>>>>>> Protection Mode: MaxAvailability
>>>>>> Members:
>>>>>> ldb1 - Primary database
>>>>>> ldb1fs - Far sync instance
>>>>>> ldb1sb - Physical standby database
>>>>>>
>>>>>> Properties:
>>>>>> FastStartFailoverThreshold = '30'
>>>>>> OperationTimeout = '30'
>>>>>> TraceLevel = 'USER'
>>>>>> FastStartFailoverLagLimit = '30'
>>>>>> CommunicationTimeout = '180'
>>>>>> ObserverReconnect = '0'
>>>>>> FastStartFailoverAutoReinstate = 'TRUE'
>>>>>> FastStartFailoverPmyShutdown = 'TRUE'
>>>>>> BystandersFollowRoleChange = 'ALL'
>>>>>> ObserverOverride = 'FALSE'
>>>>>> ExternalDestination1 = ''
>>>>>> ExternalDestination2 = ''
>>>>>> PrimaryLostWriteAction = 'CONTINUE'
>>>>>>
>>>>>> Fast-Start Failover: DISABLED
>>>>>>
>>>>>> Configuration Status:
>>>>>> SUCCESS
>>>>>>
>>>>>> DGMGRL> enable configuration
>>>>>> Enabled.
>>>>>> DGMGRL> show configuration verbose
>>>>>>
>>>>>> Configuration - dr
>>>>>>
>>>>>> Protection Mode: MaxAvailability
>>>>>> Members:
>>>>>> ldb1 - Primary database
>>>>>> ldb1fs - Far sync instance
>>>>>> ldb1sb - Physical standby database
>>>>>>
>>>>>> Properties:
>>>>>> FastStartFailoverThreshold = '30'
>>>>>> OperationTimeout = '30'
>>>>>> TraceLevel = 'USER'
>>>>>> FastStartFailoverLagLimit = '30'
>>>>>> CommunicationTimeout = '180'
>>>>>> ObserverReconnect = '0'
>>>>>> FastStartFailoverAutoReinstate = 'TRUE'
>>>>>> FastStartFailoverPmyShutdown = 'TRUE'
>>>>>> BystandersFollowRoleChange = 'ALL'
>>>>>> ObserverOverride = 'FALSE'
>>>>>> ExternalDestination1 = ''
>>>>>> ExternalDestination2 = ''
>>>>>> PrimaryLostWriteAction = 'CONTINUE'
>>>>>>
>>>>>> Fast-Start Failover: DISABLED
>>>>>>
>>>>>> Configuration Status:
>>>>>> SUCCESS
>>>>>>
>>>>>> DGMGRL>
>>>>>>
>>>>>>
>>>>>> when doing the following queries :
>>>>>>
>>>>>> sqlplus -l -s a/a_at_ldb1 <<EOF
>>>>>> set head off
>>>>>> select 'ldb1:'||current_scn||','||CONTROLFILE_CHANGE# from
>>>>>> v\$database;
>>>>>> EOF
>>>>>> sqlplus -l -s sys/a_at_ldb1fs as sysdba<<EOF
>>>>>> set head off
>>>>>> select 'ldb1fs:'||current_scn||','||CONTROLFILE_CHANGE# from
>>>>>> v\$database;
>>>>>> EOF
>>>>>> sqlplus -l -s a/a_at_ldb1sb <<EOF
>>>>>> set head off
>>>>>> select 'ldb1fs:'||current_scn||','||CONTROLFILE_CHANGE# from
>>>>>> v\$database;
>>>>>> EOF
>>>>>>
>>>>>>
>>>>>> I got the following results
>>>>>>
>>>>>> ldb1:*2275269*,2274999
>>>>>>
>>>>>> Elapsed: 00:00:00.00
>>>>>>
>>>>>> ldb1fs*:2275073*,1800180
>>>>>>
>>>>>> Elapsed: 00:00:00.00
>>>>>>
>>>>>> ldb1fs:*2275267*,2250255
>>>>>>
>>>>>> Elapsed: 00:00:00.01
>>>>>>
>>>>>>
>>>>
>>>
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Oct 29 2014 - 10:14:23 CET

Original text of this message