Re: Data movement logic in data guard

From: Lok P <loknath.73_at_gmail.com>
Date: Thu, 31 Dec 2020 17:18:24 +0530
Message-ID: <CAKna9VZxD2d4-RPDkMe6LefPfHbB5vc5z3-2GBv2TSNBOtywcA_at_mail.gmail.com>



Thank you.

I normally check using below query i.e. the sequence# from gv$archived_log, and it's showing the difference is Zero which means we have zero lag.

But talking of the logic here to evaluate or rely on a date column to compare if the replication is fully done till that point in time. Can we use min(last_time) from gv$standby_log. What i mean is can we safely say if min(last_time) from gv$standby_log >10:01:01AM, it means all the data from primary has been moved to DR till 10:01:01AM? And so we can go ahead and safely do the data load to the target/remote db using query.. select * from transaction_table where dt_created<10:01:01AM?

SELECT ARCH.THREAD# "Thread", ARCH.SEQUENCE# "Last Sequence Received",

   APPL.SEQUENCE# "Last Sequence Applied", (ARCH.SEQUENCE# - APPL.SEQUENCE#) "Difference"

FROM (SELECT THREAD# ,SEQUENCE# FROM gV$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH, (SELECT THREAD# ,SEQUENCE# FROM gV$LOG_HISTORY WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL WHERE ARCH.THREAD# = APPL.THREAD# ORDER BY 1; On Thu, Dec 31, 2020 at 4:47 PM <niall.litchfield_at_gmail.com> wrote:

> I intended to reply to this yesterday so thanks for the response!
>
> I think the explanation is simply that up until six months or so ago
> Instance 1 was the apply instance, since then instance 2 has been applying.
>
> On Thu, Dec 31, 2020 at 11:09 AM Hemant K Chitale <
> hemantkchitale_at_gmail.com> wrote:
>
>>
>> Don't you find the Apply Lag Value for INST_ID=1 very strange ??
>> Why is it so many days ?
>>
>> Hemant K Chitale
>>
>>
>>
>>
>> On Wed, Dec 30, 2020 at 9:04 PM Lok P <loknath.73_at_gmail.com> wrote:
>>
>>> Thank you.
>>>
>>> Data from standby side v$dataguard_stats looks something as below. Not
>>> sure why that diff is so big.
>>>
>>> Requirement is to move data from a few transaction tables incrementally
>>> to the target(say remote DB) in regular intervals
>>> ( say once in ~15minutes) as and when they reach the standby
>>> database, but should not miss any or duplicate any in the target. For this
>>> we want to leverage standby database considering primary is having some
>>> resource constraints.
>>>
>>> And for this we are taking a reference column as date_created of all the
>>> transaction tables which are populated with sysdate when it's
>>> created/inserted in the primary database table.Now before initiating data
>>> move from the standby at say exactly sysdate+N seconds, say i.e.
>>> 10:01:01AM. we want to ensure all the data till that point in time must be
>>> moved to standby without any left inflight. So before each data move
>>> procedure triggers , we will be triggering the pre-validation to ensure all
>>> the data till 10:01:01 AM has been reached standby and nothing left behind.
>>> This is to safely ensure we are consistent on the target/remote database
>>> without any chance of missing record, as because the criteria for the data
>>> fetch is going to be Select.. from transacton_table where
>>> date_created<=10:01:01AM. And next pick/fetch would be from 10:01:02AM
>>> onwards after next ~15minutes. And in case some data with date_created as
>>> 10:01:01AM arrives at the standby little late , they won't get picked and
>>> will be missed.
>>>
>>> So for implementing a full proof pre data load validation logic, I was
>>> looking for a date column which i can compare to ensure data movement is
>>> safe till that point in time from standby to remote DB.
>>>
>>> INST_ID NAME VALUE UNIT TIME_COMPUTED DATUM_TIME
>>>
>>> 1 apply lag * +171 16:34:19 * day(2) to second(0)
>>> interval 12/30/2020 07:06:09 12/30/2020 07:06:09
>>>
>>> 2 apply lag +00 00:00:59 day(2) to second(0) interval 12/30/2020
>>> 07:06:09 12/30/2020 07:06:09
>>>
>>> 3 apply lag day(2) to second(0) interval 12/30/2020 07:06:09
>>>
>>> 4 apply lag day(2) to second(0) interval 12/30/2020 07:06:09
>>>
>>> 1 transport lag +00 00:00:00 day(2) to second(0)
>>> interval 12/30/2020 07:06:09 12/30/2020 07:06:09
>>>
>>> 2 transport lag +00 00:00:00 day(2) to second(0)
>>> interval 12/30/2020 07:06:09 12/30/2020 07:06:09
>>>
>>> 3 transport lag +00 00:00:00 day(2) to second(0)
>>> interval 12/30/2020 07:06:09 12/30/2020 07:06:09
>>>
>>> 4 transport lag +00 00:00:00 day(2) to second(0)
>>> interval 12/30/2020 07:06:09 12/30/2020 07:06:09
>>>
>>>
>>>
>>>
>>> On Wed, Dec 30, 2020 at 4:09 PM <niall.litchfield_at_gmail.com> wrote:
>>>
>>>> I must confess to not having followed all of this but it does sound a
>>>> little bit like you are trying to reinvent some wheels here.
>>>>
>>>> The standard (in my view anyway) way to see how far "behind" a data
>>>> guard standby is to query v$dataguard_stats like the below
>>>>
>>>> SQL> select name,value,time_computed
>>>> 2 from v$dataguard_stats
>>>> 3 where name in ('apply lag','transport lag')
>>>> 4 order by name;
>>>>
>>>> NAME VALUE
>>>> TIME_COMPUTED
>>>> -------------------------------- ------------------------------
>>>> ------------------------------
>>>> apply lag +00 00:00:00
>>>> 12/30/2020 10:26:20
>>>> transport lag +00 00:00:00
>>>> 12/30/2020 10:26:20
>>>>
>>>> Similarly, for your "how do I ensure that a low activity database
>>>> transports redo regularly?" question earlier, there's a parameter
>>>> ARCHIVE_LAG_TARGET for that.
>>>>
>>>> As for the data movement from your standby to a (remote?) new table or
>>>> tables in a new database, if this is a regular thing this sounds like a
>>>> tailor-made use case for MERGE logic, especially if there is a possibility
>>>> that source data will be updated/deleted after initial load and you also
>>>> need to replicate this.
>>>>
>>>>
>>>> On Tue, Dec 29, 2020 at 6:01 PM Lok P <loknath.73_at_gmail.com> wrote:
>>>>
>>>>> While we were planning on the strategy we saw one issue here. To
>>>>> check/confirm if the data has been moved to standby/DR till certain
>>>>> point of time (say current time i.e. sysdate), we were initially thinking
>>>>> of relying on column last_replication_dt of gv$standby_log, if it's
>>>>> > sysdate, then only we will assume that all data has been replicated to DR
>>>>> till that point so it's good to pick/move data from DR/Standby till that
>>>>> point/sysdate. If last_replication_dt <sysdate it will keep on sleep for a
>>>>> few seconds and will not trigger the data move.
>>>>>
>>>>> But the issue in above strategy is that, in case there is no frequent
>>>>> changes happen on the primary , say a Dev instance , in which we hardly
>>>>> have any change happen on primary, in that case the last_replication_dt
>>>>> will never reach closer/equal to sysdate, and in that case it will keep on
>>>>> looping in that SLEEP status. Is there a better strategy or logic to handle
>>>>> this?
>>>>>
>>>>> LOOP
>>>>>
>>>>> SELECT SYSDATE, ( SELECT max(last_time) FROM gv$standby_log)
>>>>>
>>>>> INTO current_dt,
>>>>>
>>>>> last_replication_dt
>>>>>
>>>>> FROM DUAL;
>>>>>
>>>>> WHILE (current_dt > last_replication_dt)
>>>>>
>>>>> LOOP
>>>>>
>>>>> DBMS_LOCK.SLEEP(30);
>>>>>
>>>>> SELECT max(last_time)
>>>>>
>>>>> INTO last_replication_dt
>>>>>
>>>>> FROM gv$standby_log;
>>>>>
>>>>> END LOOP;
>>>>>
>>>>> /
>>>>>
>>>>> On Sun, Dec 20, 2020 at 1:09 AM Lok P <loknath.73_at_gmail.com> wrote:
>>>>>
>>>>>> Thank You very much for the detailed explanation.
>>>>>>
>>>>>> My apology, but I am not having much idea about the backup/recovery
>>>>>> area. Actually we had fallen into this wrong ordering trap while Golden
>>>>>> gate replication was used in one of the systems with multiple parallel
>>>>>> replication threads used for replication and we ended up missing data in
>>>>>> the target system, so wanted to check if that can happen in data
>>>>>> guard setup as well, as we are planning to use the data guard physical
>>>>>> standby / DR database as our data source.
>>>>>>
>>>>>> But If i get your point correctly , I think you are pointing to the
>>>>>> fact that the application of the archived redo logs on the DR database will
>>>>>> happen in the same order as the logical transactions happen in the primary
>>>>>> side. Which means , my initial concern regarding if the two different
>>>>>> inserts can be applied in different order in the DR side as compared to
>>>>>> primary is not true. And that way it will ensure data movement will never
>>>>>> fall into this trap due to wrong order.
>>>>>>
>>>>>> My other doubt was ,
>>>>>> 1)If it's good to rely on column LAST_TIME of v$standby_log(as I
>>>>>> posted the logic below) for checking the lag periodically and thus keep on
>>>>>> looping till the lag reduces to ZERO at certain point in time. And then
>>>>>> only data fetch/load query will be executed to pick data from the
>>>>>> transaction tables? And this block/loop will be called at the start of each
>>>>>> data load procedure call.
>>>>>>
>>>>>> 2) For insert only TABLES it's easy to take MAX(date_created) and
>>>>>> store it in some reference tables and then use that as reference to do the
>>>>>> incremental data load to target . But how to achieve this data movement
>>>>>> logic accurately, if UPDATE/DELETE also happens on the base transaction
>>>>>> table?
>>>>>>
>>>>>> LOOP
>>>>>>
>>>>>> SELECT SYSDATE, ( SELECT max(last_time) FROM gv$standby_log)
>>>>>>
>>>>>> INTO current_dt,
>>>>>>
>>>>>> last_replication_dt
>>>>>>
>>>>>> FROM DUAL;
>>>>>>
>>>>>> WHILE (current_dt > last_replication_dt)
>>>>>>
>>>>>> LOOP
>>>>>>
>>>>>> DBMS_LOCK.SLEEP(30);
>>>>>>
>>>>>> SELECT max(last_time)
>>>>>>
>>>>>> INTO last_replication_dt
>>>>>>
>>>>>> FROM gv$standby_log;
>>>>>>
>>>>>> END LOOP;
>>>>>>
>>>>>> /
>>>>>>
>>>>>>
>>>>>> On Sat, Dec 19, 2020 at 8:56 PM Mark W. Farnham <mwf_at_rsiz.com> wrote:
>>>>>>
>>>>>>> dataguard is fundamentally a recovery operation.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> dataguard with the “secondary open for read” is NOT logical
>>>>>>> replication, it is physical replication. (Two or more separate Dataguard
>>>>>>> destinations with different less than everything sets of tablespaces are
>>>>>>> beyond the scope of this email.) Receipt of each transaction COMMIT or
>>>>>>> ROLLBACK and the “secondary” processing completion of same is when
>>>>>>> transaction results are viewable in the “open for read” dataguard secondary.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> If they are ever out of order, that’s a major league bug in recovery
>>>>>>> (not just dataguard.)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> IF memory serves there is guidance documentation on using the open
>>>>>>> for read secondary dataguard as a source of information. Dataguard for
>>>>>>> switchover and failover existed for YEARS before the open active dataguard
>>>>>>> for read, so the documentation is probably skewed a bit toward that. I
>>>>>>> suggest first completely understanding just the notion for being complete
>>>>>>> to a point in time with respect to commit points before moving on to the
>>>>>>> exact functionality of active open for read. Then the context documents
>>>>>>> about active open for read might make sense.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Myself, I prefer the scheduled suspend recovery on business event,
>>>>>>> cold clone rename, open renamed clone frozen (allowing for aggregation for
>>>>>>> that freeze point), and resume recovery to actively open dataguard, which
>>>>>>> relegates real time queries to the primary but which allows dedicated
>>>>>>> configuration of a completely relationally integral database instance
>>>>>>> configured for ready only advantages like aggregation and result cache and
>>>>>>> minimized query traffic on undo for your selected cold clone rename
>>>>>>> versions. Often quarterly, monthly, weekly, and daily for a week are good
>>>>>>> choices, depending on your operational needs. For finance and manufacturing
>>>>>>> many of my clients found monthly and daily for a week were useful, but that
>>>>>>> was before the recovery product was built, literally using continuous
>>>>>>> recovery as described for the remote physical backup. Now don’t be confused
>>>>>>> by that. That has come to be called “roll your own dataguard” but it
>>>>>>> preceded Oracle’s product by a long time. We called it the “standby
>>>>>>> recovery database.” If you search the IOUW and OAUG archives I think you
>>>>>>> can find my detailed papers.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Several folks nearly simultaneously and independently “invented”
>>>>>>> this after Sequent presented a sales information demo of recovering a six
>>>>>>> month old vintage backup to current just from redo logs. That was early V6,
>>>>>>> so probably 1989 or so. Johnny Chan and Brien Christiansen of CISCO are the
>>>>>>> first folks who I worked with that routinely did the cold clone rename open
>>>>>>> as a reporting database frozen in time to relieve the overloaded primary
>>>>>>> systems of queries where “yesterday” and “end of last fiscal month” were
>>>>>>> actually as good as (or better than) current.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Good luck. Whether the redo logs are being shipped close to real
>>>>>>> time for the log writer or from the archived redolog after it is switch
>>>>>>> out, they are processed in order on the destination. If you are thinking of
>>>>>>> this in the context of replication of logical transactions rather than
>>>>>>> physically ordered commits and rollbacks, that way lies madness. Deltas are
>>>>>>> being applied to datablocks.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> (There does exist a temporary logical phase for upgrades, but that
>>>>>>> is also beyond the scope of this email.)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> mwf
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From:* oracle-l-bounce_at_freelists.org [mailto:
>>>>>>> oracle-l-bounce_at_freelists.org] *On Behalf Of *Lok P
>>>>>>> *Sent:* Saturday, December 19, 2020 6:42 AM
>>>>>>> *To:* Clay Jackson (cjackson)
>>>>>>> *Cc:* Oracle L
>>>>>>> *Subject:* Re: Data movement logic in data guard
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks for the guidance here.
>>>>>>>
>>>>>>> I will try to see how I can test the scenario to ensure the
>>>>>>> replication order will be exactly the same in DR as in primary even during
>>>>>>> catching up large lag.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Two questions with related to same
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 1) We don't want to keep on hitting the data read queries in those
>>>>>>> transaction tables if the lag is very high and should wait for the lag to
>>>>>>> catch up. For that reason is it okay to refer to the v$standby_log. And
>>>>>>> keep on looping with a sleep something as below?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> LOOP
>>>>>>>
>>>>>>> SELECT SYSDATE, ( SELECT max(last_time) FROM gv$standby_log)
>>>>>>>
>>>>>>> INTO current_dt,
>>>>>>>
>>>>>>> last_replication_dt
>>>>>>>
>>>>>>> FROM DUAL;
>>>>>>>
>>>>>>> WHILE (current_dt > last_replication_dt)
>>>>>>>
>>>>>>> LOOP
>>>>>>>
>>>>>>> DBMS_LOCK.SLEEP(30);
>>>>>>>
>>>>>>> SELECT max(last_time)
>>>>>>>
>>>>>>> INTO last_replication_dt
>>>>>>>
>>>>>>> FROM gv$standby_log;
>>>>>>>
>>>>>>> END LOOP;
>>>>>>>
>>>>>>> /
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2) For insert only TABLES it's easy to take MAX(date_created) and
>>>>>>> store it in some reference tables and then use that to do the incremental
>>>>>>> data load to target . But how to achieve this data movement/replication,
>>>>>>> if UPDATE/DELETE also happens on the base transaction table? Is there any
>>>>>>> recommended way for that?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Dec 19, 2020 at 1:04 AM Clay Jackson (cjackson) <
>>>>>>> Clay.Jackson_at_quest.com> wrote:
>>>>>>>
>>>>>>> While I admit I haven’t specifically tested this case, which I would
>>>>>>> recommend you do, I think you’re “overthinking” this.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The whole point behind DataGuard (or, at least one of the major “use
>>>>>>> cases”) is to create a “consistent” COPY of your primary database, so that
>>>>>>> in the event of a failure (loss of access), users can be moved to the
>>>>>>> secondary database without loss of data (or consistency). The mechanism
>>>>>>> by which this happens is through physical copying of the redo log(s), or
>>>>>>> records within the logs from the primary to the secondary, and then
>>>>>>> “applying” those logs (or records) to the secondary, IN THE SAME order as
>>>>>>> they were “applied” on the primary. So, while the actual writing of the
>>>>>>> transactions to the database on the secondary will ALWAYS happen at a point
>>>>>>> in time AFTER transaction was written to the primary, I don’t think the
>>>>>>> scenario you outlined (records being written “out of order” is possible.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Clay Jackson
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org>
>>>>>>> *On Behalf Of *Lok P
>>>>>>> *Sent:* Friday, December 18, 2020 11:06 AM
>>>>>>> *To:* Oracle L <oracle-l_at_freelists.org>
>>>>>>> *Subject:* Re: Data movement logic in data guard
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *CAUTION:* This email originated from outside of the organization.
>>>>>>> Do not follow guidance, click links, or open attachments unless you
>>>>>>> recognize the sender and know the content is safe.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Moving data from one to other system looks to be very common but I
>>>>>>> have never came across building such logic manually in code. And I don't
>>>>>>> have
>>>>>>>
>>>>>>> much idea about how the archive log apply happens at standby. But it
>>>>>>> seems like, to maintain constraints it has to be in exactly in same order
>>>>>>> as the data load happens in primary. Else things will break.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Can somebody guide me here, if the logic which we are going to rely
>>>>>>> to decide reference date/time in our case for data movement will never
>>>>>>> fail?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, 18 Dec 2020, 4:27 pm Lok P, <loknath.73_at_gmail.com> wrote:
>>>>>>>
>>>>>>> Actually in golden gate setup with multiple parallel replication
>>>>>>> threads , we have encountered scenarios where two transactions generated
>>>>>>> from source can reach target in different order thus causing the data
>>>>>>> pickup from target to miss some rows.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Dec 18, 2020 at 9:30 AM Lok P <loknath.73_at_gmail.com> wrote:
>>>>>>>
>>>>>>> Its version 11.2.0.4 of oracle exadata. We have a requirement in
>>>>>>> which we need to move the data to another system and for that we want to
>>>>>>> utilize the DR database(which is a physical standby with data guard
>>>>>>> configured) rather than Primary, as we want to not to affect the key
>>>>>>> applications which are running on primary. And we are almost saturating the
>>>>>>> DB resources on the primary during peak hours.
>>>>>>>
>>>>>>> For copying/moving data without miss in each ~15minutes interval
>>>>>>> frequency, we are relying on "date_created" column as reference column of
>>>>>>> the transaction table , so in case we have some additional lag from primary
>>>>>>> to DR, is it possible that a record created on primary and DR as below
>>>>>>> sequence, such that the row-3 created on DR before Row-2? In that case we
>>>>>>> may miss that row when we take MAX(date_created) from our transaction table
>>>>>>> to move the data.
>>>>>>>
>>>>>>> In such a scenario , how should we make our logic full proof to
>>>>>>> pick the max(date_created) on source so that we won't miss any data? Or
>>>>>>> should we some way utilize the column last_time of v$standby_log to make
>>>>>>> our logic full proof?
>>>>>>>
>>>>>>> Aso i am wondering if by any way we can handle UPDATE/DELETE of the
>>>>>>> records in source?
>>>>>>>
>>>>>>> On Primary:-
>>>>>>>
>>>>>>> Row-1 created at 18th DEC 10:00AM with date_created as 18th DEC
>>>>>>> 10:00AM
>>>>>>>
>>>>>>> Row-2 created at 18th DEC 10:05 AM with date_created as 18th DEC
>>>>>>> 10:05 AM
>>>>>>>
>>>>>>> Row-3 created at 18th DEC 10:06 AM with date_created as 18th DEC
>>>>>>> 10:06 AM
>>>>>>>
>>>>>>> On DR when we have lag of say ~5minutes:-
>>>>>>>
>>>>>>> Row-1 created at 18th DEC 10:05AM with date_created as 18th DEC
>>>>>>> 10:00AM
>>>>>>>
>>>>>>> Row-3 created at 18th DEC 10:11:01 AM with date_created as 18th DEC
>>>>>>> 10:06AM
>>>>>>>
>>>>>>> Row-2 created at 18th DEC 10:11:02 AM with date_created as 18th DEC
>>>>>>> 10:05AM
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>
>>>> --
>>>> Niall Litchfield
>>>> Oracle DBA
>>>> http://www.orawin.info
>>>>
>>>
>
> --
> Niall Litchfield
> Oracle DBA
> http://www.orawin.info
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Dec 31 2020 - 12:48:24 CET

Original text of this message