Re: Primary shows no gap, but standby is missing an archivelog

From: MARK BRINSMEAD <mark.brinsmead_at_gmail.com>
Date: Thu, 26 Feb 2015 01:31:25 -0500
Message-ID: <CAAaXtLBDRjq11eNHAyjDa4F2aj35g_CfXcQSysZikSWp0QkU6A_at_mail.gmail.com>



Yes. In the past, I have found the most useful thing to monitor on a standby is the lag. And for that, the standby does not need to be open, just mounted and applying redo as normal.

If your monitoring alerts as soon as the standby lags the primary by -- for example -- 30 minutes more than expected (remember, some sites will inject an intentional lag) you'll also get pretty good alerts to "gaps" as they arise. Of course, monitoring the alert log on the standby should do a pretty good job of notifying you about gaps, too, provided you are using dataguard.

On Wed, Feb 25, 2015 at 3:55 PM, Maureen English <maureen.english_at_alaska.edu
> wrote:

> Thank you Kenny and Mark!
>
> Yes, I agree, it is all very good advice :-)
>
> Oddly enough, we ran into a similar situation with our production standby
> last summer. If I remember
> correctly, the problem was discovered when I tried to open the standby
> readonly...which sounds like
> Mark's guess is correct. At that time, though, we hadn't checked
> v$archive_dest_status until after we
> opened the standby, at which time I'm pretty sure the gap_status was
> 'RESOLVABLE GAP'.
>
> Luckily, this is a database that gets cloned from our production database,
> so it's not a big problem.
> I really like the idea of using RMAN to manage the archivelogs...I can
> easily write the archivelog backups
> to another filesystem with lots of space, thus keeping them for as long as
> is needed until the next time
> we clone this database.
> Another person on the list also shared some queries that will give us more
> information than what we've
> been relying on from v$archive_dest_status. One of those queries monitors
> the lag status!
>
> I certainly cannot say that my job is boring here....
>
> Thanks again!
>
> - Maureen
>
>
>
>
>
>
> On Wed, Feb 25, 2015 at 6:18 AM, MARK BRINSMEAD <mark.brinsmead_at_gmail.com>
> wrote:
>
>> Very good advice, Maureen.
>>
>> Just at a guess, I would think that there is no "gap" reported until such
>> time as the standby actually TRIES to apply a redolog and finds that it
>> can't. If you haven't deleted the logs yet -- or the standby does not yet
>> know that you have -- why would a gap be reported.
>>
>> Usually -- and especially when you have a standby database -- you should
>> use RMAN to manage deletion of archivelogs (and back them up, first of
>> course). If you find yourself in a "crisis", it is often better to change
>> your LOG_ARCHIVE_DEST locations to point to a different filesystem with
>> more free capacity, even if only for an hour or two. RMAN and DataGuard
>> will both know how to find the archivelogs in the place they were actually
>> written, so you will have less disruption to your backups and your standby
>> that way.
>>
>> Finally, you really ought to monitor the lag on your standby databases.
>> In past environments I worked in, alarm bells would start ringing as soon
>> as a standby fell 15 or 30 minutes behind the primary. Doing this will
>> avoid embarassing situations where your standby has been stuck on a "gap"
>> for two months and nobody has noticed. (Sometimes "bad things" happen to
>> DBAs in situations like this when they are not lucky enough to be the first
>> to notice.)
>>
>> On Tue, Feb 24, 2015 at 9:26 PM, Kenny Payton <k3nnyp_at_gmail.com> wrote:
>>
>>> I’d have to give your question a little more thought for a good answer
>>> but hopefully you have a backup of that log or are prepared to rebuild the
>>> standby or ship an incremental backup over and apply it to the standby.
>>>
>>> I have a suggestion moving forward. If you’re not already using RMAN
>>> use it minimally to manage your archivelog deletions. You can define the
>>> ARCHIVE DELETION POLICY to SHIPPED or APPLIED TO STANDBY and when you run a
>>> delete archivelog statement it will not delete archivelogs that don’t meet
>>> your requirements unless you provide the force option.
>>>
>>>
>>>
>>>
>>>
>>> On Feb 24, 2015, at 9:19 PM, Maureen English <maureen.english_at_alaska.edu>
>>> wrote:
>>>
>>> I have a situation where I know what happened and how to prevent it
>>> from happening in the future, but I don't know why it looks like the
>>> problem shouldn't exist.
>>>
>>> A while back, I was doing something in our primary database that
>>> resulted in a lot of archivelogs being generated. We don't have that
>>> much space for archivelogs, so I delete some of them when we start to
>>> run out of space. I always run the following query in the primary
>>> database, before and after deleting archivelogs, and never delete all
>>> of them at the same time:
>>>
>>> SELECT STATUS, GAP_STATUS FROM V$ARCHIVE_DEST_STATUS WHERE DEST_ID = 3;
>>>
>>> where 3 is the standby. This query consistently returns 'NO GAP', so I figured
>>> that everything was fine. However, when restarting the standby today, it now
>>> appears that there is an archivelog that has been missing for a very long time.
>>>
>>> Querying v$archived_log shows when the problem started, but why does v$archive_gap
>>> return no rows and v$archive_dest_status return 'NO GAP'? Our standby database is
>>> in realtime apply mode.
>>>
>>> - Maureen
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Feb 26 2015 - 07:31:25 CET

Original text of this message