RE: Archive Log Size

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Fri, 12 Mar 2021 13:38:18 -0500
Message-ID: <088b01d7176e$df820300$9e860900$_at_rsiz.com>



I am rusty digging in on these details and they have almost certainly changed a few things, but if I recall correctly then once upon a time the reason for a switch appeared in one or more of the alert logs from the set alert.log, the lgwr log, the ckpt log, and the arch log.  

Now I always tried to keep quite a few log groups in the rotation (on the order of ten) because there are several things that need to be complete for a log that has been switched out of before it can be used again.  

Finally mean time to recover of course cares whether the redo data has made it to the standby, but it is the accumulated unapplied redo maximum that seems likely to be applicable within that time that still might drive a checkpoint and switch. I am also rusty on those mechanics and they have probably also changed since last I dug into the precise blow by blow order of mandatory events to ensure reliability in all edge cases.  

So, are there any hints for the switch reasons in the logs? IF not, can ARCH, CKPT, and LGWR be traced at a higher level to cause them to puke switch reasons?  

mwf  

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Henry Poras Sent: Friday, March 12, 2021 1:04 PM
To: Jonathan Lewis
Cc: ORACLE-L
Subject: Re: Archive Log Size  

Wow. I just reread my last response and it reads just like what it was, free form thoughts. Let me try and clarify what I meant.  

Initially, Jonathan's supposition that somehow the MTTR setting gets the standby to force these short log switches seems reasonable. However, upon further thought, I have some questions. First, I typically associate fast_start_mttr_target with checkpointing, and that isn't a standby thing. Second, the Standby gets its data from LGWR (the creator for dest_id=2), so the data should be in the SRL without a log switch  

This also got me wondering if LNS (LGWR) falling behind in writing from the log buffer to the standby SRL might play a role. So I looked at x$logbuf_readhist, but the HITRATE on all 3 nodes was 98%, so it looks like no backlog there.  

  1 select indx, inst_id, bufsize, rdmemblks, rddiskblks, hitrate, bufinfo   2* from x$logbuf_readhist
primary:sys_at_ohcops1> /

      INDX    INST_ID BUFSIZE              RDMEMBLKS                RDDISKBLKS                  HITRATE BUFINFO

---------- ---------- -------------------- ------------------------ ------------------------ ---------- --------------------
0 1 107812K 53747612 1869757119 2 1 1 129374K 361691660 1561813071 18 2 1 150937K 727333789 1196170942 37 3 1 172499K 1303643599 619861132 67 4 1 194062K 1834188083 89316648 95 TARGET-90 5 1 215624K 1896707348 26797383 98 CURRENT 6 1 237186K 1897284975 26219756 98 7 1 258749K 1897284975 26219756 98 8 1 280311K 1897284975 26219756 98 9 1 301874K 1897284975 26219756 98 10 1 323436K 1897284975 26219756 98 11 1 344998K 1897284975 26219756 98 12 1 366561K 1897284975 26219756 98 13 1 388123K 1897284975 26219756 98 14 1 409686K 1897284975 26219756 98 15 1 431248K 1897284975 26219756 98

16 rows selected.  

I'm back to wondering if there is a way to distinguish between a local vs. standby effect here.  

Henry    

On Thu, Mar 11, 2021 at 10:43 AM Henry Poras <henry.poras_at_gmail.com> wrote:

Your supposition makes sense.I associate fast_start_mttr with checkpointing, which isn't happening on the standby. Just reading directly from the SRL. However, on the other hand, it would still be strange for this to be the cause of the small archive. We also know from earlier that the Standby (log_archive_dest_2) is getting its data from LGWR so the log switches shouldn't be necessary to obtain the same mttr on the standby. I also tend to get confused by some of the interrelations between fsat_start_mttr_target, archive_lag_target, ...  

If it is mttr related, I would guess that doubling my fast_start_mttr_target should double the size of my archive logs (seeing that the current archive sizing is pretty consistent). Not sure if I'll be able to do that, but it's worth a try. I think it would be pretty harmless.  

Henry    

On Thu, Mar 11, 2021 at 10:07 AM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:

I wonder if the MTTR gets reflected by something in the Standby.

If you're mean time to recover has to be 15 seconds then maybe Oracle does something to ensure that the standby could recover inside the MTTR as well - and archiving log files prematurely might be a mechanism to allow that to happen.  

Regards

Jonathan Lewis      

On Wed, 10 Mar 2021 at 15:29, Henry Poras <henry.poras_at_gmail.com> wrote:

fast_start_mttr_target integer 15  

gv$instance_recovery

INST_ID                        : 1
RECOVERY_ESTIMATED_IOS         : 18594
ACTUAL_REDO_BLKS               : 369677
TARGET_REDO_BLKS               : 1002837
LOG_FILE_SIZE_REDO_BLKS        : 3821985
LOG_CHKPT_TIMEOUT_REDO_BLKS    : 1002837
LOG_CHKPT_INTERVAL_REDO_BLKS   :
FAST_START_IO_TARGET_REDO_BLKS :
TARGET_MTTR                    : 15
ESTIMATED_MTTR                 : 12
CKPT_BLOCK_WRITES              : 91402225
OPTIMAL_LOGFILE_SIZE           : 457
ESTD_CLUSTER_AVAILABLE_TIME    : 3
WRITES_MTTR                    : 211343278
WRITES_LOGFILE_SIZE            : 0
WRITES_LOG_CHECKPOINT_SETTINGS : 0
WRITES_OTHER_SETTINGS          : 0
WRITES_AUTOTUNE                : 24832295
WRITES_FULL_THREAD_CKPT        : 97904


-----------------
INST_ID : 2 RECOVERY_ESTIMATED_IOS : 14247 ACTUAL_REDO_BLKS : 268501 TARGET_REDO_BLKS : 1047425 LOG_FILE_SIZE_REDO_BLKS : 3821985 LOG_CHKPT_TIMEOUT_REDO_BLKS : 1047425 LOG_CHKPT_INTERVAL_REDO_BLKS : FAST_START_IO_TARGET_REDO_BLKS : TARGET_MTTR : 15 ESTIMATED_MTTR : 7 CKPT_BLOCK_WRITES : 8211142 OPTIMAL_LOGFILE_SIZE : 596 ESTD_CLUSTER_AVAILABLE_TIME : 3 WRITES_MTTR : 6818539 WRITES_LOGFILE_SIZE : 0 WRITES_LOG_CHECKPOINT_SETTINGS : 0 WRITES_OTHER_SETTINGS : 0 WRITES_AUTOTUNE : 37153280 WRITES_FULL_THREAD_CKPT : 134238
-----------------
INST_ID : 3 RECOVERY_ESTIMATED_IOS : 7800 ACTUAL_REDO_BLKS : 262743 TARGET_REDO_BLKS : 1035890 LOG_FILE_SIZE_REDO_BLKS : 3821985
LOG_CHKPT_TIMEOUT_REDO_BLKS : 1035890 LOG_CHKPT_INTERVAL_REDO_BLKS : FAST_START_IO_TARGET_REDO_BLKS : TARGET_MTTR : 15 ESTIMATED_MTTR : 14 CKPT_BLOCK_WRITES : 6467159 OPTIMAL_LOGFILE_SIZE : 285 ESTD_CLUSTER_AVAILABLE_TIME : 8 WRITES_MTTR : 18266031 WRITES_LOGFILE_SIZE : 0 WRITES_LOG_CHECKPOINT_SETTINGS : 0 WRITES_OTHER_SETTINGS : 0 WRITES_AUTOTUNE : 16983159 WRITES_FULL_THREAD_CKPT : 81827
-----------------

That's a table I almost never look at. I need to go back and review it a bit.

Henry  

On Wed, Mar 10, 2021 at 4:59 AM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:  

I don't think fast_start_mttr_target has been mentioned explicitly yet, nor (g)v$instance_recovery.

Any information there.  

Regards

Jonathan Lewis    

On Tue, 9 Mar 2021 at 22:34, Henry Poras <henry.poras_at_gmail.com> wrote:

Jonathan,

According to the docs, "A 0 value disables the time-based thread advance feature" so I am going under the assumption that 900 means 900. Of course, sometimes docs lie.  

Tim, what you suggest makes sense, but it doesn't match up with what I am seeing in this environment. There is some data in the first few entries in this thread. I can summarize over a larger time frame if you like. Basically, each thread is switching every couple of minutes or less, and the archive logs are consistently 40-43M (redo logs are 256M).  

Henry  

--

http://www.freelists.org/webpage/oracle-l Received on Fri Mar 12 2021 - 19:38:18 CET

Original text of this message