RE: Archive Log Size

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Fri, 12 Mar 2021 13:38:18 -0500
Message-ID: <088b01d7176e$df820300$9e860900$_at_rsiz.com>

I am rusty digging in on these details and they have almost certainly changed a few things, but if I recall correctly then once upon a time the reason for a switch appeared in one or more of the alert logs from the set alert.log, the lgwr log, the ckpt log, and the arch log.

Now I always tried to keep quite a few log groups in the rotation (on the order of ten) because there are several things that need to be complete for a log that has been switched out of before it can be used again.

Finally mean time to recover of course cares whether the redo data has made it to the standby, but it is the accumulated unapplied redo maximum that seems likely to be applicable within that time that still might drive a checkpoint and switch. I am also rusty on those mechanics and they have probably also changed since last I dug into the precise blow by blow order of mandatory events to ensure reliability in all edge cases.

So, are there any hints for the switch reasons in the logs? IF not, can ARCH, CKPT, and LGWR be traced at a higher level to cause them to puke switch reasons?

mwf

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Henry Poras Sent: Friday, March 12, 2021 1:04 PM
To: Jonathan Lewis
Cc: ORACLE-L
Subject: Re: Archive Log Size

Wow. I just reread my last response and it reads just like what it was, free form thoughts. Let me try and clarify what I meant.

Initially, Jonathan's supposition that somehow the MTTR setting gets the standby to force these short log switches seems reasonable. However, upon further thought, I have some questions. First, I typically associate fast_start_mttr_target with checkpointing, and that isn't a standby thing. Second, the Standby gets its data from LGWR (the creator for dest_id=2), so the data should be in the SRL without a log switch

This also got me wondering if LNS (LGWR) falling behind in writing from the log buffer to the standby SRL might play a role. So I looked at x$logbuf_readhist, but the HITRATE on all 3 nodes was 98%, so it looks like no backlog there.

1 select indx, inst_id, bufsize, rdmemblks, rddiskblks, hitrate, bufinfo 2* from x$logbuf_readhist
primary:sys_at_ohcops1> /

      INDX    INST_ID BUFSIZE              RDMEMBLKS                RDDISKBLKS                  HITRATE BUFINFO


---------- ---------- -------------------- ------------------------ ------------------------ ---------- --------------------


         0          1 107812K              53747612                 1869757119                        2
         1          1 129374K              361691660                1561813071                       18
         2          1 150937K              727333789                1196170942                       37
         3          1 172499K              1303643599               619861132                        67
         4          1 194062K              1834188083               89316648                         95 TARGET-90
         5          1 215624K              1896707348               26797383                         98 CURRENT
         6          1 237186K              1897284975               26219756                         98
         7          1 258749K              1897284975               26219756                         98
         8          1 280311K              1897284975               26219756                         98
         9          1 301874K              1897284975               26219756                         98
        10          1 323436K              1897284975               26219756                         98
        11          1 344998K              1897284975               26219756                         98
        12          1 366561K              1897284975               26219756                         98
        13          1 388123K              1897284975               26219756                         98
        14          1 409686K              1897284975               26219756                         98
        15          1 431248K              1897284975               26219756                         98

16 rows selected.

I'm back to wondering if there is a way to distinguish between a local vs. standby effect here.

Henry

On Thu, Mar 11, 2021 at 10:43 AM Henry Poras <henry.poras_at_gmail.com> wrote:

Your supposition makes sense.I associate fast_start_mttr with checkpointing, which isn't happening on the standby. Just reading directly from the SRL. However, on the other hand, it would still be strange for this to be the cause of the small archive. We also know from earlier that the Standby (log_archive_dest_2) is getting its data from LGWR so the log switches shouldn't be necessary to obtain the same mttr on the standby. I also tend to get confused by some of the interrelations between fsat_start_mttr_target, archive_lag_target, ...

If it is mttr related, I would guess that doubling my fast_start_mttr_target should double the size of my archive logs (seeing that the current archive sizing is pretty consistent). Not sure if I'll be able to do that, but it's worth a try. I think it would be pretty harmless.

Henry

On Thu, Mar 11, 2021 at 10:07 AM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:

I wonder if the MTTR gets reflected by something in the Standby.

If you're mean time to recover has to be 15 seconds then maybe Oracle does something to ensure that the standby could recover inside the MTTR as well - and archiving log files prematurely might be a mechanism to allow that to happen.

Regards

Jonathan Lewis

On Wed, 10 Mar 2021 at 15:29, Henry Poras <henry.poras_at_gmail.com> wrote:

fast_start_mttr_target integer 15

gv$instance_recovery

INST_ID                        : 1
RECOVERY_ESTIMATED_IOS         : 18594
ACTUAL_REDO_BLKS               : 369677
TARGET_REDO_BLKS               : 1002837
LOG_FILE_SIZE_REDO_BLKS        : 3821985
LOG_CHKPT_TIMEOUT_REDO_BLKS    : 1002837
LOG_CHKPT_INTERVAL_REDO_BLKS   :
FAST_START_IO_TARGET_REDO_BLKS :
TARGET_MTTR                    : 15
ESTIMATED_MTTR                 : 12
CKPT_BLOCK_WRITES              : 91402225
OPTIMAL_LOGFILE_SIZE           : 457
ESTD_CLUSTER_AVAILABLE_TIME    : 3
WRITES_MTTR                    : 211343278
WRITES_LOGFILE_SIZE            : 0
WRITES_LOG_CHECKPOINT_SETTINGS : 0
WRITES_OTHER_SETTINGS          : 0
WRITES_AUTOTUNE                : 24832295
WRITES_FULL_THREAD_CKPT        : 97904



-----------------



INST_ID                        : 2
RECOVERY_ESTIMATED_IOS         : 14247
ACTUAL_REDO_BLKS               : 268501
TARGET_REDO_BLKS               : 1047425
LOG_FILE_SIZE_REDO_BLKS        : 3821985
LOG_CHKPT_TIMEOUT_REDO_BLKS    : 1047425
LOG_CHKPT_INTERVAL_REDO_BLKS   :
FAST_START_IO_TARGET_REDO_BLKS :
TARGET_MTTR                    : 15
ESTIMATED_MTTR                 : 7
CKPT_BLOCK_WRITES              : 8211142
OPTIMAL_LOGFILE_SIZE           : 596
ESTD_CLUSTER_AVAILABLE_TIME    : 3
WRITES_MTTR                    : 6818539
WRITES_LOGFILE_SIZE            : 0
WRITES_LOG_CHECKPOINT_SETTINGS : 0
WRITES_OTHER_SETTINGS          : 0
WRITES_AUTOTUNE                : 37153280
WRITES_FULL_THREAD_CKPT        : 134238



-----------------



INST_ID                        : 3
RECOVERY_ESTIMATED_IOS         : 7800
ACTUAL_REDO_BLKS               : 262743
TARGET_REDO_BLKS               : 1035890
LOG_FILE_SIZE_REDO_BLKS        : 3821985
LOG_CHKPT_TIMEOUT_REDO_BLKS    : 1035890
LOG_CHKPT_INTERVAL_REDO_BLKS   :
FAST_START_IO_TARGET_REDO_BLKS :
TARGET_MTTR                    : 15
ESTIMATED_MTTR                 : 14
CKPT_BLOCK_WRITES              : 6467159
OPTIMAL_LOGFILE_SIZE           : 285
ESTD_CLUSTER_AVAILABLE_TIME    : 8
WRITES_MTTR                    : 18266031
WRITES_LOGFILE_SIZE            : 0
WRITES_LOG_CHECKPOINT_SETTINGS : 0
WRITES_OTHER_SETTINGS          : 0
WRITES_AUTOTUNE                : 16983159
WRITES_FULL_THREAD_CKPT        : 81827



-----------------

That's a table I almost never look at. I need to go back and review it a bit.

Henry

On Wed, Mar 10, 2021 at 4:59 AM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:

I don't think fast_start_mttr_target has been mentioned explicitly yet, nor (g)v$instance_recovery.

Any information there.

Regards

Jonathan Lewis

On Tue, 9 Mar 2021 at 22:34, Henry Poras <henry.poras_at_gmail.com> wrote:

Jonathan,

According to the docs, "A 0 value disables the time-based thread advance feature" so I am going under the assumption that 900 means 900. Of course, sometimes docs lie.

Tim, what you suggest makes sense, but it doesn't match up with what I am seeing in this environment. There is some data in the first few entries in this thread. I can summarize over a larger time frame if you like. Basically, each thread is switching every couple of minutes or less, and the archive logs are consistently 40-43M (redo logs are 256M).

Henry

--

http://www.freelists.org/webpage/oracle-l Received on Fri Mar 12 2021 - 19:38:18 CET

This message: [ Message body ]
Next message: nilesh kumar: "Oracle ODA suggestions."
Previous message: Henry Poras: "Re: Archive Log Size"
In reply to: Henry Poras: "Re: Archive Log Size"
Next in thread: Henry Poras: "Re: Archive Log Size"
Reply: Henry Poras: "Re: Archive Log Size"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message