Re: Lag Time Between Log File Switch and Archive Log File

From: joel garry <joel-garry_at_home.com>
Date: 14 Jun 2006 14:58:34 -0700
Message-ID: <1150322314.634445.176750@r2g2000cwb.googlegroups.com>

dhoffman0917_at_yahoo.com wrote:
> Colleagues:
>
> Has anyone experienced lag times after an automatic log switch happens
> and the time the archive log file appears in the archive log directory?
> I am running Oracle9i (9.2) on Solaris/UNIX.

You are likely experiencing I/O contention during times of high transaction activity. You are generating a gigabyte of logs every 4 minutes. Those logs are being copied by the archiver and taking up to an hour to do so. Oracle is probably starting up additional archiving processes to handle the load. Each log switch requires a checkpoint - perhaps they are colliding with each other as one starts before the previous finishes. So I would start by looking at these things:

How many archiver processes are there? How active are they? Run top and see what it thinks. Is the checkpoint process going bonkers? cpu's maxed out? How many cpu's and what sort of disk do you have, anyways?
What is the system waiting on? Look at the wait tables for a clue. If you have OEM, there are a couple of very interesting displays. In particular, see if your log buffers are having fits. How big are your log buffers?
Have you considered having more redo logs? Some might think a switch every few minutes to be excessive.
How big is your SGA? While it is less of a big deal than in times past, at your transaction volume you may be dealing with the checkpoint trying to clean up hot blocks and just not being able to do it in time. Do you have hot blocks?
What sort of file system/controller are your archives going through? A gross configuration error would be if they are on RAID-5 and I/O is colliding with, say, undo files. Archiving logs is a serial write-intensive process and the device they are going to should reflect that. If you have your redo and archiving on the same controller that could make the archiver feel like it is going up the crowded down escalator.
Are there any messages in the alert log? Checkpoint not complete, perhaps?
Are you using a journalled file system? SAME? Is your code stupid? Sometimes huge redo can be generated for no good reason and some simple fixes make all the difference. What kind of backups do you use? Tablespaces left in hot backup (not RMAN) can generate "excessive" redo.
Are you loading data during these heavy times? Steps can be taken to reduce that, for example, you can load the data unrecoverably then take a backup afterwards. I mean, 1-3AM and after 4:30PM what are you really doing... if you are loading data while taking hot backups, ooooops.

As you can see, such a symptom can range from simple not-enough-redo to advanced dba work to sysadmin issues. Just throwing hardware at it wouldn't be smart.

It is also important to give exact versions of everything.

>
> For additional information, below are SQL results of time differences
> between when the log switch happened and when the archive log file was
> written to disk.
>
> I welcome all ideas at your earliest convenience on this and thanks in
> advance.
>
> Donna
>
>
>
> BYTES NEXT_TIME COMPLETION_TIME LAG
> ---------------- -------------------- -------------------- ----------
> 1,048,568,320 JUN-13-2006 00:28:20 JUN-13-2006 01:45:54 77
> 1,048,573,440 JUN-13-2006 01:11:11 JUN-13-2006 01:46:27 35
> 1,048,574,976 JUN-13-2006 01:26:43 JUN-13-2006 01:47:06 20
> 1,048,574,976 JUN-13-2006 01:29:18 JUN-13-2006 01:47:40 18
> 1,048,574,976 JUN-13-2006 01:41:32 JUN-13-2006 01:48:12 6
> 1,048,574,976 JUN-13-2006 01:45:54 JUN-13-2006 01:48:28 2
> 1,048,572,928 JUN-13-2006 02:08:46 JUN-13-2006 03:07:04 58
> 1,048,574,976 JUN-13-2006 02:23:27 JUN-13-2006 03:07:50 44
> 1,048,569,856 JUN-13-2006 02:38:10 JUN-13-2006 03:08:19 30
> 1,048,566,784 JUN-13-2006 02:42:02 JUN-13-2006 03:09:01 26
> 1,048,571,392 JUN-13-2006 03:04:52 JUN-13-2006 03:09:33 4
> 1,048,570,368 JUN-13-2006 03:44:33 JUN-13-2006 07:48:41 244
> 347,355,136 JUN-13-2006 04:44:29 JUN-13-2006 07:47:54 183
> 2,733,056 JUN-13-2006 05:44:28 JUN-13-2006 07:47:54 123
> 2,704,896 JUN-13-2006 06:44:28 JUN-13-2006 07:47:54 63
> 45,496,832 JUN-13-2006 07:44:30 JUN-13-2006 07:47:55 3
> 6,064,640 JUN-13-2006 08:44:35 JUN-13-2006 11:34:38 170
> 3,440,128 JUN-13-2006 09:44:33 JUN-13-2006 11:34:38 110
> 7,715,840 JUN-13-2006 10:44:34 JUN-13-2006 11:34:38 50
> 2,077,696 JUN-13-2006 11:32:58 JUN-13-2006 11:34:38 1
> 512 JUN-13-2006 11:32:58 JUN-13-2006 11:34:38 1
> 88,776,192 JUN-13-2006 12:32:58 JUN-13-2006 13:39:56 66
> 25,733,632 JUN-13-2006 13:32:58 JUN-13-2006 13:39:57 6
> 687,235,584 JUN-13-2006 14:32:58 JUN-13-2006 16:41:49 128
> 10,010,112 JUN-13-2006 15:32:58 JUN-13-2006 16:41:05 68
> 1,048,574,976 JUN-13-2006 16:30:14 JUN-13-2006 16:42:46 12
> 1,048,572,416 JUN-13-2006 16:33:58 JUN-13-2006 16:42:59 9
> 1,048,572,928 JUN-13-2006 16:37:33 JUN-13-2006 16:43:59 6
> 1,048,574,976 JUN-13-2006 16:41:21 JUN-13-2006 16:44:15 2
> 1,048,572,416 JUN-13-2006 16:46:43 JUN-13-2006 17:07:35 20
> 1,048,574,464 JUN-13-2006 16:50:30 JUN-13-2006 17:07:32 17
> 1,048,574,464 JUN-13-2006 16:54:28 JUN-13-2006 17:09:16 14
> 1,048,574,976 JUN-13-2006 16:58:48 JUN-13-2006 17:08:53 10
> 1,048,573,440 JUN-13-2006 17:03:01 JUN-13-2006 17:10:07 7
> 1,048,574,976 JUN-13-2006 17:07:40 JUN-13-2006 17:10:16 2
jg

--
@home.com is bogus.
"You guys f--- it up.  Stop covering your a-- and figure it out."
US Representative Bob Filner to Veterans Administration officials in
public news conference.

Received on Wed Jun 14 2006 - 16:58:34 CDT