RE: Redo Log improvement - how to analyze and attack?

From: Bobak, Mark <Mark.Bobak_at_proquest.com>
Date: Thu, 15 Nov 2012 15:06:23 -0500
Message-ID: <6AFC12B9BFCDEA45B7274C534738067F017A302312_at_AAPQMAILBX02V.proque.st>

I would look for log file switch completion wait events among the top 5 timed events in an AWR report and/or significant waits on specific sessions being traced for performance.

If there are no log file switch wait events, I wouldn't expect reducing the frequency of log file switches to have any impact.

But, with multiple switches per minute, I find it hard to believe that there's no performance impact on your system.

Finally, if you're having a hard time seeing the problem in AWR, I'd recommend taking some samples, at the session level, using Tanel's snapper.sql script, or, enabling 10046 trace on the problem session, and doing some profiling. The data is there, you just need to find it. :-)

-Mark

-----Original Message-----

From: Christopher.Taylor2_at_parallon.net [mailto:Christopher.Taylor2_at_parallon.net] Sent: Thursday, November 15, 2012 3:00 PM To: Bobak, Mark; oracle-l_at_freelists.org Subject: RE: Redo Log improvement - how to analyze and attack?

Mark - how can a person tell what the impact numbers are for having too many log switches. I know Oracle recommends 3-4 times an hour but how can you identify that a lot of log switches are impacting performance - what events from AWR would reflect a negative impact?

I did bump my redo logs up already to 5GB and set archive_lag_Target and only had 3-4 switches last night but everything finished just like normal - no improvements which is what I thought was going to happen based on what I was seeing in the AWR but I thought perhaps I wasn't seeing the issue in the AWR.

Chris

-----Original Message-----

From: Bobak, Mark [mailto:Mark.Bobak_at_proquest.com] Sent: Thursday, November 15, 2012 1:33 PM To: Taylor Christopher - Nashville; oracle-l_at_freelists.org Subject: RE: Redo Log improvement - how to analyze and attack?

Hi Chris,

If you're switching logs 300 times per hour, that's 5 log switches per minute, or about one log switch every 12 seconds. I like to target one log switch every 15-20 minutes. So, if we say we want a log switch every 15 minutes, and you're switching every 12 seconds, you should look at increasing your log file size by a factor of 75x (15x5). Actually, if this is a 3 node cluster, and the 300 log switches per hour is aggregate across all 3 nodes, then you probably are looking at a factor of 25x.

So, that's the first problem I think you should fix, to eliminate your log file switch waits.

Once you've done that, if you have lulls in redo generation, to avoid long periods without a log switch, you could consider setting archive_lag_target, to perhaps 1800 or so.

Hope that helps,

-Mark

-----Original Message-----

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Christopher.Taylor2_at_parallon.net Sent: Wednesday, November 14, 2012 4:49 PM To: oracle-l_at_freelists.org
Subject: Redo Log improvement - how to analyze and attack?

I have this 3-Node RAC (10.2.0.4) that I've inherited (as I've mentioned) and after putting out a few issues I'm ready to move onto the redo logs (and perhaps I should have started here earlier). First question:
Are high numbers of log file switches (during certain hours) enough validation for that an improvement is needed?

In my case, during heavy batch processing, this RAC cluster may experience 300 log switches during an hour (I know, I know - sounds bad doesn't it)

(Caveat: The log file switches were generated from within Toad - I'm assuming they are valid)

My first step was to enable FAST_START_MTTR_TARGET and it is set for 3 minutes (180 seconds).

Here's some of my numbers for the time interval on Node1:

Redo Size:
1,451,121.63 per second

AWR redo/log stats (formatted) here:
https://gist.github.com/4075050

Second question:
V$INSTANCE_RECOVERY OPTIMAL_LOGFILE_SIZE = 5421 (~5 GB) Is OPTIMAL_LOGFILE_SIZE calculated by the *EXISTING LOGFILE GROUPS*? So if I had more groups, would this number go down?

Third Question:
I've seen various mentions of " archive_lag_target" - what do I need to think about regarding this parameter in relation to check points?

Any suggestions/thoughts are appreciated.

Regards,

Chris Taylor
Oracle DBA
Parallon IT&S
christopher.taylor2_at_parallon.net<mailto:christopher.taylor2_at_parallon.net> www.parallon.net

--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l Received on Thu Nov 15 2012 - 21:06:23 CET

This message: [ Message body ]
Next message: Sweetser, Joe: "RE: v$rowcache outstanding_alerts"
Previous message: Christopher.Taylor2_at_parallon.net: "RE: Redo Log improvement - how to analyze and attack?"
In reply to: Christopher.Taylor2_at_parallon.net: "RE: Redo Log improvement - how to analyze and attack?"
Next in thread: Christopher.Taylor2_at_parallon.net: "RE: Redo Log improvement - how to analyze and attack?"
Reply: Christopher.Taylor2_at_parallon.net: "RE: Redo Log improvement - how to analyze and attack?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message