Re: Redo logs. Please explain.

From: Brent Tucker <bxtucke_at_sp5-18>
Date: 1995/12/11
Message-ID: <4ai6lg$dth_at_cpcnews.cp.nts.uswest.com>


Chuck Hamilton writes:

> I'm trying to learn the DBA job by reading the Oracle manuals. Can
> someone please explain how the redo log files relate to one another?
> If I create a database with the following redo log groups....
>
> CREATE DATABASE "tardis"
> ....
> LOGFILE GROUP 1 ('a.log', 'b.log') size 1m,
> LOGFILE GROUP 2 ('c.log', 'd.log') size 1m;
>
> Which files are mirrors of which? And which are used in the circular
> fashion as described in the concepts manual?
>
> Do 'a' and 'c' mirror each other, or do 'a' and 'b'? I need to know
> so I know which drives to place the files on to protect against media
> failure.
>
> The manuals seem to cover everything about redo logs except this.
> --
> Chuck Hamilton
> chuckh_at_ix.netcom.com
>
> Never share a foxhole with anyone braver than yourself!
>
Chuck,

logfile 'b' is a mirror of 'a'. This is due to the grouping of files.

Consider it this way:

			Group 1		Group2 		Group3
	Member 1	a.log		c.log		e.log
	Member 2	b.log		d.log		f.log
	

Group 1 consists of 2 memebers, a.log and b.log. In this scenario, a.log and b.log should be on separate disks, and if possible, separate controllers as well. This eliminates the single point of failure potential. Oracle recommends at least 2 groups of two log members each. Three groups are actually better due to the circular switching that you mentioned. In general, I place the primary member of all groups on the same volume, the second member of each group on a differenct controller / volume, etc.

As to the switching process, when the members of group 1 fill (since they are mirrored, you can think of them as one member for switching), the LGWR process switches and begins writing to the next group. Several things occur at this point:

  1. If the next group is unavailable, the whole thing fails. This is the reason for the mirror members.
  2. If archive log mode is on (it should be), the 'old' group is written to an archive log.
  3. A checkpoint occurs.

Thus, when group 1 fills, writes occur to Group 2. When Group 2 fills, writes occur to Group 3. When Group 3 fills, writes occur to Group 1, thereby OVERLAYING ANY INFORMATION HELD IN GROUP 1!!! This is very important. If you want to be able to recover you database back to a point in time prior to the last time a logfile filled, you had better have archiver turned on.

I have a cron job that writes archive logs to tape every few hours. You can write them to tape directly, but this provides an inherent danger (media failure) and can take a log time depending of the size of your log files.

You will notice in my example that I show three groups. The reason is simple: we run archiver and I want to be sure that the archive log can be written before the switch back to the current 'old' file occurs. Three groups should be ample for most applications.

For highest recovery, size your log files to accomodate 15 minutes of so of data. If your users want less lost work, a smaller log size may be in order. You can also force log file switches at particular times through cron, etc.

Another layer of protection is hardware mirroring. The multiple member log groups are a software mirror. You can also mirror to other disks through hardware (RAID 0, I think). People argue about the merits of mirroring with redo logs. Granted, it does slow writes down if your logs are mirrored with hardware as well as with software, and hardware mirroring doesn't protect against stupidity (if you delete a file, it is deleted on the mirror too). However, hardware mirroring does help to protect against media failure. I use both just to be doubly safe. Both probably aren't neccessary and too much safety can slow your system. If performance is a problem pick one (software is probably safer due to the higher chance of eliminating human error) and stick with it.

Your example also shows the size of your logs as 1m. They should probably be larger to prevent frequent switching. Since a switch forces a lot of physical I/O, it is one of the slowest operations that can occur on your DBMS.

Most of what I typed here is in the Concepts manual. It is a must read. Get one from Oracle if you don't have one already.

Good luck,

Brent
bxtucke_at_cp.mnet.uswest.com Received on Mon Dec 11 1995 - 00:00:00 CET

Original text of this message