RE: split block (torn page) problem

From: Mark W. Farnham <>
Date: Thu, 8 Dec 2011 08:19:20 -0500
Message-ID: <002c01ccb5ab$ffb663c0$ff232b40$>

The problem is at a more fundamental level. Disk sector sizes are mostly 512 and some 4096 bytes usable.

Whenever the logical block size in Oracle is bigger (almost always now), there is a possibility of the implementation of Oracle's write being promised by the OS, but not in fact truly being complete yet.

Whether you call that an OS or hardware failure, it is indeed a failure that can happen, ergo the need for backups combined with a valid redo stream to ensure recoverability.

This is qualitatively different from "physical hot backups", where Oracle's dbwr process might issue a 8KB block write, be waiting for confirmation, and at the same time an OS utility operating at the native disk sizes and boundaries sweeps up physical sectors containing part of the logical Oracle block already written and other physical sectors still containing an older vintage. No hardware error has happened, but unless you properly had the tablespace of which the file or raw component is a part in "backup mode" the redo stream might not have the information to reconstruct a valid version of the block. No media error or OS error is required in this case, but it is an opportunity for pilot error.

RMAN avoids this particular problem by using Oracle logical writes instead of an OS utility.

But in the case of an OS ordered physical write being partially complete, that is an OS or hardware failure, whether it originates from a race condition with power loss or failure of the storage substrate. So even if the pilot has done everything correctly and there is no bug in Oracle, you may need to do a media recovery.

The reason I'm beating on this is that the commentary can be read to suggest that battery backed storage or ASM might remove the need for backups. That is most certainly NOT what Riyaj was suggesting by his remark on probabilities. It is probably not what you mean either, so please treat this as clarification for newbies rather than some sort of flame.

You need backups to recovery from media errors. Partial writes from the OS layer are rightly considered media errors and can sometimes result from an abrupt power outage or other crash.

Normally an Oracle warm restart applies the proper deltas from the redo stream's online redo logs to bring current the blocks that were lost dirty in the buffer cache at the time of the crash. Unless there is a corrupt block already in the file instance recovery is operating on, that should just work (or else you've found a bug.)

-----Original Message-----
From: [] On Behalf Of Sent: Thursday, December 08, 2011 6:52 AM To: ORACLE-L
Subject: Re: split block (torn page) problem

> With ASM, file system caching is not involved and write calls operate
> on

devices. So, probability is much less with ASM.

It can be asked the other way arround. It's more than common that personal computers are crashed once and again. However disk corruptions resulting from crashes either go undetected or ... they happen very rarely. How big is the probability of disk devices corrupting blocks on writes?

My understanding would be that:

if we take battery backed disk devices then disk can pretty much guarantee that block write is atomic (fails or succeeds.) Messaging from server to disk device is no problem provided some minimal care is taken to check (or even cheksum) that disk receives a whole block.

It's interesting how big is the probability of disk block corruption for non-battery backed disks?

All that does not prevent torn(fractured) database page consisting of many blocks of course. Unless the path from server process to disk device is protected to deliver the whole db page or nothing.

brgds, Laimis N

Please consider the environment before printing this e-mail  

  From: Riyaj Shamsudeen <>  


  Cc: ORACLE-L <>  

  Date: 2011.12.08 07:24  

  Subject: Re: split block (torn page) problem  

  crash recovery applies redo records to roll forward the changes. If a modified block is not written yet to the disk, that is okay, as redo records from the log files can be used to replay the changes.   But, if the block is not written properly or fractured, then the crash recovery will raise corruption errors and can't correct the corruption.   In enterprise servers, server reboots does not cause any corruption (usually). From oracle point of view, a buffer is filled with block image and I/O submitted to the OS. If the OS splits the call in to smaller chunks (say 4K) and writes with two atomic calls underneath the write system call (and that first 4K chunk succeeded, second 4K chunk write did not succeed), it is possible for the corruption to occur, but it is a corner case and you must be very unfortunate :-)
  With ASM, file system caching is not involved and write calls operate on devices. So, probability is much less with ASM. HTH Cheers

Riyaj Shamsudeen
Principal DBA,
Ora!nternals - - Specialists in Performance, RAC and EBS
OakTable member and Oracle ACE Director

Co-author of the books: Expert Oracle
Practices<>, Pro Oracle SQL, Expert PL/SQL



Received on Thu Dec 08 2011 - 07:19:20 CST

Original text of this message