Re: RMAN innocent bystanders killed on linux

From: Rajeev Prabhakar <>
Date: Thu, 28 Feb 2008 15:03:19 -0500
Message-ID: <>


Given the experience we have had recently, I am not 100% sure if this issue is merely confined to 2.4
kernels. Just to share our recent experience...

Few weeks back we were facing instance crashes on a rac cluster (, linux 2.6.9- encountered only during the rman runtime window and subsequent troubleshooting / research led to reducing the parallelism / filesperset for the rman configuration. That has so far avoided the zero memory/swap
scenario we saw in some oracle trace files and we haven't had any instance crashes during rman backup
window since then. Although, o.s. utilities had continued to show a relatively "normal" system from a
memory /swap stand point during those problematic rman backup window times. So, given what we have
seen, I would agree w/Christo that it is an issue associated with large/heavy i/o operations/filesystem cache.


On Thu, Feb 28, 2008 at 1:38 PM, Christo Kutrovsky <> wrote:
> Hello,
> This is known issue with 2.4 kernels. It's not so much to do with low
> memory, but incorrect memory counting from the OOM module.
> It is related with large file io operations, which use a lot of file
> system cache.
> Enable DIRECTIO (filesystem_options=directio). In 2.4 kernel you have
> either DIRECTIO or ASYNC for ext3 (I am assuming you are using ext3).
> Not both, if you do "setall" async will take precedence.
> Note that this will only help you with your duplicate. If you start a
> "cp" someone will get killed. I believe there's a bugfix for the 2.4
> kernel. Make sure you are using latest 2.4 kernel.
> If you really need more info, I can try to lookup the kernel that had
> this issue, and the kernel that did not.
> --
> Christo Kutrovsky
> DBA Team Lead
> The Pythian Group -
> I blog at

Received on Thu Feb 28 2008 - 14:03:19 CST

Original text of this message