Swapping causing RMAN controlfile snapshot to fail?

From: Rich Jesse <rjoralist3_at_society.servebeer.com>
Date: Wed, 4 Sep 2013 11:22:40 -0500 (CDT)
Message-ID: <d1c3158548006e19060453007a7dfc69.squirrel_at_society.servebeer.com>



Hey all,

In 11.2.0.3 under AIX 5.3 TL12, I had a one-time RMAN error where it failed to snapshot the controlfile after a scheduled archive log backup:

RMAN-03009: failure of Control File and SPFILE Autobackup command on ORA_DISK_1 channel at 08/03/2013 02:10:10 ORA-01580: error creating control backup file /u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_xxxxx.f ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 22: Invalid argument Additional information: 4

I opened an SR and the tech saw this in the alert.log:

WARNING: Heavy swapping observed on system in last 5 mins. pct of memory swapped in [6.25%] pct of memory swapped out [11.01%].

So, the tech surmised that the RMAN failure was due to swapping ("paging" in AIX land). Huh? That seems to be the opposite of the intent of paging, which is to keep programs running during memory pressure. Here's some more AIX info:

minperm%=5
maxperm%=90
maxclient%=90
lru_file_repage=0

nmon reports the FileSystemCache usually around 12%, PageSpace at about 13% used (1.7GB of 12.8GB).

When this failure occurred, a mksysb root VG backup kicked off. That apparently is the cause of the paging spike, as it happens every time mksysb runs. And it so happens that the controlfile snapshot is on the root VG (on purpose!). So my theory is that RMAN just happened to hit the controlfile snapshot at the exact same time that mksysb had a hold of the old one, although I can find no documentation to backup that behavior nor to discount it.

Thoughts?

TIA!
Rich

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Sep 04 2013 - 18:22:40 CEST

Original text of this message