RE: Help with database corruption issue

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Sun, 5 Aug 2012 12:14:24 -0400
Message-ID: <019701cd7325$61d9a490$258cedb0$_at_rsiz.com>



It is possible that the rebuilt header was still in deferred write. A cp operation forces a complete actual write, even on ext4. So it *seems* possible to me that whatever corruption was there was always a recoverable error and that it was simply not on disk yet when Oracle tried to read it.

I don't know how you would prove what happened without access to a time machine to do something like an every instruction strace on all the processes involved. It is *possible* that enough is logged somewhere.

Good luck,

mwf

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Steve Montgomerie
Sent: Saturday, August 04, 2012 11:24 PM To: pjhoraclel_at_gmail.com
Cc: oracle-l
Subject: Re: Help with database corruption issue

Thanks List!

Dennis and Peter,

We could start 19 of 20 databases. When we tried to start database X, it would lock up the mount point, would not open, and would hang all of the other 19 databases.

The actual error points to software corruption. Something like running fsck against a mounted file system.
SA swears he did not do that we believe him.

In regards to the error it points to s system utility that detects a bad block and then tries to fix it which ends up with the header information being zeroed out of some blocks.

The only thing that makes sense to me, is that the CP command somehow rebuilt the header information of the bad blocks. Is that possible?

On Fri, Aug 3, 2012 at 6:49 AM, Peter Hitchman <pjhoraclel_at_gmail.com> wrote:
> Hi,
> Well for some reason the ext4 file system had errors, leading to lost
> data. That impacts the undo tablespace data file and Oracle could not
> recover. All I can think is that at some point in time the ext4 file
> system was not 100% OK and then when you made the data file copy is
> had been fixed. What sort of disk layout do you have, maybe the error
> was corrected by way of a disk mirror or some other RAID set-up
> protection?
>
> Regards
> Pete
> --
> http://www.freelists.org/webpage/oracle-l
>
>

--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l
Received on Sun Aug 05 2012 - 11:14:24 CDT

Original text of this message