Re: Corrupt redo logs and datafiles

From: SG <ab_at_cd.com>
Date: Fri, 18 Mar 2005 16:46:06 GMT
Message-ID: <iBD_d.1862$dj.319836@news1.epix.net>

Using ext3 and using IDE hardware raid, 3Ware brand. Never had a problem before with them. No ram module on raid card. Never tinkered with IDE parameters. Even with running unpatched kernel, could that becausing corrupt redo log files? Doesn't seem likely, but I guess it's possible.

Another thing of note, I get the following message in the /var/log/messages all the time:

"kernel: application bug: sqlplus(26572) has SIGCHLD set to SIG_IGN but calls wait()."
"kernel: (see the NOTES section of 'man 2 wait'). Workaround activated."

Can anyone explain this error message and is it a valid source to the corrupt redo logs?
I guess changing the memory could be an easy thing to do to assess the memory being the problem

Thanks.

"Mark Bole" <makbo_at_pacbell.net> wrote in message news:PVJZd.10991$C47.4249_at_newssvr14.news.prodigy.com...
> Frank van Bortel wrote:
>
>> SG wrote:
>>
>>> Hi all.
>>>
>>> I am new to Oracle so would appreciate any insight as to what are main
>>> reasons based on your experience that cause corrupt redo log files and
>>> data files? We had a corrupt sysaux.dbf file and constant corrupt redo
>>> logs that would stop our application since it was in archive mode and
>>> we'd get archiver errors. We tried creating new log groups and that
>>> didn't help. We had to constantly clear unarchived log groups,etc. to
>>> get it working. We rebuilt the database and used a dmp from the
>>> "suspect" dbase to import our custom talbles in our newly created db and
>>> tablespace. All had been fine for a month, but now it's happening again.
>>> I looked in the alert logs and see that now we have a corrupt system.dbf
>>> file. Has anyone had this type of experience? We are running Oracle 10g
>>> on Redhat ES 3.0. The kernel version on the system is 2.4.21.4. An
>>> identical system with no problems, same hardware, is running kernel
>>> version 2.4.21-15.0.3. Seems to point to a hardware issue maybe? Any
>>> ideas would be grealty appreciate. TIA.
>>>
>>> SG
>>>
>>>
>>>
>> And the filesystem(s) you use?
>> Ext2, ext3, Reiserfs, hardware RAID, software RAID?
>> Hardware: SCSI, IDE (tinkered with params?)
>>
>> Why are you running unpatched kernels anyway?
>> See: http://rhn.redhat.com/errata/RHSA-2005-043.html
>> Linux csdb01.cs.nl 2.4.21-27.0.2.EL
>
> Check values of DB_BLOCK_CHECKING and DB_BLOCK_CHECKSUM and adjust for
> testing if desired.
>
> In my experience, what you describe is similar to problems I've seen with
> faulty disk controllers or even bad memory modules. What makes it so
> difficult to trouble-shoot is the intermittent and unpredictable nature --
> runs fine for hours, days, even weeks and then errors start cropping up.
>
> -Mark Bole
>
>
>
Received on Fri Mar 18 2005 - 10:46:06 CST