Re: Investigating ORA-01552 / ORA-00376 errors

From: Renee Dekenah <renee.dekenah_at_googlemail.com>
Date: Mon, 10 Jul 2006 18:32:38 +0100
Message-ID: <44B28F36.5020903@gmail.com>

Hi,

Followup: as Paul suggested, the root cause was in fact a misconfigured Symantec backup job - incorrect setting to do with locking of open files. After the error described below, we had ongoing issues with various Oracle files becoming locked, in some cases causing the instance to terminate (e.g. control file locks). Finding and then rectifying the faulty backup job configuration has now resolved the problem.

Regards,
Renee Dekenah

Paul Drake wrote:
> Renee,
>
> The root cause of this error is likely misconfigured backups (at the
> OS level).
> a rogue process locked the file.
> too bad it wasn't the system tablespace's datafile.
> The larger the datafile, the higher the probability that oracle will
> attempt to write to the file while it is locked by a useless backup
> process job, creating a backup set that has never been tested.
>
> Paul
>
>
> On 6/21/06, *Renee Dekenah* <renee.dekenah_at_googlemail.com
> <mailto:renee.dekenah_at_googlemail.com>> wrote:
>
> Hi,
>
> I am investigating the cause of a problem we had with an undo
> tablespace
> and would appreciate hearing your thoughts on this. While I have fixed
> the problem by creating a new replacement undo tablespace, and
> everything now seems fine, I am interested in why the problem
> occurred,
> and whether there is a better solution than the one I employed.
>
> Problem Description:
>
> We are running IBM Rational ClearQuest on Oracle 10g R2. On Monday
> ClearQuest users started getting the error ORA-01552: cannot use
> system
> rollback segment for non-system tablespace 'CQ_USER' when trying
> to add
> a new entry within ClearQuest.
>
> Full error message from ClearQuest
> ERROR! SQLExecute: RETCODE=-1, State=HY000, Native Error=1552
> SQL statement="update dbglobal set next_request_id=906 where site_id=2
> and next_request_id=905"
> [DataDirect][ODBC Oracle Wire Protocol driver][Oracle]ORA-01552:
> cannot
> use system rollback segment for non-system tablespace 'CQ_USER'.
>
> Further Details:
>
> Upon investigation using Enterprise Manager, it appeared that the
> CQ_USER tablespace had no space left - a critical error for this was
> logged on the preceding Friday - and it was not set to autoextend. We
> rectified this by changing it to autoextend in Enterprise Manager and
> applying the changes, thus clearing the tablespace critical errors.
> (please don't kill me for the tablespace being full, I'm just helping
> out an ex-employer, monitoring the database was not my responsibility)
>
> However, this did not resolve the initial Oracle error shown in
> ClearQuest, so we investigated further.
>
> On Thursday/Friday, the alert error log in BDUMP logged a few ORA-1653
> errors relating to tables that could not be extended in the (then)
> almost full CQ_USER tablespace.
>
> On the Saturday, there was the following:
> KCF: write/open error block 0x29 online=1
> file=2 D:\ORACLE\ORADATA\...\UNDOTBS01.DBF
> error=27072 text: 'OSD-04008: WriteFile() failure, unable to write
> to file
> O/S-Error: (OS 33) The process cannot access the file because another
> process has locked a portion of the file.'
> Automatic datafile offline due to write error on
> file 2: D:\ORACLE\ORADATA\...\UNDOTBS01.DBF
>
> Since then - even after extending the CQ_USER tablespace on Monday
> - the
> system logged many of these:
> Errors in file c:\oracle\admin\...\..._pmon_2604.trc:
> ORA-00376: file 2 cannot be read at this time
> ORA-01110: data file 2: 'D:\ORACLE\ORADATA\...\UNDOTBS01.DBF'
>
> Enterprise Manager claimed the undo tablespace (UNDOTBS1) was
> ReadWrite
> status, and 0MB in size. When I used the SQLPlus command line to
> confirm
> the status of this tablespace it was listed as AVAILABLE. Dbverify on
> the file UNDOTBS01.DBF showed nothing failing or corrupt, but
> Enterprise
> Manager did list this file as "Offline (Needs Recovery)". Based on
> this,
> and some information I found online, I decided to create a new undo
> tablespace and switched the database to use that instead. The
> ClearQuest
> error no longer occurs.
>
> Questions:
>
> 1. Was the undo datafile being locked/going offline a result of the
> tablespace maxing out, or could it have been an unrelated error? (note
> that we have Symantec backups running daily on the database server
> and
> database, I seem to remember reading somewhere that backup
> software can
> sometimes cause datafile problems?)
>
> 2. Was recreating the undo tablespace the best solution, or should I
> have done something else?
>
> Many thanks for any assistance you can offer,
>
> Renee
>
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>
>
>
> --
> #/etc/init.d/init.cssd stop
> If you find yourself using the word "telnet" somewhere, be it in an
> email or a conversation, please stop and say "I'm sorry, I meant to
> use ssh instead."

--
http://www.freelists.org/webpage/oracle-l

Received on Mon Jul 10 2006 - 12:32:38 CDT

Re: Investigating ORA-01552 / ORA-00376 errors - followup