Re: ORA-00470: LGWR process terminated with error

From: vsevolod afanassiev <vsevolod.afanassiev_at_gmail.com>
Date: Wed, 10 Mar 2010 16:43:16 -0800 (PST)
Message-ID: <ea9dcc90-6484-4b3d-a6af-31b466eaa693_at_q2g2000pre.googlegroups.com>



Please confirm that both instances are running from the same ORACLE_HOME.
If the are not then the issue is version-related, or installation related.

Assuming that both instances are run from the same ORACLE_HOME:

I see two possibilities:
1. Instance is killed by something external, similar to someone doing 'kill -p <pid of LGWR>"
2. Instance dies

  1. Instance is killed by something external

There are two instances on the server, but only one experiences this problem, correct? Are instances in any way different? For example, instance A has 3 GB SGA while instance B has 10 GB SGA? If they are different then try to make them identical, make sure that all init.ora parameters are the same (with obvious exceptions - things like control_files, background_dump_dest). If possible try to achieve it by REDUCING values, not increasing them. Once this is done we could expect two outcomes:
- The crashes will stop. It is possible that "something" was killing
instance as it was too big. Once it is made smaller it will no longer get killed.
- The crashes will affect both instances. This indicates that killing
wasn't based on size.

2. Instance dies

Oracle uses following facilities provided by OS:
- CPU

  • memory
  • disk
  • IPC facilities (shared memory and semaphores on Solaris)

CPU is unlikely to disappear, IPC facilities are allocated at startup, so most likely the issue is either
memory-related or disk-related. As this is LGWR disk issue seems more likely.
Do you have dedicated filesystems for each instance? Or filesystems are shared?
What filesystems you are using: UFS, ZFS, Veritas? Do you use something fancy like
ODM (Oracle Disk Manager)?

Finally: do you use any fancy Solaris 10 stuff like zones/containers?

  • - - - - - - - -

I had a similar issue on Tru64 several years ago, very frustrating. The database had nightly cold backup. Once-twice per month the instance would start in corrupted state - it was possible to connect to it but not run any SQL. Oracle Support pointed to a bug where instance gets corrupted on startup if something tries to connect to it in the brief moment between 'startup' command and 'Oracle instance started' message (just a second or two). It had to be SYSDBA connection, and it was happening at 6am. What could possibly do that? Eventually it was traced to UNIX script provided by DEC. It took several months to locate. Received on Wed Mar 10 2010 - 18:43:16 CST

Original text of this message