Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Causes of Oracle Database Failures

Re: Causes of Oracle Database Failures

From: MarkP28665 <markp28665_at_aol.com>
Date: 1998/03/25
Message-ID: <1998032523101301.SAA28403@ladder01.news.aol.com>#1/1

From: geburton_at_aol.com (Geburton) >>
I'm looking for statistics on the causes of Oracle database failures, as well as information regarding typical times to restore the database. <<

I do not have any statistics, but almost all of our Oracle system crashes fall into two categories: Os related and Oracle bug.

The OS related crashes are generally Oracle hitting a Unix resource limit such as running our of semaphores, hitting the user file limit, running out of memory for OS level lock manager used with OPS etc... These usually require a Unix kernal re-build to change so even though you can generally restart the instance in a matter of minutes, 5 - 20, depending on uncommited work to be rolled back, you are likely to keep encountering the problem until the time to boot the OS is given.

The Oracle bugs normally result in the instance locking up rather than crashing, but we did encounter a bug where pmon would just vanish bringing the rest of the system down. It would start right back up. We encountered a bug related to archieving where Oracle would fail to resume archieving if the archieve destination directory filled up. This would lock up the database until you shut it down and restarted it. Again quick.

In four years we have had to do recovery of the database for only two reasons. Due to a problem with 7.1.6 and OPS if both instances would crash while hot backups were in progress then Oracle would require recovery. We would have to apply the logs since the start of backup, but did not need to actually recover any files. This would take from an hour to 24 hours. We did this at least a dozen time over one year due to hardware problems.

I do not remember the exact reason for the one time when we really had to lay back down several of our datafiles other than it was hardware other than disk that caused some massive corruption of the data. We had the database back in 12 hours, but I believe it took us another 12 hours to re-create index tablespaces which we did not back up (to save time on the backup job)

I hope this is of some help.

Mark Powell -- Oracle 7 Certified DBA
- The only advice that counts is the advice that you follow so follow your own advice - Received on Wed Mar 25 1998 - 00:00:00 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US