Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: How to get notified for a frozen oracle database - NEED HELP

RE: How to get notified for a frozen oracle database - NEED HELP

From: Taft, David <TaftD_at_saic-dc.com>
Date: Fri, 13 Apr 2007 12:56:11 -0400
Message-ID: <DCE76463749C64499892A0DB3AF05AC60EF45F86@challenger.vta.saic-dc.com>


Raj,  

This is pretty much what we do in our shop, but I found that select against v$database would sometimes succeed even though users were calling the help desk saying everything was hung. I found doing a truncate against an empty dummy table to be more effective. Here is a snippet of code that gives a pretty good idea of how I handle it.  

Cheers,  

David Taft  

P.S. Our monitoring runs from cron on each server, which doesn't satisfy Keerthi's original request (paging via EM/Grid Control), but you'll notice at the bottom of the snippet, that I do make a call to Big Brother monitoring. I am guessing the same could be done with EM.

<SNIP>  

# Execute in background so that script doesn't hang if the database is hung.
# SqlPlus function writes output to a temporary file.
SqlPlus "SET TIMING ON

         TRUNCATE TABLE dbmon.status;" &  

# Give truncate a chance to complete before entering the loop (especially
for databases with heavy resource contention). sleep 5  

# Minute counter.

count=0  

# Comparison loop counter.

count2=`expr $timeout + 1`
#
# NOTE: count2 must be one greater, otherwise the '$count -eq $timeout'
# statement below will never execute.
 

# Initialize background job check to zero.
stat=0  

# Check results of above SQL*Plus session for database status.
while [ $count -lt $count2 ];
do

   # Check if background job is still running.    stat=`kill -0 $! 2>/dev/null;echo $?`  

   # If exit status of kill command is greater than zero, then job is finished.

   if [ "$stat" != "0" ]; then  

      case `cat $tmpfile` in
         *"Table truncated"* )
            # If truncate executed, then everything is OK.
            break
            ;;
         *"table or view does not exist"*)
            # Table never created or dropped?
            message="Unable to access the dbstatus table DBMON.STATUS."
            break
            ;;
         *"ORACLE not available"*)
            # Verify database didn't go down while the script was sleeping.
            message="ORACLE not available."
            break
            ;;
         *"ORACLE initialization or shutdown in progress"*)
            # Verify database is not in a CLOSED state.
            message="ORACLE initialization or shutdown in progress.
                     Database not open."
            break
            ;;
         *)
            # Query is not still running and $count has not exceeded
$timeout,
            # so there must be a problem connecting to the database.
            message="Problem connecting to ORACLE or the DBMON schema."
            break
            ;;
      esac

   elif [ $count -eq $timeout ]; then
      # Database is hung if query still running after specified timeout, so
      # kill the truncate statement running in the background.
 
      kill -9 $! >/dev/null 2>&1
      message="Database is hung."
      break
 
   else
      # Query is still running and the timeout has not been exceeded, so
      # increment counter and wait one minute before looping again.
 
      count=`expr $count + 1`
      sleep 60

   fi
done  

# If $message is not null, then there is a problem.
if [ "$message" != "" ]; then  

    PageDBA      "$message"
    MakeLogEntry "$message"
    WriteCSV     "$message"

    NotifyBigBrother "red"  

  # Exit with status of 1.
  Cleanup 1  

<SNIP>

From: Pal, Raj [mailto:Raj.Pal_at_echostar.com]

Have your script log in to extract a known statement (ie. Select name from v$database).

Run this in the background (ie. &) and send it to a dummy/temporary file (eg. /tmp/monitor.tmp).

Give your script a configurable number of seconds to find the file.

If it doesn't find it after 'x' seconds, then send an alert.

Just make sure you delete the potentially existing dummy file before repopulating it.  


From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Keerthi Kumar  

Is there a way to get notified for a frozen oracle database to a pager via EM 10g GRID. We have Database Up/Down event but could not find anything in the list for this type of problem.

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Apr 13 2007 - 11:56:11 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US