Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: How to get notified for a frozen oracle database - NEED HELP

Re: How to get notified for a frozen oracle database - NEED HELP

From: Keerthi Kumar <myneel_at_gmail.com>
Date: Sat, 14 Apr 2007 11:47:53 -0500
Message-ID: <3d9739bb0704140947t52b74fe8wf210214072bfddfe@mail.gmail.com>


David,
Thanks for your script. As cron is being replaced by other products, we cannot cron this job. But will use this as a job inside the EM and get notified.

Raj, Srini, Eric, and Jared,

Thanks for your input.

KK.

On 4/13/07, Taft, David <TaftD_at_saic-dc.com> wrote:
>
> Raj,
>
> This is pretty much what we do in our shop, but I found that select
> against v$database would sometimes succeed even though users were calling
> the help desk saying everything was hung. I found doing a truncate against
> an empty dummy table to be more effective. Here is a snippet of code that
> gives a pretty good idea of how I handle it.
>
> Cheers,
>
> David Taft
>
> P.S. Our monitoring runs from cron on each server, which doesn't
> satisfy Keerthi's original request (paging via EM/Grid Control), but you'll
> notice at the bottom of the snippet, that I do make a call to Big Brother
> monitoring. I am guessing the same could be done with EM.
>
> <SNIP>
>
> # Execute in background so that script doesn't hang if the database is
> hung.
> # SqlPlus function writes output to a temporary file.
> SqlPlus "SET TIMING ON
> TRUNCATE TABLE dbmon.status;" &
>
> # Give truncate a chance to complete before entering the loop (especially
> for databases with heavy resource contention).
> sleep 5
>
> # Minute counter.
> count=0
>
> # Comparison loop counter.
> count2=`expr $timeout + 1`
> #
> # NOTE: count2 must be one greater, otherwise the '$count -eq $timeout'
> # statement below will never execute.
>
> # Initialize background job check to zero.
> stat=0
>
> # Check results of above SQL*Plus session for database status.
> while [ $count -lt $count2 ];
> do
> # Check if background job is still running.
> stat=`kill -0 $! 2>/dev/null;echo $?`
>
> # If exit status of kill command is greater than zero, then job is
> finished.
> if [ "$stat" != "0" ]; then
>
> case `cat $tmpfile` in
> *"Table truncated"* )
> # If truncate executed, then everything is OK.
> break
> ;;
> *"table or view does not exist"*)
> # Table never created or dropped?
> message="Unable to access the dbstatus table DBMON.STATUS."
> break
> ;;
> *"ORACLE not available"*)
> # Verify database didn't go down while the script was
> sleeping.
> message="ORACLE not available."
> break
> ;;
> *"ORACLE initialization or shutdown in progress"*)
> # Verify database is not in a CLOSED state.
> message="ORACLE initialization or shutdown in progress.
> Database not open."
> break
> ;;
> *)
> # Query is not still running and $count has not exceeded
> $timeout,
> # so there must be a problem connecting to the database.
> message="Problem connecting to ORACLE or the DBMON schema."
> break
> ;;
> esac
>
> elif [ $count -eq $timeout ]; then
> # Database is hung if query still running after specified timeout,
> so
> # kill the truncate statement running in the background.
>
> kill -9 $! >/dev/null 2>&1
> message="Database is hung."
> break
>
> else
> # Query is still running and the timeout has not been exceeded, so
> # increment counter and wait one minute before looping again.
>
> count=`expr $count + 1`
> sleep 60
> fi
> done
>
> # If $message is not null, then there is a problem.
> if [ "$message" != "" ]; then
>
> PageDBA "$message"
> MakeLogEntry "$message"
> WriteCSV "$message"
> NotifyBigBrother "red"
>
> # Exit with status of 1.
> Cleanup 1
>
> <SNIP>
>
> *From:* Pal, Raj [mailto:Raj.Pal_at_echostar.com]
>
> Have your script log in to extract a known statement (ie. Select name from
> v$database).
>
> Run this in the background (ie. &) and send it to a dummy/temporary file
> (eg. /tmp/monitor.tmp).
>
> Give your script a configurable number of seconds to find the file.
>
> If it doesn't find it after 'x' seconds, then send an alert.
>
> Just make sure you delete the potentially existing dummy file before
> repopulating it.
>
>
>
> ------------------------------
>
> *From:* oracle-l-bounce_at_freelists.org [mailto:
> oracle-l-bounce_at_freelists.org] *On Behalf Of *Keerthi Kumar
>
>
>
> Is there a way to get notified for a frozen oracle database to a pager via
> EM 10g GRID. We have Database Up/Down event but could not find anything in
> the list for this type of problem.
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Sat Apr 14 2007 - 11:47:53 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US