Re: Using Diagwait on Oracle Clusterware

From: LS Cheng <exriscer_at_gmail.com>
Date: Tue, 24 Nov 2009 09:57:49 +0100
Message-ID: <6e9345580911240057s569ca8d6r6406283078157707_at_mail.gmail.com>



I dont know how to do it, someone from Sun did it once for a customer but didnt want to tell me how :-S

On Tue, Nov 24, 2009 at 9:52 AM, Martin Berger <martin.a.berger_at_gmail.com>wrote:

> Can you give me some hints how to do this?
> (even if my Solaris-Admins might not know, it's worth I know about it :))
>
> thank you
> Martin
>
> On Tue, Nov 24, 2009 at 08:08, LS Cheng <exriscer_at_gmail.com> wrote:
> > one of the reasons I use diagwait is that it makes oprocd less sensitive
> :-)
> >
> > the other reasons are those the note states but when there are evictions
> in
> > Solaris for example it is still quite hard to find out the root cause
> > (because CRSD sends some eviction messages to system console and that
> > usually is not wriiten to files unless configured so but many solaris
> admin
> > does not know how to do it!)
> >
> >
> >
> > Thanks
> >
> > --
> > LSC
> >
> >
> > On Mon, Nov 23, 2009 at 6:08 PM, Vishal Gupta <vishal_at_vishalgupta.com>
> > wrote:
> >>
> >> Hello List,
> >>
> >> What is the general consensus among RAC users regarding use of diagwait
> on
> >> Oracle clusterware.
> >>
> >> Metalink Note - 559365.1
> >>
> >>
> >> Symptoms
> >>
> >> Oracle Clusterware evicts the node from the cluster when
> >>
> >> Node is not pinging via the network heartbeat
> >> Node is not pinging the Voting disk
> >> Node is hung/busy and is unable to perform either of the earlier tasks
> >>
> >> In Most cases when the node is evicted, there is information written to
> >> the logs to analyze the cause of the node eviction. However in certain
> cases
> >> this may be missing, the steps documented in this note are to be used
> for
> >> those cases where there is not enough information or no information to
> >> diagnose the cause of the eviction.
> >>
> >> Changes
> >>
> >> None
> >>
> >> Cause
> >>
> >> When the node is evicted and the node is extremely busy in terms of CPU
> >> (or lack of it) it is possible that the OS did not get time to flush
> the
> >> logs/traces to the file system. It may be useful to set diagwait
> attribute
> >> to delay the node reboot to give additional time to the OS to write the
> >> traces. This setting will provide more time for diagnostic data to be
> >> collected by safely and will NOT increase probability of corruption.
> After
> >> setting diagwait, the Clusterware will wait an additional 10 seconds
> >> (Diagwait - reboottime). Customers can unset diagwait by following the
> steps
> >> documented below after fixing their OS scheduling issues.
> >>
> >>
> >>
> >>
> >>
> >> Regards,
> >> Vishal Gupta
> >> http://www.vishalgupta.com
> >
>
>
>
> --
> Martin Berger martin.a.berger_at_gmail.com
> Lederergasse 27/2/14 +43 660 660 83306
> 1080 Wien http://berx.at/
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Nov 24 2009 - 02:57:49 CST

Original text of this message