Re: ocssd.bin does not start after 10.2.0.1 clusterware install

From: Kumar Madduri <ksmadduri_at_gmail.com>
Date: Tue, 8 Sep 2009 17:15:06 -0700
Message-ID: <a2b1e7610909081715j6244c49ewb7449b2f6e69dc53_at_mail.gmail.com>



Hello Randy
We are not on Linux. We are trying this on a Solaris box. The thing is if we apply the 10..2.0.4 patchset, the ocssd.bin comes up fine on secondary node but it loops on the primary node. Oracle is trying to point finger at Sun Cluster installation or QFS file system that is used for the ocr and voting disks. But the question is if the Sun Cluster is bad or if it is an issue with QFS how does the root.sh run fine on node 2 plus ocrcheck and ocrdump works fine from the the primary node. Oracle is just doing circles without any solution so far.
Another question in this regard is,does the Oracle Clusterware expect more packages (SUNW*) on the primary node as compard to the secondary node. Does the root.sh run some additional stuff on primary node that it does not do on secondary (it does not seem logical but when I compared with a working cluster, I noticed this difference where the primary node had more SUNW* packages as compared to the secondary node). Another thing that Oracle brings up is that, the Sun Clusterware may be starting a bit later than the Oracle clusterware and Oracle clusterware is trying to look for the Sun cluster process and it loops. But that does not make sense also for couple of reasons (1) why is the occsd.bin starting on second node and the root.sh completling successfuly on the node 2 if that is the case (2) Even if what Oracle is saying is true, should not this happen the first time Oracle cluster tries to start and for subsequent tries, it should be able to succedd (actually for the 2nd try because Sun Cluster should be up by then even if there was a time difference) instead of looping round and round.
We cant take a pstack or ptree of any process because the only thing I see is css fatal process and when we did a ptree it just shows the sleep process

Thank you
Kumar

On Tue, Sep 8, 2009 at 7:45 AM, Randy Johnson <oraclelist_at_sbcglobal.net>wrote:

> Can you post the output from the root.sh script? I believe there is a bug
> that has to be addressed before this script will succeed for 10.2.0.1.
> I believe doc: 466673.1 covers it.
>
> ------------------------------
> *From:* oracle-l-bounce_at_freelists.org [mailto:
> oracle-l-bounce_at_freelists.org] *On Behalf Of *Kumar Madduri
> *Sent:* Thursday, September 03, 2009 3:38 PM
> *To:* Oracle-L_at_freelists.orgHe
> *Subject:* ocssd.bin does not start after 10.2.0.1 clusterware install
>
> Hi
> We have a suncluster on top of which we are trying to install the 10.2.0.1
> clusterware. The installation is fine but when you run root.sh it fails to
> bring ocssd.bin. I think it is looping at /etc/init.d/init.cssd trying to
> validate the conditions under the SunOS case statement and it keeps looping
> (start check process start and it does not go to the next stage).
> Prior to doing this install, we clean up the localconfig as per note
> 239998.1.
>
> cluvfy does not report any issues before the start of the installation.
>
> We have a tar with Oracle but there is not much progress with them
> unfortunately.
>
> Any ideas?
>
> Thank you
> Kumar
>
>
>
> __________ Information from ESET NOD32 Antivirus, version of virus
> signature database 4406 (20090908) __________
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
>
> __________ Information from ESET NOD32 Antivirus, version of virus
> signature database 4406 (20090908) __________
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Sep 08 2009 - 19:15:06 CDT

Original text of this message