Re: Long connect time when one node in RAC goes down
Date: Thu, 04 Sep 2008 10:34:19 +0300
With the help of Oracle support we narrowed the problem to names resolution. We shut down node 2 and started a session with client trace. I saw in the trace that sqlnet is deciding to use server2-vip. After that it try to convert the name to tcp/ip address. When sqlnet try to convert server2-vip to tcp/ip address he is stuck.
It seems that somewhere in the network something is not updated
when the vip is moved to the other node and it takes about 6 (or 6*2)
until sqlnet gets error from the network and then it try to connect with the second entry, server-vip1, and this works.
Have you heard anything about this problem?
We are going to do a test using the ip itself instead of names in the
and also to use a sniffer to find out what happens during these 6 seconds.
Yechiel Adar wrote:
> We need some help.
> RAC, Oracle 10.2.0.3 on windows 2003 servers 64 bit.
> We did a fail over test. We disconnected one server from the network
> by pulling the network cable.
> The system worked fine, but once in a while a connection will take 6
> seconds instead on 20 ms.
> We understand that this happens because the VIP is moved to the second
> computer and there is
> nothing there to handle calls on that TCP address.
> I would like to know how to shorten the time from 6 seconds to almost