RE: Grid Networking Reliance on DHCP

From: <fmhabash_at_gmail.com>
Date: Fri, 9 Sep 2016 13:38:00 -0400
Message-ID: <57d2f379.0b16240a.560d5.6b7a_at_mx.google.com>



There are so many different implementations/infrastructure configs & bugs, everything is possible. Googling ‘Unable to obtain IPv4 DHCP’ will reveal multiple scenarios. There are also known RH bugs that cause dhclient to fail to renew the IP e.g. RHBA-2013:1572.

This is the failure we see in the dmesg. The ifup-eth invoked dhclient. This script IF condition returned the error blow as a result of dhclient not getting its IP. As a result, OSW logs shows that the node had lost all its public/VIPs at the time. Event lasted for about 30 seconds. NTPD reacted by deleting the interfaces from its configuration.

We see no evidence of device errors. Thus, letting these entries guide us through.

Aug 22 16:30:05 xxxxx dhclient[12319]: Please report for this software via the Oracle Bugzilla site:
Aug 22 16:30:05 xxxxx dhclient[12319]:     http://bugzilla.oracle.com
Aug 22 16:30:05 xxxxx dhclient[12319]:
Aug 22 16:30:05 xxxxx dhclient[12319]: exiting.
Aug 22 16:30:05 xxxxx /etc/sysconfig/network-scripts/ifup-eth: Unable to obtain IPv4 DHCP address eth0.
Aug 22 16:30:08 xxxxx ntpd[22275]: Deleting interface #8 eth0:4, 172.26.208.59#123, interface stats: received=0, sent=0, dropped=0, active_time=1754 secs
Aug 22 16:30:08 xxxxx ntpd[22275]: Deleting interface #7 eth0:3, 172.26.208.127#123, interface stats: received=0, sent=0, dropped=0, active_time=1756 secs
Aug 22 16:30:08 xxxxx ntpd[22275]: 

What do you have the BOOTPROTO set up to in your ifcfg-ethx, for example?

Thanks



Thank you

From: Seth Miller
Sent: Thursday, September 8, 2016 6:09 PM To: fmhabash_at_gmail.com
Cc: 'oracle-l_at_freelists.org' (oracle-l_at_freelists.org) Subject: Re: Grid Networking Reliance on DHCP

Over the last decade, I have never had or heard of a DHCP lease renewal failure causing a clusterware node failover. This seems like a pretty specific problem with your DNS.

Regardless, is there a reason you are not using what I have found to be the easiest to implement and manage, least error prone, and most scalable option - GNS?

Seth

On Thu, Sep 8, 2016 at 4:15 PM, <fmhabash_at_gmail.com> wrote: True, but if they are left under the control of DHCP, I have seen issues when dhclient attempts to renew the lease on these IPs and it fails for some reason. As s result, the VIPs are gone and a failover is triggered.  
This can be resolved either by configuring these IPs to never expire or remove DHCP altogether. In such case, I’m thinking the virtual interfaces need to be configured with BOOTPROTO="static"  
 
Feedback appreciated.



Thank you
 
From: Seth Miller
Sent: Thursday, September 8, 2016 5:02 PM To: fmhabash_at_gmail.com
Cc: 'oracle-l_at_freelists.org' (oracle-l_at_freelists.org) Subject: Re: Grid Networking Reliance on DHCP  
The VIPs are created and managed by clusterware. You shouldn't need a configuration file at all for them.  
 
Seth
 
On Thu, Sep 8, 2016 at 3:54 PM, <fmhabash_at_gmail.com> wrote: I know there are 3 options to configure the public network for a GI cluster. GNS, DHCP, or static. I have, typically, did a static IP’ing for public & VIPs.  
However, Oracle official documentation indicated that as of 11.2, DHCP can by used for all VIPs, but not public IP.  
SO,  the physical interface eth0 and its virtual eth0:1 all have ‘BOOTPROTO="static"’.  
If you are not using GNS, how are having your IP’s setup for an 11.2 GI cluster.  

Thank you
 
 
 




--
http://www.freelists.org/webpage/oracle-l
Received on Fri Sep 09 2016 - 19:38:00 CEST

Original text of this message