Re: Interrupt fencing on Sun
Date: Tue, 29 Jan 2008 17:26:00 -0500
Interesting. It's been a while since I worked with Sun/Veritas, so I'd be interested in knowing the versions of the respective pieces of software being used.
I did have a similar situation with Red Hat Linux Enterprise 3 (in all it's various kernal flavors and QUs). It seems there was a limit on device addressing. Everything was identified as a SCSI device. The maximum number of addressable devices was limited to 128. Which was somewhat documented, but not very well. I walked into a RAC installation where the SAN folks had divided a 1TB+ database into 8GB hdisks, each having two paths - hence two addresses - using PowerPath software. It was a six-node cluster and although Linux 'saw' everything, it used a memory-stealing algorithm to keep track of it all. Every time it time-sliced the thread keeping track of the interconnect, the node would drop.
This got fixed, at least the number of devices increased exponentially, in v4.
Don't really know if this has anything to do with CPU usage in Sun/Veritas, but it's been a boring afternoon.
On 1/21/08, Dan Norris <dannorris_at_dannorris.com> wrote:
> Sorry I don't have any experiences to share on this one. I have installed
> several VCS/Sun/RAC clusters, but none as large as you mention. I presume
> that the size of the cluster you're working on is what led Veritas and Sun
> to recommend the additional CPU reservations. On smaller (2-3 node)
> clusters, I haven't seen any issues like what you appear to have
> I'd like to add to your query and/or branch off a new thread: I'd also be
> interested in knowing if anyone has heard of similar suggestions with
> respect to Oracle Clusterware. This would certainly be one additional item
> to add to the Oracle Clustware (only) vs. Third Party Clusterware thread
> that made the rounds on this list recently.
> ----- Original Message ----
> From: fairlie rego <fairlie_r_at_yahoo.com>
> To: Oracle-L <oracle-l_at_freelists.org>
> Sent: Monday, January 21, 2008 4:34:04 AM
> Subject: Interrupt fencing on Sun
> On a 8 node 10.2.0.3 RAC with 128 cpus Sun and Veritas have recommended
> that the customer set aside 2 cpus on each node for processing network
> interrupts exclusively.
> This was suggested after a situation where VCS was unable to process
> cluster heartbeat messages in a timely manner due to Sun attempting to
> process the network interrupt thread on the same CPU which was running the
> cluster communication thread.
> I would like to know if any other large customers have set aside cpus
> exclusively to handle network interrupts and if so what percentage of total
> Thanks much
> *Fairlie Rego
> *Senior Oracle Consultant
> M: +61 402 792 405
> Make the switch to the world's best email. Get the new Yahoo!7 Mail now<http://au.rd.yahoo.com/mail/taglines/default_all/mail/spankey/*http://au.yahoo.com/worldsbestmail/spankey/>.