Re: Real Application Cluster

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Thu, 8 Dec 2011 23:18:25 +0000 (UTC)
Message-ID: <pan.2011.12.08.23.18.25_at_gmail.com>



On Thu, 08 Dec 2011 05:49:54 -0800, Mark D Powell wrote:

> We ran OPS on Sequenct NUMA boxes without any more issues that we have
> faced on prior SMP hardware.

The question is not whether it's possible to run RAC on NUMA, the question is why would you do it. Modern NUMA machines behave like very large SMP boxes, for all intents and purposes, while providing the failure resistance and redundancy of a RAC configuration. In other words, RAC is not necessary with NUMA. NUMA machines are assembled in Lego like fashion, just like clusters, only the nature of the connection is different.
This is also a "fashion problem", not just Oracle's direction. There are other technologies, like Hadoop, that have also adopted the same cluster approach, just like Oracle. The early NUMA systems, like the one you have been working with, probably Sequent or Pyramid Nile, have existed for a number of years, alongside with the early parallel "maspar" machines like nCube or the Connection Machine. Those early MP machines have evolved into modern cluster systems while Sequent and Nile machines have evolved into the second generation of NUMA, like the HP SuperDome or SGI Altix. These are still prohibitively expensive but modern AMD chips are coming with NUMA primitives for accessing "remote" memory, the one attached to another CPU, with only around 10% of speed penalty. For some reason beyond my understanding, server manufacturers like Dell, HP or IBM are not mass producing and selling cheapo NUMA boxes, but I don't think that we will have to wait very long time before they start doing just that. When that happens, the old article by Moans Nogood will be even more relevant than today.
The things that RAC excels at are fault tolerance redundancy and scale ability, at a price of $10k/CPU thread. NUMA architecture can give you those same benefits, so my question is fairly logical. Unfortunately, the large NUMA boxes are much more expensive than $10k/CPU thread, so there isn't much competition yet, but I do believe that a clash is inevitable. Oracle must be aware of it because the latest SUN "supercluster" machine is a NUMA machine which can accommodate up to 128 CPU sockets..

PS:

---
I was exaggerating when I said that I have no clue why are cheap NUMA 
systems not being mass produced. At present stage, those systems still 
need a massive central "directory", sort of global page tables, which 
must be extremely fast, partially associative and have rather large 
capacity, which is still expensive. So called commodity NUMA 
implementations, based on BIOS extensions and modern AMD/Intel chips 
cannot deliver fault tolerance and scalability, it's just a method of 
making cheapo SMP boxes, without an expensive crossbar switch buss design.

-- 
http://mgogala.byethost5.com
Received on Thu Dec 08 2011 - 17:18:25 CST

Original text of this message