Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: DB2 Crushes Oracle RAC on TPC-C benchmark

Re: DB2 Crushes Oracle RAC on TPC-C benchmark

From: Serge Rielau <srielau_at_ca.ibm.com>
Date: Thu, 03 Feb 2005 00:39:17 -0500
Message-ID: <36drn4F5040mpU1@individual.net>


DA Morgan wrote:
> Serge Rielau wrote:
>

>>> The published list price for RAC is $20/proc AFAIK but I have never
>>> seen anyone actually pay that price. Here's the calculation I had
>>> the procurement folks at a Boeing division do for me last year for
>>> a project to the extent that I can divulge the numbers.
>>>
>>> 2 x 4CPU H/P-Compaq 1U servers $11K
>>> 8 x RAC licenses (using the published price) $160K (it was less)
>>>
>>> total cost $175K with rounding up for miscellaneous items.
>>
>>
>> But if one machine fails you risk thrashing the system because it will 
>> be 100% overloaded. How does Oracle react when you "exceed" 100% CPU?

>
>
> Let me suggest that you read the RAC concept books. The straw horse
> you have cobbled together is only possible in a badly designed system.
>
> But to answer your question it needs to be rephrased. How does the
> operating system handle hitting 100% CPU? I don't know how does it
> handle that situation with any software?
>
>>> But the real advantage was that we knew we wouldn't need a
>>> forklift in the future. Here's the most important part of the
>>> RAC savings.
>>
>>
>> You think you don't need a forklift. Let's coem back to this thread in 
>> 2 years when Boing is running on 5 nodes.

>
>
> 5 1U boxes sitting in a rack? Lets see ... I've personally put together
> 8 nodes in my lab. Oracle has published numbers on up to 128 nodes. I
> don't see a forklift in the future. Why do you?
Running what? SAP? How loaded?
When you put together 8 nodes in your university lab, what kind of proof is that? I can run DB2's QA cycle on "my" local development cluster of 12 p690 with 12 * 4 logical nodes easily with a 5 min setup.. but that proves absolutely nothing beyond correctness. Where is the URL with the details on the 128 nodes? What did it run? How long does it take to fail over when you have to remaster the locks in a TB sized workload?

>

>>> Development ... one cluster with 2 nodes.
>>> Testing ... one cluster with 2 nodes.
>>> Production ... one cluster with 2 nodes initially.
>>
>>
>> See with the Sun aproach you could have shared used the idle standby 
>> for development and testing (kick the developers out in case of fail 
>> over).
>> leaves 750k (3 * 250k) facing 750k. Earlier it was stated that
>> Unix servers are /cpu more powerful...

>
>
> Not to be rude here but perhaps you'd like to borrow a book I have
> on best practices. You don't kick the developers out that are
> supporting a production release. You give them as close as possible
> a duplicate environment. Especially when the cost of doing so is
> so reasonable.

You still underestimated the cost by a factor of three. I'm no DBA, so I have to take your word for it that it's unacceptable to kick development of a box I need for a failover. But you find it acceptable to go in case of failover to half speed (assuming that both OS and DBMS handles the overload cracefully which I dare to doubt)! You end up with denial of service problems until the node come back. Where is the use of that? IMHO a scale-out solution aimed to handle failover makes no sense with less than 3 nodes.
Run each with 2/3 capacity, then you can handle the loss of a node and compete with an idle standby with one machine running close to full. That means, perfect scaling assumed. To stick with your example, you need 3 * 4 CPUs (with RAC) to compete against the 2*8 CPU (8 fully licensed for Oracle + whatever dataguard needs for the idle stand buy). As you scale the number of nodes the hardware you need for a decent shared disk system is also getting more pricy. Isn't the same true for the high speed interconnect (of which you should need 2 for RAC but only one for idle standby (presuming Dataguard can offset the network failure by buffering the logs)?
I'm always skeptical if the now works and the then "has been done in my lab".. call it paranoia.

Cheers
Serge

-- 
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Received on Wed Feb 02 2005 - 23:39:17 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US