Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: CPU Usage % on 2-Node RAC Run versus Single Node RAC Run .... Benchmarking

Re: CPU Usage % on 2-Node RAC Run versus Single Node RAC Run .... Benchmarking

From: Greg Rahn <greg_at_structureddata.org>
Date: Fri, 12 Oct 2007 11:01:24 -0700
Message-ID: <a9c093440710121101pde843e6ie051b8a7ccdcb639@mail.gmail.com>


I believe what is being observed is a *symptom* of the way the test is being run. The db doesn't need any "tuning", nothing is wrong. It comes down to the longest operation (in time) dominates the overall transaction time. When a transaction (or portion of workload) needs to do physical IO, that time will dominate the overall time. This is because physical IO time is an order of magnitude (or more) than memory access times or remote buffer get times.

Sure, one could shrink the buffer cache size to be very small and cause physical IO, but then the test is being formulated to get the desired results. Perhaps this is a valid learning exercise, but not something that would be necessary on a production-like workload.

I'm also probably going to guess that the one node test was run in exclusive mode (non-RAC, not just one node of a RAC cluster w/ other node down), meaning there was *no* RAC overhead. On the scaling curve, the worst delta will between a single exclusive node and 2 RAC nodes. Does this mean RAC doesnt scale? Of course not. This delta becomes less and less the more nodes that are in the cluster. This is because the "overhead" becomes a smaller and smaller percentage of the overall resource consumption.

On 10/12/07, ajeet ojha <oraclev28_at_gmail.com> wrote:
> Greg - so should he try to reduce the db cache size in case of 2 node
> runs...I have tried that but I didn't notice any improvement as such.
> just to be a little more clear on this - what I am saying that suppose he
> has 4 gb of db cache at each node.when he tries to run a 2 node rac test ,
> he should reduce the cache to 2gb - but if cache is oversized...statspack
> would show that...
>
> but great explanation from you !!
>
> regards
> Ajeet
>
> PS - vivek , you can send the statspack to me - will go in detail.
>
>
>
> On 9/26/07, Greg Rahn <greg_at_structureddata.org> wrote:
> >
> > Theory:
> >
> > The workload probably has nearly 100% of the data in cache and thus is
> > CPU bound - little to no IO is taking place. The 2 node RAC config
> > probably has 50% of the data in each cache. The "additional" CPU from
> > the sum of both nodes is due to the remote buffer get calls (extra
> > function calls are not free). Again, this is a symptom of an
> > in-memory database and probably would not be the case in a real-world
> > scenario. If there was physical IO taking place, it would be a closer
> > number. Why? Physical IO is an order of magnatude slower than remote
> > buffer calls and several orders of magnitude slower than local gets.
> > The physical IO times would dominate the overall transaction time
> > simply because of scale.
> >
> > For demonstration lets play with some numbers.
> > Lets first declare some constants:
> > - local buffer get takes 1 microsecond ( 0.000001)
> > - remote buffer get takes 1 millisecond (0.001)
> > - physical IO takes 10 milliseconds (0.01)
> >
> > Lets say our workload has to do 1,000,000 buffer gets.
> >
> > If 100% are local buffer reads:
> > 1,000,000 gets * 0.000001 = 1 second
> >
> > If 50% local buffer gets, 50% are remote buffer gets,:
> > (500,000 * 0.000001) + (500,000 * 0.001) = 0.5 + 500 = 500.5 seconds
> >
> > Lets also consider if a remote buffer get takes 0.0001 seconds
> > (500,000 * 0.000001) + (500,000 * 0.0001) = 0.5 + 50 = 50.5 seconds
> >
> > Depending on the remote buffer get times, this in-memory transaction
> > could get 50-500x slower if 50% of its buffer gets are remote gets.
> >
> > Are remote buffer gets a bad thing? Lets see.
> >
> > Lets introduce some physical IO now. Lets say 95% of the data is in
> > local memory, 5% physical IO.
> > ((.95 * 1,000,000) * 0.000001) + (.10 * 1,000,000) * 0.01) = 509.5 seconds
> >
> > If we compare the 95% local, 5% physical case with the 50/50
> > local/remote (1 millisecond) we see that they take approximate the
> > same time (509.5 seconds vs. 500.5 seconds). With the given
> > constants, we see that if 100% of the data spread across the RAC
> > cluster, it would be (slightly) faster to do the remote buffer gets
> > than to have 5% physical IO with 95% local buffer gets.
> >
> > Of course, there are an unlimited number of use cases here, one could
> > also have local gets, remote gets and physical IO, access times could
> > vary slightly, but I hope that the numbers help paint the picture.
> >
> > Bottom line: the slowest call will dominate the overall transaction
> > time when there are one to several orders of magnitude differences
> > between the call durations.
> >
> > On 9/24/07, VIVEK_SHARMA <VIVEK_SHARMA_at_infosys.com> wrote:
> > > CASE 1 - When Executing a FULL Set of Transactions on Node 1, with the
> 2nd Node's RAC instance in SHUTDOWN Condition
> > >
> > > CPU Usage of Node 1 = 20 %
> > > CASE 2 - When Executing approx Half the above Number of Transactions on
> Node 1, & the Other Half on Node 2 (by setting LOAD_BALANCE = yes in
> tnsnames.ora)
> > >
> > > CPU Usage of Node 1 = 18 %
> > >
> > > CPU Usage of Node 2 = 19 %
> >
> > --
> > Regards,
> >
> > Greg Rahn
> > http://structureddata.org
> > --
> > http://www.freelists.org/webpage/oracle-l
> >
> >
> >
>
>

-- 
Regards,

Greg Rahn
http://structureddata.org
--
http://www.freelists.org/webpage/oracle-l
Received on Fri Oct 12 2007 - 13:01:24 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US