Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: CPU Capacity Planning

RE: CPU Capacity Planning

From: Cary Millsap <cary.millsap_at_hotsos.com>
Date: Sun, 07 Dec 2003 08:24:25 -0800
Message-ID: <F001.005D91D9.20031207082425@fatcity.com>


My answers are in-line, preceded with “[Cary Millsap]”...

Cary Millsap
Hotsos Enterprises, Ltd.
http://www.hotsos.com

Upcoming events:

- Performance Diagnosis 101: 12/16 Detroit, 1/27 Atlanta
- SQL Optimization 101: 12/8 Dallas, 2/16 Dallas
- Hotsos Symposium 2004: March 7-10 Dallas
- Visit www.hotsos.com for schedule details...


-----Original Message-----
Boris Dali
Sent: Sunday, December 07, 2003 9:54 AM
To: Multiple recipients of list ORACLE-L

Thanks a lot for the reply, Cary. Yes, your explanation makes all the sense in the world even though it is precisely the weighted average approach that I've seen on some capacity planning spreadsheets.

Two additional questions if I may, Cary. Would it be correct to say that when I throw additional users on a system it is only queueing component of a response time that climbs up, while service time stays the same?

[Cary Millsap] “Sort of,” but not exactly. There are lots of scalability threats that begin to manifest in reality when you crank up the load. For example, you’ll see “latch free” waiting on applications that parse too much, but only at higher user volumes (never in unit test). You can consider the new appearance of “latch free” events to be a type of queueing if you want, but it’s really not queueing in the sense of a simple CPU queueing model.

If that's true, than does
it matter how I measure service time of my Bus.Tx1 - on a loaded system where hundreds of users run this operation or when nobody executes it all? Also is it important to have the other two operations - Bus.Tx2 and Bus.Tx3 - running concurrently (as they would in a real life) for the c measurements?

[Cary Millsap] You’ll put yourself at risk if you simply try to use a queueing model to extrapolate big-system performance from data collected in a unit testing environment. It’s because of the potentially out-of-model scalability threats.

In other words assuming I have an identical replica of a production environment where I am the only user - would service time/rate measured there be applicable for a loaded system with heterogeneous workload?

[Cary Millsap] ...Only if you your production environment doesn’t trigger any new serialization issues that weren’t visible on your unit test env.

And another stupid question.
Knowing individual business tx. characteristics (response time, number of CPUs required to comply with SLA requirements, average utilization per CPU, etc), how does one go about sizing the box in terms of the overall "system" required CPU capacity? Or put it another way - what do I tell a hardware vendor?

That is, if what comes out of a queueuing exercise is:

           m       pho
         --------  ---
Bus.Tx1   2-way    70%

Bus.Tx2 3-way 50%
Bus.Tx3 4-way 80%

What should be the optimistic (let's assume perfect liner CPU scalability for now) recommendation to decision makers in terms of the horsepower required to run this "system" on?
After all, yes individual business transactions have their own SLA requirements (e.g. worst tolerated response time), but they all use the same resources, don't they? So even though a service time of Bus.Tx1 might remain constant the queueing delay (and hence the response time) would likely to increase due to other concurrent activities on the system. Is there a way to account for this if capacity planning is done at the individual bus.tx level?

[Cary Millsap] The hardest part about capacity planning is that there’s no useful industry-wide standard unit of CPU work to use. You can’t use MHz, you can’t use MIPS, and you can’t use SPECints, or anything else like that. But you can use Oracle LIOs. It’s not hard to test a system to see how many LIOs/sec it can handle; this is your supply (capacity). It’s also not hard to see how many LIOs/sec an application needs; this is your demand (workload). With this realization, capacity planning is much simpler. The game is to ensure that supply exceeds demand at all times, and by a sufficient amount so that you don’t have unstable response times.

[Cary Millsap] ...And, of course, as I mentioned previously, you have to keep your peripheral vision open for the possibility that some new scalability threat will manifest and surprise you.

Thanks,
Boris Dali.


Post your free ad now! http://personals.yahoo.ca
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Boris Dali
  INET: boris_dali_at_yahoo.ca

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Cary Millsap
  INET: cary.millsap_at_hotsos.com

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Sun Dec 07 2003 - 10:24:25 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US