Oracle FAQ Your Portal to the Oracle Knowledge Grid

Home -> Community -> Mailing Lists -> Oracle-L -> RE: cpu average load

RE: cpu average load

From: Cary Millsap <>
Date: Sat, 4 Dec 2004 00:39:53 -0600
Message-ID: <009a01c4d9cc$103b1600$6601a8c0@CVMLAP02>


My answer to the final question in your note is, with due respect: No. I think you're looking at your system upside-down.

A system's value cannot be measured solely by how much capacity it = consumes.
Value is a function of both cost and benefit. System resource = consumption
statistics convey only cost, and system-wide resource consumption = statistics
can convey only costs for which you have no hope of ever properly = allocating
back to some tangible benefit.

You can measure the value your system provides to you only by measuring = the
performance (both accuracy and speed) of the tasks your business = requires
from it. The more important tasks deserve more attention (and more = system
capacity) than the less important tasks. If your business wants only for your system-wide statistics to be happy, then I submit that your = business is
working for your system instead of the other way around.

I don't suggest that you start tracing 100s of processes. I do suggest = that
it wastes time if you attempt to measure a system's efficiency in any = way
that doesn't begin with prioritizing the tasks that your business = requires
of your system.

Cary Millsap
Hotsos Enterprises, Ltd.
* Nullius in verba *

Upcoming events:

- Performance Diagnosis 101: 1/4 Calgary
- SQL Optimization 101: 12/13 Atlanta
- Hotsos Symposium 2005: March 6-10 Dallas
- Visit for schedule details...

-----Original Message-----
From: =
On Behalf Of Sent: Friday, December 03, 2004 10:05 PM To:; Cc:
Subject: RE: cpu average load


I am sorry it took awhile for me to answer this question. I have been in implementations lately. I truly apologize.

Cary, what you find simple and implement frequently will take me awhile to catch on to - I am still reading your book as I can.

The question started when my management asked me to deploy certain big brother monitors on the system - it appealed to them for various reasons which is a completely different discussion.

Now we are getting warning messages regarding cpu average load. I am thinking of upping the thresholds on these warnings since no one has complained about performance and frankly getting these messages interferes with our ability to monitor "true" problems on the system - but then again users sometimes live with bad performance and the information never gets passed along - IMHO.

I don't mean to take anyone's time in answering an abstract question. I just wanted a general understanding of what this was measuring, what impact it could have before I went ahead and changed it. I was looking for a good place to start to gain a better understanding of what this measures. For example, I commonly look at TOP to see how much CPU a process is using. It is very easy to tell from that which processes are consuming the most CPU on the system, how much CPU (approximately) and for how long. When I get error messages from OEM regarding CPU I can run TOP and trace it directly back to a particular process many times. Then I can proceed with more in-depth tracing. However, if I am getting warnings, errors and e-mails about average CPU load then I am not completely clear what that is measuring.

In my simple mind I think that looking at overall resource utilization on a box is a good place to start if you are seeing things slowing down (as a whole) then drilling down from there. Also, proactively monitoring system resource utilization on a regular basis if you are supporting a number of databases operationally has proven useful to me. That is what these overall monitoring processes are for - just to show unusual activity. That is why I was asking - where can I start finding out what is usual or unusual average CPU load? =3D20

Cary, when you say:

"The amount of response time that process preemptions are costing your performance is measured as the amount of response time in an extended SQL trace file that is not accounted for by the sum of your file's c values at recursive depth zero, plus the sum of your file's ela values."

Does not seem to answer my question. Certainly, I shouldn't have to start by running extending SQL traces on everything running on my system when these warnings occur. For example, that might require an extending SQL trace of multiple OLTP system with 100+ users. Shouldn't I be able to discern something from this information at a higher level? =3D20

-----Original Message-----
[] On Behalf Of Niall Litchfield Sent: Wednesday, December 01, 2004 8:28 AM To:
Subject: Re: cpu average load

On Tue, 30 Nov 2004 10:59:11 -0600, Cary Millsap <> wrote:
> I disagree that this advice is difficult to implement in practice, =
=3D3D =3D

> because I implement it in practice frequently.

I disagree with mladen for a somewhat different reason (i.e I don't care about ease of use here). It seems to me this discussion springs from a technical question that may or may not be worth answering. Paula's question was along the lines of 'how can I tell if my server is being utilized efficiently'. One possible answer to this is 'Who cares?'. Now if the question is being asked because there is an ongoing discussion about buying new hardware, or the transactional capacity of the system is apparently not good enough for the business needs of the state then there is a real business problem to investigate.

iff there is a real problem to be investigated, then it doesn't really matter how easy or hard it is to get the correct answer (unless the cost of obtaining the answer is higher than the cost of not answering), it is the correct answer that you require.

So I'd be taking a step back and asking Paula to define *why* she is investigating the amount of the CPU capacity of her machines that Oracle is using. If you can express that in clear business terms then you can go down the profiling route (or any other method you think appropriate).

BTW In this particular case, my money would be on unaccounted-for time being a better measurement of time spent being prempted than the kernel mode time consumed by the whole system, but I'm willing to be proven wrong.

Niall Litchfield
Oracle DBA

Teach CanIt if this mail (ID 17285359) is spam:
Not spam:
Forget vote:


Received on Sat Dec 04 2004 - 00:42:20 CST

Original text of this message