Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: The best CPU usage measurement in Oracle: BUFFER_GETS or CPU_TIME?

RE: The best CPU usage measurement in Oracle: BUFFER_GETS or CPU_TIME?

From: Cary Millsap <cary.millsap_at_hotsos.com>
Date: Sat, 19 Jun 2004 16:17:21 -0500
Message-ID: <023b01c45642$d2e8a9f0$6601a8c0@CVMLAP02>


Jurijs,

My comments are in-line below.

Cary Millsap
Hotsos Enterprises, Ltd.
http://www.hotsos.com
* Nullius in verba *

Upcoming events:
- Performance Diagnosis 101: 6/22 Pittsburgh, 7/20 Cleveland, 8/10 = Boston

- SQL Optimization 101: 5/24 San Diego, 6/14 Chicago, 6/28 Denver
- Hotsos Symposium 2005: March 6-10 Dallas
- Visit www.hotsos.com for schedule details...


-----Original Message-----
From: oracle-l-bounce_at_freelists.org =
[mailto:oracle-l-bounce_at_freelists.org]
On Behalf Of J.Velikanovs_at_alise.lv
Sent: Saturday, June 19, 2004 3:45 PM
To: oracle-l_at_freelists.org
Cc: vjv_at_alise.lv
Subject: RE: The best CPU usage measurement in Oracle: BUFFER_GETS or CPU_TIME? Cary,
I have bought your excellent book "Optimizing Oracle Performance" at the =

time of Paris Open World last year.
I read it twice and trying to hold it near for references. Even now it = is=20
in the range of my vision.

Thank you for sharing your experience with us Oracle users!

[Cary Millsap] You're very kind.

But ;) there some situations there the Method R is difficult to use for =

performance diagnostic. In case we have some process that running to = slow=20
from business perspective, we can use described method with big success.

Lets imagine the Business saying to me: "Well, at the middle of the=20 working day activity A have no appropriate response time, but out of=20

business ours we have no problems with mentioned activity". I have made=20
TRC for this activity at the problem time. You probably know that I can=20
see from trace file. There is CPU time starvation, because of the other=20
processes in the running queue.

What we can do in such situation? I imagine 3 possible solutions (2 for=20 hardware vendors, 1 for Oracle peoples DBA/Dev/Consultants)

[Cary Millsap] Conceptually, anyway, it's quite simple. You identify the competition for the resource that's being competed for, and you fix it, either by moving the competition into another time window, or you = optimize
that competing piece of application workload. ...In your example, of = course,
the competed-for resource is CPU.

Third one=20
(1 - add more CPUs, 2 - change CPU to modern ones, second can be more=20 popular because of Oracle licensing ;)=20 is to find TOP CPU consumers and try to optimise those.

[Cary Millsap] Yes, exactly.

The question is: What is the most effective way to find TOP consumers? Root of my initial question is what is the best way (BUFFER_GETS or=20 CPU_TIME) to accomplish this task.

[Cary Millsap] ...And now I see your point more clearly.

You know peoples often use statspack as well, as your web site to = diagnose=20
system overall performance. If bottleneck is CPU, then advice is to find =

TOP SQL from BUFFER_GETS perspective. My point is: may be now the better =

measure is CPU_TIME? I show in previous posting situation then CPU_TIME=20 figure is 10 times accurate then BUFFER_GETS. Can you show me = contrariwise=20
situation?

[Cary Millsap] No, I think you're actually on the best path to the = solution.

Best regards,
Jurijs

PS In the most live situation the Method R is the best way to work, but=20 there are exceptions. What is the best way to fight exceptions?

[Cary Millsap] I don't view this as an exception to Method R, it's just = one
of the steps you have to go through to implement your solution. Method R says to find out what's taking so long, and then go attack that. This is exactly what you're doing. What is NOT Method R is when people /assume/ = they
have a competition-for-CPU problem without measuring its impact, and = then go
on to work for days, weeks, or months trying to fix that problem, and = then
end up with nothing because they in fact were working on a flawed = assumption
to begin with. It seems that you've executed that correct first = diagnostic
step.

On 19.06.2004 18:32:39 oracle-l-bounce wrote:

>Jurijs,
>
>What you're saying is one of the key reasons that I pay attention =
almost
>always to trace data and almost never to V$ data.
>
>The problem with paying attention to sort counts and things like that =
=3D
>(even
>LIO counts!) is that some sorts are cheap, and some are really =3D
>expensive.
>You cannot tell how long they take by counting how many times they =3D
>happen.
>However, from your trace data, it's pretty simple to see when c has a =
=3D
>value
>that's larger than is normally explained by your cr+cu value, and you =
=3D
>can
>corroborate the time consumption by noticing SORT row source operations =
=3D
>in
>your STAT lines. In 9.2.0.2 and beyond, you can even see elapsed time
>consumption per row source operation in your STAT output (but watch for =
=3D
>bug
>3009359).
>
>With V$ data, you have to be lucky to find the problem, because of the =
=3D
>way
>the information gets aggregated before you can even collect it. For =3D
>example,
>it's entirely possible for a workload A with sorts=3D3D1 to take more =
time=20
=3D
>than
>a workload B with sorts=3D3D1003. All it takes is for the sort of A to =
be =3D
>1004
>times more expensive than the average sort duration of B. It is MUCH =
=3D
>more
>difficult to tell whether this is happening by looking at your V$ data =
=3D
>(if
>it's even possible at all)...
>
>The big problem with Statspack or ANY other tool that relies upon V$ =
=3D
>data is
>that, because of the aggregation done inside the Oracle kernel, it =3D
>presents
>very little solid evidence about response time. What you by looking at =
=3D
>event
>counts and aggregations of response time consumption qualifies =3D
>technically
>as only circumstantial evidence about the response time of an =
individual
>user action under inspection.
>
>It is also very simple with trace data to find the "time spent =3D
>preempted"
>problem that Dan brought up. It is considerably more difficult to do =
=3D
>with V$
>data. I'm not sure how you'd possibly go about it, actually, because =
=3D
>there's
>no place in the V$ data where you can pick up statistics about =3D
>individual
>executions of dbcalls (parse, exec, fetch, etc.).
>
>* There are certain problems (a lot of them in my experience) that you =
=3D
>will
>never be able to solve reliably with Statspack or ANY other tool that's
>based upon V$ data. *
>
>More generally, the problem is not V$ data in particular, it's ANY
>performance diagnostic data that is aggregated before you can collect =
=3D
>it. If
>you're interested in this phenomenon, there is a lot more detail =
present=20
=3D
>in
>the book "Optimizing Oracle Performance"--especially chapters 1 and 8.
>
>
>Cary Millsap
>Hotsos Enterprises, Ltd.
>http://www.hotsos.com
>* Nullius in verba *
>
>Upcoming events:
>- Performance Diagnosis 101: 6/22 Pittsburgh, 7/20 Cleveland, 8/10 =3D
>Boston
>- SQL Optimization 101: 5/24 San Diego, 6/14 Chicago, 6/28 Denver
>- Hotsos Symposium 2005: March 6-10 Dallas
>- Visit www.hotsos.com for schedule details...
>
>
>-----Original Message-----
>From: oracle-l-bounce_at_freelists.org =3D
>[mailto:oracle-l-bounce_at_freelists.org]
>On Behalf Of J.Velikanovs_at_alise.lv
>Sent: Saturday, June 19, 2004 8:01 AM
>To: oracle-l_at_freelists.org
>Cc: oracle-l_at_freelists.org; oracle-l-bounce_at_freelists.org
>Subject: Re: The best CPU usage measurement in Oracle: BUFFER_GETS or
>CPU_TIME?
>
>Thank you all for responses.
>I totally agree with Daniel regarding "CPU starvation" issues. But if =
we=20
=3D
>
>have one we cant see it ether from BUFFER_GETS or CPU_TIME any way ;)
>
>Lets imagine we have no one. And we would like find TOP SQL on =
which=3D20
>spending our time we will get most.
>
>As Jonathan Lewis note there are can be situations then some =
activates=3D20
>takes CPU resources, but doesn't reflect in the logical read =
statistics.=20
=3D
>
>As mentioned Jonathan latch contention can be the issue or buffer =
is=3D20
>pinned.
>
>But I would like to pay your attention to activities which, by my =3D
>option,=3D20
>takes place more often then others CPU consuming activates and can =
takes=20
=3D
>
>much significant CPU resources.
>
>It is !SORTING!
>
>I have gotten results bellow:
>1. Activity=3D20
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D=
3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D =3D
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D=
3D
>PL/SQL procedure successfully completed.
>
>vvvvvvvvvvvvvvvvv
>Elapsed: 00:00:01.35
>^^^^^^^^^^^^^^^^^^
>NAME VALUE
>----------------------------------------------------------------=3D20
>----------
>vvvvvvvvvvvvvvvvv
>session logical reads 23002
>CPU used by this session 137
>^^^^^^^^^^^^^^^^^^
>
>2. Activity=3D20
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D=
3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D =3D
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D=
3D
>PL/SQL procedure successfully completed.
>
>vvvvvvvvvvvvvvvvv
>Elapsed: 00:00:14.22
>^^^^^^^^^^^^^^^^^^
>
>NAME VALUE
>----------------------------------------------------------------=3D20
>----------
>vvvvvvvvvvvvvvvvv
>session logical reads 23006
>CPU used by this session 1422
>^^^^^^^^^^^^^^^^^^
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D=
3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D =3D
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D=
3D
>
>As you can see both activities have taken equal amount of "logical =3D
>reads",=3D20
>but CPU consuming for second one is 10 times bigger.
>The key answer why I have gotten such unproportional results is =
SORTING.
>For fists activity
>sorts (memory) 1
>For second
>sorts (memory) 1001
>
>So. I would say, it is better from this perspective to take a look =
first=20
=3D
>
>on CPU utilization not on BUFFER_GETS, or if you want on both. But =
not=3D20
>just on BUFFER_GETS as in statspack.
>
>Please correct me if I am wrong.
>
>Jurijs
>9268222
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D=
3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D =3D
>=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D=
3D=3D3D=3D3D=3D3D=3D3D
>http://otn.oracle.com/ocm/jvelikanovs.html
>
>PS Activities code
>
>1.
>declare
>vn number;
>begin
>for f in 1..1000 loop
>select count(*) into vn from (select * from testsort);
>end loop; end;
>/
>
>2.
>declare=3D20
>vn number;
>begin
>for f in 1..1000 loop
>select count(*) into vn from (select * from testsort order by 1);
>end loop; end;
>/
>
>



Please see the official ORACLE-L FAQ: http://www.orafaq.com

To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.
--
Archives are at http://www.freelists.org/archives/oracle-l/
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request_at_freelists.org
put 'unsubscribe' in the subject line.
--
Archives are at http://www.freelists.org/archives/oracle-l/
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------
Received on Sat Jun 19 2004 - 16:15:47 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US