Re: High CPU usage

From: shiva <sshivayogi_at_gmail.com>
Date: 14 Oct 2006 09:46:12 -0700
Message-ID: <1160844372.911005.210080@h48g2000cwc.googlegroups.com>

Fellas,

Thanks for giving very good pointers.

> What is not stated is why the number of sessions has increased from 80
> to 1000, because that could explain a lot more.

There were about 4 app servers connecting to database in 9i. In 10G they are about 8. So the app server capacity is doubled resulting in more connections. Then you might ask, Why 1000 connections instead of 160 in the new database? it appears some problem with new jdbc drivers not closing the connections. However after we rebooted all the app servers overloaded connections problem is resolved! (I'm still at loss to explain what helped here)

> Why high CPU usage is an issue?

Because sometimes during business hours cpu usage goes up to 100% and results in jamming users.

So present update is:

Connections are back to what we expected after rebooting app servers which is around 160 but still we are facing high cpu usage.
I set up a test 10G environment on a standalone box and made few app servers connect to it to rule out RAC as an issue. a) 10G standalone instance consumed same amount of cpu as the RAC box b) execution plans for queries were same in 10GRAC and non-RAC c) no RAC related wait events in AWR reports. So I strongly rule out this being an issue from upgrade to standalone to RAC
Looking at the execution plans of 9i and 10G. 10G is faster consuming more resources. 9i is not so efficient but users never complained about performance. My opinion is 10G is CPU hogger!

Below is my observations on top used SQL. 10046 output for 9i is

call count cpu elapsed disk query current

rows
------- ------ -------- ---------- ---------- ---------- ----------

Parse        1      0.01       0.00          0          0          0
       0
Execute      1      0.00       0.00          0          0          0
       0
Fetch       14      0.04       0.03          0       3907          0
     195

------- ------ -------- ---------- ---------- ---------- ----------

total       16      0.05       0.04          0       3907          0
     195

10046 output for 10G is

call count cpu elapsed disk query current

rows
------- ------ -------- ---------- ---------- ---------- ----------

Parse        1      0.01       0.00          0          0          0
       0
Execute      1      0.01       0.01          0          0          0
       0
Fetch       14      0.06       0.05          0       7761          0
     195

------- ------ -------- ---------- ---------- ---------- ----------

total       16      0.08       0.07          0       7761          0
     195

Obviously 10G is consuming more data buffers.

So I spend some time tuning 10G query adding hints and come with a sql which consumes less resources. Below is tuned query's 10046 output.

call count cpu elapsed disk query current

rows
------- ------ -------- ---------- ---------- ---------- ----------

Parse        1      0.01       0.00          0          0          0
       0
Execute      1      0.01       0.01          0          2          0
       0
Fetch       14      0.02       0.02          0         35          0
     195

------- ------ -------- ---------- ---------- ---------- ----------

total       16      0.04       0.04          0         37          0
     195

Since it is difficult to add this in to application right away. I created an outline and made optimizer to use tuned query's execution plan with bind variable. With the hope to kill two birds at one stone. Since cursor_sharing is set to similar, I follow Jonathan Lewis special instructions and make it work with cursor_sharing=similar. Now whenever this query gets executed it uses my execution plan.

I have to wait and watch the database and how this effects during business hours.

One thing i started noticing after making this change is background process for parallel server "ora_p002_" is consuming about 1.9% of the cpu usage. Below are my parallel processor processing parameter settings:

parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean FALSE
parallel_execution_message_size integer 2152

parallel_instance_group              string
parallel_max_servers                 integer
80
parallel_min_percent                 integer                          0
parallel_min_servers                 integer                          0
parallel_server                      boolean
TRUE
parallel_server_instances            integer                          2
parallel_threads_per_cpu             integer                          2
recovery_parallelism                 integer                          0

Note: Along with tuning some SQL's we are also considering adding another node to this RAC. What do you feel about third node?

Thanks so much.

Valentin Minzatu wrote:
> I re-read the posts and my understanding of the "problem" is that by
> upgrading from 9i to 10g (supposedly " better and faster") and also
> increasing the HW capacity - practically doubling it - they experience
> ~ 8 (considering a growth from 20% to 80% for each node) times higher
> CPU usage.
>
> What is not stated is why the number of sessions has increased from 80
> to 1000, because that could explain a lot more.
>
> If the overall system response time is slower and the application
> requires now more connections to handle the same workload then it make
> sense to look into optimizing the database, *but* if the increase in
> connections is something due to increased demand/workload then we may
> be just wasting our time as the behavior may be perfectly normal.
>
> Another thing that baffles me is that CURSOR_SHARING change made *no
> difference*. For better or for worse it should have made one (again in
> the case when those literal queries had significance), so I am
> wondering if the change was made across the whole cluster or on a
> single node.
>
> I have also noted that there are slight differences between the
> configurations on each node. Not that it would solve the "problem", but
> I wonder why the nodes are configured differently.
>
> ... and we can keep going like this for days unless some relevant
> information is posted.
>
> Cheers,
> Valentin
Received on Sat Oct 14 2006 - 11:46:12 CDT