Oracle-L: RE: HIGH CPU WITH MULTIPLE CONCURRENT USERS (long)

From: Khedr, Waleed <Waleed.Khedr_at_FMR.COM>
Date: Fri, 19 Apr 2002 08:08:41 -0800
Message-ID: <F001.004497BB.20020419080841@fatcity.com>

The elapsed time taken can not change for this stress test except by improving the cpu time needed to execute a single task.

So to improve the numbers some tuning needed on the sql.

The average elapsed time for (n concurrent jobs) = x * n / c sec

x = cpu time needed to execute a single task (alone on the system)
c = number of CPUs
n = number of concurrent tasks 
n = m * c  where m = 2,3,4,5,6, etc.

regards,

Waleed
-----Original Message-----

To: Multiple recipients of list ORACLE-L Sent: 4/19/02 12:23 AM

Vivek's feedback: (on your questions)

Richard,

I agree that over time this incident has been tested with various scenarios that it is getting confusing. However, the objective that we started with is still the same.

Query: I have a query that does a select from 1 table (uses first_rows and index hint). This index is the one that gives us the best possible time with least possible consistent gets. The IN clause contain 50 individual literals. The query for 1 user to execute takes 1.67 seconds. This includes the time it also takes to display the results on the client. In our case the sql plus window on the database server. I had generated the trace file and did a TKPROF on the trace file. I am attaching the results of the trace file for your perusal.

I had tried to _spin_count as default and various values from 4 to 40000. The most optimal response time was obtained at _spin_count of 10000. This is the value currently set. This was also recommended by Oracle as the CPU seems to be doing something (I believe due to Oracle) and is clearly visible as the user load is increased.

To provide more clarity, I am attaching a word document that lists the trace status of parse, execute and fetch for 1 and 20 simultaneous users. Please note that while for 1 user the total elapsed time is very close the fetch time, for 20 concurrent users, the disparity is high. This disparity increases more than linearly as the stress is increased. I hope this helps.

You are correct in your observation that Oracle does not show a wait in the v$session_wait and the CPU idle time is 0%, usage 98% user, 2% kernel. This can be observed clearly for as small as 100 concurrent users. There is no data functions or conversion on any of the columns both in the select and in the where clause. I want to be careful here. As I keep reducing the number of literals in the IN clause, the query works faster. However, the degradation factor (response time for 20 simultaneous queries to response time of 1 query) is the same hovering around 1 to 3.6. This degradation factor becomes very large as the stress in increased.

Our first scenario was an IN clause with 800 literals. Then we had reduced it to 200. Then to 100. Now we are at 50. However, since our application response is for 800, now we have that many simultaneous queries accessing the database. This contributes to increased load and the overall degradation factor is still the high level.

I will try the truss and send you the observation soon.

Thanks in advance.

Vivek Vijayaraghavan

1 USER: