Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Excessive SQL*Net message from client waits

Excessive SQL*Net message from client waits

From: Karen Morton <oracledba_at_morton-consulting.com>
Date: Wed, 12 Mar 2003 18:18:54 -0800
Message-ID: <F001.0056870F.20030312181854@fatcity.com>


Hi All,

I've got a situation where I've collected trace data and am seeing 90% of total response time is accounted for with the SQL*Net Message From Client event. Individual queries within the trace show minimal CPU time used and no obvious indications of bad SQL being the culprit. I used the Hotsos Profiler (way cool) and here's an example of what it shows:

Response Time Component                       Duration         # Calls
Avg        Min      Max
---------------------------------- -------------------- --------------- ----
------- ---------- -------

(i) SQL*Net message from client 500.98s 85.1% 2,757
0.181712s 0.00s 5.91s
(i) unaccounted-for 23.03s 3.9%
(i) direct path write 22.38s 3.8% 1,373
0.016300s 0.00s 0.32s
(i) log file sync 20.70s 3.5% 685
0.030219s 0.00s 0.52s
(i) user-mode CPU 12.12s 2.1% 12,016
0.001009s 0.00s 1.50s
(i) direct path read 6.66s 1.1% 985
0.006761s 0.00s 0.09s
(i) db file sequential read 1.09s 0.2% 2,679
0.000407s 0.00s 0.14s
(i) db file scattered read 0.83s 0.1% 2,158
0.000385s 0.00s 0.17s
(i) SQL*Net more data to client 0.50s 0.1% 1,007
0.000497s 0.00s 0.13s
(i) SQL*Net more data from client 0.42s 0.1% 5
0.084000s 0.01s 0.19s
(i) db file parallel read 0.11s 0.0% 44
0.002500s 0.00s 0.01s
(i) latch free 0.10s 0.0% 30
0.003333s 0.00s 0.02s
(i) file open 0.01s 0.0% 8
0.001250s 0.00s 0.01s
(i) SQL*Net message to client 0.00s 0.0% 2,757
0.000000s 0.00s 0.00s ---------------------------------- -------------------- --------------- ---- ------- ---------- ------- Total 588.93s 100.0%

If you want to see the whole profile please check at www.morton-consulting.com/pdfs/sqlnetwaitstrace.pdf.

By the way, this single trace is one of 15 that was done and all show the same SQL*Net waits being, on average, 90% or above of the total time.

The "network guys" did some testing and came back saying that the network couldn't possibly be the problem (do they ever?). Here's what they said: "There are 0 Symptoms and Diagnoses that occur from Physical layer up to network layer. In running the trace file in loopback mode using the packet generator function - I noticed that the average % of utilization across the network was under 10% utilization. There were only few spikes that reached 10% and only two at 20%. All spikes seen lasted no more than two seconds. In analysis of your packet size distribution, it was a perfect bell curved. In fact, 64 byte sized packets (broadcasts and unicasts) were only 4th in the number of occurrences (very good!). Traffic captured showed the mac layer path clearly through cabinet K and out to the Oracle TNS server. At the network layer, you only had IP traffic between the client and the Oracle TNS server. You had 267 connection layer Symptoms between the Oracle TNS server and the client. 263 of which were Ack Too Long. All were because of the Oracle TNS Server which had very poor Ack times. The client's Ack's averaged 2ms. You had 1 Windows Frozen and 3 Retransmissions. At the session layer you had 2 Diagnoses and 4 Symptoms - all related to the Oracle TNS Server."

Another test I did was to do a continuous ping from the server to one of the connected sessions and saved the results to a file. I saw some some minor spikes over the period of 30 minutes or so but nothing significant.

The application is not the most efficient in the world (to put it mildly) as it doesn't use bind variables (at all) and changing the app at this point is not an option. The app is in part written in C++ and in part in Delphi. Every SQL statement is built as a concatenated string and then passed to the database. I know bind variable usage would help, particularly with all the hard parsing but if I can't make app changes, then what?

Another thing the app does is that for every individual user that logs in, multiple sessions will be spawned. For a given single user who is logged in when you look in v$session, you see as many as 10 sessions for that process. So, in effect, 20 users "look like" 200.

I know, I know....is anybody ill yet?

Can anyone suggest anything that could help pinpoint all that SQL*Net wait time? Do you see anything in the trace data that would lead you to suggest anything (init.ora parameter changes for example)? I've tried everything I can think of and am at wits end but still have a screaming client who's highly upset with the excessive response times.

And one last thing which is kinda related....we changed the optimizer_index_cost_adj and optimizer_index_caching parameters from their defaults to 50 (both of them) and saw in our testing that it did make significant improvements in some cases (the trace profile I've shown you doesn't really have examples of this). But...since the client didn't want to bounce the database, we attempted to make these changes temporarily in a logon trigger by using execute immediate to do alter session commands to set the parameters per session logon. It seemed to work wonderfully but one application (one out of four) started hanging on startup right after we put in the trigger. When the trigger was dropped, the application was OK again. What the heck is up with that? I've never seen anything like that and can't for the life of me figure out what the app could be doing that a logon trigger would cause to freak out. Any ideas on that one?

Many thanks in advance for your assistance, Karen

PS If you could send me direct replies to this post at karen_at_morton-consulting.com as well as replying to the list I'd appreciate it.

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Karen Morton
  INET: oracledba_at_morton-consulting.com

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L

(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).
Received on Wed Mar 12 2003 - 20:18:54 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US