Re: Resend: performance problem with fetch using 8.1.7 32-bit client to access a 64-bit server

Home -> Community -> Usenet -> c.d.o.misc -> Re: Resend: performance problem with fetch using 8.1.7 32-bit client to access a 64-bit server

From: <harronc_at_attglobal.net>
Date: Tue, 10 Feb 2004 10:53:02 +0700
Message-ID: <4028570c_3@news1.prserv.net>

"Sybrand Bakker" <gooiditweg_at_sybrandb.demon.nl> wrote in message news:33pf20t0ks9si86ln3ee646s6cnkc5r95c_at_4ax.com...
> On Mon, 9 Feb 2004 15:36:25 +0700, <harronc_at_attglobal.net> wrote:
>
> >Please ignore previous message.
> >
> >We are running 8.1.7.4 64-bit Oracle on Solaris 8/sparc3. We have a
client
> >which was compiled on a 32-bit sparcII platform. We find that when we run
> >multiple instances of the client we find that it doesn't scale very well,
> >even though the product vendor believes that it should and we have
evidence
> >to prove this in different sites. We have already spent a lot of effort
> >tuning the server and we now believe that it is well tuned. The server
> >spends 90% of it's time idle. We have found that the client also spends
> >about 90% of its time sleeping even though it has work to do, and there
are
> >plenty of free resources (only 25% of 24 CPUs are utilised, 40% of free
> >memory, and IO is healthy).
> >
> >When we dig deeper into what the client is doing using pstack, we find
that
> >that it spend most of its time fetching from various cursor and
specifically
> >in the read function. An example stack trace is below:
> >
> >fea9e570 read (a, 8077ee, 810)
> >
> >fefc1a88 sntpread (810, 80560c, 804aec, 8077ee, 804a9c, 7f2544) + 10
> >
> >fefbf114 ntprd (804aa8, 805108, 7fb3d8, ff257314, 0, 7f9088) + 134
> >
> >fee1f490 nsprecv (8048c8, ffbe875f, ff285ce8, 0, 7f9088, 7fb3d8) + 58c
> >
> >fee24184 nsrdr (8048c8, 8080f0, ff285ce0, 0, 27, a) + 1dc
> >
> >fee09bc8 nsdo (7fcd08, 1, 0, 0, 7f9088, 8048c8) + f14
> >
> >fee91044 nioqrc (7f250c, ff257314, c27, 7fcd08, 0, 7f9088) + 6c0
> >
> >fefca86c ttcdrv (7f5c78, 7f5fbc, ff257314, 7f5fac, 7f5fae, 7f5fac) + b7c
> >
> >fee996ec nioqwa (7f250c, 0, fefc9cf0, 7f5c78, 7f2490, 0) + 48
> >
> >fed19330 upirtrc (7f2468, 7f250c, 0, 0, 7f1df4, 7f2544) + 7f8
> >
> >fed6ab60 kpurcsc (7f25e8, ffbe9cae, 7f242c, 7f5c78, 7f6804, 0) + 70
> >
> >fedccd7c kpufch0 (1f5b2c0, 2, 7f2468, 7f5b74, 1f5b308, 6340) + a68
> >
> >fedcdc28 kpufch (1f5b2c0, 7f1f0c, 0, ff257314, 7f242c, 2) + 92c
> >
> >fecebec4 sqlcucFetch (1f5a5d0, ff2848b4, 1, 1f5a5d0, 0, 7f1f0c) + 20
> >
> >fecda5b4 sqlall (ff2848b4, 1, 22a0c28, 16, 58, 16) + 1bf4
> >
> >fecd17a8 sqlnst (382a62, ffbedb50, ffbedb50, 1, ff2848b4, a7f0f0) + 7b44
> >
> >fecb3d3c sqlcmex (0, a7f0f0, ffbedb50, 37d918, ff22d13c, ff257314) + 488
> >
> >fecb4180 sqlcxt (0, 7615ec, ffbedb50, 37d918, 37d800, d0) + 58
> >
> >We run pstack repeatedly and the client is invariably in the above
function.
> >
> >The client uses a local bequeath connection so I assume that the read()
is
> >reading from a pipe. I've noticed that the third parameter of read(), the
> >number of bytes, is always 810. It is possible that this is the
bottleneck.
> >It is the sdu that controls the number of bytes read. We have not set the
> >sdu and oracle documentation says that the default value is 2048. But
then
> >when in only 810 bytes read? (We have run pstack many time and it is
always
> >810 bytes).
> >
> >Even if the read size needs tuning this does not explain why the client
> >processes do not scale. The performance increases reasonably linearly up
to
> >eight parallel processes but beyond this the increase in throughput
slopes
> >off, and beyond 14 processes there is no increase in throughput at all,
even
> >though the oracle server is idle and there are plenty of free resources.
> >
> >We were told that there was a bit when using a 32-bit client to access a
> >64-bit server, but we have installed the relevant patch 2298968 and this
did
> >not improve the situation.
> >
> >We do not have source code for the client application but as it calls
sqlcxt
> >I assume it uses Pro*C. Any similar experience or any idea of what can be
> >causing the problem?
> >
> >Ciaran
> >
>
> Most likely your analysis is completely inadequate, and you are just
> barking at the wrong tree. The influence of sqlnet on total
> performance is completely negliceable. Also your assumption sqlnet
> always should transmit 2048 bytes demonstrates you don't know anything
> about the interaction between client and server. A sqlnet packet will
> never ever cross statement boundaries. If you see 810 byte packets all
> the time, the database only has 810 bytes to send for that statement.
> For your sqlnet analysis you should also rather use the trcasst tool
> than pstack, and if you insist on using pstack you should trace the
> dedicated server process server your client instead of the client.
> You would also check out to see what is going on in terms of
> sqlstatements, and you should check disk I/O.
> My crystall ball tells me, if you perform an adequate analysis, you
> will find you are dealing with an unscalable application.
>

I would say our analysis in inadequate, it's more a last ditch desparate attempt to solve a problem that we have been unable to solve for the last six months. The pstack output always says that the client is doing a fetch, although each time for different sql.

After 6 months of tuning the oracle server and various sql, we have the following typical statspack report for 1 hour:

Avg

                                                              Total Wait
wait  Waits
Event                                  Waits       Timeouts   Time (cs)

(ms) /txn

---------------------------- --------------- -------------- -------------- -
------- ------
db file sequential read              709,534              0        485,112
7   33.2
latch free                            34,194         29,875         47,577
14    1.6
db file parallel write                13,493              0         24,937
18    0.6
log file sync                         22,394              0         21,367
10    1.0
SQL*Net more data to client          848,722              0         16,448
0   39.7
log file switch completion               181             37         13,305
735    0.0
enqueue                                   75             23         13,091
1745    0.0
log file parallel write               21,532              0          9,684
4    1.0
log file sequential read              65,552              0          3,883
1    3.1
buffer busy waits                      2,499              0          1,422
6    0.1
direct path read                      15,594              0            704
0    0.7
file identify                             36              0            564
157    0.0
file open                             25,192              0            555
0    1.2
control file parallel write            1,270              0            218
2    0.1
db file scattered read                   429              0            136
3    0.0
LGWR wait for redo copy                  367              7             26
1    0.0
control file sequential read             697              0             18
0    0.0
row cache lock                            85              0              7
1    0.0
direct path write                      3,147              0              5
0    0.1
log file single write                     16              0              4
3    0.0
refresh controlfile command               10              0              2
2    0.0
SQL*Net break/reset to clien               2              0              0
0    0.0
db file single write                       2              0              0
0    0.0
SQL*Net message from client       14,090,519              0     26,082,903
19  658.8
PX Idle Wait                          21,012         21,012      4,307,460
2050    1.0
SQL*Net message to client         14,090,518              0          7,631
0  658.8
SQL*Net more data from clien         442,221              0          2,047

0 20.7

Note that 75% of wait time is for db file sequential read, and with an average wait time of 7ms this looks reasonable. Most of the read come from a single index cluster table with 300 million rows so there not much we can do to reduce the number of physical reads. Anyway the volumes where the index cluster is located are never more that 15% busy (were using multiple fibrechannel connection to an enterprise storage array with 24G internal cache, its fast).

We were worried about the latch free waits which are consistently 7.5% of the total wait time. The latch stats are as follows:

                                 Get                                   Spin
&
Latch Name                     Requests          Misses      Sleeps Sleeps
1->4
-------------------------- ---------------- ----------- ----------- --------
----
cache buffers chains             94,983,959      44,728       1,670
43094/1608/1
                                                                    9/7/0
library cache                     8,691,838      34,134      10,298

27914/2268/3

829/123/0
checkpoint queue latch 6,135,872 25,742 21,883 3868/21865/9

                                                                    /0/0
redo allocation                   5,090,786      16,847         111
16736/111/0/
                                                                    0/0
session idle bit                 28,284,866       6,229          79
6150/79/0/0/
                                                                    0
dml lock allocation               1,284,074       1,529          15
1514/15/0/0/
                                                                    0
active checkpoint queue la           25,554         336          14
322/14/0/0/0
enqueue hash chains               1,421,549         334          17
317/17/0/0/0
undo global data                    271,350         164           9
155/9/0/0/0
redo writing                        135,560         130           8
122/8/0/0/0
messages                            201,923         115           2
113/2/0/0/0
cache buffers lru chain             676,086          69          45
24/45/0/0/0
shared pool                         105,216          50          40
29/5/13/3/0
session allocation                  109,383          41           3

38/3/0/0/0

There is some library cache latch contention. Our application vendor has done a lot of work to reduce to an absolute minimum the amount of parsing done by our application. It always uses bind variables and some sql which unexplained caused high parsing have been rewritten. There's nothing more we can do in this area. The remaining contention on the library cache is now caused by different appliation accessing the same database.

A big chunk of the latch free waits are on the checkpoint queue latch. I don't believe this directly impacts the user processes. We do plan however reduce the number of database writers to see if we can reduce this.

So we not so worried about the latch free given the above analysis.

Bear in mind that oracle spent a lot of its time idle during the statspack interval. See the number of SQL*Net Message from client wait events.

So it is following the above that I turned to using pstack to trace what the application is doing. As I said oracle spends a lot of it time idle, and the client application spends a lot of time sleeping, so I think it makes sense trying to find out what it is sleeping on, if it is not waiting for oracle?

Ciaran

>
> --
> Sybrand Bakker, Senior Oracle DBA
Received on Mon Feb 09 2004 - 21:53:02 CST