Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Listener problems on 9.2/Linux
We have a seriuos problem regarding one production site which has recently
been
upgraded to SLES8 (2.4.19-233 smp kernel) and to 9.2.0.2
Let me mention that the database is running under dedicated server with
about 150 users connected and a lot of
batch processes due to automated warehouse management system which are very
short connections to the database.
Now we are facing the same problem which was on our last production
installation but have no workaround.
On previous SUSE 7.3 installation with 9.2.0.1 we have offloaded the
listener by creating additional 3 listeners and
it helped.
.
Namely, listener hangs from time to time under load. The system accepts about 10.000 connection requests per day and the requests are spread among 4 listeners. Once the listener quits its work, nothing is to me visible in trace or log files, but the commang "ps -elf" gets the following output:
000 S oracle 7953 1 0 75 0 - 3379 pipe_w 05:00 ? 00:00:00 /opt/oracle/product/9ir2/bin/tnslsnr LISTENER1 -inherit 000 S oracle 7959 1 0 76 0 - 3378 schedu 05:00 ? 00:00:04 /opt/oracle/product/9ir2/bin/tnslsnr LISTENER3 -inherit 000 S oracle 20628 1 0 75 0 - 3385 schedu 10:53 pts/0 00:00:03 /opt/oracle/product/9ir2/bin/tnslsnr LISTENER2 -inherit 000 S oracle 24149 1 0 75 0 - 3379 schedu 11:32 pts/2 00:00:00 /opt/oracle/product/9ir2/bin/tnslsnr LISTENER -inherit
Note the WCHAN field in which the normal "running" listeners are stated as "schedu" and the died one has "pipe_w" stated. It seems that the process is waiting something from Linux (or ORACLE?) indefinitely.
Anyway, last few lines before the listener died in an trace file could help perhaps:
[03-MAR-2003 11:50:22:900] sntpcall: hdl[IR]=15, hdl[IW]=14
[03-MAR-2003 11:50:22:900] nsbeqaddr: doing connect handshake...
[03-MAR-2003 11:50:22:900] nsbequeath: doing connect handshake...
[03-MAR-2003 11:50:22:902] nsbequeath: NSE=0
[03-MAR-2003 11:50:22:902] nsbeqaddr: connect handshake is complete
[03-MAR-2003 11:50:22:902] nstimarmed: no timer allocated
[03-MAR-2003 11:50:22:902] nsclose: closing transport
[03-MAR-2003 11:50:22:902] nsclose: global context check-out (from slot
3) complete
[03-MAR-2003 11:50:22:976] nsopen: opening transport...
[03-MAR-2003 11:50:22:976] nttcnp: Validnode Table IN use; err 0x0
[03-MAR-2003 11:50:22:976] nttcnp: getting sockname
[03-MAR-2003 11:50:22:976] nttcnr: waiting to accept a connection.
[03-MAR-2003 11:50:22:976] nttcnr: getting sockname
[03-MAR-2003 11:50:22:976] nttcnr: connected on ipaddr 192.168.254.2
[03-MAR-2003 11:50:22:976] nttvlser: valid node check on incoming node
192.168.254.1
[03-MAR-2003 11:50:22:976] nttvlser: Accepted Entry: 192.168.254.1
[03-MAR-2003 11:50:22:977] nttcon: set TCP_NODELAY on 10
[03-MAR-2003 11:50:22:977] nsopen: transport is open
[03-MAR-2003 11:50:22:977] nsnainit: inf->nsinfflg[0]: 0xd
inf->nsinfflg[1]: 0xd
[03-MAR-2003 11:50:22:977] nsopen: global context check-in (to slot 3)
complete
[03-MAR-2003 11:50:22:977] nsanswer: deferring connect attempt; at stage
5
[03-MAR-2003 11:50:22:977] nscon: doing connect handshake...
[03-MAR-2003 11:50:22:977] nscon: got NSPTCN packet
[03-MAR-2003 11:50:22:977] nsevdansw: exit
[03-MAR-2003 11:50:22:978] nsbeqaddr: connecting...
[03-MAR-2003 11:50:22:979] sntpcall: About to exec
/opt/oracle/product/9ir2/bin/oracle
[03-MAR-2003 11:50:22:994] sntpcall: detaching from parent with
additional fork
[03-MAR-2003 11:50:22:994] sntpcall: result string is NTP0 25272
[03-MAR-2003 11:50:22:994] sntpcall: hdl[IR]=15, hdl[IW]=14
[03-MAR-2003 11:50:22:994] nsbeqaddr: doing connect handshake...
[03-MAR-2003 11:50:22:994] nsbequeath: doing connect handshake...
and it dies.... (no log or tracing info beyond this point)
By issuing LSNRCTL there is no response from the "died" listener, only "kill -9" and restarting solves the problem.
It seems that the problem appears to rise in frequency after several uptime days of the system - I am not sure but the first 24 hours this problem has not arrived (thursday) and on friday it happened once - and today it has happened about 6 times.
It is not easy to get the problem solved on metalink and if anyone has an idea, please help since it has impact on users.
Thanks to all in advance,
Tom Sagrak Received on Mon Mar 03 2003 - 10:52:42 CST