Re: Client sqlplus SELECT + SPOOL knocks down server network

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: Mon, 10 Mar 2008 06:12:45 -0700 (PDT)
Message-ID: <de52ff80-2e50-4aff-ad75-3273ecfb19ce@o77g2000hsf.googlegroups.com>


On Mar 10, 7:51 am, Carlos <miotromailcar..._at_netscape.net> wrote:
> Hello all:
>
> I'm facing a strange problem on a 10.2.0.3 Oracle on Oracle Enterprise
> Linux 5 server.
>
> This morning someone was issuing a SELECT from a windows client via
> ODBC and suddenly all the server communications broke down: even no
> ping from or to that server worked.
>
> Tried stop&start network but no avail. The server appears to be dead
> at communications until reboot.
>
> So I tried to repeat the SELECT from sqlplus on a windows client to a
> OS file with SPOOL and all the communications on the server are gone
> once again. Ping only answers to 127.0.0.1 or its own IP address, and
> Linux is telling me that the network card is OK. Stop&start network
> didn't fix the problem either.
>
> Client sqlnet.log shows:
>
> Fatal NI connect error 12170.
>
>   VERSION INFORMATION:
>         TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
>         Windows NT TCP/IP NT Protocol Adapter for 32-bit Windows: Version
> 10.2.0.1.0 - Production
>   Time: 10-MAR-2008 12:11:59
>   Tracing not turned on.
>   Tns error struct:
>     ns main err code: 12535
>     TNS-12535: Message 12535 not found; No message file for
> product=NETWORK, facility=TNS
>     ns secondary err code: 12560
>     nt main err code: 505
>     TNS-00505: Message 505 not found; No message file for
> product=NETWORK, facility=TNS
>     nt secondary err code: 60
>     nt OS err code: 0
>   Client address: <unknown>
>
> The select is a two column (varchar2(9), char(1)) select of 11 million
> rows aprox.
>
> Could anybody tell me where to search next? (I didn't find anything at
> metalink).
>
> TIA.
>
> Cheers.
>
> Carlos.

If it does not cause too many issues dealing with another lock-up, you might try enabling a 10046 trace at level 12 and a simultaneous 10053 trace at level 1 for the SQL*Plus session, and then execute the SQL statememt. The 10053 trace should output the expected plan for the SQL statement, including the expected number of rows for each operation. If the session crashes, the data generated by 10046 trace portion of the file should provide information indicating why the session crashed.

It might be possible that the table/index stats are suggesting that Oracle should use a merge Cartesian join, when it should not, which causes excessive temp tablespace usage, which causes the server's hard drives to fill to capacity, which causes... It could also be a problem with having CURSOR_SHARING set to FORCE.

Have you looked for trace files in the udump, bdump, and cdump directories on the server? Also, have you looked at the alert log?

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc. Received on Mon Mar 10 2008 - 08:12:45 CDT

Original text of this message