Re: Client sqlplus SELECT + SPOOL knocks down server network
From: DA Morgan <damorgan_at_psoug.org>
Date: Mon, 10 Mar 2008 10:23:18 -0700
Message-ID: <1205169794.340978@bubbleator.drizzle.com>
>> Hello all:
>>
>> I'm facing a strange problem on a 10.2.0.3 Oracle on Oracle Enterprise
>> Linux 5 server.
>>
>> This morning someone was issuing a SELECT from a windows client via
>> ODBC and suddenly all the server communications broke down: even no
>> ping from or to that server worked.
>>
>> Tried stop&start network but no avail. The server appears to be dead
>> at communications until reboot.
>>
>> So I tried to repeat the SELECT from sqlplus on a windows client to a
>> OS file with SPOOL and all the communications on the server are gone
>> once again. Ping only answers to 127.0.0.1 or its own IP address, and
>> Linux is telling me that the network card is OK. Stop&start network
>> didn't fix the problem either.
>>
>> Client sqlnet.log shows:
>>
>> Fatal NI connect error 12170.
>>
>> VERSION INFORMATION:
>> TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
>> Windows NT TCP/IP NT Protocol Adapter for 32-bit Windows: Version
>> 10.2.0.1.0 - Production
>> Time: 10-MAR-2008 12:11:59
>> Tracing not turned on.
>> Tns error struct:
>> ns main err code: 12535
>> TNS-12535: Message 12535 not found; No message file for
>> product=NETWORK, facility=TNS
>> ns secondary err code: 12560
>> nt main err code: 505
>> TNS-00505: Message 505 not found; No message file for
>> product=NETWORK, facility=TNS
>> nt secondary err code: 60
>> nt OS err code: 0
>> Client address: <unknown>
>>
>> The select is a two column (varchar2(9), char(1)) select of 11 million
>> rows aprox.
>>
>> Could anybody tell me where to search next? (I didn't find anything at
>> metalink).
>>
>> TIA.
>>
>> Cheers.
>>
>> Carlos.
Date: Mon, 10 Mar 2008 10:23:18 -0700
Message-ID: <1205169794.340978@bubbleator.drizzle.com>
Charles Hooper wrote:
> On Mar 10, 7:51 am, Carlos <miotromailcar..._at_netscape.net> wrote:
>> Hello all:
>>
>> I'm facing a strange problem on a 10.2.0.3 Oracle on Oracle Enterprise
>> Linux 5 server.
>>
>> This morning someone was issuing a SELECT from a windows client via
>> ODBC and suddenly all the server communications broke down: even no
>> ping from or to that server worked.
>>
>> Tried stop&start network but no avail. The server appears to be dead
>> at communications until reboot.
>>
>> So I tried to repeat the SELECT from sqlplus on a windows client to a
>> OS file with SPOOL and all the communications on the server are gone
>> once again. Ping only answers to 127.0.0.1 or its own IP address, and
>> Linux is telling me that the network card is OK. Stop&start network
>> didn't fix the problem either.
>>
>> Client sqlnet.log shows:
>>
>> Fatal NI connect error 12170.
>>
>> VERSION INFORMATION:
>> TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
>> Windows NT TCP/IP NT Protocol Adapter for 32-bit Windows: Version
>> 10.2.0.1.0 - Production
>> Time: 10-MAR-2008 12:11:59
>> Tracing not turned on.
>> Tns error struct:
>> ns main err code: 12535
>> TNS-12535: Message 12535 not found; No message file for
>> product=NETWORK, facility=TNS
>> ns secondary err code: 12560
>> nt main err code: 505
>> TNS-00505: Message 505 not found; No message file for
>> product=NETWORK, facility=TNS
>> nt secondary err code: 60
>> nt OS err code: 0
>> Client address: <unknown>
>>
>> The select is a two column (varchar2(9), char(1)) select of 11 million
>> rows aprox.
>>
>> Could anybody tell me where to search next? (I didn't find anything at
>> metalink).
>>
>> TIA.
>>
>> Cheers.
>>
>> Carlos.
> > If it does not cause too many issues dealing with another lock-up, you > might try enabling a 10046 trace at level 12 and a simultaneous 10053 > trace at level 1 for the SQL*Plus session, and then execute the SQL > statememt. The 10053 trace should output the expected plan for the > SQL statement, including the expected number of rows for each > operation. If the session crashes, the data generated by 10046 trace > portion of the file should provide information indicating why the > session crashed. > > It might be possible that the table/index stats are suggesting that > Oracle should use a merge Cartesian join, when it should not, which > causes excessive temp tablespace usage, which causes the server's hard > drives to fill to capacity, which causes... It could also be a > problem with having CURSOR_SHARING set to FORCE. > > Have you looked for trace files in the udump, bdump, and cdump > directories on the server? Also, have you looked at the alert log? > > Charles Hooper > IT Manager/Oracle DBA > K&M Machine-Fabricating, Inc.
I can see everything you listed as a potential issue but I'm not sure how any of them could whack an IP address.
-- Daniel A. Morgan Oracle Ace Director & Instructor University of Washington damorgan_at_x.washington.edu (replace x with u to respond) Puget Sound Oracle Users Group www.psoug.orgReceived on Mon Mar 10 2008 - 12:23:18 CDT