Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: Re[2]: Sun Boxes Crashing

RE: Re[2]: Sun Boxes Crashing

From: Linda Wang <lwang_at_messagemedia.com>
Date: Thu, 7 Sep 2000 12:52:05 -0600
Message-Id: <10612.116442@fatcity.com>


Our production E4500 have been moved to a computer center room for colling purpose. But one of them still reboots or gets CPU panic several times a month.

-----Original Message-----

From: root_at_fatcity.com [mailto:root_at_fatcity.com]On Behalf Of Tommy Pham Sent: Thursday, September 07, 2000 12:56 PM To: Multiple recipients of list ORACLE-L Subject: Re[2]: Sun Boxes Crashing

We had the parameters set since day one. It didn't correct the problem. One
thing that has helped/kept our system (E6500) up and running by cooling down our
computer room. Prtdiag -v shows the memory/cpu boards are in the low 30's F.
 We still don't an answer from Sun yet.

TP
Sr. Apps DBA / Solaris Systems Admin.

        ,--o         ,--o         ,--o          ,--o
      _-\_<_        _-\_<_       _-\_<_       _-\_<_  let's bike!!!
     (*)/-'(*) ___ (*)/-'(*) __ (*)/-'(*) __ (*)/-(*)



____________________Reply Separator____________________
Author: ORACLE-L_at_fatcity.com
Date:       9/6/00 10:20 PM



Sounds exactly like the type of problem we experienced recently (same environment)...here is a response direct from Sun techo's:

I've had a look at the crash dump and the problem appears to be that a kernel thread's stack overflowed.

This, unfortunately, happens occasionally if you run with too many levels of different filesystems/device layers. (In your case vxfs/vxio/sd)

The only real way to fix/workaround the problem is to increase the size of the
kernel stacks.

i'd suggest you add

set rpcmod:svc_run_stksize=0x4000
set lwp_default_stksize=0x4000

to /etc/system then reboot the domain to take effect.

the only reason it happened is because the kernel thread was interrupted to service a disk request. This pushed the stack over the normal 0x2000 limit.

The crash was sufficient to corrupt one of our largest datafiles and appeared to
occur after running a large insert batch job in parallel.

We only made these changes yesterday afternoon so obviously it's too early to
qualify!

Regards
Grant

"Rama Malladi" <rmalladi_at_inteliant.com> on 07/09/2000 08:40:38

Please respond to ORACLE-L_at_fatcity.com

To: Multiple recipients of list ORACLE-L <ORACLE-L_at_fatcity.com> cc: (bcc: GRANT G HOLYOAKE/NSO/CSDA)

We have several Sun boxes (Solaris 2.6) running Oracle 8, 8i. One of the boxes (description given below) Kept rebooting and this machine happens to run one of the most critical billing systems (Murphy's law!).

Overall, this machine rebooted some 40 times, in a period of 2 months and some nights, it rebooted as many as 10 times! Our SysAdmin contacted Sun Engineers and they never told us what exactly was the problem, and kept replacing CPUs, Memory boards, SCSI cards etc ... This happened several times and last week there was an article in Computer Weekly magazine saying several customers were having this kind of problem on Sun boxes and Sun tried to hush up the matter ...!!

Has anybody else faced this kind of situation?

Just curious ...
Rama



System Configuration: Sun Microsystems sun4u 8-slot Sun Enterprise E4500/E5500
SunOS uscaelmux06 5.6 Generic_105181-21 sun4u sparc SUNW,Ultra-Enterprise

--

Author: Rama Malladi
  INET: rmalladi_at_inteliant.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists

--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).

--

Author:
  INET: grant.g.holyoake_at_centrelink.gov.au

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists

--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may Received on Thu Sep 07 2000 - 13:52:05 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US