Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re[2]: Sun Boxes Crashing

Re[2]: Sun Boxes Crashing

From: Tommy Pham <TPham_at_specialized.com>
Date: Thu, 07 Sep 2000 10:52:09 -0700
Message-Id: <10612.116430@fatcity.com>


--MIME MULTIPART BOUNDARY=.968349161:+'1

Content-Type: text/plain; charset=US-ASCII
Content-Id: <2018079358-2_at_specialized.com>
Content-Transfer-Encoding: 7bit

We had the parameters set since day one. It didn't correct the problem. One thing that has helped/kept our system (E6500) up and running by cooling down our computer room. Prtdiag -v shows the memory/cpu boards are in the low 30's F.  We still don't an answer from Sun yet.

TP
Sr. Apps DBA / Solaris Systems Admin.

        ,--o         ,--o         ,--o          ,--o
      _-\_<_        _-\_<_       _-\_<_       _-\_<_  let's bike!!! 
     (*)/-'(*) ___ (*)/-'(*) __ (*)/-'(*) __ (*)/-(*)        



____________________Reply Separator____________________
Subject: Re: Sun Boxes Crashing
Author: ORACLE-L_at_fatcity.com
Date: 9/6/00 10:20 PM

Sounds exactly like the type of problem we experienced recently (same environment)...here is a response direct from Sun techo's:

I've had a look at the crash dump and the problem appears to be that a kernel thread's stack overflowed.

This, unfortunately, happens occasionally if you run with too many levels of different filesystems/device layers. (In your case vxfs/vxio/sd)

The only real way to fix/workaround the problem is to increase the size of the kernel stacks.

i'd suggest you add

set rpcmod:svc_run_stksize=0x4000
set lwp_default_stksize=0x4000

to /etc/system then reboot the domain to take effect.

the only reason it happened is because the kernel thread was interrupted to service a disk request. This pushed the stack over the normal 0x2000 limit.

The crash was sufficient to corrupt one of our largest datafiles and appeared to occur after running a large insert batch job in parallel.

We only made these changes yesterday afternoon so obviously it's too early to qualify!

Regards
Grant

"Rama Malladi" <rmalladi_at_inteliant.com> on 07/09/2000 08:40:38

Please respond to ORACLE-L_at_fatcity.com

To: Multiple recipients of list ORACLE-L <ORACLE-L_at_fatcity.com> cc: (bcc: GRANT G HOLYOAKE/NSO/CSDA)

We have several Sun boxes (Solaris 2.6) running Oracle 8, 8i. One of the boxes (description given below) Kept rebooting and this machine happens to run one of the most critical billing systems (Murphy's law!).

Overall, this machine rebooted some 40 times, in a period of 2 months and some nights, it rebooted as many as 10 times! Our SysAdmin contacted Sun Engineers and they never told us what exactly was the problem, and kept replacing CPUs, Memory boards, SCSI cards etc ... This happened several times and last week there was an article in Computer Weekly magazine saying several customers were having this kind of problem on Sun boxes and Sun tried to hush up the matter ...!!

Has anybody else faced this kind of situation?

Just curious ...
Rama



System Configuration: Sun Microsystems sun4u 8-slot Sun Enterprise E4500/E5500
SunOS uscaelmux06 5.6 Generic_105181-21 sun4u sparc SUNW,Ultra-Enterprise

--

Author: Rama Malladi
  INET: rmalladi_at_inteliant.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists

--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).

--

Author:
  INET: grant.g.holyoake_at_centrelink.gov.au

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists

--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).  

--MIME MULTIPART BOUNDARY=.968349161:+'1

Content-Type: application/octet-stream
Content-Id: <2018079358-3_at_specialized.com>
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="RFC822.txt"

UmVjZWl2ZWQ6IGZyb20gVW5rbm93biBob3N0IFsyMDguMTk1LjE4Mi40OF0gYnkgc3BlY2lhbGl6 ZWQuY29tIChjY01haWwgTGluayB0byBTTVRQIFI4LjUyLjAyLjEpDQoJOyBUaHUsIDA3IFNlcCAy MDAwIDEwOjE3OjQ0IC0wNzAwDQpSZXR1cm4tUGF0aDogcm9vdEBmYXRjaXR5LmN0cy5jb20NClJl Y2VpdmVkOiBmcm9tIHdpbmRtaWxsLWVuMC5nYXJsaWMuY29tIChbMjA4LjE5NS4xNjAuMTMwXSkg YnkgMjA4LjE5NS4xODIuNDgNCiAgKE5vcnRvbiBBbnRpVmlydXMgZm9yIEludGVybmV0IEVtYWls IEdhdGV3YXlzIDEuMCkgOw0KICBUaHUsIDA3IFNlcCAyMDAwIDE3OjE0OjMzIDAwMDAgKEdNVCkN ClJlY2VpdmVkOiBmcm9tIG5ld3NmZWVkLmN0cy5jb20gKG5ld3NmZWVkLmN0cy5jb20gWzIwOS42 OC4xOTIuMTk5XSkNCglieSB3aW5kbWlsbC1lbjAuZ2FybGljLmNvbSAoOC4xMC4wLzguMTAuMCkg d2l0aCBFU01UUCBpZCBlODc2YTlvMDUzNzQNCglmb3IgPHRwaGFtQHNwZWNpYWxpemVkLmNvbT47 IFdlZCwgNiBTZXAgMjAwMCAyMzozNjowOSAtMDcwMA0KUmVjZWl2ZWQ6IGZyb20gZmF0Y2l0eS5V VUNQICh1dWNwQGxvY2FsaG9zdCkNCglieSBuZXdzZmVlZC5jdHMuY29tICg4LjkuMy84LjkuMykg d2l0aCBVVUNQIGlkIFhBQTg2MTc5Ow0KCVdlZCwgNiBTZXAgMjAwMCAyMzozNTozMyAtMDcwMCAo UERUKQ0KUmVjZWl2ZWQ6IGJ5IGZhdGNpdHkuY29tICgwNC1NYXktMjAwMC92MS4wZi1iNjkvYmFi KSB2aWEgVVVDUCBpZCAwMDFGNjI1MDsgV2VkLCAwNiBTZXAgMjAwMCAyMToyMDozMSAtMDgwMA0K TWVzc2FnZS1JRDogPEYwMDEuMDAxRjYyNTAuMjAwMDA5MDYyMTIwMzFAZmF0Y2l0eS5jb20+DQpE YXRlOiBXZWQsIDA2IFNlcCAyMDAwIDIxOjIwOjMxIC0wODAwDQpUbzogTXVsdGlwbGUgcmVjaXBp ZW50cyBvZiBsaXN0IE9SQUNMRS1MIDxPUkFDTEUtTEBmYXRjaXR5LmNvbT4NClgtQ29tbWVudDog T3JhY2xlIFJEQk1TIENvbW11bml0eSBGb3J1bQ0KWC1TZW5kZXI6IGdyYW50LmcuaG9seW9ha2VA Y2VudHJlbGluay5nb3YuYXUNClNlbmRlcjogcm9vdEBmYXRjaXR5LmNvbQ0KUmVwbHktVG86IE9S QUNMRS1MQGZhdGNpdHkuY29tDQpFcnJvcnMtVG86IE1MLUVSUk9SU0BmYXRjaXR5LmNvbQ0KRnJv bTogZ3JhbnQuZy5ob2x5b2FrZUBjZW50cmVsaW5rLmdvdi5hdQ0KU3ViamVjdDogUmU6IFN1biBC b3hlcyBDcmFzaGluZw0KT3JnYW5pemF0aW9uOiBGYXQgQ2l0eSBOZXR3b3JrIFNlcnZpY2VzLCBT YW4gRGllZ28sIENhbGlmb3JuaWENClgtTGlzdFNlcnZlcjogdjEuMGYsIGJ1aWxkIDY5OyBMaXN0 R3VydSAoYykgMTk5Ni0yMDAwIEJydWNlIEEuIEJlcmdtYW4NClByZWNlZGVuY2U6IGJ1bGsNCk1p bWUtVmVyc2lvbjogMS4wDQpDb250ZW50LXR5cGU6IHRleHQvcGxhaW47IGNoYXJzZXQ9dXMtYXNj Received on Thu Sep 07 2000 - 12:52:09 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US