Re: Server unexpect shutdown

From: Michael Austin <maustin_at_firstdbasource.com>
Date: Fri, 17 Oct 2008 21:22:56 -0500
Message-ID: <dMbKk.5758$YU2.5078@nlpi066.nbdc.sbc.com>


DA Morgan wrote:

> GLADtr wrote:

>> Hi all.
>> Help me please!
>> what is may be ?
>>
>> SYSTEM REDHAT 4
>> uname -a
>> Linux edw1 2.6.9-67.0.0.0.1.ELsmp #1 SMP Sun Nov 18 01:06:10 EST 2007
>> x86_64 x86_64 x86_64 GNU/Linux
>> Oracle 10.2.0.4
>>
>> I`m look in all log files.
>> once suspected in dmesg
>>
>> ACPI: Processor [CPU0] (supports C1)
>> ACPI: Processor [CPU1] (supports C1)
>> ACPI: Processor [CPU6] (supports C1)
>> ACPI: Processor [CPU7] (supports C1)
>> ACPI: Processor [CPU8] (supports C1)
>> ACPI: Processor [CPU9] (supports C1)
>> ACPI: Processor [CPUE] (supports C1)
>> ACPI: Processor [CPUF] (supports C1)
>> ________________________________________________
>> Losing some ticks... checking if CPU frequency changed.
>> ------------------------------------------------------------------------------------
>>
>> microcode: CPU0 already at revision 0x16 (current=0x16)
>> microcode: CPU1 already at revision 0x16 (current=0x16)
>> microcode: CPU2 already at revision 0x16 (current=0x16)
>> microcode: CPU3 already at revision 0x16 (current=0x16)
>> microcode: CPU4 already at revision 0x16 (current=0x16)
>> microcode: CPU5 already at revision 0x16 (current=0x16)
>> microcode: CPU6 already at revision 0x16 (current=0x16)
>> microcode: CPU7 already at revision 0x16 (current=0x16)
>> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>> and
>>
>> once error find in the mcelog
>>
>> MCE 0
>> CPU 7 BANK 4 TSC 4f58cf631fca4
>> MCG status:
>> MCi status:
>> Error enabled
>> MCA:BUS Generic Generic Generic Other-transaction Request-timeout
>> Error
>> Model:Pad address glitch
>>
>> STATUS 9000080110800e0f MCGSTATUS 0
>> MCE 1
>> CPU 6 BANK 4 TSC 4f58cf6321e0a
>> MCG status:
>> MCi status:
>> Error enabled
>> MCA:BUS Generic Generic Generic Other-transaction Request-timeout
>> Error
>> Model:Pad address glitch
>>
>> STATUS 9000080110800e0f MCGSTATUS 0
>> MCE 0
>> CPU 9 BANK 4 TSC 960e0d4f7b60
>> MCG status:
>> MCi status:
>> Error enabled
>> MCA:BUS Generic Generic Generic Other-transaction Request-timeout
>> Error
>> Model:Pad address glitch
>>
>> STATUS 9000080210800e0f MCGSTATUS 0
>> MCE 1
>> CPU 8 BANK 4 TSC 960e0d4f7e26
>> MCG status:
>> MCi status:
>> Error enabled
>> MCA:BUS Generic Generic Generic Other-transaction Request-timeout
>> Error
>> Model:Pad address glitch
>>
>> STATUS 9000080210800e0f MCGSTATUS 0
>> MCE 0
>> CPU 1 BANK 0 TSC 2105f9ef4c1fc
>> MISC 14000240002a4 ADDR 208c54080
>> MCG status:
>> MCi status:
>> Error overflow
>> MCi_MISC register valid
>> MCi_ADDR register valid
>> MCA:Generic CACHE Level-1 Snoop Error
>> STATUS cc00000120040189 MCGSTATUS 0
>> MCE 1
>> CPU 1 BANK 1 TSC 2105f9ef4ee2a
>> MCG status:
>> MCi status:
>> MCA:Data CACHE Level-1 Data-Read Error
>> STATUS 8000001800000135 MCGSTATUS 0
>> /var/log/messages
>> Sep 29 10:30:33 edw1 kernel: SMB connection re-established (-5)
>> Sep 29 10:32:54 edw1 sshd(pam_unix)[19592]: session closed for user
>> oracle
>> Sep 29 11:04:51 edw1 sshd(pam_unix)[2505]: session opened for user
>> root by (uid=0)
>> Sep 29 11:05:22 edw1 observiced: observiced shutdown succeeded
>> Sep 29 11:05:29 edw1 observiced: observiced startup succeeded
>> Sep 29 11:05:57 edw1 sshd(pam_unix)[2505]: session closed for user
>> root
>> Sep 29 11:42:11 edw1 sshd(pam_unix)[17458]: session opened for user
>> oracle by (uid=0)
>>
>> Sep 29 11:45:27 edw1 su(pam_unix)[18877]: session opened for user root
>> by oracle(uid=200)
>> Sep 29 11:45:28 edw1 observiced: observiced shutdown succeeded
>> Sep 29 11:53:44 edw1 su(pam_unix)[18877]: session closed for user root
>> Sep 29 11:53:49 edw1 kernel: SMB connection re-established (-5)
>>
>> Oct 5 04:02:03 edw1 kernel: SMB connection re-established (-5)
>> Oct 5 04:02:05 edw1 snmpd[7107]: Received TERM or STOP signal...
>> shutting down...
>> Oct 5 04:02:05 edw1 snmpd: snmpd shutdown succeeded
>> Oct 5 04:02:05 edw1 snmpd: snmpd startup succeeded
> 
> My goodness a whole lot of irrelevant information. <g>
> 
> Redhat 4 update?
> 
> Is this a new issue or one that existed before applying the 10.2.0.4 patch?
> 
> Have you verified kernel parameters per the Oracle install docs?
> 
> When does it shutdown?
> What else is installed?
> What is happening when it shuts down?
> Are the shutdowns random or predictable?
> Does the server have a remote management card?
> Will it shut itself down if Oracle is not started?
> Is there a core dump?
> the more information you provide the better.

Did it come back up? Does it still reboot?

MCA:Generic CACHE Level-1 Snoop Error???? MCA:Data CACHE Level-1 Data-Read Error??? Received on Fri Oct 17 2008 - 21:22:56 CDT

Original text of this message