RE: tmon

From: Noveljic Nenad <nenad.noveljic_at_vontobel.ch>
Date: Wed, 20 Sep 2017 08:40:55 +0000
Message-ID: <18777_1505896860_59C2299C_18777_1093_1_ECDEF0CC6716EC4596FCBC871F48292AB18FD14A_at_ZRH-S231>


I think I've figured out what had happened here.

Once again, the message in tmon.trc was:

TMON (ospid: 14954): terminating the instance due to error 472

The error 472 actually means that the PMON died (This information is the key - I should have looked up the error message before!): oerr ora 472
00472, 00000, "PMON process terminated with error" // *Cause: The process cleanup process died

TMON just happened to be the first one who noticed that PMON had disappeared and requested the abnormal instance termination. The problem can be reproduced by shooting down the PMON (kill -9 ). Even the cleanup stack looks identical.

Finally, the PMON was killed by a dodgy application (a problem similar to http://nenadnoveljic.com/blog/avaloq-database-crash/ )

Thanks to all who provided useful pieces of information, challenged my reasoning and by doing so nudged me to revisit the problem over and over!

Nenad

Twitter: _at_NenadNoveljic
Home page: http://nenadnoveljic.com

-----Original Message-----

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Noveljic Nenad Sent: Dienstag, 19. September 2017 20:51 To: 'Yong Huang'; oracle-l_at_freelists.org Subject: RE: tmon

Hey Yong,

Thank you for providing your much appreciated insight on the functions on the cleanup stack.

I've already opened an SR and will keep you posted.

Besides that, I keep running kill.d in the background to record any interference by the application ( such as in http://nenadnoveljic.com/blog/avaloq-database-crash/ ) in the case of a new occurrence.

Nenad

Twitter: _at_NenadNoveljic
Home page: http://nenadnoveljic.com

-----Original Message-----

From: Yong Huang [mailto:yong321_at_yahoo.com] Sent: Dienstag, 19. September 2017 17:09 To: oracle-l_at_freelists.org
Cc: Noveljic Nenad
Subject: Re: tmon

Hi Nenad,

I searched for your call stack on MOS but can't locate a good match. There are a few for Oracle 11gR2 but there's an additional function between ksuitm and ksumcl, i.e. ksuinstalive, e.g. Bugs 16426985, 10179554. One is for Oracle 12.1 (Bug 18077020) and there's no such intermediate function, but there's ktsj_smco_main further down the stack. Anyway, I suggest you open an SR and have the Support take a look.

Your stack:
ksedsts: error handling, always ignore
kjzdicrshnfy: crash notification

ksuitm: some kind of timeout
ksumcl: not sure, process memory cleanup?
ksbcti: "call timeout/interrupts" according to various bug reports
ksbabs: "Background process: Action based server"
ksbrdp: "run a detached (background) process"

ksumcl may be interesting. But if it's already doing cleanup, then this stack is not helpful.

If Oracle Support finds anything interesting, let us know. Thanks.

Yong Huang



Please consider the environment before printing this e-mail. Bitte denken Sie an die Umwelt, bevor Sie dieses E-Mail drucken.

Important Notice
This message is intended only for the individual named. It may contain confidential or privileged information. If you are not the named addressee you should in particular not disseminate, distribute, modify or copy this e-mail. Please notify the sender immediately by e-mail, if you have received this message by mistake and delete it from your system.  

E-mail transmission may not be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete. Also processing of incoming e-mails cannot be guaranteed. All liability of the Vontobel Group and its affiliates for any damages resulting from e-mail use is excluded. You are advised that urgent and time sensitive messages should not be sent by e-mail and if verification is required please request a printed version.

!���
0~���+-���� ������rW�

Important Notice
This message is intended only for the individual named. It may contain confidential or privileged information. If you are not the named addressee you should in particular not disseminate, distribute, modify or copy this e-mail. Please notify the sender immediately by e-mail, if you have received this message by mistake and delete it from your system.  

E-mail transmission may not be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete. Also processing of incoming e-mails cannot be guaranteed. All liability of the Vontobel Group and its affiliates for any damages resulting from e-mail use is excluded. You are advised that urgent and time sensitive messages should not be sent by e-mail and if verification is required please request a printed version.

†Ûiÿü0ÁúÞzX¬¶Ê+ƒün– {ú+iÉ^ Received on Wed Sep 20 2017 - 10:40:55 CEST

Original text of this message