RE: Solaris 10 dtrace command...

From: Johnson, William L (TEIS) <"Johnson,>
Date: Wed, 14 Dec 2011 06:51:47 -0500
Message-ID: <2F161F8A09B99B4ABF8AE832D546E7890E9911CFCB_at_us194mx002.tycoelectronics.net>



We have opened several SRs with Oracle for the database problems. They have never been able to identify anything as the source of the problem always indicating the problem was OS or disk storage. The core files and guds captured by our DBA and UNIX Admin team helped Oracle hardware/OS support to determine that we are running into an OS bug. The bug number is 6876962.

This is an excerpt from the service request... This looks like CR# 6876962 - degraded write performance with threads held up by space_map_load_wait(). This bug is fixed in patch 147440-05, -06 or -07, which is current and contains the fix.

I personally believe we have been running into this bug for more than 1 year - and have been pushing Oracle for well over 1 year for a resolution. The trace and core files we provided finally allowed them to identify the possible source of our problems.



From: Jared Still [mailto:jkstill_at_gmail.com] Sent: Tuesday, December 13, 2011 9:40 PM To: Johnson, William L (TEIS)
Cc: ORACLE-L
Subject: Re: Solaris 10 dtrace command...

Any idea why it is taking so long to resolve paths?

And why it takes 7+ minutes to determine that a file is missing?

I would be concerned about my storage.

Or there could be some bug causing issues.

Have you taken this up with the storage admins, or Sun^H^H^HOracle support?

Jared Still
Certifiable Oracle DBA and Part Time Perl Evangelist Oracle Blog: http://jkstill.blogspot.com Home Page: http://jaredstill.com

On Mon, Dec 12, 2011 at 5:55 PM, Johnson, William L (TEIS) <WLJohnson_at_te.com<mailto:WLJohnson_at_te.com>> wrote: I was able to catch the hang in production again. Really ugly this time - one command took 8-9 minutes to execute and a simple lsnrctl status took close to 15 minutes to execute in a production environment.

Here was my command...
truss -d -D -f -o bill2.out ls -lrt

This is a small section of the resulting output...A bad start right out of the gates...

Base time stamp: 1323721323.0651 [ Mon Dec 12 15:22:03 EST 2011 ] 1186: 102.3725 102.3725 resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12

Here was the second incident...(I replaced some of our file system names with variables to protect the innocent... :) ) The wait times are outrageous... truss -d -D -f -o bill3.out lsnrctl status <listener_name> Base time stamp: 1323722672.7062 [ Mon Dec 12 15:44:32 EST 2011 ]

9898: 0.0000     0.0000     execve("{ORACLE_HOME}/bin/lsnrctl", 0xFFFFFFFF7FFFF368, 0xFFFFFFFF7FFFF388)  argc = 3
9898: 80.3434     80.3434     resolvepath("/usr/lib/sparcv9/ld.so.1", "/lib/sparcv9/ld.so.1", 1023) = 20
9898: 116.6389    36.2955     resolvepath("{ORACLE_HOME}/bin/lsnrctl", "{ORACLE_HOME}/bin/lsnrctl", 1023) = 46
9898: 116.6403    0.0014     stat("{ORACLE_HOME}/bin/lsnrctl", 0xFFFFFFFF7FFFEF60) = 0
9898: 569.7705    453.1302    open("/var/ld/64/ld.config", O_RDONLY)          Err#2 ENOENT




--
http://www.freelists.org/webpage/oracle-l
Received on Wed Dec 14 2011 - 05:51:47 CST

Original text of this message