RE: Solaris 10 dtrace command...
Date: Mon, 12 Dec 2011 20:55:54 -0500
Message-ID: <2F161F8A09B99B4ABF8AE832D546E7890E9911CCA4_at_us194mx002.tycoelectronics.net>
I was able to catch the hang in production again. Really ugly this time - one command took 8-9 minutes to execute and a simple lsnrctl status took close to 15 minutes to execute in a production environment. Here was my command...
truss -d -D -f -o bill2.out ls -lrt
This is a small section of the resulting output...A bad start right out of the gates...
Base time stamp: 1323721323.0651 [ Mon Dec 12 15:22:03 EST 2011 ]
1186: 0.0000 0.0000 execve("/usr/bin/ls", 0xFFBFF514, 0xFFBFF520) argc = 2 1186: 102.3725 102.3725 resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12 1186: 107.4393 5.0668 resolvepath("/usr/bin/ls", "/usr/bin/ls", 1023) = 11 1186: 107.4420 0.0027 stat64("/usr/bin/ls", 0xFFBFF1D0) = 0
Here was the second incident...(I replaced some of our file system names with variables to protect the innocent... :) ) The wait times are outrageous... truss -d -D -f -o bill3.out lsnrctl status <listener_name> Base time stamp: 1323722672.7062 [ Mon Dec 12 15:44:32 EST 2011 ]
9898: 0.0000 0.0000 execve("{ORACLE_HOME}/bin/lsnrctl", 0xFFFFFFFF7FFFF368, 0xFFFFFFFF7FFFF388) argc = 3 9898: 80.3434 80.3434 resolvepath("/usr/lib/sparcv9/ld.so.1", "/lib/sparcv9/ld.so.1", 1023) = 20 9898: 116.6389 36.2955 resolvepath("{ORACLE_HOME}/bin/lsnrctl", "{ORACLE_HOME}/bin/lsnrctl", 1023) = 46 9898: 116.6403 0.0014 stat("{ORACLE_HOME}/bin/lsnrctl", 0xFFFFFFFF7FFFEF60) = 0 9898: 569.7705 453.1302 open("/var/ld/64/ld.config", O_RDONLY) Err#2 ENOENT
________________________________
From: Johnson, William L (TEIS)
Sent: Sunday, December 11, 2011 7:37 PM
To: 'Jared Still'
Cc: ORACLE-L
Subject: RE: Solaris 10 dtrace command...
Well, this is rather embarrassing. I just looked through my .sh_history file and found that I never included the "-d -D" option for my truss command. I was building a script to run and never ran the script when the problem surfaced...I ran the command by hand and forgot the options. Now I have to hope I can catch the system hanging again...
Here is what the output looks like when I run it properly... <server_name>:XXX02/xxx02/data00/oracle/local>more bill2.out Base time stamp: 1323649536.7395 [ Sun Dec 11 19:25:36 EST 2011 ]
1402: 0.0000 0.0000 execve("/usr/bin/ls", 0xFFBFF51C, 0xFFBFF528) argc = 2 1402: 0.0031 0.0031 resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12 1402: 0.0032 0.0001 resolvepath("/usr/bin/ls", "/usr/bin/ls", 1023) = 11 1402: 0.0034 0.0002 stat64("/usr/bin/ls", 0xFFBFF1D8) = 0 1402: 0.0035 0.0001 open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT 1402: 0.0037 0.0002 stat64("/usr/lib/X11R5/libsec.so.1", 0xFFBFE968) Err#2 ENOENT 1402: 0.0038 0.0001 stat64("/usr/lib/Motif1.2/libsec.so.1", 0xFFBFE968) Err#2 ENOENT
________________________________
From: Jared Still [mailto:jkstill_at_gmail.com] Sent: Friday, December 09, 2011 6:50 PM
To: Johnson, William L (TEIS)
Cc: ORACLE-L
Subject: Re: Solaris 10 dtrace command...
On Fri, Dec 9, 2011 at 10:46 AM, Johnson, William L (TEIS) <WLJohnson_at_te.com<mailto:WLJohnson_at_te.com>> wrote: I am having sporadic problems with a Solaris 10 database server running on a ZFS file system. Every once in a while, a simple OS command like "ls -al" in a directory with 10-20 files will hang for more than 1 minute. I was able to use the truss command to finally capture one of the incidents where the "ls -al" command took over 1 minute. The unfortunate thing is that the truss output wasn't able to capture where the wait occurred. I am now moving on to dtrace - but wow...I am really hoping that someone on the list has had prior
Please share the truss output.
You can use http://pastebin.com<http://pastebin.com/fmtGg5rM> to share it.
Just paste the text into the box, click submit, and share the resulting URL.
Such as: http://pastebin.com/fmtGg5rM
Jared Still
Certifiable Oracle DBA and Part Time Perl Evangelist
Oracle Blog: http://jkstill.blogspot.com
Home Page: http://jaredstill.com
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Dec 12 2011 - 19:55:54 CST