Re: Solaris 10 dtrace command...

From: kyle Hailey <kylelf_at_gmail.com>
Date: Fri, 9 Dec 2011 16:58:41 -0800
Message-ID: <CADsdiQjnjmjpJQ=_cxC5ktTMtxfiEjDbCysAGtSQEzh+rLuu4w_at_mail.gmail.com>



slow downs like that sound like paging issues, though 1 minute is a heck of along time.
Run "vmstat 1" and look at page outs (po)

Check out "DTrace" book by Brendan Gregg and Jim Mauro. Tons of examples. Check out Brendan's blog as well: http://dtrace.org/blogs/brendan/ , awesome stuff

Here is a quick sketch of a dtrace script that will show zfs read times on which file for which process (you could add info to filter for a specific PID or execname)

#!/usr/sbin/dtrace -s
#pragma D option quiet
BEGIN {
 printf("Monitoring I/O on zfs filesystem ctl-c to interrupt");  self->size=0;
}
fbt:zfs:zfs_read:entry
{

 self->path = (string)((vnode_t *)arg0)->v_path;
 self->size = ((uio_t *) arg1)->uio_resid;
 self->ts = timestamp;

}
fbt:zfs:zfs_read:return,zfs_write:return /self->ts /
{
        this->delta=timestamp - self->ts ;
        printf("%u %s (%u) %s (%u) \n", this->delta,(string)self->path,
self->size, execname, pid);
        self->ts=0;
        self->path=0;
        self->size=0;

}

output looks like

delta in nanoseconds file (size read) executable (pid)

as in

11893 soe.dbf (512) dd (18986)
11912 soe.dbf (512) dd (18986)
11888 soe.dbf (512) dd (18986)
11934 soe.dbf (512) dd (18986)
11683 soe.dbf (512) dd (18986)

On Fri, Dec 9, 2011 at 3:50 PM, Jared Still <jkstill_at_gmail.com> wrote:

> On Fri, Dec 9, 2011 at 10:46 AM, Johnson, William L (TEIS) <
> WLJohnson_at_te.com
> > wrote:
> > I am having sporadic problems with a Solaris 10 database server running
> on
> > a ZFS file system. Every once in a while, a simple OS command like "ls
> > -al" in a directory with 10-20 files will hang for more than 1 minute. I
> > was able to use the truss command to finally capture one of the incidents
> > where the "ls -al" command took over 1 minute. The unfortunate thing is
> > that the truss output wasn't able to capture where the wait occurred. I
> am
> > now moving on to dtrace - but wow...I am really hoping that someone on
> the
> > list has had prior
> >

>

> Please share the truss output.
>

> You can use http://pastebin.com <http://pastebin.com/fmtGg5rM> to share
> it.
>

> Just paste the text into the box, click submit, and share the resulting
> URL.
>

> Such as: http://pastebin.com/fmtGg5rM
>

> Jared Still
> Certifiable Oracle DBA and Part Time Perl Evangelist
> Oracle Blog: http://jkstill.blogspot.com
> Home Page: http://jaredstill.com
>
>

> --
> http://www.freelists.org/webpage/oracle-l
>
>
>


--
http://www.freelists.org/webpage/oracle-l
Received on Fri Dec 09 2011 - 18:58:41 CST

Original text of this message