RE: Oracle & pagefaults (solaris)

From: Johnson, William L (TEIS) <"Johnson,>
Date: Mon, 13 Aug 2012 10:35:17 -0400
Message-ID: <2F161F8A09B99B4ABF8AE832D546E7890EE505ED9A_at_us194mx002.tycoelectronics.net>



You may want to have a peak at the /tmp directory and make sure someone is not dumping large files into the /tmp directory and impacting memory. Applications (and sometimes Oracle) will place files under /tmp that can have a very negative impact on overall memory performance on a Solaris server.

Just a thought...
Bill

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of przemolicc_at_poczta.fm Sent: Monday, August 13, 2012 10:27 AM
To: oracle-l_at_freelists.org
Subject: Oracle & pagefaults (solaris)

Hi all,

we have been facing performance problems on our Solaris 10 Sparc based server with several Oracle databases running on it. It started 2 months ago and gradually is increasing. There is always high level of syscalls when we face these problems:

09:20:03      25       8       0      68
09:30:02      22       7       0      70
09:40:03      23       8       0      70
09:50:03      23       7       0      69
10:00:03      22       9       0      69
10:10:03      25      34       0      41
10:20:03      28      28       0      44
10:30:03      27      24       0      49
10:40:02      24       8       0      68
10:50:03      27      19       0      54
11:00:02      31      27       0      41
11:10:21      29      48       0      23
11:20:03      36      56       0       9
11:30:05      32      51       0      18
11:40:03      37      43       0      20
11:50:06      28      59       0      14
12:00:06      32      61       0       7
12:10:03      38      26       0      36
12:20:02      34       9       0      57
12:30:02      35       8       0      58
12:40:03      35       9       0      56

Normal level of syscalls is about 7-10%. But when something is happening it increases to over 50%. Using DTrace I can find that kernel is working in 'pagefault' function:

......

              unix`page_freelist_coalesce+0x894 [***]
              0x1fe0000
              unix`page_get_mnode_freelist+0x398
              unix`page_get_freelist+0x428
              unix`page_alloc_pages+0x110
              genunix`anon_map_getpages+0x348
              genunix`segvn_fault_anonpages+0x32c
              genunix`segvn_fault+0x530
              genunix`as_fault+0x4c8
              unix`pagefault+0x68
              unix`trap+0xd50
              unix`utl0+0x4c
            29062

              unix`cpu_halt+0x10c
              unix`cpu_halt+0x104
              unix`idle+0x128
              unix`thread_start+0x4
           719133

(this is just part of the image - there are much more pagefaults then listed above). So natural is to check on behalf of which application kernel is working in pagefaults:

[ High syscalls ]

CPU     ID                    FUNCTION:NAME
 96  85656                       :tick-5sec
  nscd                                                              1
  top                                                               1
  dtrace                                                            5
  perl                                                            122
  emagent                                                         124
  tnslsnr                                                        1569
  oracle                                                         5538

[ Low (normal) syscalls ]

CPU     ID                    FUNCTION:NAME
 20  85656                       :tick-5sec
  nscd                                                              1
  dtrace                                                            5
  emagent                                                          41
  sh                                                               91
  perl                                                             93
  emdctl                                                          243
  tnslsnr                                                         320
  oracle                                                         2466


When I watch which Oracle's PID are generating these high syscalls using 'ps' most of them don't exists (!). Are they short lived processes ? Our DBA claims that this is not a problem of Oracle but DTrace shows that system is working on behalf of Oracle. Can you shed some light where our DBAs should look for any solution ? When I analyzed number of connections in listener log they don't differ much between low and high system calls.

Can you help me please ? I don't feel comfortable because my feeling is that I am doing DBAs homework but they claim that AWR report is the same as before so there is nothing they could do.

Best regards
Przemek

--
http://www.freelists.org/webpage/oracle-l




--
http://www.freelists.org/webpage/oracle-l
Received on Mon Aug 13 2012 - 09:35:17 CDT

Original text of this message