Re: Oracle user process uses all memory and swap on server

From: Tom Dale <tom.dale_at_fivium.co.uk>
Date: Tue, 13 Nov 2012 14:24:55 +0000
Message-ID: <CAGDf7wSy1BCzrTLG-407QgR+ujLXVX3C3tovn5tSdqSLHxjXAQ_at_mail.gmail.com>



Thanks Iggy, now I know the strace show the os is really out of memory. Some more Debug, using Tanels tools...

SQL> _at_snapper stats 5 1 13
Sampling SID 13 with interval 5 seconds, taking 1 snapshots...

    SID, USERNAME , TYPE,

STATISTIC                                                 ,     HDELTA,
HDELTA/SEC, %TIME, GRAPH
     13, TESTING   , STAT, session logical
reads                                     ,      1.93k,      386.6,
     13, TESTING   , STAT, consistent
gets                                           ,      1.93k,      386.6,
     13, TESTING   , STAT, consistent gets from
cache                                ,      1.93k,      386.6,
     13, TESTING   , STAT, consistent gets from cache
(fastpath)                     ,        277,       55.4,
     13, TESTING   , STAT, consistent gets -
examination                             ,       1.1k,      220.8,
     13, TESTING   , STAT, logical read bytes from
cache                             ,     15.84M,      3.17M,
     13, TESTING   , STAT, shared hash latch upgrades - no
wait                      ,        552,      110.4,
     13, TESTING   , STAT, calls to get snapshot scn:
kcmgss                         ,        276,       55.2,
     13, TESTING   , STAT, index crx upgrade
(positioned)                            ,        552,      110.4,
     13, TESTING   , STAT, lob
reads                                                 ,        276,
55.2,
     13, TESTING   , STAT, index fetch by
key                                        ,        276,       55.2,
     13, TESTING   , STAT, index scans
kdiixs1                                       ,        552,      110.4,
-- End of Stats snap 1, end 12-11-13 10:54:06, seconds=5

SQL> _at_snapper stats 5 1 13

HANG! So snapper not showing anything?
Over to ostackprof

SQL> _at_ostackprof 1143 0 5

Sampling...

Below is the stack prefix common to all samples:


Frame->function()
------------------------------------------------------------------------

# 34 ->__libc_start_main()
# 33 ->main()
# 32 ->ssthrdmain()
# 31 ->opimai_real()
# 30 ->sou2o()
# 29 ->opidrv()
# 28 ->opiodr()
# 27 ->opiino()
# 26 ->opitsk()
# 25 ->ttcpip()
# 24 ->opiodr()
# 23 ->opifch()
# 22 ->opifch2()
# 21 ->qerstFetch()
# 20 ->qertbFetch()
# 19 ->qerstRowP()
# 18 ->kpofcr()
# 17 ->evaopn2()
# 16 ->evaopn2()
# 15 ->evaopn2()
# 14 ->kokle_rxsubstr()
# 13 ->kole_rxsubstr()
# 12 ->lxkRegexpSubstrLobNSub()
# 11 ->lxregexec()
# 10 ->lxregmatch()
# ...(see call profile below)
#
# -#--------------------------------------------------------------------
# - Num.Samples -> in call stack()
# ----------------------------------------------------------------------

     2
->lxregmatgpt()->kole_rxrdcb()->koklc_read()->koklread()->koklOutlineRead1()->kdlf_read()->kdl_read1()->kdlprl()->__intel_new_memcpy()->__sighandler()->->

     2 ->__sighandler()->->
     1 ->lxregmatpush()->__sighandler()->->

Ok we see the regex function running
Server starting to overload now

SQL> _at_ostackprof 1143 0 5

Sampling...

Below is the stack prefix common to all samples:



Frame->function()

# ...(see call profile below)
#
# -#--------------------------------------------------------------------
# - Num.Samples -> in call stack()
# ----------------------------------------------------------------------

     2
->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->opifch()->opifch2()->qerstFetch()->qertbFetch()->qerstRowP()->kpofcr()->evaopn2()->evaopn2()->evaopn2()->kokle_rxsubstr()->kole_rxsubstr()->lxkRegexpSubstrLobNSub()->lxregexec()->lxregmatch()->lxregmatgpt()->kole_rxrdcb()->koklc_read()->koklread()->koklOutlineRead1()->kdlf_read()->kdl_read1()->kdlprl()->kdlrdb()->kcbgtcr()->__sighandler()->->

     1 ->__sighandler()->->
     1
->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->opifch()->opifch2()->qerstFetch()->qertbFetch()->qerstRowP()->kpofcr()->evaopn2()->evaopn2()->evaopn2()->kokle_rxsubstr()->kole_rxsubstr()->lxkRegexpSubstrLobNSub()->lxregexec()->lxregmatch()->lxregmatgpt()->kole_rxrdcb()->koklc_read()->koklread()->koklOutlineRead1()->kdlf_read()->kdl_read1()->kdlprl()->__intel_new_memcpy()->__sighandler()->->
     1

->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->opifch()->opifch2()->qerstFetch()->qertbFetch()->qerstRowP()->kpofcr()->evaopn2()->evaopn2()->evaopn2()->kokle_rxsubstr()->kole_rxsubstr()->lxkRegexpSubstrLobNSub()->lxregexec()->lxregmatch()->lxregmatgpt()->__sighandler()->->

SQL> _at_ostackprof 1143 0 5

Sampling...

Below is the stack prefix common to all samples:



Frame->function()

# ...(see call profile below)
#
# -#--------------------------------------------------------------------
# - Num.Samples -> in call stack()
# ----------------------------------------------------------------------
     2 ->__sighandler()->->
     2
->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->opifch()->opifch2()->qerstFetch()->qertbFetch()->qerstRowP()->kpofcr()->evaopn2()->evaopn2()->evaopn2()->kokle_rxsubstr()->kole_rxsubstr()->lxkRegexpSubstrLobNSub()->lxregexec()->lxregmatch()->lxregmatgpt()->kole_rxrdcb()->koklc_read()->koklread()->koklOutlineRead1()->kdlf_read()->kdl_read1()->kdlprl()->kdlrdb()->kcbgtcr()->__sighandler()->->
     1

->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->opifch()->opifch2()->qerstFetch()->qertbFetch()->qerstRowP()->kpofcr()->evaopn2()->evaopn2()->evaopn2()->kokle_rxsubstr()->kole_rxsubstr()->lxkRegexpSubstrLobNSub()->lxregexec()->lxregmatch()->lxregmatpush()->__sighandler()->->

SQL> _at_ostackprof 1143 0 5

Hit CTRL+C to cancel, ENTER to continue...

HANG! Now I am lost!!
Any help would be greatly appreciated.

Thanks,
Tom

On Mon, Nov 12, 2012 at 5:55 PM, Iggy Fernandez <iggy_fernandez_at_hotmail.com>wrote:

>  I noticed that the brk system calls are failing because the return value
> is less than the argument value.
>
> brk(0x60013000) = 0x5ffef000
>
> http://www.kernel.org/doc/man-pages/online/pages/man2/brk.2.html
> brk() sets the end of the data segment to the value specified by addr,
> when that value is reasonable, the system has enough memory, and the
> process does not exceed its maximum data size (see setrlimit(2)) ... On
> failure, the system call returns the current break.
>
> http://www.kernel.org/doc/man-pages/online/pages/man2/setrlimit.2.html
> RLIMIT_DATA The maximum size of the process's data segment (initialized
> data, uninitialized data, and heap).  This limit affects calls to brk(2)
>
> 0x60013000 is 1,610,690,560 in decimal
>
> 0x5ffef000 is 1,610,543,104 in decimal
>
> Best of luck in solving this
>
> Iggy
>
>
> > When I run an OS strace and a 10046 trace, after the last traced event in
> > the oracle session trace, the process strace continues with
> >
> > brk(0x60013000) = 0x5ffef000
> > mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0)
> > = 0x2b9e775b5000
> > brk(0x60013000) = 0x5ffef000
> > mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0)
> > = 0x2b9e776b5000
> > ...
> > Repeating
> > ...
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Nov 13 2012 - 15:24:55 CET

Original text of this message