Re: Oracle user process uses all memory and swap on server
Date: Tue, 13 Nov 2012 14:24:55 +0000
Message-ID: <CAGDf7wSy1BCzrTLG-407QgR+ujLXVX3C3tovn5tSdqSLHxjXAQ_at_mail.gmail.com>
Thanks Iggy, now I know the strace show the os is really out of memory. Some more Debug, using Tanels tools...
SQL> _at_snapper stats 5 1 13
Sampling SID 13 with interval 5 seconds, taking 1 snapshots...
- Session Snapper v3.52 by Tanel Poder _at_ E2SN ( http://tech.e2sn.com )
SID, USERNAME , TYPE,
STATISTIC , HDELTA,HDELTA/SEC, %TIME, GRAPH
13, TESTING , STAT, session logical reads , 1.93k, 386.6, 13, TESTING , STAT, consistent gets , 1.93k, 386.6, 13, TESTING , STAT, consistent gets from cache , 1.93k, 386.6, 13, TESTING , STAT, consistent gets from cache (fastpath) , 277, 55.4, 13, TESTING , STAT, consistent gets - examination , 1.1k, 220.8, 13, TESTING , STAT, logical read bytes from cache , 15.84M, 3.17M, 13, TESTING , STAT, shared hash latch upgrades - no wait , 552, 110.4, 13, TESTING , STAT, calls to get snapshot scn: kcmgss , 276, 55.2, 13, TESTING , STAT, index crx upgrade (positioned) , 552, 110.4, 13, TESTING , STAT, lob reads , 276, 55.2, 13, TESTING , STAT, index fetch by key , 276, 55.2, 13, TESTING , STAT, index scans kdiixs1 , 552, 110.4,-- End of Stats snap 1, end 12-11-13 10:54:06, seconds=5
SQL> _at_snapper stats 5 1 13
HANG!
So snapper not showing anything?
Over to ostackprof
SQL> _at_ostackprof 1143 0 5
Sampling...
Below is the stack prefix common to all samples:
Frame->function() ------------------------------------------------------------------------
# 34 ->__libc_start_main()
# 33 ->main()
# 32 ->ssthrdmain()
# 31 ->opimai_real()
# 30 ->sou2o()
# 29 ->opidrv()
# 28 ->opiodr()
# 27 ->opiino()
# 26 ->opitsk()
# 25 ->ttcpip()
# 24 ->opiodr()
# 23 ->opifch()
# 22 ->opifch2()
# 21 ->qerstFetch()
# 20 ->qertbFetch()
# 19 ->qerstRowP()
# 18 ->kpofcr()
# 17 ->evaopn2()
# 16 ->evaopn2()
# 15 ->evaopn2()
# 14 ->kokle_rxsubstr()
# 13 ->kole_rxsubstr()
# 12 ->lxkRegexpSubstrLobNSub()
# 11 ->lxregexec()
# 10 ->lxregmatch()
# ...(see call profile below)
#
# -#--------------------------------------------------------------------
# - Num.Samples -> in call stack()
# ----------------------------------------------------------------------
2
->lxregmatgpt()->kole_rxrdcb()->koklc_read()->koklread()->koklOutlineRead1()->kdlf_read()->kdl_read1()->kdlprl()->__intel_new_memcpy()->__sighandler()->->
2 ->__sighandler()->-> 1 ->lxregmatpush()->__sighandler()->->
Ok we see the regex function running
Server starting to overload now
SQL> _at_ostackprof 1143 0 5
Sampling...
Below is the stack prefix common to all samples:
Frame->function()
# ...(see call profile below)
#
# -#--------------------------------------------------------------------
# - Num.Samples -> in call stack()
# ----------------------------------------------------------------------
2
->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->opifch()->opifch2()->qerstFetch()->qertbFetch()->qerstRowP()->kpofcr()->evaopn2()->evaopn2()->evaopn2()->kokle_rxsubstr()->kole_rxsubstr()->lxkRegexpSubstrLobNSub()->lxregexec()->lxregmatch()->lxregmatgpt()->kole_rxrdcb()->koklc_read()->koklread()->koklOutlineRead1()->kdlf_read()->kdl_read1()->kdlprl()->kdlrdb()->kcbgtcr()->__sighandler()->->
1 ->__sighandler()->-> 1 ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->opifch()->opifch2()->qerstFetch()->qertbFetch()->qerstRowP()->kpofcr()->evaopn2()->evaopn2()->evaopn2()->kokle_rxsubstr()->kole_rxsubstr()->lxkRegexpSubstrLobNSub()->lxregexec()->lxregmatch()->lxregmatgpt()->kole_rxrdcb()->koklc_read()->koklread()->koklOutlineRead1()->kdlf_read()->kdl_read1()->kdlprl()->__intel_new_memcpy()->__sighandler()->-> 1
->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->opifch()->opifch2()->qerstFetch()->qertbFetch()->qerstRowP()->kpofcr()->evaopn2()->evaopn2()->evaopn2()->kokle_rxsubstr()->kole_rxsubstr()->lxkRegexpSubstrLobNSub()->lxregexec()->lxregmatch()->lxregmatgpt()->__sighandler()->->
SQL> _at_ostackprof 1143 0 5
Sampling...
Below is the stack prefix common to all samples:
Frame->function()
# ...(see call profile below)
#
# -#--------------------------------------------------------------------
# - Num.Samples -> in call stack()
# ----------------------------------------------------------------------
2 ->__sighandler()->-> 2 ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->opifch()->opifch2()->qerstFetch()->qertbFetch()->qerstRowP()->kpofcr()->evaopn2()->evaopn2()->evaopn2()->kokle_rxsubstr()->kole_rxsubstr()->lxkRegexpSubstrLobNSub()->lxregexec()->lxregmatch()->lxregmatgpt()->kole_rxrdcb()->koklc_read()->koklread()->koklOutlineRead1()->kdlf_read()->kdl_read1()->kdlprl()->kdlrdb()->kcbgtcr()->__sighandler()->-> 1
->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->opifch()->opifch2()->qerstFetch()->qertbFetch()->qerstRowP()->kpofcr()->evaopn2()->evaopn2()->evaopn2()->kokle_rxsubstr()->kole_rxsubstr()->lxkRegexpSubstrLobNSub()->lxregexec()->lxregmatch()->lxregmatpush()->__sighandler()->->
SQL> _at_ostackprof 1143 0 5
Hit CTRL+C to cancel, ENTER to continue...
HANG!
Now I am lost!!
Any help would be greatly appreciated.
Thanks,
Tom
On Mon, Nov 12, 2012 at 5:55 PM, Iggy Fernandez <iggy_fernandez_at_hotmail.com>wrote:
> I noticed that the brk system calls are failing because the return value > is less than the argument value. > > brk(0x60013000) = 0x5ffef000 > > http://www.kernel.org/doc/man-pages/online/pages/man2/brk.2.html > brk() sets the end of the data segment to the value specified by addr, > when that value is reasonable, the system has enough memory, and the > process does not exceed its maximum data size (see setrlimit(2)) ... On > failure, the system call returns the current break. > > http://www.kernel.org/doc/man-pages/online/pages/man2/setrlimit.2.html > RLIMIT_DATA The maximum size of the process's data segment (initialized > data, uninitialized data, and heap). This limit affects calls to brk(2) > > 0x60013000 is 1,610,690,560 in decimal > > 0x5ffef000 is 1,610,543,104 in decimal > > Best of luck in solving this > > Iggy > > > > When I run an OS strace and a 10046 trace, after the last traced event in > > the oracle session trace, the process strace continues with > > > > brk(0x60013000) = 0x5ffef000 > > mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, > 0) > > = 0x2b9e775b5000 > > brk(0x60013000) = 0x5ffef000 > > mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, > 0) > > = 0x2b9e776b5000 > > ... > > Repeating > > ... >
-- http://www.freelists.org/webpage/oracle-lReceived on Tue Nov 13 2012 - 15:24:55 CET