OEL - fork: Resource temporarily unavailable

From: Upendra N <nupendra_at_hotmail.com>
Date: Wed, 21 Sep 2011 23:32:35 -0400
Message-ID: <BLU129-W129964E4F8C9A57A67EBB1D80C0_at_phx.gbl>



We have a 2 node Oracle RAC clusters, both are running OEL 5.6 and Oracle 11g R2.

::::::::::::::

/etc/enterprise-release
::::::::::::::

Enterprise Linux Enterprise Linux Server release 5.6 (Carthage)
::::::::::::::

/etc/oracle-release
::::::::::::::

Oracle Linux Server release 5.6
::::::::::::::

/etc/redhat-release
::::::::::::::

Red Hat Enterprise Linux Server release 5.6 (Tikanga)

From one of the node I am seeing the following error consistently..

-bash: fork: Resource temporarily unavailable

(22:55:17) root_at_proddb1: /var/tmp # ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 1056768
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 100000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1056768
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited



proddb1 is where the problem exists..

Number of database connections between the boxes look fine:

proddb1# ps -ef|grep -ic local=no
3800

proddb2# ps -ef|grep -ic local=no

4500

proddb1# ps -eLf|wc -l

32500

proddb2# ps -eLf|wc -l

6500

proddb1# strace lsof -o /tmp/lsof.out <--- Produces the following output .....

.....

close(99982)                            = -1 EBADF (Bad file descriptor)
close(99983)                            = -1 EBADF (Bad file descriptor)
close(99984)                            = -1 EBADF (Bad file descriptor)
close(99985)                            = -1 EBADF (Bad file descriptor)
close(99986)                            = -1 EBADF (Bad file descriptor)
close(99987)                            = -1 EBADF (Bad file descriptor)
close(99988)                            = -1 EBADF (Bad file descriptor)
close(99989)                            = -1 EBADF (Bad file descriptor)
close(99990)                            = -1 EBADF (Bad file descriptor)
close(99991)                            = -1 EBADF (Bad file descriptor)
close(99992)                            = -1 EBADF (Bad file descriptor)
close(99993)                            = -1 EBADF (Bad file descriptor)
close(99994)                            = -1 EBADF (Bad file descriptor)
close(99995)                            = -1 EBADF (Bad file descriptor)
close(99996)                            = -1 EBADF (Bad file descriptor)
close(99997)                            = -1 EBADF (Bad file descriptor)
close(99998)                            = -1 EBADF (Bad file descriptor)
close(99999)                            = -1 EBADF (Bad file descriptor)
open("/dev/null", O_RDWR)               = 3
close(3)                                = 0
umask(0)                                = 022
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_sizeV442544, ...}) = 0 mmap(NULL, 56442544, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2b9c118e9000
close(3)                                = 0
getpid()                                = 760
getgid()                                = 0
getegid()                               = 0
geteuid()                               = 0
getuid()                                = 0
stat("/dev", {st_mode=S_IFDIR|0755, st_sizea80, ...}) = 0
open("/", O_RDONLY)                     = 3
lseek(3, 1, SEEK_SET)                   = 1
lstat("/proc/760/fd/3", {st_mode=S_IFLNK|0500, st_sized, ...}) = 0
close(3)                                = 0
open("/proc/mounts", O_RDONLY)          = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b9c14ebd000
read(3, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 1228
pipe([4, 5])                            = 0
pipe([6, 7])                            = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2b9c118e8670) = -1 EAGAIN (Resource temporarily unavailable) write(2, "lsof: can't fork: Resource tempo"..., 51) = 51
exit_group(1)                           = ?


Any help is appreciated.

-Upendra                                                

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Sep 21 2011 - 22:32:35 CDT

Original text of this message