OEL - fork: Resource temporarily unavailable
Date: Wed, 21 Sep 2011 23:32:35 -0400
Message-ID: <BLU129-W129964E4F8C9A57A67EBB1D80C0_at_phx.gbl>
We have a 2 node Oracle RAC clusters, both are running OEL 5.6 and Oracle 11g R2.
::::::::::::::
/etc/enterprise-release
::::::::::::::
Enterprise Linux Enterprise Linux Server release 5.6 (Carthage)
::::::::::::::
/etc/oracle-release
::::::::::::::
Oracle Linux Server release 5.6
::::::::::::::
/etc/redhat-release
::::::::::::::
Red Hat Enterprise Linux Server release 5.6 (Tikanga)
From one of the node I am seeing the following error consistently..
-bash: fork: Resource temporarily unavailable
(22:55:17) root_at_proddb1: /var/tmp # ulimit -a
core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1056768 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 100000 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 1056768 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
proddb1 is where the problem exists..
Number of database connections between the boxes look fine:
proddb1# ps -ef|grep -ic local=no
3800
proddb2# ps -ef|grep -ic local=no
4500
proddb1# ps -eLf|wc -l
32500
proddb2# ps -eLf|wc -l
6500
proddb1# strace lsof -o /tmp/lsof.out <--- Produces the following output .....
.....
close(99982) = -1 EBADF (Bad file descriptor) close(99983) = -1 EBADF (Bad file descriptor) close(99984) = -1 EBADF (Bad file descriptor) close(99985) = -1 EBADF (Bad file descriptor) close(99986) = -1 EBADF (Bad file descriptor) close(99987) = -1 EBADF (Bad file descriptor) close(99988) = -1 EBADF (Bad file descriptor) close(99989) = -1 EBADF (Bad file descriptor) close(99990) = -1 EBADF (Bad file descriptor) close(99991) = -1 EBADF (Bad file descriptor) close(99992) = -1 EBADF (Bad file descriptor) close(99993) = -1 EBADF (Bad file descriptor) close(99994) = -1 EBADF (Bad file descriptor) close(99995) = -1 EBADF (Bad file descriptor) close(99996) = -1 EBADF (Bad file descriptor) close(99997) = -1 EBADF (Bad file descriptor) close(99998) = -1 EBADF (Bad file descriptor) close(99999) = -1 EBADF (Bad file descriptor) open("/dev/null", O_RDWR) = 3 close(3) = 0 umask(0) = 022open("/usr/lib/locale/locale-archive", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_sizeV442544, ...}) = 0 mmap(NULL, 56442544, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2b9c118e9000
close(3) = 0 getpid() = 760 getgid() = 0 getegid() = 0 geteuid() = 0 getuid() = 0 stat("/dev", {st_mode=S_IFDIR|0755, st_sizea80, ...}) = 0 open("/", O_RDONLY) = 3 lseek(3, 1, SEEK_SET) = 1 lstat("/proc/760/fd/3", {st_mode=S_IFLNK|0500, st_sized, ...}) = 0 close(3) = 0 open("/proc/mounts", O_RDONLY) = 3fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b9c14ebd000 read(3, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 1228 pipe([4, 5]) = 0 pipe([6, 7]) = 0clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2b9c118e8670) = -1 EAGAIN (Resource temporarily unavailable) write(2, "lsof: can't fork: Resource tempo"..., 51) = 51
exit_group(1) = ?
Any help is appreciated.
-Upendra
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Sep 21 2011 - 22:32:35 CDT