RE: OEL - fork: Resource temporarily unavailable

From: Upendra N <nupendra_at_hotmail.com>
Date: Mon, 26 Sep 2011 01:07:03 -0400
Message-ID: <BLU129-W174B09C5B8E5FFA47D322CD8F30_at_phx.gbl>



> 4000 concurrent sessions per node? I assume you mean active sessions.

This application receives a huge number of sessions from the application. The user sessions are handled by application connection pool, though I see a large number of database connections, not all of them are in "active" status.

CPU load doesn't seem to be an issue, 60-80% idle. I have checked against the bugs you sent as well, none of them seems to meet my scenario. Thanks again for compiling them.

I found that out of 32000 threads, 27000 of them were related to ONS process. We have DBConsole running on that box. After I stop ONS and restart DBConsole, all the threads get cleared. Oracle support couldn't narrow down to a specific issue, they are still investigating. For now we restart DBConsole as a work around to keep the threads under control. Thanks much for everyone's input.

-Upendra

Date: Thu, 22 Sep 2011 09:49:07 -0500
Subject: Re: OEL - fork: Resource temporarily unavailable From: mihajlo.tekic_at_gmail.com
To: niall.litchfield_at_gmail.com; nupendra_at_hotmail.com CC: sfaroult_at_roughsea.com; hemant.chitale_at_sc.com; oracle-l_at_freelists.org

4000 concurrent sessions per node? I assume you mean active sessions. This triggers a question, how much resources you have available on each of the nodes to support all these connections?

RAM 128G, you mentioned that. But, is it enough? Are you also experiencing performance problems with the existing sessions? Anyway, >>>“fork: Resource temporarily unavailable”<<< is pretty much self explanatory. At large, it indicates a resource problem.

From your strace input it looks clone call is failing to create a child process. *****clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2b9c118e8670) = -1 EAGAIN (Resource temporarily unavailable)



From checking clone man page it looks like it failed while creating a child process due to too many processes already running (EAGAIN error for clone). http://linux.die.net/man/2/clone

According to fork man pages, this call could fail due to the following errors:

http://linux.die.net/man/2/fork*****************
1. EAGAIN
fork() cannot allocate sufficient memory to copy the parent's page tables and allocate a task structure for the child. 2. EAGAIN It was not possible to create a new process because the caller's RLIMIT_NPROC resource limit was encountered. To exceed this limit, the process must have either the CAP_SYS_ADMIN or the CAP_SYS_RESOURCE capability.

3. ENOMEM

fork() failed to allocate the necessary kernel structures because memory is tight.*****************
Check your memory consumption and see if there is enough memory available.

You also indicated there are 32K processes were running on the server when the issue was happening. Have you checked for any defunct/zombie processes? Aside from what I’ve indicated above, you may also be hitting some of the known 11.2 bugs, such as 8841501, 9356344, 9398412, 9944177, 9234660, 9855476 (Check MOS Note# 1062676.1)

Although CPU might not be a problem for this particular case, running 4K processes concurrently may also cause heavy CPU utilization. How many CPUs(cores) each node has? Knowing it is RAC environment, if CPU is 100% utilized (unless you use resource manager) you may also experience the heavy utilized node to be evicted. Has this happened? --- Maybe you should think of using some connection pooling mechanism, or if you already use it to check if it is used appropriately/efficiently. Stephane’s comment about shared servers is also valid.

Hope this helps.
~Mihajlo

On 09/22/2011 06:12 AM, Upendra N wrote:

> yeah. This is a very heavily used app/db, we see 4000 co...

http://www.freelists.org/webpage/oracle-l                                                

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Sep 26 2011 - 00:07:03 CDT

Original text of this message