RE: Semaphore Tuning Reference

From: Joshi Rajanish <dbaoracle2000_at_yahoo.com>
Date: Fri, 1 Dec 2000 20:14:46 -0800 (PST)
Message-Id: <10697.123465@fatcity.com>

Dear Paul,
I found this reference on semaphore hope it helps you :).

Regards,
Rajanish Joshi
DBA Pune,India

UNDERSTANDING SHARED MEMORY/SEMAPHORES ON UNIX

                              Oracle Corporate Support
                                 Problem Repository

Prob# 1007974.6 UNDERSTANDING SHARED MEMORY/SEMAPHORES ON UNIX
Soln# 2055995.6 UNDERSTANDING SHARED MEMORY/SEMAPHORES ON UNIX
Prob# 1007974.6 UNDERSTANDING SHARED MEMORY/SEMAPHORES ON UNIX

Problem ID          : 1007974.6
Affected Platforms  : Generic: not platform specific
Affected Products   : Oracle Server - Enterprise Edition V7
Affected Components : RDBMS Generic
Affected Oracle Vsn : Generic

Summary:
UNDERSTANDING SHARED MEMORY/SEMAPHORES ON UNIX
+=+

Understanding shared memory and semaphores on UNIX

This bulletin is designed to explain the proper usage of shared memory and semaphores on UNIX.

Search Words: ORA-07421 ORA-07429 ORA-07430 (Additional errors which this

               PRE addresses are included in the error listing associated with 
               the entry.)

+==+

Diagnostics and References:

2. Soln# 2055995.6 UNDERSTANDING SHARED MEMORY/SEMAPHORES ON UNIX

Solution ID         : 2055995.6
For Problem         : 1007974.6
Affected Platforms  : Generic: not platform specific
Affected Products   : Oracle Server - Enterprise Edition V7
Affected Components : RDBMS Generic
Affected Oracle Vsn : Generic

Summary:
UNDERSTANDING SHARED MEMORY/SEMAPHORES ON UNIX
+=+

							      
            Understanding Oracle and Shared Memory/Semaphores

Shared memory and semaphores are two important resources for an Oracle instance on Unix. An instance cannot start if it is unable to allocate what it needs. This paper primarily discusses the process Oracle goes through to allocate shared memory and semaphores at instance startup. Other important points unrelated to startup as well as some troubleshooting information will be touched.

General

Shared memory is exactly that - a memory region that can be shared between different processes. Oracle uses shared memory for implementing the SGA, which needs to be visible to all database sessions. Shared memory is also used in the implementation of the SQL*Net V1 Fast driver as a means of communicating between the application and shadow process. On the RS/6000, each shadow process stores its PGA in a shared memory segment (however, only the shadow attaches this segment). In the latter two cases, Oracle allocates the shared memory dynamically as opposed to the allocation of the SGA, which occurs at instance startup. This allocation will not be discussed in this paper.

Semaphores can be thought of as flags (hence their name, semaphores). They are either on or off. A process can turn on the flag or turn it off. If the flag is already on, processes who try to turn on the flag will sleep until the flag is off. Upon awakening, the process will reattempt to turn the flag on, possibly suceeding or possibly sleeping again. Such behaviour allows semaphores to be used in implementing a post-wait driver - a system where processes can wait for events (i.e., wait on turning on a semphore) and post events (i.e. turning off of a semaphore). This mechanism is used by Oracle to maintain concurrency control over the SGA, since it is writeable by all processes attached. Also, for the same reasons, use of the Fast Driver requires additional semaphores. However, these semaphores will be allocated dynamically instead of at instance startup. This allocation will not be discussed in this paper.

Instance startup

On instance startup, the first things that are done are: read the init.ora, start the background processes, and allocate the shared memory and semphores required. The size of the SGA will be calculated from various init.ora parameters; this will be the amount of shared memory required. The SGA is broken into 4 sections - the fixed portion, which is constant in size, the variable portion, which varies in size depending on init.ora parameters, the redo block buffer, which has its size controlled by log_buffers, and the db block buffer, which has its size controlled by db_block_buffers. The size of the SGA is the sum of the sizes of the 4 portions. There is unfortunately no simple formula for determining the size of the variable portion. Generally, the shared pool dominates all other parts of the variable portion, so as a rule of thumb, one can estimate the size as the value of shared_pool_size (in v6, one can ignore the size of the variable portion). The number of semaphores required is much simpler to determine. Oracle will need exactly as many semaphores as the value of the processes init.ora parameter. Note that the recommended kernel parameter values in the ICG are enough to support the default database (4M SGA, 50 processes), but may be insufficient to run a larger instance. With the above estimations and the information which follows, a DBA should be able to build a kernel with appropriate settings to support the instance.

Shared memory allocation

Oracle has 3 different possible models for the SGA - one-segment, contiguous multi-segment, and non-contiguous multi-segment. When attempting to allocate and attach shared memory for the SGA, it will attempt each one, in the above order, until one succeeds or raises an ORA error. On other, non-fatal, errors, Oracle simply cleans up and tries again using the next memory model. The entire SGA must fit into shared memory, so the total amount of shared memory allocated under any model will be equal to the size of the SGA. This calculated value will be referred to below as SGASIZE.

NOTE: Models implemented are operating system dependent.

The one-segment model is the simplest and first model tried. In this model, the SGA resides in only one shared memory segment. Oracle attempts to allocate and attach one shared memory segment of size equal to the total size of the SGA. However, if the SGASIZE is larger than the configured SHMMAX, this will obviously fail (with EINVAL). In this case, the SGA will need to be placed in multiple shared memory segments, and Oracle proceeds to the next memory model for the SGA. If an error other than EINVAL occurs when allocating the shared memory with shmget(), Oracle will raise an ORA-7306. If the segment was gotten (i.e. if SHMMAX > SGASIZE), Oracle attempts to attach it at the start address defined in ksms.o. An error on the attach will raise an ORA-7307. Note: ksms.o is only used in SVR4-based Operating Systems such as Solaris and is not used in BSD-based Operating Systems such as Sun4 and Ultrix.

With multiple segments, there are two possibilities. The segments can be attached contiguously, so that it appears to be one large shared memory segment, or non-contiguously, with gaps between the segments. The former wastes less space that could be used for the stack or heap, but depending on alignment requirements for shared memory (defined by SHMLBA in the kernel), it may not be possible. At this point, Oracle needs to determine SHMMAX so it can determine how many segments will be required. This is done via a binary search algorithm over the range [1...SGASIZE] (since Oracle is trying this model and not the one segment model it must be that SHMMAX.dbf file is used to get the necessary information. In version 7, the SGA itself contains the information about the shared memory and semaphores (how the bootstrap works will be explained later). In either case, the information stored is the same - the key, id, size, and attach address of each shared memory segment and the key, id, and size of each semaphore set. Note that we need not do anything special to initialize the semaphores. We can use them with the data structure we read in on connecting.

The version 6 approach is rather simple. It first tries to open the sgadef.dbf file. If it cannot, an ORA-7318 is raised. Once opened, the data written earlier on startup is read. If an error occurs for some reason on the read, an ORA-7319 occurs. Once all the data is read in, Oracle attaches each segment in turn. First, it generates what it believes the key for the segment should be. It then gets that segment, returning ORA-7429 if it fails. The key used and the key stored are then compared. They should be equal, but if not, an ORA-7430 occurs. Once the key is verified, the segment is attached. A failure to attach the segment raises an ORA-7320. If the segment is attached, but not at the address we requested, an ORA-7321 occurs. This process is repeated for all segments until the entire SGA is attached.

Version 7 differs only in the first part, when the shared memory and semaphore data is read. Once that data is read in, Oracle proceeds in the same manner. To fetch this data, Oracle generates what it thinks should be the key for the first segment of the SGA and attaches it, as if it were the only segment. Once it is attached, the data is copied from the SGA. With this data, Oracle attaches any remaining segments for the SGA. There is one possible problem. If somehow two instances have a key collision (i.e. they both generate the same key for their first segment), it is possible to have only one of the two instances up at a time! Connection attempts to either one will connect a user to whichever instance is up. This is rare, but can happen. Development is currently working on a better key generation algorithm.

Attaching shared memory

As seen in previous sections, shared memory must be gotten (this may mean allocating the shared memory, but not necessarily) and then attached, to be used. Attaching shared memory brings the shared memory into the process' memory space. There are some important things about attach addresses. For one thing, they may need to be aligned on some boundary (generally defined by SHMLBA). More importantly, shared memory must mapped to pages in the process' memory space which are unaccounted for. Every process already has a text, a data, and a stack segment laid out as follows (in general):

	       +---------+  high addresses 
	       |  stack  | 
	  |---------| -+ 
	       |    |    |  | 
	       |    v    |  | 
	       |---------|  | 
	       | shm seg |  |- unused portion 
               |---------|  |  These are valid pages for shared memory 
	       |    ^    |  |  Pages are allocated from this area 
	       |    |    |  |  as both the stack and heap(data) grow 
	       |---------| -+ 
	       |   data  | 
               |---------| 
	       |   text  | 
	       +---------+  low addresses

So valid attach addresses lie in the unused region between the stack and the data segments (a shared memory segment is drawn in the diagram to aid in visualization - not every process has shared memory attached!). Of course, the validity also depends on the size of the segment, since it cannot overlap another segment. Note that both the stack and data segments can grow during the life of a process. Because segments must be contiguous and overlapping is not allowed, this is of some importance. Attaching shared memory creates a limit on how much the stack or data segment can grow. Limiting the stack is typically not a problem, except when running deeply recursive code. Neither is limiting the data segment, but this does restrict the amount of memory that can be dynamically allocated by a program. It is possible (but seldom) that some applications running against the database may hit this limit in the shadow (since the shadow has the SGA attached). This is the cause of ORA-7324 and ORA-7325 errors. How to deal with these is discussed in the troubleshooting section.

The SGA is attached, depending on the allocation model used, more or less contiguously (there may be gaps, but those can be treated as if they were part of the shared memory). So where the beginning of the SGA can be attached depends on the SGA's size. The default address which is chosen by Oracle is generally sufficient for most SGAs. However, it may be necessary to relocate the SGA for very large sizes. It may also need to be changed if ORA-7324 or ORA-7325 errors are occuring. The beginning attach address is defined in the file ksms.s. Changing the attach address requires recompilation of the Oracle kernel and should not be done without first consulting Oracle personnel. Unfortunately, there is no good way to determine what a good attach address will be. When changing the address to allow a larger SGA, a good rule of thumb is taking the default attach address in ksms.s and subtracting the size of the SGA. The validity of an attach address can be tested with the Oracle provided tstshm executable. Using tstshm -t -b will determine if the
address is usable or not.

Trouble shooting

Errors which might have multiple causes are discussed in this sections. Errors not mentioned here generally have only one cause which has a typically obvious solution.

ORA-7306, ORA-7336, ORA-7329
Oracle received a system error on a shmget() call. The system error should be reported. There are a few possibilities:

There is insufficient shared memory available. This is indicated by the operating system error ENOSPC. Most likely, SHMMNI is too small. Alternatively, there may shared memory already allocated; if it is not attached, perhaps it can be freed. Maybe shared memory isn't configured in the kernel.
There is insufficient memory available. Remember, shared memory needs pages of virtual memory. The system error ENOMEM indicates there is insufficient virtual memory. Swap needs to be increased, either by adding more or by freeing currently used swap (i.e. free other shared memory, kill other processes)
The size of the shared memory segment requested is invalid. In this case, EINVAL is returned by the system. This should be very rare - however, it is possible. This can occur if SHMMAX is not a muliple of the page size and Oracle is trying a multi-segment model. Remember that Oracle rounds its calculation of SHMMAX to a page boundary, so it may have rounded it up past the real SHMMAX! (Whether this is a bug is debatable).
The shared memory segment does not exist. This would be indicated by the system error ENOENT. This would never happen on startup; it only would happen on connects. The shared memory most likely has been removed unexpectedly by someone or the instance is down.

ORA-7307, ORA-7337, ORA-7320
Oracle received a system error on a shmat() call. The system error should be reported. There a a few possibilities:

The attach address is bad. If this is the cause, EINVAL is returned by the system. Refer to the section on the attach address to see why the attach address might be bad. This may happen after enlarging the SGA.
The permissions on the segment do not allow the process to attach it. The operating system error will be EACCES. Generally the cause of this is either the setuid bit is not turned on for the oracle executable, or root started the database (and happens to own the shared memory). Normally, this would be seen only on connects.
The process cannot attach any more shared memory segments. This would be accompanied by the system error EMFILE. SHMSEG is too small. Note that as long as SHMSEG is greater than SS_SEG_MAX, you should never see this happen.

ORA-7329, ORA-7334
Oracle has determined the SGA needs too many shared memory segments. Since you can't change the limit on the number of segments, you should instead increase SHMMAX so that fewer segments are required.

ORA-7339
Oracle has determined it needs too many semaphore sets. Since you can't change the limit on the number of semaphore sets, you should increase SEMMSL so fewer sets are required.

ORA-7250, ORA-7279, ORA-7252
Oracle received a system error on a semget() call. The system error should be reported. There should be only one system error ever returned with this, ENOSPC. This can mean one of two things. Either the system limit on sempahore sets has been reached or the system limit on the total number of semaphores has been reached. Raise SEMMNI or SEMMNS, as is appropriate, or perhaps there are some semaphore sets which can be released. In the case of ORA-7250, ORANSEMS may be set too high (>SEMMSL). If it is, raise SEMMSL or decrease ORANSEMS.

ORA-7251
Oracle failed to allocate even a semaphore set of only one semaphore. It is likely that semaphores are not configured in the kernel.

ORA-7318
Oracle could not open the sgadef file. The system error number will be returned. There are a few possible causes:

The file doesn't exist. In this case, the system error ENOENT is returned. Maybe ORACLE_SID or ORACLE_HOME is set wrong so that Oracle is looking in the wrong place. Possibly the file does not exist (in this case, a restart is necessary to allow connections again).
The file can't be accessed for reading. The operating system error returned with this is EACCES. The permissions on the file (or maybe directories) don't allow an open for reading of the sgadef file. It might not be owned by the oracle owner. The setuid bit might not be turned on for the oracle executable.

ORA-7319
Oracle did not find all the data it expected when reading the sgadef.dbf file. Most likely the file has been truncated. The only recovery is to restart the instance.

ORA-7430
Oracle expected a key to be used for the segment which does not match the key stored in the shared memory and semaphore data structure. This probably indicates a corruption of the sgadef file (in version 6) or the data in the first segment of the SGA (in version 7). A restart of the instance is probably necessary to recover in that case.It may also be a key collision problem and Oracle is attached to the wrong instance.

ORA-7321
Oracle was able to attach the segment, but not at the address it requested. In most cases, this would be caused by corrupted data in the sgadefile (in version 6) or the first segment of the SGA (in version 7). A restart of the database may be necessary to recover.

ORA-7324, ORA-7325
Oracle was unable to allocate memory. Most likely, the heap (data segment) has grown into the bottom of the SGA. Relocating the SGA to a higher attach address may help, but there may be other causes. Memory leaks can cause this error. The init.ora parameter sort_area_size may be too large, decreasing it may resolve the error. The init.ora parameter context_incr may also be too large, decreasing it may resolve this

ORA-7264, ORA-7265
Oracle was unable to decrement/increment a semaphore. This generally is accompanied by the system error EINVAL and a number which is the identifier of the semaphore set. This is almost always because the semaphore set was removed, but the shadow process was not aware of it (generally due to a shutdown abort or instance crash). This error is usually ignorable.

System Parameters

SHMMAX - kernel parameter controlling maximum size of one shared memory

segment
SHMMHI - kernel parameter controlling maximum number of shared memory

segments in the system
SHMSEG - kernel parameter controlling maximum number of shared memory

segments a process can attach
SEMMNS - kernel parameter controlling maximum number of semaphores in

the system
SEMMNI - kernel parameter controlling maximum number of semaphore

sets. Semphores in Unix are allocated in sets of 1 to SEMMSL. SEMMSL - kernel parameter controlling maximum number of semaphores in a

semphore set.
SHMLBA - kernel parameter controlling alignment of shared memory

         segments; all segments must be attached at multiples of this value. 
	 Typically, non-tunable.

System errors

ENOENT - No such file or directory, system error 2 
ENOMEM - Not enough core, system error 12 
EACCES - Permission denied, system error number 13 
EINVAL - Invalid argument, system error number 22 
EMFILE - Too many open files, system error number 24 
ENOSPC - No space left on device, system error number 28

Oracle parameters

SS_SEG_MAX - Oracle parameter specified at compile time (therefore,

             unmodifiable without an Oracle patch) which defines maximum 
	     number of segments the SGA can reside in.  Normally set to 20. 
SS_SEM_MAX - Oracle parameter specified at compile time (therefore, 
	     unmodifiable without an Oracle patch) which defines maximum 
	     number of semaphore sets Oracle will allocate.  Normally set  
             to 10.

+==+

References:

 ref: {4684.6}     PRE-1010332.6
 ref: {5296.6}     PRE-1006475.6

-------------------------------------------------------------------------
Doc ID: 
        Note:1016792.4
 Subject: 
        ORA-00600 ORA-27700
        WHEN STARTING SECOND
        INSTANCE
 Type: 
        PROBLEM
 Status: 
        PUBLISHED

                                   Content Type: 
                                                  TEXT/PLAIN
                                   Creation Date: 
                                                  01-MAY-1998
                                   Last Revision
                                   Date: 
                                                  01-MAR-2000
                                   Language: 
                                                  USAENG

roblem Description:

You want to bring up a second instance on the server. You have raised the kernel parameter SHMMAX to 500Mb to accomodate the increase in your SGA. When bringing up the second instance, you receive the following errors:

    ORA-00600:  [SKGMBUSY][1] [0]                                            
    ORA-27700:  Shared memory realm already exists.                         
    svr4 error 17:  file exists

Problem Explanation:

One or more of the kernel parameters is misspelled.

Search Words:

Sun, Solaris, 8.0.4

Solution: VERIFY KERNEL PARAMETERS ARE SPELLED CORRECTLY

Solution Description:

Ensure you have not misspelled any kernel parameters that are typically set for an Oracle installation, particularly shared memory. Check the configuration of your Operating System kernel and ensure the parameters for shared memory are set in accordance with the Installation and Configuration Guide.

The Installation and Configuration Guide for the version of the database you are running will list the recommended values for the kernel parameters for your system.

Run the following command to print the kernel settings to the screen:

% /etc/sysdef

Compare the values of the modified kernel parameters output by running this command to the values defined in /etc/system. Where the values differ confirm the spelling of the parameter in /etc/system with the defined spelling the Operating System documentation.

Correct the spelling in accordance with the Installation and Configuration Guide or Operating System documentation.

To reconfigure the kernel, perform the following as root:

# reboot -- -r

This will reconfigure the kernel.

Confirm the modified kernel settings by re-running:

% /etc/sysdef
.

Center of Expertise Research Articles

Solaris ISM and Oracle, Frequently asked Questions

What is ISM?

ISM, or Intimate Shared Memory, is a way of handling the page table entries by the Sun Solaris operating system.

There is one memory structure in Sun Solaris that is used for keeping track of all the process page table
information. This structure is essentially a series of hash chains, similar to Oracle's concept of LRU
latch chains or cache buffers chains. During memory operations, this structure is traversed by the operating system as it makes decisions about how to handle memory. Each process on the system has information in this memory structure. There are only 256 entry points into this structure and this number
cannot be increased.

Consequently, as more and more memory mappings and operations occur, the information stored behind each entry point grows (the "chains" lengthen). As the operating system responds to higher activity requiring memory mappings, it spends more and more time in kernel mode (shown as %sys in sar output) just walking around in this structure deciding what to do. Having large number of users and
a large Oracle SGA further aggravates this situation.

The use of ISM reduces this problem significantly because it allows the processes to share the page
table entries. Essentially, complex operations on these memory chains reduce to pointer operations
involving small amounts of data.

What are the issues related to ISM?

ISM reduces the number of system calls therefore improving the overall performance when there are large memory allocations and large number of users on the system. In theory, this should cause no adverse affects and increase the overall performance of the system. Unfortunately, due to various Solaris operating system bugs enabling ISM can cause Oracle data corruptions or crashes. The relevant Sun bug numbers are:

4244523: Data corruption in ISM shared memory segs with heavy load/multi-threaded apps 4255955: With enable_grp_ism=1 on E10000, 5.6 -15 KJP, oracle 7.3.4 crashes

The Sun base bug number is 4228856, which is not published.

In addition to the above Sun bugs there are many Oracle bugs and duplicate Sun bugs pointing to the
same problem:

If ISM is enabled on Sun E10000 systems, Oracle data corruptions may occur. This is especially true when Domain Reconfiguration (DR) feature is also enabled.

There are three very important points one should be aware of:

The problem is specific to Sun E10000 models.
The problem is more prominent when DR is enabled
These problems are fixed in Sun Kernel patch level 16, a.k.a. Sun OS 5.6 Patch ID# 105181-16.

What causes the corruption?

Sun E10000 models have a new processor model featuring a new type of CPU register. When ISM is turned on, the shared memory image coherency across processes becomes inconsistent under some circumstances. This is caused by the CPU cache getting out-of-sync with the on-disk data and the register not being flushed. The effect of this problem is reflected as corrupt Oracle blocks, where the
block header does not match the tail. In almost all cases, the block header has a pattern that exists in all
of the ORA-1578 trace files.

It is very important to remember that data block corruption can be caused by many other factors, such
as hardware failure, logical volume manager bugs or Oracle bugs. Encountering a corruption on a Sun
E10000 does not automatically imply it is caused by ISM, and should be reported to the appropriate channels for detailed analysis.

How is ISM enabled?

To make full use of ISM, it must be enabled both at the Solaris and Oracle level. The default behavior
of Solaris 5.6 and Oracle is to enable the use ISM. However, due to the issues discussed above, support analysts used to recommend turning off ISM usage, by setting the following parameters:

In /etc/system:

shmsys:ism_off=1
shmsys:share_page_table=1

In init.ora

use_ism=false

Note that the above parameters will turn OFF ISM. To enable ISM, simply remove those lines from the configuration files. However, Oracle will not use ISM if the entire SGA cannot fit in a contiguous
shared memory area and will not report any error messages. If the system memory is fragmented and a
contiguous shared area cannot be allocated for the Oracle SGA, the system needs to be rebooted.

How much performance gain does ISM give?

The performance gain one can expect after enabling ISM depends largely on the utilization of the system, especially the number of users and amount of shared memory used. On busy Oracle systems with many concurrent users (>200) and large SGA (>1G), we have observed as much as 30% performance improvements. There are also cases where the system may seem like it's hung but the kernel CPU usage is so high that no other activity can take place. Simply enabling the ISM reduced the
kernel CPU usage, eliminating the hanging situations.

In general, the less memory the system allocates for process page tables, the less the overhead. Unfortunately, there is no direct interface to see the memory allocated for this purpose, other than the
"crash" utility. The cache name for process page tables is "sfmmu8_cache" and can be checked by running the "crash" utility as the "root" user as follows:

# crash
dumpfile = /dev/mem, namelist = /dev/ksyms, outfile = stdout > kmastat

                            buf   buf   buf   memory    #allocations 
cache name                 size avail total   in use    succeed fail 
----------                ----- ----- ----- --------    ------- ---- 
... 
sfmmu8_cache                232  8297 43925 10280960     124873    0 
... 
----------                ----- ----- ----- --------    ------- ---- 
permanent                     -     -     -   114688       1155    0 
oversize                      -     -     - 16433152     381109    0 
----------                ----- ----- ----- --------    ------- ---- 
Total                         -     -     - 122658816 1545645581    0

> quit
#

If ISM is enabled, this cache should not grow as more users are connected and stay at the same level
after reaching a stable value. We have observed less than 100M of memory usage on very busy systems, with more than 300 users and an SGA size of 2G, as opposed to 1G of memory usage without ISM.

What is the bottom line?

With the introduction of Sun Kernel patch 16 (Sun OS 5.6 Patch ID# 105181-16), all known problems related to ISM has been fixed. We should encourage Oracle customers to make use of this facility as it has a significant performance impact and definitely worthwhile. Customers should make
sure they have applied the latest Sun Kernel patch and remove the parameters disabling the use of ISM, if they were set to prevent corruption problems in the past.

Sun already published a note informing customers on how to make this change:

Infodoc ID: 20823
Synopsis : Information about hangs on E10k systems due to disabling of Intimate Shared Memory (ISM) in /etc/system or Oracle's init.ora file Date :7 Oct 1999

There was a kernel bug, 4244523, that required a temporary workaround which was to turn off ISM in /etc/system and in the database application, for example, Oracle's init.ora file. This was fixed in the Solaris 5.6 kernel update patch 105181-15.

Unfortunately some customers may forget to remove the modifications to /etc/system and init.ora after upgrading their kernel. ISM is enabled by default. The following should NOT be in /etc/system:

shmsys:ism_off=1
shmsys:share_page_table=1

In addition Oracle's init.ora should NOT have:

use_ism=false

To turn off ISM can cause severe performance degradation and cause what appears to be a hung state.

Acknowledgements

Most of the information in this document is compiled from internal mailing lists, Sun Microsystem's
Support Web Page (sunsolve.sun.com) and author's field experience at various customer sites.

I would also like to thank Vern E. Wagman for his case study on Solaris 2.6, Veritas 3.3.1 and ISM. Doc ID:

Note:1010984.6
Subject:

        "SPCRE: SEMGET ERROR,
        COULD NOT ALLOCATE
        SEMAPHORES." ON
        DATABASE STARTUP
 Type: 
        PROBLEM
 Status: 
        PUBLISHED

                                    Content Type: 
                                                  TEXT/PLAIN
                                    Creation Date: 
                                                  07-JUL-1995
                                    Last Revision
                                    Date: 
                                                  28-MAR-2000
                                    Language: 
                                                  USAENG

Problem Description:

UNIX kernel parameters in "/etc/system" are set to a value recommended in the Installation and Configuration Guide (or greater), but database startup returns the error:

ORA-07252 "spcre: semget error, could not allocate semaphores."

 // *Cause:  Semget system call returned an error. Possible resource limit 
 //          problem. 
 // *Action: Check errno. Verify that enough semaphores are available in system. 
 //          If additional errors occur in destroying the semaphore sets then 
 //          sercose[0] will be non-zero. If this occurs, remove the semaphore 
 //          will be non-zero. If this occurs, remove the semaphore sets using 
             ip.

Problem Explanation:

If there are syntax errors or incorrectly-spelled kernel parameters in the "/etc/system" file, then the operating system will use the default parameter values; if these default values are insufficent, the above error will occur.

Search Words:

SEMMNI, SEMMNS, SEMMSL, configure, configuration, value, UNIX, etc/system, regenerate, regenerating, reconfigure

Solution: BE SURE KERNEL PARAMETERS IN "/ETC/SYSTEM" ARE SET CORRECTLY

Solution Description:

You need to check the syntax of the UNIX system parameters listed in the "/etc/system" file.

         SEMMNS 
         SEMMNI 
         SEMMSL

Correct any mispelled parameters and regenerating the kernel. .

Doc ID:

Note:1067184.6
Subject:

        ORA-27146: STARTING THE
        DATABASE AFTER
        MODIFYING PARAMETER
        PROCESSES IN "INIT.ORA"
 Type: 
        PROBLEM
 Status: 
        PUBLISHED

                                   Content Type: 
                                                  TEXT/PLAIN
                                   Creation Date: 
                                                  12-MAR-1999
                                   Last Revision
                                   Date: 
                                                  25-MAY-2000
                                   Language: 
                                                  USAENG

Problem Description:

You are modifying the parameter processes in the "init.ora" file. Upon database startup you receive the following error:

  ORA-27146: post/wait initialization failed 
      Cause:  OS system call failed 
     Action: check errno and contact Oracle Support
  
 NOTE: If you check the alert log, possible other messages showing a trace file 
       generated in the "$ORACLE_HOME/rdbms/log" directory may appear.

Solution Description:

You can either decrease the PROCESSES parameter in the init<SID>.ora file or increase the number of OS semaphores.

To decrease the PROCESSES parameter, edit the init<SID>.ora parameter file and change the value for PROCESSES.

To increase the number of OS semaphores, do the following:

Login as root.
Edit the file "/etc/system".
Increase the value of set semsys: seminfo_semmns to approximately twice the number of init<SID>.ora parameter PROCESSES The install guide recommends to start with SEMMNS set to 200.
Save the change.
Reboot the machine.

Explanation:

You may have exceeded the amount of OS semaphores. You must make certain that there are sufficient semaphores for each Oracle process.

Reference:

[NOTE:15654.1] Calculating Oracle's SEMAPHORE Requirements BUG:714557 WHEN OUT OF SEMAPHORES GET ORA-27146: POST/WAIT INITIALIZATION FAILED BUG:865217 ORA-27146: POST/WAIT INITIALIZATION FAILED Search Words:

ORA-27146,processes,SEMMNS
.

From: Tom Felton 08-May-00 15:49
Subject: ORA-27146 error when trying to start 2nd instance

RDBMS Version: 8.1.5.0.0
Operating System and Version: Solaris 2.6 Error Number (if applicable): ORA-27146 Product (i.e. SQL*Loader, Import, etc.): SVRMGRL Product Version:

ORA-27146 error when trying to start 2nd instance

I have read the information contained in Doc ID:Note:1067184.6 regarding this error. I still have to shut down on instance to start another. Below is the list of the semaphores from my system:

 set shmsys:shminfo_shmmax=4294967295 
 set shmsys:shminfo_shmmmin=1 
 set shmsys:shminfo_shmmni=1600 
 set shmsys:shminfo_shmseg=160 
 set shmsys:shminfo_semmns=4096 
 set shmsys:shminfo_semmni=2048 
 set shmsys:shminfo_semmsl=2048 
 set shmsys:shminfo_semopm=2048

Based on a formula that I have found on the site the semmns should be more than enough.

Any ideas on what might be occurring?

Document ID:         100750.382
Title:               Common Semaphore Problems
Creation Date:       23 August 1991
Last Revision Date:  23 August 1991
Revision Number:     01
Product:             RDBMS
Product Version:     v6.0 v7.0 v7.1
Platform:            UNIX
Information Type:    ADVISORY
Impact:              MEDIUM
Abstract:            This bulletin describes the common problems with the UNIX
                     Semaphore System that an Oracle DBA might encounter and
                     explains how they can be avoided.
Keywords:            IPC;SEMAPHORE;ORA-7279;ORA-7251;ORA-7252;ORA-9702;SEMNS;
________________________________________________________________________________
                           COMMON SEMAPHORE PROBLEMS

OVERVIEW This bulletin describes how Oracle uses the Unix semaphore system, and the causes of Oracle errors involving Unix semaphores. These errors often (but not always) indicate a resource problem on your Unix machine that needs to be fixed before Oracle will be able to run properly.

EXPLAINING THE UNIX SEMAPHORE SYSTEM One of the important features Unix provides for inter-process communication is the semaphore system. Semaphores are integer-valued objects set aside by the operating system that can be incremented or decremented atomically. They are designed to allow processes to synchronize execution, by only allowing one process to perform an operation on the semaphore at a time. The other process(es) sleep until the semaphores values are either incremented or set to 0, depending on the options used.

Semaphores are generally not used one at a time, so Unix uses the concept of semaphore sets to make it easier to allocate and refer to semaphores. When your Unix kernel is configured, the maximum number of semaphores that will be available to the system is set. Also set are the maximum number of semaphores per set, and the maximum number of sets that can be allocated. These limits can only be changed by remaking the Unix kernel and rebooting the machine.

ORACLE'S NEED FOR SEMAPHORES Oracle uses semaphores to control concurrency between all the background processes (pmon, smon, dbwr, lgwr, and oracle shadows). Semaphores are also used to control two-task communication between the user process and shadow process if the fast (shared memory) driver is used. And in the Unix ports based on MIPS RISC processors, Oracle uses a special semaphore to perform basic test & set functions that are not provided by the processor.

SEMAPHORES IN USE Typing "ipcs -sb" will show you what semaphores are allocated to your system at the moment. This will display all the semaphore sets allocated, their identifying number, the owner, the number of semaphores in each set, and more.

Occasionally, unexpected termination of Oracle processes will leave semaphore resources locked. If your database is not running, but "ipcs -sb" shows that semaphore sets owned by oracle are still in use, then you need to de-allocate (free) them. If you don't do this, then you may not be able to allocate enough semaphores later to restart your database.

Freeing semaphore sets is done with the "ipcrm" command. For each set that oracle has allocated, type "ipcrm -s ID" where ID is the set number you see from the "ipcs" output. Semaphores can also be freed by rebooting the system.

STARTING UP THE DATABASE Oracle allocates all the semaphores it needs for the background processes at the time of database startup. The init.ora parameter "processes" is used to determine how many semaphores will be allocated for Oracle's use. If Oracle needs more semaphores than are allowed in one set, then more sets are grabbed.

A common error received while starting up the database is

ORA-7279: spcre: semget error, unable to get first semaphore set

Your system is trying to allocate the first set of semaphores, containing either the maximum number of semaphores per set, or the value specified by the "processes" parameter, whichever is less. Either your system doesn't have enough semaphores configured, or too many semaphores or semaphore sets are already allocated. In this case, the first thing to check is if unused sets are hogging up all the system's semaphores (see above SEMAPHORES IN USE section). If that isn't the problem, you'll need to configure more semaphores on your system. If you don't have any semaphores configured, or every single one is currently allocated, you may see this error:

ORA-7251: spcre: semget error, could not allocate any semaphores

If the first set full of semaphores was succesfully allocated, but the second could not be taken, this error will come up:

ORA-7252: spcre: semget error, could not allocate semaphores

Again, the problem can be resolved by making sure that the resources for your machine are still being held by long-dead Oracle processes, and otherwise that you have configured enough to begin with.

SHUTDOWN ABORT When a shutdown abort is done, the Oracle background processes are killed and the semaphore sets are freed, without waiting for the user processes to finish what they're doing. The users finally find out that the database is down when they send a request to the database by incrementing (or decrementing) a semaphore. The attempt to modify the semaphore fails, and the user's process (along with its Oracle shadow process) dies. One or both of the following errors will be displayed (sometimes only in the trace file):

ORA-7264: spwat: semop error, unable to decrement semaphore ORA-7265: sppst: semop error, unable to increment semaphore

This is an effective (if ungraceful) way of letting the users know that the database has been shut down with the abort option.

USING A UNIX MACHINE BASED ON MIPS RISC TECHNOLOGY On Unix machines with MIPS RISC processors, such as the DEC RISC Ultrix and MIPS machines, Oracle allocates an extra semaphore at startup time for latching purposes. Since MIPS RISC chips don't have test & set built into the hardware, this must be simulated in sofware with a semaphore.

Oracle uses this semaphore as a latch whenever anyone makes a connection to the database. But to make sure that no processes retain a latch even if they are accidentally killed, a semaphore undo structure is grabbed as well, so that any changes the process makes to the semaphore can be undone in case of unexpeced process death. So each process that connects needs to be able to allocate an undo structure. If there aren't enough undo structures in the system, the following error will result:

ORA-9702: sem_acquire: cannot acquire latch semaphore

The solution is to either wait until undo structures become available, or reconfigure the system parameter controlling the maximum number of undo structures available (SEMMNU).

USING THE FAST DRIVER Every connection made to the database with the fast driver requires semaphores to control who can read/write to the shared memory buffer that is used for communication between the user and Oracle shadow process. A semaphore set containing 3 semaphores is allocated for each connection made with the fast driver. If the system's semaphore resources (sets or total semaphores) are all currently allocated, then connection will fail with

ORA-2721: osnseminit: cannot create semaphore set

and you will need to wait until semaphores are freed up to make a connection with the fast driver.

One word of caution: unexpected termination of processes using the fast driver will leave the semaphore set (and also a shared memory segment) allocated. Since it's hard to tell which semaphore set belongs to which process, you may have to wait until everyone logs off the database (and perhaps for the database to be shut down) before you can tell which semaphore set you need to free.

PLANNING YOUR SYSTEM'S SEMAPHORE RESOURCES Different Unix machines have different names for the semaphore kernel parameters that you need to adjust, and different methods to adjust them and rebuild your kernel. You'll need to consult your System Administrator's Guide to determine exactly what parameters to modify and how to modify them.

You can determine the total number of semaphores you'll need to have available with the following equation:

     proc (value of the "processes" parameter in your 
           initSID.ora file, controlling the total number of
           connections allowed to the database)
 
   + test (test = 1 if your system uses a MIPS RISC processor,
                = 0 otherwise)

+ 3*f (f = # of fast driver connections that will be made) Received on Fri Dec 01 2000 - 22:14:46 CST