Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Shared Memory Per Process
"Bonjour" from Paris,
When I worked for Oracle support, I wrote a document relative to Oracle processes consumption :
0 - Abstract :
There is several ways to examine individual processes in Solaris, there is actually more information available than you may realize. This document looks at the commands provided in Solaris 2.5.1 and 2.6 in order to probe processes memory utilization .
To answer all, we need to use 2 nodes :
1- Solaris memory sytem :
The first thing to observe in a system is where the memory has been allocated. In a broad perspective, we are interested in knowing :
The amount of total physical memory can be ascertained by looking at the output of the « prtconf » command.
Host2% prtconf
System Configuration: Sun Microsystems sun4u
Memory size: 512 Megabytes
System Peripherals (Software Nodes):
...
The buffer cache uses available free memory to buffer files on the filesystem. On most systems, the amount of free memory is almost zero as a direct result of this. To look at the amount of file buffer cache, you will need to use the MemTool package (See chapter 3-4). The MemTool « prtmem » command can be used to dump the contents of the buffer cache.
Host1% ./prtmem
Total memory: 123 Megabytes Kernel Memory: 15 Megabytes Application memory: 68 Megabytes Buffercache memory: 37 Megabytes Free memory: 2 Megabytes Host2% ./prtmem Total memory: 469 Megabytes Kernel Memory: -20 Megabytes Application memory: 180 Megabytes Buffercache memory: 227 Megabytes Free memory: 7 Megabytes
The amount of kernel memory can be found by using the Solaris « sar » command and summing all the alloc columns. The output is in bytes.
Host1% sar -k 1 1
SunOS frsolmp 5.5.1 Generic_103640-21 sun4u 03/26/99
15:27:58 sml_mem alloc fail lg_mem alloc fail ovsz_alloc fail 15:27:59 5169152 4113720 0 6209536 5480208 0 3334144 0
Free memory is almost always zero, because the buffer cache grows to consume free memory. Free memory can be measured with the « vmstat » command. The first line of output from « vmstat » is an average since boot, so the real memory figure is available on the 2nd line. The output is in kilobytess.
Host2% vmstat 3
procs memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id
0 0 0 13168 304 1 13 10 21 28 0 2 1 2 3 0 264 1934 1652 11 3 86
0 0 0 1413120 7536 3 0 0 26 26 0 0 0 1 1 0 239 6340 384 1 1 98
0 0 0 1413120 7536 3 0 0 24 24 0 0 0 1 1 0 263 6493 505 5 1 94
2- How a user process defined in memory ?
In simplified terms, a process' address space is made up of both kernel & user space.
The user space is then broken down into:
(1) program instructions. (2) initialized data (3) uninitialized data. (4) stack. (5) The unused space in between the stack & the uninitialized data
which shrinks as the stack grows and/or uninitialized data becomes initialized.
From the kernel point of view, it gets a little more complex. The address space is broken into pages (4 or 8k, depending on architecture) at the lowest level. Pages are then grouped into segments. A segment is a contiguous number of pages that share an "identity".
What is an "identity"? There are different kinds of segments, the most common of which is a "segvn", a segment tied to a vnode (i.e. a file). Segvn's are tied to a specific vnode and offset into that vnode. Therefore, a contiguous span of pages that all belong to one vnode, and are contiguous in that vnode, constitutes a single segment. Segvns also have a feature that individual pages of the segment may be remapped to "anonymous" memory (i.e. swap space).
Pages have two other properties to note: read-only (RO) vs. read-write (RW) (which is self-explanatory) and shared vs. private. For a shared page, modifications made to the page are written back to the vnode the page came from, thus modifying the file. For a private page, the first modification to the page causes the page to be remapped to an anonymous page (swap), then the write happens. This preserves the original contents of the file. This process is commonly called "copy-on-write".
An ELF executable can define any number of segments. Furthermore, it directly supports the runtime linker (ld.so), so that by the time the executable is actually started, the ld.so and its own data segments are mapped in too.
Also, while any executable is running, it will likely map in files (like shared object libraries (libc.so)). These mappings get placed just below the maximum size stack the process could have (based on the stacksize rlimit), thus filling in the "hole" between the stack and the top of uninitialized data.
Shared memory segments are their own segments, created using either mmap() or shmat(). As such, they usually get created at the top of the gap called (5).
3 - What’s the best way to probe processes ?
Most people know about the ps command. It gives a high level summary of the status of processes on your system. It has many options, but it is a relatively crude way to figure out what your processes are doing. 3-1 System V ps command :
The « ps » command prints information about active processes. The « -l » option generates a long listing with the total size of the process in virtual memory (SZ column), including all mapped files and devices, in pages (i.e containing pages of swap space, combined RAM and disk swap).
Host1% ps -efl
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD 8 S ora805 21795 1 0 41 20 603c4cc0 4327 6067043c 17:34:00 ? 0:00 ora_smon_V805 8 S ora805 21787 1 0 40 20 60c1ecc8 4586 606703fc 17:34:00 ? 0:00 ora_pmon_V805 8 S ora805 21789 1 0 40 20 60d1b990 4342 6067040c 17:34:00 ? 0:00 ora_dbw0_V805 8 S ora805 21793 1 0 41 20 6092a010 4331 6067042c 17:34:00 ? 0:18 ora_ckpt_V805 8 S ora805 21797 1 0 41 20 60c1f328 4315 6067044c 17:34:00 ? 0:00 ora_reco_V805 8 S ora805 21791 1 0 40 20 60d1a670 4336 6067041c 17:34:00 ? 0:00 ora_lgwr_V805
%pagesize
8192
3-2 BSD ps command :
Of all the available options, the best performance related summary comes from the BSD version « /usr/ucb/ps aux », which collects all the process data in one go, sorts the output by CPU usage, then displays the result. The unsorted versions of « ps » loop through the processes, printing as they go. A quick look at the most active processes can be obtained easily using this command:
%/usr/ucb/ps aux
root 22759 0.1 0.6 824 736 pts/12 S 11:52:23 0:00 grep V805 ora805 21793 0.0 7.034648 8776 ? S 17:34:00 0:18 ora_ckpt_V805 ora805 21787 0.0 7.036688 8736 ? S 17:34:00 0:00 ora_pmon_V805 ora805 21789 0.0 7.034736 8728 ? S 17:34:00 0:00 ora_dbw0_V805 ora805 21791 0.0 7.034688 8728 ? S 17:34:00 0:00 ora_lgwr_V805 ora805 21795 0.0 7.434616 9304 ? S 17:34:00 0:00 ora_smon_V805 ora805 21797 0.0 7.434520 9248 ? S 17:34:00 0:00 ora_reco_V805
The columns retative to memory are :
SIZE The total size of the process in virtual memory, including all mapped files and devices, in kilobyte units. This column contains kilobytes of swap space (combined RAM and disk swap). Unused space (5) is not counted towards the address, everything else is (including things that are mapped into the address space into the area originally called (5), like libraries, shared memory segments, etc.)
SZ Same as SIZE.
RSS Real memory (resident set) size of the process, in kilobyte units (i.e containing the amount of RAM
in kilobytes). Resident Set Size, RSS in the « /usr/ucb/ps » output, is calculated by looking at each
page of the process's address space, and if that page has a "hardware translation" to memory, it is
counted.
This is a fancy way of saying "if any page of the process is in RAM" it will be counted. However, it is
only an estimate, because there are cases that can both under- and over-count pages, and things can
also be changing faster than you snapshot it.
%MEM This column prints the percentage of RAM memory
3-3 pmap command :
Using the « pmap(1) » command, which dumps a process's address space map, we can see what mapping exists for running processes. The column on the far left is the virtual address of the mapping, followed by the size in bytes, the permissions, and finally the object name.
If you have Solaris 2.6, then you can use the « -x » option of the « pmap » command (similar to « /opt/RMCmem/bin/pmem »). See 3-4 chapter.
3-4 pmem command :
This command is part of the MemTool utility. The latest version of memtool can be obtained by sending a request to
memtool-request_at_chessie.eng.sun.com. These tools are provided free, but are not covered by normal Sun support. In solaris 2.6, « pmap
-x » is similar to « pmem ». « pmem » is a utility for displaying the address space of a process.
Process address space is broken into several sections, some which is shared and some which is not shared.
A typical address space has an executable, a heap space, some shared libraries and it's stack:
Addr Size Res Shared Priv Segment-Name
00010000 176K 88k 88k 0k text /usr/bin/ksh 0004A000 8K 4k 4k 0k data /usr/bin/ksh 0004C000 32K 16k 0k 16k 0004E000 24K heap EF580000 504K 224k 224k 0k text /usr/lib/libc.so.1 EF60C000 40K 20k 4k 16k data /usr/lib/libc.so.1 EF7B0000 8K 4k 4k 0k text /usr/lib/libdl.so.1 EF7D0000 88K 44k 44k 0k text /usr/lib/ld.so.1 EF7F4000 8K 4k 4k 0k data /usr/lib/ld.so.1 EFFFC000 16K stack
----------------------------------------------------------------------------------------------
1528K 716k 648k 68k
Program text is the executable component of a binary, which is usually mapped into a process as read only.
Each binary also has a data section, which is mapped just above the text portion. The data section contains all of the initialized sections of an application, such as i=10 or char *str={\"hello world\"}.
Because the data section can be shared amongst other processes to save memory, the segment is mapped shared, and any writes to this segment cause COW (copy on writes) which creates an anonymous private page for each page that is writen to.
This means that the data section of an application typically has a shared portion an a non-shared portion.
This is four basic measures for the amount of memory in each segment :
Size : The amount of virtual memory space assigned to this segment.
Res : The amount of a segment which is currently in memory, including the portions which are shared between other processes.
Shared : The amount of this segment which is shared with another process.
Priv : The amount of memory this segment has in memory which is not shared with another process.
4 - Oracle processes memory consumption :
Test 1 : Oracle 7.3.4.4 SGA=21,3 Mo Solaris 2.5.1, pagesize = 8k
NAME ps aux (k) ps -efl
----- ------------- ---------------------
SZ RSS SZ * pagesize (k) ----- ----- --------------------- pmon 42976 24184 5372 * 8k = 42976 lgwr 43136 24224 5392 * 8k = 43136 smon 43000 25320 5375 * 8k = 43000
Name ps aux (k) ps -efl
----- ------------- ------------------------
SZ RSS SZ * pagesize (k) ----- ----- ------------------------pmon 26592 9080 3324 * 8k = 26592
Test 3 : Oracle 7.3.4.4 SGA=21,3 Mo Solaris 2.6, pagesize = 8k
Name pmap -x (k) ps aux (k) ps -efl
----- ----------------------------- ------------- --------------------
Size Resident Shared Private SZ RSS SZ * pagesize (k)
Summary :
The size of background processes depends of the memory usage report in Unix.
The Oracle SGA is resident in shared memory and is sharing by Oracle background processes. This memory is often reported as private memory, so you need to substract this from the commands result.
On many unix platforms and specially on Sun platforms, the text of the Oracle binary and shared libraries are actually shared between
background processes if these instances share the same ORACLE_HOME. So you need to substract the shared text of the oracle binary and
the shared libraries
in the result of the OS commands.
Even pmap and pmen utilities make mistakes between these memory divisions, and sometimes SGA and text executable are often added incorrectly.
However, the following columns from the commands are equivalent :
SZ (ps aux) = SZ (ps -efl) * pagesize = Size (pmap -x)
"Au revoir" from Paris
Hieraklion
Mick Rice a écrit :
> Oracle 734 on Solaris 2.6 > > I am a bit confused by what appears to be the large > amount of shared memory used by oracle connections on > the solaris box I'm working on. It looks like ordinary > database connections are taking upwards of 400mb each. > The server has 4GB virtual memory yet because of the > seeming over use of shared memory by ordinary connections > there's very little left spare on the box. Any insights > would be appreciated. I've attached some details of memory > use on the box but to be honest they don't mean a lot to me. > I don't have access to a unix sysadmin at the moment, and > perhaps that's where the explanation lies. The user concerned > has been assigned the default Oracle profile, > > Mick Rice > > uname -a > > SunOS chan1 5.6 Generic_105181-23 sun4u sparc SUNW,Ultra-4 > > load averages: 0.64, 0.86, 1.03 16:08:12 > 242 processes: 237 sleeping, 1 running, 4 on cpu > CPU states: 79.3% idle, 13.0% user, 5.6% kernel, 2.2% iowait, 0.0% swap > Memory: 4096M real, 263M free, 601M swap in use, 1447M swap free > > PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND > 17555 clarify 4 0 0 13M 6992K run 0:01 2.25% cbbatch > 11946 ora734 1 38 0 404M 400M sleep 1:06 1.90% oracle > 6537 ora734 1 58 0 401M 397M sleep 2:59 1.85% oracle > 6932 ora734 1 0 0 2048K 1600K cpu0 1:21 1.03% top > 643 root 1 25 10 18M 11M cpu2 822:44 0.51% PatrolAgent > 7675 ora734 1 28 0 399M 394M sleep 3:18 0.40% oracle > 1743 patroldu 1 3 10 4384K 1520K sleep 265:18 0.32% bgscollect > 21584 root 1 48 0 23M 17M sleep 164:03 0.20% esd > 16875 ora734 1 58 0 398M 394M sleep 0:03 0.15% oracle > 18679 ora734 1 58 0 401M 397M sleep 1:21 0.14% oracle > 11984 clarify 5 48 0 10M 3952K sleep 8:01 0.14% rulemgr > 19524 ora734 1 58 0 404M 399M sleep 2:36 0.12% oracle > 23760 ora734 1 48 0 403M 398M sleep 1:03 0.09% oracle > 18172 ora734 1 38 0 398M 394M sleep 0:06 0.09% oracle > 15062 root 193 100 -20 3736K 3088K sleep 146:37 0.09% rpc.pmfd > > sysdef |grep -i sem > sys/semsys > * IPC Semaphores > 2100 entries in semaphore map (SEMMAP) > 2100 semaphore identifiers (SEMMNI) > 2200 semaphores in system (SEMMNS) > 2200 undo structures in system (SEMMNU) > 2000 max semaphores per id (SEMMSL) > 10 max operations per semop call (SEMOPM) > 1500 max undo entries per process (SEMUME) > 32767 semaphore maximum value (SEMVMX) > 16384 adjust on exit max value (SEMAEM) > > sysdef |grep -i shm > sys/shmsys > 471859200 max shared memory segment size (SHMMAX) > 1 min shared memory segment size (SHMMIN) > 1700 shared memory identifiers (SHMMNI) > 1550 max attached shm segments per process (SHMSEG) > > _______________________________________________ > Submitted via WebNewsReader of http://www.interbulletin.comReceived on Wed Apr 25 2001 - 03:38:24 CDT