Sizing up memory in Solaris
How can you tell how much memory you need? Introducing Sun's memory guru, who has written a white paper and some very powerful tools to help solve this problem
How much memory do I need? This is a question we keep hearing over and over, and memory sizing has been the subject of several Performance Q&A columns. This month, we take another stab at explaining memory in Solaris by reviewing the different types of memory and introducing a white paper and a set of tools, the RMCmem package, by Sun Systems Engineer Richard McDougall. (2,200 words)
: How can I tell how much memory I need?
: I've tried to answer this question a few times, but the real problem is that Solaris doesn't give you enough information on what memory is being used for. A few years ago a Sun systems engineer based in Adelaide, Australia, started looking for solutions to this problem. He came to work with me for a month in November 1995 and developed some new tools. In April 1997 we hired him to work full-time in our group, and now Richard McDougall is ready to share his knowledge with the world via a new white paper The Solaris Memory System -- Sizing Tools and Architecture (see Resources section), and a new tool set, the RMCmem package.
The RMCmem tools
The white paper contains details on how to obtain the tools. The package includes a kernel module that provides extra instrumentation, and is not a supported Sun product. As with any unsupported package, you should not depend upon it in a production environment. Bugs, patches, or interactions with other kernel modules could crash your system, and there is some additional data-collection overhead. Part of the package was integrated into Solaris 2.6, and we're using the tools as a proof-of-concept prototype to guide future enhancements to Solaris. An earlier collection of these tools was known as the RMCbunyip package -- the Bunyip is a legendary monster in Australia.
The four main uses of memory
Solaris is a virtual memory system. The total amount of memory that you can use is increased by adding swap space to the system. If you ever see "out of memory" messages, adding swap space is the usual fix. Performance of the system is very dependent on how much physical memory (RAM) you have. If you don't have enough RAM to run your workload, performance degrades rapidly. In this discussion I'm mainly concerned with RAM usage, and knowing how much is enough.
Physical memory usage can be classified into four groups:
RMCmem includes a simple command to summarize this:
% prtmem Total memory: 123 Megabytes Kernel Memory: 12 Megabytes Application memory: 16 Megabytes Buffercache memory: 34 Megabytes Free memory: 59 Megabytes
Total physical memory
The total physical memory can be seen using
Memory is allocated in units called pages, and you can use the
pagesize command to see whether you have 4096 or 8192
bytes per page:
% /usr/sbin/prtconf | grep Memory Memory size: 128 Megabytes % /usr/bin/pagesize 8192
From an application you can call
sysconf(_SC_PHYS_PAGES) to find out how many pages of
memory you have, and
getpagesize() to see the size. In
some cases the result you get is less than the total -- as during
boot, the first few megabytes are grabbed by the kernel, and
what remains is made available to the system as pages (counted
by the kernel as
Kernel memory is allocated to hold the initial kernel code at boot time, then grows dynamically as new device drivers and kernel modules are used. Kernel tables also grow dynamically, unlike some older versions of Unix. As you add hardware and processes to a system, the kernel will grow. In particular, to keep track of all the memory in a system, the kernel allocates a page table structure. If you have several gigabytes of RAM this table gets quite large. The dynamic kernel memory allocator grabs memory in large "slabs," then allocates smaller blocks more efficiently. This means that the kernel tends to grab a bit more memory than it's really using. If there is a severe memory shortage, the kernel unloads unused kernel modules and devices and frees unused slabs. The simplest summary of kernel memory usage comes from
sar. The old memory allocator architecture is still
sar's output format, which lists a pool of
small allocations (under 512 bytes each), large allocations (512
bytes to a page), and oversize allocations (a page or more). The
new allocator is more sophisticated and is best seen by running the
crash command as root and using the
option to list the individual memory pools:
% sar -k 1 ... 12:44:23 sml_mem alloc fail lg_mem alloc fail ovsz_alloc fail 12:44:24 2842624 2003764 0 6561792 5503480 0 2678784 0
Totaling up the small, large, and oversize allocations, the kernel has grabbed 12,083,200 bytes of memory and is actually using 10,186,028 at present.
Application process memory
Application processes consist of an address space divided into segments, where each segment maps either to a file, anonymous memory (the swap space), System V shared memory, or a memory mapped device. The mapped files include the code and initialized data for the command and all its shared libraries. Files that the process has accessed via the
mmap call may also be included, but not files that have been read and written in the normal manner. The segments can be viewed using
/usr/proc/bin/pmap on any system running Solaris 2.5 or
% /usr/proc/bin/pmap 3687 3687: cat misc/vmstat.run64 00010000 8K read/exec dev: 32,0 ino: 108245 00020000 8K read/write/exec dev: 32,0 ino: 108245 00022000 8K read/write/exec EF680000 512K read/exec /usr/lib/libc.so.1 EF70E000 32K read/write/exec /usr/lib/libc.so.1 EF716000 8K read/write/exec EF730000 48K read/shared dev: 32,8 ino: 226 EF740000 16K read/exec /usr/platform/SUNW,Ultra-2/lib/libc_psr.so.1 EF752000 8K read/write/exec /usr/platform/SUNW,Ultra-2/lib/libc_psr.so.1 EF770000 32K read/exec /usr/lib/libw.so.1 EF786000 8K read/write/exec /usr/lib/libw.so.1 EF790000 16K read/exec /usr/lib/libintl.so.1 EF7A2000 8K read/write/exec /usr/lib/libintl.so.1 EF7B0000 8K read/exec/shared /usr/lib/libdl.so.1 EF7C0000 8K read/write/exec EF7D0000 104K read/exec /usr/lib/ld.so.1 EF7F8000 16K read/write/exec /usr/lib/ld.so.1 EFFFC000 16K read/write/exec EFFFC000 16K [ stack ]
What is shown here is the base address and size of each segment,
together with its mapping. Inode 108245 is the
cat command itself. The
cat command uses
mmap for its input file, and inode 226 matches the file being read.
What we really want to know, however, is the amount of RAM used by each segment. This is shown by the
pmem command in the RMCmem package in Solaris 2.5.1. The new kernel measurement used by this command was added to Solaris 2.6, where it can be seen using
% pmem 3687 3687: cat misc/vmstat.dread.384run64 Addr Size Res Shared Priv Prot Segment-Name -------- ------ ------ ----- ------ ----------------- ----------------------------- 00010000 8K 8k 8k 0k read/exec dev: 32,0 ino: 108245 00020000 8K 8k 8k 0k read/write/exec dev: 32,0 ino: 108245 00022000 8K 8k 0k 8k read/write/exec EF680000 512K 504k 480k 24k read/exec /usr/lib/libc.so.1 EF70E000 32K 32k 8k 24k read/write/exec /usr/lib/libc.so.1 EF716000 8K 8k 0k 8k read/write/exec EF730000 48K 40k 8k 32k read/shared dev: 32,8 ino: 226 EF740000 16K 16k 16k 0k read/exec /usr/platform/SUNW,Ultra-2/lib/libc_psr.so.1 EF752000 8K 8k 8k 0k read/write/exec /usr/platform/SUNW,Ultra-2/lib/libc_psr.so.1 EF770000 32K 32k 32k 0k read/exec /usr/lib/libw.so.1 EF786000 8K 8k 8k 0k read/write/exec /usr/lib/libw.so.1 EF790000 16K 16k 16k 0k read/exec /usr/lib/libintl.so.1 EF7A2000 8K 8k 8k 0k read/write/exec /usr/lib/libintl.so.1 EF7B0000 8K 8k 8k 0k read/exec/shared /usr/lib/libdl.so.1 EF7C0000 8K 8k 0k 8k read/write/exec EF7D0000 104K 104k 104k 0k read/exec /usr/lib/ld.so.1 EF7F8000 16K 16k 8k 8k read/write/exec /usr/lib/ld.so.1 EFFFC000 16K 16k 0k 16k read/write/exec EFFFC000 16K [ stack ] -------- ------ ------ ----- ------ 864K 848k 720k 128k
Now we can see that the process address space size is 864 kilobytes; 846 kilobytes of that are currently resident in main memory, wherein 720 kilobytes are shared with other processes while 128 kilobytes are private. When this command started only the 128 kilobytes of private memory were taken from the free list.
If we now go through all the processes on the system, add up how
much private memory they use, and also add in the shared memory for
each mapped file, we'll know how much application memory is in use.
This summary is shown by
prtmem as we saw in the beginning,
and the detail is listed by the
memps command in
RMCmem. Note that the X server maps over 100 megabytes of Creator3D
framebuffer hardware address space in this example, and processes
that run only in the kernel address space do not have any
address space of their own.
# memps SunOS crun 5.5.1 Generic_103640-14 sun4u 02/21/98 14:12:03 PID Size Resident Shared Private Process 3349 11712k 10240k 1408k 8832k /usr/dist/pkgs/framemaker,v5.1/bin/ 1479 126080k 9160k 1088k 8072k /usr/openwin/bin/X :0 -dev /dev/fbs 355 2992k 2552k 1632k 920k rpc.ttdbserverd 1513 5144k 3664k 2800k 864k cm -Wp 769 40 -Ws 485 440 -WP 205 3 1511 3864k 3152k 2296k 856k calctool -Wp 60 60 -Ws 438 264 -WP 1498 3552k 2728k 1984k 744k ttsession -s ...and so on... 3732 880k 872k 760k 112k more 180 832k 728k 648k 80k /usr/lib/power/powerd 1443 1128k 928k 864k 64k -csh 1525 1104k 928k 864k 64k /bin/csh 3 0k 0k 0k 0k fsflush 2 0k 0k 0k 0k pageout 0 0k 0k 0k 0k sched
Filesystem cache memory
This is the part of memory that is most confusing, as it is invisible. You can only tell it's there if you access the same file twice and it's quicker the second time. The RMCmem package adds kernel instrumentation that counts up all the pages for each cached file. The
memps -m command lists the files that are
cached in order of the amount of memory they're consuming. One
problem is that within the kernel the file is only known by its
inode number and filesystem mount point. The directory pathname for
the file may not be known. The RMCmem package tries to solve this
problem by catching file names as files are opened (by interposing
on the vnode open code) and making an inode-to-name lookup cache in
the kernel. This cache size is limited (to 8192 entries by default),
and the file may have been opened before the kernel module was
loaded, so it can't always find the name. You see a mount point (in
/800008) and the inode number
for that mount point:
# memps -m | more SunOS crun 5.5.1 Generic_103640-14 sun4u 02/21/98 14:46:58 Size Filename 5336k /export/home5/framemaker,v5.1/bin/sunxm.s5.sparc/maker5X.exe 1256k /usr/openwin/server/lib/libserverdps.so.1 960k /usr/openwin/lib/libxview.so.3 608k /usr/openwin/lib/libtt.so.2 528k /usr/lib/libc.so.1 448k /usr/openwin/bin/Xsun 344k /usr/openwin/server/modules/ddxSUNWffb.so.1 336k /usr/openwin/lib/libX11.so.4 312k /usr/lib/libnsl.so.1 232k /usr/lib/sendmail 216k /800000: 586782 200k /export/home5/framemaker,v5.1/bin/sunxm.s5.sparc/fa.htmllite 192k /800000: 586694 184k /800000: 586702 168k /800000: 586758 160k /usr/bin/csh 160k /usr/openwin/server/lib/libcfb.so.1 144k /usr/openwin/server/lib/libmi.so.1 136k /800008: 27 136k /usr/dt/bin/rpc.cmsd 128k /export/home5/framemaker,v5.1/bin/sunxm.s5.sparc/fa.tooltalk 128k /opt/RMCmem/bin/memps551 112k /usr/openwin/server/lib/libcfb32.so.1 112k /usr/lib/ld.so.1 112k /usr/openwin/platform/sun4u/server/lib/libmpg_psr.so.1 ... and so on down to lots of files with only 8KB each ...
To add up all the memory taken up by shared library files another command is provided by RMCmem:
# prtlibs Library (.so) Memory: 6496 K-Bytes
Free memory is maintained so that when a new process starts up or an existing process needs to grow, the additional memory is immediately available. Just after boot, or if large processes have just exited, free memory can be quite large. As the filesystem cache grows, free memory settles down to a level set by the kernel variable
lotsfree. When free memory falls below
lotsfree, all the other memory in the system is scanned
to find pages that have not been referenced recently. These pages
are moved to the free list.
Graphical user interface tools
There is now a lot of new data available. To view it more easily and interactively, Richard wrote a Motif-based GUI tool. The
program comes up with a list of the files that are in memory and
also shows the pageins and pageouts that have occurred for each file
in the last measurement interval.
Example display of filesystem cache using memtool
There are three displays available. The second one shows process memory usage, and clicking on a process gives the detailed segment list as shown here for the X server.
Example display of process memory usage using memtool
The third display is a matrix that combines the two views. It shows the processes across the screen and the files down the side, with each cell holding the amount of memory mapped from that file by that process.
Example memtool process and file matrix display
So, finally, we can tell where all the memory is going. Richard's white paper contains a lot more information, including typical memory sizes for many common commands and daemons, and a full explanation of how the memory system works, and how to size memory configurations for desktop and server systems.
The white paper is at http://www.sun.com/sun-on-net/performance/vmsizing.pdf.
The tools can be obtained by sending e-mail to: email@example.com.
New book update
On another recurring subject, my new book is now in the final stages of the production process. It should become available sometime in April. It ended up being more than 600 pages long (the old book was about 250 pages). The title is Sun Performance and Tuning -- Java and the Internet, by Adrian Cockcroft and Richard Pettit, Sun Press/PTR Prentice Hall, ISBN 0-13-095249-4.
About the author
Adrian Cockcroft joined Sun Microsystems in 1988, and currently works as a performance specialist for the Server Division of SMCC. He wrote Sun Performance and Tuning: SPARC and Solaris, published by SunSoft Press PTR Prentice Hall. Reach Adrian at firstname.lastname@example.org.
If you have technical problems with this magazine, contact email@example.com