Click on our Sponsors to help Support SunWorld
Performance Q & A by Adrian Cockcroft

What are the tunable kernel parameters for Solaris 2?

I wanna take my system to Joe's Garage for a smog test and tune-up

SunWorld
January  1996
[Next story]
[Table of Contents]
[Search]
Subscribe to SunWorld, it's free!

Abstract
Tuning the kernel isn't easy. Some tunables are well known and easy to explain. Others are more complex, or change from one release to the next. Administrators often choose settings based on out of date folklore. This column identifies and explains some of the variables that are safe to tune. It also compares Solaris tunables with those in other versions of Unix in order to identify the ones that size automatically in Solaris 2 and therefore don't need manual tuning. (3,000 words)


Mail this
article to
a friend

Q:
Coming from other versions of Unix, I'm used to having a long list of tunable parameters to set up when I rebuild the kernel. What is the equivalent list for Solaris 2?

-- Tuneless in Two Rivers

This is a common question that doesn't have an easy answer. Understanding the distinction between interfaces, implementations, and behaviors is fundamental and it's as good a starting point as any.

Interfaces
Interfaces are designed to stay the same over many releases of the product. This way users and programmers have time to figure out how to use the interface before it changes.

Consider the controls used to drive a car. The driver's interface remains relatively constant. The basic controls for stop, go, and steer stay in the same place. You don't need to know how many cylinders your engine has before you can drive your car. Likewise, a computer's interface must maintain some consistency at many levels to prevent leaving users and administrators in the dark.

Implementations
The implementation hides behind the interface and does the actual work. Bug fixes, performance enhancements, and underlying hardware differences are handled by changes in the implementation. To improve the product, designers often make changes from one release to the next, or even from one system to another running the same release.

If a car engine starts to misfire and you need to lift the hood and tinker with the ignition timing, you suddenly face a lot of implementation details. Outwardly identical cars may harbor major differences under the hood (such as completely different engines). Furthermore, many components change year by year as well.

Behaviors
The behavior of a system changes from one implementation to the next. For example, Solaris 2.5 on an Ultra 1 has the same set of interfaces as Solaris 2.4 on a SPARCstation 5. The behavior is quite different, however, as the latest OS has been tuned in several ways and the new hardware offers better performance.

To use the car analogy again, a BMW 518i and a BMW 540i look very similar, but one has a 1.8-liter four-cylinder, and the other has a 4-liter eight cylinder engine. They don't sound the same and they don't behave the same when you put your right foot down!

In normal use there is no need to tune the Solaris 2 kernel, since it dynamically adapts itself to the given hardware configuration and application workload.


Advertisements

Documented configuration tunables
The Solaris 2 AnswerBook Performance Section offers a list of tunable parameters. The size of these data structures has no effect on performance, but if they are set too low an application might not run at all. Configuring shared memory allocations for databases falls into this category.

Kernel configuration and tuning variables normally are edited into the /etc/system file by hand. Unfortunately, any kernel data that has a symbol can be set via this file at boot time, whether it is a documented tunable or not. The result of such fiddling can be less than ideal. The kernel is supplied as many separate modules (type ls /kernel/* to see some of them); to set a variable in a module or device driver when it is loaded, the variable name must be prefixed by the module name and a colon. For example:

set pt_cnt = 1000

set shmsys:shminfo_shmmax = 0x20000000

The history of kernel tuning
Given today's self-adjusting kernels, why is there so much emphasis on tuning? And why do we expect big performance boosts available from kernel tweaks? I think the reasons are historical. For an explanation, let's revist my car analogy.

Compare a car from the 1970s with a similar 1995 model. The older car has a carburetor, needs regular tune-ups, and is likely to be temperamental at best. The 1995 car has computerized fuel injection, self-adjusting engine components, and is easier to live with, consistent and reliable. If the old car won't start reliably, you get out the manual and tinker with a large number of fine adjustments. In contrast, the new car's computerized ignition and fuel injection systems have few if any user-serviceable components.

Unix started out in an environment where users had source code and did their own tuning and support. (If you like this way of working, you probably should run the free Unix clone Linux on your PC at home -- if you don't already.) As Unix became a commercial platform for running applications, the user profile changed. Today's users typically only want to run their applications, and consider tinkering with the operating system an unwelcome distraction.

SunSoft engineers put a lot of effort into automating the tuning for Solaris 2. It adaptively scales according to the hardware capabilities and the current workload. The self-tuning nature of modern cars is now a major selling point. Likewise, the self-configuring and self-tuning nature of Solaris contributes to its ease of use and greatly reduces the potential gains from tweaking it yourself.

With each successive version of Solaris 2, SunSoft has removed tuning variables by converting hand-adjusted values into adaptively managed limits.

If SunSoft can describe a tunable variable and offer detailed guidelines about when and how it should be tuned, it could have either documented this in the manual or implemented the tuning automatically. Rather than require manual tuning by users and administrators, SunSoft opted to employ automatic tuning in most cases.

The Solaris 2 tuning manual should really tell you which things don't need to be tuned any more, but it doesn't. This is one of my complaints about the manual, which, in my opinion, is in need of a complete rewrite. It is too closely based on the original Unix System V manual from many years ago, when tuning was needed and worthwhile.

Tuning to incorporate extra information
An adaptively managed kernel can react only to the workload it sees. If you know enough about the workload, you may be able to use the extra information to effectively pre-configure the algorithms. In most cases the gains are minor. Increasing the size of the name caches on NFS servers falls into this category. One problem is that the administrator often knows enough to be dangerous, but not enough to be useful.

Tuning during development
The primary reason so many obscure "folklore" kernel tunables exist is that tunables are often used to provide options and allow tuning during the development process. Kernel developers can read the source code and try things out under controlled conditions. When the final product ships, the tunables are often still there. Each bug fix and new version of a product potentially changes the meaning of the tunables. This is the biggest danger for an end user, who is guessing what a tunable does from its name or from knowledge of an older Unix implementation.

Tuning to solve problems
When a bug or performance problem is needs fixing, the engineer typically tries to find an easy workaround that can be implemented immediately. It takes much longer to rewrite and test the code to eliminate the problem, so a proper fix likely won't exist until it appears in a patch or in the next release of the operating system. There may be a kernel tunable that can be changed to provide a partial workaround, and this information will be provided to users. Unfortunately, these "point-patch" fixes sometimes become part of the folklore and are propagated indiscriminately -- a short-term fix in one case may be a long-term problem in another.

In one real-life case a large SPARCcenter 2000 configuration was running very slowly. The problem turned out to be a setting in /etc/system that had been supplied to fix a problem on a small SPARCstation 2 several years before. The administrator had carefully added it during installation to every machine at his site. Instead of increasing the size of a dynamically configured kernel table on a SPARCstation 2 with 32 megabytes of RAM, the tweak was drastically reducing the table size on a machine with 1 gigabyte of RAM. The underlying problem did not even exist in the version of Solaris 2 that was currently being used at the site!

The lesson: Clean out your /etc/system when you upgrade.

The placebo effect
You may be convinced that setting a tunable has a profound effect on your system when it is truly doing nothing. In one case an administrator was adamant that a bogus setting could not be removed from /etc/system without causing serious performance problems. Although the "variable not found" error message that displayed during boot was pointed out, it took a while to convince him that this meant that the variable no longer existed in this release and thus the setting could not be having any effect.

Tunable kernel parameters
The kernel chapter of my book, Sun Performance and Tuning: SPARC and Solaris, explains how the main kernel algorithms work. The kernel tunable values listed in this section include the main tunables worth worrying about. A huge number of global values are defined in the kernel; if you hear of a tweak that is not listed here, think twice before using it. The algorithms, default values, and existence of many of these variables vary from one release to the next. Do not assume that an undocumented tweak that works well for one kernel will apply to other releases, other kernel architectures of the same release, or even a different patch level.

The tables that follow are taken from Appendix A of my book, and contain cross-references to the detailed descriptions in the book.

Primary Configuration Variables in Solaris 2.3, 2.4 and 2.5


Name Default Min Max Reference ____ _______ ___ ___ _________

maxusers MB available 8 2048 "Autoconfiguration of maxusers in Solaris RAM(physmem) 2.3 and Solaris 2.4" on page\x11188

pt_cnt 48 48 3000 "Changing maxusers and Pseudo-ttys in Solaris 2" on page\x11187

maxusers
I never set maxusers. It sizes itself based on the amount of RAM in the system. In some cases on configurations with gigabytes of RAM it needs to be reduced to avoid problems with lack of kernel address space. The kernel uses up a lot of space keeping track of all the RAM in a system. Several other kernel table sizes and limits are derived from maxusers. The name is historical, and has no real link to the number of users a system is expected to support.

pt_cnt
The variable that really limits the number of remote user logins on the system is pt_cnt. It may be necessary to set the number of pseudo-ttys higher than the default of 48, especially in a time-sharing system that uses telnet from Ethernet terminal servers to connect users to the system. Solaris 2.3 and later are tested up to about 3000 idle, pseudo-tty-based logins. A practical limit is imposed by the format of the utmp file entry of 62*62 = 3844 telnets and another 3844 rlogins; it is best to keep pt_cnt under 3000.

To actually create the /dev/pts entries, a boot -r is required after pt_cnt is set.

File Name and Attribute Cache Sizes for Solaris 2


Name Default Min Max Reference ____ _______ ___ ___ _________

ncsize (maxusers 226 34906 "Directory Name Lookup Cache" on page\x11189 * 17) + 90

ufs_ninode (maxusers 226 34906 "The Inode Cache and File Data Caching" on * 17) + 90 page\x11191

ncsize
The directory name lookup cache (DNLC) is sized to a default value based on maxusers. A large cache size (ncsize) significantly helps NFS servers that have a lot of clients. On other systems the default is adequate.

The only limit to the size of the DNLC cache is available kernel memory. For NFS server benchmarks, the limit has been set as high as 16,000; for the maximum maxusers value of 2048, the limit would be set at 34,906. Each DNLC cache entry is quite small, since it basically just holds up to a 30-character name. Increase it to at least 5000 on a busy NFS server that has 256 megabytes or less RAM by adding the following line to /etc/system:

set ncsize=5000

If you have more than 256 megabytes of RAM, ncsize will already be big enough.

Hardware-Specific Configuration Tunables


Name Default Min Max Reference ____ _______ ___ ___ _________

use_mxcc_prefetch 0 (sun4d) 0 1 "The SuperSPARC with SuperCache 1 (sun4m) Two-level Cache Architecture" on page\x11159.

use_mxcc_prefetch
This one falls in the category of knowing your workload and optimizing accordingly. The SuperSPARC's external cache controller can pre-fetch the next cache subblock before you need it. This tends to improve performance in floating-point-intensive applications that sweep through memory sequentially. Database applications have a random access pattern, so prefetching does not help, and will most likely get in the way. By default, prefetch is turned on for desktop systems like the SPARCstation 20, and turned off on servers like the SPARCserver 1000. You could try changing the setting for SPARCstation 20 database servers and/or SPARCserver 1000 compute servers.

System V shared memory and semaphores
Shared-memory parameters usually are set based on the needs of specific applications. Most of these parameters are limits, so setting them too high does not consume any extra resources. The shmsys:shminfo_shmni tunable is an exception, as it causes structures to be preallocated.

Shared Memory and Semaphore Tunables in Solaris 2


Name Default Min Max Reference ____ _______ ___ ___ _________

shmsys:shminfo_shmmax 1048576 1048576 Available Maximum shm segment RAM size in bytes

shmsys:shminfo_shmmin 1 1 - Minimum shm segment size in bytes

shmsys:shminfo_shmni 100 100 - Number of shm identifiers to pre-allocate

shmsys:shminfo_shmseg 6 6 - Maximum number of shm segments per process

semsys:seminfo_semmap 10 10 - Number of entries in semaphore map

semsys:seminfo_semmni 10 10 65535 Number of semaphore identifiers

semsys:seminfo_semmns 60 - - Number of semaphores in system

semsys:seminfo_semmnu 30 - - Number of undo structures in system

semsys:seminfo_semmsl 25 - - Maximum number of semaphores per ID

semsys:seminfo_semopm 10 - - Maximum number of operations per semop call

semsys:seminfo_semume 10 - - Maximum number of undo entries per process

semsys:seminfo_semusz 96 - - Size in bytes of undo structure, derived from semume

semsys:seminfo_semvmx 32767 - - Semaphore maximum value

semsys:seminfo_semaem 16384 - - Adjust on exit maximum value

msgsys:msgmap 100 100 - # of entries in msg map

msgsys:msgma 2048 2048 - max message size

msgsys:msgnb 4096 4096 - max # bytes on queue

msgsys:msgmni 50 50 - # of message queue identifiers

msgsys:msgssz 8 8 - msg segment size (should be word size multiple

msgsys:msgtql 40 40 - # of system message header

msgsys:msgseg 1024 1024 32767 # of msg segments

The ones that went away
I looked at HP-UX 9.0 on an HP 9000 server. The sam utility provides an interface for kernel configuration. Like Solaris 1/SunOS 4, the HP-UX kernel must be recompiled and relinked to tune it and to add drivers and subsystems. In Solaris 2, filesystems, drivers, and modules are loaded into memory when they are used, and the memory is returned if the module is no longer needed. Rather than provide a GUI, the whole process is made transparent.

There are 50 or more tunable values listed in sam. Some of them are familiar or map to dynamically managed Solaris 2 parameters. There is a maxusers parameter that must be set manually, and several other parameters that are sized based upon maxusers in a similar way to Solaris 2. Of the tunables that I can identify, the Solaris 2 equivalents are either unnecessary or listed above.

Dynamic kernel tables in Solaris 2
Solaris 2 dynamically manages the memory used by the open file table, the lock table (in 2.5), the callout queue, the streams subsystem, the process table, and the inode cache. Unlike other Unix implementations, which statically allocate a full-size array of data structures and thus waste a lot of precious memory, Solaris 2 allocates memory as it goes along. Some of the old tunables used to size the statically allocated memory in other Unixes still exist in Solaris 2, but now they are used as limits to prevent too many data structures from being allocated. This dynamic allocation approach is one reason why it is safe to let maxusers scale automatically to very high levels. In Solaris 1 or HP-UX 9, setting maxusers to 1024 and rebuilding the kernel would result in a huge kernel (which might not be able to boot) and a huge waste of memory. In Solaris 2, however, the relatively small DNLC is the only statically sized table derived from maxusers.

Wrap-up
Take a look at your own /etc/system file. If there are things there that are not listed in this article and that you don't understand, you have a problem. There should be a large comment next to each setting that explains why it is there and how its setting was derived. You could even divide the file into these sections: configuration, extra information, development experiments, problem fixes, and placebos.

I hope I have convinced you that this short collection is the full set of Solaris tunables that should be documented and supported for general use. Why worry about tweaking a cranky and out-of-date Unix system, when you can use one that takes care of itself?

(I must admit that I like tweaking things, and I don't mind messing with my 1980 Fiat Spider most weekends to keep it running. However, our other car is a Saturn SW2, well known as a reliable and trouble-free car.)

Next month
Advice on compiler flags and why dynamic linking is the safest option on Solaris 2.


Click on our Sponsors to help Support SunWorld


Resources