Adrian Cockcroft's frequently asked questions

Editor's note: Adrian Cockcroft's Performance Q&A column in SunWorld generates more reader e-mail than any other item in our monthly magazine. We're starting to see some repetition to your letters, so to lighten Adrian's e-mail load and offer you quicker access to Adrian's wisdom, we've compiled the more frequently asked questions here.

As time passes, we'll add popular questions to the top. We'll leave this file at this URL, so you may wish to add this FAQ to your bookmarks list.

Where can I find SymbEL (aka "rule tool" and "SE")?
Where can I find in-depth documentation for rule tool?
What's the highest value ncsize can have?
How do I figure out target drives for vmstat and iostat?
How does Solaris 2 use memory?
Is there any way to change the time slice on Solaris?
What's a high process-switch value?
Static vs. dynamic linking.
What is a "Level Red" mutex stall?
Does the current edition of your book cover SunOS 5.5?
What flags in /etc/system are related to security?
I need to reboot my Solaris 2.4 machines every two weeks. Why?
I think swap size doesn't affect performance.
How do I improve my Web server's performance?
What's the latest recommended version of Solaris for older SPARC computers?
Should I use the SSA NVSIMM and Presto NVSIMMs together?
Any tips for Solaris X86 administrators?
Is there a Solaris 2.4 kernel tuning parameter that stop unfriendly programs from taking over a system?
We see a "Allocation errors, kmap full?" message on SPARCstation 20 with 512 megabytes. Why?
Help me program asynchronous I/O.
How should I partition my hard disk?
Will I/O be faster on a 64-bit file system, especially on a database application like Oracle?
What's better for a Web server: UltraSPARC or hyperSPARC?
Why shouldn't I run CacheFS on a read-write filesystem?
How do I interpret the w column in vmstat?
How do I tune the Solaris kernel?
How can I time-out orphaned processes in Solaris?
What causes slow rlogin?
Any performance tuning hints for Solaris 2.5?
Why do some login IDs in SunOS 4.1 accounting files change?
Why doesn't my virtual-memory monitoring program add up?
Are kernel memory allocation errors worth worrying about?
How can I improve my Web server's http performance?
Does Solaris offer a vmtune-like tool?
Why are my news spool disks overloaded?
Do you have a version of SymbEL for HP-UX?
What is better, SPARC or Pentium?
Why doesn't 32 megabytes seem like enough?
Is there a way to measure the amount of CPU used by AIO "waiting" methods?
How many syscalls are too many?
What can I do kernel memory button in ruletool goes black?
How large can a process be in Solaris?

Where may I find ruletool?
Q: Where may I find ruletool?

A: I discuss a new version of the SymbEL toolkit (release 2.5) in my column "Description and Installation for SymbEL release 2.5.0.2". If you already have the SE Performance Toolkit Version 2.5.0.2 or have read Appendix A at the end of my book, you should be familiar with the ten basic rules that indicate the state of parts of the system and the action required to improve bad states. The toolkit makes it easy to include the rules, and they provide a high-level indication of the system's health.

Q: Could you please hint as to where I might find a discussion of how to interrogate the device drivers without using SE?

--Gal Bar-or, firm indeterminate

A: The SE toolkit provides direct access to many of the data sources in the kernel. The primary commands you can use are:

kstat - the kernel statistics interface - man -s3k kstat
/proc - the process statistics interface - man -s4 proc
kmem - direct kernel access - man -s3k kvm_read
network - protocol information - netstat -s
probes - new in Solaris 2.5 - man prex

Where I can find in-depth documentation for rule tool?
Q: Can you please tell me where I can find indepth documentation for rule tool?
--Kenny Henderson, (firm indeterminate)

A: I don't understand your request. Ruletool is a script, so you can read the source code. It was also described in depth in articles on www.sun.com that are linked from my column and the SE2.4 download page. That's all there is, apart from the rules in appendix A of my book.

What's the highest value ncsize can have?
Q: I'm running a very large Web server and I seem to have the directory cache "go to amber" (under ruletool) for poor DNLC hit rates. I increased ncsize to 34000 on a : *) 512 MB 4-CPU SPARC 1000 running 2.4

Is that the highest ncsize can go? Would tuning somewhere else help?
--(name and firm indeterminate)

A: It can go bigger, but there isn't much point. DNLC performance effects are not great unless you are on a NFS server with too little RAM. Amber is just a warning. You are probably accessing lots of new files once, so the cache will never be able to hit. You do need to make ufs_ninode 34000 as well, otherwise there are not enough inodes for the DNLC to cache references to.

How do I figure out target drives for vmstat and iostat?
Q: There is some confusion about which target sd0 in the iostat output refers to. Some people here say it is target 3 (Internal drive) while others say it is target 0. Which would be correct? The same quandary exists with the vmstat disk fields, s0 and s3. Which target does each field refer to? Thanks and I am going to buy your book this afternoon!
--Jeanne Brennan, brennan@hou.moc.com

A: On my system its setup like this (note that the SE toolkit figures this out for you):

% /opt/RICHPse/examples/disks.se sd0 -> c0t0d0 sd2 -> c0t2d0 sd3 -> c0t3d0

On older Sun systems its setup with t3 and t0 swapped: eeprom(1M) Maintenance Commands eeprom(1M) sd-targets Map SCSI disk units (OpenBoot PROM version 1.x only). Defaults to 31204567, which means that unit 0 maps to target 3, unit 1 maps to target 1, and so on.

How does Solaris 2 use memory?
Q: We are busy porting a real-time application from Linux to Solaris x86, and are experiencing problems with regard to memory.

We load about half of our physical memory with data (30 MB), and even though there should be plenty memory available, we experience lots of paging to disk. I would appreciate it if you could enlighten us as to what is the problem.
--Antony Jankelowitz, (firm indeterminate)

A: If you do any file I/O it appears as paging, you may want to look at plock and mlock manpages Read my October SunWorld article that talks about how memory is used in Solaris.

Is there any way to change the time slice on Solaris?
Q: We are using Solaris 2.3 on SPARC 20 (four 125-MHz CPUs). We have 15 to 20 idle time left over when we have 18 application processes running with lot of messages being processed. There are two processes which take up 35% and 25% of the time, respectively.

My question is: Is there anyway to change the time slice on Solaris? I don't know what the default is. But for example, if it is 20ms, we change it to 50 ms for the first process and 40ms for the second process. By doing this, there will be less swapping and better turn round. Do you think this will make any difference?
--Jay, (firm indeterminate)

A: I don't think the timeslice will make any difference. Upgrading to Solaris 2.5 will help. The kernel is more efficient, especially on MP systems. You have two busy processes, and you have four CPUs, that is why there is idle time.

What's a high process-switch value?
Q: I'm trying to tune some SPARC machines. sar -w reports a lot of processes switches as reported in pswch/s. 1000 on a SS1000 60-MHz with two CPUs, more than 1500 on a SS1000 85-MHz with four CPUs.

Are these common values ?
Is pswch an accurate indicator ? Is it possible to reduce number of process switches without impact on throughput ?

Both machines are paging heavily (about 100 p/s). I used ioctls on /proc to get pr_usage_t's pr_minf and pr_majf (minor and major page fault). I have almost only major page faults.

I ran se2.4 on both systems and it reported nothing wrong. They have quite different profiles : One is an NFS server, one runs six Oracle instances (DB server).
--Alexis Grandemange, (firm indeterminate)

Q:Are these common values?
A:These are quite low.

Q:Is pswch an accurate indicator?
A:Yes, the metric itself is accurate, but it is not usually useful as a problem indicator.

Q:Is it possible to reduce number of process switches without impact on throughput?
A:No, usually as throughput increases so do pswch. Use mpstat to see how many are involuntary versus voluntary context switches. If there are a large proportion of icsw, then increasing the timeslice might help a little. (see dispadmin) Don't expect any dramatic improvements.

% mpstat 5 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 68 19 0 1046 826 568 57 0 27 0 285 7 13 0 80

here there are 568 switches but only 57 are involuntary,

Q:Both machines are paging heavily (about 100 p/s).
A:That's not very heavy. Its probably just filesystem I/O activity.

Q:I ran se2.4 on both systems and it reported nothing wrong.
A:In that case they are probably both OK. Wait until a problem is reported before you start to worry.

Static vs. dynamic linking.
Q: I would like to add a few points to your column "Which is better, static or dynamic linking?":

If most of your programs are small and contain mostly library calls, you are probably right that dynamic linking is faster and uses overall less memory. However, if your typical program contains mostly data, such as in FEM or CFD analysis, the memory saved by dynamic linking is negligible. On the other hand, in those applications, static linking provides for better linker optimization and (slightly to considerably) faster code.
Dynamic linking is fine so long as you have all necessary libraries installed on every computer where your program is to run. Depending on the nature of your libraries, this may be expensive, cumbersome, or impossible. Expensive, because you may have to buy an extra license of a compiler for every computer, instead of just using the compiler on one computer and then using the executable on any other computer.

Cumbersome, because you may have to crossmount libraries on a slow network, or just put a separate copy on each computer. Again, the alternative is to use just one computer for compilation and then take the executable elsewhere. Impossible, because you cannot put each and every runtime library on every computer, and not always is there a network available -- just think of a laptop with limited disk space away from the office.

In summary, dynamic linking may be better if you do not mind paying good money for license fees, disks and networks. It is definitely better if you are the person selling those licenses, disks and networks. Otherwise, better take a second look at static linking.
--Hubert Meitz, (firm indeterminate)

A: You can certainly decide to static link to third-party libraries or ones that would need to be installed on every system. There is no reason to statically link to the standard OS libraries. You can statically link to one set of libraries, while dynamically linking to libc etc and the window system. Note that Sun's Fortran libraries (libF77.so etc) are explicitly licensed such that you can install them anywhere they are needed to run a fortran program that uses them.

The benefit of dynamic linking is that you can upgrade to an improved libF77.so without rebuilding your application. You may also get a platform-specific libF77.so that is optimized for the hardware that it is running on. This may make a big difference for high-end floating point applications.

What is a "Level Red" mutex stall?
Q:Following the guidelines indicated in appendix A of your SP&T book we have a four-processor SS1000 in a "Level Red" Mutex Stall (smtx > 400). Is there a quick hardware fix for this problem (i.e., more or faster CPUs)? Any advice would be appreciated. ...BTW, great book!
--Todd Resnick, Duke University Medical Center

A: I actually answer this question in "New Release of the SE Performance Toolkit". The main new feature is the performance collator for Web servers. See the SE Performance Toolkit Version 2.5.0.2 for more information.

Adrian:
OK, I'm trying to locate a copy of your book locally, I installed your SE Tools (cool, big help), but here's one I can't figure out. Is this a hardware problem, or configuration problem. This particular SPARC 5, with 32 megabytes RAM, O/S on 1 gigabyte disk, user dir's on NFS mounts from an old (cough, gag) Digital Ultrix 4.4 box, keeps crashing Framemaker sessions. I'm trying to determine if it's Framemaker, or the box, or the configuration. I have about 128 megabytes swap. We have nine of these SPARC 5's running Framemaker, and they are having problems with NFS, which I am trying to determine if there is any tuning stuff I can do until I can get them to get rid of the old Ultrix machine.

I'm running the calendar manager monitor, but can't get cm printing to quit truncating the output!! Any help on at least this kmem issue would be highly praised throughout my company.

KERNEL MEMORY INFORMATION
 
The current kmem state is:
 
        amber   Allocation errors have occurred
 
Total number of kernel memory allocation errors since boot:          1
Number of kernel memory allocation errors this interval:             1
Number of pages of memory on the free list:                        274
 
KMEM RULE THRESHOLD
 
KMEM_FREEMEM_LOW   default= 64pg  getenv= 64pg  no pages for kernel to use

--Michael W. Randall, SpaceLabs Medical

Adrian Cockcroft responds: A single kmem error is not likely to be a problem. See my column on Sun's home page (not SunWorld) for more on what each rule action means.

All I can suggest is to attach a truss to frame, wait for it to die then look at the system call sequence before it died to see if there are any clues.

truss -o /tmp/trusslog -p makerpid

then tail /tmp/trusslog

Adrian:
I seek examples of se_2.4 scripts others that these installed in /opt/RICHPse/examples, especially a class allowing monitoring of distant site by using rstatd. I use Solaris 2.4 in many SPARCstation 5.

Does an anonymous ftp site exist with such examples? Thank you very much of your assistance
--Robert Rivoir, (firm indeterminate)

Adrian Cockcroft responds: You should upgrade to SE2.5, there are a few extra scripts that were contributed. I have thought of adding rstatd to SE but have never had time. If you would like to try and build an interface yourself I'll give you as much help as I can. I don't have time to work on this directly myself.

Adrian:
I've just installed the latest version of the SE toolkit on my Solaris 2.5 machine. It seems to work fine. But when I DISPLAY it on my sunos 4.1.3 machine running Openwin 3, it kills the openwin! What can I do to make it work?
--(name and firm indeterminate)

Adrian Cockcroft replies I don't have any 4.1.3 systems to test with. I have no idea why it might kill openwin. Do other Motif apps work? Do all the SE GUI applications kill it or just one?

The reader replies I do not have many Motif applications, but I noticed the same behaviour with Netscape for Solaris. The other SE apps seems to behave the same. Does it mean my 4.1.3 machine has no Motif libraries?

Adrian Cockcroft replies A good possibility. Readers?

Adrian:
I have followed several articles of yours about tuning and found the information to be most helpful. Is there a non-graphical/non-audio based version of virtual_adrian.se? We need to run this at our customer sites over a telnet connection.
--David A. Dempsey, Endeavor Information Systems

Adrian Cockcroft responds: Its a script, you can make it do anything you like. If you want it to stop beeping you can use

virtual_adrian.se 30 -

to disable the audioplay command.

Does the current edition of your book cover SunOS 5.5?
Q: Does the current edition of your book cover SunOS 5.5?
--(name and firm indeterminate)

A: The book was written before Solaris 2.5 was released but there were few changes and everything still applies. The performance changes in Solaris 2.5 are the subject of my "Sun Home Page" performance column this month.

What flags in /etc/system are related to security?
Q: While the article on kernel and /etc/system parameters there are still several things missing. In Solaris 2.4 there was a fix in /etc/system

set nfs:nfs_portmon=1

which made nfsd only accept connections from low numbered ports. This is the same as running rpc.mountd on sunos4.x with a -n flag. Unfortunately this same parameter does not exist on solaris 2.5. What I need is a list of parameters such as this one which have nothing to do with performance tuning but everything to do with securing my system. I'd like a reference source for these sorts of tuning parameters. They are not autoconfigured based on available resources.

A: I concentrate on performance related parameters, I'm not a security expert. However you will find set nfssrv:nfs_portmon=1 works in Solaris 2.5. There were changes due to the integration of NFS3 into the system that seem to have rearranged the nfs modules slightly.

I found that it still existed by running /usr/ccs/bin/nm on /dev/ksyms, then located the module it contains in /kernel.

/usr/ccs/bin/nm /kernel/misc/nfssrv | grep portmon
[29]    |      1676|       4|OBJT |LOCL |0    |3      |nfs_portmon

The manual page for nfsd(1m) also documents this change.

If the NFS_PORTMON variable is set, then clients are required to use privileged ports (ports < IPPORT_RESERVED) in order to get NFS services. This variable is equal to zero by default. This variable has been moved from the "nfs" module to the "nfssrv" module. To set the variable, edit the /etc/system file and add this entry:
          set nfssrv:nfs_portmon = 1

I need to reboot my Solaris 2.4 machines every two weeks. Why?
Q: Hi, I have three or four SPARC 5 machines running Solaris 2.4 that have to be rebooted every two weeks or even every week depending on what the users are doing (e.g., every week if they continually run Matlab simulations along with the usual programs such as text-editing, email; longer for email and text-editing until they start running more resource demanding programs). What happens is that the machines slow down for a while and then completely freeze because they are paging in and out. The users could not do anything, even to just move the mouse from one window to another.

I ran some performance statistics using sar, ps and vmstat. On one of the machines which is a SPARCprinter II server, I was convinced that there was a memory shortage. So I added another 32MB RAM expanding the total RAM to 64MB. It still slows down when it's printing, but the system is able to recover from the paging activity unlike before when I had to reboot the system. In this case, I did not think of a possible kernel memory leak, so I did not collect sufficient statistics for analysis (sar -k). Here are the statistics output and notes on the analysis:

(Editor's note: extensive tables and statistics deleted)

I believe that all machines displaying the above-mentioned behavior are experiencing some kind of memory leak somewhere. I tried to set maxusers=40 in /etc/system, but that did not work. I would try setting shared memory (shmsys), but I could not find any helpful hints on it. Please help.

I appreciate any comments or thoughts on this issue.
--Cindy Doehr, (firm indeterminate)

A: It sounds like a kernel memory leak bug, have you tried loading the latest set of kernel patches for Solaris 2.4?

I think swap size doesn't affect performance.
Q: Most of your article on performance tuning is just fine, but I disagree when you state that so long as applications fit, swap size doesn't affect performance. I used to think that, too, but then I learned about fragmentation, and observed user complains going away when I increased their swap.

Also, you mention ncsize and ufs_ninode. These are discussed in Sun's old performance tuning overview. I'd be interested in an updated discussion of these as pertains to SunOS 5.3-5.
--Anthony D'Atri, (firm indeterminate)

A: The only mechanism I can think of is that the pageout swapfs clustering could be limited if free swap space becomes fragmented into small blocks, this might delay the availability of free pages a bit, but if you are paging in and out at the same time to an overloaded disk you are already in a slow performance mode. Adding swap space on a separate disk helps.

Regarding ncsize and ufs_ninode, this is discussed in the book, and I have touched on it a few times in my Web articles.

From further discussion it seems that your experience of performance improvement may have been on SunOS 4.x systems. Solaris 2 has a completely different swap space layout policy that avoids this problem as far as I can tell.

How do I improve my Web server's performance?
Q: Just wanted to drop you a quick email to add to your no doubt groaning emailbox, to thank you for the articles on performance on the web.

I am currently a system admin. at a small internet provider (but growing fast) in the UK, which means I do everything ;-)

Our Web server, sparc 20, sol 2.5, has just died performance-wise over the last week.... (seem to be loads of time_waits) and whilst doing everything else to keep the provider running, Ive got to look at it urgently before all our customers leave :-(

Anyway, I asked for a list of /etc/system parms and was recommended to read your articles.. I've only just started going through them, but already I can see I have to order your book ;-)

I can see this week will be spent reading everything you've written, that I can get my hands on, to try and solve this problem, so thought I should at least email you to say thanks for the info.
--Keith Pritchard, (firm indeterminate)

A: This is the subject of my March column. In this particular case it turned out that the workload had crept up until there was no more Internet bandwidth left to/from this system. Increasing the speed of the Internet link fixed it.

What's the latest recommended version of Solaris for older SPARC computers?
Q: Currently we have an installed base of Sun SPARC 4/110, 4/330 and 4/75s. I was wondering whether running Solaris 2.4/5 is compatible with these old hardware platforms, and what are the performance implications by running just Solaris and/or any simple X-based application.
--Ioannis M. Kyratzoglou, Mitre

A: Solaris 2.4 is the last release that works on a 4/110 or 4/260 or 4/330, or 4/490 -- we have upgraded our lab systems to 4/600 CPU boards with SuperSPARC modules that will run the latest OS.

All others run 2.5, the latest releases are faster and smaller than older Solaris 2 versions. Solaris 2.5 uses more RAM than 4.X, but most things are faster than 4.X, a few operations are slower because extra functionality has been added.

If you have enough RAM I'd upgrade them, if any are marginal don't bother. 32 MB is probably the minimum: you can't waste CPU cycles paging on a slow system, so you need more RAM to keep up with more recent hardware.

Should I use the SSA NVSIMM and Presto NVSIMMs together?
Q: I have heard and read conflicting views of using SSA NVSIMM and Presto NVSIMM items together. While I understand the configuration problem of putting one logically before the other in order to have an orderly recovery after a system crash, I still am not clear that using a Presto-NVSIMM with an SSA-NVSIMM gives you anything over just the SSA-NVSIMM alone.
--Kerry P. Boomsliter, Knight-Ridder Information

A: The bandwidth to/from Presto NVRAM is perhaps 10 times that of SSA NVRAM. The CPU does more work with Presto as the data is copied to and from the Presto NVRAM, but doesn't bottleneck on the disk interface like the SSA NVRAM.

That is the main difference. Either one gets you most of the performance boost by changing disk accesses into memory accesses, but the higher bandwidth and capacity of Presto gets better response times, and the SSA NVRAM has higher throughput. Both together is the best option for maximum performance. The recent introduction of 16MB SSA NVRAMa (up from 4MB in older SPARCstorage Arrays) also helps performance for write-intensive applications.

Q: In your answer to the person whose NNTP server had overworked disks, you said adding an NVRAM SIMM and Legato's Prestoserve software was the best thing to do. I had always associated Prestoserve with NFS service. Does your advice imply these pieces of hardware and software are for use in more than just NFS servers? Should I consider putting them in any machine with a high I/O load?
--(name and firm withheld)

A: Directory and inode updates are synchronous, also files are flushed when they are closed. The NVSIMM defers and coalesces all synchronous I/Os, and has no effect on regular writes to a local filesystem. NFS writes are also synchronous, which is why it helps NFS. News and mail do a lot of directory and inode updates, and create/move/delete small files which is why NVSIMM helps a lot.

Q: You seem to be very free in recommending NVSIMMs, everywhere filesystem performance is mentioned :-) Having recently gotten my greedy little hands on ODS4.0, I am playing with the "metatrans" device, known to the rest of the world as journaling filesystems. It seems to speed up filesystem access quite nicely, although I have never had NVSIMMs to play with, to compare. Could you provide some juicy details comparing the pros and cons of both approaches?
--Philip Brown, (firm indeterminate)

A: Bandwidth to an NVSIMM is greater than 100MB/s, SBus Presto is 30MB/s?, log disk is 2-5MB/s, random writes go at 400KB/s on a good day.

Converting random writes to any of the others is faster, NVSIMM is the lowest latency, lowest overhead option.

Remember that the data must be written to the log (low latency as far as user is concerned) then must be read back and written to the filesystem (extra overhead and throughput needed, not latency sensitive).

Philip Brown replies All the data has to be written there?? This seems strange to me. How is it then, that it speeds up ufs throughput as much as it does? If it only does that.. it would seem to me, that it would only become equal to ufs speed, not exceed it.

Adrian answers Only the synchronous writes go to the log, inode updates and synch writes on NFS servers.

It speeds up allocation of blocks by forcing them to be sequential, regardless of how many inode and indirect block updates are needed on the way.

Any tips for Solaris X86 administrators?
Q: (Your) performance tuning articles are excellent, but I wonder if anyone has any performance tips or gotchas for the vaguarities of the Intel platforms running Solaris x86? What special concerns come up on those hardware platforms?
--(name and firm indeterminate)

A: I don't have any Solaris x86 systems (I work in SMCC remember :-) but almost everything I say about Solaris on SPARC applies also to Solaris x86. The SE toolkit supports Solaris x86.

Is there a Solaris 2.4 kernel tuning parameter that stop unfriendly programs from taking over a system?
Q: Is there a Solaris 2.4 kernel tuning parameter (like maxuprc) that would allow sysadmins to stop unfriendly programs from taking over a system? The problem we have sometimes seen is a poorly written program forking off infinite copies of itself until the machine dies or hits its process limit. We want to be able to limit a user's total to, say, 100 processes.

Is this possible under Solaris 2.4?
--Lance Nakata, Stanford University

A: The same maxuprc variable does this for you in Solaris 2.X

set maxuprc=100 in /etc/system and reboot

We see a "Allocation errors, kmap full?" message on SPARCstation 20 with 512 megabytes. Why?
Q: Hello Adrian. I have read your book a dozen times and use your tools. Excellent. I have a question about an "Allocation errors, kmap full?" message we received last week on one of our production servers. It is a SS20 with 512 megabytes of RAM. For some weird reason, it started canceling telnet and rlogin connections and I have a feeling they were during the same time we received the kmap full error messages. Could you explain? Every once in a while we would receive mutex contention errors as well.

I know this is in the dark work without the specs on the system, processes running, system configuration and things like that. But, you are an expert and I figured you could point me in the right direction.

Thanks!
--Neil Greene, Sr Oracle DBA / Unix Administrator, SHL Systemhouse

A: If the kernel can't grab memory it will cause a login or telnet to fail and you will get allocation errors.

If it persists, the machine stops working, and you need a reboot to fix, it means the kernel got too big. To fix this reduce maxusers to 200 or so, set bufhwm to 4000, upgrade to 2.5 (which has more kmap on sun4m) or upgrade to SS1000 or UltraSPARC systems that have much bigger kmap.

If it comes and goes, then the free list was empty so no pages for the kernel to grab. Set lotsfree to 512 and desfree to 256, leaving minfree alone. Increase slowscan to 500.

This is a fairly common problem with 512MB SS20's.

Help me program asynchronous I/O.
Q: I was wondering if you could point me in the right direction on which Solaris 2.4 patch will enable me to do asynchronous I/Os (aio_read and aio_write). The man page says that async. support is a future release, but in one of your articles you mentioned a patch that would allow async. I/O. I just installed the latest jumbo patch (101945-34) but the routines still return -1 (errno set to ENOSYS). Any help would be greatly appreciated. Thanks,
--Chuck Williams, Senior Telecommunication Systems Engineer, Loral Test & Information Systems

A: You need to look on the second CD that comes with 2.4 or in the Patches directory on the main 2.4 CD. Kernel async I/O was shipped with the 2.4 release but was not installed by default. There is probably an updated release of that patch to look for once you know its number.

In the meantime, the aioread calls should work with no patches, the KAIO fast path in the patch is only really needed for Sybase on raw disks.

My guess is that you are not using the API correctly in some way. (See the feature story Programming asynchronous I/O.)

How should I partition my hard disk?
Q:I always face hot comments when I suggest to bundle /, /usr and /var under one large partition... I do not think that having separate partitions is needed anymore. Am I right? Is there any good reason to split them these days ? I know that in the past it was needed because of small disks but now ? It is an issue I would like to close once and for all. What are your thoughts about this?
--Benoit Gendron, (firm indeterminate)

A: My book Sun Performance and Tuning: SPARC and Solaris contains my thoughts on this subject. I recommend one partition for desktops, and keeping /var separate on servers only -- so that /var/mail can have Prestoserve acceleration.

Also makes upgrades much easier.

Will I/O be faster on a 64-bit file system, especially on a database application like Oracle?
Q: You have not focused on 64-bit file systems and performance in your article. Will I/O be faster on a 64-bit file system and special on a database application like Oracle?
--(name and firm indeterminate)

A: 64-bit file sizes and file systems can and will be implemented on any system. Solaris already supports 1 TB file systems, and 2 GB files.

Oracle runs best on a raw disk setup, and there are no 64-bit features that would speedup file system accesses.

What's better for a Web server: UltraSPARC or hyperSPARC?
Q: We have a SS20/712 as an applications layer firewall that is (at times) completely CPU bound w/ all the http traffic going through it. We are in the process of enhancing the http-proxy w/ all the recommendations made on this web page. However, until that is done we want to increase the throughput via hardware, i.e., faster processor. We are looking at the 100-MHz HyperSPARC setup, but don't know what the optimal cache size would be. We have a choice of 1M or 256K. Please help. In a networked environment what would be the preferred (fastest) for us?
--Mike McPherson, (firm indeterminate)

A: I would spend the money on an UltraServer 1 Model 170

Solaris 2.5 is a bit more efficient than 2.4, and the faster CPU and system bandwidth will probably work better than a dual CPU SS20.

I don't think HyperSPARC systems run kernel code as well as SuperSPARC systems. I found the 125-MHz 256KB HyperSPARCs were about the same as 60-MHz SuperSPARCs for running commercial applications like database backends that do a lot of kernel work.

Is the http proxy forking for every request? If so, a preforked or threaded proxy would be much better -- i.e., Netscape or phttpd or Apache, but not CERN or NCSA.

Why shouldn't I run CacheFS on a read-write filesystem?
Q:I attended a seminar you gave at a Computer Literacy in San Jose a while back and remember you mentioning a caveat about using CacheFS.

I remember you saying something like "it's not a good idea to use CacheFS on a r/w filesystem." What I can't remember is WHY. Is it because writes through CacheFS are slower, or is it because writes through CacheFS are unreliable? Or does having an r/w fs mounted through CacheFS cause performance of CacheFS to drop in general?

Also do you have any suggestions for CFS option settings for read-mostly filesystems ?
--Jim Burwell, Systems/Network Admin., Broadvision

A: Read-mostly is fine. If you only read the data once don't bother caching it, if you keep changing a lot of it it is a waste of time caching it.

If you have a few updates, but mostly read the data it should give a good speedup. /var/mail is a really bad choice, /home is usually OK, /export/local (or whatever you mount applications on) is a good idea.

How do I interpret the w column in vmstat?
Q: We have a SPARCserver 1000E/Solaris 2.4 with four CPUs and 610 megabytes of RAM as a dedicated Sybase server. vmstat 5 is used to monitor the system at all times. Recently, the third column 'w' of 'procs' in vmstat's output started to report a value of around 20 and rarely changed. This value shows up again even after rebooting the system many times during the past month. This seems to indicate we have a memory shortage because swapping occurred. But my questions are:

Why does the free swap space still show a big value (i.e., 201356) indicating we have plenty of it? (Physical swap space is 100 megabytes)
Why does the value in 'w' column stay the same regardless the load on the system? (Ten users and 300 users produce the same value.)
Sun tech support tells me there is no problem on our system as long as it runs OK, but Mike Loukides's book System Performance Tuning tells me that when swapping occurs, my sysadmin needs to find the problem because it may be the tip of the iceberg. To whom should I listen?
Sun tech support tells me swapping and paging are the same thing. I disagree. Who is right?
Sun tech support tells me that we should have a swap space at least as big as our memory size. Again, I disagree -- based on your book and my own experience. Who's right?

--(name and firm indeterminate)

A: vmstat w reports the number of processes that are currently swapped out. Those 20 processes all are idle ones. This is not a performance problem. The Loukides book is rather out of date in places, and is not particularly relevant to Solaris 2.

If you run vmstat -S and see lots of si and so, you might have a problem. Here's a reminder of what vmstat -S looks like:

vmstat -S 5
 procs     memory            page            disk          faults      cpu
 r b w   swap  free  si  so pi po fr de sr f0 s0 s1 s2   in   sy   cs us sy id
 0 0 0 137392 15608   0   0  2  2  5  0 55  0  0  0  0  132  319   69  2  1 98

Swapping moves whole processes to the swap space, and paging is done a page at a time. Page-outs occur in large clusters, so the net effect is not all that different.

Swap space size is not a performance issue. If you have enough to run your apps reliably without running out at peak loads then you should be happy. If you want to collect crash dumps you might need more. That is one reason why SunService recommends setting swap equal to RAM.

How do I tune the Solaris kernel?
Q: We are veteran Interactive users and used to tuning the kernel using kconfig (mtune and stune files). We are now porting to Solaris x86 (Base Server) and need to be able to make equivalent tuning changes. In particular, we need to increase the various values associated with IPC queue (MSGMAX, MSGTQL, etc.). We have found one cryptic way to do this by hacking at system(4). Is there a better way and is there any comprehensive documentation source on tuning kernel parameters under Solaris x86? Thanks.

--Ken Robbins, firm indeterminate

A: The "better way" involves editing /etc/system.

The performance manual section offers little help, but does list some parameters. My book (Sun Performance and Tuning) contains more details, including the algorithms that are being tuned.

Your question is a common one. I will probably address the question "What is the list of tunables in Solaris?" in a future column. There is no easy answer, unfortunately.

How can I time-out orphaned processes in Solaris?
Q: At Brown & Root, we run both Solaris and AIX servers. On all servers, we have Oracle as our database. On occasion, some clients' Oracle processes remain active even after they have logged off. In AIX, we have found two parameters, tcp_keepidle and tcp_keepalive, that help us timeout these orphaned processes. Is there anything comparable in Solaris?

--Jacques Dejean, Brown & Root

A: Your looking for the Solaris ndd command, find a description of it and the values it can be assigned in appendix E of TCP/IP Illustrated, Volume 1 by W. Richard Stevens. This book is also a complete reference to TCP/IP and how it works.

Make sure you understand the implications of any TCP tweaks. You can easily mess up the standard algorithm if you set it up wrong.

What causes slow rlogin?
Q: What are likely causes of extremely slow rlogin both to and from a machine? The machine in question is seldom busy. It takes about 60 seconds to do rlogin from or to the machine. Once rlogin is completed, response is fine.

--Mike Kelly, firm indeterminate

A: Check for:

Incorrect routing setup -- use ping -sRv to check route to NFS servers, etc.
NIS, NIS+ or DNS server problems.
Automount or NFS server problems.
Bad directories in set path= in .cshrc or similar files.
Symbolic links in home directory to /net/system/somewhere.

To diagnose this problem, use etherfind or snoop (Solaris 2) on a third system, capture all packets in and out of the slow machine, and look at the sequence and timestamps to see which part of the sequence is taking a long time.

I get this problem myself, normally due to routing foul-ups. It may also help to put "file" at the start of the name server lookup path for hosts and password. I use "file nisplus dns" for hosts in /etc/nsswitch.conf, as I find that the system boots much more quickly if it looks up system identities for its main routers and servers in the /etc/hosts file.

Q: This is in response to your December Performance Q&A column and the question about what causes slow rlogins.

The most common reason I see (and hear about) a slow login is the remote site using daemons or protocol wrappers that use the ident protocol to lookup who is trying to connect. I use a TCP wrapper that logs user names on a daily to weekly basis. The lookup can cause a login delay of up to a configurable timeout (2 seconds) if the client machine is not running an identd daemon.

Another common cause on a busy machine is when the remote site does not have enough physical memory and must swap to get the login daemons or the shell loaded and running. Hope this helps...
--Michael Johnson, CS Undergrad, Oregon State University

Any performance tuning hints for Solaris 2.5?
Q: I'm hoping that when Solaris 2.5 comes out you can dedicate an article or a series of articles to the improvements and kernel /etc/system parameters that should or should not be set for 2.5. In reading your book, you gave different hints for different types of systems (i.e., servers vs. hosts), and the hints varied depending upon the version of Solaris being used. I'm guessing that when 2.5 comes out, it'll be different from 2.4, so it'd be nice to know what changes have been made and what performance tuning hints are applicable for 2.5.

--Blair Zajac, firm indeterminate

A: Since I'm writing this before Solaris 2.5 is officially released, I can't offer much guidance yet. I'll cover tunables soon. It takes a while to figure out how to tweak a new OS release. There are a few new NFS V3 variables. The rest is basically identical to 2.4.

Editor's Note: Solaris 2.5 was announced at the end of October. Shipment for the SPARC and Intel versions of the new OS has just begun; Solaris for PowerPC recently entered beta testing and is expected to ship early next year.

Why do some login IDs in SunOS 4.1 accounting files change?
Q: I hope you can help me out. Lately, I had to look at the /usr/adm/acct/fiscal/fiscrptxx files. I found that some login IDs had two entries per file, while other login IDs had one.

Can you tell me what the problem is, or at least give me a hint? I need to use the files for performance evaluation purposes. Do I have to add up the entries corresponding to a given login ID per file?

--Halim M. Khelalfa, AI Division, CERIST

A: I'm not sure, and I haven't used accounting on 4.1.3 for many years. One guess: Perhaps some users changed their group ID, keeping the same user ID, during the month.

Why doesn't my virtual memory monitoring program add up?
Q: I now have a hard copy of your System Performance Monitoring article and will read it soon. First, I am going to take advantage of your offer to answer questions about this subject.

One of my personal monitoring programs presents physical memory utilization, which I calculate based on the following method. I believe it works and the assumptions are correct, but I'd like your opinion on its accuracy.

First I get some static facts:

V = total virtual memory size (everything is in kilobytes)

R = total real (physical) memory size

Next, I get 1-second snapshots of transient facts:

A = allocated (in-use) virtual memory

F = free (available) virtual memory, in the form of free resident memory pages (now in the physical memory)

IF ( A + F ) >= R

THEN U = 100%

ELSE U = (100% * A) / (R - F)

There are cases when A < R but I report 100 percent because of the free pages that inhabit physical memory, forcing some allocated pages to be swapped out. I am not concerned about this because there will be (at least a potential for) thrashing, and that's what 100-percent physical memory utilization is supposed to indicate. What I am concerned about is when ( A + F ) <= R yet there is a potential for thrashing -- and I don't know why -- because there is something missing from my equations.

Notes:

Sun does not present non-zero "avm" (active virtual memory) values from a vmstat report, so I must get A from pstat -s. The V and R values are from dmesg.
When I asked a Sun performance person why that was so, I got led off on a tangent about how unnecessary my calculations were. ("Why do you want to know the physical memory utilization?") I am hoping that your answer is more to the point.

--Alex Vrenios, EMTEK Health Care Systems

A: Look at my SunWorld column entitled "Help! I've lost my memory!" Then it may become clear why your calculation does not work. The VM system is far more complex than your simple equation presumes. I don't think the available data is sufficient to model memory use. In particular, the only data available on a per-process basis is the size of the address space for the process, and the amount that has valid memory mappings. These values can be seen (measured in kilobytes) via the old-style ps command, in the SZ (process size) and RSS (process resident set size) fields:

% /usr/ucb/ps uax
USER       PID %CPU %MEM   SZ  RSS TT       S    START  TIME COMMAND
root      2026  3.0  2.1 1424 1284 pts/6    O 23:14:29  0:01 /usr/ucb/ps uax
adrianc   2021  0.7  4.1 3444 2500 ??       S 23:14:26  0:00 /usr/openwin/bin/c
adrianc   1785  0.6 11.110048 6840 console  S 20:50:55  1:12 /usr/openwin/bin/X
adrianc   2024  0.3  1.4  980  856 pts/6    S 23:14:27  0:00 /bin/csh
...

Unfortunately for your calculation, the RSS excludes pages that are in memory but do not have valid mappings, and it includes pages that are shared by other processes. Your calculation also doesn't consider the memory used by files that are cached. To obtain this data, kernel code would have to be written that traverses many data structures and tallies the pages. This is not available in the base release, or in any commercial performance tools that I am aware of.

I think it would be useful to have more information about memory usage, and it is on my list of things I'd like to see added to Solaris.

Are kernel memory allocation errors worth worrying about?
Q: While recently monitoring a SPARCcenter 2000E with RuleTool, I noted that the system was regularly experiencing kernel memory allocation errors. I tried to find some info on the seriousness of this, but wasn't able to find much other than it possibly being caused by a memory leak. A call to SunService seemed to indicate that as long as the frequency was very low (it was, approx. 5 per day) it wasn't a cause for concern.

I would like more info on this in a future article (or in your next book Performance Tuning: The Sequel). I'd like to congratulate you on your book and I assume it's doing well considering it fills a void that's been around for years. (I purchased two copies myself, and have influenced several others in purchasing the book.)

--Greg Wells, firm indeterminate

A: If the system can't grab memory when it needs it and can't wait, then you can get problems like a stream or login attempt failing. There are other reasons why allocation failures occur; in most cases, the system finds a way to retry the operation and succeed.

This problem happens mostly on Solaris 2.4 and 2.5 multiprocessor systems, not so often on Solaris 2.3 or uniprocessor systems. If you see kmem allocation errors (sar -k 1), then increase the free list so that it is less likely to hit the endstop.

Set lotsfree to 128 * the number of CPUs you have or set up virtual_adrian.se to run every time you reboot and it will set this for you.

Adding more RAM doesn't help, as the free list size is not scaled. As you can see below, I've had a few errors on my Ultra 1, but I take this as a warning, not a serious worry. It is useful to track it in case something else fails at the same time as a new kmem error.

% sar -k 1

SunOS eccles 5.5 Generic sun4u    11/05/95

23:30:18 sml_mem   alloc  fail  lg_mem   alloc  fail  ovsz_alloc  fail
23:30:19 4046848 3611540     0 7536640 6492776     8     5373952     0

How can I improve my Web server's http performance?
Q: My issue deals with a problem I have been seeing on more and more Solaris 2.4 systems running as WWW servers.

First, I do realize that the http protocol was not designed to work with TCP/IP. In fact, it butchers it, but since it's a growing phenomena, we need to tune the system for it!

Now, the problem I have been seeing. When dialup users connect to these WWW servers via SLIP/PPP, Solaris apparently drops a lot of packets, and a lot of retransmissions are occurring as shown from the results of the netstat -s command.

What I discovered is that the default setting of 200 for tcp_rexmit_interval_min is too low. Setting this up to 10000 finally gives good performance results. However, as you are well aware, this will increase the amount of time the system waits before a retransmission takes place after a packet is dropped. Catch-22! ;)

I also noted that the listen backlog parameter set by ndd: tcp_conn_req_max is set to 5 and allows a maximum value of 32.

How can I optimize Solaris 2.4 to perform well as a WWW server? More and more clients are asking me to improve WWW performance on Sun.

--Boni Bruno, Data Systems West

A: There is an excessive retransmit bug that is fixed in Solaris 2.4 patch 101945-34 (Sun's recently released kernel jumbo patch) and Solaris 2.5. You will still see retransmit levels of 10-30% on machines with direct Internet connections. You can reduce them by setting the initial retransmit interval to a second or so (1000, as the units are ms). Most packets seems to take at least a second to get to their destination and get an acknowledgement back over the Internet! You should not set it to much more than a second.

The limit for tcp_conn_req_max should be set to 32 in 2.4, and can be set up to 1024 with ndd in 2.5 if you have enough memory to hold all those pending connections. A setting of 128 seems to work well on Solaris 2.5, and is being used on some big internet sites.

Add these lines to /etc/init.d/inetinit

ndd -set /dev/tcp tcp_rexmit_interval_initial 1000
ndd -set /dev/tcp tcp_conn_req_max 32

We also have fast name service caching in 2.5, so DNS (Domain Name System) lookups get cached (see the nscd man page). In general 2.5 is a much faster Internet server than 2.4, even though there are several areas where tuning work is still underway.

Does Solaris offer a vmtune-like tool?
Q: I've recently started using Suns. With Sequent Dynix/ptx (based on AT&T V3.2), a vmtune utility controls virtual memory (VM) management.

Does Solaris have a vmtune-like virtual memory tool?
Is there Maximum Resident Set size for each process on Solaris? If so, how can one modify this value?
Is there "swapout" on Solaris? In my environment, there is no swapping and free memory is always more than 25 megabytes. I think Solaris was designed to avoid swapping. If Solaris doesn't avoid swapping, when does swap occur?

--MyungSuk Yoo, Bombardier Regional Aircraft

A: There are no controls on resident set size per process in any of the mainstream versions of Unix. Why not? Well, it's hard to get a default behavior that works any better than the current system over a wide range of system sizes and workloads. Also, implementing a working set pager requires a lot more overhead, in terms of both CPU use and kernel data storage.

In Solaris 2.4, swapouts of large idle processes occur if free memory stays well below its normal level for several seconds.

Why are my news spool disks overloaded?
Q: I am running a news server and I am getting very poor performance from it. It is running on a SPARCserver 1000 with 640 megabytes of RAM. The news software (INN) resides on /opt (sd1) and Solaris 2.4 resides on sd0. iostat -x 30 indicates that at least one of my bottlenecks can be attributed to my disks, primarily the spool. I am striping 3 disks (sd15 sd37 sd7) using Online DiskSuite. The stripe has an interlace value of 16 blocks.

Below is some of the output from iostat -x 30. As you can see, most of the load is caused by writes to the spool.

disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
sd0       0.0  3.3    0.0   19.9  0.0  0.1   34.6   0   5 
sd1       0.0 15.7    0.0   99.3  0.0  0.4   24.7   0  39 
sd15      1.4 18.0    3.6   97.9 45.1 46.9 4737.6  44  79 
sd37      0.7 16.5    3.2   99.9 10.5  7.1 1025.1   9  22 
sd7       0.5 16.5    2.7   98.7  9.7  6.8  972.8   9  20 
                                 extended disk statistics 
disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
sd0       0.0  3.8    0.0   24.6  0.0  0.1   38.2   0   5 
sd1       0.0 15.9    0.0  101.1  0.0  0.4   23.9   0  38 
sd15      1.1 17.5    2.3  100.3 14.0 54.2 3656.0  36  73 
sd37      0.8 15.5    3.3   96.9  9.0  6.6  961.8   8  21 
sd7       0.6 15.5    3.2   97.0  8.9  6.1  929.2   8  18

--(name and firm indeterminate)

A: Those disks are dead meat! A slow service time is 50 milliseconds; 4737 ms is glacier-like speed. As you can see, there are 47 active commands inside the disk drive, and 45 commands waiting to be sent to the drive. Each new command you send to the drive has to wait for 92 other commands to finish first. Thus it takes almost 5 seconds to service each I/O. Dividing down, 4737 ms/92 commands = 51 ms for each I/O at the disk drive. This indicates a lot of long seeks -- probably random seeks between inodes and data in many parts of the disk drive.

The problem is lots of files being created, touched, and destroyed; lots of inode updates; and Directory Name Lookup Cache (DNLC) activity (i.e., a busy NNTP [Network News Transfer Protocol] server).

The best fix: Add non-volatile (NV) SIMMs and Legato's Prestoserve software. This will help a lot more than anything else. If the disks are still too busy, you need more of them, and you need an NVRAM disk cache. A SPARCstorage Array (SSA) with 12 or so disks would give you a wider stripe. The SSA NVRAM is a reasonable substitute for the Prestoserve NVSIMMs, but both together is even better. Note that you do not need the storage capacity of 12 disk drives, but you look as if you need the random I/O performance of them. Twelve disks may seem extreme, but so does a 4700-ms service time!

Increasing ncsize and ufs_ninode to 34000 in /etc/system may help a little. With 640 megabytes of RAM, maxusers should be at 640 already, and the caches will already be quite large. If you have set maxusers directly to some low value then you should remove it from /etc/system and let it size automatically.

Where can I find the HP-UX version of SymbEL?
Q: I've printed out your article to read while working on my HP-UX system. Of course, my question is do you have a version for 9.X on a 9000/735?
--dave, (firm indeterminate)

A: It is difficult to build a useful SE language on HP-UX, AIX or other OSes. The trick on Solaris 2 is that the /dev/kstat interface is a readonly, nonpriviledged interface, note that vmstat etc are no longer setuid commands in Solaris 2. This means that you can write easy scripts that can get at almost all the performance data, without running as root, and without making the binary setuid.

That said, Rich Pettit has recently been looking at the performance interfaces on other platforms. They are effectively undocumented on HP-UX as far as we can tell, and without sourcecode, it is hard to work out what to do.

Overall, one of my aims was to make Solaris 2 a better OS to work with than other OSes, so I'm not very motivated to spend time working on ports. I also have way too many unimplemented ideas for Solaris 2 to work on.

Which is better, SPARC or Pentium?
Q: My organisation is arguing about SPARC servers versus UNIX pentium-based servers. What I really have to prove to them is that the SPARC motherboard is faster, more reliable and robust, and is a market leader.

Could you please mail me relevant information QUICKLY before the management decides to scrap Sun? Thanks
--Jean-Pierre, (firm indeterminate)

A: This is really a job for you local Sun sales team to take on. What I can say is that we have been benchmarking Solaris on Intel 166MHz Pentium's and on Ultra1's, for network server workloads the Ultra's are two to three times faster. The PC motherboard and ethernet hardware does not have the bandwith to compete. We actually get throughput on SBus of over 100MBytes/s on an Ultra 1, the PCI bus on a Pentium PC is rated at 100MB/s or so in theory, but in practice you are lucky if you get over 10MB/s.

The PC hardware is designed down to a price, and most of the I/O adaptors are also not designed for demanding applications.

Some Web server benchmarks backing this up should be put up on www.sun.com soon, but they are not available yet.

The other issue is support, and since the hardware and software are from the same company, we can give better integrated support with less finger- pointing between hardware/OS/IOcard suppliers when something isn't working.

Why doesn't 32 megabytes seem like enough?
Q: The short question is, "why does it seem that a SPARC 4 running solaris 2.4 w/ 32 megabytes of RAM has too little memory?"

I have such a system and it seems almost hopeless to have emacs, gdb, netscape, and g++ running -- unless you like to hear the disk spinning (paging).

I think your columns are great, so thanks!
--Dave, (firm indeterminate)

A: See Adrian's May column for an indirect answer to this question.

Is there a way to measure the amount of CPU used by AIO "waiting" methods?
Q: Based on the way kaio works in Solaris, the question has come up regarding the possibility of being CPU bound because you are i/o bound. The thought process on this is based on the asumption that Solaris is asynchronously "waiting" for the i/o to complete by doing a SIGIO with a polling timeout versus the synchronous method of using the aiowait function. Can you help to clear this up ? Is there a way to measure the amount of CPU used by these "waiting" methods ?
--Marty Carangelo, Amdahl

A: Solaris doesn't poll. The user application might if it was using aio, but Solaris waits for the interrupt to wake up the thread that issued the I/O.

How many syscalls are too many?
Q: I don't find a threshold mentioned for the three fault categories. Here is a sample of output from a SC2000

 r b w   swap  free  re  mf pi po fr de sr s1 s1 s1 s3   in   sy   cs us sy id
 0 0 0 2502796 1136148 0  0  0  0  0  0  0  0  0  0  0  172 103473 498 61 29 10
 0 0 0 2502796 1136148 0  0  0  0  0  0  0  0  0  0  0  137 104002 487 62 29 9
 0 0 0 2502796 1136148 0  0  0  0  0  0  0  0  0  0  0  154 104144 463 62 29 9
 0 0 0 2502796 1136148 0  0  0  0  0  0  0  1  0  0  0  109 104189 467 63 28 9

It seems to me the fault/sy numbers are high. What do you consider high?
--Kenneth Woods, US Navy

A: syscalls are a byproduct of an application doing work. The more the better, as it means you are doing more work more quickly.

In absolute terms 100000 is quite reasonable for an SC2000, especially since you have usr:sys times in a 2:1 ratio which is quite healthy.

What can I do when the kernel memory button in ruletool goes black?
Q: We have a SPARCcenter 2000 with 8 cpus, 3 gb of RAM, 150 gb of disk. We are running a lot of processes, a dbms, and have quite a few users. I tuned the system file as near as I can tell according to the guidelines in your book and using the output of ruletool. The box keeps coming to its knees with kmem allocation errors - no kmem available (from ruletool). Of course - the kernel mem button in ruletool goes black and the box can't recover. I end up having to stop-a and reboot. Any help you can provide would be greatly appreciated as I am about to be ran out of Dodge by the local town folk.
--Ted Regan, EDS

A: fixed in Solaris 2.5.1 with an algorithm change and a bigger free list.

In the meantime set slowscan=500, and keep doubling lotsfree and desfree until the alloc fails go away. try this first (assuming lots - 3GB of RAM)

set slowscan=500
set lotsfree=4000
set desfree=2000

How large can a process be in Solaris?
Q: You say a SPARC Center 2000 can support 5gig of ram, is there still a 4gig per process limit on this system?

Does Solaris 2.5 support more than 4 gig per process? Does Solaris 2.5.1 support more than 4 gig per process?
--Chris Krebs, (firm indeterminate)

A: Note that the Enterprise 6000 machine can support 30GB of RAM, and both it and the SC2000 are limited by DRAM density, not address space.

The SC2000 has a 32bit virtual address space that maps to a 36bit physical address space. That is how lots of 4GB processes can share a much larger amount of RAM.

About the author
Adrian Cockcroft joined Sun Microsystems in 1988, and currently works as a performance specialist for the Computer Systems Division of Sun. He wrote Sun Performance and Tuning: SPARC and Solaris and Sun Performance and Tuning: Java and the Internet, both published by Sun Microsystems Press Books.
The answers to questions posed here are those of the author, and do not represent the views of Sun Microsystems Inc.

Resources

All of Adrian Cockcroft's Performance Q&A columns:
http://www.sunworld.com/common/swol-backissues-columns.html#perf
Online performance information home page
http://www.sun.com/sun-on-net/performance.html
SE download site:
http://www.sun.com/sun-on-net/performance/se3
virtual_adrian.se rule:
http://www.sun.com/951001/columns/adrian/column2.html
Detailed two-part whitepaper that describes how to optimize for performance on Sun systems: http://www.sun.com/software/white-papers/wp-optimize/optimize-part1.pdf
http://www.sun.com/software/white-papers/wp-optimize/optimize-part2.pdf
Related articles and FAQs in SunWorld:
- "SE Toolkit FAQ," (January 1998):
  http://www.sunworld.com/swol-01-1998/swol-01-perf.html
- "SyMON and SE get upgraded," (February 1999):
  http://www.sunworld.com/swol-02-1999/swol-02-perf.html
- "How do disks really work?" (June 1996):
  http://www.sunworld.com/swol-06-1996/swol-06-perf.html
- "Solving the iostat disk mystery," (October 1996):
  http://www.sunworld.com/swol-10-1996/swol-10-perf.html
- "Choosing the right disk configurations for your servers," (November 1996):
  http://www.sunworld.com/swol-11-1996/swol-11-perf.html
- "Clarifying disk measurement and terminology," (September 1997):
  http://www.sunworld.com/swol-09-1997/swol-09-perf.html
- All of Jim Mauro's Inside Solaris columns:
  http://www.sunworld.com/common/swol-backissues-columns.html#insidesolaris
- Web server performance features on SunWorld's Site Index:
  http://www.sunworld.com/common/swol-siteindex.html#webperf
If you want to build performance tools and utilities, get a copy of the SE Performance Toolkit Version 2.5 http://www.sun.com/960601/columns/adrian/se2.5.html
percol, the percollator.se startup stript for /etc/init.d. http://www.sun.com/960301/columns/adrian/percol
If you like Adrian's column, you'll probably want a copy of his book, Sun Performance and Tuning http://www.sun.com/smi/ssoftpress/books/Cockcroft/Cockcroft.html
And be sure to take a look at Adrian Cockcroft's profile (complete with low- and high-bandwidth bios) http://www.sun.com/950901/columns/adrian/adrian.html

If you have problems with this magazine, contact webmaster@sunworld.com
URL: http://www.sunworld.com/common/cockcroft.letters.html
Last updated: