Click on our Sponsors to help Support SunWorld

Processor partitioning

Swamped by several competing workloads on the same machine? Learn how you can divvy them up into processor sets

May 1998

Abstract

When several workloads compete for CPU time on a large system, you can divide the CPUs into sets and bind each workload to a different set to constrain it. This month Adrian looks at how this works and where it can be used effectively. (1,500 words)

Mail this
article to
a friend

: I want to run several very different workloads on the same machine. How can I make sure that one workload doesn't take over all the CPU power and crowd out another one?

: In the past it was common to use several systems -- one to run each workload. Nowadays, systems are so powerful and scalable that it's simpler to use one machine and run everything on it at once. A new feature in Solaris 2.6 allows a multiprocessor machine to be partitioned into processor sets, and constrain each workload to use only the processors in a single set.

Processor sets
I mention processor sets in the second edition of my book -- which, by the way, is now available (finally!); but I don't go into any detail of how they work or how to use them. In this column I'll explain more about this new feature of Solaris 2.6 and how it can be used. Let's start by taking a look at the manual page for the psrset command, which is all the information that is provided as standard.

Maintenance Commands 				 psrset(1M)

NAME
	psrset - creation and management of processor sets

SYNOPSIS
	psrset -c [ processor_id ... ]
	psrset -d processor_set_id
	psrset -a processor_set_id processor_id ...
	psrset -r processor_id ...
	psrset - creation and management of processor sets
	psrset -p [ processor_id ... ]
	psrset -b processor_set_id pid ...
	psrset -u pid ...
	psrset -q [ pid ... ]
	psrset [ -i ] [ processor_set_id ... ]

DESCRIPTION

psrset controls the management of processor sets. Processor sets
allow the binding of processes to groups of processors, rather than
just a single processor. There are two types of processor sets,
those created by the user using the psrset command or the
pset_create(2) system call, and those automatically created by the
system. Processors assigned to user-created processor sets will run
only LWPs that have been bound to that processor set, but system
processor sets may run other LWPs as well.

A single systemwide processor set is created on a multiprocessor system by default. One reason for implementing processor sets is to provide support for NUMA architecture systems that have groups of processors with fast communications, connected by a slower interconnect. In that case a system processor set would be set up for each group. This feature is not needed on Sun's Enterprise server range, as all processors are on a single fast interconnect.

The initial state is that all CPUs belong to an initial system processor set. Additional sets can be created by the system administrator by taking CPUs away from the system set. The kernel only uses the system set for normal operations, although interrupts are handled by processors regardless of which set they belong to. There will always be at least one CPU left in the system processor set -- for example, NFS server services will only run on the system processor set.

If you have a mix that includes some NFS service that needs to be constrained this is one way to do that. In general the system set should be as large as possible, perhaps shared with one of your regular workloads, so that you don't starve the kernel of CPU time.

Advertisements

Sun's published dual TPC-C and TPC-D result
Sun recently published a fully audited benchmark where we run an online transaction processing TPC-C workload on the same machine at the same time as a data warehouse TPC-D workload. This was managed using processor sets. A 16-CPU E6000 was divided into an eight-CPU system processor set and an additional eight-CPU user-created set. A single copy of IBM's DB2 Universal Server database code was used to create two database instances on separate parts of the disk subsystem. When the benchmark was run, the continuous small TPC-C transactions ran at a constant rate, providing good response times to the online users. The large and varied TPC-D transactions were constrained and did not affect the online user response times. The overall throughput is less than it could have been if the idle time in each set could be used by the other workload, but consistency of steady state response times and throughput is a requirement for an audited TPC-C result, and it could not be achieved without using processor sets in this way.

The TPC-C summary is at: http://www.tpc.org/results/individual_results/Sun/sun.ue6000.ibm.es.pdf
The TPC-D summary is at: http://www.tpc.org/results/individual_results/Sun/sun.ue6000.ibm.d.es.pdf

How does it work?
Solaris maintains a queue of jobs that are ready to run on a per-CPU basis. There is no single global run queue. Older versions of Solaris implement processor binding using the pbind(1M) command and underlying system calls. A process is bound to a CPU with pbind, but it isn't exclusive. Other work can also run on that CPU. With psrset, the binding is to a group of CPUs, but it is also an exclusive binding, and nothing else will be scheduled to run on that set. It's possible to use pbind within any set, to give a further level of control over resource usage.

The way psrset works is to create a kind of virtual machine for scheduling purposes. Once a process is bound to that set, all child processes are also bound to that set, so it is sufficient to bind a shell or startup script for an application. Bindings can only be made if you have root permissions.

The system normally keeps a linked list of the online processors. Each processor has its own run queue. When a kernel thread is to be placed on a run queue, it goes through some various machinations and decides where the thread should be placed. Normally this is the same processor on which it last ran, but sometimes it changes processors (migrates) for various reasons (load balancing, etc.).

With processor sets, we can split up the list of processors into disjoint subsets. When you create a processor set, you create a new list with the processors that are in the set. The processors are taken out of the "normal" list of processors that run everything not in the set. Processes assigned to the set run on the processors in the set's list and can migrate between them. Other processes and normal (non-interrupt) kernel threads cannot run on those processors; they no longer have access to them. It's as if the processors have been taken offline. The exception is kernel threads that may be bound to a specific processor for one reason or another, but this is unusual.

Interrupts are taken on whichever CPU normally takes that interrupt, but any subsequent activity will take place in the system processor set. The mpstat command can be used to see the distribution of interrupts and load over all the CPUs.

% mpstat 5

CPU minf mjf xcal  intr  ithr   csw  icsw migr smtx  srw syscl usr sys wt idl
  0   58   8 1459   822   610  1306   171  242   96   30   609   6  67 27   0
  1   36   8 1750  1094   657  1100   151  238  104   28   717   6  76 18   0
  4   53   7 1518   951   759  1111   155  226   95   29   642   6  69 24   0
  5   25   7 1715  1067   765  1104   178  232  111   23   552   7  65 28   0

Wrap up
Thanks to Andy Tucker for implementing processor sets and providing some of the explanations provided above.

Click on our Sponsors to help Support SunWorld

Resources

Sun Performance and Tuning -- Java and the Internet, by Adrian Cockcroft and Richard Pettit, Sun Press/PTR Prentice Hall, ISBN 0-13-095249-4 http://www.sun.com/books/catalog/Cockcroft/Cockcroft.html
The SE3.0 Toolkit page http://www.sun.com/sun-on-net/performance/se3
SE Toolkit FAQ, Adrian's January 1998 SunWorld Performance Q&A column http://www.sun.com/sunworldonline/swol-01-1998/swol-01-perf.html
See Adrian Cockcroft's frequently asked questions for answers to three dozen performance-related questions. Subjects covered include performance monitoring commands, tuning variables, logins and processes, how to interpret the output of performance measurements, and how to optimize Web servers and news servers. http://www.sun.com/sunworldonline/common/cockcroft.letters.html
virtual_adrian.se rule http://www.sun.com/951001/columns/adrian/column2.html
Interested in Web server performance? Go to SunWorld's Site Index http://www.sun.com/sunworldonline/common/swol-siteindex.html#webperf
If you want to build performance tools and utilities, get a copy of the SE Performance Toolkit Version 2.5.0.2 http://www.sun.com/960601/columns/adrian/se2.5.html
Adrian Cockcroft's profile (complete with low- and high-bandwidth bios) http://www.sun.com/950901/columns/adrian/adrian.html
A full listing of Adrian Cockcroft's other Performance Q&A columns in SunWorld http://www.sun.com/sunworldonline/common/swol-backissues-columns.html#perf

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-05-1998/swol-05-perf.html
Last modified:

Comments:
Name:
Email:
Company Name: