|
Sending signalsThere's more than one way to kill a process. Which should you use? |
This month we explain some basic Unix signals, specifically SIGTERM, SIGQUIT, SIGINT, and SIGHUP. These signals can be used in conjunction with the trap and kill commands to end processes in a variety of different ways. We show you how they work. (1,700 words)
Mail this article to a friend |
The simplest example of this is the hang up signal, or SIGHUP in Unix-speak. When a user is logged on from a remote terminal, the line can hang up for a number of reasons. Phone outages, modem problems, or a power loss at the remote terminal. All of these conditions cause a program to go "out of control." In other words a program or process that was being run from a terminal is no longer under the control of that terminal. The program itself needs to know what to do in that case. The Unix operating system keeps track of which processes are being run by which terminal, and when the terminal hangs up (drops the connection) the operating system sends a SIGHUP signal to all the programs that are launched from that terminal.
The process has three options when it receives the SIGHUP signal.
All three options have their place depending on the type of program that is being run.
The kill command
The kill command can be used to send a signal to a running process.
You can use ps -ef
to locate a job number, and then use
kill to send a signal to that job. But before you start killing
processes, it is necessary to understand a little about what signals
do and how programs handle them.
In order to see some practical results, let's try a few simple
examples. First switch into the Korn shell by typing ksh
and
pressing enter. For these examples I will be using a short script
and the kill command. The kill command lets you send a specific
signal to a process. The syntax for kill is:
kill -(SignalNumber) JobNumber
The hang up signal is 1 (one), so the command kill -1 1234
would send the hang up signal to job 1234 just as if the phone line
had been hung up.
Listing 1 is a short script that sleeps for five seconds, wakes up,
prints a message that it is awake and then goes back to sleep.
Type this script in and save it as waiter
, and then change its
mode so that it can be executed by typing:
chmod a+x waiter
Listing 1: waiter
while true do sleep 5 echo "Awake" doneYou can run Listing 1 by detaching it from the terminal. To do this type
waiter
followed by an ampersand (&
) on the command line. After you press Enter, something like the following will appear:
waiter & [1] 4567
Make a note of the second number as it will vary. It is the process ID or job number for the "waiter" process. This job is now running as a background process, but because it is echoing the word "Awake" every five seconds, Awake will appear on your terminal at five-second intervals.
You can stop the job by typing the following command, but use the actual process ID number that appeared when you started the job:
kill -1 4567
Don't worry if the word "Awake" butts into the middle of your typing. It won't affect the command you are typing. Finish typing the command and press Enter. The kill command sends the hang up signal to waiter and waiter simply does the default action of dying. At this point Awake stops appearing at your terminal.
If Awake continues to appear on your terminal, you should make sure that you have noted the job number correctly and try the kill command again. If that doesn't work, repeat the kill command but change the -1 to -9. There will be more about -9 in just a moment.
In Listing 2, waiter has been modified to trap the hang up
signal. First a function is created that prints the fact that a
hang up signal has been received. Then the trap command is used
to set up a trap for the hang up signal. Instead of quitting the
program executes the function echo01
and then continues.
|
|
|
|
Listing 2: Waiter with a trap
function echo01 { echo "Received signal 1 (SIGHUP)" } trap echo01 1 while true do sleep 5 echo "Awake" done
Now if you start waiter with an ampersand and try to kill it with -1, your screen will look something like this.
$ waiter & [1] 951 $ Awake Awake kill -1 951 $ Received signal 1 (SIGHUP) Awake Awake
The trap in the program now catches the signal 1 and simply
displays a message and continues running. You can stop the
program by sending a different signal such as kill -2
or
kill -9
.
This technique is used in large applications. There the program is frequently in the middle of important or complicated actions, like closing open files, that should not be handled in "drop dead" fashion.
Listing 3 is closer approximation of how a large application would handle a hang up signal.
Listing 3: Waiter with a bigger trap
function echo01 { echo "Received signal 1 (SIGHUP)" echo "Now I would close files if I had any open." exit } trap echo01 1 while true do sleep 5 echo "Awake" done
Start this latest version of waiter with an ampersand and try to kill it with -1. Your screen will look something like this, and the program will stop executing:
$ waiter & [1] 951 $ Awake Awake kill -1 951 $ Received signal 1 (SIGHUP) Now I would close files if I had any open.
SIGKILL: The command that cannot be ignored
So what does kill -9
do? Signal 9 is SIGKILL, and it
cannot be trapped. If you send a signal 9 to a process you are
telling the operating system to cut it off at the knees -- drop dead
now. The advantage of signal 9 is that the program cannot trap it
and ignore it. The disadvantage of signal 9 is that the program
cannot intercept it and perform an orderly shut down even if it
needs to.
Listing 4 is waiter, modified with a trap for signal 9.
Listing 4: Waiter trying to trap signal 9
function echo09 { echo "Received signal 9 (SIGKILL)" echo "Now I would close files if I had any open." exit } trap echo09 9 while true do sleep 5 echo "Awake" done
Start version 4 of waiter with an ampersand and try to kill it with -9. Your screen will look something like this, and the program will stop executing:
$ waiter & [1] 1151 $ Awake Awake kill -9 1151 $
There is no friendly message, no "now I am trying to close files" information on the screen. The process died where it stood on receipt of a signal 9 even though a trap was prepared for it.
Using kill -9 on a process that controls a database application or a program that updates files can be disastrous.
Most well behaved processes are written to allow an orderly shut down when some signal other than 9 is received. Signal 1, SIGHUP, is possibly the most common signal used for an orderly shut down. Many applications intercept and shut down correctly for most signals, so try 1 and others below before you try 9.
Signals you should know
Here are some of the other common signals and what causes them to
be generated.
1 SIGHUP, hang up -- Caused by the phone line or terminal connection being dropped.
2 SIGINT, interrupt -- Generated
from the user keyboard usually by a Control-C, Backspace or Delete. To
find out which, type stty -a
and press Enter. In the
listing you will find intr = DEL
, or intr =
^C
, or intr = ^H
, or something similar.
3 SIGQUIT, quit -- Also generated
from the keyboard usually by Ctrl-\ or Ctrl-Y. To find out which,
type stty -a
and press Enter. In the listing you will
find quit = ^\
, or quit = ^Y
or something
similar. A SIGQUIT often causes a core file to be created. This is a
copy of your current memory.
15 SIGTERM, software terminate -- This is usually generated by another program. In fact the kill command uses 15 as the default. If you specify kill job, with no signal number, kill sends a signal 15 to the job. Using kill without a signal number is usually a good place to start on killing a process.
It requires extra time and coding to write a trap for a signal into a program. When a trap has been written into a program, it is usually for good reason. If the program can simply die without doing any cleanup, then why go to the trouble of including a trap? That's why it is a good idea to try 15 and 1 and 2 before ever resorting to 9.
|
Resources
About the author
Mo Budlong, president of King Computer Services, Inc.,
specializes in UNIX and Client/Server consulting and
training and currently publishes the COBOL Just In Time
Course, a crash course for the Year 2000 problem as well as COBOL
Dates and the Year 2000 which offers date solutions.
Reach Mo at mo.budlong@sunworld.com.
If you have technical problems with this magazine, contact webmaster@sunworld.com
URL: http://www.sunworld.com/swol-04-1998/swol-04-unix101.html
Last modified: