Unix 101 by Mo Budlong

Sending signals

There's more than one way to kill a process. Which should you use?

SunWorld
April  1998
[Next story]
[Table of Contents]
[Search]
Sun's Site

Abstract
This month we explain some basic Unix signals, specifically SIGTERM, SIGQUIT, SIGINT, and SIGHUP. These signals can be used in conjunction with the trap and kill commands to end processes in a variety of different ways. We show you how they work. (1,700 words)


Mail this
article to
a friend
In Unix, signals are sent to running processes to indicate that an event, exterior to the process, has occurred and that the process must respond.

The simplest example of this is the hang up signal, or SIGHUP in Unix-speak. When a user is logged on from a remote terminal, the line can hang up for a number of reasons. Phone outages, modem problems, or a power loss at the remote terminal. All of these conditions cause a program to go "out of control." In other words a program or process that was being run from a terminal is no longer under the control of that terminal. The program itself needs to know what to do in that case. The Unix operating system keeps track of which processes are being run by which terminal, and when the terminal hangs up (drops the connection) the operating system sends a SIGHUP signal to all the programs that are launched from that terminal.

The process has three options when it receives the SIGHUP signal.

  1. The process does the default action, which is to stop executing immediately.
  2. The process can be programmed to catch the signal (called trapping the signal) and ignore it. The process continues running.
  3. The process can be programmed to catch (trap) the signal and do something sensible such as close all open files and exit.

All three options have their place depending on the type of program that is being run.

The kill command
The kill command can be used to send a signal to a running process. You can use ps -ef to locate a job number, and then use kill to send a signal to that job. But before you start killing processes, it is necessary to understand a little about what signals do and how programs handle them.

In order to see some practical results, let's try a few simple examples. First switch into the Korn shell by typing ksh and pressing enter. For these examples I will be using a short script and the kill command. The kill command lets you send a specific signal to a process. The syntax for kill is:

kill -(SignalNumber) JobNumber

The hang up signal is 1 (one), so the command kill -1 1234 would send the hang up signal to job 1234 just as if the phone line had been hung up.

Listing 1 is a short script that sleeps for five seconds, wakes up, prints a message that it is awake and then goes back to sleep. Type this script in and save it as waiter, and then change its mode so that it can be executed by typing:

chmod a+x waiter

Listing 1: waiter

while true
do
    sleep 5
    echo "Awake"
done
You can run Listing 1 by detaching it from the terminal. To do this type waiter followed by an ampersand (&) on the command line. After you press Enter, something like the following will appear:

waiter &
[1]  4567

Make a note of the second number as it will vary. It is the process ID or job number for the "waiter" process. This job is now running as a background process, but because it is echoing the word "Awake" every five seconds, Awake will appear on your terminal at five-second intervals.

You can stop the job by typing the following command, but use the actual process ID number that appeared when you started the job:

kill -1 4567

Don't worry if the word "Awake" butts into the middle of your typing. It won't affect the command you are typing. Finish typing the command and press Enter. The kill command sends the hang up signal to waiter and waiter simply does the default action of dying. At this point Awake stops appearing at your terminal.

If Awake continues to appear on your terminal, you should make sure that you have noted the job number correctly and try the kill command again. If that doesn't work, repeat the kill command but change the -1 to -9. There will be more about -9 in just a moment.

In Listing 2, waiter has been modified to trap the hang up signal. First a function is created that prints the fact that a hang up signal has been received. Then the trap command is used to set up a trap for the hang up signal. Instead of quitting the program executes the function echo01 and then continues.


Advertisements

Listing 2: Waiter with a trap

function echo01 {
    echo "Received signal 1 (SIGHUP)"
}

trap echo01 1

while true
do
    sleep 5
    echo "Awake"
done

Now if you start waiter with an ampersand and try to kill it with -1, your screen will look something like this.

$ waiter &
[1]   951
$ Awake
Awake
kill -1 951
$ Received signal 1 (SIGHUP)
Awake
Awake

The trap in the program now catches the signal 1 and simply displays a message and continues running. You can stop the program by sending a different signal such as kill -2 or kill -9.

This technique is used in large applications. There the program is frequently in the middle of important or complicated actions, like closing open files, that should not be handled in "drop dead" fashion.

Listing 3 is closer approximation of how a large application would handle a hang up signal.

Listing 3: Waiter with a bigger trap

function echo01 {
    echo "Received signal 1 (SIGHUP)"
    echo "Now I would close files if I had any open."
    exit
}

trap echo01 1

while true
do
    sleep 5
    echo "Awake"
done

Start this latest version of waiter with an ampersand and try to kill it with -1. Your screen will look something like this, and the program will stop executing:

$ waiter &
[1]   951
$ Awake
Awake
kill -1 951
$ Received signal 1 (SIGHUP)
Now I would close files if I had any open.

SIGKILL: The command that cannot be ignored
So what does kill -9 do? Signal 9 is SIGKILL, and it cannot be trapped. If you send a signal 9 to a process you are telling the operating system to cut it off at the knees -- drop dead now. The advantage of signal 9 is that the program cannot trap it and ignore it. The disadvantage of signal 9 is that the program cannot intercept it and perform an orderly shut down even if it needs to.

Listing 4 is waiter, modified with a trap for signal 9.

Listing 4: Waiter trying to trap signal 9

function echo09 {
    echo "Received signal 9 (SIGKILL)"
    echo "Now I would close files if I had any open."
    exit
}

trap echo09 9

while true
do
    sleep 5
    echo "Awake"
done

Start version 4 of waiter with an ampersand and try to kill it with -9. Your screen will look something like this, and the program will stop executing:

$ waiter &
[1]   1151
$ Awake
Awake
kill -9 1151
$

There is no friendly message, no "now I am trying to close files" information on the screen. The process died where it stood on receipt of a signal 9 even though a trap was prepared for it.

Using kill -9 on a process that controls a database application or a program that updates files can be disastrous.

Most well behaved processes are written to allow an orderly shut down when some signal other than 9 is received. Signal 1, SIGHUP, is possibly the most common signal used for an orderly shut down. Many applications intercept and shut down correctly for most signals, so try 1 and others below before you try 9.

Signals you should know
Here are some of the other common signals and what causes them to be generated.

1 SIGHUP, hang up -- Caused by the phone line or terminal connection being dropped.

2 SIGINT, interrupt -- Generated from the user keyboard usually by a Control-C, Backspace or Delete. To find out which, type stty -a and press Enter. In the listing you will find intr = DEL, or intr = ^C, or intr = ^H, or something similar.

3 SIGQUIT, quit -- Also generated from the keyboard usually by Ctrl-\ or Ctrl-Y. To find out which, type stty -a and press Enter. In the listing you will find quit = ^\, or quit = ^Y or something similar. A SIGQUIT often causes a core file to be created. This is a copy of your current memory.

15 SIGTERM, software terminate -- This is usually generated by another program. In fact the kill command uses 15 as the default. If you specify kill job, with no signal number, kill sends a signal 15 to the job. Using kill without a signal number is usually a good place to start on killing a process.

It requires extra time and coding to write a trap for a signal into a program. When a trap has been written into a program, it is usually for good reason. If the program can simply die without doing any cleanup, then why go to the trouble of including a trap? That's why it is a good idea to try 15 and 1 and 2 before ever resorting to 9.


Resources

More reading

About the author
Mo Budlong, president of King Computer Services, Inc., specializes in UNIX and Client/Server consulting and training and currently publishes the COBOL Just In Time Course, a crash course for the Year 2000 problem as well as COBOL Dates and the Year 2000 which offers date solutions. Reach Mo at mo.budlong@sunworld.com.

What did you think of this article?
-Very worth reading
-Worth reading
-Not worth reading
-Too long
-Just right
-Too short
-Too technical
-Just right
-Not technical enough
 
 
 
    

SunWorld
[Table of Contents]
Sun's Site
[Search]
Feedback
[Next story]
Sun's Site

[(c) Copyright  Web Publishing Inc., and IDG Communication company]

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-04-1998/swol-04-unix101.html
Last modified: