Advertisement: Support SunWorld, click here!

September 1999

Navigate

	Home
	Next Story
	Printer-Friendly Version

Navigate

Subscribe, It's Free
Topical Index
Backissues
SunWHERE
Letters to the Editor
Events Calendar
TechDispatch Newsletters

Technical FAQs

Solaris Security
Secure Programming
Performance Q&A
SE Toolkit

SunWorld Partners

Software Store
Career Central
Sun Microsystems

About SunWorld

SunWorld FAQ
Advertising Info
SunWorld Editors
Masthead
Editorial Calendar
Writers Guidelines
Privacy Policy
Link To Us!
Copyright

Viewing your network in realtime

An examination of network monitoring protocols and tools

Summary
In July, Blair described the benefits of realtime monitoring systems for Solaris systems, and described several freely available tools that allow system administrators to manage and monitor short-term problems and long-term trends for capacity planning. This month, he takes the same concepts and applies them to your network by examining network monitoring protocols and tools. (3,000 words)

By Dr. Blair Zajac

urning away customers because your Web site is too slow is tantamount to corporate suicide. What every site needs is capacity planning, which requires some form of network measuring, monitoring, event trapping, and traffic plotting.

This month, expanding on my previous article, I'll identify and describe some monitoring tools. Each is free and available to the public; each is similar to the Orca tool that monitors Sun Solaris systems; and each is designed to monitor Simple Network Management Protocol (SNMP) agents instead of servers.

The precepts surrounding system monitoring are also applicable here. For medium to large sites, vast amounts of data will be collected that must be collated for easy viewing. Requirements for such a system include the ability to:

Monitor many boxes
Measure and display short- and long-term data
Allow for the easy comparison of measurements between different systems
Allow for easy viewing of all system measurements on different time scales
Keep plots available and up to date

SNMP: A protocol for network monitoring
SNMP is a client/server protocol that manages, controls, and receives error messages and alert conditions from network hardware. The server/agent (i.e., managed network entity) is located on the network hardware being managed, and the client is, in fact, specialized software running on a network management station (NMS). To keep the agent on the network hardware small, simple, and easy to implement, agents gather data and let the NMS handle the collation and presentation of this data to the network administrator.

SNMP uses UDP port 161 to communicate. If a packet is lost, the NMS will resend its request. No sequencing is needed, because all requests and responses fit within a single datagram.

SNMP separates the data available on a particular agent from the method for receiving and setting that data. For example, SNMP does not know that an NFS server with an SNMP agent can report the disk usage on a particular volume. This information is supplied separately in a management information base (MIB) which is used by the NMS. Several standard MIBs exist, such as an MIB for TCP/IP statistics known as MIB-II. This MIB contains such statistics as the uptime of the SNMP agent, the number of TCP/IP packets received and sent, and the number of currently established TCP connections.

An MIB is a tree structure of globally unique object identifiers (OIDs). A separate list of rules -- the structure of management information (SMI), described in RFC 1155 -- defines and identifies OIDs. The SMI states that OIDs must be specified using ISO's Abstract Syntax Notation 1 (ASN.1), which is a formal language allowing for both a human-readable and compact description of computer reading. ASN.1 specifies exactly how to encode names and data into messages for network transport, and removes any ambiguity about the data representation. For example, instead of specifying an integer value, ASN.1 requires an exact form and range for the integer.

An OID is a sequence of integers that traverse a global tree. OIDs in an MIB are managed by ISO and ITU, and define globally unique variables in a manner similar to the way in which DNS defines globally unique hosts. The tree consists of a root connected to a number of labeled nodes via edges. Each node may, in turn, have children of its own, which are also labeled. In this context, a label is the pairing of a brief textual description and an integer. Authority for portions of the namespace are assigned to other organizations, much in the same way in which DNS delegates the authority for individual domains to either individuals or organizations.

OIDs can also be associated with standards documents. Their space, more general than the description of variables in network boxes, is unnamed and has three direct nodes named for their managing organization; these are iso(1), itu(2), and a third, joint-iso-itu(3), managed by both groups. The number following the name is the numeric identifier for a particular node. All OIDs of interest on the Internet are rooted under iso(1), under which is a subtree for national or international standard organizations, which is named org(3). The U.S. National Institute for Standards and Technology allocated a node under org for the Department of Defense that it named dod(6). The Internet Activities Board then petitioned the DOD for a node for the Internet community. The node, named internet(1), contains a node named mgmt(2). Under this node are the OIDs for network and system management.

At this point some examples of the OID naming scheme would be helpful. If you want to know the number of currently established TCP connections, the name would be:

iso.org.dod.internet.mgmt.mib.tcp.tcpCurrEstab

Numerically this would be 1.3.6.1.2.1.6.9 -- 1 from iso, 3 from org, 6 from dod, 1 from internet, 2 from mgmt, etc. Since all OIDs fall under the mgmt node, they all begin with the prefix 1.3.6.1.2.1.

Two MIBs, MIB-I and MIB-II, are standard and supported by every agent. MIB-II is a superset of MIB-I and is the standard for monitoring TCP/IP. Vendors can provide their own MIBs for specific hardware. Under the internet(6) node is a private(4) node that contains an enterprises(1) node. There you'll find the OIDs for vendor-specific hardware, such as routers, switches, and hubs.

A useful tool for examining the MIB and getting specific values from a host is tkmib, which comes with the UCD-SNMP distribution described below. Notice that tkmib shows the MIB tree in the top window and that I've selected the iso.org.dod.internet.mgmt.mib-2.interfaces.ifNumber OID, which shows the numeric form as 1.3.6.1.2.1.2.1. It also displays some information about this OID farther down in the window. At the bottom it shows a walk of the iso.org.dod.internet.mgmt.mib-2.interfaces OID I did earlier. This tool is a great time-saver.

Figure 1. The tkmib tool

Instead of defining a large set of commands, SNMP implements a fetch-store paradigm for operations. In the original version of SNMP there are only five types of messages:

Table 1. SMNP commands
Command Meaning

Get Get a value from a specific OID

GetNext Get a value without knowing its exact name

Response Reply to a get operation

Set Set a specific variable to a specific value

Trap Reply to a triggered event

**Table 1. SMNP commands**
Command	Meaning
Get	Get a value from a specific OID
GetNext	Get a value without knowing its exact name
Response	Reply to a get operation
Set	Set a specific variable to a specific value
Trap	Reply to a triggered event

The NMS typically polls each agent in regular intervals. However, if a problem occurs, the NMS may not pick up on it immediately. For this reason, the agent can be programmed to generate a trap upon a predefined event. The trap event is sent to the NMS on UDP port 162.

The last issue to discuss in communicating with an SNMP agent is security. Access to an SNMP agent is divided into groups called communities. Each community name is, in effect, a password, and if you know the community name, you can access the SNMP agent. The community string is transmitted as plain text in the SNMP packet, and most agents have two community names, one public and one private. The private name allows more access to the agent.

SNMP agents and clients
Let's look at what's available.

Sun's SNMP Server
Sun includes an SNMP agent in Solaris 2.6 and all subsequent versions. This product installs as the solstice enterprise agents (SUNWCsea) cluster and contains the SUNWmibii, SUNWsacom, SUNWsadmi, and SUNWsasnm packages. In addition, SyMON contains a more comprehensive SNMP agent and client system for monitoring hosts.

UCD-SNMP
The UCD-SNMP package is a popular, freely-available SNMP client/server combination for many hosts. This software builds on many different Unix flavors and provides an SNMP agent and clients for acquiring and setting variables. In addition, UCD-SNMP provides a tkmib program to view the tree structure of an MIB and receive OID values. Additional MIBs from vendors can be loaded into UCD-SNMP. For example, I loaded Network Appliances Filer MIB to query the box on the disk usage for all of its volumes.

I'll quickly describe the steps to download and install UCD-SNMP with its associated tkmib program.

The UCD-SNMP's home page is at http://ucd-snmp.ucdavis.edu/, and the distribution can be downloaded from its anonymous FTP site, ftp://ucd-snmp.ucdavis.edu. Download the latest version, decompress and untar the file into a working directory, then cd into it.

Next run ./configure --help, view the different configuration options, and choose any that apply to your needs. If you're going to use a Perl SNMP module later on, you'll want to use the --enable-shared library to build a shared libsnmp.so library. If you want to install this someplace other than /usr/local, you'll need to use the --prefix=/path/to/install/dir option.

Now run ./configure with all the options you want. This will check the capabilities of your system and compiler and set up the codes to compile and run properly. Finally, do a make install to install it in its final location.

If you want the uptime of the SNMP agent, run the following command using the UCD-SNMP snmpwalk program:

% snmpwalk 10.1.2.3 community system.sysUpTime system.sysUpTime.0 = Timeticks: (1216034184) 140 days, 17:52:21.84

The first argument to snmpwalk is the IP address or name of the SNMP agent. The next (optional) argument is the community name that grants access to the SNMP agent.

If you want to build and use tkmib, build and install the Perl SNMP and Tk modules. This is described below.

Perl SNMP Modules
There are two different SNMP modules that allow you to get/set SNMP variables from Perl.

SNMP.pm
The first, written by G.S. Marzot, is simply named SNMP.pm, and links against UCD-SNMP's libsnmp.so library. The current version, 1.8.1, is available from the CPAN archive (ftp://ftp.funet.fi/pub/languages/perl/CPAN/authors/id/GSM). Get the latest version and run the following commands. The installation will ask for the location of the UCD-SNMP. Use the include and lib directory from the prefix given to the ./configure step above. If you did not use a --prefix= command line option to ./configure, the location will be /usr/local/include/ucd-snmp and /usr/local/lib.
% gzcat SNMP-1.8.1.tar.gz | tar xf - % cd SNMP-1.8.1 % perl Makefile.PL Where are the libsnmp.a include files? [/usr/local/include/ucd-snmp] /usr/local/include/ucd-snmp Where is libsnmp.a installed? [/usr/local/lib] /usr/local/lib Checking if your kit is complete... Looks good Processing hints file hints/solaris.pl Writing Makefile for SNMP Enter host and community for SNMP tests: [localhost private]
The last line is the hostname and community name of a host to test SNMP against. If you don't have a box with an SNMP agent, don't worry; it's not crucial.
To get tkmib running, you'll need to download and install the Perl Tk module. The latest version, 800.015, is available at CPAN. Follow the same steps as above for the SNMP module:
% gzcat Tk800.015.tar.gz | tar xf - % cd Tk800.015 % perl Makefile.PL perl is installed in /home/bzajac/opt-i386-solaris/perl5/lib/5.00503/i86pc-solaris okay PPM for perl5.00503 Test Compiling config/signedchar.c Test Compiling config/Ksprintf.c Test Compiling config/tod.c Generic gettimeofday() /usr/X/bin/xmkmf suggests /usr/openwin Using -L/usr/openwin/lib to find /usr/openwin/lib/libX11.so.4 Using -I/usr/openwin/include to find /usr/openwin/include/X11/Xlib.h Writing Tk/Config.pm Writing pTk/tkConfig.h . . . % make % make test % make install
Make sure the Makefile.PL found the X include and library files you want. The installed tkmib should now run. You may need to fix the first line of tkmib to point to the correct version of Perl.

SNMP_Session
The second Perl SNMP module, written by Simon Leinen, differs from the previous one in that it's written completely in Perl and does not rely upon, or link with, any other libraries. This module is used by both MRTG and Cricket, the two network monitoring tools described below. Its main disadvantage is that it only understands numeric OIDs.

Monitoring solutions
Sun's SyMON does a great job of monitoring hosts for events using SNMP, but it doesn't record and plot data. For monitoring the short- and long-term capacity issues, I'll examine the multirouter traffic grapher (MRTG) and Cricket tools.

Both MRTG and Cricket generate HTML pages containing GIFs or PNGs (a new image format that does not have the patent issues GIF does) of recorded data. Plots are generated showing multiple timespans, from daily to yearly. The binary data files do not grow over time. Both are freely available on the Web, written in Perl, use the SNMP_Session Perl module described above, and use C code to store and graph data. Typically, a crontab entry is set up to run the data collection tool every five minutes.

Cricket and MRTG are, however, installed and set up in completely different manners. MRTG is simpler to install and set up, while Cricket is faster and more flexible. MRTG forks a separate process for each image or data update, while Cricket dynamically loads the RRDtool library. Cricket does not generate the images until a user points his or her browser at a CGI script that generates the images on the fly.

Both tools are widely used in the network community for measuring everything from the backplane bandwidth usage on Cisco routers, to the amount of traffic passing through a particular port on a switch, to the CPU usage on routers.

Installing either of these packages requires some work. Because of patent issues surrounding GIF creation code, libraries that were used to create GIF images have been converted to generate PNG images. While PNG images are smaller and take less time to compress, installing the code requires the libpng and libz libraries. You can download these tools from the following places:

**Table 2. Network monitoring tools**
Tool	Location	Description
`zlib`	http://www.cdrom.com/pub/infozip/zlib/	Compression library used to make PNGs
`libpng`	http://www.cdrom.com/pub/png/	PNG creation library
`libgd`	http://www.boutell.com/gd/	Graphics library for creating images
SNMP_Session	http://www.switch.ch/misc/leinen/snmp/perl/	Perl SNMP library
MRTG	http://ee-staff.ethz.ch/~oetiker/webtools/mrtg/mrtg.html	Traffic measuring and bandwidth plotting tool
Cricket	http://www.munitions.com/~jra/cricket/	Traffic measuring and bandwidth plotting tool

MRTG

MRTG, written by Tobias Oetiker, generates Web pages such as the following:

Figure 2. Top-level MRTG example

Shown here is a portion of a Web page displaying network traffic, NFS operations per second, and CPU usage for a Network Appliances NFS Filer. Clicking on one of the images leads to a page showing the daily, weekly, monthly, and yearly plots. Below are the plots for the number of NFS operations per second:

Figure 3. Daily MRTG plot

Figure 4. Weekly MRTG plot

Figure 5. Monthly MRTG plot

Figure 6. Yearly MRTG plot

Once you've downloaded, configured, and compiled MRTG, it's a straightforward process to set up the monitoring of a new router or host. In this example, we will point MRTG at the SNMP running on a Solaris 2.6 host. Simply run the following commands:

% pwd /home/blair/mrtg-2.8.6 % mkdir /home/blair/www/mrtg % cp images/* /home/blair/www/mrtg/ % ./run/cfgmaker public@dagalas > dagalas.cfg % vi dagalas.cfg Here add the line WordDir:/home/blair/www/mrtg mentioned at the top of the file. Make sure all MaxBytes settings are large enough for the interface being monitored. Sometimes cfgmaker gets this value too small and all recorded data larger than this value will be ignored. Add a new argument to each target in order to have the image plot the newest data on the right, not left, side of the plot. Options[XXX]: growright. % ./run/mrtg dagalas.cfg Rateup WARNING: ./run//rateup could not read the primary log file for dagalas Rateup WARNING: ./run//rateup The backup log file for dagalas was invalid as well Rateup WARNING: ./run//rateup Can't remove dagalas.old updating log file Rateup WARNING: ./run//rateup Can't rename dagalas.log to dagalas.old updating log file % ./run/mrtg dagalas.cfg Rateup WARNING: ./run//rateup Can't remove dagalas.old updating log file % ./run/mrtg dagalas.cfg % ./run/indexmaker dagalas.cfg > /home/blair/www/mrtg/index.html

Finish by setting the mrtg command in your crontab to run every five minutes; then just point your browser at the directory and you'll see the new results.

The configuration file cfgmaker creates lines like:

Target[XXX]: 1:public@dagalas

This will gather the traffic for port 1 of the machine named dagalas by using the community public for the SNMP query. You can also define the exact OID by using the syntax:

OID_1&OID_2:community@router

The following example retrieves error input and output octets/sec on interface 1. MRTG needs to graph two values, so specify two OIDs, such as temperature and humidity.

Target[XXX]: 1.3.6.1.2.1.2.2.1.14.1&1.3.6.1.2.1.2.2.1.20.1:public@myrouter

This is where having tkmib available to receive numeric OID values is extremely useful.

Cricket
Cricket, a relatively new tool compared to MRTG. It was written by Jeff Allen, based on Tobias Oetiker's new Round Robin Database (RRD) library.

Cricket is significantly faster than MRTG at gathering SNMP statistics and updating binary data files. It also leaves image creation to viewing time by having a CGI create the images. This saves CPU time for other purposes, though it does increase the user's wait for viewing. The other large improvement is the creation of an inheritance tree of configuration files. A top-level configuration file can set global parameters that may or may not be overridden in lower configuration files. Lower levels of the tree set more specific targets to monitor. This is extremely useful for large sites, as it lets different organizations handle different portions of the configuration tree.

A top-level page for viewing a Cricket installation, pulled directly from the Cricket author's demonstration Web site, is shown below.

Figure 7. Top-level example Cricket page

Clicking on the router link takes you to this page:

Figure 8. Second-level example Cricket page

Finally, clicking on this CPU link shows the actual statistics of the router's CPU usage:

Figure 9. Example Cricket page showing router CPU usage

More information on building a Cricket installation can be found on the Cricket Web page.

[Blair Zajac] About the author
Blair Zajac is an IT analyst at Yahoo!/GeoCities, where he focuses on Web site architecture and performance issues, including networking hardware, content storage, international distribution, server operating systems, and Web server software. He is the author of the Orca monitoring system and was a key developer of the freely available Amanda backup software system. Before moving to Yahoo!/GeoCities, he was the systems manager for the Geological and Planetary Sciences Division at Caltech, where he also received a Ph.D. in geophysics.

Home | Next Story | Printer-Friendly Version | Comment on this Story | Resources and Related Links

Advertisement: Support SunWorld, click here!
<A HREF="http://ad.doubleclick.net/jump/idg.sw.com/archives;sz=468x60"><IMG SRC="http://ad.doubleclick.net/ad/idg.sw.com/archives;sz=468x60" height=60 width=468></A>

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-09-1999/swol-09-realtime2.html
Last modified: