Click on our Sponsors to help Support SunWorld
Sysadmin by Hal Stern

Time bombs

Setting, adjusting, and maintaining synchronized clocks on a network

SunWorld
April  1996
[Next story]
[Table of Contents]
[Search]
Subscribe to SunWorld, it's free!

Abstract
You can synchronize the clocks on your networked systems several ways. Some approach Rube Goldberg levels of efficiency, while others conjure chronological daemons to do your timekeeping bidding. Which method is best for your site? rdate? NTP? The answers lie below. (3,200 words)


Mail this
article to
a friend

About ten years ago, Dennis Ritchie posted the question "What time is it?" on net.general, then the catch-all for USENET traffic of interest to the masses. Dennis was poking fun at messages (and their authors) that arrived several days after bearing any sense of relevance. The "what time is it?" probe demonstrated that USENET isn't a good source for a watch-setting consensus. The underlying problem, however, is not setting just one clock on one machine, but instead keeping an entire network of machines synchronized with respect to a global, reliable time source and with respect to each other.

Accurate time-keeping affects many day-to-day functions as well as issues critical to system administrators:

Put simply: how do you keep the software watches of hundreds or thousands of systems synchronized, particularly when they are spread around the world, and how do you make sure that your sense of time matches that of the rest of the computing world? This month, we address the question of what to do when your sense of time bombs. We'll start with an overview of how Unix systems keep track of time, We'll look a simple method for synchronizing Unix system clocks, and take a peek at the Network Time Protocol, a small dose of Swiss perfection you can bring to most Unix and Windows NT machines.


Advertisements

A brief history of time, in 64 bits or less

Time is suddenly in vogue for technical discussion. Impending software meltdowns due to the two-digit year rollover in 2000, and estimates of the amount of work required to fix every picayune date reference have displaced Internet mania on the covers of our trade press. The hysteria has clouded the issue slightly, since the problem with date rollover only exists if your systems do math on two-digit year values. If you designed a database with the year ranging from 00-99, you will have saved a few bytes per row but created a software headache for someone else. Lotus 1-2-3, Microsoft Excel, and other products skirt the 1999-party-over problem by using 100 to refer to 2000, 101 for 2001, and so on, extending date fields to three digits when the millennium arrives.

On Unix systems, the current time is decoupled from any particular year or representation. The time is stored as a 64-bit value representing the number of seconds elapsed since January 1, 1970. The year 2000 isn't a problem, but sometime in January 2037 the Unix time counter will roll over and cause another trade press crisis. The time counter is relative to Greenwich Mean Time (or Universal Mean Time) and is converted into local time using the information in the timezone file /etc/timezone. Choosing UMT as a baseline means that (in theory) all Unix systems should have exactly the same 64-bit time counter at any point in time. We'll see how that simplifies time management protocols later on.

Accurate time keeping affects nearly all of the file-based operations on a Unix system. Every Unix file has its modification and access times recorded in the inode. The modification time is used by backup utilities, NFS, and the local virtual memory management system for maintaining consistency of in-memory file cache pages. Note that no creation times are not kept in the filesystem (see "The Unix Filesystem" SunWorld Online September 1995). If you want an audit trail of when a file was created and modified, you'll need to use a source code control system such as SCCS (see "A system administrator's introduction to SCCS and its wonders" SunWorld Online July 1995) that tracks each change.

Given that system clocks tend to drift apart, NFS client systems compensate for "impossible" clock values with slight modifications to basic commands like ls. Normally, ls shows you the day and modification time of a file if it is less than six months old, otherwise it shows you the month, day and year:

duey% ls -l 
total 38
-rw-r--r--   1 stern    user      18631 Apr 15  1995 time.txt

What if you create a file on a machine whose clock is slightly ahead? You'll confuse ls since it's subtraction of modification time from current time yields a negative number. The NFS-aware ls accepts a clock drift window of a few minutes, and does the visually right thing with such files.

If you're bothered by inconsistent file modification times, you can always explicitly set them using touch (or perhaps /usr/5bin/touch to access the System V version). To find all of the files created since a certain time, use find and a timestamp file. The following example creates a timestamp file dated April 15, 1996, 2:15 PM, and then prints out files in /home/stern that have modification times more recent than the timestamp:

duey% touch 04151445 /tmp/timestamp 
duey% find /home/stern -newer /tmp/timestamp -print

If you're using a Secure RPC service, such as Secure NFS or NIS+, you also need solid time management to ensure that timestamp verifiers generated by a client are accepted by the server. The Secure RPC client encrypts the current time and passes it to the server, where it is decrypted to look for out-of-order or old requests that might signify a replay of a previous request. If you get messages like "auth-def validator mismatch" or "NIS+ received an invalid time stamp," your client clock has drifted too far behind the server's, and the server is rejecting Secure RPC requests.

As an increasing number of network services depend on timestamps, verification, and sequencing, it's a good idea to get your house's clocks in order. We'll start by going all the way down to the interrupt level to see how Unix tracks time.


Advertisements

Dali would be proud: Driving the soft clocks

When you set the time using date, you're really setting an on-board, battery-backed hardware clock. You've probably noticed that your Sun systems retain their sense of time, even when powered off or between reboots, since the hardware clock keeps ticking even when you're not clicking. In an ideal world, you would set your system's clock once when you installed it and forget about it, letting the Unix kernel read its on-board watch to keep track of time.

The real world is less forgiving: your initial setting can only be as accurate as your own watch or time source, so with multiple administrators or multiple sources, you generally get slightly skewed system clocks. The hardware clock will exhibit some natural drift, but at a second or two per month it tends to be lost in the noise of variations in wristwatch settings used to set the time. What you need to worry about, however, is drift between the software clock and the hardware clock caused by system load. Despite this long wind-up on the virtues of a hardware clock, there's some software involved that keeps the user- and system-visible sense of time.

All Unix systems are driven by a hardware clock (hardclock) that interrupts at regular intervals. In the case of Sun's SPARC-based systems, the hardclock runs at 100-Hz, generating 100 interrupts a second. Hardclocks act as a system heartbeat, providing a steady drum beat to drive the scheduler and implement timeouts for system operations such as TCP transmissions and RPC requests. Note that the hardclock and the hardware clock are two different things: the hardclock is a source of constant interruption, and the hardware clock is a built-in I/O device that can be set and read.

On the surface, getting the system time should be as easy as reading the hardware clock. However, the current system time is used throughout the filesystem and virtual memory code for comparing timestamps, so it has to be accessible with minimum latency and overhead. Reading the hardware clock is much less efficient, and much slower, than reading a value out of memory. To avoid constantly referring to the hardware clock, the Unix kernel uses the hardclock to drive a software clock. That is, at boot time, the kernel reads the current version of the hardware clock into a 64-bit chunk of memory. In the interrupt handler for the hardclock, the softclock is incremented by 10 milliseconds. Voila -- parallel software and hardware clocks. Unfortunately, the software timepiece drifts quite frequently.

While your sense of time is important, other system events take precedence over the hardclock. Incoming characters on a serial line, for example, may not wait around while the system processes events on a timeout queue. The serial device needs to be read and reset as quickly as possible to avoid dropping input characters. The virtual memory system also masks hardclock interrupts while it updates memory management unit entries, or manipulates address space kernel structures. Put a high serial I/O load on the system, or thrash your address spaces, and you're going to miss hardclock interrupts. When you miss a hardclock, the software clock falls behind. Miss enough, and your Unix system clock appears to be running slowly.

The softclock has a built-in adjustment mechanism. Periodically, it checks the value of the hardware clock, and determines if it needs to catch up. If the software clock is behind, it will slowly adjust itself by adding two or more "ticks" per hardclock until hardware and software are synchronized again. The incremental adjustment is known as the clock slew rate. The software clock catches up slowly to avoid abrupt changes in the user-visible time. If you are consistently missing hardclocks, the slow-sync approach may never get you caught up; you'll fall behind faster than the slew rate makes up for lost ticks. If this is a problem, you'll see two obvious symptoms. First, your system's clock will be losing time, and second, the clock interrupt rate shown by vmstat -i will be less than 100 per second:

duey% vmstat -i
interrupt         total     rate
--------------------------------
clock          95738328       99
fdc0             510691        0
--------------------------------
Total          96249019       99

The clock interrupts are not counted in the generic output of vmstat, but appear in the interrupt breakdown shown above. If you see the interrupt rate drop to 99 per second or lower, it's time to lighten your system load or use some aggressive clock management. We'll look at a simple rdate-based scheme and then cover the Network Time Protocol, a finer-grain and gentler time management system.

"What Time Is It?" revisited

Included in the bevy of Berkeley r-commands is rdate, the equivalent of the date time-setting command that takes its input from another host on the network, in this case the machine timepiece:

duey# rdate timepiece

The result will be system clocks that are synchronized to a common source, and accurate within a few seconds. Many system administrators stick an rdate command in the crontab file, forcing clients to synchronize once a day or even once an hour with a well-known server. If you do go the rdate route, make sure you redirect all of your output streams using a crontab entry like the following:

0 * * * *  rdate timepiece > /dev/null 2>&1

If you don't catch stdout and stderr, rdate's confirmations will show up in root's mailbox, courtesy of cron. A simple approach to rdate synchronization is to have each NFS client talk to one of its NFS servers, and have the servers talk to a common, local time source. You can replicate this setup across multiple LANs, where cross-network time synchronization may not be as much of an issue if you aren't sharing PGP-signed messages, NFS mounted filesystems or NIS+ servers.

There are drawbacks to the rdate scheme. While it doesn't require root privileges between hosts over the network, it does introduce a domino effect in the case of a failure or error in setting the top-level server's time. Your clients' clocks will only be as accurate as the server singled out as a time source. Users may also see time discontinuities, which are amusing when you're in a role-playing game but frustrating when trying to build a library using make. What you want is a time-management protocol that adjusts the clock slew rate, creating gentle shifts in time, and uses a group of trusted servers to provide an accurate, stable time base. It's called the Network Time Protocol, or NTP.

The network Rolex

The Network Time Protocol (NTP) was developed by David Mills at the University of Delaware. It's freely available and runs on nearly every Unix variant as well as Windows NT clients and servers. NTP is in its third protocol revision, described by RFC 1305. Version 2 is covered in RFC 1119 and the original Version 1 is in RFC 1059. A Version 3 installation will talk to older NTP versions, so you can focus on the current release without worrying about compatibility issues. Given that anyone with root privileges can set the time, NTP provides a quiet, unobtrusive way of undoing whatever time warps were intentionally or inadvertently created on your networked hosts.

NTP organizes hosts into varying layers of accuracy known as strata. Strata 1 hosts are attached to reliable, accurate time keeping hardware such as radio receivers (simpler than a personal atomic clock). The NTP distribution includes drivers for a variety of time pieces, so you can build your own strata 1 host if you choose to, or if you don't have a reliable, permanent Internet connection. The strata below the top level are assumed to be less accurate and to exhibit more variability in their time keeping. The NTP server's job is to keep hosts in all strata synchronized with the well-known source of quality time.

Consider this crude time management system: You know there's an atomic clock at the university located 10 miles away. Each hour, you drive to the campus, read the clock and note the current time on a sheet of paper, and then drive back to your building and set the clock using your reference sheet. Of course, your clock is a solid ten or fifteen minutes behind the time reference, and the actual delay varies depending upon traffic and the number of vending machines that looked appealing on the way out of the building. NTP manages a similar problem with network hosts by measuring the offset, or actual time difference in their clocks, and the latency or delay required to send a message from one to another. If you can reliably drive from one campus to another in six minutes, with little or no variation, it's safe to set your clock to the time noted on your paper plus six minutes.

NTP chooses a reference server in a higher strata by looking for the smallest, least variable communication delay. Having the client and server measure latency for a unicast packet is nearly impossible, since it requires synchronized clocks, precisely the problem to be solved! NTP uses an average of round-trip transit times, so that offsets between client and server clocks are removed from the delay calculation. On a local area network, NTP can maintain the time within a few milliseconds, and on a wide-area network, the dispersion is usually within a few tens of milliseconds. In theory, NTP can adjust clocks to within 300 picoseconds of each other, but network latency, scheduling latency and other variables keep the realized skew in the millisecond range. As networks get faster, and CPU cycle times decrease into the few nanosecond arena, NTP will get more accurate as well.

NTP consists of several administrative utilities and a daemon that speaks the NTP V3 protocol. In its simplest installation, you run the NTP-enabled version of date, called ntpdate, out of cron the same way you'd use rdate. If the time offsets are more than half a second, ntpdate does a "step" to synchronize the clocks, otherwise it uses the gentle slew method normally used by the softclock to catch up. While date and friends use the stime() system call to explicitly set the time, NTP makes use of the adjtime() call to control the softclock slew rate.

If you want constant, highly reliable timekeeping, you also run the xntpd daemon that converses with hosts in other strata. After an initial date-check with ntpdate, xntpd will keep the time with minimum skew between hosts. NTP documentation is provided in HTML format, so you can use your favorite browser and search engine to become familiar with it. There are some items we'll call out as especially noteworthy:

A cardinal rule of system administration is that something is always broken. If you rely on a single system, with multiple administrators, for an accurate time accounting, you're likely to be surprised when license servers, security services, or software builds break in creative ways. Reducing random behavior is always a nice goal, and establishing a reliable time-keeping mechanism is a good step in that direction. And besides -- when was the last time you shocked someone with your punctuality?


Click on our Sponsors to help Support SunWorld


Resources


What did you think of this article?
-Very worth reading
-Worth reading
-Not worth reading
-Too long
-Just right
-Too short
-Too technical
-Just right
-Not technical enough
 
 
 
    

SunWorld
[Table of Contents]
Subscribe to SunWorld, it's free!
[Search]
Feedback
[Next story]
Sun's Site

[(c) Copyright  Web Publishing Inc., and IDG Communication company]

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-04-1996/swol-04-sysadmin.html
Last modified: