|
Time bombsSetting, adjusting, and maintaining synchronized clocks on a network |
You can synchronize the clocks on your networked systems several ways.
Some approach Rube Goldberg levels of efficiency, while others conjure
chronological daemons to do your timekeeping bidding. Which method is
best for your site? rdate
? NTP? The answers lie below.
(3,200 words)
Mail this article to a friend |
About ten years ago, Dennis Ritchie posted the question "What time is it?" on net.general, then the catch-all for USENET traffic of interest to the masses. Dennis was poking fun at messages (and their authors) that arrived several days after bearing any sense of relevance. The "what time is it?" probe demonstrated that USENET isn't a good source for a watch-setting consensus. The underlying problem, however, is not setting just one clock on one machine, but instead keeping an entire network of machines synchronized with respect to a global, reliable time source and with respect to each other.
Accurate time-keeping affects many day-to-day functions as well as issues critical to system administrators:
make
, so
skewed desktop clocks may result in erratic actions.
Put simply: how do you keep the software watches of hundreds or thousands of systems synchronized, particularly when they are spread around the world, and how do you make sure that your sense of time matches that of the rest of the computing world? This month, we address the question of what to do when your sense of time bombs. We'll start with an overview of how Unix systems keep track of time, We'll look a simple method for synchronizing Unix system clocks, and take a peek at the Network Time Protocol, a small dose of Swiss perfection you can bring to most Unix and Windows NT machines.
|
|
|
|
A brief history of time, in 64 bits or less
Time is suddenly in vogue for technical discussion. Impending software meltdowns due to the two-digit year rollover in 2000, and estimates of the amount of work required to fix every picayune date reference have displaced Internet mania on the covers of our trade press. The hysteria has clouded the issue slightly, since the problem with date rollover only exists if your systems do math on two-digit year values. If you designed a database with the year ranging from 00-99, you will have saved a few bytes per row but created a software headache for someone else. Lotus 1-2-3, Microsoft Excel, and other products skirt the 1999-party-over problem by using 100 to refer to 2000, 101 for 2001, and so on, extending date fields to three digits when the millennium arrives.
On Unix systems, the current time is decoupled from any particular year or representation. The time is stored as a 64-bit value representing the number of seconds elapsed since January 1, 1970. The year 2000 isn't a problem, but sometime in January 2037 the Unix time counter will roll over and cause another trade press crisis. The time counter is relative to Greenwich Mean Time (or Universal Mean Time) and is converted into local time using the information in the timezone file /etc/timezone. Choosing UMT as a baseline means that (in theory) all Unix systems should have exactly the same 64-bit time counter at any point in time. We'll see how that simplifies time management protocols later on.
Accurate time keeping affects nearly all of the file-based operations on a Unix system. Every Unix file has its modification and access times recorded in the inode. The modification time is used by backup utilities, NFS, and the local virtual memory management system for maintaining consistency of in-memory file cache pages. Note that no creation times are not kept in the filesystem (see "The Unix Filesystem" SunWorld Online September 1995). If you want an audit trail of when a file was created and modified, you'll need to use a source code control system such as SCCS (see "A system administrator's introduction to SCCS and its wonders" SunWorld Online July 1995) that tracks each change.
Given that system clocks tend to drift apart, NFS client systems
compensate for "impossible" clock values with slight modifications to
basic commands like ls
. Normally, ls
shows
you the day and modification time of a file if it is less than six
months old, otherwise it shows you the month, day and year:
duey% ls -l total 38 -rw-r--r-- 1 stern user 18631 Apr 15 1995 time.txt
What if you create a file on a machine whose clock is slightly ahead?
You'll confuse ls
since it's subtraction of modification
time from current time yields a negative number. The NFS-aware
ls
accepts a clock drift window of a few minutes, and
does the visually right thing with such files.
If you're bothered by inconsistent file modification times, you can
always explicitly set them using touch
(or perhaps
/usr/5bin/touch
to access the System V version). To find
all of the files created since a certain time, use find
and a timestamp file. The following example creates a timestamp file
dated April 15, 1996, 2:15 PM, and then prints out files in
/home/stern that have modification times more recent than the
timestamp:
duey% touch 04151445 /tmp/timestamp duey% find /home/stern -newer /tmp/timestamp -print
If you're using a Secure RPC service, such as Secure NFS or NIS+, you also need solid time management to ensure that timestamp verifiers generated by a client are accepted by the server. The Secure RPC client encrypts the current time and passes it to the server, where it is decrypted to look for out-of-order or old requests that might signify a replay of a previous request. If you get messages like "auth-def validator mismatch" or "NIS+ received an invalid time stamp," your client clock has drifted too far behind the server's, and the server is rejecting Secure RPC requests.
As an increasing number of network services depend on timestamps, verification, and sequencing, it's a good idea to get your house's clocks in order. We'll start by going all the way down to the interrupt level to see how Unix tracks time.
|
|
|
|
Dali would be proud: Driving the soft clocks
When you set the time using date
, you're really setting
an on-board, battery-backed hardware clock. You've probably noticed
that your Sun systems retain their sense of time, even when powered
off or between reboots, since the hardware clock keeps ticking even
when you're not clicking. In an ideal world, you would set your
system's clock once when you installed it and forget about it, letting
the Unix kernel read its on-board watch to keep track of time.
The real world is less forgiving: your initial setting can only be as accurate as your own watch or time source, so with multiple administrators or multiple sources, you generally get slightly skewed system clocks. The hardware clock will exhibit some natural drift, but at a second or two per month it tends to be lost in the noise of variations in wristwatch settings used to set the time. What you need to worry about, however, is drift between the software clock and the hardware clock caused by system load. Despite this long wind-up on the virtues of a hardware clock, there's some software involved that keeps the user- and system-visible sense of time.
All Unix systems are driven by a hardware clock (hardclock) that interrupts at regular intervals. In the case of Sun's SPARC-based systems, the hardclock runs at 100-Hz, generating 100 interrupts a second. Hardclocks act as a system heartbeat, providing a steady drum beat to drive the scheduler and implement timeouts for system operations such as TCP transmissions and RPC requests. Note that the hardclock and the hardware clock are two different things: the hardclock is a source of constant interruption, and the hardware clock is a built-in I/O device that can be set and read.
On the surface, getting the system time should be as easy as reading the hardware clock. However, the current system time is used throughout the filesystem and virtual memory code for comparing timestamps, so it has to be accessible with minimum latency and overhead. Reading the hardware clock is much less efficient, and much slower, than reading a value out of memory. To avoid constantly referring to the hardware clock, the Unix kernel uses the hardclock to drive a software clock. That is, at boot time, the kernel reads the current version of the hardware clock into a 64-bit chunk of memory. In the interrupt handler for the hardclock, the softclock is incremented by 10 milliseconds. Voila -- parallel software and hardware clocks. Unfortunately, the software timepiece drifts quite frequently.
While your sense of time is important, other system events take precedence over the hardclock. Incoming characters on a serial line, for example, may not wait around while the system processes events on a timeout queue. The serial device needs to be read and reset as quickly as possible to avoid dropping input characters. The virtual memory system also masks hardclock interrupts while it updates memory management unit entries, or manipulates address space kernel structures. Put a high serial I/O load on the system, or thrash your address spaces, and you're going to miss hardclock interrupts. When you miss a hardclock, the software clock falls behind. Miss enough, and your Unix system clock appears to be running slowly.
The softclock has a built-in adjustment mechanism. Periodically, it
checks the value of the hardware clock, and determines if it needs
to catch up. If the software clock is behind, it will slowly adjust
itself by adding two or more "ticks" per hardclock until hardware
and software are synchronized again. The incremental adjustment is known
as the clock slew rate. The software clock catches up slowly to
avoid abrupt changes in the user-visible time. If you are consistently
missing hardclocks, the slow-sync approach may never get you caught up;
you'll fall behind faster than the slew rate makes up for lost ticks.
If this is a problem, you'll see two obvious symptoms. First, your
system's clock will be losing time, and second, the clock interrupt
rate shown by vmstat -i
will be less than 100 per second:
duey% vmstat -i interrupt total rate -------------------------------- clock 95738328 99 fdc0 510691 0 -------------------------------- Total 96249019 99
The clock interrupts are not counted in the generic output of
vmstat
, but appear in the interrupt breakdown shown
above. If you see the interrupt rate drop to 99 per second or lower, it's
time to lighten your system load or use some aggressive clock
management. We'll look at a simple rdate
-based scheme and
then cover the Network Time Protocol, a finer-grain and gentler time
management system.
"What Time Is It?" revisited
Included in the bevy of Berkeley r-commands is rdate
,
the equivalent of the date
time-setting command that
takes its input from another host on the network, in this case
the machine timepiece:
duey# rdate timepiece
The result will be system clocks that are synchronized to a common
source, and accurate within a few seconds. Many system administrators
stick an rdate
command in the crontab file, forcing
clients to synchronize once a day or even once an hour with a
well-known server. If you do go the rdate
route, make
sure you redirect all of your output streams using a crontab entry
like the following:
0 * * * * rdate timepiece > /dev/null 2>&1
If you don't catch stdout and stderr, rdate
's
confirmations will show up in root's mailbox, courtesy of
cron
. A simple approach to rdate
synchronization is to have each NFS client talk to one of its NFS
servers, and have the servers talk to a common, local time source. You
can replicate this setup across multiple LANs, where cross-network
time synchronization may not be as much of an issue if you aren't
sharing PGP-signed messages, NFS mounted filesystems or NIS+ servers.
There are drawbacks to the rdate
scheme. While it doesn't
require root privileges between hosts over the network, it does
introduce a domino effect in the case of a failure or error in setting
the top-level server's time. Your clients' clocks will only be as
accurate as the server singled out as a time source. Users may also
see time discontinuities, which are amusing when you're in a
role-playing game but frustrating when trying to build a library using
make
. What you want is a time-management protocol that
adjusts the clock slew rate, creating gentle shifts in time, and uses
a group of trusted servers to provide an accurate, stable time base.
It's called the Network Time Protocol, or NTP.
The network Rolex
The Network Time Protocol (NTP) was developed by David Mills at the University of Delaware. It's freely available and runs on nearly every Unix variant as well as Windows NT clients and servers. NTP is in its third protocol revision, described by RFC 1305. Version 2 is covered in RFC 1119 and the original Version 1 is in RFC 1059. A Version 3 installation will talk to older NTP versions, so you can focus on the current release without worrying about compatibility issues. Given that anyone with root privileges can set the time, NTP provides a quiet, unobtrusive way of undoing whatever time warps were intentionally or inadvertently created on your networked hosts.
NTP organizes hosts into varying layers of accuracy known as strata. Strata 1 hosts are attached to reliable, accurate time keeping hardware such as radio receivers (simpler than a personal atomic clock). The NTP distribution includes drivers for a variety of time pieces, so you can build your own strata 1 host if you choose to, or if you don't have a reliable, permanent Internet connection. The strata below the top level are assumed to be less accurate and to exhibit more variability in their time keeping. The NTP server's job is to keep hosts in all strata synchronized with the well-known source of quality time.
Consider this crude time management system: You know there's an atomic clock at the university located 10 miles away. Each hour, you drive to the campus, read the clock and note the current time on a sheet of paper, and then drive back to your building and set the clock using your reference sheet. Of course, your clock is a solid ten or fifteen minutes behind the time reference, and the actual delay varies depending upon traffic and the number of vending machines that looked appealing on the way out of the building. NTP manages a similar problem with network hosts by measuring the offset, or actual time difference in their clocks, and the latency or delay required to send a message from one to another. If you can reliably drive from one campus to another in six minutes, with little or no variation, it's safe to set your clock to the time noted on your paper plus six minutes.
NTP chooses a reference server in a higher strata by looking for the smallest, least variable communication delay. Having the client and server measure latency for a unicast packet is nearly impossible, since it requires synchronized clocks, precisely the problem to be solved! NTP uses an average of round-trip transit times, so that offsets between client and server clocks are removed from the delay calculation. On a local area network, NTP can maintain the time within a few milliseconds, and on a wide-area network, the dispersion is usually within a few tens of milliseconds. In theory, NTP can adjust clocks to within 300 picoseconds of each other, but network latency, scheduling latency and other variables keep the realized skew in the millisecond range. As networks get faster, and CPU cycle times decrease into the few nanosecond arena, NTP will get more accurate as well.
NTP consists of several administrative utilities and a daemon that
speaks the NTP V3 protocol. In its simplest installation, you run the
NTP-enabled version of date
, called ntpdate
,
out of cron the same way you'd use rdate
. If the time
offsets are more than half a second, ntpdate does a "step" to
synchronize the clocks, otherwise it uses the gentle slew method
normally used by the softclock to catch up. While date
and friends use the stime()
system call to explicitly set
the time, NTP makes use of the adjtime()
call to control
the softclock slew rate.
If you want constant, highly reliable timekeeping, you also run the
xntpd
daemon that converses with hosts in other strata.
After an initial date-check with ntpdate
,
xntpd
will keep the time with minimum skew between hosts.
NTP documentation is provided in HTML format, so you can use your
favorite browser and search engine to become familiar with it. There
are some items we'll call out as especially noteworthy:
xntpd
finds its own servers and is
largely self-configuring.
adjtime()
system call is accurate to better than 10
milliseconds.
set dosynctodr=0
The variable dosynctodr is for "doing synchronization of time of day register". Set it to zero and NTP will control the softclock slew rate.
xntpdc
that lets you
query an NTP server directly. Use it for debugging strange behavior
such as systems that will not converge on an agreed-upon time.
A cardinal rule of system administration is that something is always broken. If you rely on a single system, with multiple administrators, for an accurate time accounting, you're likely to be surprised when license servers, security services, or software builds break in creative ways. Reducing random behavior is always a nice goal, and establishing a reliable time-keeping mechanism is a good step in that direction. And besides -- when was the last time you shocked someone with your punctuality?
|
Resources
If you have technical problems with this magazine, contact webmaster@sunworld.com
URL: http://www.sunworld.com/swol-04-1996/swol-04-sysadmin.html
Last modified: