Click on our Sponsors to help Support SunWorld

Programming asynchronous I/O in Solaris 2

Using AIO is not hard, and on MP machines it yields tremendous throughput advantages.

By Steve Leung

SunWorld
March  1996
[Next story]
[Table of Contents]
[Search]
Subscribe to SunWorld, it's free!

Abstract
Programming asynchronous input/output (AIO) requires a lot of effort in design and coding. But with good design, software utilizing AIO can outperform its synchronous I/O counterpart by as much as 50 percent. We will examine the Solaris API for programming AIO. (5,100 words, including 2 sidebars)


Mail this
article to
a friend

Let's start with the basics. Today's computers have input/output hardware that functions asynchronously. The I/O subsystem sends I/O commands to the I/O devices (such as disk drives) and continues to perform other tasks without waiting for the completion of the I/O commands. When the I/O devices complete the execution of the commands, they interrupt the CPU. The Solaris kernel, however, will block user applications (processes) until their corresponding I/O commands are completed, even though the underlying hardware works asynchronously.

When an application (process) executes I/O system calls like read(2) or write(2), the Solaris kernel translates the system calls into I/O commands and sends them to the I/O device or devices. While the devices process the commands, Solaris puts the application to sleep and switches to serve another process. When the kernel receives an interrupt from the I/O device signaling the completion of the original I/O commands, it wakes the sleeping process and makes it eligible to run again. This is a simplified view of how a kernel performs I/Os. In reality, kernel and file systems also implement other facilities to improve performance, including read-ahead and buffered writes.

It is not good for performance-sensitive applications to go to sleep because of I/O requests! Asynchronous I/O is one way to work around this potential barrier to higher performance. AIO is suited for any I/O intensive application.

Solaris provides the library libaio for AIO programming. The library contains only four functions:

We'll cover each function in detail and provide examples. Then, we'll look at compiling AIO applications, SIGIO, signal handling, programming AIO, performance issues, benchmarking synchronous and asynchronous I/O, multi-threaded unsafe and AIO implementations, and the Posix 4 library.


Advertisements

SUBHEAD aioread

The definition of aioread looks like this:

int aioread(
	int fd,			/* file descriptor */
	char *bufp,		/* buffer */
	int nbyte,		/* number of bytes */
	off_t offset,		/* file pointer offset */
	int whence,		/* seek option */
	aio_result_t *resultp)	/* result of aioread */

You can think of aioread as the asynchronous counterpart of read. aioread initiates an asynchronous read and returns control to the calling program without waiting for the completion of read. Let's consider each argument in detail.

The value of whence could be SEEK_SET, SEEK_CUR, or SEEK_END. SEEK_SET sets the file pointer to offset. SEEK_CUR sets the file pointer to current file pointer plus offset. SEEK_END sets the file pointer to EOF plus offset. If the file object is not capable of seeking, the values of offset and whence are ignored. resultp points to data structure aio_result_t which stores the result of aioread. aio_result_t, is defined as:

typedef struct aio_result_t {
	int aio_return;		/* return value of the corresponding read() */
	int aio_errno;		/* errno set by the corresponding read() */
} /* as defined in /usr/include/sys/aio.h */

Upon the completion of read, aio_return and aio_errno are set to indicate the result of the AIO request. The request is submitted by calling aioread. (Note: I wrote "the completion of read" and not aioread"). The result of an AIO operation can only be determined after its corresponding asynchronous read is completed. If the operation is not completed, the buffer that bufp points to may not contain any valid data. aioread always returns right away, but its submitted AIO operation will not complete until the corresponding read finishes. Upon completion, aio_return contains the return value of read (or write), and aio_errno contains the errno set by read (or write).

There are three ways to detect the completion of an AIO operation.

aio_return
AIO_INPROGRESS is not a value that Solaris will use in aio_return; therefore, you can detect the completion of aioread or other changes in state by initializing aio_return to AIO_INPROGRESS before calling aioread. The value of AIO_INPROGRESS is -2 for Solaris 2.4.

For example,

	...
	/* read some data into the buffer */
	aioread(fd, bufp, nbyte, 0, SEEK_SET, resultp);
	/* loop until the asynchronous operation is completed */
	while( resultp->aio_return = = AIO_INPROGRESS) ;
	/* now it is safe to access the buffer */
	access_the_buffer(bufp);	

signal

A completed AIO operation sends a SIGIO signal to the process.

function aiowait
Calling aiowait will block until an outstanding AIO operation is completed.

The AIO_INPROGRESS example implements a busy wait. It does not give us any performance advantage. We will discuss how to gain performance with signal and aiowait in detail later.

SUBHEAD aiowrite

The definition of aiowrite is

int aiowrite(
	int fd,			/* file descriptor */
	char *bufp,		/* buffer */
	int nbyte,		/* number of bytes */
	off_t offset,		/* file pointer offset */
	int whence,		/* seek option */
	aio_result_t *resultp)	/* result of aiowrite */

Everything we discussed earlier about aioread works the same for aiowrite, except reading becomes writing. aiowrite initiates an asynchronous write and returns control to the calling function without waiting for the completion of the write. Completion notification of the asynchronous write operation follows the same scheme as aioread.

There is a limit to the number of outstanding asynchronous I/Os that Solaris allows for a single process. The limit is defined in MAXASYNCHIO. Solaris 2.4 allows 200 outstanding asynchronous I/Os.

#define MAXASYNCHIO		(200)	/* as defined in /usr/include/sys/aio.h */

SUBHEAD aiocancel
aiocancel cancels an asynchronous operation.

int aiocancel(
	aio_result_t *resultp)

We use resultp to identify the asynchronous operation we want to cancel. resultp must point to a result structure that was previously used in an asynchronous I/O function (either a aioread or aiowrite) call. Do not reuse the result structure until you are sure its corresponding I/O operation is complete. Each unique result structure identifies a unique AIO request.

aiocancel will return 0 on success; otherwise, it returns -1 and sets the errno to indicate the error. For example, if resultp does not match any outstanding AIO requests, errno is set to EACCES (if there is some outstanding AIO requests for the process) or EINVAL (if there are no outstanding AIO requests). Depending on the error, applications can execute different programmer-defined error handling.

SUBHEAD aiowait

aiowait provides a synchronous method of notification. It is defined as

aio_result_t *
aiowait(
	const struct timeval *timeval)

aiowait suspends the calling process until one of its outstanding AIO requests completes. timeval specifies the maximum time interval that the calling process will wait. If timeval is a NULL pointer, the calling process will wait indefinitely; otherwise timeval points to a structure which defines the time interval.

struct timeval {
	long	tv_sec;		/* seconds */
	long	tv_usec;		/* microseconds */
} /* as defined in /usr/include/sys/time.h */

If you want to implement polling ("no wait"), set both tv_sec and tv_usec to zero. After waiting for an outstanding AIO operation to complete successfully, aiowait returns the result structure of the completed AIO operation. Then we can check the result structure to see if the operation completes successfully. If there's a failure, aiowait returns -1 and sets the errno. For example, it sets errno to EINVAL when there are no outstanding AIO requests.

Each result structure identifies an AIO request. When you submit an AIO request through aioread or aiowrite, the request becomes outstanding. Even though its corresponding asynchronous read or write is complete, the result is considered outstanding until you call aiowait to mark off its outstanding status. In Solaris terminology, this means you must dequeue its notification. If you use the result structure of an outstanding AIO request with another aioread or aiowrite, you will get the error EINVAL (indicating that the result structure is currently used by an outstanding AIO request).

Compiling AIO applications
There is only one header file <sys/asynch.h> required for using libaio functions. asynch.h contains all the needed AIO-related data structures for using AIO functions. To compile an AIO application, you must do the following:

% cc [flag ...] file ... -laio [library ...]

SIGIO
As I mentioned, AIO boosts performance by allowing the calling process to perform other critical tasks until it can no longer proceed without accessing the results of the outstanding AIO operations. There are situations when we may want to check the status of the outstanding AIO operations regularly. However, we want to minimize excess checking for performance reasons. Using polling (as in the AIO_INPROGRESS example earlier) to check the status also hurts performance. A better choice is to use asynchronous notification for the completion of AIO operations.

Keep in mind aiowait is the only way to dequeue a notification; therefore, the process must still call aiowait after it receives the SIGIO signal.

By default, the notification is done synchronously through the use of aiowait. In asynchronous notification, SIGIO signal is sent to the calling process when I/O is completed. You turn on asynchronous notification by installing a signal handler to manage (catch) the SIGIO signal. You must understand Solaris signal handling to program AIO effectively. I will review the basic concepts related to programming AIO in the next section. Skip it if you are an expert.

Signal handling
Signals are software interrupts. SIGIO is one of Solaris's many signals. For AIO, if asynchronous notification is turned on, a SIGIO signal is sent to the calling process when I/O is completed. A process can react one of three ways to the signal:

For most signals (including SIGIO), the default action terminates the process. Turn on the asynchronous notification by installing a signal handler to catch the SIGIO signal through the system call signal, sigset, or sigaction. Their usages are similar. For example, both signal(SIGIO, sig_handler) and sigset(SIGIO, sig_handler) will install the user-defined function sig_handler as the signal handler to the SIGIO signal. When the calling process receives the SIGIO signal, control is transferred to the signal handler. When it completes, control is returned to where the calling process was previously interrupted.

The difference between these two system calls is simple. signal will reset the signal handler to the original default action when it enters the signal handler function. sigset will not. Programmers will usually write code to call signal again inside the signal handler to re-install the signal handler. There is a space of time, however, after the signal has occurred and before the call when the kernel will take the default action and terminate the calling process. For this reason it is almost always better use sigset than signal.

Some system calls are interruptible. When calling slow system calls such as read from files, the calling process is blocked to wait for the completion of the call. If signal is received at that time, it will interrupt the system call, and the system call will return the error EINTR. The usage of sigaction is similar to sigset but with the support to restart system calls. If the sigaction option SA_RESTAR is set, the kernel will restart the system calls that are interrupted by the signal automatically. If your program contains interruptible system calls, you may consider using sigaction instead of sigset. You can find if a system call is interruptible by looking up its manual page and searching for errno EINTR. If EINTR is listed, the call is interruptible.

In addition, each process has a signal mask that defines the set of signals currently blocked from delivery to the process. The signal mask is implemented like a bit map. Every defined signal is represented by one bit. For example, if SIGIO signal is blocked, its corresponding bit in the signal mask will be set to one. By default, signal mask contains all zeros. The system calls sighold, sigrelse, sigignore, and sigpause are used to manipulate the signal mask. sighold(int sig) adds a signal to the signal mask. sigrelse(int sig) removes the signal from the mask. sigignore(int sig) sets the signal disposition to SIG_IGN. sigpause(int sig) removes the signal from the mask and suspends the calling process until a signal is received. You must understand this: signal can be held by the kernel and delivered later. This allows applications to build critical regions without losing any signals.

Note: In Solaris 2.5, signal, sigset, sighold, sigrelse, sigignore and sigpause are implemented in libc. They are no longer system calls (as they were in earlier Solaris releases) and are implemented as user code by calling the sigaction family of system calls.

Programming AIO
Now we will tie everything we have discussed so far together using code segments as illustrations.

The first consideration in designing AIO applications is to organize the AIO-related data structures. For most AIO applications, these data structures are required to generate multiple AIO requests at a time. For example,

	for (...) {
		aioread(fd, bufp, ..., offset, whence, resultp);
Applications must have a way to select the "correct" bufp, offset, whence and resultp for different iterations. Date structures like arrays, linked lists, or even hash tables can be used to keep track of the information. It is important that your data structures establish an association between the result structure and buffer structure because the kernel will return you the result structure pointers when the AIO operations are complete. You then need to find out its corresponding buffer structures. You can organize a simple data structure like

	typedef struct operation {
		aio_result_t	result;		/* AIO result structure */
		char		buffer;		/* buffer used in aioread/aiowrite */
		int		flag;		/* this flag indicates if this struct is in use */
	} operation;
	operation 		myoperations[NUM_AIO_OPERATIONS];

I defined the result structure as the first item in struct operation because I can use the address of the result structure to locate the buffer structure, like

	((operation *) (resultp))->buffer

Yes, make sure the result structure is aligned in the same beginning address as the struct operation. Different compilers can generate different alignments! You must manage the current file pointer yourself through offset and whencefor sequential asynchronous operations. Asynchronous operations can finish in a different order than they were submitted; so libaio can not identify the current file pointer in the functions aioread and aiowrite unless you explicitly specify it. For example,

	for(offset=0; ... ; offset += 4096) 	/* this implement a sequential asynchronous read */
		aioread(fd, buf, 4096, offset, SEEK_SET, resultp);

The flag defined in the struct operation indicates whether the result and buffer structure are in use. If they are in use, we can not let other AIO requests use them. When AIO operations are complete, we must return the result and buffer structures to the free pool. These housekeeping tasks are the price price we pay for using AIO! Now that we've learned how to organize AIO-related data structures, we will discuss how to handle AIO notifications.

You have a choice in receiving notifications when an AIO operation is completed -- either synchronously or asynchronously. The decision is usually determined by how urgent your application needs to be informed of AIO operation completion. "Urgent" means your application can not proceed without knowing the status of the completion. If it is urgent, you want to receive asynchronous notifications. But let's look at a non-urgent example first. Applications implementing database-like transactions can take synchronous notifications when the transaction is committed. By the definition of database transactions, it is not urgent because applications only need to guarantee that all writes are completed at transaction commit.

	while(...)				/* submit all AIO operations */
		aiowrite(...);
	/* user commits the transaction */
	while((rp = aiowait(NULL) != -1)  	/* block until all AIO's are completed */
		...

The above code is what you will typically write to handle synchronous notifications. Remember, each completion of an aiowrite will generate a notification; therefore, aiowait is placed in a while loop to dequeue all possible notifications in the example. Also, aiowait returns the pointer to the result structure. Good programmers should always check the aio_return to make sure the AIO operation completes successfully.

While synchronous notification is easy to understand, there are times when your application can not block to wait for notifications. There are other high priority tasks to perform. Imagine that a GUI-based application is writing a large file to disk. It displays the dialog box CANCEL and calls aiowrite. If the CANCEL button is pressed before the application receives the notification, it calls the aiocancel to cancel the I/O request. If the CANCEL button is not pressed and the asynchronous notification is received, the signal handler is called to cancel the dialog box.

	/* sigset installs the signal handler for SIGIO signal and turns on the asynchronous notifications 
*/
	sigset(SIGIO, cancel_dialog_box);							
	...
	aiowrite(...);		/* submit AIO requests */
	...
	display_dialog_box();	/* displays dialog box and monitors the button press */

Even for asynchronous notifications, applications still need to call aiowait (usually in the signal handler) to dequeue the notifications and check for bad aio_return.

	void cancel_dialog_box( int sig)		/* signal handler */
	{	...
		aiowait(...);			/* dequeue notifications */
		destroy_dialog_box_cancel();	/* cancel the dialog box */

On one hand, the AIO API appears simple because there are only four functions and a few related data structures. But on the other hand, there are many associated design and performance issues. Careful planning is required in organizing the buffer and result structures, handling the notifications, and checking I/O errors and recovering them. It is no small task. The reward, of course, is applications with improved performance!

Performance
In my AIO programming experience, I have found little or no improvement in processing sequential files under the UFS file system. The Solaris file system efficiently buffers and reads ahead sequential accesses.

For random file access to large files (a gigabyte or more), I commonly see improvements ranging from 20 to 30 percent for I/O intensive applications. Improvements of more than 40 percent are possible when the files are located on NFS-mounted filesystems. Keep in mind that any major improvement over standard read/write system calls requires the use of multi-processor machines, since AIOs are executed via light weight processes (LWPs).

Since there are so many hardware and software factors that affect AIO performance, it is best to build a prototype to estimate your likely performance gain. At the end of this article, you will find two simple programs I use to benchmark AIO. They aren't perfect, but they offer a simple measure of AIO's benefits on your particular melange of hardware, network, and software.

The benchmarks run a workload to calculate the total checksum of data blocks. Data blocks can be read randomly from a file or raw device (through a special file). Since the calculation of the checksum depends on the content of the data blocks, the calculation can not proceed without the availability of the blocks.

The program aiorandom (the source for which is in the attached sidebar) implements the inputs with AIO, and syiorandom (see the second attached sidebar) implements it with the standard Solaris system call reads. Both programs take the same command options (however, syiorandom does not need as many).

Here's how to use aiorandom and syiorandom:

aiorandom | syiorandom <file name> <buffer size> <file size> <total # of read> <# of asynchronous readahead> <threshold> <difficulty of checksum>

<file name>
Specifies the input file.

<buffer size>
Specifies the size of the data block.

<file size>
Specifies the size of the file. If it is zero, the program will call fstat to find out the size.

<total # of read>
Specifies the number of random reads. If it is zero, the program will set the number to the total number of data blocks in the file using the <buffer size> for calculation.

<# of asynchronous readahead>
Specifies the number of AIO reads submitted (aiorandom only).

<threshold>
Specifies the low watermark for the number of outstanding AIOs. If the current number of outstanding AIO operations is under the water mark, the program will submit more AIO reads.

<difficulty of checksum>
specifies the workload for implementing checksum. A larger number indicates a more CPU intensive checksum calculation. It is not important to understand how the checksum is calculated. It is simply used to create some CPU workload.
You can vary options to create different workloads in both I/O and CPU usages.

Benchmark results
To give you an idea of AIO's capabilities, I ran the programs with the following options on a SPARCserver 1000 sporting four CPUs.

<file name=a_UFS_big_file> <buffer size=4,096> <file size=1G> <total # of read=262,144> <# of readahead=30> <threshold=15> <difficulty of checksum=1>

Both programs read 262,144 data blocks. Each block is 4,096 bytes. aiorandom submits a batch of 30 asynchronous reads at a time. Whenever the number of outstanding AIO operations falls below 15, it submits another batch. Results were recorded from the command timex.

	aiorandom	syiorandom
real	42:43.92	1:02:48.68
user	 5:35.57	   4:39.38
sys	 5:24.02	   2:41.23

For aiorandom, this is a speed-up of close to 30 percent in real (elapsed) time. Note how aiorandom consumed more user CPU time due to the extra work associated with managing the AIO-related data structures. System CPU time doubled thanks to aiorandom running multiple LWPs. Keep this in mind on busy, multi-user systems since heavy AIO applications will depress the performance of other apps.

I ran the same test mounting NFS files over Ethernet (except <total # of read> was changed to 50,000). The NFS client was a SPARCserver 1000 with two CPUs, while the server was a SPARCserver 1000 with four CPUs. Both machines were connected to the same subnet.

	aiorandom	syiorandom
real	10:12.45	22:04.79
user	 1:16.07	 1:07.76
sys	 2:10.63	 1:24.27

This is a speed-up of more than 50 percent. It shows that AIO performs particularly well with NFS file systems, in comparison to synchronous I/Os. aiorandom took advantage of the Ethernet's bandwidth and the multi-threaded NFS server. The bandwidth was better used by the multiple AIO operations, and higher concurrency was obtained by having the multi-threaded server (the NFS server) serving the multi-threaded client (aiorandom).

MT unsafe and AIO implementation
AIO functions are MT unsafe currently, which means applications calling the functions should not be multi-threaded. An MT-safe version of AIO is in the works.

In the current implementation, AIO functions are implemented through the use of LWP system calls. LWP is generally not recommended to use with applications. Applications should use the multi-threaded library. However, libaio is different; it implements system functions. Depending on the numbers of submitted AIO requests, libaio creates so-called "workers" (as implemented as LWP's) to process the IO requests.

Solaris 2.5 implements kernel-supported asynchronous I/O. The kernel has implemented special, efficient AIO assists for the libaio.

Posix 4 library
The Posix real-time library libposix4 contains Posix functions for asynchronous I/O operations, including aio_read, aio_write, aio_cancel, and aio_return. However, with Solaris 2.X, libposix4 functions are NO-OP and always return ENOSYS. SunSoft says it will provide support for libposix4 in future releases.

Keep libposix4 in mind, but today libaio is the only practical choice.

And in the end
Asynchronous I/O improves performance by, in effect, multi-threading the application. AIO's benefit is that its API is easier to understand and use than the Solaris API for creating threaded apps. Most of the work for creating threads and managing synchronization is handled in the AIO library automatically. If you are thinking about threading an I/O-intensive application, you should first consider retrofitting it to use AIO.


Click on our Sponsors to help Support SunWorld


Resources


What did you think of this article?
-Very worth reading
-Worth reading
-Not worth reading
-Too long
-Just right
-Too short
-Too technical
-Just right
-Not technical enough
 
 
 
    

SunWorld
[Table of Contents]
Subscribe to SunWorld, it's free!
[Search]
Feedback
[Next story]
Sun's Site

[(c) Copyright  Web Publishing Inc., and IDG Communication company]

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-03-1996/swol-03-aio.html
Last modified:

SidebarBack to story

Source Listing of aiorandom

/* aiorandom.c - randomly perform asynchronous reads */
/* All right reserved by Steve Leung. COPYRIGHT 1995 */

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/asynch.h>
#include <sys/time.h>
#include <errno.h>
#include <signal.h>
#include <stdio.h>

#define NUM_RESULT	MAXASYNCHIO
#define BUFFERSIZE	(4096)

#define RESULT_OK	(0)
#define RESULT_IN_USE	(1)

#define TRUE	(1)
#define FALSE	(0)

typedef struct result_t {
	aio_result_t	result;
	int		status;
	char		buf[BUFFERSIZE];
} result_t;

result_t	results[NUM_RESULT];
int 		num_outstanding = 0;
int		filesize;
int		numfileblock;
int		numread = 0;
int		totnumread;
int		blksize;

void
init_results()
{
	int i;
	for (i = 0; i < MAXASYNCHIO; i++)
		results[i].status = RESULT_OK;
} /* init_results */

unsigned char
compute_checksum(char *bufp, const int blksz, unsigned char checksum)
{
	int i;
	unsigned char *cp;
	for (i = 0, cp = (unsigned char *)bufp; i < blksz; cp++, i++) 
		checksum ^= (unsigned int) *cp;	
	return(checksum);
} /* compute_checksum */

unsigned char 
compute_many_checksum(int howmany, char *buf, const int blksz)
{
	int i;
	unsigned char checksum = 0;
	/*
	 * This for-loop is the same as howmany times compute_checksum(...)
	 * but I intentionally want to use more CPU.
	 */
	for(i=0; i< howmany; i++) 
		checksum = compute_checksum(buf, blksz, checksum);
	return(checksum);
} /* compute_many_checksum */
		
result_t *
nextresult()
{
	static int idx = -1;
	
	while (TRUE) {
		idx = ++idx % NUM_RESULT;
		if (results[idx].status == RESULT_OK) {
			results[idx].status = RESULT_IN_USE;
			return( &(results[idx]) );
		}
	}

} /* nextresult */

void
aio_read_many(int fd, int num_readahead)
{
	result_t *rp;
	int i;
	int offset;
	long r;

	for (i=0; i< num_readahead; i++) {
		if (numread >= totnumread)
			return;
		offset = ((r=lrand48()) % numfileblock) * blksize;
		if (offset > filesize) {
			fprintf(stderr, "#### fatal erorr \n");
			break;
		}
		rp = nextresult();
		if (aioread(fd, rp->buf, blksize, offset, SEEK_SET, &(rp->result)) ==
			-1) {
			perror("aioread");
		}
		else {
			numread++;
			num_outstanding++;
		}
	}

	return;

} /* aio_read_many */

main(int argc, char ** argv)
{
	int fd;
	aio_result_t *rp;
	int num_readahead;
	int threshold;
	int num_checksum;
	unsigned char total_checksum = 0;
	struct stat statbuf;

	if (argc != 8) {
		fprintf(stderr, "Usage: %s <file name> \
<buffer size> <file size> <total # of read> \
<# of read ahead> <threshold> <number of checksum>\n", argv[0]);
		exit(1);
	}
	blksize = atoi(argv[2]);
	filesize = atoi(argv[3]);
	totnumread = atoi(argv[4]);
	num_readahead = atoi(argv[5]);
	threshold = atoi(argv[6]);
	num_checksum = atoi(argv[7]);

	srand48(11);
	init_results();

	if ((fd = open(argv[1], O_RDWR)) == -1) {
		perror("open");
		exit(1);
	}

	if (filesize <= 0) {
		fstat(fd, &statbuf);
		filesize = statbuf.st_size;
	}
	numfileblock = filesize / blksize;
	if (filesize % blksize)
		numfileblock++;
	if (totnumread <= 0) {
		totnumread = numfileblock;
	}

	printf("block size=%d filesize= %d numfileblock= %d\n", 
		blksize, filesize, numfileblock);
	printf("total_#_of_read = %d read_ahead=%d threshold= %d #_of_checksum= %d\n",
		totnumread, num_readahead, threshold, num_checksum);

	aio_read_many(fd, num_readahead);
	while( (rp=aiowait(NULL)) != (aio_result_t *)-1) {	/* wait for aio to complete */
		if (--num_outstanding < threshold) {
			aio_read_many(fd, num_readahead);
		}	
		if (rp->aio_return <= 0) {
			/* ignore EOF and IO errors */
			continue;
		}
		
		total_checksum ^= compute_many_checksum(num_checksum, 
			((result_t *) rp)->buf, rp->aio_return);
		((result_t *) rp)->status = RESULT_OK;
	}

	printf("AIO: total numread= %d checksum = %d = 0x%x\n", numread,
		(int)total_checksum, (int) total_checksum);
	exit(0);

SidebarBack to story


SidebarBack to story

Source Listing of syiorandom

} /* main */	
/* syiorandom.c - randomly perform synchronous read's */
/* All right reserved by Steve Leung - COPYRIGHT 1995 */

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/asynch.h>
#include <sys/time.h>
#include <errno.h>
#include <signal.h>
#include <stdio.h>

#define BUFFERSIZE	(4096)

#define TRUE	(1)
#define FALSE	(0)

int 		num_outstanding = 0;
int		filesize;
int		numfileblock;
int		numread = 0;
int		totnumread;
int		blksize;

unsigned char
compute_checksum(char *bufp, const int blksz, unsigned char checksum)
{
	int i;
	unsigned char *cp;
	for (i = 0, cp = (unsigned char *)bufp; i < blksz; cp++, i++) 
		checksum ^= (unsigned int) *cp;	
	return(checksum);
} /* compute_checksum */

unsigned char 
compute_many_checksum(int howmany, char *buf, const int blksz)
{
	int i;
	unsigned char checksum = 0;
	/*
	 * This for-loop is the same as howmany times compute_checksum(...)
	 * but I intentionally want to use more CPU.
	 */
	for(i=0; i< howmany; i++) 
		checksum = compute_checksum(buf, blksz, checksum);
	return(checksum);
} /* compute_many_checksum */
		
main(int argc, char ** argv)
{
	int fd;
	aio_result_t *rp;
	int num_readahead;
	int threshold;
	int num_checksum;
	unsigned char total_checksum = 0;
	struct stat statbuf;
	char buffer[BUFFERSIZE];
	int rc;
	int offset;

	if (argc != 8) {
		fprintf(stderr, "Usage: %s <file name> \
<buffer size> <file size> <total # of read> \
<# of read ahead> <threshold> <number of checksum>\n", argv[0]);
		exit(1);
	}
	blksize = atoi(argv[2]);
	filesize = atoi(argv[3]);
	totnumread = atoi(argv[4]);
	num_readahead = atoi(argv[5]);
	threshold = atoi(argv[6]);
	num_checksum = atoi(argv[7]);

	srand48(11);

	if ((fd = open(argv[1], O_RDWR)) == -1) {
		perror("open");
		exit(1);
	}

	if (filesize <= 0) {
		fstat(fd, &statbuf);
		filesize = statbuf.st_size;
	}
	numfileblock = filesize / blksize;
	if (filesize % blksize)
		numfileblock++;
	if (totnumread <= 0) {
		totnumread = numfileblock;
	}

	printf("block size=%d filesize= %d numfileblock= %d\n", 
		blksize, filesize, numfileblock);
	printf("total_#_of_read = %d read_ahead=%d threshold= %d #_of_checksum= %d\n",
		totnumread, num_readahead, threshold, num_checksum);

	while (numread < totnumread) {
		offset = (lrand48() % numfileblock) * blksize;
		if (offset > filesize) {
			fprintf(stderr, "#### fatal erorr \n");
			break;
		}
		if (lseek(fd, offset, SEEK_SET) == -1) {
			perror("lseek");
			continue;
		}
		if ((rc=read(fd, buffer, blksize)) > 0) {
			total_checksum ^= compute_many_checksum(num_checksum, 
				buffer, rc);
		} else if (rc == -1) {
			perror("read");
			continue;
		}
		numread++;
	}

	printf("SYIO: total numread = %d checksum = %d = 0x%x\n", numread,
		(int)total_checksum, (int) total_checksum);
	exit(0);

} /* main */

An Overview of Programming Asynchronous I/O in Solaris                                                   January 10,1996

15
All rights reserved by Steve Leung		COPYRIGHT 1996											

SidebarBack to story

About the author
Steve Leung (steve.leung@sunworld.com) is a senior software engineer at Amdahl Corp., Sunnyvale, CA. He had worked on the Amdahl A+ edition Solaris and UTS (Amdahl's Unix SVR4 on S/390 mainframes) projects. Recently he started developing software for Windows NT. Reach Steve at steve.leung@sunworld.com.