Beyond client/server

Strategies for managing and distributing data transcend the underlying architecture.

By Hal Stern

Client/server computing is very popular these days; it's a concept familiar even to nontechnical people who are often caught up in the excitement and apprehension of their companies' IT moving to client/server. What is often missing, particularly in all the media hype, is any indication of the real goals of that migration: Client/ server isn't a place or a thing. Client/server is not a noun; it's an adjective. It describes an application design methodology and mechanism for data distribution.

Keep in mind, too, client/server is just another wave in a series of computing paradigms. Publish-and-subscribe data delivery, peer-to-peer computing, broadcast-oriented data distribution, and asynchronous messaging all may be worthy competition to the client/server model. Regardless of the computing paradigm, the higher goal that ultimately determines your computing systems' success is delivering data to users when and where they need it. If users aren't getting the right information in a timely fashion so they can do their jobs (let alone in faster and more-efficient ways), elegance in the underlying computing infrastructure just doesn't matter.

We'll look at computing architectures from the data-management perspective, focusing on meeting and maintaining response-time goals. Starting with a quick review of client/ server designs, we outline some network capacity planning ideas to control latency. A discussion of instrumentation and management of applications will lead us into techniques for optimized I/O operations. We conclude with a closer look at the mechanics of data caching, completing a focus on the data, rather than the code distribution aspects of client/server computing. Our goal is to raise questions about where your data live on your network, how they get there, and how you might shuffle them to improve performance.

Middle-child syndrome
Most current client/server designs spread an application over two or, more recently, three tiers (see diagram Client/server architectures). The traditional two-tier design works well with the garden variety client/server application that has a simple, forms-based front end talking to a relational database at the other end. Three-tier designs make sense when you need a middle layer to exert control over data flow between user desktops and back-end data servers, or when a local data store is needed to supplement a single, heavily loaded central server. In particular, from a data-management perspective, don't ignore that middle tier like a middle child caught between the lovable, familiar desktop and the older, larger server on the other: Before repressing it into an adjunct role in the design, look at the ways in which data, not requests, flow in your application architecture.

Reducing demands on the infrastructure should be the key goal of your analysis. If your user load doubles, will your network load double as well? Will widespread use of this year's killer application mean your users will send 20 percent more requests to the server and retrieve 20 percent more data on each query? Will that make your users 20 percent more productive, or will they experience a 20 percent decrease in response time? Any of these events might send you past the knee of a response-time curve by hitting a bottleneck in server or network capacity. Middle-tier servers can ease the bottlenecks or at least provide probe points where you can monitor data transfers and proactively manage your way through them.

For instance, when developing a new application for distribution across your computing infrastructure, project how many distinct data items will be retrieved by users. How long will those data remain valid? Will desktops share portions of the information, either amongst themselves or through a central server, or will each desktop get an independent dataset? Once you've sliced your application and apportioned it to the various client/server hosts, slice your data up as well and assign them homes on the network. Various data-distribution paradigms will fall out of these analyses. For instance, you may find that the middle tier may act merely as a front-end, data-caching server for the third tier, with no application code other than a cache manager and request forwarder. In this limited role, the middle tier is reducing demand on the network, making it critical for a viable client/server architecture.

Perhaps the greatest benefit from taking a data-oriented view of applications is merging the time domain into the business logic and transaction-oriented domains. Unless they are writing a control system or a device driver where specific physical events determine the flow and pace of the software, developers usually don't think about the timing dynamics of applications. Rather, time is usually considered a dependent variable based on code efficiency and external conditions like I/O contention. Relegating performance below the major design considerations yields major performance problems further down the development road.

The time domain is where system responsiveness is measured, making it the ultimate metric for gauging a system's success or failure -- client/server or not. The trick to making your systems and software responsive to the data needs of users is to stretch, fold, and overlay your data movement and code blocks so they fit the constraints imposed by response-time goals. Take those business logic and transaction models and pack them into bins that mark off reasonable time boundaries. If you get the feeling you're packing a lot of network traffic and local or server processing into a very short time, be assured your users, too, will notice the system's poor performance. A first step is to get a feel for the relative sizes of the boxes representing each processing and network delay in a typical transaction.

Path less traveled
Several industry consultants have issued reports that distributed computing is more expensive than terminal/host computing. When the cost of data movement is ignored in the migration from monolithic to distributed application architectures, this prediction frequently proves to be true. We'll not ignore the very important movement of data, starting our analysis with a look at end-to-end data flow.

Each step in executing a transaction consists of some local processing, network latency, server processing, and the return trip network latency. If you measure the start-to-finish time for each server and client code fragment, you can construct a rough picture timing for host processing and network delays. It's only fair that the client/server network should be a primary suspect when sleuthing missing performance since "the network is the computer." When network latency exceeds what you'd expect just to move the data, look at the trade-offs between cost, bandwidth, and network complexity (see the sidebar Transaction history 101 ).

Spending more money on additional bandwidth may solve the problem. Then again, it may obliterate any processing cost savings first realized in a distributed-computing design. Distinguish tactical expenditures that rectify short-term problems from strategic investments that improve productivity.

Buying more bandwidth through additional networks, new routers, or faster network media may be the simplest, but most costly, solution. The major drawback to buying your way out of a performance problem is that another increase in demand requires another purchase. Eventually, you won't be able to add more wires or spend more money; you'll need to come up with a creative, scalable network architecture that eliminates some common client/server architectural bottlenecks. Reducing demands on the network will help, as will streamlining the data flow through the most congested points in the network topology.

One common misconception you need to disregard is that of symmetric network paths. Requests and responses usually trace the same network route between client and server, but this need not be the case if responses are clogging up the path for incoming requests. For example, put on-line transaction-processing workloads on one network and a decision-support application for the marketing department on another path to the server. The large output returned to the marketroids will introduce network congestion (and probably increased latency), putting your time-sensitive on-line work at risk of missing performance goals. You might even want separate request and reply networks, providing alternate "input channels" for workloads that are update- or write-intensive. Splitting your traffic across networks requires mind-numbing thought about IP addresses, host names, and routing, but don't put it off: It can also produce a clean separation of traffic that dramatically reduces network latency.

If network traffic segregation is appealing, how about streamlining the load by establishing classes of service? Despite its proprietary nature, IBM's SNA networking has robust class-of-service attributes that let you prioritize traffic, something that is notoriously hard to do in TCP/IP without application-level help. Fortunately, three-tier client/server applications can use the server in the middle as a data staging area or request router. Have the intermediate server sort queries and send them out of a priority queue, for example, or have it delay database activity that is expected to produce large quantities of output. The additional processing overhead must be small compared with the total transaction time, so it's most beneficial with large transaction volumes or long-lived transactions.

Consider the middle tier of a three-tiered client/ server architecture as a transaction router, so the prospect of query prioritization is merely making a smarter router. Some TCP/IP routers support class-of-service based on port number, and most transaction-processing monitors, such as Tuxedo, Encina, and TopEnd, support priority-based queues. Embed the streamlining features into your transaction forwarder on the middle tier so it becomes an electronic traffic cop.

Stick a fork in it
Not all data distribution problems are solved by adjusting the supply side of the equation. Sometimes you need to reduce demand, particularly when you strain the limits of the physical network media or your budget. Before you assault your network management people, it's good practice to be sure response-time problems aren't buried in your application code. Establishing a service-level agreement with them regarding network bandwidth and latency doesn't help one bit if your code is stuck in a loop or if 95 percent of an application's processing time is spent locally. You need to watch your code running at full speed, interacting with the network and server components, to see where the biggest clumps of execution time get chewed up.

The obvious solution is to rebuild your application using code profiling (normally enabled with the -g option to your compiler), which gives you a detailed map of routine calls and elapsed times. Unfortunately, code profiling slows down your application, so it's more difficult to run it with other networked components. You may do better to insert the timing equivalent of printf() statements, but the problem with that approach is identifying where in the application to stick your tuning forks.

Gather performance data from your application by noting elapsed times through the most tortuous code passages in both client and server components. Some of the other key instrumentation points include the beginning and end of each transaction type, the time between open() and close() operations on a file, or any other long-lived activity, such as drawing, parsing input from a file or output from a database query, and building data structures.

Once you find the section of code that contributes the most to the response-time problem, refine your instrumentation points to examine the implementation cost of code that is supposed to be a "black box." When you're not supposed to see the inner workings of a library, it's frequently because the view would make you queasy. A database-access library may create a large temporary file, or what appears to be a simple database table lookup may actually be a multitable join that sends your back-end server into a tailspin. Bad system programming habits, such as excessive file locks or inefficient polling mechanisms, appear when you lift the veil of a commonly used library and question the prosaic beauty of the code beneath.

Multistop surf shop
Let's say one of your longest code paths takes you through six or seven network I/O operations, such as waiting for data from servers. How do you fold, bend, and overlap these operations so they don't have to be completed consecutively? Speed up I/O-intensive code paths with multithreading, and use user-level threads, asynchronous I/O, or multiple processes if the total I/O service time is long enough to warrant interprocess I/O completion communication via semaphores. Not all servers or devices respond at the same time, so multithreading lets the client start post-processing requests and drawing the screen at the return of the first network response. If your I/O requests must be serialized -- for example, the input for the second request is in the result of the first request -- multithreading won't help. But if you don't need all of the data items at once, scattering the I/O operations into parallel processes and gathering results on the fly reduces I/O wait time. Check out the Solaris aioread() function or a POSIX-compliant threads library that lets each thread issue a blocking system call to implement folded I/O strategies.

Another way to relieve data-distribution demand is to deliver only what the data users need to proceed to the next step, not the whole enchilada that might be required for extensive processing. Watching data-usage patterns, for instance, you might find users initially browse records, looking at only one or two fields (customer name and balance) representing 10 percent or less of the whole record. Cut down on your data-delivery problems and improve response times by giving them just enough data to make a selection, not the whole record.

Another tactic that can improve responsiveness is to prefetch data. The most common example is the way NFS and some other distributed-filesystem implementations retrieve pages of sequentially accessed files before the reading process actually asks for them. Prefetching lets you get more data in a single request, amortizing the cost of a server access over more bytes. If your network latency is high due to contention, try to get as much data as you can in each round trip to the server. Prefetch is ideal for imaging applications, ordered searches, and decision support applications where sequential access is the rule of the game.

Caching out
Adding or improving the use of data caches throughout your architecture will also cut down on demand for server processing and network bandwidth at the expense of more complex code. Aggressive cache designs require that you know something about how your application uses data, which means more instrumentation and analysis.

If you think our emphasis on data caching and storage is a bit retro-minded, you've discovered our underlying theme. Let's get truly reactionary and think about the data-access problem in mainframe terms: Getting bytes off a disk (direct-access storage device, or DASD in mainframe-speak) means building a path over a channel to the device, and copying it into some processor-controlled storage. In a distributed model, the channel consists of one or more networks with a server host and operating system sitting in front of the storage devices. Because you can't control the population of devices on the virtual channel, nor can you get full bandwidth of it when you want, you need to worry about the location of your disk devices on the network.

To get more "bandwidth" for data, replication comes to the rescue, albeit with a high price tag. The classic example is replicating a commonly used tools directory, but replication works just as well for databases. Replicate all of your data, or just summary information for decision support, or use replicas to put different workloads like long-running queries and on-line updates onto different machines. You'll need hard facts on data consumption to determine the utility of replication, and you'll have to shoulder a large administrative cost making sure replicas are up to date, synchronized, and available. (For an excellent overview of replication strategies, see "Unix RDBMS: The next generation" by Bill Rosenblatt, February 1994.)

A main point with regard to user response times is to replicate only those items you need and cache them locally or on a mid-tier server. One of the best ways to work around a constrained network is to cache those items that are least likely to change or that change slowly with respect to the request rates. Client-side caching is at the heart of realistic NFS and DCE DFS performance, since it avoids having to complete a network transaction for each reference to a file or filesystem data structure.

And be very careful: Accuracy of data in your local cache and the penalty for using stale data are completely application dependent. Here are some typical cache-consistency schemes to help you resolve that severe restriction:

Perform a cache-consistency check on every transaction. Verifying cache consistency usually amounts to little more than asking the server for the modification time of a piece of data and comparing the result to the time at which the data was cached. Per-transaction consistency checks are most effective if the data retrieved is much larger than the time stamp or if the server's processing time to produce data is long compared to the time to check a time stamp.
Periodically query the server and refresh stale data, anticipating regular data updates. NFS uses a periodic server-consistency check, looking for file changes once every 30 seconds. The client assumes full responsibility for having current data. It's safer, too, since there's no server state to be preserved in case of a crash. Unfortunately, periodic checks for consistency also leave you open to windows of stale data, where the server has been updated but the client doesn't ask for changes right away.
Make the server invalidate client caches when data changes (DCE DFS-like). Server invalidation is complex but exceptionally efficient because clients discard data as soon as it has been updated, thereby rarely, if ever, running into cache-consistency problems, although server crash recovery is harder because you must know the location of any cached data items.

There's a point of diminishing returns for larger caches and more aggressive consistency protocols, so you need to experiment to find out what works best for each application.

Client/server endgame
The endgame for client/server computing might be what Scott McNealy, CEO of Sun Microsystems, calls "peer computing" wherein every host is both a server and a client. Data storage, location, and migration issues explode again when your prototypic clients and servers switch roles; data ownership gets about as distributed as it can. The Object Management Group's object-request broker specifications and strategies or message-based data-delivery schemes, such as Teknekron's Information Bus, may hold the key to the post-client/server design school.

However, one design requirement is universal: An enterprisewide evaluation of where data live, how they move from place to place, and how they are used is necessary for any advanced computing environment to become optimally functional. Such an analysis is a confluence of the centralized control of a data center and the local variations possible in a highly distributed world. Central control gives you corporate standards for quality and data durability, while local implementation dictates which caching and fetch mechanisms you might use to gain the performance edge. Local choice is handled in a purely federated fashion: There are no globally right or wrong decisions, only those that work for each application. Drawing the line that separates the federalist powers affecting your data is the first step in building a practical, distributed data center. The more widely distributed and closely managed your data is, the more you move beyond the client/server, code- oriented world.

About the author
Hal Stern is a Distinguished Systems Engineer with Sun Microsystems, where he specializes in networking, performance tuning, and kernel hacking. He can be reached at hal.stern@advanced.com.

Transaction history 101

Identifying time components in an end-to-end transaction is not a trivial task. Because it's difficult to get the whole enterprise on a synchronized time base, use differences (deltas) instead of absolute values to measure elapsed time for each component. Have each machine calculate its processing time and record it in the transaction reply. When the client subtracts server processing time from its total execution time, it's left with the round-trip network latency.

The only downside to this simple model is that it lumps the transmitting and receiving network latencies into a single measure, while they tend to be markedly different for short requests versus long replies on lightly loaded networks. Here's a simple formula for prorating network latencies:

Calculate the time required to send data, including protocol and request headers. Unless you look at low-level protocol descriptions, you won't know the header size exactly, but approximations will suffice. For example, sending 10,000 bytes over Ethernet requires six packets of 1,500 bytes and one of 1,000 bytes. Allowing about 100 bytes per packet for protocol headers, the actual data packed in that transmission is about 9,300 bytes. The data transfer takes eight milliseconds on a 10 megabit-per-second LAN. Add up the time required to send both requests and responses. Take an average over a large number of requests for more accurate results.
Subtract the data transit time from the network round-trip time to measure network latency -- how long it took the hosts to get the data packets onto the network, which is dependent on network load and contention. A heavily loaded Ethernet might have a latency of 10 milliseconds, in which case the data transfer time for a 100-byte packet is merely noise. If you find the network latency is long compared with data transfer times, stop here. The network latency is probably uniform for each packet.
If the network latency is low, divide it in half and prorate the data transfer time by the relative sizes of the request and reply. For example, if you find the network latency is two milliseconds and the total transfer time is eight milliseconds for a 300-byte request and a 9,000-byte reply, your request network delay is 1Ú2 x 2 milliseconds + 300/9300 bytes x 8 milliseconds = 1.26 milliseconds. The response delay is 1Ú2 x 2 milliseconds + 9000/9300 x 8 milliseconds = 8.74 milliseconds. On a heavily loaded network, where the delay getting on the network is as long or longer than the data transmission time, the request and reply network latencies can be assumed to be equal.

Build up averages for each processing step over many transactions. Performance measurements can be recorded in a file, in memory (for quick access), or in an SNMP MIB for analysis with your favorite network management tool. Once you see how time flies while your users are having fun staring at hourglass cursors, it's time to reduce the largest components.

If you have problems with this magazine, contact webmaster@sunworld.com
URL: http://www.sunworld.com/asm-04-1995/asm-04-techfeat.html
Last updated: 1 April 1995