Web routing provides Net traffic relief
How do components of Web routing -- server redirection, load balancing, and caching -- work to speed network connections?
The Web is both the greatest wonder of the Internet and the biggest source of network congestion. But there's hope. By routing network traffic based on the needs of an application, we can greatly reduce bottlenecks. This month, Rawn describes how proper Web routing can help to alleviate bandwidth problems. (2,500 words)
What is the cause of all this traffic? A seemingly innocent network application called the World Wide Web. It is a golden, bejeweled snake of information that spans the globe, but a snake nonetheless. While users glorify its benefits, it's the information managers and administrators that have to suffer its wrath on their networks.
The job is to charm the snake and make it dance as you want it to. To learn this trick, you must first know the animal, how it behaves, what it likes, and what it is willing to do. Then, you must find the right instruments to mellow the monster into a more ordered, responsive creature. I describe this as Web routing.
Making the Web go
Routing for the Web is the process of directing Web traffic efficiently and appropriately to Web sites, be it your own, or those your users visit. It is an area of research and development that has taken on major proportions, and now consumes most of the traffic on the Internet. With the move towards Web-based applications, the launch of application service provider companies, and the growth and development of better Web technologies like XML, we can be assured that it will probably remain a leading Internet bandwidth consumer.
Web routing is partially independent of IP packet routing, but still follows in its footsteps -- Web protocols can go only where IP routes do. The primary protocol for the Web, the Hypertext Transfer Protocol (HTTP), exists in a somewhat primitive form still, barely in version 1.1, even years after its introduction. HTTP, an application layer protocol, runs on top of the Transmission Control Protocol (TCP), a transport layer protocol which, in turn, runs on top of the Internet Protocol. As the transport protocol, TCP provides a steady stream connection between a Web browser client and a Web server to transfer the contents of a Web page in HTML, DHTML, XML, or other forms. Thus, HTML packets flow through an HTTP session running within a TCP session, delivered by IP packets.
Technically speaking, Web routing is the process of directing HTTP traffic, from a locus on the network path between the browser and the Web server, so that the HTTP request can be fulfilled in a more efficient, predictable, or organized manner. Web routing is independent of the request generation by clients and the Web page processing at a server, but may decide upon a route based on the load on clients, servers, or their networks.
Here are the basic approaches to Web routing:
Ignoring the lower protocols for now, HTTP provides a basic method to redirect traffic to another Web URL. When a client connects to a server and requests a page, the server may look at the URL request and determine that the page is either no longer on the server, or that it has been redirected to a different URL. This server-end configuration causes the Web server to send a redirect message back to the client, along with a new URL to seek. The Web browser interprets this message and goes to the new URL indicated.
This is the first form of Web routing. By redirecting traffic to other servers, Web content can be distributed across many different servers and sites. If the Web server were smart enough to make this decision based on its load and that of its peer Web servers, it could make intelligent choices about traffic redistribution. However, the primary role of a Web server is to serve pages, and not to determine how to distribute Web traffic. The latter is a role better left to a separate system that can be optimized for routing.
Localized load balancing
Another aspect of Web routing occurs when companies establish a cluster of Web servers -- a Web server farm -- to handle large amounts of requests. These servers share the workload of processing Web requests through different load balancing mechanisms.
Load balancing of Web servers was the subject of my October 1998 column, "Clustering, Part 3: On the node," and is also discussed in even greater detail in my chapter in High Performance Cluster Computing (Prentice Hall, 1999). To avoid redundancy (no pun intended), I'll refer you back to that article or the book for details on how Web load balancing works.
The job of the load balancing device is to determine which server has the least load and to send the next incoming request to this server. It does this by either looking at the transport layer (TCP) headers or the application layer (HTTP) headers and rewriting them appropriately to make it appear that the packets have been sent directly to the server. The load balancers need to keep track of the different sessions associated with these packets so that clients interact with the same server once the request has been processed. Because several separate HTTP sessions are often needed to define a Web session, it is important to keep the connection between the same partners.
In this situation the routing occurs when the load balancer determines which server should receive new requests. This is usually based on a combination of past history of communications from the client, the processing load on the server, and the network load on the server. A number of different algorithms are used to make this decision (see the October 1998 column for the details).
Distributed load balancing
A second method of load balancing for Web servers routes Web traffic across networks. Whereas localized load balancing is normally put into effect for a single LAN or a small group of LANs directly connected to the load balancing device, distributed load balancing sends packets across networks, potentially located quite far away from the local server. The intention is for the client request to be processed by the closest Web server.
A client Web request normally goes to the very first or largest Web server identified by its URL hostname (for example, www.straypackets.com). Because this primary server will likely receive all initial requests, it will have to bear the brunt of new incoming traffic. However, the Web site might actually have another server that is topologically closer to the client than the primary server. Thus, to reduce the latency of the Web session, and to improve processing, the request should be distributed to the closer server.
This is not as simple as it sounds. With the crisscross snaking of the Internet, it is difficult to determine the closest server to a client. Because IP packets do not necessarily flow along the same predictable path, this sense of closeness could even change depending on the network topology of a client's Internet service provider.
Distributed load balancing usually involves the use of another special protocol between the different servers to determine the closeness factor and the load of remote servers, and to redirect traffic appropriately. To date, there is no common standard for such a protocol, but several vendors do have their own proprietary standards.
Web caching is the process of keeping temporary copies of popular Web pages on a local server so that they can be accessed more quickly by local users. Using this approach, Web traffic is rerouted starting at local clients intended for remote Web sites to the cache server at the edge of the local network. The cache server gets on average between 30 to 50 percent of this Web traffic -- enough to reduce the bandwidth consumption of the WAN link to the Internet, saving money.
Web caching works in several ways. The basic system involves a single cache that acts as a proxy for all Web browser clients on the local network. Therefore, every Web request is directed to the proxy cache server before going out over the WAN link. The downside of this method is that it can take some time to configure every single browser to use the cache. Furthermore, this option on most browsers can be changed by the user, defeating the purpose.
A second method of caching involves the cache actively monitoring outgoing Web traffic on the local network, and, with the help of a packet filter, capturing and rerouting the packets through the cache server instead. Such a system, called a transparent proxy cache, does not require any modifications to the client browsers. It is slower, however, than the first method, because the server has to scan every single IP packet to see which ones are intended for the Web.
A third method involves keeping a distributed network of cache servers, either in different departments or different office locations, but in communication with each other. This approach is useful for large networks accommodating thousands of users, or when the networks are geographically dispersed. You can even integrate national and global cache servers with a local cache of your own to get even greater benefits.
Web caching reroutes traffic at the client end of the Web session. In essence, it terminates multiple Web sessions from clients to the cache, and creates new sessions between the cache and the origin Web site that contains the actual data.
Distributed Web caches have their own protocols to communicate between the servers. They use these protocols to synchronize and replicate cache contents, determine if a document is stored on a different server, and redirect client requests to other caches. Web routing is also affected by these other protocols.
Policy-based routing differs from these other methods because it is not based on the traffic behavior of the Web or the load behavior of Web servers. Instead, sites can establish different policies that promote or manage security, server preferences, user behavior, or even corporate network use policies.
Although policy routing can occur at any level of the network protocol stack, we're going to focus on the application protocol layer.
These policies can be of any practical type as long as they can be defined in a programmable fashion. Some are specific to routing near the client end to enhance client security or control of behavior before the requests go out to the network. Others focus more on how to handle traffic that is coming into the Web server itself.
A selection of possibilities for policy-based Web routing include:
Web routing = application-specific routing
Web routing is application specific and introduces a whole new set of issues not anticipated by traditional network protocol routers. Most of these routers are not designed to handle application layer protocols and perform poorly in this role. The best ones have specialized hardware designed to process IP packet headers very quickly, but slow down dramatically once they have to go through the rest of the packet to check for Web traffic.
However, the role of these traditional routers will not be displaced by Web routing systems, but rather supported by them. By more efficiently delivering Web traffic, they can reduce the load on traditional routers. Interestingly enough, proper Web routing can both generally benefit the public good of the Internet and also improve the mood of users waiting for the traffic to move. While Web routing does not kill the techno-snake of traffic, it certainly puts it on a diet -- and loosens its coils.
About the author
Rawn Shah is an independent consultant based in Tucson, AZ. He has written for years on the topic of Unix-to-PC connectivity and has watched many of today's existing systems come into being. He has worked as a system and network administrator in heterogeneous computing environments since 1990.
If you have technical problems with this magazine, contact firstname.lastname@example.org