Click on our Sponsors to help Support SunWorld
Connectivity by Rawn Shah

Network balancing act

Network clustering can save money and stress by evening out your server loads across a network. How does it work?

SunWorld
November  1998
[Next story]
[Table of Contents]
[Search]
Subscribe to SunWorld, it's free!

Abstract
This month's look at network clustering options is the fourth installment in Rawn's series on clustering. An alternative to using specialized software on the server to balance loads, network clustering allows you to bring in external devices to monitor and optimize your server cluster traffic by directing it to the right node. (2,300 words)


Mail this
article to
a friend

The difference between network clustering and last month's topic, node clustering, is that network clustering doesn't usually require special dedicated interfaces between the nodes. In fact, sometimes the clustering is performed on a device altogether separate from the nodes. You don't need specialized hardware, software, or direct connects between the nodes to implement network clustering, and in many cases, you don't even need nodes of similar hardware platforms or operating systems. However, to simplify network management in a large cluster, most of the nodes ought to be similar.

You can implement network level clustering in two ways: (1) by sharing an identity across several machines; or (2) by sharing an application across several machines. In other words, network level clustering can work at either the network protocol level or at higher application levels. At the network protocol level, this balancing act can be based on domain names, network-geographic proximity, traffic optimization, or server-load optimization. On the application level, it can be based on similar strategies: load optimization, application or user names, application load optimization, or application-specific optimization.

The cluster nodes for network level clustering can either be uniform (identical in either hardware or software configuration, or both), or specialized (some nodes are more suited to specific tasks). Unlike node level clustering -- which is best when you have near-identical cluster nodes, or at least, near-identical software environments -- network level clustering can work with a variety of different servers as long as the traffic intended for the nodes is directed to the appropriate server node.

Protocol-based network clustering
This type of clustering is based primarily on the use of, and information contained in, the network protocol packets, possibly together with information from the individual processing nodes. Knowing the source, number, size, and destination packet can help determine factors that can be tweaked to optimize sessions between the remote user and the server. Since all IP traffic begins with the IP hostname or address, the Domain Naming System (DNS) service is crucial to effective traffic direction.

Several techniques used by network protocol clustering systems are summarized below.

A good protocol level network clustering product should implement most or all of these systems. Take a look at the table below to see several products in this category. Usually they're divided into separate products for the workgroup, enterprise, and global level. CPU load- and topology-based redirection, in particular, are difficult and usually require monitoring agents distributed across all the nodes to keep an active monitor going. This sometimes ends up limiting the system, since these agents have to run on varied server platforms which may not all be supported by the vendor. Most vendors try to devise a product which is independent of the node OS platform.

Network level clustering product vendors

Vendor

Product

Round-robin
DNS

Connection
counting

Network/CPU
load

Topology
redirection

Policy
redirection

F5 Labs

3DNS

Y

Y

Y

N

N

Cisco

Local
Director

Y

Y

Y

N

Y

Cisco

Distributed
Director

Y

Y

Y

Y

Y

Alteon

ACEDirector

Y

Y

Y

N

N

ArrowPoint

Context Smart
Switch

N

Y

Y

Y

Y

Resonate

Global
Dispatch

Y

Y

Y

Y

Y

Resonate

Central
Dispatch

Y

Y

Y

N

Y

Application-based network clustering
Application-based clustering on the network level is simply a higher level abstraction of the protocol level techniques discussed earlier. However, many applications have their own specific requirements, which complicate the situation.

Take the Web, for example. Since the HyperText Transfer Protocol is a nontransactive protocol, there is no notion of a "Web session." If a user is going through a series of Web pages by following links in the pages, the server really doesn't know the difference between that user and someone going to the pages directly by typing in the URL. HTTP was designed this way for simplicity and efficiency. However, you can infer a logical Web session if you follow the trail of HTTP requests to pages if you have an idea of the layout of the overall site. Furthermore, the concept of Web cookies can help a server keep better track of the user's session. This kind of tracking is particularly important in light of the fact that many CGI scripts need to keep track of sequential execution.

A network clustering system for a Web server farm needs to keep track of these application level sessions so the Web servers and/or cluster nodes don't get confused and lose valuable data. A top level dispatcher needs to keep track of client requests and where they've been directed to so that it can continue to send those packets to the same node.

A different kind of system, the global single login system, is based on security access, and may be implemented across an enterprise network. When the user logs in from anywhere on the network, he is directed only to servers to which he has access. This system is usually called authentication rather than clustering, although a company may utilize it specifically to balance loads across servers.

Parallel databases use another form of clustering based on server-application load. A parallel database is usually one or more pairs of nodes that run the same type of database engine, strongly tied together. Across the nodes, you can either partition your data or keep identical copies of the data on the nodes. On the other hand, the databases may monitor each other to see which node is least busy and direct requests accordingly. The idea here is to speed up database processing. Such databases have active monitoring processes keeping track of the data contained within and the server load at all times. A simpler form of this is the replicated database, which keeps periodic tabs on a corresponding sibling database, but doesn't actively transfer requests and data all the time. The replicated database is usually used in remote office or network situations where a slower WAN link between the two databases impedes continuous activity.

When it comes to application-based clustering, you're often limited to what is available from the application vendor, except for very common or popular applications. For example, there are many solutions for Web servers from third-party vendors. In fact, most of the products listed in the table above are marketed at Web server farms.

For systems such as the global single login system, products like SCO Tarantella create a method of organizing users across multiple servers of different platforms. The SCO system creates a Webtop and thus specializes the delivery system but still provides application access on any number of preconfigured servers. The load balancing in such a case is policy based and determined by the administrator. Such a product blurs the line between application and node clustering.

Choosing a network level clustering product
The idea behind network level clustering is to speed up access or processing by directing network traffic appropriately. Most of these solutions work best for farms of servers performing similar tasks such as Web service. The idea is to automate traffic direction, much as a valet parks cars at a good garage. If, on the other hand, what you need really is a traffic-light system, you should look into network traffic management systems like Ethernet switches, smarter routers, and ATM. The network cluster shares the technology used in traffic management but is specialized to the needs of the end nodes. A combination of both makes for the best environment.

The products listed in Table 1 provide similar features at different prices. It's hard to compare them since these aren't really commodity products and they don't all accomplish the task in the same way. The table below will, however, give you a good idea of what features to look for in particular in products.

How to take advantage of network level clustering
Clustering taskClustering method
Spread requests out as evenly as possible Connection counting, round-robin DNS
Spread requests out but keep consecutive requests consistent Network/CPU load, topology redirection
Send requests to the nearest server Topology redirection
Send requests to the least loaded server Network/CPU load monitoring
Use as little network bandwidth as possible Network load monitoring, topology redirection
Spread requests to specific locations Policy-based redirection, application-based clustering
Spread requests according to application Application-based clustering
Spread requests according to security rights of user or location Topology redirection, policy redirection, application-based clustering

One last thing to keep in mind when purchasing a solution is the management system. Unfortunately, many of these products are based on proprietary management methods. A top name vendor like Cisco certainly makes it easier to integrate products with common network management suites, but it comes at a price.

Network level clustering evens the load on your network segments and your servers. Network and server bottlenecks are the cause of many a network manager's ulcers, as well as recurring problems within the server itself. Often, too much traffic to a server is enough to choke and crash the system. Finding a solution through this type of clustering allows you to tackle the problem without necessarily having to change your existing server platform or operating systems, which is not the case with node clustering. In many cases, one external device can even help solve the situation for whole farms of servers. A simple, effective cure? Maybe, but then you still have to get it to work right; something you can only solve best with time and effective monitoring.


Click on our Sponsors to help Support SunWorld


Resources


About the author
Rawn Shah is an independent consultant based in Tucson, AZ. He has written for years on the topic of Unix-to-PC connectivity and has watched many of today's existing systems come into being. He has worked as a system and network administrator in heterogeneous computing environments since 1990. Reach Rawn at rawn.shah@sunworld.com.

What did you think of this article?
-Very worth reading
-Worth reading
-Not worth reading
-Too long
-Just right
-Too short
-Too technical
-Just right
-Not technical enough
 
 
 
    

SunWorld
[Table of Contents]
Subscribe to SunWorld, it's free!
[Search]
Feedback
[Next story]
Sun's Site

[(c) Copyright  Web Publishing Inc., and IDG Communication company]

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-11-1998/swol-11-connectivity.html
Last modified: