The golden gateway
It's time to dust off the idea of gateway servers, and examine how they fit client-server applications
Marketeers and CIOs tout three-tier, client/server reengineering as the cure-all for enterprise ailments. Before you make that your mantra, consider using gateways. We outline the transactions and structures suited to gateways, providing examples and design rules for building gateways, and conclude with a view of how emerging technologies like Java and SafeTcl can gateway between vertical client-server applications. (3,600 words)
Not everything fits a three-tier architecture. Some applications belong on a mainframe and should stay there. Cost constraints, risk reduction, operational complexity, technology adaptation, and time-to-market often dictate mixed-bag solutions. In our rush to view the world through client-server glasses, we sometimes filter out opportunities to link new and legacy operational systems.
Peer forward a few years to the time when your off-the-shelf financials, human resources, and inventory applications fit neatly into isolated, three-tier holding tanks. Each best-of-breed product offers a client-server architecture for good performance, easy addition of new features, and minimal data movement.
But when you try to access personnel records from the financial system, though, you'll have to "go outside" to hand-crafted queries that reformat output or even paper to extract data from one system and consume it in another. The growing sentiment among users will likely be that the applications aren't flexible, don't interoperate well (if at all), and that data is isolated. Those attributes are the de facto definition of a legacy application, where "legacy" won't be said with the respect accorded a 10 year-old, tried-and-true system.
How do you side-step both problems? It's time to dust off the idea of gateway servers, and examine how they fit into the re-engineering spectrum and the client-server application arena.
We'll discuss specific transaction types and sequences well-suited for gateways, along with the technologies required to handle them. Then we'll look at examples and design rules for building transaction processing sytems around gateway servers.
We will conclude with a view of how emerging technologies, like Java and SafeTcl, can gateway between vertical client-server applications. Our goal is to raise the level of abstraction for application architecture to look at how individual solutions fit into the larger whole to become Enterprise -- with a capital E -- applications.
Twisty little passages, all different: Gateways defined
A gateway is merely a box that converts between two systems that don't talk to each other in a native tongue. Gateways convert data representation, syntax, and semantics to keep the bits flowing. Here are some examples:
Gateways also link disparate applications. Perhaps the best-known are the database gateways from Unix relational DBMS services such as Oracle and Sybase to MVS-hosted DB2 or CICS systems. Besides converting between database views and access protocols, gateways accommodate differences in database structures, linking relational databases to hierarchical systems such as IBM's IMS.
But when do you use a gateway, and when do you proceed full speed with mainframe migration? Gateways represent just one form of re-engineering legacy applications. The figure below shows the risk, cost and benefit relationships of various approaches:
For a real-life story, read the accounting of the California Divison of Motor Vehicles' attempt to migrate from its current environment. (See the list of resources at the end of this story.)
When executed well, re-engineering provides the biggest benefit to the organization, but this approach also sits on the extreme of the risk axis.
If you are entering a period of cost freezes, don't consider re-engineering projects because they require substantial capital commitments. If either funding or risk tolerance govern your approach, gateways and conversion are more attractive because they limit the investment and potential impact on other systems. Gateways often provide the best mechanism to add new functionality or link operational systems together.
Gating factors: Gateway or reengineer?
How do you identify gateways waiting to be built? Here are some criteria to quantify the cost and risk reduction offered by a gateway as compared to a more extensive engineering effort:
For example, consider adding a system that analyzes asset deployment for sales tax abatements. If you buy a fleet of corporate vehicles in New York, but use some of them in New Hampshire, you are due a refund on the sales tax. Modifying the purchasing system to track where assets go is a good way to slow down future purchases, or to break the system in a non-obvious way. Pulling data out of the purchasing system and populating a tracking or geographical asset management system will have less impact. You don't want to spend $3 on development to recover $1 in costs.
When evaluating gateway projects, minimize their impact on fielded systems. You need to solve a political and technical problem at the same time, or you'll have to coordinate the new-wave client/server developers with the legacy/MIS stalwarts and justify changes to legacy systems that alter your cost-benefit picture. Once you've identified promising candidates, it's time to look more closely at the linkages between the gateway and legacy systems.
A sink or async: Eliminating synchronization problems
One overriding design rule is to look for gateways that encourage data flow between systems while eliminating dependencies or synchronization points between them. This simply reflects the nature of transactions in the business world -- they tend to be one-way and asynchronous. The telephone company, for example, doesn't shut off your service in between the time it generates your bill and the time your check clears the bank. Instead, the bill to collect cycle is three distinct asynchronous transactions: bill generation, check transmittal, and funds transfer (clearing). Each transaction gets bunched up with similar operations (multiple checks sent to the bank for deposit at once), offering an economy of scale.
Put another way, asynchronous transaction flows are the building blocks for scalable systems. The more you require fine-grain locking and synchronization of systems, the harder it is to scale them to larger workloads. The best example is that of two people sharing a checkbook. You and your spouse could visit the ATM and teller windows synchronously (highly unlikely), or you can each access the checking account and periodically reconcile your transactions. While the asynchronous access requires some optimistic assumptions (more on those shortly), it's also the only viable mechanism for two- (or more) person access to a shared account.
Bill Raduchel, Sun's CIO, describes the process of designing one-way transaction systems with the elegant phrase "decoupling organizational cadences." Each work-flow proceeds at its own pace, stopped to synchronize only when absolutely necessary. Thinking asynchronously is counter to the way in which databases and applications have historically been designed. Here's a case study that illustrates the point:
A new transaction processing engine has been deployed on a Unix system. The existing mainframe system needs to be updated with insert or modify request so that its historical database remains complete and accurate. Initially, the system designer puts an IBM SNA gateway in between the Unix and mainframe sides, using IBM's LU 6.2 program-to-program communications to mirror requests to the historical database. Unfortunately, the LU 6.2 gateway introduces a choke point in the Unix-side transaction flow. Every update or insert on the Unix side now waits for the LU 6.2 communication to finish the mirror transaction on the mainframe, gating the new system's performance by that of the legacy system.
A better solution is to use message-based middleware, such as a reliable queueing system, to stack up the transactions for the mainframe in a input "hopper." Once the Unix side delivers the transaction, the queue guarantees delivery of it to the mainframe at some future point. The mainframe executes independently, catching up with the Unix side when it has the capacity to drain the input queue.
Decoupling the Unix transaction engine from the mainframe lets both of them proceed at full speed. If this sounds like a less than subtle endorsement of messaging, it is. Messaging is an architecture, and the gateway systems are the components that ensure the messages make sense to all parties sending or receiving them.
Safely decoupling systems requires adherence to some additional design rules:
So far, we've only looked at cases where transactions cross the border between Unix and legacy systems, giving you cleanly separated systems with little data movement. What do you do when you want access to most or all of the data on both sides of the gateway? The costs of mass data migration -- transfer latency, network bandwidth, and on-going management -- for large data sets quickly offset the benefits of the application migration through the gateway.
Immovable objects: Trade-offs in moving data through gateways
If the data accessed by your gateway fills hundreds of gigabytes of mainframe storage, the pure gateway approaches described above work best. You don't want to spend hours dragging data through a gateway, particularly if it is being updated frequently on the source side. Similarly, it doesn't make sense to migrate an application and its associated data to a Unix server if the same data sets are used by several remaining mainframe applications that are both mature and mission-critical. Moving the data impacts the middle of other work flows, violating the design criteria outlined for efficient gateway deployment.
Copying data through the gateway, however, gives you complete autonomy over your work flow. Applications modify local copies of data, with no dependencies on foreign systems. When does it make sense to turn your gateway into the Kinko's of the data center? The first guideline is to keep the data transfer time short and bounded. Token Ring connections to the mainframe and channel attach devices both deliver about 300 kilobytes per second of sustained bandwidth. At that rate, you move about a gigabyte in an hour. If you need to move a few hundred megabytes, you can slide the data through and update Unix databases in a short time. Larger data transfers may require a "staging" area on the Unix side of the gateway in which mainframe-extracted data awaits consumption by the local application.
You also need to judge the trade-off between the granularity of data transfers and the accuracy of data on the Unix side. If you copy data every night, you can't make intra-day business decisions using the Unix application because it has no knowledge of activity that occurred after the previous day's close of business. Conversely, marketing systems that derive weekly, monthly, or longer-term trends won't suffer from a slight lag in obtaining operational data. If you need to reduce the size of the transfer window, consider replicating updates, deltas, or transaction logs instead of entire data sets.
Consider a mutual fund company that uses cash flows in and out of funds as a marketing yardstick. Short-term effects, such as cash influx resulting from an advertisement in the Wall Street Journal or Money magazine need to be determined from operational data. While the code to analyze cash flows could be added to the cash management system, changing a mission-critical function is usually frowned upon. Duplicating the detail records out of the cash system into a separate Unix system minimizes the impact on existing systems while isolating the decision support from dependencies or synchronization points with the mainframe.
The figure above shows the increasing levels of abstraction offered by different gateway services, ranging from TCP/IP to 3270 protocol conversion to foreign database access. Migrating functionality to the Unix domain makes the gateway functions become more complex, as they are no longer simple byte stream converters. When the gateway is operating at an application or transaction level, analyses of data movement and synchronization points are crucial to its success.
Thin is in: A case for thin servers
Client-server system architects have embraced the "thin client" as the ideal state in desktops. Application functionality is pushed back to a middle tier or back end server, with the desktop providing minimal user interface support. Along with those who hold up thin clients as the supermodels in the client-server glamor industry are those who question the complexity required to implement a three-tier architecture. It's time to ponder whether thin is in for gateways as well.
One example of a thin gateway is an object broker that is compliant with the Object Management Group's UNO standard for inter-object request broker (inter-ORB) communication. The thin gateway converts object requests from one implementation to another, without moving application functionality. Thin servers, or thin gateways, will also link client-server applications together. When vertical applications such as PeopleSoft and Oracle Financials talk to each other through thin servers, you have true interoperability at a functional level. When critics use "stovepipe" with derision, they're commenting on tall, non-integrated applications -- they don't meet, don't talk, and don't exchange data.
Why thin servers? It's not practical to think of carving off the user interface component or front-end application logic of a SAP or Baan to integrate it with other applications. Instead of tying the applications together at the desktop or data consumption layer, thin servers link them at the data management layer, extracting data from one application and presenting it to another. There's a great case study in progress on Common Gateway Interface (CGI) scripts on Web servers. Browsers with absolutely no knowledge of back-end applications can exchange data through HTML forms using thin gateway scripts.
What are the emerging technologies for creating thin servers? Obviously perl is the language of choice for creating scripts, with the Tool Command Language (tcl) and the SafeTcl extensions to it gaining in popularity due to their simple string and text manipulation features. Layer the expect package on top of tcl, and you can gateway between interactive applications with command-line interfaces as well. Sun's Java programming language is also emerging as a glue for sticking disparate applications and user interfaces (browsers) to each other. Java introduces the notion of executable content -- the data envelope that contains small applications to convert, extract or interpret the data.
Wrapping the data with a context for its consumption by another application is conducive to joining application stovepipes together. You can try to hammer the stovepipes into a common shape and size, or you can add duct work. Going the gateway route by putting ducts between applications lets you link third-party or commercial applications as easily as you build custom client-server solutions.
Encouraging gateway level interoperability lets each arm of the business runs at its own pace, without dependencies on others. When you cast application transcoders into gateways, you create yet another level of abstraction for the second figure. That notion -- improving information flow from one self-contained, commercial application to another one -- is at the heart of client-server computing.
About the author
Hal Stern is a Distinguished Systems Engineer at Sun Microsystems, where he specializes in information systems architecture, technology evangelism, and answering e-mail. Reach Hal at firstname.lastname@example.org.
If you have technical problems with this magazine, contact email@example.com