Defining and achieving RAS step by step
How does a networked enterprise become fully reliable, available, and serviceable? We offer detailed pointers for identifying and then implementing RAS processes
We have mentioned the acronym RAS many times before and its importance in managing the Unix Enterprise. Systems that are reliable, available, and serviceable are required to meet business needs and, as we continually emphasize, become even more important in the distributed, networked computing environments. One of the biggest issues that most enterprises face in moving toward the new networked model is how to define and implement RAS. In this article we will look more in-depth at these issues and some approaches on how to solve them. (1,600 words)
Over the last few years many corporations have allowed business units to look for and implement client/server-based applications without using IS. It has generally been believed that they (the business unit) could implement their own applications to support business requirements without the help and support of IS or that there was no need for IS. This is very similar to the way in which they were allowed to implement, manage, and support the PC and local area network (LAN) environments (meaning there was no RAS defined). Emphasis on planning the implementation of the application is the order of the day. There usually is little or no heed placed on how the application is going to be supported. In other words, in IS terminology, they pay attention to day one issues (implementation), but pay little or no attention to day two issues (ongoing maintenance and support). Many business executives feel the day-to-day problems can be handled the same way they deal with PC/LAN issues (without change control, problem management, or backup/restore procedures). Well, how wrong they are! Most business apps need some RAS processes and procedures in place to be effective in supporting business requirements.
The biggest problem is that business units have no idea what RAS procedures or mission critical really means, and rightly so, because most have never been part of the IS infrastructure. We in IS deal with RAS and mission critical every day. We know what the processes are, how to implement them, and what the ramifications are if we do not. IS is paid to know RAS! And, IS has been evolving these processes over the past thirty years. So why stop using them just because we change some of the technology and structure of how applications are delivered? We emphatically believe that these processes must be moved into the networked environment.
But we also suggest that it's mostly IS's fault for not educating our internal "customer" on what mission critical and RAS means. We never explained what change control, configuration management, problem management, and backup and restore processes were and how they must be in place to effectively support a mission-critical environment. We never spent the time to educate them on the need and explain that RAS really is required to support their business apps. We think a major effort by the New Enterprise IS now is to educate their customers on the importance of RAS and to "market" and "sell" them on the idea that IS should be the group to implement and support RAS for distributed applications in the enterprise. We also believe this could certainly apply to the PC/LAN environment as well. Once we sell them on the need, provide better service, communicate with them more, and hopefully improve their perception of how IS can support them, then we can really take a look at what RAS means and how to implement it in the new networked enterprise.
Defining the scope of production
Now that we have agreed that the environment needs to be "RASed" we must first define what we call the "scope of production." This scope should be defined using a teamwork approach. The team should consist of key people from IS including networking, data center, database administration, desktop support and LAN support, plus some key users of IS services (preferably those who are planning to or have implemented client/server-based applications). The first thing the team must do is agree on a very important assumption. The assumption is that anything that is part of the scope of production must be RASed. Agreement must be achieved before moving forward, otherwise the remaining activities are a waste of time. Trust us, we know from first hand experience! In many enterprises this is a very difficult and political bridge to cross, but once over it many pieces of the overall service puzzle begin to fall into place. Secondly, the team should look at what the total enterprise network-computing environment is that could "potentially" be considered part of the scope of production. Figure 1 provides an example of the potential total scope for the enterprise-computing environment.
Figure 1. Example of the potential scope of production in the enterprise
The best way to get started is to present a view of your enterprise environment to the team like the one shown in Figure 1. Then in a team meeting (which could take more than one meeting) ask the following questions:
|Is the mainframe part of our scope of production?||Yes or No.|
|Is the network (WAN) part of our scope of production?||Yes or No.|
|Is the network router part of our scope of production?||Yes or No.|
|Is the LAN part of our scope of production?||Yes or No.|
|Is the server part of our scope of production?||Yes or No.|
|Is the PC/workstation part of our scope of production?||Yes or No.|
The team must answer all the questions for the identified total enterprise environment. Oh and by the way, the whole team must agree on each answer. What you are really doing here is identifying what components of your total network (down to and including the desktop) should (or eventually) be considered mission critical. This list of answers should be reviewed on an on-going basis by the team. The environment will certainly change and so will the scope. Also, on the first pass, you may not be able to consider all components because of the daunting task that lies ahead. As you have probably guessed the next step is to define the RAS for all the "yes" answers.
Now that your team has identified the scope of production for the enterprise it is time to define and plan the implementation of the RAS processes to support it. This is a huge effort and can take a long time; but it must be done (you can pay now or pay later.) The returns can be seen in improved service, availability, productivity, and reduced costs. However, it does take up-front investment in terms of people time and/or utilizing outside consultants. The first step is to define a priority list of the processes to implement.
Because there are many in the total list (a comprehensive list is provided below) it usually makes sense to start with the top five or ten that are currently causing the most "pain" in your environment. We normally do not provide a generally recommended top ten list because it depends on the enterprise, the scope of production, and probably most importantly on the corporate culture. However, our company has been involved with several case studies from our workshops over the past two years, and so we will provide a list you can use as a starting point (and there are only eight). They are:
To get started it may be easiest to identify a simple implementation. For example, a minimum and sufficient implementation of change management would be to:
That wasn't so hard! Now identify the implementation criteria for the next one on your list, and so on. Once the list is complete, make another list until all the processes have been defined and implemented. Along the way, you may want the team to review the completed processes including how well they are working, how effective they are, whether they are making a difference, and whether improvements are required. Remember this is an evolving procedure. It can take a lot of time and investment, but we cannot over emphasize the importance and the potential benefits. We put the process definition for metrics on our list because we always recommend developing internal IS measurements. These measurements should tell us how well we are doing and if we are getting any better. They are not metrics intended for our customers. Remember our saying: you manage what you measure! Finally, once the processes are in place and effective, you should look for a tool to automate them. There are various tools available from many different vendors. The important factor here is that tools are only as good as the processes and standards that support them!
Another key consideration is to make sure you define an owner of these processes. No owner means no accountability. To be effective there must be accountability. We usually recommend the data center organization (if there is one) as the owner since they maintain ownership of mainframe RAS processes.
The more extensive RAS processes list
Here is a fairly comprehensive list of all the processes that support RAS in the enterprise. Please note that they are not listed in any priority order.
Harris Kern (firstname.lastname@example.org) is Sun's Open Systems Migration Consultant for NAAFO Market Development. Randy Johnson (email@example.com) owns R 38;H Associates, a full-time rightsizing consultancy in Boulder Creek, CA. R&H Associates helps people worldwide in implementing and supporting client/server infrastructures based on their proven methodologies. © 1997 Harris Kern and Randy Johnson. All rights reserved.
Harris Kern and Randy Johnson are authors of Rightsizing The New Enterprise: The Proof, Not the Hype and coauthors of Managing The New Enterprise: The Proof, Not the Hype, and Networking The New Enterprise: The Proof, Not the Hype. You can buy these at Amazon.com Books. Select the hyperlinks to learn more about each and Amazon.com.
If you have technical problems with this magazine, contact firstname.lastname@example.org