Click on our Sponsors to help Support SunWorld

Performance analysis of client/server applications

A step-by-step guide to troubleshooting
application response problems on Unix-based systems

By Uday O. Pabrai

SunWorld
July  1996
[Next story]
[Table of Contents]
[Search]
Subscribe to SunWorld, it's free!

Abstract
When you encounter slow application response times, it is often difficult to pinpoint the problem. Troubleshooting in today's complex client/server environments requires comprehensive analysis. We take you through step-by-step solutions to performance problems in Unix-based computers, databases, networks, and applications.
(4,600 words)


Mail this
article to
a friend
Are you experiencing network, system, database or application performance problems? Are users frustrated with application response times? In today's client/server, distributed, LAN/WAN computing environments, it is difficult to troubleshoot and correctly diagnose why applications or systems are performing "slower than usual." Further, an inadvertent parameter change on a communications device may effect the network, system, or application performs in unexpected ways.

Client/server computing introduces both complexity and dependencies between core computing elements. No longer is the operating system or application stored locally. The client system accesses information over the network and from a multitude of systems, both on the LAN, WAN, and yes, the Internet. Poor performance on any server the client depends on or for any intermediate network will lead to poor response time. How do you find the problems and improve performance with such a complicated configuration?

Consider a client system that is dependent on the network so the operating system can be downloaded at boot time. As more nodes are added to the network and as nodes exchange more information, congestion on the network increases, leading to increased collisions and re-transmissions. Or, if the Maximum Transmission Unit (MTU) of some communications device has been changed (reduced), it takes more packets to boot the client system than before. In both situations, what users will notice is that the system is taking longer to boot. Adding more memory, or a faster processor on the client side, will not lead to any significant change in performance. Being able to correctly select the parameter that has changed or a threshold that has been exceeded is critical in identifying bottlenecks.

The key challenge that many of us face as system administrators and application developers is figuring out what parameter or factor in which system or network is preventing applications from performing optimally and consistently. You must follow a step-by-step problem solving approach, which we outline here. There are four key areas you should examine to resolve any performance problem successfully:

Methodology
Timely identification of bottlenecks is key to maintaining consistent application performance. Most users are not tolerant of an application that performs inconsistently -- fast one day and slow the next. Your first objective is to make the application perform consistently. An application that used to perform consistently but has recently been inconsistent typically points to one or more resources being used to its limits.

For example, the application may be waiting on such system resources as:

  1. The CPU on the node the client.
  2. The CPU on the node the server.
  3. The CPU on the database server node the application needs to access.
  4. Memory on the client.
  5. Memory on the server.
  6. Memory on the database server.


Advertisements

System environment
The system environment includes two areas:

The client/server application executes, at a minimum, on two systems. Typically, associated with any client/server application are three critical systems:

  1. The PC users work with (the front-end GUI interface).
  2. The system that functions as a file server for the enduser PC. The application executable is resident on the file server system.
  3. The host that is configured as the database server.

The configuration of all systems must be clearly defined. The important configuration elements are:

System related parameters are summarized in the following table.

System Elements Things to consider
1. Client System Configuration What is the processor on the client system? Intel 80486, Pentium, Digital Alpha, MIPS? What is the clock speed? 66 MHz, 120 MHz?
2. Operating System Server Configuration What is the processor on the operating system server? Intel 80486, Pentium, Digital Alpha, MIPS? What is the clock speed? 66 MHz, 120 MHz?
3. Database Server Configuration What is the processor on the database server system? Intel 80486, Pentium, Digital Alpha, MIPS? What is the clock speed? 66 MHz, 120 MHz?
4. Installed Memory/Maximum Memory on Client System To determine if the system is configured optimally for the client/server and other applications.
5. Installed Memory/Maximum Memory on Server System To determine if the system is configured optimally for the client/server and other applications.
6. Installed Memory/Maximum Memory on Database System Server To determine if the database system is configured optimally for the client/server and other applications.
7. Virtual Memory/Maximum Virtual Memory on Client System (if applicable) To determine if the system is configured optimally for the client/server and other applications.
8. Virtual Memory/Maximum Virtual Memory on Server System (if applicable) To determine if the system is configured optimally for the client/server and other applications.
9. Network Interface on Client System Ethernet, Token Ring, and/or FDDI.
10. Network Interface on Operating System Server Ethernet, Token Ring, and/or FDDI.
11. Network Interface on Database Server Ethernet, Token Ring, and/or FDDI.

You need to accurately determine and state what components of the client/server application run on which systems. For example, it could be stated that in the case of a Powerbuilder application, the Powerbuilder client application runs on the enduser PC while the Powerbuilder (server) executable may run on a file server. Both the Powerbuilder client application and the Powerbuilder (server) executable run at the branch (remote) office that supports its own LAN.

On Unix systems, you can use the following commands to provide information on how the system is performing:

The iostat command provides I/O statistics such as transfers per second, bytes per second and milliseconds per seek. By default, information is averaged since the system was booted. The vmstat (Virtual Memory Statistics) provides information on virtual memory, disk access, and CPU utilization. The results provided by the command are averaged since the system was booted (initialized). To get information on peak system activity, which may indicate potential system bottlenecks, specify an interval as an argument to the vmstat command. For CPU statistics specify an interval of about two seconds, while for disk statistics specify an interval of 60 seconds. The vmstat command sleeps for the interval defined. Typically, if the CPU idle time is greater than 20 percent then it implies the system is either I/O bound or memory bound.

CPU idle time includes the time:

  1. The CPU is not in use because it has nothing to do.
  2. CPU is waiting for memory.
  3. CPU is waiting for I/O.

Examine the output under columns for:

r: Provides information on jobs that are currently runable. If this number is high then it implies that the CPU is forced to switch between runable jobs -- may indicate that the system is CPU-bound.

b: Provides information on jobs sleeping at negative priority typically because a process is waiting for disk, tape, or other resources. If this number is high and CPU idle time is high then it could indicate that the system is I/O bound.

w: This field specifies the number of jobs that executed in the last 20 seconds and have now been swapped out. If this field is nonzero then it implies that the system may not have sufficient memory.

The ps command with options such as aux on a BSD Unix system and ef on SVR4 provides useful information on processes running on the system. Check the information provided on:

  1. Per process CPU utilization -- look in the %CPU column.
  2. Per process memory utilization -- look in the %MEM column.
  3. The current state of the process -- look in the STAT column.

Identify the processes that are consistently the highest users of CPU and/or memory. Note that the memory information provided relates to physical memory and does not include the memory used by the kernel or the instruction segment for each process. Examine the STAT column, and if the process is ever in the RW state it implies that the system is experiencing memory problems -- specifically a shortage of memory. RW implies that the system swapped out a process that was either running or had run recently.

The uptime command provides useful information on:

  1. How long the system has been running (up).
  2. The number of users on the system.
  3. Load averages of active jobs in the system for the last one, five, and 15 minutes.

The kernel maintains information on the averages of the count of active jobs in the system for the last one, five, and 15 minutes. The first load number provides information on the current CPU load. If the number is greater than four the system may be CPU-bound.

Execute the command pstat -T (use the -T option). This provides information on the amount of free swap space. On Sun multiprocessor systems, note these two commands:

The psrinfo -v command provides information on the type of processor that you are using and its status. The mpstat command provides information that is similar to the vmstat command. Note the smtx field which counts the number of times the kernel attempted to acquire a semaphore for exclusive usage and the request was denied. The request is denied if the another CPU is holding on to the same data structure, thus indicating that there is contention for the same resource. A high number implies that a CPU was forced to wait for another CPU to release the semaphore.

Verify the man pages on your Unix to determine how to interpret the data correctly in various fields provided by commands discussed in this article. Field names and output format vary from one flavor of Unix to another.

Also, consider using the public domain utility, top. top displays and updates information about the top 15 processes on the system.

Network
There are significant differences, from a network perspective, between running a client/server over a Local Area Network (LAN) and running the same application over the Wide Area Network (WAN). The impact of network latency is more pronounced on WANs. This is primarily due to the fact that typical WAN segments operate at 56 or 64 kilobits per second while most LAN segments are 10 megabits per second. Propagation delay and delays introduced as a consequence of routers processing packets will impact the performance of the client/server application. These delays may be more pronounced on WANs than LANs.

In the network area, the performance of an application may be impacted by the following:

Network related parameters are summarized in the following table.

Network Elements Things to consider
1. Protocol Stack -- Single or Multiple Is all communication between the client and server applications over a single protocol stack such as TCP/IP or are multiple protocol stacks involved?
2. Protocol Stack on Client System Segment Examples are: TCP/IP, Novell's NetWare (IPX/SPX), AppleTalk, SNA, and/or DECnet.
3. Protocol Stack on Server System Segment Examples are: TCP/IP, Novell's NetWare (IPX/SPX), AppleTalk, SNA, and/or DECnet.
4. Data Rate -- Client System Segment Examples are: Ethernet 10 Mbps, Token Ring 4 Mbps or 16 Mbps, or FDDI 100 Mbps.
5. Data Rate -- Server System Segment Examples are: Ethernet 10 Mbps, Token Ring 4 Mbps or 16 Mbps, or FDDI 100 Mbps.
6. Data Rate on WAN (if applicable) Examples are: Frame Relay 56 kbps, 256 kbps, T1 1.5 Mbps, T3 45 Mbps, or ATM. Are there other factors that impact performance? For example, Frame Relay CIR.
7. Average Utilization on Client Segment (business hours only) To determine how the LAN segment, to which the client system is connected, is performing.
8. Average Utilization on Server Segment (business hours only) To determine how the LAN segment, to which the server system is connected, is performing. If the load on the server segment is consistently high then that could be a factor that impacts the performance of the client/server application - this is even if the application itself does not place a significant load on the network.
9. Average Utilization on WAN Segment (business hours only) To determine the load on the WAN.
10. Dominant Protocol on Client LAN Segment Is it primarily IPX/SPX or TCP/IP? Within the protocol stack, which protocol is seen the most on the client LAN segment? For example, if TCP/IP is the dominant protocol stack then is it NFS, XWS, NIS, RIP, SNMP or some other protocol that generates the most packets on the network?
11. Dominant Protocol on Server LAN Segment Is it primarily IPX/SPX or TCP/IP? Within the protocol stack, which protocol is seen the most on the server LAN segment? For example, if TCP/IP is the dominant protocol stack then is it NFS, XWS, NIS, RIP, SNMP or some other protocol that generates the most packets on the network?
12. Dominant Protocol on WAN Segment Is it primarily IPX/SPX or TCP/IP? Within the protocol stack, which protocol is seen the most on the WAN? For example, if TCP/IP is the dominant protocol stack then is it NFS, XWS, NIS, RIP, SNMP or some other protocol that generates the most packets on the network?
13. Routing Protocol Examples are: RIPv1, RIPv2, OSPF, IGRP.
14. Routing Tables Exchanged Dynamically? Example: Is the routed (or some other routing process) daemon running on the system? Is the route command used to define static routes?
15. Number of links (hops) between client and server systems. To determine impact of network latency on the performance of the application. Is there a way to reduce the number of hop-counts between the client application and the server application? If yes, is that consistent with the network architecture? If it is not consistent with the network architecture then what are the network issues involved? Typically, what is the latency in the router to process a packet?
16. Client System Router CPU Utilization To determine if the performance of the application is impacted due to a busy router. Router is so busy it is unable to keep up with the number of packets that it needs to process.
17. Server System Router CPU Utilization To determine if the performance of the application is impacted due to a busy router. Router is so busy it is unable to keep up with the number of packets that it needs to process.
18. Client System Router -- LAN Segment MTU Verify that the Maximum Transmission Unit (MTU) is set to the highest value defined by the LAN technology in use.
19. Client System Router -- WAN Segment MTU Verify that the MTU is set to the highest value defined by the WAN technology in use.
20. Server System Router -- LAN Segment MTU Verify that the MTU is set to the highest value defined by the LAN technology in use.
21. Server System Router -- WAN Segment MTU Verify that the MTU is set to the highest value defined by the WAN technology in use.

To summarize, the following information is key to determining if the network is the bottleneck:

  1. Average utilization of LAN segments to which the client, operating system server, and database server are connected.
  2. Peak utilization of key systems, for example, client node, operating system server, and database server node.
  3. Frame size distribution on key LAN and WAN segments. Use this information to determine if the application or database server has any parameters or options that affect packet size.
  4. Protocol types on key LAN and WAN segments. Which protocols place the most load on the network? Is there a way to optimize network load by examining where systems are located on the network?

On Unix systems, you can execute the following commands to provide information on how the system is configured on the network and how the system is using the network:

The ifconfig command provides information on the IP address(es), subnet masks, and broadcast addresses used by the network interface(s) on the system. Verify the MTU value for each network interface in the output of the ifconfig command. The netstat -a command lists the state of all network connections. netstat -s provides information on the number of IP, TCP, UDP, and, most importantly, ICMP packets processed by the system. Verify the types of ICMP messages, especially Source Quench, Redirect, and Time Exceeded -- these typically imply some type of network or communication device problem. The netstat -r command provides routing table information.

The nfsstat command provides information on NFS server and client performance. Examine the server portion of the output when you execute the command on the NFS server (or execute nfsstat -s); likewise, look at the client portion of the nfsstatoutput if the command is executed on the NFS client (or execute nfsstat -c). In general, if NFS is used significantly in the environment, consider using an NFS write accelerator product, such as the Legato Prestoserve. One of the major bottlenecks in NFS performance is synchronous writes. With a product such as Prestoserve, the write requests are written into a battery-backed RAM buffer, and an immediate ACK is sent to the client. The requests are then written, at a later time, on the server.

Database
To determine how the database performs, you need to address the following areas:

Database related parameters are summarized in the following table.

Database Elements Things to consider
1. Software Describe the database software. For example, Sybase System 10.0.2 with three database engines running. Database engine 0 is responsible for all network I/O while engine 2 and 3 process all queries.
2. Data Characteristics Is most of the data accessed by users read-only in nature? Are some screens (scripts) read-only (such as reports) while others read-write (such as a new loan order)?
3. Database System Architecture Is the database system architecture centralized or distributed? Why? Based on the requirements of the application does it make sense to configure database replication servers? If a significant number of transactions require read-only access to data then configuring database replication servers may help improve application performance.
4. Single Processor vs. Multiple Processors Is the database able to evenly distribute load between multiple processors? Is the processing of database requests symmetric or asymmetric? For example, if there are multiple database engines then can any engine service network I/O and database queries or are there limitations? Does the database administrator have control over the work executed by each database engine (process). For example, database engine 0 may be reserved to only process network I/O, while two additional database engines process database queries. Does the database vendor recommend how to effectively utilize engines on a multi-processor system? What are the merits of using a multi-processor system with multiple database engines versus multiple database server systems (central server with replica servers)?
5. Processing -- front-end vs. back-end Should some parts of the application be written as stored procedure and others as scripts generated by software such as Powerbuilder? How is that determination made?
6. Performance Execute the SP_WHO command on a consistent basis to determine the state of the database engines. Typically, how many users are logged in? As far as the database is concerned what are most users doing? For example, are there a large number of SELECTs? Do the SELECTs last for a long time? What event in the application is forcing the database to spend considerable time processing information? Can it be justified? Are there alternatives available in either the application design and coding or in the way the database is configured that will reduce the load on the database and the system? Execute the SP_LOCK command on a consistent basis to determine pages that are locked. Typically, how many pages are locked? Who are the users whose pages are locked? What is the lock type? How long do the locks exist?

Application
To determine how the applications perform, you need to address the following areas:

Application related parameters are summarized in the following table.

Application Elements Description
1. Software Describe the software that was used to develop the application. For example, Powerbuilder version 4.0.3 may be required on the Novell file server and the DLLs are downloaded to the client PC per Powerbuilder application login.
2. End User Application Interface Describe how the end user invokes or gains access to the application. Identify the systems involved in this initialization process.
3. Number of different modules, routines or screens in the application. Describe the application in terms of each of it's elements. Each element (routine) may use the network or system differently and it is imperative that each module be analyzed individually.
4. Processing -- front-end vs. back-end
  • Business Rules
  • Computation
If the developer has a choice to write a module that requires:
  • Significant utilization of system resources on the server host OR
  • Significant utilization of system resources on the client host, then which one should the developer select? Is the preference to do most processing at the front-end or the back-end? It may be that the processing of all business rules is executed at the front-end while all computation is done at the back-end.
5. Module Execution and Systems For each module we need to determine what part of the module executes on which system.
6. Version Control System Name the version control system in use. Define the methodology to introduce version changes.
7. CPU Utilization It's important to characterize the CPU utilization on each system (client and server) for each module. If a given module requires significant CPU resources on a given system then, it needs to be determined if there is any change that may be made to the application to provide the functionality and to improve the utilization of CPU resources. For example, when does it make sense to use an SQL GROUP clause versus the SQL WHERE clause? In a situation where computation between fields is not required then a WHERE clause may seem appropriate. This may reduce the CPU load on the database server hence improving the response seen by the end user. The objective is to question the impact of one or more SQL calls versus another set of SQL calls or the usage of system calls and functions on the load on the system.
8. Network Utilization Need to characterize the network utilization of all segments (client LAN, WAN, server LAN) for each module. Need to further determine:
  • Protocol used by the module.
  • Number of packets generated by each module.
  • Number of acknowledgements per module.
  • Average packet size for all data exchanged for a specific module.
  • Total number of bits exchanged for a specific module.
It needs to be determined if there is any change that may be made to the application to provide the functionality but improve the utilization of network resources. For example, an SQL query may require the entire database to be searched instead of optimizing or limiting the search to a subset of records.

The last word
Before an application, or a new version of an application, or any enhancement that is made to the application is moved to a production system, it is important that some tests be executed to verify the impact of the change in the application on various system elements. Understanding how the application performs with the changes introduced is critical because the application is competing with users and other applications for utilizing CPU and network resources. How the new release of the application functions will determine not just the performance of the client/server application but that of other applications on the network.

Performance testing must be a key element in the process of moving an application from development to a production environment. The tools to analyze system, network, database, and application-related entities must be in place to effectively determine bottlenecks and recommend solutions.


Click on our Sponsors to help Support SunWorld


Resources


About the author
Uday O. Pabrai is an industry expert providing solutions in the areas of Internet, intranet and TCP/IP architecture, infrastructure and deployment. His clients include Fortune 1000 and U.S. Government agencies such as Microsoft, AT&T, CBOE, Landis & Gyr, Norwest Mortgage, and Thomas J. Lipton. His articles have appeared in several publications. Reach Uday at uday.pabrai@sunworld.com.

What did you think of this article?
-Very worth reading
-Worth reading
-Not worth reading
-Too long
-Just right
-Too short
-Too technical
-Just right
-Not technical enough
 
 
 
    

SunWorld
[Table of Contents]
Subscribe to SunWorld, it's free!
[Search]
Feedback
[Next story]
Sun's Site

[(c) Copyright  Web Publishing Inc., and IDG Communication company]

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-07-1996/swol-07-csapp.html
Last modified: