A Statistics Based Approach for Performance Management in Distributed Systems Sunil Santha Department of Computer Science Texas A&M University College Station, Texas 77843-3112.
[email protected] Tel: (409)845-3858 Fax: (409)847-8578 and Udo W. Pooch Department of Computer Science Texas A&M University College Station, Texas 77843-3112.
[email protected] Tel: (409)845-5498 Fax: (409)847-8578 May 1997
TR-97-016
i
A Statistics Based Approach for Performance Management in Distributed Systems Abstract The present trend is to build large distributed systems comprising of high speed networks and high performance heterogeneous workstations. As more complex distributed systems are built, the traditional tools used to manage computer systems become inadequate. In the search for eective tools for the management of current distributed systems, several dierent approaches have been proposed and some have been implemented as prototypes. Here a general approach is presented where a multivariate statistics based mechanism is used as the core strategy for reducing the large amount of data collected from the distributed system. The system presented comprises of the basic components: Data Collection, Data Analysis, Performance Indication and the System Control. A distributed system with its global and local schedulers under the control of this tool will be able to maintain its performance in an optimum range.
1 Introduction The proliferation of high performance workstations and the advent of high speed communication networks, gave rise to a new form of computing by combining them together. The current trend towards grouping general purpose workstations using local and wide area networks provides many bene ts. In addition to high performance these distributed systems can provide improved reliability and availability, ease of modularity, incremental growth, con guration exibility and improved resource utilization. The current trend of developing systems is based on the client-server concept. This and the popularity of the intra/inter net are other reasons for the present interest in network based distributed computing. Distributed system is a term used to refer to a variety of systems. Many authors have attempted to de ne and categorize distributed systems [1],[2]. In this paper we consider a distributed system consisting of a set of computing resources connected by a network. Any node in this system can act as a client or a server or both. The system is transparent to the users of the system (i.e. to a user the whole system looks like a large machine. This is sometimes referred to as the single system image.). The requests submitted by clients will be executed by the servers, and for optimizing the throughput on this system a load leveler is in use. As the size and the complexity of a distributed system network increases, the management of this network and its resources becomes a challenging problem. The major components of the functional scope of distributed systems management according to the distributed systems management model put forward by ISO1 [3] are as follows:1. Fault management; 2. Con guration management; 3. Accounting management ; 4. Performance management; 5. Security management. The methodology presented in this paper is intended as a tool assisting in performance management. 1 International Standardization Organization
1
In order to take full advantage of the bene ts of distributed systems, and to tailor a system to a given environment, system administrators need detailed characteristics about the system's behavior. It is however, dicult to gain insight into the interaction and performance characteristics, because there is no global or central control of activities in distributed systems. Also there is what is referred to as the lack of common knowledge [4], or an announcement facility. The processors share information only through the communication network, which may behave in uncertain ways. Common knowledge is similar to an \announcement" made simultaneously to all the processors. In a distributed system it is very dicult for the processors to achieve common knowledge, if not impossible. Even though the factors mentioned above makes it hard, performance measurements are essential to the evaluation of the eectiveness of a distributed system. Performance data collected from a distributed system can be used for several purposes, including understanding and improving the execution of a distributed program, dynamic allocation of computer resources, helping in load sharing, properly comparing dierent systems and improving the design of future systems. If the performance monitor operates independently and concurrently with the target system, it may be used to provide some degree of fault tolerance, in addition to performance monitoring [5]. This may be achieved by using the output of the performance monitor for recognizing tendencies towards faulty states and taking steps to avoid them. It may also be possible to recover from a faulty state without a total shutdown by the timely recognition of the faulty state with the help of a performance monitor. The main problem in general of computer systems evaluation is their complexity and the fact that these systems usually have not been designed to be monitored. In order to reduce the complexity of the problem, a locally distributed system with load sharing is considered. Figure 1 shows the con guration of such a system. Jobs enter the system via the nodes in the system (clients) and are processed by those and other nodes (servers) in the system. The performance of the distributed system may be considered as a function of the performance of its individual processors and other resources. This function re ects the organization of the system including, scheduling policies of the systemwide job scheduler, as well as task schedulers, communication aspects, and workload characteristics, among other things. The load sharing can be considered as a part of the larger distributed scheduling problem [6]. As shown in Figure 1, the local scheduling subproblem (task scheduling) addresses the allocation of processing resources to jobs within one node. The global scheduling subproblem (process scheduling) addresses the determination of which jobs are processed by which node. A key function of the job or process scheduler is load sharing, the appropriate sharing and allocation of system processing resources. In this methodology, the statistically derived performance variables are coupled with a feedback to the systemwide scheduler (process scheduler) as well as to the node schedulers (task schedulers in each node). These schedulers can be made to use the additional information provided by the feedback in their respective job allocation and task allocation algorithms to enhance the overall system performance. The performance variables produced by the proposed methodology are used in a suitable form to 2
C/S
C/S
C/S Node Scheduler (Task Scheduler)
Workload C/S
C/S Systemwide Scheduler (Process Schedular)
Performance Variables
C/S
C/S
C/S
C/S
C/S = Client/Server
Distributed System with Load Leveling
Figure 1: Distributed system with a load leveler and performance feedback. display the performance status of the system in real-time. This could be used to provide an alarm to the system operator in the case of a performance deterioration beyond a preset threshold. The rest of the paper is organized in the following manner. Section 2 provides the design details of the proposed methodology and Section 3 contains a summary and concluding remarks.
2 Methodology for Distributed Systems Performance Management In a distributed system, the task of monitoring is much harder than the task of monitoring a centralized system. This is mainly due to the distributed nature of the monitor. If the monitor works in real-time there will be additional requirements above and beyond those of traditional techniques. In almost all the known systems built so far, the measuring and monitoring activity takes place at each node in the distributed system and the data is collected, processed, and displayed, at a central station. During the past few years, there were several systems built and some proposed for measuring the performance and observing the behavior of distributed systems. Several experimental systems were designed for online monitoring of distributed systems. Of these INCAS-TMP [7], JEWEL [8], Spearmints [9], and MCC ES-Kit [5] are some of the notable hybrid (monitor consisting of hardware and software) monitoring systems. A proposal for an expert system based online performance management tool is presented in [10]. The approach similar to the one presented in this paper is applied in many instances to complex systems. Yield optimization in semiconductor manufacturing, nance, health science, sociology are some of the elds where such statistics based approaches are used. 3
2.1 System Design Performance
Analysis
log
Raw data
(2)
Control
(4)
Graphical display
Data base
(3) (1) Data
collection
Distributed System (Clients, Applications, Application Servers, Local Area Networks)
Figure 2: Performance management system. The proposed system for performance management is designed to run in the background together with the distributed operating system. The main components of the performance management system are: 1. Data Collection (determination of what to collect and how to collect). 2. Data Analysis (reduction using multivariate statistical methods). 3. Performance Indication (method of displaying the results). 4. System Control (establishing a feedback control mechanism). Figure 2 shows the basic block diagram of an application of this methodology. Data collection is closely associated with the distributed system. Analysis, performance indication and system control are shown as separate interacting boxes. Raw data is intended as a FIFO buer for input data. This is intended to work as an on line monitoring methodology, but the historic data is kept in a cyclic buer so that the user can go back a few steps for a replay if necessary. Performance log is a le or some other mass storage where the results of the analysis are stored. This will be useful when it is required to monitor long unattended periods. This is a long time storage whereas the raw data buer is a short term storage. The volume of raw data will be large compared to the amount of data from the analysis. The data base is intended as reference data for the analysis. This may be devised so that as the system gets more \experienced" it can store some characteristic parameters to suit various situations. Analysis and control or the operator will be able to recall these when a situation similar to one which has already been stored arise. 4
2.1.1 Data Collection Selected parameters are measured periodically at each node of the distributed system. The parameters are selected according to the performance objective so that the current system performance status is represented by the set of measurements.
2.1.2 Data Reduction As mentioned earlier multivariate statistical methods will be used to reduce the large number of data collected. The underlying principle is aimed at obtaining a limited number of orthogonal variables by combining the measured variables. The lesser number of orthogonal variables contain almost the same amount of performance information as the original set of variables. The reduction is possible since we combine the original variables to get the orthogonal set and unlike the original variables no two of the orthogonal variables carry the same information. Assume the number of measured variables to be n and the number of measurements done on each of them to be m. These can be represented as a matrix such that each column has the observations corresponding to a measured variable. Each row of the matrix represents a system status since a row contains values of all the n variables. This matrix therefore has n columns and m rows.
0x x x BB x2111 x2212 x2313 B@ ... ... ...
xm1 xm2 xm3
1
x1n x3n C C
.. .
.. .
xmn
CA
Where x11 , x21 , ... and xm1 are the measured values of the measured variable X1 . The matrix consists of the measurements of the X1 , X2 , ... and Xn measured variables (X). We wish to nd a set of parameters Y1 ; Y2 ; ; and Yp such that Yi s provide the maximum discrimination among the components. These p parameters can be written in the following form: 0
Yi = or in matrix from:
n X j =1
aij Xj = a X i
(1)
Y = AX
(2) Where each column of A contains the coecients for a parameter Yi (where i goes from 1 to p). These parameters Yi are called principal components. The principal components form an orthogonal set, or in other words none of the pairs in the set are correlated X (Yi ; Yj ) = aik akj = 0: (3) k
The transformation from variables Xj (j=1..n) to the principal components Yi (i=1..n) can also be viewed as a rotation of the n-dimensional coordinate system, so that the resulting new variables Yi are uncorrelated. 5
The multivariate methods used will result in a data matrix with p columns and m rows. We wish to have p < n. This resulting matrix contains almost the same information as the original matrix which had a larger number of columns (n) [11].
2.1.3 Performance Indication The nal outcome of the reduction algorithm will be the transformation matrix. This pxn matrix will convert a vector of performance measures X containing n variables into a vector of performance variables Y containing p variables with the desired properties. These nal composite variables are used for the graphical display. The graphical display should show a pro le of the system's response to its task environment. One possibility of such system performance status indication is by using Kiviat graphs. These visual devices help system managers to quickly recognize performance problems by recognizing deviations of graphs from the standard patterns. In these graphs each axis represents an almost independent phenomenon. Each axis consists of a variable we obtained from the principal component analysis of the original measured variables. The most popular form of this graph has an even number of axes. Half of these axes display what is referred to has \high is better" (HB) type of variables and the rest, \lower is better" (LB) type of variables [12]. These two types are assigned alternating axes in the Kiviat graph. As a result, the Kiviat graph for a good system will have a star form (HB axes having higher values and LB axes having low values).
2.1.4 Performance Optimizer It is possible to express the original performance measures in terms of the nal set of principal components [11],[13]. This will also be a linear combination. For this inverse transformation the same transformation matrix we used for generating principal components can be used. Only the matrix will be now read column-wise rather than row-wise. i.e.
Xj = a1j U1 + a2j Up
+ apj Up :
(4)
[11]. This type of relationship shows that a change in nal performance variables can be achieved by causing a change in the performance measures obtained from the system. The performance measures are the results of the measurements done on the system. By changing the corresponding task scheduling and job scheduling parameters it is possible to achieve a change in the displayed performance variable. The mapping between these two end parameters is system speci c. A feedback mechanism such as the one shown in Figure 3 causes the system to return to the range of optimum performance shown by the \good" shape of the Kiviat graph. A deviation of a performance variable out of its acceptable range triggers changes in the system to correct it. A control mechanism such as this will work on a system with gradual state changes. The tolerable rate of state changes depends on how fast the system can react to the negative change to correct it. It may also be possible to extend this concept to a system which has faster state changes. In this case the trends of change 6
Workload
Load Leveller Task Scheduler (Node)
Job Scheduler (Systemwide Scheduler)
Performance Y = a X + a X + ... i
1
1
2
2
X = bY + bY + .. i
11
2 2
Figure 3: Feedback control. in performance variables could be detected and corresponding amount of feedback applied to correct the anticipated deterioration in performance (This is somewhat similar to the proportional integral and dierential control).
3 Summary and Concluding Remarks In this paper, a general methodology is presented such that, performance of a distributed system can be established in real-time within an empirical framework. The performance is indicated in the form of Kiviat graphs with recognizable patterns for various performance status. In addition a performance control mechanism was proposed with which the system could operate within an optimum range of performance. This mechanism would bring the system to an acceptable state when it drifts to an unacceptable state (indicated by the Kiviat graph). In a distributed system based on object concepts [14], since implementation is separate from the interface, performance measures can be used to compare and select between several implementations for the same object, tailoring the system to speci c target systems. This allows speci c characteristics of the underlying hardware (co-processors, high performance I/O interfaces, accelerator boards, cache etc.) to be exploited. The system could further be improved by considering implementation issues such as: minimizing the performance impact from the monitor on the distributed system; eect of clock synchronization; scalability of the system.
References [1] Mukesh Singhal and Niranjan G. Shivaratri, Advanced Concepts in Operation Systems, chapter 4, pp. 71{94, McGraw-Hill, Inc., New York, 1994. [2] Umjad Umar, Distributed Computing, chapter 1, pp. 4{37, P T R Prentice-Hall, Inc., Englewood Clis, NJ, 1994. [3] James Rankin and Michael Paulson, Distributed Computing, chapter 3, pp. 26{29, NCC Blackwell Limited, 108 Cowley Road, Oxford OX4 1JF, England, 1993. 7
[4] Gilbert Andrew Neiger, Techniques for Simplifying the Design of Distributed Systems, Ph.D. thesis, Cornell University, Ithaca, NY, 1988. [5] Russell D. McLaren and William A. Rogers, \Instrumentation and performance monitoring of distributed systems," in Proceedings of the Fifth Distributed Memory Computing Conference, Charleston, SC, April 1990, pp. 1180{1186, IEEE Computer Society Press, Los Alamitos, CA. [6] Y. T. Wang and J. T. Morris, \Load sharing in distributed systems," IEEE Transactions on Computers, vol. C-34, no. 3, pp. 204{217, March 1985. [7] D. Haban and D. Wybranietz, \A hybrid monitor for behavior and performance analysis of distributed systems," IEEE Transactions on Software Engineering, vol. 16, no. 2, pp. 197{211, February 1990. [8] F. Lange, R. Kroeger, and M. Gergeleit, \JEWEL: Design and implementation of a distributed measurement system," IEEE Transactions on Parallel and Distributed Systems, vol. 3, no. 6, pp. 657{671, November 1992. [9] Uwe Kleinhans, Joerg Kaiser, and Karol Czaja, \Spearmints: Hardware support for performance measurements in distributed systems," IEEE Micro, vol. 13, no. 5, pp. 69{78, October 1993. [10] Manish Parashar, Salim Hariri, and Kamal Jabbour, \An expert system for performance management," in Proceedings of the 35th Midwest Symposium on Circuits and Systems (Cat. No.92CH30999), Washington, DC, August 1992, pp. 1052{1055, IEEE, New York, NY. [11] Bernhard Flury and Hans Riedwyl, Multivariate Statistics A Practical approach, chapter 10, pp. 181{233, Chapman and Hall, London, 1988. [12] Raj Jain, The Art of Computer Systems Performance Analysis, chapter 3, pp. 30{43, John Wiley and Sons, Inc., New York, 1991. [13] Richard J. Harris, A Primer of Multivariate Statistics, chapter 6, pp. 155{205, Academic Press, New York, 1975. [14] Silvano Maeis, Run-Time Support for Object-Oriented Distributed Programming, Ph.D. thesis, Universitaet Zuerich, Zurich, Switzerland, 1995.
8