Automatic Detection of Parallel Program Performance Problems Antonio Espinosa
Tomàs Margalef
Emilio Luque
Computer Science Department Universitat Autònoma de Barcelona. 08193 Bellaterra, Barcelona, Phone: (+34) 93 581 19 90
e-mail:
[email protected] ABSTRACT1 Actual behaviour of parallel programs is of capital importance for the development of an application. Programs will be considered matured applications when their performance is under acceptable limits. Traditional parallel programming forces the programmer to understand the enormous amount of performance information obtained from the execution of a program. In this paper, we propose an automatic analysis tool that lets the programmers of applications avoid this difficult task.
Keywords Performance analysis of parallel programs, parallel program design, automatic bottleneck detection.
1. INTRODUCTION The performance of a parallel program is one of the main reasons for designing and building a parallel program [1]. When facing the problem of analysing the performance of a parallel program, programmers, designers or occasional parallel systems users must acquire the necessary knowledge to become performance analysis experts.
problems of the application and shows them to the application programmer, together with source code references of the problem found, and indications on how to overcome the problem. The main difference between the KAPPA-PI tool and the existing automatic performance analyisis tools [2] [3] [4] is that the code of the analysed application is checked to propose alternatives for a new behaviour. Analysis first considers the study of the trace file in order to locate the most important performance problems occurring at the execution. Once those problematic execution intervals have been found, they are studied individually to determinate the type of performance problem for each problematic execution interval. When the problem is classified under a specific category, the analysis tool scans the segment of application source code related to the execution trace data previously studied. This analysis of the code brings out any design problem that may have produced the performance problem. Finally, the analysis tool produces an explanation of the problems found at this application design level and recommends what should be changed in the application code to improve its execution behaviour.
The amount of data to be visualised and analysed, together with the huge number of sources of information (parallel processors and interconnecting network states, messages between processes, etc.) difficult this task of becoming a performance expert. Programmers need a high level of experience to be able to derive any conclusions about the program behaviour using these visualisation tools. Moreover, they also need to have a deep knowledge of the parallel system because the analysis of many performance features must consider architectural aspects like the topology of the system and the interconnection network.
The KAPPA-PI tool currently analyses C+PVM applications running on a cluster of workstations, being ported to analyse problems in other message-passing environments such as MPI.
We propose a Knowledge-based Automatic Parallel Program Analyser for Performance Improvement (KAPPA-PI tool) that eases the performance analysis of a parallel program. Analysis experts look for special configurations of the graphical representations of the execution which refer to problems at the execution of the application. Our purpose is to substitute the expert with an automatic analysis tool which, based on a certain knowledge of what the most important performance problems of the parallel applications are, detects some critical execution
[3] Hollingsworth, J. K., Miller B, P. Dynamic Control of Performance Monitoring on Large Scale Parallel Systems. International Conference on Supercomputing (Tokyo, July 19-23, 1993).
1 This work has been supported by the CICYT under contract TIC 95-0868
2. REFERENCES [1] Cherri M. Pancake, Margaret L. Simmons, Jerry C. Yan: Performance Evaluation Tools for Parallel and Distributed Systems. IEEE Computer, November 1995, vol. 28, p. 16-19. [2] Fahringer, T. Automatic Performance Prediction of Parallel Programs. Kluwer Academic Publishers. 1996.
[4] Yan J.C., Sarukhai S.R. Analyzing parallel program performance using normalized performance indices and trace transformation techniques. Parallel Computing 22 (1996) 1215-1237