Monitoring and Visualization in TOPSYS 1 Arndt Bode and Peter Braun Lehrstuhl fur Rechnertechnik und Rechnerorganisation, Institut fur Informatik, Technische Universitat Munchen, Arcisstr. 21, 8000 Munchen 2, FRG Tel.: +49-89-2105-8240 or -2386, Fax.: +49-89-2800-529 e-mail: fbode,
[email protected]
Abstract
It is widely acknowledged that writing parallel programs is much more complex than writing sequential programs. This is particularly true for distributed memory multiprocessor systems and message passing programming models. Graphical representations of a program execution can increase the understanding of parallel programs. The paper describes the on-line visualization and animation system VISTOP (VISualization TOol for Parallel systems) and its use of the TOPSYS distributed monitoring system. TOPSYS (TOols for Parallel SYStems) is an integrated tool environment which aims at simplifying the usage and programming of parallel systems. It consists of several interactive development tools for speci cation, debugging, performance analysis and visualization. VISTOP is presented after an evaluation and classi cation of existing visualization environments. It supports the interactive on-line visualization of message passing programs based on various views, in particular, a process graph based concurrency view for detecting synchronization and communication errors. The data acquisition of VISTOP is performed using the distributed monitoring system common to all tools of the TOPSYS environment.
Keywords:
Visualization of parallel programs; monitoring; distributed memory computers; communication and synchronization; debugging; integrated programming environment
1. Motivation
Currently a process and message passing based programming model is most suited for scalable distributed memory multiprocessors. There, the programmer has to handle various additional complexities compared to the development of sequential code. Apart from understanding the sequential parts of the parallel program, new degrees of freedom lead to additional diculties. The most critical issue is the overall understanding of the dynamic behavior of parallel programs. In particular, the interaction between processes with respect to synchronization and communication raises new problems. Inadequate synchronization and communication very often results in deadlocks and races. In addition to correctness, performance issues are extremely critical in parallel programs. Inappropriate synchronization and communication primitives may lead to performance bottlenecks and unexpected behavior. Usually these types of errors can not be found with conventional debugging systems. But the most complex 1
Funded by the German Science Foundation, Contract: SFB 0342
problem to solve when writing parallel programs is the scalability problem [22]. Programs running on massively parallel architectures consist of hundreds and thousands of interacting processes. This massive amount of processes or information in general has to be handled by the programmer. As the state of the art in parallel programming is still based on the approach of programming each component of a parallel system individually, visualization is an adequate technique for this type of computer systems. Graphical representation can aid human comprehension of complex phenomena with large quantities of data. This is due to the fact that the human visual system is more suited to process information in form of graphical images than in the form of textual images, i.e. graphical visualization techniques are one favorable way of dealing with all the problems stated above. Another way to portray the behavior of parallel programs is auralization. Studies have shown that human perception of sound can eectively enhance one's insight into complex interactions of concurrent applications [12]. Maybe several debugging techniques, like textual based systems, visualization systems and auralization systems should be combined for the best possible support for the programmer. In general, program visualization systems can be used for many purposes, including
understanding the dynamic behavior of parallel programs for educational purposes, debugging for correctness, performance monitoring and analysis (performance debugging) and visualization of scienti c data.
2. Visualization Systems for Parallel Programs
Usually the purpose of program visualization systems is to gain insight into the dynamic behavior of programs. Myers [21] has proposed to classify program visualization systems by code or data animation and by static or dynamic displays. Further criteria for visualization systems can be whether they are active or passive and whether the contents of the displays show the current state or a complete history of events [9]. In our opinion a dierent way of classifying visualization systems for parallel programs is more appropriate. Approaches to build visualization and animation systems to support debugging and understanding of parallel programs can be divided into three dierent categories according to the motivations given in the previous section:
Data oriented visualization systems. Program or system oriented visualization systems. Problem oriented visualization systems.
These three classes can be used for dierent visualization tasks. Each approach has its speci c strengths and can help a programmer in a dierent way to locate problems or errors in parallel programs or to gain insight into parallel algorithms. Some systems provide dierent components which belong to dierent categories of our classi cation.
2.1. Data Oriented Visualization
Data oriented visualization is often called scienti c visualization. It refers to the animation of data calculated by an algorithm. This type of visualization can often be found in scienti c computing areas, like simulation, aerospace applications, and uid dynamics applied in elds like geology, meteorology, chemistry, and physics. Advanced techniques are used in these areas to produce static displays (2{D and 3{D) and animated movies. Although such systems can also help the programmer to detect erroneous behavior of a program, the main purpose of this kind of animation systems in general is to visualize the results of a computation. Thus we do not consider this class of visualization systems in this paper. One approach presented in literature uses data oriented visualization to proof the correctness of parallel programs. Roman and Cox [24] have developed a visualization methodology which is based on correctness proofs. They propose to visualize properties which are used in correctness proofs as data patterns. For example invariants of an algorithm could be visualized as a stable pattern, progress in an algorithm could be displayed as an evolving pattern. They argue that this approach is especially promising for systems with a large number of processing elements, where the traditional visualization techniques could not be used.
2.2. Program or System Oriented Visualization
These systems monitor the execution of a parallel program. A parallel program can be observed at dierent levels of abstraction, for example at the level of the source code, at the level of the underlying operating system, at the level of the process graph or at the level of the parallel programming model. Program oriented visualization systems gain their visualization from the source code, while in system oriented visualizers the properties of the underlying hardware (number of CPUs, topology of the communication network) or the objects of the operating system (processes) are most important. This group contains textual based tools like PECAN which animates the program code by highlighting the appropriate part as the code is executed [23]. PROVIDE extends the properties of a conventional debugger by the use of graphical displays for continuous process states [20]. Some of the tools use the process graph instead of the source code as a basis for visualization [1]. This is especially appropriate if the same process graph is part of the speci cation for the parallel program. In Belvedere, the message activity of a parallel program is displayed via the edges of the process graph [14]. However, all systems using this method only support a static process model which is only sucient for regular types of applications and not general purpose.
Most visualization systems for parallel programs focus primarily on performance monitoring rather than algorithm animation. Examples for system oriented visualization systems are ParVis [18], Traceview [19] and ParaGraph [13]. These are general-purpose trace-visualization tools which can display trace data in dierent graphical forms. In ParaGraph, the user can select the appropriate visualization method from about 25 dierent diagrams, which include Gantt charts, Kiviat diagrams, communication trac diagrams, and space-time diagrams. They display time dependent process activities of the program to be monitored.
2.3. Problem Visualization (Algorithm Animation)
The basic idea of an algorithm animation system is to represent a program's data, actions and semantics in form of graphical views. Adequate abstractions can help to increase the understanding of the events occurring, while an algorithm is executed. A classical example of an algorithm animation system is the BALSA system [10] and its descendant BALSA-II [9]. Although it has been developed for sequential algorithms, many of its ideas are also useful for animation systems for parallel programs. In the BALSA system, multiple graphical displays of an acting algorithm can be speci ed by so called \annotations" in the source code of the algorithm. The programmer creates the animation for an algorithm and the end-users execute the animated program. The animation system itself is domain independent. It does not know which events are interesting or what representation is best for a particular situation. The strength of BALSA is its exibility which makes it possible to create the best suited abstractions and graphical representations for a given algorithm. An example of an algorithm animation system which is adapted to parallel programs, is the TANGO algorithm animation system [25]. An animation is composed of basic graphical objects and operations that drive the animation. Throughout an animation, objects can change their state (e.g. location, size, color). TANGO uses the \path-transition" animation paradigm to produce smooth color animations for an algorithm. Paths specify the magnitude of modi cation in image attributes between successive frames. A transition uses path parameters to change the state of an image. Simple transitions can be iterated, concatenated or composed to build more complex transitions. Another system is the AFIT Algorithm Animation Research Facility (AAARF). It is a program visualization system for the Intel iPSC/2 hypercube system, which displays animated data in real-time [26]. The drawback with problem-oriented approaches supporting debugging is, that displays cannot be generated automatically. The problem of having to create an adequate representation for algorithm visualization is widely recognized. So most of the systems provide basic graphical objects which can easily be adapted to a new algorithm. An example for this approach is Voyeur [2]. Its goal is to ease the creation of new views for a speci c algorithm. A similar approach is adapted in PARADISE which is meant as a meta tool to create visualizers [15]. PARADISE aims to make the creation of visualization tools a manageable task, for the non-graphics expert and the experienced user. However, in all these systems, the algorithms to be animated should be correct, which makes it dicult to use such systems for a general purpose debugging aid, as its dicult to gure out if a bug is in the application or in the animation itself. The main application area for systems like BALSA is computer-aided instruction.
2.4. Structure of Visualization Systems
Although visualization systems described in literature provide a great variety in functionality, purpose, exibility and architecture, most of them share a common basic design concept. Following this principle, a visualization system can be divided into three layers,
the data acquisition layer, the model layer and the presentation layer.
The data acquisition is usually event-triggered, i.e. certain interesting events are collected during the execution of a program. The majority of the systems described in literature use software instrumentation at the source code level. Other data acquisition techniques include object-code instrumentation, hardware monitoring or hybrid monitoring. Very often the visualization is purely based on simulations, without any execution on real target systems. The data collected is stored in a trace le and visualization is performed o-line. The model layer contains a model of the objects to be visualized. The model is based on the events which are collected from the program monitored. Usually a model can drive dierent views of a parallel program. In the presentation layer, the graphical views of the visualization are created. In most systems the user can control the animation in many ways. Multiple views are updated in a consistent way.
3. The TOPSYS Integrated Tool Environment
The major intention of the TOPSYS (TOols for Parallel SYStems) project [4] is to simplify the usage and programming of parallel systems for widening the range of applications for these types of machines. Only with the support of integrated tools and new programming methodologies programmers of distributed memory machines will be able to handle the complexities introduced with massive parallelism. In TOPSYS, the visualization system is not an isolated tool but it is integrated in a tool environment and its programming method.
3.1. The MMK Message Passing Programming Model
The programming model of TOPSYS is based on a message passing process model implemented with the parallel programming library MMK (Multiprocessor Multitasking Kernel) [8]. MMK supports an object based style of programming and provides three prede ned types of objects: tasks, mailboxes, and semaphores. A parallel program consists of multiple interacting instances of objects. Tasks communicate (synchronously or asynchronously) via mailboxes and can be synchronized via semaphores. All MMK objects reside in a global object name space and can be addressed regardless of their processor location (mapping; location transparency) and the underlying hardware con guration. MMK objects can be created and destroyed dynamically and thus oer great exibility for dierent types of applications. The programming model is as suited for static regular applications as for applications with intensive need for dynamic process models. The implementation of the tasks can be performed in a sequential
programming language. Currently C, C++, and FORTRAN are supported as language interfaces. Implementations of TOPSYS are available for the iPSC/2 and iPSC/860 multiprocessor as well as for networks of SUN SPARCstations.
3.2. The Distributed Monitoring System
The monitoring system of TOPSYS is based on an event/action model [7]. In this model the monitoring system monitors the execution of events. Based on the detection of events, several conditional actions can be executed. Events are classi ed into three dierent categories. Control ow events describe the execution ow of a program. Data
ow events monitor the state changes of program data (e.g. variables). The third category of events, concurrency events, deal with parallelism. More complex events can be formed by combining simple events. There are two operators available for event concatenation, a logical or operator and a sequence operator. The sequence operator enables the user to specify events that satisfy a \happened before" relation. The detection of events may initiate dierent actions, breakpoints , traces , and measurements . Breakpoints can stop individual tasks or the whole system, traces store information of various events (e.g. communication, creation of objects, execution of speci c part of the program. Measurements include counters and timers. The de nition of events is performed at the logical task level and independent of the physical distribution of a program, i.e. the programmer does not have to know where tasks and objects are located. On each node of the multiprocessor system resides a local monitor (software or hybrid), which is responsible for all objects that are mapped to this node. In order to detect events that are distributed over multiple processing nodes the local monitors communicate via so called crosstriggers . A crosstrigger is used to inform other local monitors on the detection of events. The hardware monitors have a communication network of their own. The software monitors use the communication network of the multiprocessor system for crosstrigger messages. Figure 1 shows the TOPSYS hierarchical tool model. All tools use information collected by the distributed monitoring system. explains this interactive parallelization methodology.
3.3. Common Design Aspects of TOPSYS
The tools which were developed in the TOPSYS project aim to support a parallelization method which we call interactive or semi-automatic parallelization. With this method, the programmer can very quickly experiment with dierent versions of his parallel program. Dierent parallelizations can be easily evaluated with interactive tools for speci cation, mapping, debugging, testing, performance tuning and visualization. In addition, a dynamic load balancing system for distributed memory multiprocessors has been implemented. A detailed description of the TOPSYS tools can be found in [4, 5]. We will therefore only describe the major design concepts of the tool environment as needed for the understanding of the visualization tool in the following section. The implementation of TOPSYS is based on an integrated hierarchical tool model with the following features:
Implementation of all of the tools on top of a common distributed monitoring system which uses dierent instrumentation techniques (software, hardware,
Graphical User Interface
Debugger
Perform. Analyzer Visualizer
....
Load Balancer
Tools
HLTL
Predicate/Action/Symbol-Management UTL Simulation
Software
Hardware
Hybrid
Monitors Multiprocessor Multitasking Kernel (MMK) NX/2
Figure 1. TOPSYS hierarchical tool model
hybrid monitoring) with identical functionality. All tools are interactive tools which run in parallel to the application being monitored. This enables the user to control the execution and decide which information is required during execution. This is the major advantage compared to trace based o-line systems. All tools share a graphical and menu-driven user interface with the same look and feel, i.e. with the same philosophy of usage based on the X Window system. All tools can be used in an integrated way on the same program. For example, it is possible to specify a breakpoint in your program with the debugger and start visualization exactly in the routine where a deadlock is assumed.
4. The Visualization and Animation Tool VISTOP
The visualization tool VISTOP (VISualization TOol for Parallel systems) animates message passing programs developed with the parallel programming library MMK.
VISTOP visualizes the structure of an application in terms of the basic MMK object types task, mailbox and semaphore, which represents the abstraction level of the programmer using the MMK programming model. Figure 2 shows the representation of the MMK objects in VISTOP.
Figure 2. Object classes in the MMK parallel programming library and their representation in VISTOP
4.1. Requirements of a Visualization Tool
The following section identi es some properties of a visualization tool which are very desirable in our opinion. Many of the points were identi ed by a questionnaire among the users of the TOPSYS environment.
A visualization tool to support debugging must be quickly adaptable to the application. It is not acceptable to spend a day or a week to adapt the visualization tool to a new application program. For visualization it is preferable to use object code instrumentation or instrumented communication libraries instead of source code instrumentation. When using annotated programs, the user can never be sure that a bug discovered through visualization is really a bug in the program and not a bug in the annotation of the source code. A visualization tool should be used interactively and on-line. When an application runs for a longer period of time the programmer is not interested in visualizing the whole execution, if only a small fraction is really interesting. A visualization tool should be integrated with other tools of a parallel programming environment, e.g. the source level debugger. A visualization tool should support dierent levels of abstraction including the abstractions of the underlying programming model. The visualizer should be applicable to applications running on massively parallel systems. A visualization tool should provide dierent views and abstractions of an application. This helps a user to focus on dierent aspects of an application. It should support dynamic process models and should be able to visualize the behavior of system integrated automatic load balancing.
A visualization tool should be extensible. If a user wants to have additional information about a program he should be able to get it with minimal modi cations of the visualization tool.
Some of these demands on a visualization system are contradictory. In our opinion, the base functionality of a visualization system to assist debugging should follow a program and system oriented approach as described in section 2.2. VISTOP follows such an approach. However, not all of the properties identi ed before are already integrated in current version of it.
Figure 3. The VISTOP Control Panel serves to select dierent views and controls the execution and animation of the program being visualized
4.2. The VISTOP control panel
In the current state of implementation, VISTOP supports three dierent views of the underlying application, the concurrency (communication, synchronization) view, the object creation view, and the system view. They can be displayed together or separately, which can be selected in the VISTOP control panel (see gure 3). The dierent views are synchronized and always show a consistent picture of the state of the objects of the application. Such a picture is called a \snapshot". The sequence of snapshots that arises from this is animated in such a way that transitions between successive snapshots are smooth and continuous, thus giving the user an idea of how one state changes to another. During the animation, each MMK object is represented by a small window containing information about the current state of this object. You can replay the snapshots of an application back and forth, stepping through manually or automatically.
4.3. The Concurrency View
In order to gain insight into the dynamic behavior of a parallel program it is crucial to understand the communication and synchronization events that occur while an application is executing. This re ects the opinion that for parallel algorithms communication and synchronization events are the most important new feature added to the concepts of sequential programming. In addition to the fact that communication or synchronization takes place, the corresponding protocol, i.e. whether blocking or non-blocking communication or synchronization library calls are used by the application program, is also visualized.
Figure 4. VISTOP Concurrency View For visualization of a distributed application, VISTOP gathers run-time information whenever a communication or synchronization event occurs. A snapshot of the state of the objects under supervision is taken whenever such an event occurs and stored in an internal data structure, the \animation queue". Figure 4 shows a screen dump of the display of the animation window. The communication and synchronization events collected by VISTOP at a conceptual level are: a task sends a message to a mailbox a task receives a message from a mailbox a task requests units from a semaphore a task releases units to a semaphore For the visualization such a coarse distinction of events is insucient. Not every communication or synchronization can be executed instantaneously. A target mailbox might be completely lled or a critical section guarded by a semaphore is blocked. In addition, errors or timeouts can occur during communication. In the visualization of our programs we therefore want to see the re ned task states explained in table 1.
task state description running send wait send receive wait receive request wait request release cerror timeout
task is running (or ready) task sends a message to a mailbox task is waiting at a full mailbox task receives a message of a mailbox task is waiting at an empty mailbox task requests units of a semaphore task is waiting at a semaphore task releases units to a semaphore communication error time-out during communication
Table 1. Task states visualized by VISTOP
Animation in the Concurrency View
All interesting MMK objects of an application are displayed in the data display area of the Concurrency window and represented by little icons showing the class and the name of an object. The metaphor used for the animation in the Concurrency view is the following: If a task communicates with a mailbox or a semaphore or if a task has to wait at a mailbox or semaphore, the corresponding task window inserts itself into a queue of tasks waiting at the communication object. For this purpose the task moves from its current location to the last position in the queue. Queues are visualized as chains of arrows between objects. Thin arrows characterize communication while thick arrows indicate that a task really has to wait until its communication request can be satis ed. The direction of an arrow shows the direction of the requested data transfer. When a task has nished its communication, it leaves the queue and the window is moved back to its \home position". The \home position" of a task is either the location where the task was initially placed by VISTOP or the location the user assigned to it. The associated arrows are deleted. All windows representing MMK objects can be moved individually by the user. This feature can be used to rearrange the MMK objects to re ect the communication structure of the application. Arrows are rearranged automatically.
4.4. The Object Creation View
This view displays the dynamic object creation chain of the MMK objects created and/or deleted frequently, as it happens for example in database systems or tree search algorithms. The MMK objects are represented in form of a graph consisting of at least one tree. An example for an object tree can be seen in gure 5. Each of the tasks that are declared statically form the root of a separate tree. The parent entry shows the task that created the child object dynamically. By selecting a node of the graph, additional information about the object is displayed (e.g. state of the object, module where the object was created). When a new object is created this is animated in the object creation view as follows. First, the position of the object in the tree is determined. All objects below this position move down one line. A \turtle" appears below the task which creates the new task and draws a line from the father to the new object. Now the turtle grows
Figure 5. VISTOP Object Creation View and nally the icon for the new object appears. The animation steps for an object creation are shown in gure 6. The animation of the deletion of an object is animated in an analogous way to the creation.
4.5. The System View
This visualization component shows the distribution of the objects of an application onto dierent processing nodes. Dierent processing nodes may be located on dierent machines, when a program is distributed in a network of UNIX workstations. The display is organized as a matrix. Each column contains all objects of one node. An object may be displayed as an icon that shows its type, its name or its object identi er. By selecting an object with the mouse additional information about the object can be obtained. The VISTOP system view is displayed in gure 7. This view visualizes object migration in applications which facilitate migration through the automatic load balancing mechanism of the MMK. When MMK objects are created or deleted dynamically, this view shows the location of the objects and their overall distribution in the system. Again, the events displayed in this view are animated to show the dynamic behavior of the underlying application.
5. Scalability
A major problem for visualization systems of parallel programs is scalability. With the advent of massively parallel systems consisting of many processing nodes, the amount
Figure 6. Animation steps for a object creation
Figure 7. VISTOP System View of information provided by a visualization system must be adaptable to the needs of the user of the system in order to be eective.
Selection of Objects
One mechanism in VISTOP to increase scalability is selection of \interesting" objects. When using a visualization system to support debugging and performance monitoring, a user is usually interested in very speci c objects and phases during the execution of an application. The user can select the MMK objects which he is interested in via lists. Only those objects are displayed in the animation window. The selection of the appropriate MMK objects for each phase of a computation can be performed any time during the animation. This feature makes it more feasible to visualize an application running on a massively parallel architecture. Initially each MMK object appears as an icon showing the type of the object and the name of the object thus requiring a minimal amount of space on the screen. How-
ever the objects can also be deiconi ed when the user wants to get extra information, like the number of messages in a mailbox or the scheduling state of a task. In addition, the animation area of each view is much larger than the physical space on the screen. The user can select the appropriate section of the animation area with scroll-bars.
Selection of interesting phases
Parallel programs usually run for a long time because most often the extra eort to build a parallel application is only justi ed for time consuming problems. Often a user only wants to visualize a key scene of his program. In the TOPSYS environment this can be performed by using VISTOP together with our parallel debugging system DETOP (DEbugging TOol for Parallel Systems). The user starts the program under control of DETOP and speci es a breakpoint at the entry of the routine he wants to visualize. When the breakpoint is reached he starts the visualization tool and examines the interesting part of the application. Visualization
Data Flow
Control Flow (User)
Model
Data Flow
Data Acquisition
Figure 8. General design architecture of VISTOP
6. VISTOP Design Concepts
VISTOP follows a general design concept which is similar to most visualization systems. There is a data acquisition layer which gathers runtime information on a parallel application, a model layer which stores the data of subsequent snapshots and the presentation layer which visualizes the information of the model layer. Figure 8 shows the basic architecture of VISTOP. The model of the concurrent application is contained in a central data structure called \animation queue". It is lled by the data acquisition process. The visualization process reads the data from the animation queue. All action is controlled by the user.
The advantage for having two independent parts for acquisition and animation is obvious: one part can easily be replaced by another implementation without aecting the other.
6.1. The Visualization Layer
VISTOP's graphical layer is written in C++ and based on the C libraries the X Window System (X11R5) and the Motif Widget set. VISTOP is independent of the underlying parallel hardware and thus portable in the UNIX world. While the X Window System and widget set are very useful for building user interfaces, it is much more dicult to produce animated displays where objects appear and move. The basic problem is that the X Window system is single threaded with a single event loop where events are processed sequentially. While an event is processed, user interaction is impossible and { what is even worse { objects on the screen can not be updated. We solved this problem by explicitly processing events within our animation classes.
6.2. Data Acquisition
Data acquisition is a central problem for visualization systems of distributed memory multiprocessors. We will therefore explain this problem in more detail. Critical questions for the design of the data acquisition component in general are: Which events should be gathered? Where should the events be stored? On-line or o-line data acquisition, i.e. should the events be ltered during the execution run or afterwards? When designing the data acquisition layer for a visualization system to be used for distributed memory multiprocessors, additional questions have to be answered: Since the target multiprocessor typically provides no physical global time, the time model to be used for ordering of events is one of the most critical issues. Does the data acquisition layer provide enough information for total ordering of events or is only a partial ordering of events based on a \happened before" relationship possible? Furthermore, the run-time overhead caused by monitoring a parallel program can change the parallel execution and cause an unexpected behavior. Is the intrusion caused by the data acquisition component acceptable?
6.3. The Monitoring Interface
As already mentioned, all tools of the TOPSYS environment are based on a common distributed monitoring system. The monitors which are located on the computation nodes of the target system can be implemented either in software or in hardware. The TOPSYS hierarchical tool model oers a tool interface, the HLTL (High Level Language Tool Layer) interface, which enables to control the execution of a parallel application and to collect runtime data on the execution of the parallel program. There
Command Description
dbreak s5() De ne Breakpoint cbreak s5() Clear Breakpoint mbreak s5() Masterbreak dtrace s5() De ne Trace go s5() Start Application view s5() Show all MMK Objects display s5() Display Variables modify s5() Modify Variables inspect s5() Query State of a MMK objects Table 2. Some of the TOPSYS monitoring (HLTL) commands are, for example, commands to start an application, to specify breakpoints, to specify traces and to inspect and modify the state of the objects of the application. A detailed description of this tool interface is given in [6]. Table 2 shows some of the commands of the HLTL-layer. In VISTOP the tool interface of the TOPSYS environment was used in two dierent ways to implement two dierent data acquisition techniques. One version of the visualization tool is based on iterative global breakpoints and state inspection, whereas the second version uses traces for gathering runtime information. Both techniques are implemented by combination of the functionalities oered by the tool interface, i.e there is no need to change the program in order to visualize it. All instrumentation is performed automatically at the level of the object code.
Visualization using Breakpoints
The key idea with this method is to stop an application, whenever an interesting event occurs. This is performed via the TOPSYS distributed monitoring system. Then the state of all interesting objects is inspected, stored in the animation queue, and the application is continued. The major advantage of this data acquisition method is that the application stops when an interesting event occurs. The state of the application running on a parallel machine is exactly the same that is shown by the visualization system, i.e. on-line. When the user needs additional information of the application (for example values of variables), he can get the information with the debugger. The user may even modify a variable to x a problem and continue the visualization. The rst version of VISTOP used this method for data acquisition. The drawback of this approach is the heavy intrusion of the data acquisition process into the application to be visualized. This is especially true when using the software monitoring system. In addition to the time needed to stop the application, the transmission time between host and node processors for the data must also be considered.
Visualization with Traces
The use of event traces for visualization and performance analysis systems is much more common in visualization systems described in literature than the method mentioned previously. Traces can be generated very easily either by source code instrumentation or by using an instrumented communication library like the PICL library [11]. In most systems, data acquisition and visualization is completely independent. When the
program is running, the trace data is stored in a le. The analysis of the program execution is performed after the program has nished, i.e. o-line. The user can not modify the execution of the program or get additional information which is not contained in the trace data. Also, the scalability of such an approach is a problem due to the huge amount of data which has to be collected and transferred. In our current version of VISTOP we try to combine the advantages of an on-line visualization with those of data acquisition by traces. This is possible because our visualization system is integrated in the TOPSYS tool environment. The monitoring system collects traces of the events required for the visualization. These are visualized in parallel to the program to be analyzed. However the user is not forced to trace the whole execution. The user can start visualization any time he wants and he can suspend the execution of the program, when he needs additional information. Thus, the amount of information collected can be kept to a minimum and intrusion through the monitoring system and data transfer is much less than in the version of VISTOP that used breakpoints for data acquisition. Of course there remains the problem that any intrusion may introduce indeterminism in a parallel program. This could be partially solved by an execution replay mechanism [17].
6.4. Time and Ordering of Events
A visualization system should give a picture consistent with reality with respect to time. In distributed memory computing environments there is usually no common global time base available. This means that the visualization of the execution of a parallel program does not necessarily show all events in the sequence in which they were actually executed. Within a distributed system without a global time base, a total order of events cannot be achieved. Figure 9 shows two send events with no causal relation. An ideal observer might recognize that both events actually happen at the same time in one execution run. VISTOP's sorting algorithm will show two events, task T1 sends a message to mailbox M1 and task T2 sends a message to mailbox M2. The sequence of the visualization of both events however is not deterministic. However for the visualization this is sucient. T1 send
M2 receive
Processor 1
Processor 2 T2 send
M1 receive time
Figure 9. Example for an indeterministic visualization sequence The sequence of the events of the example does not matter as both events have no causal relation, i.e. no in uence on the computation of the program. The sequence shown in the visualization system only has to be consistent with causality. An event e1 that can aect an event e2 has to be visualized before the event e2 . Events that can possibly be executed concurrently and can not aect each other can be visualized
in each sequence. This is a very informal description of the \happened before" relation introduced by Lamport [16]. Both data acquisition methods described in the previous section satisfy this weaker condition. The distributed monitoring system (software and hardware) guarantees that an application will be stopped in a consistent state [3]. In the second data acquisition method the traces are sorted in a way that is consistent with causality.
7. Ongoing Work
Implementations of VISTOP as part of the TOPSYS tool environment are distributed since about two years on a non-commercial basis. TOPSYS is available for the iPSC/2 and iPSC/860 multiprocessor systems as well as for networks of SUN SPARCstations. Approximately 30 installations are currently in use at 15 dierent institutions. Most users of VISTOP so far have used the tool for getting an overview on the dynamic behavior and a deeper understanding of their parallel programs. The work on TOPSYS is funded by the German Science Foundation and will continue until December 1994. In the near future our research will address the following topics:
Graphical visualization techniques have to be integrated into VISTOP to deal with massively parallel systems, incorporating hundreds or thousands of processors (objects). While the properties like selection of objects and program phases are already very helpful to increase the scalability of VISTOP, the mechanisms available in VISTOP to visualize massively parallel applications are not yet sucient. Currently we are implementing \composite objects" in VISTOP. The user can combine basic objects (i.e. tasks, mailboxes and semaphores) and compose objects to new objects. Visualization can be performed at any level. The user can switch between dierent levels by zooming. In the long run, such object hierarchies appropriate for visualization might already be created during the design process for a parallel application. This feature will be enhanced by a virtual desktop which shows the location of current part of the screen in the whole communication graph. Application oriented visualization based on primitives oered by VISTOP. Integration of new visualization views better suited for abstractions used within the application program, e.g. visualization of transactions in parallel databases. Improving the portability and adaptability of VISTOP with respect to dierent programming models. This means eliminating the restrictions for applications based on message passing programming models and support for new programming models which are appropriate for machines with virtual shared memory.
References
[1] M. Aspnas, R. J. R. Back, and T. Langbacka. Millipede - A Programming Environment Providing Visual Support for Parallel Programming. In Proc. of the European Workshops on Parallel Computing, March 1992.
[2] Mary L. Bailey, David Socha, and David Notkin. Debugging Parallel Programs using Graphical Views. In Proc. of the International Conference on Parallel Processing (Vol. II Software), pages 46{49, August 1990. [3] J. Bauer, T. Bemmerl, and T. Treml. Leistungsanalyse von verteilten Beobachtungs- und Bewertungswerkzeugen. SFB-Bericht 342/14/90 A, Institut fur Informatik, Technische Universitat Munchen, Juli 1990. [4] T. Bemmerl. The TOPSYS Architecture. In H. Burkhart, editor, Proceedings of CONPAR90 VAPP IV, volume 457 of LNCS, pages 732{743, Zurich, Schweiz, 1990. Springer-Verlag. [5] T. Bemmerl and A. Bode. An Integrated Environment for Programming Distributed Memory Multiprocessors. In A. Bode, editor, Distributed Memory Computing - 2nd European Conference, EDMCC2, volume 487 of LNCS, pages 130{142, Munchen, April 1991. Springer-Verlag. [6] T. Bemmerl, A. Bode, P. Braun, O. Hansen, T. Treml, and R. Wismuller. The Design and Implementation of TOPSYS. SFB-Bericht 342/16/91 A, Institut fur Informatik, Technische Universitat Munchen, Juli 1991. [7] T. Bemmerl, R. Lindhof, and T. Treml. The Distributed Monitor System of TOPSYS. In H. Burkhart, editor, Proceedings of CONPAR90 VAPP IV, volume 457 of LNCS, pages 756{765, Z"urich, Schweiz, 1990. Springer-Verlag. [8] T. Bemmerl and T. Ludwig. MMK - A Distributed Operating System Kernel with Integrated Dynamic Loadbalancing. In H. Burkhart, editor, Proceedings of CONPAR90 VAPP IV, volume 457 of LNCS, pages 744{755, Zurich, Schweiz, 1990. Springer-Verlag. [9] Marc H. Brown. Perspectives on Algorithm Animation. In Elliot Soloway, Douglas Frye, and Sylvia B. Sheppard, editors, Human Factors in Computing Systems, Washington, May 1988. ACM. [10] Marc H. Brown and Robert Sedgewick. A System for Algorithm Animation. Computer Graphics, 18(3):177{186, 1984. [11] G. A. Geist et al. PICL: A Portable Instrumented Communication Library, C Reference Manual. Tech. Report. ORNL/TM-11130, Oak Ridge Nat'l Lab., Oak Ridge, Tenn., 1990. [12] Joan M. Francioni, Larry Albright, and Jay Alan Jackson. Debugging Parallel Programs Using Sound. In Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging, pages 60{68, Santa Cruz, CA, May 1991. ACM/ONR. [13] Michael T. Heath and Jennifer A. Etheridge. Visualizing the Performance of Parallel Programs. IEEE Software, 8(9):29{39, September 1991. [14] Alfred A. Hough and Janice E. Cuny. Belvedere: Prototype of a Pattern-Oriented Debugger for Highly Parallel Computations. In Sartaj K. Sahni, editor, Proc. of the International Conference on Parallel Processing, pages 735{738, August 1987.
[15] James Arthur Kohl and Thomas L. Casavant. Use of PARADISE: A Meta-Tool for Visualizing Parallel Systems. In Proc. of the 5th International Parallel Processing Symposium, pages 561{567, Anaheim, CA, 1991. IEEE Computer Society Press. [16] L. Lamport. Time, Clocks, and the Ordering of Events in a Distributed System. CACM, 21(7):558{565, Juli 1978. [17] Eric Leu and Andre Schiper. Execution replay: a mechanism for integrating a visualization tool with a symbolic debugger. In L. Bouge, M.Cosnard, Y. Robert, and D. Trystram, editors, Parallel Processing: CONPAR 92 { VAPP V, pages 55 { 78, Lyon, France, September 1992. Springer-Verlag. [18] Laura Bagnall Linden. Parallel Program Visualization Using ParVis. In Margaret L. Simmons and Rebecca Koskela, editors, Performance instrumentation and visualization, ACM Press Frontier Series, pages 157 { 187. ACM Press, 1990. [19] Allen D. Malony, David H. Hammerslag, and David J. Jablonsky. Traceview: A Trace Visualization Tool. IEEE Software, 8(9):19{28, September 1991. [20] Thomas G. Moher. PROVIDE: A Process Visualization and Debugging Environment. IEEE Transactions on Software Engineering, 14(6):849{857, June 1988. [21] Brad A. Myers. Visual Programming, Programming by Example, and Program Visualization: A Taxonomy. In Proc. ACM SIGCHI '86 Conference on Human Factors in Computing Systems, pages 59{66. ACM, April 1986. [22] Daniel A. Reed, Robert D. Olson, Ruth A. Aydt, Tara M. Madhyastha, Thomas Birkett, David W. Jensen, Bobby A. A. Nazief, and Brian K. Totty. Scalable Performance Environments for Parallel Systems. In Proc. of the Sixth Distributed Memory Computing Conference, pages 562{569, Portland,Or, April 28 - May 1 1991. IEEE. [23] Steven P. Reiss. PECAN: Program Development Systems that Support Multiple Views. IEEE TSE, 11(3):276{285, 1985. [24] Gruia-Catalin Roman and Kenneth C. Cox. A Declarative Approach to Visualizing Concurrent Computations. IEEE Computer, 38(10):25 { 36, 10 1989. [25] John T. Stasko. Tango: A Framework and System for Algorithm Animation. IEEE Computer, 11(9):27{39, September 1990. [26] Edward T. Williams and Gary B. Lamont. A Real-Time Parallel Algorithm System. In Proc. of the Sixth Distributed Memory Computing Conference, pages 551{561, Portland,Or, April 28 - May 1 1991. IEEE.