A Debugging Environment for PVM Kalyan S. Perumalla Udaya B. Vemulapati CS-TR-93-10
Department of Computer Science University of Central Florida P. O. Box 162362 Orlando, FL 32816-2362 E-Mail:
[email protected]
Contents 1 Introduction
1.1 Distributed Computing Systems 1.2 Parallel Virtual Machine (PVM) 1.2.1 Scope 1.2.2 PVM Software Interface 1.2.3 PVM Internals and Implementation Details 1.2.4 PVM Programming Tools
1
: : : : : : : : : : : : : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : : : : : : : :
2 The Debugging Environment 2.1 2.2 2.3 2.4 2.5 2.6
Overview Task Debugging Log of PVM Calls Group Debugging Option Monitor: A Better Interface to PVM Console Implementation Details 2.6.1 Internals of the Monitor 2.6.2 Files Changed in PVM Software
5
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
3 4 5 6 A
Tests Future Work Summary Bibliography PVM Monitor User's Guide
A.1 PVM Monitor Control Window A.1.1 Console/Con guration/Task List A.1.2 Detach/Re-attach Monitor A.1.3 Quit A.2 PVM Console Window A.3 PVM Con guration Window A.4 PVM Task List A.5 Noti cation Window A.6 Task Debugging Window A.7 Group Speci cation Window A.8 Group Debugging Window
1 2 2 2 4 5
5 6 6 7 7 7 7 8
9 9 9 11 13
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
: : : : : : : : : : : : : : : : : : : : : : : : : : : :
13 13 13 14 14 17 18 20 20 22 23
A Debugging Environment for PVM Kalyan S. Perumalla
Udaya B. Vemulapati
Abstract
The PVM (Parallel Virtual Machine) is one of the most popular implementations of distributed systems. In this report, we present a debugging environment for programming in PVM. The features that the environment provides include running the PVM tasks under the control of debuggers, a run-time log of all the PVM calls made by the tasks, facilities for debugging tasks as groups, and an improved interface to the PVM Console. The important feature of the debugging environment is that it is entirely run-time. Also, the environment can be attached to or detached from the PVM system at any time of the execution of applications. It is user-friendly, with a graphic user-interface and simpli ed window management. The environment has been implemented and tested on dierent architectures. The software, which consists of a \Monitor" program, and a set of modi cations to the PVM software system, is available from the author on request.
1 Introduction
1.1 Distributed Computing Systems
A distributed computing system can be regarded as consisting of multiple autonomous, often heterogeneous, processors that do not share primary memory, but cooperate by sending messages over a common network [9]. Such concurrent computing environments based on networks of computers can be an eective, viable, and economically attractive complement to hardware multiprocessors/supercomputers [8]. Distributed computing systems are advantageous in several ways [6]. Some of them are: better computer performance at a lower cost utilization of idle cycles and resources capability to match hardware to components of applications partial failure and fault tolerance. The number eld sieve project of Lenstra and Manasse [2] which factored the ninth Fermat Number using over 1000 computers worldwide, is an example of experiments mainly aimed at and designed for demonstrating the viability of distributed computing. Some examples of applications of distributed computing are: Global environment simulation project, requiring vector processing, distributed multiprocessors, high speed scalar computation, and real time graphics [11] 1
uid dynamics applications, like BF3D, using large amounts of memory, generating large amounts of data, and requiring 2D and 3D graphics terminals [11] calculation of electronic structure of metallic alloys [8] solution of molecular dynamics [8]. Several of the applications are inherently heterogeneous, requiring a variety of hardware capabilities, like computing cycles, high memory requirements, graphic visualization interfaces, and so on. Each of the requirements is satis ed by a particular type of architecture. Also, the dierent capabilities are required at dierent stages of the application, mostly as a pipeline. There exist several distributed environments providing a combination of a variety of capabilities. Some of such environments are Locus [6], ISIS (Cornell and Ohio State Universities), Linda [10], Cosmic Environment [7], Camelot Transaction Process facility [9], and PVM [11].
1.2 Parallel Virtual Machine (PVM) 1.2.1 Scope
PVM is currently one of the most popular distributed computing software systems that allows the utilization of heterogeneous computing elements interconnected by one or more networks, as a single computational resource. It is being used by scientists and educators in US and abroad [4]. Several of the US supercomputer centers provide PVM as a program infrastructure for scientists who wish to spread their computations over machines available at the centers. Educators at several universities use PVM as a teaching tool in parallel programming courses. According to [1], \PVM is a software package that allows a heterogeneous network of parallel and serial computers to appear as a single concurrent computational resource... It permits a network of heterogeneous Unix computers to be used as a single large parallel computer; thus large computational problems can be solved by using the aggregate power of many computers...". PVM is a large-granularity environment, primarily targeted at applications that are collections of relatively independent components of a program. It supports heterogeneity at the application, machine, and network level [1]. It handles all data conversions that may be required if two computers use dierent data representations. PVM is highly portable. It is available for a very wide range of architectures, and is being ported to more architectures. It is also known for its robustness. Discussion on the background, applications, use, and scope of PVM can be found in [11, 8]. Critical analysis of the strengths and weaknesses of the PVM software system can be found in [6]. PVM is available free of charge through the Internet. Details on getting and installing PVM are given in [1].
1.2.2 PVM Software Interface
PVM provides a straight-forward and general interface that permits the description of various types of algorithms, and their interactions, while the underlying infrastructure permits the 2
Application 1 Application 2 Component Instances
Cray
PVM System
MasPar
Sun
Cube
SMM
Sun
DEC
VAX
Butterfly
Lan 2
LAN 1
Figure 1: PVM Hardware Con guration execution of applications on a virtual computing environment that supports multiple parallel computation models [8]. The applications view the PVM system as a general and exible parallel computational resource that supports common parallel programming paradigms (see Figure 1). Application programs access these resources by invoking function calls from within common procedural languages such as C and FORTRAN. Routines are provided that perform process spawning, message transmission and reception, barrier synchronization, dynamic grouping, etc.[11, 3]. PVM consists of two parts: a daemon process that any user can install on a machine, and a user library that contains routines for initiating processes on other machines, for communicating between processes, and changing the con guration of machines. Applications using PVM system must be linked with the PVM library. There have been several releases of the PVM software system. The latest release of PVM (version 3.0) has been vastly improved over the previous versions[1]. Several new features such as fault tolerance, dynamic process groups, signaling, inter-process communication, multiprocessor integration, and process control have been either added or improved in the latest release. A partial list of the various library routines provided for the above is given below. For a complete list [1] can be consulted. Process Control: pvm mytid(), pvm exit(), pvm kill(), and pvm spawn(). Information: pvm parent(), pvm pstat(), pvm mstat(), pvm config(), and pvm tasks(). Message Sending: pvm send(), pvm mcast(), pvm mkbuf(), pvm freebuf(), pvm initsend(), pvm pkbyte(), pvm pkint(), pvm pkcplx(), pvm pkstr(), pvm setsbuf(), and pvm getsbuf(). 3
Message Receiving: pvm recv(), pvm nrecv(), pvm upkbyte(),
pvm upkint(), pvm upkcplx(), pvm upkstr(), pvm setsbuf(), and pvm getsbuf(). Signaling: pvm sendsig(), and pvm notify(). Dynamic Process Groups: pvm joingroup(), pvm barrier(), pvm bcast(), and pvm lvgroup().
1.2.3 PVM Internals and Implementation Details
A brief outline is presented here. For more detailed discussion [1] can be consulted. A process called pvm daemon is to be run on every host that the user intends to use in a PVM application. Multiple applications can be run with the same set or overlapping sets of daemons. Also, the daemons are on a per-user basis. The daemons do not need any special privileges | any user with a valid account can install a daemon on a host. There are several ways by which the virtual machine (group of hosts with the pvm daemons running on them) can be set up. One way is to use a con guration le that has entries specifying the hosts that form the virtual machine, and optionally some system dependent details of each host. To set up the virtual machine, the pvm daemon is started on a host, with the con guration le speci ed as an argument to the daemon. That daemon, also called the master daemon, starts up the daemons on the hosts speci ed in the con guration le. Another way is to use the PVM Console program that comes along with the PVM system. The PVM Console provides relatively easy interface to various status information and control of the virtual machine. The PVM Console, when started up, also starts up the pvm daemon on the same host. Later on, other hosts can be added to the virtual machine, using the commands of the PVM Console. Yet another way is to start up just the master daemon, and initiate the application on that host. The application can then use the PVM library calls to start up the daemons on other hosts. After the virtual machine (at least the initial con guration) is setup, applications can be started that use the virtual machine. Again, PVM processes can be spawned using the PVM Console, or from the UNIX command prompt, or by other processes. If an ordinary UNIX process intends to become a PVM process, it calls the PVM library routine pvm mytid() which enrolls it into the PVM, and assigns it a task descriptor (TID). From then on, the process can use other PVM calls. If a PVM process intends to create another a PVM process, it contacts its local pvm daemon, and that daemon spawns the process and returns the TID of the new process to the caller. The virtual machine initially consists of a single master pvm daemon. Either by reading the con guration le, or in response to requests from PVM Console or PVM processes, or any combination of the above possibilities, the master daemon starts up the daemons on other hosts. It accomplishes this by using rexec or rsh. The standard input, output and error streams of the slave daemons are directed to the master daemon. Some handshaking and authentication are performed. The most current state of the virtual machine is also passed to the new daemons. From then on all the pvm daemons are ready to cater to requests from the local processes and other pvm daemons. In response to a request from either a local 4
process or another pvm daemon to spawn a process, a pvm daemon fork s and exec s the given executable, and redirects the process's standard input, output and error to it's own. When the spawned task tries to connect back to daemon, some handshaking and authentication are performed, and a unix socket is opened between the process and the local daemon, through which the two communicate from then on. The process is then ready to use the other PVM library calls.
1.2.4 PVM Programming Tools
Presently there are a few tools that help the programmer to visualize, pro le, trace or debug the application programs in PVM. Some of the tools are: the PvmTaskDebug option provided as a flag option while spawning a pvm process executes the shell script pvm3/lib/debugger [1]. The script starts an xterm window on the host PVM was started on and spawns the task under a debugger in this window. This is not entirely adequate - the script appears not to have been thoroughly tested as supplied, and appears to be supporting a bare minimum of functionality. the HeNCE system, providing trace feedback speci cally tailored to its programming paradigm, and suited for animated performance monitoring and measurement[8] ParaGraph, displaying events generated by parallel programs, using trace les; a postmortem evaluation tool for pro ling, load-balancing, and performance evaluation[8] BEE, providing dynamic event monitoring [5] Xab, providing run-time monitoring of PVM programs [4] Though each of the above tools has its own realm of utility, none of them allows for run-time debugging of an arbitrary subset of components of an application program. Xpvm [8] is the one that comes closest in satisfying this, in that a separate debugging window is provided for each component, and it is the user's responsibility to manage the multiple windows [8]. Xpvm was part of PVM version 1.0, but was withdrawn from public release from later versions [6]. ParaGraph is a post-mortem analysis tool, as is HeNCE. ParaGraph uses the PICL (Portable Instrumented Communication Library) for its tracing. Xab mostly deals with the communication/event displays, along with tracing using ParaGraph. Besides, there is no ocial word on if and when it would be available. None of the existing tools provides complete run-time control of the processes. As mentioned above, the PvmTaskDebug option provided is inadequate for complete control of the processes that are spawned on several hosts of dierent architectures.
2 The Debugging Environment 2.1 Overview
The debugging environment comes in two parts | an X-Window based program called the \Monitor" program, and a set of modi cations to the PVM software to allow for the Monitor to control the PVM processes. 5
The environment provides a complete interface to debug applications that use PVM. PVM applications can be run as usual, as they are run with the usual PVM system. The Monitor program can be added to the PVM environment at any time before, during or after the time the application(s) is(are) run. The Monitor can also be detached from the system at any time. The most signi cant feature of the environment is that the entire debugging interface is run-time .
The term \user" from now on would be used to denote a programmer writing applications using PVM.
The Monitor provides the user complete control of the processes of the PVM application. When the Monitor is attached to the PVM system, the PVM processes are spawned under the control of a debugger (a copy of debugger is run for each process). The display and control of the debugger is provided at a single central point - which is the host on which the Monitor is executed. The Monitor need not necessarily be run on the same host as the master pvm daemon. Also, for each process that is being controlled by the Monitor, a log of all the PVM library routines that the process calls is displayed at run-time. Applications that are to be debugged with this environment have to be linked with the modi ed PVM library so that they can cooperate with the Monitor.
2.2 Task Debugging
The environment provides the maximum possible control over the processes, as they are run under the control of debuggers. Both the pace and order of execution of the processes can be controlled by the user. Using the debuggers, the local correctness of the processes can be veri ed. Using the log of PVM library calls for communication, the correctness of inter-relations among the processes can be veri ed.
2.3 Log of PVM Calls
A log of PVM library routines that are called by a process is displayed in the process's window. This is done for every process. The signi cant features of the above log are: it is run-time the call is detected even before the action associated with the call is carried out. The above features allow for several types of errors to be detected: Detection of deadlocks is a direct outcome. For example, if there is a circular dependency of message reception among a subset of the processes, the processes simply remain in waiting states. Since the information about the PVM calls they make to receive messages are displayed at run-time (for each of the processes) by the Monitor even before the waiting begins, the deadlock becomes clearly evident by looking at the last PVM calls made by each of the stalled processes. In fact, any invalid message/communication/dependency is clearly re ected in the log of the PVM calls. 6
Erroneous call to any PVM library routine is easily detected. For example, any
invalid arguments to a PVM call, that may be causing the process to act abnormally (like dumping core), can be detected by looking at the arguments to the last PVM call that the process made before the abnormal action. The log of PVM calls is not destroyed even after the process exits. This allows a post-mortem analysis to be carried out.
2.4 Group Debugging Option
An important and powerful option provided with the Monitor is the ability to debug a set of processes as a group. A subset of active processes can be selected as a group and can be run in a lock-step fashion. Much like the Single Program Multiple Data model of computation, the processes can be made to execute their \next statement"s synchronously, and wait for all of them to nish one statement before executing the next. This is particularly suited for applications using the master-slave or symmetric-processes models of computation. Group execution option also makes it easier to give commands to the processes (debuggers) | instead of individually giving a single command, like, say, a stop at 20 command to each of the processes, the same command could be given to the group, which is transmitted to all the processes belonging to the group.
2.5 Monitor: A Better Interface to PVM Console
The Monitor program could also be used as just a better interface to using PVM - the Monitor's debugging capability is optional. For example, using the Detach Monitor option of the Monitor, the process-controlling feature of the Monitor can be turned o, and the graphic interface to the PVM Console can be used just as the Console would be used otherwise.
2.6 Implementation Details
As mentioned above, the debugging environment software consists of a \Monitor" program, and a set of modi cations to the PVM software. The Monitor program has been designed to provide the user a run-time debugging environment with graphic interface.
2.6.1 Internals of the Monitor
The Monitor is implemented as a PVM task itself. Initially it starts up the PVM Console by fork ing and exec ing the Console with its standard input, output and error streams directed to itself through pipes. The Monitor then enrolls itself as a PVM task by calling pvm mytid(). It opens an Internet socket, and binds it to a vacant port number. The address of the host on which the monitor resides, coupled with the port number is unique across the Internet, and completely identi es the monitor. The monitor informs about its existence (address,port) to all the daemons that form the virtual machine at that moment. To allow for this, the PVM software has been modi ed to accommodate a special request pvm mon enter() from the 7
Monitor program. The pvm mon enter() request informs the local pvm daemon about the monitor's listening address. The local daemon then broadcasts the same information to all other daemons in the con guration. If any more hosts are later added to the con guration (which will be done by the master pvm daemon), the information about the monitor is passed on to the new daemons too. From that point on, when any daemon is about to spawn a process, it does so in such a way that the new process is completely under the control of the monitor. This is accomplished as follows: Since the daemon knows the (address,port) of the monitor, it connects to the monitor. The monitor itself would be listening on the same (address,port) for such connections. Let the resulting connected socket-pair be called s out. When connected, the daemon passes some information about the process to be spawned to the monitor, such as the executable name, the TID, the time of spawning, etc. The daemon then opens a socket in the Internet domain, and binds it to a vacant port, and passes this socket's (address,port) pair to the monitor. The daemon then waits listening on that socket. After getting the daemon's (address,port) of the temporary socket, the Monitor connects to the daemon. Let the resulting connected socket-pair be s in. The daemon listens one more time on the same listening socket, and the monitor connects once again to the daemon. Let the resulting connected socket-pair be s err. The daemon now makes s in, s out, and s err the standard input, output and error streams of the new task to be spawned respectively. It now fork s and exec s the new process to be spawned under the control of a debugger1. The monitor in turn adds the (TID, s in, s out, s err) tuple to its list of processes that it is monitoring. 2
2.6.2 Files Changed in PVM Software ddpro.c To
broadcast Monitor's address from local daemon to other daemons lpvm.c To connect the pvm library of the process to the Monitor so that the calls can be logged lpvmgen.c To modify all the PVM library routines so that calls to them can be logged by the Monitor mon util.c A utility function added pvmd.c To modify the pvm daemon to spawn the process under debugger control after making connections with the Monitor startup.c To pass on the Monitor address to newly initiated daemons
1 In fact, the script pvm3/lib/debugger is invoked which in turn invokes the debugger appropriate to the host's architecture. The pvm3/lib/debugger that is supplied along with the PVM software has been modi ed slightly. 2 It is important that after the rst connection from the daemon to the monitor, the monitor connect back to the daemon instead of the daemon connecting to the monitor for the remaining two connections. This is because connections for dierent processes could occur at the same time, and possibly get mixed up.
8
tdpro.c To let the Monitor program request the local daemon to register and broadcast
its address ddpro.h To add a new message type for inter-daemon communication pvm3.h To add a prototype tdpro.h To add a new message type for Monitor-local daemon communication
3 Tests The modi ed PVM software was installed on two dierent architectures (SUN4, and PMAX). The Monitor program (based on Athena Widgets of X11R5) was compiled and tested on a Sun Workstation. Tasks were spawned on dierent Sun workstations sharing le systems, as well as on a PMAX machine. Implementation of the program on RISC6000 is in progress. The standard example programs that come along with the PVM release have been used in testing. For group debugging, the master program that spawns several instances of slave has been used, where all the slaves were split into one or more groups and run in lock-step fashion (see Figure 2).
4 Future Work The following can be done as a continuation of the work: Application Management: A graphic interface to compiling/building the modules of an application would make it easier for the user in the development and maintenance stages of the application. It would also allow him/her to keep track of the location of each module of the application in the virtual machine. Graphic Communication Event Display: A graphic display of the communicationevents provides visualization of process-interaction to the user, and also allows for faster error-detection. Porting: The debugging environment software can be tested on several more architectures. Logging PVM calls to les: Options can be provided to record the log of PVM calls by PVM processes to les, thus allowing for post-mortem analysis.
5 Summary A debugging environment was presented for programming with the Parallel Virtual Machine (PVM). The environment consists of two parts | an X Windows-based program, called the Monitor, that provides a graphic interface, and a set of modi cations to the PVM software system. The Monitor provides the maximum control possible over the PVM processes. The processes are initiated under debuggers which are controlled by the Monitor. The PVM 9
Figure 2: A view of a debugging session
10
software was modi ed to allow processes to be spawned under the Monitor's control. The software was also modi ed to overload the PVM library routines, so that information about the calls to them by the processes is sent to the Monitor, which gathers them into a log of calls. The Monitor can be dynamically attached to the PVM system. Options are provided to detach/re-attach the Monitor to the PVM system dynamically at any moment and as often as desired. A \group debugging" feature is provided to allow a set of processes to be treated as a group for debugging purposes. Some of the signi cant features of the environment include Complete control of pace and order of execution of individual processes Veri cation of local correctness of the processes using the debuggers, as well as the correctness of inter-relationships among the processes using the log of PVM calls for each process Entire session is run-time Facility to debug processes as groups Simpli ed window management for the user Usage of Athena Widgets (X11R5) for the graphic interface of the Monitor program, making the program portable Easy detection of deadlocks
6 Bibliography References [1] Al Geist, et al., PVM 3.0 User's Guide and Reference Manual, ORNL Technical Report TM-12187, February 1993. [2] A. Lenstra, M. Manasse, \The Number Field Sieve", Proc. Symposium on the Theory of Computing, Baltimore, May 1990. [3] Adam Beguelin, et al., A User's Guide to PVM Parallel Virtual Machine , Technical Report ORNL/TM-11826, Oak Ridge National Laboratory, July 1991. [4] Adam Beguelin, Xab: A Tool for Monitoring PVM Programs , Oak Ridge National Laboratory Technical Report, June 1992. [5] Bernd Bruegge, A Portable Platform for Distributed Event Environments , ACM SIGPLAN Notices, December 1991. Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging. 11
[6] Brian K. Grant and Anthony Skjellum, The PVM Systems: An In-Depth Analysis and Documenting Study|Concise Edition , Lawrence Livermore National Laboratory Technical Report, August 1992. [7] C. Seitz, et al., The C Programmer's Abbreviated Guide to Multicomputer Programming , Caltech Computer Science Report CS-TR-88-1, January 1988. [8] G. A. Geist and V. S. Sunderam, Network Based Concurrent Computing on the PVM System , Oak Ridge National Laboratory Technical Report, June 1990. [9] H. E. Bal, Programming Distributed Systems , Silicon Press, 1990. [10] Mauricio Arango, et al., Adventures with Network LINDA, Supercomputing Review, October 1990. [11] V. S. Sunderam, PVM: A Framework for Parallel Distributed Computing, ORNL internal Technical Report.
12
A PVM Monitor User's Guide Usage: xpvm [pvm-exec]
where
pvm-exec
is the complete path name of the PVM Console executable.
A.1 PVM Monitor Control Window When the Monitor is executed, a control window is opened with buttons with the following labels (see Figure 3): PVM Console PVM Con guration PVM Task List Detach Monitor Quit
A.1.1 Console/Con guration/Task List Actions:
Choosing any of the rst three options, by clicking on them using the left mouse button, pops up a window corresponding to the option. For example, choosing the button PVM Console pops up the PVM Console window. If the window is already open, it is raised above all windows so that it is completely visible.
Use:
These are used to open the corresponding windows for the rst time, to open them again if closed before, or to raise them if they are covered by other windows.
A.1.2 Detach/Re-attach Monitor Actions:
Choosing the option Detach Monitor detaches the Monitor from controlling any processes that may be initiated from that point on. After the option is chosen, the label of the option changes to Re-attach Monitor. This \toggle" button is highlighted in reverse video to re ect the fact that monitoring is turned o. Note that the processes that are currently being controlled continue to be controlled. When the button changes to Re-attach Monitor, choosing the option will re-attach the Monitor to the PVM system, so that any processes initiated from then on will be initiated under the Monitor's control.
Use:
By default, after the Monitor is started, all PVM processes run by the user are initiated under the Monitor's control (i.e., initiated under a debugger whose standard input, output 13
Figure 3: PVM Monitor Control Window and error streams are redirected to sockets to the Monitor). This option is chosen to suspend or resume the above action. The options Detach Monitor and Re-attach Monitor can be chosen any number of times during a session (the overhead involved is not signi cant). These options are particularly useful when a subset of the processes of an application are to be debugged without caring about those processes of the application that are known to be accurate/correct or insigni cant in the type of problem that is being debugged. This reduces the amount of resources required for monitoring the application.
A.1.3 Quit Actions:
The Quit button is used to quit the monitoring session leaving all the daemons and processes in their current state. This is similar to the quit option of the PVM Console. Before quitting, a quit command is passed to the PVM Console, and the PVM daemons are informed that the monitoring is stopped.
Use:
This option is used to quit the monitoring session without altering the states of the PVM daemons and processes. To stop all daemons and processes, and leave the system in a clean state, use the Halt option from the PVM Console window (see the discussion on PVM Console for more details).
A.2 PVM Console Window
This window is a \wrapper" around the standard PVM Console program that comes along with the PVM release version 3.0. This is mainly provided for two reasons: 1. To allow a new user of this Monitor program to eectively use it even while getting familiar with it. The interface provides almost all the features of the PVM Console. 14
2. To carry out some time-consuming tasks using the PVM Console instead of the Monitor itself. For example, adding a host consumes some time, and the Monitor cannot wait for the result, since it has to keep processing user input. By passing on the command to the PVM Console, the Monitor can continue to cater to user's requests, accept connections from daemons for controlling new processes, display output from running processes, and so on. The PVM Console window consists of several panes (see Figure 4). Output Text Pane: The top pane is a text pane that represents the standard output of the PVM Console. All the output from the PVM Console appears in this pane. Whenever the entire output cannot t horizontally or vertically in the current size of this pane, scroll bars appear in that direction, using which the desired location of the output can be xed in the display area. The size of the pane itself can be varied by using the grip at the bottom right-hand corner of the pane. The output of the PVM Console can be completely cleared o the pane at any time, by choosing the Clear option in the Button Pane (see below). Button Pane: The pane below the Output Text Pane houses several buttons for various actions corresponding to the commands of the PVM Console. { Con guration: Choosing this button sends a conf command to the PVM Console. A list of hostnames that are currently in the virtual machine con guration appears in the Output Text Pane. { Processes: Choosing this button sends a ps command to the PVM Console. A list of active PVM processes that are currently running in the virtual machine appears in the Output Text Pane. { Help: Choosing this button sends a help command to the PVM Console. A list of PVM Console commands appears in the Output Text Pane. This is only provided for completeness. { Clear: Choosing this button clears the output of the PVM Console in the Output Text Pane. This is provided so that output that is no longer necessary can be cleared. No command is sent to the PVM Console. { Reset: Choosing this button sends a reset command to the PVM Console. This kills all PVM processes except consoles and resets all the internal PVM tables and message queues. The deamons are left in an idle state. { Halt: Choosing this button sends a halt command to the PVM Console. This kills all PVM processes including consoles and then shuts down PVM. All daemons exit. { Close: Choosing this button pops down the PVM Console window. This is provided so that the user can temporarily close this window to get some screen space when this window is not needed. No command is sent to the PVM Console. The window can be restored by choosing the PVM Console button from the PVM Monitor Control window. 15
Figure 4: PVM Console Window
16
Host Add/Delete Dialog: The pane below the Button Pane contains the inter-
face to add/delete a host to the virtual machine con guration. The host name to be added/deleted can be entered in the text entry box, and then the button labeled Add or Delete can be chosen to add or delete the given hostname respectively. Task Spawn Dialog: The pane below the Host Add/Delete Dialog contains the interface to spawn a PVM task using the PVM Console. The executable name of the task, arguments to it if any, the host name on which to spawn the task, or the architecture name of the set of hosts on which the task can be spawned, and the number of instances of the executable to spawn can all be speci ed using the corresponding text entry boxes. Except the executable name, any of the other entries can be left blank. After speci cation, choosing the button labeled SPAWN TASK sends the appropriate command to the PVM Console to initiate the task. Pressing the Return key in any of the text entry boxes of this dialog has the same eect as choosing the SPAWN TASK button. The button labeled CLEAR SPECS can be chosen to quickly clear the entries in all the text entry boxes. Manual Command Dialog: The last pane contains the dialog for entering a command manually to the PVM Console. Entering a command here and pressing Return will have the same eect as typing in the command to the standard input of the PVM Console when run alone.
A.3 PVM Con guration Window The PVM Con guration provides a list of the hostnames that currently form the virtual
machine, and a list of active tasks on each of those machines. It provides a better interface than the PVM Console, by providing the con guration at a glance. The window contains four panes (see Figure 5). Machine List: It contains a list of hostnames that are present in the current virtual machine con guration. On selecting a hostname from the list, the list of active tasks (their TIDs) running on that host is shown in the Task List (see below). Task List: It contains a list of active tasks (TIDs) that are running on the hostname selected in the Machine List. Machine Add/Delete Dialog: It contains a dialog interface similar to the Machine Add/Delete Dialog of the PVM Console window. Selecting a hostname from the Machine List places the hostname in the speci cation entry box of this dialog; to delete the host, the button labeled Delete can be chosen. This dialog is repeated here so that the user need not retype the hostname to delete, and also need not search for the PVM Console window if it is not open. Button Pane: It contains two buttons labeled Refresh and Close. Choosing Refresh updates the list of machines and tasks that are currently displayed. This option is needed since the PVM Console and the PVM processes do not provide any feedback to 17
Figure 5: PVM Con guration Window the Monitor about the virtual machine con guration. The Monitor, also, cannot update the status too often, as it needs to continue processing other input, as mentioned previously. The Close button can be chosen to temporarily close the PVM Con guration window. It can be popped up again using the PVM Con guration button from the PVM Monitor Control window.
A.4 PVM Task List
This window provides a list of active tasks and a list of exited tasks, along with facilities to debug them. When a new task is spawned, and monitoring is on (see PVM Monitor Control - Detach/Attach Monitor), the PVM daemon spawns the task under the control of a debugger whose standard streams are redirected to sockets attached to the Monitor. When the connections are established, the Monitor adds the new task ID to its list of active tasks. The task is then ready to debugged. The task remains under the debugger's control, and it has to be started using the debugger's run command. Tasks can be debugged individually, as well as a group in a lock-step fashion. When debugging a single task individually, a window is provided for the task that displays the task's standard streams (which will be the same as those of the debugger), along with some control buttons. Standard input can be entered to the task as well as the debugger. A log of all the calls to the PVM library routines by the task is also provided. Note that all the above information and action is run-time. When debugging a set of tasks as a group, rst the group itself has to be speci ed using a group-speci cation window. After the speci cation, a window representing the group of tasks is provided. The group-window has exactly the same interface as that of the individual task debugging window, with a couple of exceptions. The rst dierence is that a command 18
Figure 6: PVM Task List Window given in a group window is transmitted to all the tasks in the group. For example, a \stop at 20" followed by a \print x" given in a group window results in all the tasks of the group receiving the commands. The second dierence is that a list of the tasks that belong to the group is provided at the bottom of the window, using which the standard output and error and log of PVM calls of any of those tasks can be selected for display in the group window. The display status of tasks is not destroyed when the tasks exit. Thus, it is possible to examine the status of the exited tasks. For example, the log of the PVM calls that a task has made can be examined even after the task exits. The PVM Task List window contains the following (see Figure 6): Active Task List: It contains a list of tasks that are currently active. All the tasks on the list are spawned under the control of a debugger. The type of debugger is dependent on the architecture of the host on which the task was spawned. The tasks on this list may be \freshly active", and so a run command has to be given to them to get them running (the tasks are run initially dormant under the debugger's control). To debug/examine an active task individually, the task TID can be selected in the list, and the button labeled Debug Active can be chosen to open a debugging window for it. Exited Task List: It contains a list of tasks that have exited tasks. Selecting a TID from this list results in some information about the task being displayed in the Task Information pane. 19
Figure 7: Task Initiation Noti cation Window
Task Information: It contains some useful information about the task whose TID
has most recently been selected from the lists of active and exited tasks. The task's executable name, the host on which it was initiated, the time of initiation, etc., are shown. To get information on any other task, its TID can be selected to change the display of information. Control buttons: Buttons to control the tasks are provided: { Debug Active: Choosing this button opens a window to debug the task that is highlighted in the Active Task List. If a window was opened previously for the same task, the same window is popped up. { Examine Exited: Choosing this button opens a window to examine the state of the task that is highlighted in the Exited Task List. If a window was opened previously for the same task, the same window is popped up. { Debug Group: Choosing this button opens a dialog window that allows a group of active tasks to be debugged to be speci ed (see Group Speci cation Window). { Close This button can be chosen to temporarily close the PVM Task List window. It can be popped up again using the PVM Task List button from the PVM Monitor Control window.
A.5 Noti cation Window
Whenever a new task is initiated under the Monitor's control, the user is noti ed of the initiation of the task using a popup that appears at the top left corner of the screen (see Figure 7). The keyboard bell is rung on popup. One noti cation-popup is created per task initiated. If several tasks are initiated in rapid succession, the popups are laid on top of each other. Choosing the OK button of the popup removes the popup.
A.6 Task Debugging Window
This window is opened for each task which has been selected for individual debugging from the PVM Task List window. It consists of the following panes (see Figure 8): 20
Figure 8: Task Debugging Window
21
Figure 9: Group Speci cation Window
Standard Output Text Pane: This provides the display for the standard output of
the debugger and the task that the debugger controls. Whenever the output does not t the display size of the pane, scroll bars appear on the needed direction. Also, whenever the displayed output is not more needed, the display can be cleared by choosing the button labeled Clear Output from the Control Button Pane. Standard Error Text Pane: This provides the display for the standard error stream of the debugger and the task that the debugger controls. PVM Call Log: This pane lists a log of all the PVM calls that the process has called until that point of time. The arguments to the calls are also displayed along with the call names. Arguments which are arrays are displayed as comma separated list of elements enclosed in square brackets. Arguments printed as hexadecimal numbers are indicated so. Since the information that the call to a particular PVM call has been made by the task is sent to the Monitor even before the call is executed, errors can be detected better. Control Button Pane: This houses several command buttons that allow the user to issue commands to the debugger. The Close button can be used to close the window temporarily. The window can be popped up again using the Debug Active button from the PVM Task List window. Manual Command Dialog: Using this dialog, more complicated commands can be given to the debugger. This dialog can also be used to provide input for the task if and when the task waits for input from its standard input.
A.7 Group Speci cation Window
This window presents a list of all the tasks that are currently active. The user can select a subset of them to be treated as a group for debugging purposes. After the speci cation is done, a Group Debugging Window is opened for the speci ed set of tasks. The window consists of the following (see Figure 9): 22
List of TID Toggles: This is a list of toggles each of which represents a task. A
highlighted toggle represents that the task, whose TID is given in the toggle label, has been included in the group. Clicking on an unhighlighted TID highlights it, thus selecting it. Clicking on a highlighted TID unhighlights it, thus unselecting it. Con rmation Buttons: Choosing the Done button accepts the selections and opens a Group Debugging Window. Choosing the Cancel button abandons the selection process.
A.8 Group Debugging Window
This window is exactly similar to the Task Debugging window, except for the following (see Figure 10): A list of TIDs that represent the group being debugged is displayed at the bottom of the window. Selecting a TID among the list of those TIDs results in the standard stream displays of the corresponding task being displayed in the panes at top of the window. Every command given in this window is applied to all the tasks included in the group. This is useful when the tasks are to be executed in a lock-step fashion.
23
Figure 10: Group Debugging Window
24