Visualizing Program Execution - CiteSeerX

0 downloads 0 Views 76KB Size Report
recent work, contour models have also been applied to the .... the scope of an identifier x declared in a program unit f .... type arr = dynamic array of pair; var a,b: ...
Visualizing Program Execution Bharat Jayaraman

Charlotte M. Baltus

Department of Computer Science SUNY at Buffalo Buffalo, NY 14260

Department of Computer Science SUNY at Buffalo Buffalo, NY 14260

Abstract

object-oriented and logic languages [3]. The significance of this extension is that it can serve as a basis for developing visualization tools for these languages. The contour model visualization of program execution is an example of an operational semantics for a language. It differs from traditional operational semantics in that it is pictorial rather than textual. Nevertheless it will be sufficiently formal to give a clear and unambiguous meaning of programs. In essence, a contour diagram consists of a set of nested boxes, or contours, where each box represents, using source-program identifiers (where possible), the run-time information on the activation of some procedure-level construct, i.e., the bindings for parameters and local variables, the executable instructions, and appropriate linkage information. The nesting structure of contours conveys the scoping of environments directly. We show in this paper that contour models are very appropriate for object-oriented languages because contours make very explicit the important fact that objects are really environments. To show the versatility of this approach, we also discuss their use for another very different paradigm, logic programming, especially the modeling of backtracking and partially-defined data objects (through logic variables). In recent work, contour models have also been applied to the visualization of concurrent programs, especially Ada tasks [1]. Since our emphasis is on understanding program execution, we will not be concerned with compile-time issues, such as type checking. We will focus on execution at the procedure-level, i.e., we will examine what happens when a procedure is called, how parameters and non-locals are accessed, etc. Not emphasized in this discussion are the details of executing assignment statements, sequencing, conditional and iterative statements, as well as input/output. We assume the reader is familiar with various programming paradigms, especially object-oriented and logic programming concepts. The remainder of this paper is organized as follows:

The motivation for this work stems from the lack of good visual tools for describing the execution of procedure-level constructs such as procedures, functions, coroutines, iterators, methods, and processes. Our proposed solution to this problem is an extension of an old technique called the contour diagram, which was originally used to give semantics for Algol-like languages. Our extensions allow the contour diagram to be used for more modern languages, such as object-oriented languages, logic languages, etc. In this paper, we explain this extended notation, and its use in visualizing the execution of procedural, object-oriented and logic programs. The significance of this extension is that it can serve as a basis for program visualization tools.

1. Introduction The motivation for this work stems from the lack of good visual tools for describing the execution of procedure-level constructs such as procedures, functions, coroutines, iterators, methods, and processes. It may be noted that most text-book descriptions of such constructs are not entirely satisfactory: either they are given in terms of implementation devices such as run-time stacks and pointers, which sacrifice clarity for efficiency, or else in terms of formal semantics, which sacrifice clarity for precision. Furthermore, these descriptions are not very suitable for a visual presentation. Our solution to this problem is an extension of an old technique called the contour diagram, which was originally used to give semantics for Algol-like languages, i.e., for scope rules, recursion, parameter transmission, etc. [4]. (A very readable account of this use is in Organick et al [5].) The contour diagram gives a precise, high-level, and visual account of the meaning of programs. The contribution of this paper lies in showing how this notation can be extended to account for more modern languages, such as

Section 2 introduces the elements of the contour model and shows its use for scope rules, recursive procedures, and dynamically allocated data objects. Section 3 shows how objects and inheritance in object-oriented programming can be represented in the contour model. Section 4 considers the basic control and data objects from logic languages— backtracking and partially-defined structures. Finally, section 5 presents conclusions and areas of further work.

2. Contour Models: An Introduction To keep this paper self-contained, we first describe how contour models work, and then discuss scope rules, recursion, and dynamic data objects. In the contour model, the effect of a procedure call is represented by a contour. A contour basically records information local to a procedure definition and linkage information to facilitate procedure return 1. In general, the local information pertains to the identifiers declared in the procedure as well as the procedure code itself. The information about identifiers is structured as an array of entries, where each entry consists of an identifier name, its attribute, and its value. For example, if the identifier stands for a variable name, then its attribute field would hold its type and its value field would hold its run-time value (if one has been assigned). In general there are different rules for contour creation and deletion, but for now let us assume that a procedure call results in the creation of a contour, and the return from a procedure results in the deletion of the created contour. Because recursive procedure calls are possible, there could in general be several outstanding (i.e., not-yetcompleted) procedure calls at some point during program execution. Therefore, to record the execution state of a program, in general, we must maintain multiple contours and link them together in some way. fi attr

id

2.1. Recursion, Scope Rules, Parameters

value ...

...

...

rpdl

f. The superscript i is used to distinguish the different contours of f. Not shown in the figure are the internal details of the program unit and contour. The local part of a contour holds for each variable in the program unit its type and value information; in addition, the local part holds information on any procedures or functions defined in the program unit, as well as the executable instructions of the program unit. The linkage part holds information to enable procedure/function return once execution within this contour has completed. This linkage information is maintained as the value of the special variable rpdl, which stands for ‘return point-dynamic link’. We can now sketch the basic operation of a contourmodel based abstract machine for an Algol-like language: Suppose that the top-level procedure is called main. Initially, a contour main1 is created by the underlying operating system and control is given to the abstract machine. The abstract machine executes instructions of main1 in sequence, starting at the first instruction. To keep track of its current instruction, the abstract machine has an instruction pointer. As variables are assigned values during execution, the associated fields in the contour are updated. Suppose that the abstract machine is about to execute the first call on a procedure f. It would now construct a new contour f1 and set its rpdl field to be ‘ip in main1’, where ip is the instruction following the call to f in main1. Exactly where a newly created contour is placed relative to the calling contour depends upon the precise semantics of the language, and we will discuss this issue in detail later. Parameters to f are next transmitted—these details are also discussed later—and execution continues with the first instruction in f1. When f1 eventually executes a return, its rpdl field is consulted to continue execution at ip in main1. The contour f1 is then deleted. Finally, when main1 reaches its end, the rpdl field in main1 is consulted to make the return back to the system.

return pt. & dynamic link

begin

end f;

Figure 1. Basic Program Unit Contour. Figure 1 shows a skeletal contour fi for some procedure 1 In order to model more advanced constructs such as coroutines and backtracking, additional information would need to be maintained in the contour.

Contour diagrams are particularly appropriate for understanding statically-scoped languages. In these languages, the scope of an identifier x declared in a program unit f is the entire unit f and all nested units of f, except those nested units where x is redeclared. For example, in the program shown below, the scope of x declared in procedure main includes procedure p1 but not procedure p2 (both occurrences of p2), because x is redeclared inside p2. In the presence of recursion, it is necessary to define the scope of an identifier at execution time because multiple invocations of a procedure might coexist at execution time. This is where the contour model is particularly helpful: The scope of an identifier x declared in a contour fi is the entire contour fi and all its nested contours, except those contours

The j-th call on a procedure h declared in contour f results in the creation of contour hj nested inside fi . We illustrate the above ideas with a program whose behavior is hard to explain without the aid of the contour model— the problem lies in the treatment of procedure parameters. We stress that this is not an example of a meaningful program; its sole purpose is to show the value of contour diagrams in clarifying the meaning of static scoping. Figure 2 shows the contour model for this program at the point when the print statement in the outer procedure p2 is about to be executed. procedure main var x,y : int; procedure p1(value y: int, q: procedure(reference int)); procedure p2(reference x: int); begin x := y + 2; print x; q(y); end p2; begin if x==y then q(y) else p1(y+1,p2); end p1; procedure p2(reference x: int); begin x := y + 2; print (x); end p2; begin x := 2; y := x; p1(0, p2); end main;

When a procedure is passed in as the actual parameter, the corresponding formal parameter is referred to as a procedure parameter, e.g., parameter q in p1. The actual parameter matching a procedure parameter can either be a procedure or another procedure parameter. The main semantic issue with a procedure parameter concerns the contour—called the environment contour—in which the contour for the argument procedure created. The reason that the environment contour is significant is because it affects the resolution of any non-local identifiers of the argument procedure. There are three possible choices for the environment contour: (1) the contour in which the argument procedure is declared; or (2) the contour in which the argument procedure is passed as argument; or (3) the contour in which the argument procedure is eventually invoked. In statically-scoped languages, clearly the first choice is the correct one.

x

int

2

y

int

2

p1

proc

p2

proc

p1.cf p2.cf system

rpdl

p11 y

int

0 4

q

proc p2 in main 1

p2

proc p2.cf

rpdl

p21 x rpdl

int

y in p1 2 "end"

"end"

p12 y

int

1 2

q

proc p2 in p1 1

p2

proc p2.cf

rpdl

y

p2 1 x rpdl

int

y in p1 3 "end"

"end"

int

q

2 3 proc p2 in p1 2

p2

proc p2.cf

p1 3

"end"

rpdl

x

int

rpdl

y in p1 1

p2

1

"end"

...

i

main 1

print(x);

i.p.

...

where x is redeclared. Where exactly is a contour created? For staticallyscoped languages, the basic rule for contour creation is as follows:

Figure 2. Visualizing Procedural Programs. Space limitations preclude a detailed discussion of figure 2. The keys to this diagram are the bindings shown for the parameter q in p11 , p12 , and p13. Although the procedure p2 is bound to q in each case, the associated contours are different, and this makes all the difference in the meaning of the program. We claim that it is very difficult to convey the behavior of such programs without the contour model.

2.2. Dynamic Data Objects Dynamically allocated data objects are drawn outside all contours, in a separate area called the heap. Hence they are also known as heap-allocated objects. These objects can be thought of as data contours, i.e., they do not have any code or linkage information. The following program fragment

serves to illustrate how data objects are represented: procedure test; type pair = record f1:int, f2:bool end; type arr = dynamic array of pair; var a,b: arr; begin a := ; b := a; ... end

In the above program, variables a and b are bound to dynamically allocated objects. The act of creation can be done in different ways (usually by a new operation); in the above example, this is achieved implicitly by the assignment of to a. The term refers to an array of two record values, [10,true] and [20,false]. For dynamically allocated data objects, the semantics of an assignment such as b:= a is of particular interest. There are two general options: assignment-by-copy and assignment-by-sharing. Figure 3 illustrates the difference visually. In the case of copying, further distinctions can be made, depending upon whether the copying is shallow or deep. These can also be clearly portrayed using contour diagrams. arr 1

test 1 a

arr

b

arr ...

rpdl

pair 1 10 true

pair 2 20 false

arr 2

(a) assignment-by-copy arr 1

test 1 a

arr

b

arr ...

rpdl

pair 1 10 true

pair 2 20 false

(b) assignment-by-sharing

Figure 3. Visualizing Dynamic Data Objects.

3. Visualizing Object-Oriented Programs The two main concepts in object-oriented languages are abstract data types (ADTs) and inheritance. Objects are instances of ADTs, which, in OOP languages are defined by

a construct called a class. When procedures are attached to an object, they must be executed in the context of the object in order that the fields of the object are correctly accessed. Thus, objects are really environments. Moreover, since these object-environments could be dynamically allocated, the invocation of a procedure defined in a class must be understood in a dynamically allocated object-environment. The contour model provides a very good way of presenting the execution of object-oriented programs. We illustrate with a sample object-oriented (C++-like) program. class tree { public: tree(int n) {value = n; left = right = NULL;}; procedure insert(int); // plus other procedures, e.g., print, etc. private: int value; tree left; tree right; }; // Implementation of insert procedure tree::insert(int n) { if (value==n) return; if (value