RESEARCH PAPER International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009
A Node-Marking Technique for Slicing Concurrent Object-Oriented Programs Madhusmita Sahu1, Durga Prasad Mohapatra2 1
Department of MCA, C V Raman Computer Academy, Bhubaneswar-752054, India Email:
[email protected] 2 Department of CSE, National Institute of Technology, Rourkela-769008, India Email:
[email protected]
Model in Java. In Section III, we discuss the method to develop the CSDG. In Section IV, we describe the nodemarking concurrent dynamic slicing algorithm. In Section V, we compare our work with related work.
Abstract— We propose an efficient technique for slicing object-oriented programs. We use a dependence based intermediate program representation, which we have named Concurrent System Dependence Graph (CSDG) to represent object-oriented programs. The CSDG is an arcclassified digraph that represents various dependences like synchronization and communication dependences between statements. Our slicing algorithm marks and unmarks the executed nodes of CSDG appropriately during run time.
II. CONCURRENCY MODEL IN JAVA Java supports concurrent programming using threads. A thread is a single sequential flow of control within a program. A program that contains multiple threads is called a multi threaded program. Java provides a Thread class library, that defines a set of standard operations on a thread such as start(), stop(), join(), suspend() and resume() etc. Java provides the methods wait(), notify(), and notifyall() to support synchronization among different threads [5]. Fig. 1 shows an example of a concurrent Java program segment. In this example, SyncObject is a class in which there are two synchronized methods Swait() and Snotify(). Swait() invokes a wait() method and Snotify() invokes a notify() method. CompObject is a class which provides a method mul(CompObject, CompObject). If, a1.mul(a2, a3) is invoked then a1 = a2 * a3. There are two threads and four objects o1, a1, a2 and a3 present in the program. a1, a2 and a3 are data objects. o1 is a synchronized object, which is used to control the operations on a1, a2, and a3 synchronously. These four objects are passed to the two threads Thread1 and Thread2 as parameters. The details of the code are not listed in the example.
Index Terms— Program Slicing, Concurrent System Dependence Graph (CSDG), synchronization dependence, communication dependence.
I. INTRODUCTION Weiser [4] first introduced the concept of a program slice. A static slice consists of all statements of a program that might affect the value of a variable at a program point of interest for every possible inputs to the program where as a dynamic slice consists of only those statements that actually affect the value of a variable at a program point of interest for a particular set of inputs to the program. Many real life object-oriented programs are concurrent in nature in the way they run on different machines connected to a network. Usually it is hard to understand and debug the concurrent object-oriented programs in comparison to sequential programs because of the nondeterministic nature of concurrent programs, the lack of global states, unsynchronized interactions among processes, multiple threads of control and a dynamically varying number of processes. Slicing technique is helpful in this regard since an increasing amount of resources are being spent in debugging, testing and maintaining these products. In this paper, we propose an algorithm for dynamic slicing of concurrent object-oriented programs. First, we develop a dependence-based graph called concurrent system dependence graph (CSDG) to represent concurrent object-oriented programs. Then, we propose a dynamic slicing algorithm called node-marking concurrent dynamic slicing (NMCDS) algorithm for object-oriented programs. The rest of the paper is organized as follows. In Section II, we present a brief introduction to Concurrency
III. CONSTRUCTION OF CONCURRENT SYSTEM DEPENDENCE GRAPH (CSDG) A. Concurrent System Dependence Graph (CSDG) To capture the inter-thread synchronization and communication, we use a dependence-based representation called the concurrent system dependence graph (CSDG) We use CSDG to slice concurrent Java programs. A concurrent system dependence graph (CSDG) GC of a concurrent object-oriented program P is a directed graph (NC;EC) where each node n ∈ NC represents a statement in P. For x, y ∈ NC, (x,y) ∈ EC iff one of the following holds: 1. y is control dependent on x. Such an edge is called a control dependence edge.
25 © 2009 ACADEMY PUBLISHER
RESEARCH PAPER International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009 2.
y is data dependent on x. Such an edge is called a data dependence edge. 3. y is synchronization dependent on x. Such an edge is called a synchronization dependence edge. 4. y is communication dependent on x. Such an edge is called a communication dependence edge. A CSDG of a concurrent object-oriented program captures the program dependencies that can be determined statically as well as the dependencies that may exist at run-time. We use different types of edges to represent the different types of dependencies.
5.
wait node which represents a statement containing a wait() method call. Consider the example program shown in Fig. 1 and its CSDG in Fig. 2. In Fig. 2, node 16 is a definition node for object a1, node 2 is a use node for object a1, node 11 is a predicate node, nodes 3 and 10 are notify nodes, and nodes 5 and 8 are wait nodes. Fig. 2 shows the CSDG for the program segment in Fig. 1. In this CSDG, node 14 is the start node of the main program, nodes 1 and 7 are the start nodes of Thread1 and Thread2, respectively, and other nodes represent the corresponding statements. Threads t1 and t2 are started by statements 21 and 22, respectively. So, there are control dependence edges from node 21 to 1 and from node 22 to node 7. Note that statement 6 is not data dependent on statement 2, because a2 defined at statement 2 cannot be used at statement 6. So, there is no data dependence edge from node 2 to node 6. Rather, node 6 is S-Communication dependent on statements 9 and 13.
Fig. 1: A concurrent Java Program Fig. 2: Concurrent System Dependence Graph (CSDG) for the program given in Fig. 1
These edges are: (i) control dependence edge, (ii) data dependence edge, (iii) synchronization dependence edge, and (iv) S-Communication dependence edge. A CSDG can contain the following types of nodes: 1. definition (assignment) node which represents a statement defining an object. 2. use node which represents a statement using an object. 3. predicate node which represents a statement containing an if construct. 4. notify node which represents a statement containing a notify() method call.
IV. NODE-MARKING CONCURRENT DYNAMIC SLICING (NMCDS) ALGORITHM Before presenting our algorithm, we first introduce some definitions that are used in the algorithm. In the rest of the paper, the terms statement, node and vertex are used interchangeably. A. Definitions Definition 1 def(obj): Let obj be an object in a class in an object-oriented program P. A node x is said to be a def(obj) node if x represents a definition (assignment) statement that defines the object obj. 26
© 2009 ACADEMY PUBLISHER
RESEARCH PAPER International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009 thread if the value of an object defined at x is directly used at y through inter thread-communication. Let x be a def(obj) node in thread t1 and y be a use(obj) node in thread t2. Then the node y is said to be SCommunication dependent on node x. For example, in Fig. 2, node 6 in Thread1 uses the object a2 which is defined at nodes 9 and 13. So, node 6 in Thread1 is SCommunication dependent on nodes 9 and 13 in Thread2. Similarly, node 13 in Thread2 is S-Communication dependent on node 4 in Thread1.
In Fig. 1, nodes 2, 9 and 17 are the def(a2) nodes. Defnition 2 defSet(obj): The set defSet(obj) denotes the set of all def(obj) nodes. In Fig. 1, defSet(a2) = {2, 9, 17}. Definition 3 use(obj): Let obj be an object in a class in the program P. A node x is said to be a use(obj) node if it uses the object obj. In Fig. 1, the node 4 is a use(a3) node and nodes 2, 6 and 12 are use(a2) nodes. Definition 4 useSet(obj): The set useSet(obj) denotes the set of all use(obj) nodes. In Fig. 1, useSet(a2) = {2, 6, 12}. Definition 5 recDef(obj): For each object obj, recDef(obj) represents the node (the label number of the statement) corresponding to the most recent definition of the object obj. In Fig. 1, recDef(a2) is at statement 2 in thread t1 and it is at statement 9 in thread t2. Definition 6 Concurrent Control Flow Graph (CCFG): A concurrent control flow graph (CCFG) G of a program P is a directed graph (N, E, Start, Stop), where each node n ∈ N represents a statement of the program P, while each edge e ∈ E represents potential control transfer among the nodes. Nodes Start and Stop are two unique nodes representing entry and exit of the program P, respectively. There is a directed edge from node a to node b if control may flow from node a to node b. Definition 7 Synchronization Dependence: A statement y in one thread is synchronization dependent on a statement x in another thread if the start or termination of the execution of y directly determinates the start or termination of the execution of x through an inter-thread synchronization. Let y be a wait() node in thread t1 and x be the corresponding notify() node in thread t2. Then, the node y is said to be synchronization dependent on node x. For example, in Fig. 2, node 5 in Thread1 is synchronization dependent on node 10 in Thread2. Similarly, node 8 in Thread2 is synchronization dependent on node 3 in Thread1. Definition 8 Communication Dependence: In Java there exists two types of communication dependencies. In the first one, communication among different threads may be established through sockets and using constructs like getOutputStream() and getInputStream(). We have named this type of communication dependence MCommunication dependence. In the other one, Java uses shared memory to support communication among threads. In this type of communications, two parallely executed threads may exchange their data via shared objects. We have named this type of communication dependence S-Communication dependence. Since inter-thread communication using shared objects is very common, in this paper, we use shared objects to support communications among threads i.e., the SCommunication dependence. However, our approach can easily be extended to consider message-passing communication among threads using sockets. Informally, a statement y in one thread is SCommunication dependent on statement x in another
B. Overview of NMCDS Algorithm Before execution of a concurrent object-oriented program P, its CSDG is constructed statically only once. We mark and unmark the executed nodes during program execution. When a statement executes a wait() node, the algorithm marks it and the corresponding notify() node. During the execution of a concurrent object-oriented program P, let dslice(u, obj) denotes the dynamic slice with respect to the most recent execution of the node u. Let x1, x2, … , xk be all the marked predecessor nodes of u in the updated CSDG after an execution of the statement corresponding to node u. Then, the dynamic slice with respect to the present execution of the node u, for the object obj, is given by dslice(u, obj) ={x1, x2, …, xk} ∪ dslice(x1, obj) ∪ dslice(x2, obj) ∪ … ∪ dslice(xk, obj). Our NMCDS algorithm computes the dynamic slice with respect to the specified slicing criterion by simply looking up the corresponding dslice computed during run-time. Algorithm: Node Marking Concurrent Dynamic Slicing (NMCDS) Algorithm 1. CSDG Construction: Construct the CSDG of the object-oriented program P before execution starts. (a) Add control dependence edges for each test (predicate) node u do the following for each node x in scope of u do the following Add control dependence edge (u, x). (b) Add data dependence edges for each node x do the following for each object obj used at x do the following for each reaching definition u of obj do the following Add data dependence edge (u, x). (c) Add synchronization dependence edges for each wait node x in thread t1 do the following for the corresponding notify node u in thread t2 do the following Add synchronization dependence edge (u, x). (d) Add communication dependence edges for each use(obj) node x do the following for each def(obj) node u do the following Add S-Communication dependence edge (u, x). 2. Initialization: Do the following before each execution of P (a) unmark all the nodes of the CSDG. 27
© 2009 ACADEMY PUBLISHER
RESEARCH PAPER International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009
3.
4.
(b) Set dslice(u,obj)= φ for every object obj of each node u of the CSDG. (c) Set recDef(obj)=NULL for every object obj of the program P. // end of initialization Run-time Updations: Run the program and carry out the following after each statement s of the program P is executed until the program ends or a slicing command is given. Let the node u in CSDG corresponds to the statement s in the program P. (a) For every object obj used at node u, update dslice(u,obj)={x1,x2,…,xk} ∪ dslice(x1,obj) ∪ dslice(x2,obj) ∪ … ∪ dslice(xk,obj)), where {x1,x2,…,xk} are the marked predecessor nodes of u in the CSDG. (b) If u is a def(obj) node, then do the following: ∪ (i) update dslice(u,obj) = {u} dslice(u,obj). (ii) unmark the node recDef(obj). (iii) update recDef(obj)=u. (c) Mark the node u. (d) If u is a call vertex, then do the following: (i) mark the vertex u. (ii) mark the actual-in and actual-out vertices associated with u corresponding to the present execution of u. (iii) mark the method entry vertex of the corresponding called method for the present execution of the vertex u. (iv) mark the formal-in and formal-out vertices associated with the method entry vertex. (e) If u is a vertex representing the operator new, then do the following: (i) mark the vertex u. (ii) mark the actual-in and actual-out vertices associated with u corresponding to the present execution of u. (iii) mark the method entry vertex of the corresponding constructor method for the present execution of u. (iv) mark the formal-in and formal-out vertices associated with the method entry vertex of the constructor method. (f) If u is a node representing the start() method, then do the following: (i) mark the node u. (ii) mark the node representing the corresponding run() method. (g) If u is a wait node, then do the following: (i) mark the node u. (ii) mark the corresponding notify node. //end of Run-time updations. Slice Look Up: (a) If a slicing command is given, carry out the following: (i) look up dslice(u,obj) for the object obj for the content of the slice. // node u corresponds to statement s. (ii) display the resulting slice.
(b) If the program has not terminated, go to step 3. Working of NMCDS Algorithm. We illustrate the working of our NMCDS algorithm with the help of an example. Consider the Java program given in Fig. 1. The CSDG of the program is given in Fig. 2. During the initialization step, our NMCDS algorithm first unmarks all the nodes of the CSDG and sets dslice(u,obj)= φ for every node u of the CSDG. Now, for the input data argm[0]=1, argm[1]=1 and argm[2]=2, the algorithm marks and unmarks the executed nodes. The algorithm marks the nodes 3 and 8, and 10 and 5 as synchronization dependency exists between statements 3 and 8, and statements 10 and 5. For the given input values, statement 6 is communication dependent on statement 9. So, the algorithm marks the nodes 9 and 6. The algorithm also marks the associated actual parameter vertices at the call site and the formal parameter vertices at the called method.
Fig. 3: The updated CSDG for the program given in Fig. 1
Now, we shall find the backward dynamic slice computed with respect to slicing criterion . According to the NMCDS algorithm, the dynamic slice at statement 6, is given by the expression dslice(6, a3) = {1, 5, 9} ∪ dslice(1) ∪ dslice(5) ∪ dslice(9). By evaluating the expression in a recursive manner, we get the final dynamic slice at statement 6. The statements included in the dynamic slice are shown as shaded vertices in Figure 4 and also shown in rectangular boxes in Figure 5.also shown in rectangular boxes in Fig 5. C. Complexity Analysis Space Complexity: The worst case space complexity of the NMCDS algorithm is O(N3), where N is the number of nodes in the CSDG or equivalently the number of statements in the program. 28
© 2009 ACADEMY PUBLISHER
RESEARCH PAPER International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009 Time Complexity: The worst case time complexity of the NMCDS algorithm is O(N2S), where N is the number of nodes in the CSDG or equivalently the number of statements in the program and S is the length of the execution of the program.
Ramesh [3] have extended Krinke's technique [1] to compute static slices of concurrent programs with synchronization. All these approaches [1,2] did not consider the dynamic slicing aspects. Mohapatra et.al. [6,7], Lallchandani et. al. [8] developed many techniques to compute dynamic slices of concurrent object-oriented programs. Their approaches involve marking and unmarking the edges of dependence graph during the execution of the program. For large programs having many method calls and threads, the dependence graph will contain more number of dependence edges and their algorithm will take more time to mark and unmark all the required dependence edges. But the NMCDS algorithm will mark and unmark only the executed nodes and thus it will take comparatively less time to mark and unmark the executed nodes. CONCLUSION In this paper, we have proposed a novel algorithm for computing dynamic slices of concurrent object-oriented programs. We have named our algorithm node-marking concurrent dynamic slicing (NMCDS) algorithm. Our algorithm uses concurrent system dependence graph (CSDG) as the intermediate representation. The NMCDS algorithm is based on marking and unmarking the nodes of the CSDG as and when the dependencies arise and cease at run-time. Our algorithm does not use any trace file to store the execution history. Also, it does not create additional nodes during run-time. Another advantage of our approach is that when a request for a slice is made, it is already available. We consider the Java concurrency model, though it can easily be extended to handle other concurrency models. REFERENCES [1] Krinke, J. Static slicing of threaded programs. ACM SIGPLAN Notices 33 (April 1998), 35-42. [2] Zhao, J., and Li, B. Dependence based representation for concurrent Java programs and it's application to slicing. In Proceedings of ISFST (2004), pp. 105-112. [3] Nanda, M. G., and Ramesh, S. Slicing concurrent programs. In ACM International Symposium on Software Testing and Analysis (August 2000). [4] Weiser M. Programmers Use Slices When Debugging. Communications of the ACM, 25(7): 446-452, July 1982. [5] Naughton, P., and Schildt, H. Java - The Complete Reference. Mc GrawHill, 3rd Edition, 1998. [6] Mohapatra D. P., Mall R., and Kumar R. Computing dynamic slices of concurrent object-oriented programs. Information & Software Technology 47(12): 805-817 (2005). [7] Mohapatra D. P., Mall R., and Kumar R. A novel method for computing dynamic slices of concurrent C++ program". In Proceedings of 12th International Conferences on Advanced Computing & Communications (ADCOM-04), Ahmedabad, December 2004. [8] Lallchandani J. T., and Mall R.: Computation of Dynamic Slices for Object-Oriented Concurrent Programs. In Proceedings of Asia Pacific Software Engineering Conference (APSEC’05). 2005: pp. 341-350.
Fig. 4: The dynamic slice of the program given in Figure 1 for the slicing criterion
V. COMPARISON WITH RELATEDWORK Zhao et al. [2] developed a dependence-based representation called concurrent program dependence graph (CPDG) to represent program dependencies in a concurrent Java program. The CPDG is a digraph which consists of a collection of dependence graphs each representing a single method in the class and a few additional vertices and arcs to model parameter passing between different methods in a class, and inter-thread synchronization and communication between different threads. Krinke [1] developed a static slicing algorithm for threaded programs. But, Krinke has not considered thread synchronization in her algorithm. Nanda and 29 © 2009 ACADEMY PUBLISHER