On Verifying Distributed Multithreaded Java Programs - IEEE ...

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000

On Verifying Distributed Multithreaded Java Programs Jessica Chen School of Computer Science, University of Windsor 401 Sunset Avenue, Windsor, Canada N9B 3P4 [email protected]

Abstract Distributed multithreaded software systems are becoming more and more important in modern networked environment. For these systems, concurrency control and thread synchronization make it much harder to do traditional extensive testing to guarantee the quality of the systems. In contrast to testing, software verification under certain formalisms and methodologies usually gives us higher confidence about the system. In this paper, we consider translating some parts of program code that are sensitive to concurrency control into certain formal description so that we can reuse existing verification tools to enhance our confidence in the final code. Java language is gaining increasing popularity in distributed multithreaded system development, and CCS is one of the convenient tools for describing concurrent and multiprocess systems. Under a set of reasonable restrictions, we present a general framework on how to translate the thread control and synchronization portion of distributed, multithreaded Java programs into formal specification in CCS. With the translated process terms, we are able to use some model checkers to verify properties expressed in modal -calculus, such as invariance, eventualities, fairness etc, which are by nature hard to test.

1. Motivations With the advances of modern computer and computer networks, distributed multithreaded software systems are becoming more and more popular. Improving qualities of these systems is an important issue we are facing. However, distributed systems are usually much more complex mainly due to the concurrency control and thread synchronization. High confidence in these kinds of systems is much harder to reach because multithreaded environment introduces many more uncertainties to the system behavior, and This work is supported by the Natural Sciences and Engineering Research Council of Canada under grant number OGP 0209774.

testing of these systems are much harder to control. Certain properties that usually should hold for multithreaded software systems, such as invariance, eventualities, fairness etc. are by nature hard to test. In contrast to testing, software verification under certain formalisms and methodologies usually gives us higher confidence in our conclusion. Comparing testing and verification, testing usually is to check the program code against system requirements1, so the confidence we obtain is about the final system. On the other hand, traditional verification is to verify the correctness of a formal description of the system behavior against the system requirements. It is not directly related to the code. Very often, additional work (e.g. testing) is still necessary to link the verified description to the final product. Moreover, formal verification techniques are hard to apply to most software systems. This is mainly because both system requirements and the description of system behavior need to be formally expressed in given formal languages, and this is usually hard, if not impossible, for ordinary designers and developers. To overcome this difficulty, efforts have been made recently by various researchers on obtaining formal descriptions from semiformal or informal design. However, even the semiformal or informal design may not be always available due to various reasons, for example: 1. some parts of software systems, like the coordination among multiple threads, can hardly be clearly expressed by semiformal or informal design notations like UML [3]; 2. even with semiformal or informal design notations, as an unfortunate reality in current software industry, design specification may still be harder than programming for most of programmers. People are likely to directly use programming languages that they are much more familiar with. 1 This includes module testing, subsystem testing etc. which are dependent on internal design.

0-7695-0493-0/00 $10.00 (c) 2000 IEEE

1


Facing these problems, we follow the idea of the backward engineering and consider the translation from coding (rather than from informal design) into formal descriptions. In this way, 1. we can reuse formal verification tools to check some important properties that are hard to be checked by testing techniques; 2. we do not require ordinary designers to formalize their description of the system behavior; 3. in addition, compared to general procedure of specifications and verifications, the verification based on such translated descriptions provides us more confidence in the final system because we do not have to worry about the consistency between program code and the formal description of the system behavior. Since the use of formal specification and verification is still quite restricted, we only translate a small portion of the system that is hard for testing, i.e. the one involving concurrency control and thread synchronization. In this paper, we present our experiment on systematically translating part of a Java program that involves thread control and synchronization into a process term according to CCS (Calculus of Communicating Systems) [19]. Multithreaded programming languages are widely used as coding languages for concurrent systems. We have chosen Java in particular because it is gaining increasing popularity for a large number of distributed application development. Java language provides native support for the multithreaded programming as well as some ease of use, high level communication facilities such as Remote Method Invocation (RMI) for the distributed systems development. Process algebras e.g. CCS [19], CSP [14], ACP [2], are generally recognized as a convenient tool for describing concurrent and multi-process systems. Moreover, for finite state processes (processes that can be interpreted on finite transition systems), various practical tools have been developed to verify whether a process satisfies certain properties (see e.g. [4, 6, 9, 16, 18, 25]), where the properties actually characterize the requirements of the system and can be described in a formula of a modal/temporal logic such as CTL [8], -calculus [17, 24]. Here we consider reusing the model checker implemented in CWB (Edinburgh Concurrency Workbench) [25]. We translate pieces of Java programs into CCS in order to verify the properties given in modal -calculus [17, 24]. A prototype of the automated translation is currently under implementation. The rest of the paper is organized as follows: Section 2 gives a short review of Java concurrency control mechanism and of CCS notations. In Section 3, we present the framework of our translation from parts of Java programs

to CCS process terms. Section 4 briefly introduces modal

-calculus and explains how to express some important sys-

tem requirements with it. The comparison with related work is given in Section 5. The last section is dedicated to conclusions and some directions for future work.

2. Java concurrency control, RMI and CCS notations As we mentioned previously, we focus our formal verification only on the part of a Java program that involves concurrency control and communication among multiple threads possibly running in different hosts. In this section, we give a brief review of Java concurrency control and Java RMI. We show a typical example that involves synchronization among multiple threads residing on two hosts communicating via RMI. This example is used later on for the explanation of our verification strategy. In the latter part of the section, we give a brief review of the basic CCS notations for readers who are not familiar with it.

2.1. Java concurrency control The Java language and runtime system support thread synchronization through the use of monitors which was originally introduced in [13]. Generally, the critical sections in Java programs are defined at the method level (identified by the synchronized keyword), and Java platform uses monitors to synchronize method calls on an object: each object with synchronized methods is a monitor that allows only one thread at a time to execute a synchronized method of that object. This is accomplished by locking the object when a synchronized method is invoked so that no other thread can invoke any synchronized method on this object at the same time. A synchronized method automatically performs a lock action when it is invoked; its body is not executed until the lock action has successfully completed. When the execution of the method’s body is ever completed, either normally or abruptly, an unlock action is automatically performed on that same lock. In addition to having an associated lock, every object with synchronized method has an associated wait set of threads. A thread executing in a synchronized method may voluntarily call wait() to release the lock on the monitor object and put itself into the wait set of this object. When notify() is called and the wait set is not empty, some arbitrarily chosen thread2 is removed from the wait set and re-enabled for thread scheduling. The awakened thread will compete in the usual manner with any other threads that might be actively competing to synchronize on this object. Analogously, when notifyAll() is called, every thread in the wait 2 In this work, we do not consider the various priorities among the threads.

0-7695-0493-0/00 $10.00 (c) 2000 IEEE

2

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000 set of the object is removed from the wait set and re-enabled private Arbiter node2 = null;

for thread scheduling3.

public Arbiter1Impl() throws Exception f Naming.rebind(“//montague:2000/node1”,this);

2.2. Java Remote Method Invocation

Thread.sleep(3000);

In Java programs, there are several ways to communicate between different processes and threads on the same or different hosts, such as low level message passing with sockets or higher level Remote Procedure Call (RPC), either using CORBA middleware or RMI. RMI is recommended in pure Java programs because it is easier to use and it is gaining popularity in Java community. To use RMI, remote interfaces including each method to be invoked remotely have to be declared. Implementation of these remote interfaces should be provided in the related class definitions. At runtime, objects that provide remote methods usually have to be created and registered in some kind of naming server so that they can be looked up by other objects in the network. When a remote method is invoked, RMI runtime may (or may not) execute the method in a separate thread it creates. So the remote object implementation for the method has to be thread-safe.

node2 = (Arbiter)Naming.lookup(“//SGI:2000/node2”);

g public void run() throws Exception f getToken(); doCS();

g public synchronized void getToken() throws Exception f if (!tokenIsHere) f node2.passToken(); tokenIsHere = true;

g inCS = true;

g public synchronized void doCS() f /* ...do work in critical section... */ inCS = false;

2.3. An example

notify();

g

To illustrate a typical scenario in using synchronized methods and Java RMI, let us consider a distributed arbiter system that resolves requests from users for a shared resource. For simplicity, we assume that there are only two users residing on two hosts A and B . One of the possible designs is to control the access to the shared resource through the use of an implicitly shared token. Every user must hold the token in order to access the shared resource. We use method passToken() to be remotely called to pass the implicit token between two hosts. The following interface Arbiter defines that method passToken() can be remotely called.

public synchronized void passToken() throws RemoteException f if (inCS) f try f wait();

gcatch(Exception e)fe.printStackTrace(System.err);g g tokenIsHere = false;

g public static void main(String args[]) f

import java.rmi.*;

try f

public interface Arbiter extends Remote

Arbiter1Impl node1 = new Arbiter1Impl();

fpublic void passToken() throws RemoteException;g

while (true) f node1.run(); g

An implementation of interface Arbiter for the part of the program on host A is given below. import java.rmi.server.*;

g

import java.rmi.*; class Arbiter1Impl extends UnicastRemoteObject implements Arbiter f private boolean tokenIsHere = true; private boolean inCS = false; 3 Note

g catch(Exception e) f e.printStackTrace(System.err); g g

that suspend() and resume() is another pair of methods to coordinate the threads. Since they are deprecated in the new version of Java, we do not include them in our discussion.

Here, we use boolean variable tokenIsHere to denote whether the token is in this host, and variable inCS to denote whether the user on this host is currently in the critical section, i.e. accessing the shared resource. Initially, tokenIsHere is true (i.e. the token is in host A) and inCS is false (i.e. the user is not accessing the resource). On host A, we create an object node1 and register it under the name “node1” for Remote Method Invocation. The user on host A enters the critical section when it successfully enters

0-7695-0493-0/00 $10.00 (c) 2000 IEEE

3


method doCS(). For simplicity, we have omitted the part of the program corresponding to what the user does in the critical section. Actually, this will not affect our verification and we are not interested in it. When the user exits the critical section, it sets inCS to be false and then notifies other threads who might be waiting to enter the critical section. Among the methods defined for node1, getToken(), doCS() and passToken() are synchronized methods and passToken() is only remotely called. The implementation on host B is almost the same except that (i) variable tokenIsHere is initially set false; (ii) we create object node2 on this host and register it under the name “node2” so that node1 can call its method passToken().

2.4. CCS notations The process terms in CCS are constructed from a set of atomic actions (simply called actions below) and a set of operators. Let A be a set of actions ranged by a. We assume that A comes equipped with a mapping A ! A such that a a for every a 2 A. We use A to denote fa j a 2 Ag. Following the convention, we use action a to denote sending a signal via channel a and its complimentary action a to denote receiving a signal via a. The special action where 62 A [ A is used to model “silent” or invisible transitions: those transitions that typically occur when two processes in the system do an internal communication. In the following, we will consider A[A[f g as the set of actions, i.e. actions of sending and receiving signals, and a special action . The syntax of basic CCS is given by P nil j :P j P P j P j P j P f j P \ where (i) is an action; (ii) f is a relabelling of the actions; and (iii) is a set of actions. Informally,

=

:

::=

+

[]

nil represents the process which cannot perform any action;

:P denotes a process which can only perform and then behave like P ;

P1 + P2 is a process which can act either as P1 or

as P2 , and the choice depends on the environment in which P1 P2 is used;

+

P1 j P2 is a process which can perform any sequences

of actions obtained by arbitrary interleaving of the sequences of actions which P1 and P2 can perform; moreover it can perform silent moves whenever P1 and P2 are able to perform complementary actions;

P [a1 =b1; : : : ; an =bn] denotes the process derived from P by replacing action b1 ; : : : ; bn in P with a1 ; : : : ; an ;

P \ is a process which cannot perform actions in . In fact,

is a set of actions that are restricted to be

internal: if a 2 , then whenever a process in P does an a action, it must be executed simultaneously with action a of another process in P .

The semantics of a process is given in terms of transition systems. A transition system is a quadruple S; Act; ! ; s0 where (i) S is a set of states; (ii) Act is a set of actions; (iii) ! S A S is a transition relation; and (iv) s0 is an initial state. Given an initial process pinit , we associate to pinit a transition system whose states are the processes reachable from pinit , via the transitions inferred by using a set of structural rules. The transition relation ! in the transition system is defined as the least relation satisfying this set of structural rules. The transition system associated to a process p can be automatically generated from p by some tools such as CWB. When such a transition system associated to p is finite, it forms the base to formally verify whether p satisfies certain properties given in modal -calculus (see Section 4).

(

)

3. Translating Java programs into CCS description Since the model checking is limited to concrete states (states without variables) and that the verification is based on the state expansion of a process, we have to restrict ourselves to a small part of a program and to make certain reasonable assumptions on this part of the program. In our current work, we assume that, in the portion of the Java program to be verified,

the objects and threads can be statically determined; only boolean variables are allowed. For the moment, we do not consider the other types like integer; for methods, there is neither local variable, parameter passing nor recursive calls.

These assumptions form the starting point of our prototype and they guarantee that we shall obtain a finite state transition system from the translated process, a condition that is essential to using the model checking in CWB. Under these assumptions, given a portion of a Java program to be verified, let us say that there are s methods, t threads, and l objects among which m have synchronized methods. In this setting, the translated process has the following framework:

S = (Lock1 j : : : j Lockm j W ait1 j : : : j W aitm j V1 j : : : j Vn j M1;1;1 j : : : j Ml;s;t )n

Here,

1. Since we know all the objects statically, we give each of them an object identifier i ( i l). Assume that

0-7695-0493-0/00 $10.00 (c) 2000 IEEE

1

4


1

among these objects, object i with i m has synchronized methods. For each object i with i m, we have a lock process Locki and a wait-set process W aiti in the translated process. As we have explained in Section 2, a monitor of object o actually has two tasks: (i) to synchronize the calls to o’s synchronized method via lock and unlock actions; (ii) to coordinate the threads to access o when wait() and notify() are called. We use lock process to simulate the lock mechanism of a monitor. Similarly, we use wait-set process to simulate the monitor’s management of the wait-set of the threads. The lock process and wait-set process are explained in details in Section 3.1 and 3.2 respectively.

1

2. Since we know all the objects statically, we also know all the variables of each object. Given each variable of an object a variable identifier i, we use Vi ( i n) to denote the variable process corresponding to variable i. Note that for two variables defined in the same class but belonging to different objects, we use different variable identifiers. Variable process is explained in details in Section 3.3.

1

1

3. Each method is given an identifier j ( j s), and we use Mi;j;k ( i l, j s, k t) to denote the method process which simulates thread k ’s execution of method j of object i, where i is an object identifier and j is a method identifier. Under the assumption that there is no recursive method call, for each method call, there is one method process corresponding to it. Method process is explained in details in Section 3.4.

1

4.

1

1

is a set of actions used in the processes. Its content will be clear once we have introduced the lock processes, wait-set processes, variable processes and method processes. We summarize the content of in Section 3.5.

3.1. The lock process We use lock process to simulate the lock mechanism of a monitor. The task of this process is to lock and unlock the monitor object. A general lock process Lock is predefined as

Lock = lock:unlock:Lock

which can recursively lock and unlock the monitor object by doing actions lock and unlock. These two actions are executed simultaneously with actions lock and unlock respectively performed by some threads who access the monitor object: Any thread who intends to get permission to execute a synchronized method controlled by Lock should do an action lock which is complimentary to lock. This lock

action by a thread t can be successful only when the lock performs action lock. If the lock is now in state unlock.Lock and cannot perform lock, or if Lock is currently synchronizing with action lock of another thread, then the method call of thread t is blocked. As we may have more than one monitors, we use Locki , together with actions locki and unlocki for the ith lock process. Here i is an object identifier. Locki is defined by relabelling the actions in Lock as:

Locki = Lock[locki =lock; unlocki=unlock].

3.2. The wait-set process The wait-set process maintains the set of threads who are waiting to be notified and resumes them at proper time. It is constructed by three kinds of actions: wait, notify and resume. When thread i calls wait(), it performs action waiti . This action should be executed simultaneously with action waiti performed by the wait-set process. When a thread calls notify() or notifyAll(), it performs action notify or notifyAll respectively. These actions are executed simultaneously with actions notify and notifyAll respectively by the wait-set process. After performing notify or notifyAll, the wait-set process is able to perform actions resumei to resume certain threads i waiting to be notified. Note that the wait-set process needs to know the identity of the threads who are waiting to be notified. However, it does not need to know the identity of the thread who calls notify() and notifyAll(). To facilitate the translation, we assume a fixed maximum number of threads which can simultaneously call wait() within the same object. A general wait-set process with n that handles maximum n such threads is predefined. To simplify the explanation, we show below the general waitset process with 2:

W ait = wait1 :W A(resume1 )+ wait2 :W A(resume2 )+ notify:W ait + notifyAll:W ait W A(r) = wait1 :W B (r; resume1 )+ wait2 :W B (r; resume2 )+ notify:r:W ait + notifyAll:r:W ait W B (r1 ; r2 ) = notify:(r1 :W A(r2 ) + r2 :W A(r1 ))+ notifyAll:r1:r2 :W ait

Here we use parameterized process WA(r) and to simplify the description. Processes with actions as parameters are acceptable by CWB. In this description, we assume that this monitor is used by ; ) are threads and . Actions waiti and resumei (i for accepting the wait call from thread i and for resuming thread i respectively.

W B (r1 ; r2 )

1

2

0-7695-0493-0/00 $10.00 (c) 2000 IEEE

=1 2

5


A wait-set process is defined by relabelling the actions in the general wait-set process. There are two kinds of relabelling in this regard: (i) the relabelling of the wait and resume actions with those actions with proper thread identifiers; (ii) the relabelling of the notify and notifyAll actions with those actions with proper object identifiers of the waitset process. Generally, wait-set process i which manages the wait and notify calls from threads u and v is defined as

W aiti = W ait[waitu =wait1 ; resumeu =resume1; waitv =wait2 ; resumev =resume2; notifyi=notify; notifyAlli=notifyAll]

In the example of the distributed arbiter, we have two threads and on different hosts. The wait-set process 1 W ait1 on host A handles the wait() calls from thread on host B , and vice versa. Thus, we have

1

2

2

W ait1 = W ait[wait2=wait1 ; resume2 =resume1; notify1=notify; notifyAll1=notifyAll] W ait2 = W ait[wait1=wait1 ; resume1 =resume1; notify2=notify; notifyAll2=notifyAll]

3.3. Variable processes In CCS, each variable is considered as a process. At the moment, we consider only boolean variables. For a boolean variable v , we can define a process with four actions: setT to set true to v , setF to set false to v , getT to get value true from v , and getF to get value false from v . We use V getT and V getF to denote the process where the current truth value of v is true and false respectively. The general variable process is predefined as

(

)

(

)

V (r) = setT:V (getT ) + setF:V (getF ) + r:V (r)

() ( ) ( ) ) ( ) ( )

where r can be either getT or getF . V r is able to perform a setT action and a setF action followed by doing V getT or V getF respectively. Besides, V getT can also perform getT followed by doing V getT , and V getF can also perform getF followed by doing V getF . For variable i where i is the variable identifier, we obtain its variable process by relabelling the actions in the general variable process using the variable identifier:

(

)

(

Vi (r) = V (r)[setTi =setT; getTi=getT; setFi =setF; getFi =getF ]

3.4. The method process Now we use the example of the distributed arbiter to illustrate the construction of the method process which simulates the execution of a method. As we explained previously, method process Mi;j;k simulates thread k ’s execution of method j of object i. Except for the process corresponding to the main method, each method process Mi;j;k recursively starts with action

mStarti;j;k and ends with action mEndi;j;k . When object i1 in method j1 calls, in thread k1 , method j2 of object i2 , the method process Mi1 ;j1 ;k1 performs mStarti2 ;j2 ;k1 followed by mEndi2 ;j2 ;k1 . These two actions are to be synchronized with actions mStarti2 ;j2 ;k1 and mEndi2 ;j2 ;k1 performed by Mi2 ;j2 ;k1 . Thus, when Mi1 ;j1 ;k1 has performed mStarti2 ;j2 ;k1 , it is blocked and Mi2 ;j2 ;k1 is activated. Mi1 ;j1 ;k1 is blocked until Mi2 ;j2 ;k1 performs mEndi2 ;j2 ;k1 . In the example of the distributed arbiter, let us name methods run(), getToken(), doCS(), passToken(), main(String args[]) of object on host A as method , , , , while methods run(), getToken(), doCS(), passToken(), main(String args[]) of object on host B as method , , , , . When object executes run() in thread 1, it calls methods getToken() and doCS() of itself. So we have method process

1

345 6 7 8 9 10

1

12

2

M1;1;1 = mStart1;1;1 :mStart1;2;1 :mEnd1;2;1 : mStart1;3;1 :mEnd1;3;1 :mEnd1;1;1 :M1;1;1

The method process for the main(String args[]) method is active at the beginning: it does not need to be synchronized by a start signal. For example, we have

M1;5;1 = mStart1;1;1 :mEnd1;1;1 :M1;5;1 That is, on host A, we start from the main(String args[])

1

method of object by recursively calling the run() method4. Note that M1;5;1 does not start with action mStart1;5;1 . Besides the Java program, the user is also allowed to input to the translator some additional signals which can be used as externally observable actions for the verification (like the print-out statement in a program). In our example, we intend to say that when method doCS() is called, it enters the critical section (which is not shown in the program). When it exits, it changes inCS to be false and notifies other threads who might be waiting for the token. In this setting, we can ask the translator to add, for example, signals enter1 followed by exit1 right before setting inCS to false, to denote that object enters/exits the critical section. Similarly, we can add signals enter2 and exit2 in the same place for object . Thus, we have method process

1

2

M1;3;1 = mStart1;3;1 :lock1 :enter1 :exit1 :setF2 : notify1:mEnd1;3;1 :unlock1:M1;3;1 When object 1 calls doCS(), M1;3;1 is activated by receiving a signal mStart1;3;1 . It then performs the follow-

ing actions in sequel: (i) try to get the lock by synchronizing with Lock1 using action lock1 . This is necessary because method doCS() is a synchronized one; (ii) give out signal enter1 and exit1 as the user requested for the external observations; (iii) set variable (inCS of object ) to false; (iv) notify other threads who might be waiting to enter the critical section, by synchronizing with W ait1 using action

2

4 For

1

the moment, we do not take into account the exception handling.

0-7695-0493-0/00 $10.00 (c) 2000 IEEE

6


3

notify1; (v) inform the caller of method the end of executing this method; (vi) unlock monitor object by synchronizing with Lock1 using action unlock1 . As we know, using Java RMI, one can make a remote method call just like making a local call. Similarly, the simulation of a remote method call in CCS is the same as the simulation of a local method call. This is because in CCS, the parallelism can be interpreted as parallel executions of multiple threads either on the same machine or on different machines. Consider for example, method (getToken() on host A) in the distributed arbiter executed in thread 1. The corresponding method process is constructed as: M1;2;1 mStart1;2;1 :lock1 : getF1 :mStart2;9;1 : mEnd2;9;1 :setT1:setT2 :mEnd1;2;1 :unlock1:M1;2;1 getT1:setT2:mEnd1;2;1 :unlock1 :M1;2;1

1

2

=

(

)

+

3.5. The restriction to the translated process As we discussed so far, the CCS process corresponding to the distributed arbiter can be expressed as

S = (Lock1 j Lock2 j W ait1 j W ait2 j V 1 j : : : j V4 j M1;1;1 j : : : j M1;5;2 j M2;6;1 j : : : j M2;10;2 )n

The restriction in the translated process contains all the actions used for the internal synchronization among the lock processes, wait-set processes, variable processes and method processes. Generally, we have

(iii) modalities to denote the capability of performing certain actions in a given configuration; (iv) least and greatest fixpoint constructs to denote “temporal” properties of the system, typically defined by induction and coinduction. Formally, -formulae are formed inductively according to the following abstract syntax (Note that not all of the constructs are independent).

::= true j false j : j 1 ^ 2 j 1 _ 2 j hLi j [L] j X: j X: j X where X is a variable symbol in a given set Var of variables, and L is a set of actions in Act. As usual in -calculus, for formulae of the form X: and X:, we require the syntactic monotonicity of with respect to X : every occurrence of the variable X in must be within the scope of an even number of negation signs. This requirement guarantees the existence of the least and the greatest fixpoints associated with . The formulae of modal -calculus is interpreted over transition systems. Given a transition system T S; Act; !; s0 , a valuation V on T is a mapping from variables in Var to subsets of the states in T . We assign meaning to -formulae by associating to T and V an extension function which maps -formulae to subsets of S .

(

= flocki; unlocki; waitj ; resumej ;

notifyi ; notifyAlli; setTk ; setFk ; getTk ; getFk ; mStartp;q;j ; mEndp;q;j j 1 i m; 1 j t; 1 k n; 1 p l; 1 q sg

notifyi ; notifyAlli; setTk ; setFk ; getTk ; getFk ; mStartp;q;j ; mEndp;q;j j 1 i; j; p 2; 1 k 4; 1 q 10g

4. The formal verification Once we have a representation of a portion of a distributed system in CCS, we can use such a representation to infer some important properties of the system. In CWB, modal -calculus was adopted as the specification language to describe system properties. Here we give a brief review of modal -calculus. We refer to the excellent tutorial article [25] for a thorough introduction on modal -calculus and its use in the context of concurrent processes. The formulae in modal -calculus consist of: (i) propositions (true and false); (ii) logical connectives (^; _; :);

The boolean connectives have the expected meanings. The extension of hLi includes all the states s 2 S such that starting from s, there is an execution of some action in L that leads to a successive state s0 included in the extension of . Thus, for example, hfagitrue expresses the capability of executing action a.

[ ]

The extension of L includes all the states s such that starting from s, each execution of an action in L leads to some successive state s0 included in the extension of . Thus, for example, fag false expresses the inability of executing action a.

[ ]

In the example of the distributed arbiter, we have that

= flocki; unlocki; waitj ; resumej ;

=

)

The extension of X: is the smallest subset E of S such that, assigning to X the extension E , the resulting extension of is contained in E .

Similarly, the extension of X: is the largest subset E of S such that, assigning to X the extension E , the resulting extension of is contained in E .

Now we show examples in -formulae that typically characterize some important system requirements. Let Act be the set of all external actions appeared in the system. In the distributed arbiter example, as we explained, the user can ask the translator to add some additional external actions for verification purpose, e.g. additional actions enter1 and exit1 for node1 to enter and exit the critical section, and actions enter2 and exit2 for node2 to enter and exit the critical section. In this setting,

0-7695-0493-0/00 $10.00 (c) 2000 IEEE

7


Act = fenter1; enter2 ; exit1 ; exit2 g 1.

[ ]

X: _ Act X expresses that there exists an evolution of the system such that eventually holds. Indeed, its extension E is the smallest set that includes (1) the states in the extension of ; and (2) the states that can execute an action leading to a successive state that is in E . In other words, the extension E includes each state s such that there exists a run from s leading eventually (i.e. in a finite number of steps) to a state in the extension of .

( )=

[ ]

X: ^ Act X . Similarly as we 2. Let Always explained in the previous item, Always expresses the invariance of under all of the evolutions of the system. Typically, deadlock freeness is an invariance. It can be expressed as Always hActitrue . That is, it is always true that the system can perform some action.

()

(

3.

([ 1 ]

)

Always fa g false _ fa g false says that at any time, it is impossible that both a and a are enabled. In the distributed arbiter example, with actions enter1 , enter2, exit1 and exit2 , the mutual exclusion of the access of the critical section can be expressed simply as Always fexit1 g false _ fexit2 g false . I.e. at any moment, it is not possible that the system is able to perform both actions exit1 and exit2 (i.e. both of the two users are in the critical section).

([

4.

[ 2]

)

Even(a)

]

1

[

2

]

)

=

X:(hActitrue ^ [Act fag]X ) expresses that action a will eventually occur. Let us add two more additional external actions req1 and req2 in methods getToken() in two machines respectively. Then Always([req1 ]Even(enter1 )) expresses that whenever user 1 raises a request on host A, he/she will eventually enter the critical section.

With the translated process S for the distributed arbiter, we can use CWB to verify a lot of interesting properties expressible in modal -calculus against the Java program. For example, whether the program is deadlock free, whether mutual exclusion is guaranteed. Notably, the answer to the verification of -formula

Always([exit1 ]false _ [exit2 ]false)

(mutual exclusion property) is negative. Following a trace (provided by CWB) leading to an error situation, we can easily find out the reason: Java run-time system does not guarantee that a waiting thread, once notified, will immediately be assigned the monitor to access the synchronized method and to continue its work. As a consequence, let us assume that the token is in host A, user 1 is in critical section, and user 2 has called wait(). When user 1 exits from the critical section, user 2 is notified, but this does not mean that user 2 will immediately be able to access the synchronized method passToken() again. In fact, user 1 at this time

may raise a request again and access method getToken(). If user 2 accesses method passToken() right after user 1 has finished method getToken(), then user 2 will be able to return from the remote method invocation and enter its critical section. At the same time, user 1 can also access method doCS() and enter its critical section. Once we have found out the reason causing the negative answer to mutual exclusion, we can easily see that the ifstatement in method passToken() should be changed into a while-statement. Such kind of diagnosis by way of a trace which leads to an error status is currently not automatic, but we intend to work on its automation. See next section. Note also that the answer to the verification of -formula

Always([req1 ]Even(enter1 ))

is also negative. That is, user 1 may stay in starvation: it is possible that he/she raises a request at certain point but never gets the chance since then to enter the critical section. The explanation to this fact is similar to the previous one: although user 1 can call wait() to wait for the token, and user 2 calls notify() when it finishes the critical section, this does not guarantee that user 1 can access method passToken(). In fact, it is possible that user 2 always raises request and always gets the chance to enter the critical section before user 1 gets even once the chance to access the synchronized method passToken().

5. Related works Formal verification techniques rely on both formal description of the systems properties to be verified, and a formal description of the systems behavior. In recent years, various researchers have been exploiting formal verification techniques to check the correctness of either a design or a program against their requirements. The focus has been to derive formal descriptions of system behavior. Since Unified Modeling Language (UML) [3] is well accepted by the software industries as graphical design notations, people have been studying the translations of UML descriptions into various formalisms. For example, [10] discussed the formalism of class diagrams in UML using Z [22]. [11] introduced an approach to formalizing some aspects of UML so that we can automatically check the integrity of a design in UML. People also have followed different lines of research to apply formal verification techniques to verify the correctness of program code against the requirements: (i) The solution given in [23] is to provide an additional (Java) class library which supports constructs in process algebra CSP [14]; (ii) The solution given in [5, 12, 21] and our work, is to translate the program code in a certain language according to some formal description language. Following (i), the software designers (or programmers) are still required to be familiar with (and actually use) the formal description

0-7695-0493-0/00 $10.00 (c) 2000 IEEE

8


language. Following (ii), in [21], Shatz et al. described a tool to verify the correctness of Ada tasking programs using Petri Net. A translator from a concurrent extension of C++ into Promela [15] is presented in [5], and a translator from a subset of Java into Promela can be found in [12]. In both of these two works, the translated programs are then given to SPIN model checker (see e.g. [16]) as input to verify systems properties given in LTL [20]. Our work also fall in this line of research. We translate a concurrency-sensitive subset of Java into CCS in order to verify systems properties given in model -calculus. Among the above mentioned works, [12] is the one most related to ours. Among the differences between these two, we would like to mention the following two points:

Since the chosen formal descriptions are different, we can do different kinds of analysis in the translated programs. The analysis of systems behavior with SPIN lies in the linear structure of the model. This includes, for example, checking trace equivalence, checking the satisfiability of a linear-time temporal logic formula. The analysis of systems behavior with CWB, on the other hand, lies in the branching structure of the model. So we can, for example, check bisimulation equivalence [19], check the satisfiability of a branching-time temporal logic formula. Since the operational semantics of CCS is quite clean and clear, we have indirectly defined in our work the semantics of a subset of Java in terms of transition systems. There exists a clear relationship between the states in the execution of a Java program and the states in the defined transition system. With such a relationship, we are able to derive many analytic results (not restricted to verification) from the model and then explain them at the level of programming language.

We can also define the operational semantics of Java programs in terms of transition systems directly instead of using the semantics of an existing process algebra. For example, in [1, 7], the semantics of subsets of Java are defined directly with a set of structural rules. These may also lead to the application of formal verification techniques for Java programs. Since they are not oriented to applying existing verification tools, the real application of these semantics in term of doing formal verification still needs a lot of work.

6. Conclusions and future work Based on the Java language specification, we proposed a CCS-based modeling of Java thread control and synchronization mechanisms. This partially laid the ground work for formal descriptions of Java programs involving multiple

threads. Furthermore, we also proposed a general framework of how to translate the thread control and synchronization portion of distributed, multithreaded Java programs into formal description in CCS. The resulting description can be used for various purposes. Particularly, it can be used for formal verification against certain required properties, such as deadlock/livelock freeness, fairness, eventualities, etc. Usually these kind of properties are very hard to test with general testing techniques. We have used the distributed arbiter example written in Java RMI. Using CORBA middleware in our framework is very similar to the one using RMI. Message passing with stream sockets can be used also in a similar way. As far as the concurrency control and thread synchronization are concerned, they make no difference to the underlying translation and verification5. The example we have shown here has several interesting variations. As we discussed, the if check of variable inCS in method passToken() may actually lead both nodes to method doCS() simultaneously, (i.e the mutual exclusion property does not hold with this implementation). This subtle error is very hard to be detected in ordinary testing because the possibility of hitting this error is relatively low. Without the help of verification tools, this error would easily slip into product in customers hands. We have detected the above error in the translated CCS process by using the tracing capability of CWB. This indicates that if we can make use of these kind of capabilities of the model checking tools, we might also be able to automatically identify the problems at the programming language level. In the future, we will include the diagnosis into our Java program verification tool. Due to the limitation of the verification tools, the translation is restricted to a small portion of a distributed system. Such work is obviously not scalable to the whole set of Java programs. To make it more useful, we intend to exploit possible ways to combine it with testing techniques so that we can improve the effectiveness of both testing and verification. There are two main directions that we are interested in:

To find a way to introduce into the formal verification, the dynamic interaction with the external world. Such interactions may be either between the portion of the program being verified and the rest part of the system, or between this portion and the external world of the system. Obviously, they are based on sample test cases. The interaction would give us information about the execution of the system at given point of the program (a kind of debugging), and would help us to guide the further execution of the verification procedure and thus significantly reduce the state space of the model

5 Of

course error handing or exception handling may be different.

0-7695-0493-0/00 $10.00 (c) 2000 IEEE

9

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000 being verified. [8] E. A. Emerson. Temporal and modal logic. In Handbook of Theoretical Computer Science, volume B, chapter 16. ElseTo use this formal verification technique on some vier Science Publishers B.V., 1990. concurrency-sensitive classes as part of unit testing, [9] E. A. Emerson. Automated temporal reasoning about reand use some testing tools on the other classes, then active systems. In Logics for Concurrency: Structure versus Automata, LNCS 1043, pages 41–101. Springer-Verlag, investigate a suitable way to derive conclusions on the 1996. behavior of the global system. [10] R. B. France, J. M. Bruel, M. M. Larrondo-Petrie, and In this paper, we focused mainly on how to obtain deM. Shroff. Exploring the semantics of UML type strucscription of system behavior from code. The requirement tures with Z. In IFIP Proc. of Formal Methods in Open specification in -calculus still mainly relies on user input. Object-based Distributed Systems, pages 247–257. ChapBut we can see from our above analysis that some system man & Hall, 1997. [11] A. Hamie, J. Howse, S. Kent, R. Mitchell, and F. Civello. properties are quite typical and generally used in distributed A formal semantics for checking and analysing UML modmultithreaded systems. These properties can be considered els. In 13th Annual ACM SIGPLAN Conference on Objectas instantiations of certain general -formulae with some Oriented Programming Systems, Languages, and Applicaactions as parameters. Hence, we consider it possible to tions, 1998. provide a relatively simple user interface for the designers to [12] K. Havelund. Java PathFinder. In The 6th International express some of their requirements without requiring their SPIN Workshop, LNCS 1680. Springer-Verlag, 1999. knowledge of modal -calculus. [13] C. A. R. Hoare. Monitors: An operating system structurWe will continue to study the derivation of CCS descriping concept. Communications of the ACM, 17(10):549–557, tion from some pieces of Java programs when different pri1974. [14] C. A. R. Hoare. Communicating Sequential Processes. Prenorities are assigned to the threads. Basic CCS does not extice Hall Int., London, 1985. plicitly support the expression of various priorities among [15] G. Holzmann. The Design and Validation of Computer Proprocesses. However, we will look for a suitable way to simtocols. Prentice Hall, 1991. ulate it. [16] G. Holzmann. The model checker SPIN. IEEE Transactions Another possible way to extend this work is to consider on Software Engineering, 23(5), May 1997. using symbolic model checking (see e.g. [18]) tools so that [17] D. Kozen. Results on the propositional -calculus. Theoretwe can avoid the complete state space expansion in the verical Computer Science, 27(2):333–354, 1983. ification. This will also help us to handle integer variables [18] K. L. McMillan. Symbolic Model Checking. Kluwer Acain Java programs. demic Publishers, 1993. [19] R. Milner. Communication and Concurrency. Prentice Hall, London, 1989. References [20] A. Pnueli. The temporal logic of programs. In Proc. of 18th IEEE Symp. on Foundations of Computer Science, pages 46– [1] I. Attali, D. Caromel, and M. Russo. A formal executable 57. 1977. semantics for Java. In OOPSLA’98 Workshop on Formal [21] S. Shatz, K. Mai, C. Black, and S. Tu. Design and impleUnderpinnings of the Java Paradigm. Vancouver, Canada, mentation of a Petri Net based toolkit for Ada tasking analyOctober 1998. sis. IEEE Transactions on Parallel and Distributed Systems, [2] J. Bergstra and J. Klop. Process algebra for synchronous 1(4), October 1990. communication. Information and Control, 60:109–137, [22] J. M. Spivey. The Z Notation: A Reference Manual. Prentice 1984. Hall, 1992. [3] G. Booch, I. Jacobson, and J. Rumbaugh. The Unified Mod[23] G. Stiles. Safe and verifiable design of multithreaded Java eling Language User Guide. Addison-Wesley, 1998. programs with CSP and FDP. In OOPSLA’98 Workshop [4] G. Boudol, R. de Simone, V. Roy, and D. Vergamini. Process on Formal Underpinnings of the Java Paradigm. Vancouver, calculi, from theory to practice: Verification tools. In Proc. Canada, October 1998. of Workshop on Automatic Verification Methods for Finite [24] C. Stirling. Modal and temporal logics. In Handbook of State Systems, LNCS 407. Springer-Verlag, 1990. Logic in Computer Science, Volume 2, pages 477–563. Ox[5] T. Cattel. Modeling and verification of sC++ applications. In ford University Press, 1992. Proc. of the Tools and Algorithms for the Construction and [25] C. Stirling. Modal and temporal logics for processes. In Analysis of Systems, LNCS 1384, pages 232–248. SpringerLogics for Concurrency: Structure versus Automata, LNCS, Verlag, 1998. 1043, pages 149–237. Springer-Verlag, 1996. [6] R. Cleaveland and S. Sims. The NCSU concurrency work-

bench. In Computer-Aided Verification (CAV’96), LNCS 1102, pages 394–397. Springer-Verlag, 1996. [7] E. Coscia and G. Reggio. A proposal for a semantics of a subset of multi-threaded “good” Java programs. In OOPSLA’98 Workshop on Formal Underpinnings of the Java Paradigm. Vancouver, Canada, October 1998.

0-7695-0493-0/00 $10.00 (c) 2000 IEEE

10

On Verifying Distributed Multithreaded Java Programs - IEEE ...

On Verifying Distributed Multithreaded Java Programs - IEEE ...

Suggest Documents

On Verifying Distributed Multithreaded Java Programs - CiteSeerX

Java 8: Multithreaded programs

Systematic Testing of Multithreaded Java Programs - CiteSeerX

Multithreaded Dependence Graphs for Concurrent Java Programs ...

Compiling Multithreaded Java Bytecode for Distributed Execution

A Thread Monitoring System for Multithreaded Java Programs - ACM ...

Checkmate: a Generic Static Analyzer of Java Multithreaded Programs

Distributed Java Programs Initial Mapping Based

Verifying Safety Properties of Concurrent Java Programs ... - CiteSeer

Distributed Java Programs Initial Mapping Based on ...

Verifying Staged Programs - DiVA portal

VERIFYING PROBABILISTIC PROGRAMS USING A

Detecting Errors in Multithreaded Programs by Generalized ...

Verification of Multithreaded Object-Oriented Programs with ...

Real-time Event-handling and Scheduling on a Multithreaded Java ...

Runtime Safety Analysis of Multithreaded Programs - CiteSeerX

Multithreaded Code from Synchronous Programs: Generating Software ...

Trace-Driven Verification of Multithreaded Programs - Western ...

Assertion Guided Symbolic Execution of Multithreaded Programs ...

Assertion Guided Symbolic Execution of Multithreaded Programs ...

Multithreaded Code from Synchronous Programs: Extracting ...

Distributed Dynamic Slicing of Java Programs Durga Prasad ...

Distributed, Multi-threaded Verification of Java Programs - UCF EECS

Workload Characterization of Multithreaded Java Servers - CiteSeerX