Specifying Multithreaded Java Semantics for Program Verification Abhik Roychoudhury
Tulika Mitra
Department of Computer Science School of Computing National University of Singapore Singapore 117543
Department of Computer Science School of Computing National University of Singapore Singapore 117543
[email protected]
[email protected]
ABSTRACT The Java programming language supports multithreading where the threads intera t among themselves via read/write of shared data. Most urrent work on multithreaded Java program veri ation assumes a model of exe ution that is based on interleaving of the operations of the individual threads. However, the Java language spe i ation (whi h any implementations of Java multithreading must follow) supports a weaker model of exe ution, alled the Java Memory Model (JMM). The JMM allows ertain reordering of operations within a thread and thus permits more behaviors than the interleaving based exe ution model. Therefore, programs veri ed by assuming interleaved thread exe ution may not behave orre tly for ertain Java multithreading implementations. The main diÆ ulty with the JMM is that it is informally des ribed in an abstra t rule-based de larative style, whi h is unsuitable for formal veri ation. In this paper, we develop an equivalent formal exe utable spe i ation of the JMM. Our spe i ation is operational and uses guarded
ommands. We then use this exe utable model to verify popular software onstru tion idioms ( ommonly used program fragments/patterns) for multithreaded Java. Our prototype veri er tool dete ts a bug in the widely used \DoubleChe ked Lo king" idiom, whi h veri ers based on interleaving exe ution model annot possibly dete t.
1. INTRODUCTION The Java programming language supports multithreaded programming where multiple threads an ommuni ate via reads/writes of shared obje ts (see [21℄ for a detailed dis ussion on software design using multithreaded Java). Multithreading is a useful te hnique as it allows the programmer to stru ture dierent parts of the program into dierent threads. Implementing the user interfa e of a software as a separate thread is a ommon example of su h stru turing.
Con rete real-life uses and appli ations of Java multithreading are presented in [18℄. Java threads an be run on multiple hardware pro essors or on a single pro essor through a thread library (su h as POSIX threads [7℄). As the implementations of multithreading are varied, the Java Language Spe i ation (JLS) pres ribes ertain abstra t rules whi h any implementation of Java multithreading must follow [16℄. These rules are alled the Java Memory Model (JMM). However, the JMM is more
omplex than an interleaved exe ution of the threads, where ea h thread exe utes in program order. The operations in any Java thread in lude read/write of shared variables and syn hronization operations like lo k/unlo k. In order to allow standard ompiler and hardware optimizations, the JMM permits these operations within a thread to be ompleted out-of-order. Thus, the permitted set of exe ution tra es under the JMM is a superset of the simple interleaved exe ution of the individual threads. This makes the debugging and veri ation of multithreaded Java software very diÆ ult as we have to onsider:
arbitrary interleaving of the threads,
ertain (not all) re-orderings of operations in the individual threads.
There is urrently a huge body of ongoing work on employing stati analysis and model he king te hniques [10℄ for on urrent Java program veri ation [14, 19, 25, 26, 31℄. Some of these te hniques translate the program to a formal model [19, 25℄ and then use data ow analysis/ model
he king to sear h the state spa e of this model. Others [14, 31℄ dire tly analyze program sour e ode by employing te hniques su h as stateless sear h and persistent sets. However, a ommonality among all these te hniques is that they assume the underlying exe ution model of a multithreaded program to be sequentially onsistent [20℄.
Sequential Consistency. Before pro eeding any further, let us elaborate on this point. An exe ution model for multithreaded programs is sequentially onsistent if for any program P (a) any exe ution of P is an interleaving of the operations in the onstituent threads (b) the operations in ea h onstituent thread exe ute in program order. Thus, in the following program with two threads (Op1 ; Op2 )
k (Op1 ; Op2 ) 0
0
Op1 ; Op1 ; Op2 ; Op2 is a sequentially onsistent exe ution but Op1 ; Op2 ; Op2 ; Op1 is not. As sequential onsisten y denotes 0
0
0
0
the programmer's intuitive understanding of the exe ution model, it is generally an useful model to assume for purposes of program veri ation. Unfortunately, it is not suÆ ient to assume a sequentially
onsistent exe ution model for verifying multithreaded Java programs. The reason for this lies in the JMM. The urrent JMM (whi h any implementation of Java multithreading must follow) is weaker than sequential onsisten y, that is, it allows more behaviors than simple interleaving of the operations of the individual threads. Thus, assuming sequential onsisten y during program veri ation of an invariant property might lead to an observable violation of the invariant in a veri ed- orre t program on some exe ution platforms !! Examples of su h programs even in lude some popular multithreaded Java software onstru tion idioms su h as the \Double-Che ked Lo king" idiom [29℄. There ould be several solutions to this problem. First, we ould develop a restri ted fragment of Java programs for whi h the JMM guarantees sequentially onsisten y [2℄. Programmers are then en ouraged to write programs only in this fragment. Se ondly, we ould hange the JMM altogether (this is being seriously onsidered by an expert group, see [1℄). Finally, we ould develop an exe utable formal des ription of the JMM and in orporate it into program veri ation. Let us now study ea h of these solutions in depth. In the rst solution, the fragment of Java programs for whi h the exe ution will always appear to be sequentially onsistent are the so- alled \properly syn hronized programs" or \data-ra e-free programs" [3℄. Intuitively, these programs ensure that whenever a thread a
esses a shared obje t, it possesses a lo k to the obje t. There are two diÆ ulties with this solution. First, even if the user's program is \properly syn hronized", he/she might use software libraries from untrusted sour es whi h are not guaranteed to be \properly syn hronized". Moreover, demanding syn hronization for every shared obje t a
ess is known to ause una
eptable performan e overheads in pra ti e [3, 21℄. For the purposes of program veri ation, the se ond solution also has ertain diÆ ulties. First of all, even if the JMM is hanged, the new memory model will not be implemented for some time to ome. Existing Java Virtual Ma hines (JVM) will ontinue to be used on unipro essor as well as multipro essor platforms. Therefore, hanging the JMM now will not solve the problem of the Java programmer for many years to ome. Moreover, the two on rete proposals for an improved JMM [22, 23℄ (whi h were proposed very re ently, and are now being hotly debated) are also weaker than sequential onsisten y. In fa t sin e the Java memory model des ribes all possible program behaviors on all possible platforms, it is unrealisti to de ne a Java memory model whi h enfor es sequential onsisten y. In this paper, we advo ate the third solution. We ompose a formal des ription of the JMM along with a formal model of the program for the purposes of program veri ation. Java's \write-on e, run-everywhere" strategy makes it important to develop programs whi h do not behave differently on dierent platforms. Moreover, a veri ed- orre t program behaving in orre tly on ertain platforms is of parti ular on ern! As the JMM aptures all possible behaviors, in orporating it into program veri ation allows platform in-
dependent reasoning. However to in lude the JMM in program veri ation, we have to take note of the following issues. First, the JMM [16℄ is informally des ribed in a de larative style. It spe i es
ertain rules that must never be violated in a multithreaded exe ution. In other words, the model is neither operational nor exe utable. This makes the JMM almost impossible to reason with (see [28℄ for the omplexities of informal reasoning about the JMM). We develop an exe utable spe i ation of the JMM in this paper. Se ondly, program veri ation via model he king suers from the state spa e explosion problem. Composing the memory model along with the program model further blows up the state spa e to be explored. However, note that the Java memory onsisten y model oin ides with sequential
onsisten y for \properly syn hronized" programs, that is, programs where any a
ess to shared data is pre eded by expli it syn hronization. In reality, a signi ant portion of a multithreaded Java program is \properly syn hronized". The \unsyn hronized" portion using the performan e enhan ing features of the JMM (whi h allow reordering of operations within a thread) mostly appear in \low-level" program fragments. These are widely used software onstru tion idioms performing a spe i task. These program fragments are typi ally exe uted many times in the ourse of program exe utions. Hen e they are optimized by avoiding expli it syn hronization for every shared data a
ess. We have used our exe utable memory model to debug and verify su h program fragments.
Summary of Results. Con retely, our ontributions are:
We develop a formal exe utable spe i ation of the JMM: the rules for implementing multithreading in the Java programming language. Our operational style spe i ation des ribes all possible behaviors that any implementation of Java multithreading an exhibit.
Our approa h of (a) onstru ting an exe utable JMM and (b) omposing it with the program model for program veri ation, is ompletely generi . It is not tied to the urrent JMM.
We have used our exe utable model of JMM to develop a prototype invariant he ker. This tool is parti ularly useful for verifying unsyn hronized program fragments. Our he ker has been used to dete t a bug in the widely used \Double-Che ked Lo king" software
onstru tion idiom [29℄.
Finally, our formal JMM spe i ation alleviates the diÆ ulty in understanding the urrent JMM [16℄. The rule-based JMM has been des ribed as \very hard to understand" [3, 28℄ and most reasoning has been done via informal ounter-example onstru tion.
Organization. The rest of the paper is organized as follows. In Se tion 2, we dis uss related work on the JMM and program veri ation. Se tion 3 dis usses the informal spe i ation of the JMM, while Se tion 4 presents the formal spe i ation. Se tion 5 dis usses appli ations of our JMM spe i ation in program veri ation. Finally, we dis uss the broad impli ations of our work in Se tion 6.
2. RELATED WORK Veri ation of Java programs has been studied extensively. Spe i ally, signi ant progress has been a hieved re ently in multithreaded Java program veri ation [6, 11, 14, 19, 25, 26, 30℄. Out of these works, [11, 19, 25℄ extra t formal model from Java sour e ode and analyze the formal model, while [6, 14, 30℄ propose te hniques to dire tly analyze the sour e ode by modifying the state spa e sear h algorithm. These works appear at a higher level of abstra tion than our work. They assume a simple exe ution model of sequential onsisten y and develop algorithms to analyze sequentially onsistent exe ution tra es of a program. Our work on entrates on formalizing the underlying exe ution model, but does not address the issue of state-spa e sear h algorithms. Re ently, some resear h has been dire ted towards developing exe utable models of the JVM [4, 24℄. In parti ular, Moore [24℄ develops a formal model of a multithreaded JVM and advo ates its use for verifying Java programs. Here, the only dieren e from onventional program veri ation is that instead of sour e ode veri ation, the byte- ode is veri ed. This work still suers from the problems we dis ussed earlier: the reasoning performed is platform dependent be ause a spe i JVM is formalized (whi h enfor es sequential onsisten y). Any platform independent veri ation of Java programs must take into a
ount the JMM. The JMM has been a topi of intense resear h in the past few years. The informal model was rst developed in the Java Language Spe i ation [16℄. Pugh [28℄ rst pointed out the diÆ ulties in informally reasoning about the model and suggested hanges. Subsequently, resear hers have proposed several improvements to the model [22, 23℄. Contrary to these works, our work does not address the question: \What should be the semanti s for multithreaded Java" ? Instead, it argues that multithreaded Java semanti s (the urrent one or any future improvement) should be in orporated into Java program veri ation. Sin e the in eption of the JMM, several formalizations of Java on urren y have been proposed, [5, 8, 15, 17℄ to name a few. Some of these [5℄ fo us only on language level on urren y onstru ts without onsidering the memory model. Some others [8, 15℄ onstru t non-exe utable spe i ations of the memory model. Most importantly, the goal of all these works is to have a lear understanding of Java on urren y (via formal spe i ation) and then perform human reasoning. Our goal is dierent. We have developed an exe utable formal JMM spe i ation for (semi)-automated reasoning about Java programs. This allows us to verify nontrivial software fragments, whi h would be extremely umbersome to perform with human reasoning. Developing exe utable memory models has been studied in the ontext of hardware multipro essors [13, 27℄. Similar to Java threads, hardware shared-memory multipro essors also impose a onsisten y model whi h di tates the allowed intera tions among the pro essors via a shared memory.
3. THE JAVA MEMORY MODEL In this se tion, we present the Java Memory Model (JMM) given in [16℄. The model is abstra t and is not onstru ted as a guide for implementing Java multithreading. Rather any Java multithreading implementation is supposed to allow only behaviors allowed by the model. We onstru t a formal
exe utable des ription of the model in the next se tion. The Java threads intera t among themselves via shared variables. For any shared variable v, ea h thread (a) possesses a lo al opy of v and (b) is allowed to a
ess the global master opy of v in main memory. The JMM essentially imposes onstraints on the intera tion of the threads with the master opy of the variables and thus with ea h other. The model de nes the following a tions for reading/writing the lo al/master opy of v in thread t. uset (v): Read from the lo al opy of v in t assignt(v): Write into the lo al opy of v in t readt(v): Initiate reading from master opy of v to lo al opy of v in t. loadt(v): Complete reading from master opy of v to lo al opy of v in t. storet (v): Initiate writing the lo al opy of v in t into master opy of v writet (v): Complete writing the lo al opy of v in t into master opy of v Apart from the above a tions, ea h thread t may perform lo k/unlo k on shared variables, denoted lo kt and unlo kt respe tively. When a thread exe utes a virtual ma hine instru tion that uses/assigns the value of a variable, it a
esses the lo al opy of that variable. Before unlo k, the lo al opy is transferred to the master opy through store and write a tions. Similarly, after lo k a tion the master opy is transferred to the lo al opy through read and load a tions. Given the above de nitions, we an now onsider a multithreaded program as \properly syn hronized", if every a
ess to a shared variable o
urs between a lo k and its orresponding unlo k. Two important points need to be noted here. First, the lo al opies of shared variables on eptually form a thread's private \ a he". Se ondly, data transfer between the lo al and the master opy is not modeled as an atomi a tion. This is to model the realisti transit delay when the master
opy is lo ated in the hardware shared memory and the lo al
opy is in the hardware a he. Among the eight a tions mentioned above, a thread in a Java program invokes only four of them: use, assign, lo k, and unlo k. Ea h thread invokes these a tions in its program order. The other four (load, store, read, and write) are invoked arbitrarily by the multithreading implementation, subje t to temporal ordering onstraints spe i ed in the JMM. A major diÆ ulty in reasoning about the JMM (as reported in literature [28℄) seems to be these ordering onstraints. They are given in an informal, rule-based, de larative style. It is diÆ ult to reason how multiple rules determine the appli ability/non-appli ability of an a tion. Our operational spe i ation avoids this diÆ ulty by modeling ea h a tion as a guarded ommand. Details appear in the next se tion. We on lude this se tion by brie y explaining why the JMM is weaker than sequential onsisten y. Note that the threads annot dire tly invoke a tions whi h modify the master opy of a shared variable. Therefore, modi ations to the master opy of a shared variable an omplete out-oforder. As a result, writes to shared variables are not seen by all threads in the same order. For example in the following program with two threads:
(assign u, 1 ; assign v, 2) k Op The following is a legal tra e of the program: assign u, 1; % Lo al writes to u assign v , 2; % Lo al writes to v store v , 2; write v , 2; % Write master opy of v Op % thread 2 exe utes here store u, 1; write u, 1 % Write master opy of u In this tra e, the write operations of the rst thread do not
omplete in program order. In fa t, when the se ond thread exe utes Op it an observe (via reads) the old value of u, and new value of v. This is never possible under sequential
onsisten y.
4. JMM SPECIFICATION This se tion presents a formal exe utable spe i ation of the Java Memory Model (JMM). Our spe i ation style is operational. In parti ular, we des ribe ea h a tion in the JMM as a guarded ommand. First we present an exe utable spe i ation of the ore memory model onsisting of eight a tions. A proof of equivalen e of our exe utable formal des ription of the JMM with the rule-based de larative des ription in the Java language spe i ation [16℄ appears in the appendix.
4.1 Core Memory Model Our model is an asyn hronous on urrent omposition of
n Java threads Th1 ; : : : ; Th and a single main memory pron
ess MM. Communi ation among pro esses takes pla e via shared data. Ea h pro ess an perform a set of a tions, ea h of whi h is modeled by a guarded ommand. The asyn hronous on urrent omposition of these pro esses is the union of the guarded ommands of the onstituent pro esses.
Local States. We now pro eed to des ribe the lo al states of the Thi and MM pro esses. Then, we formally des ribe the a tions whi h the threads and the main memory pro esses exe ute. Note that the threads ommuni ate via shared program variables fv1 ; : : : ; vm g; we denote the type of vj as j . The program variables are not part of the lo al states of Thi or MM. Rather thread Thi maintains a lo al opy of ea h program variable vj , while MM maintains the master opy. Set of Shared Variables = fv1 ; v2 ; : : : ; v g v1 : 1 , v2 : 2 , : : :, v : T hread state = (Ca he , Rd qs , W r qs ) Ca he = [ a he 1 ; : : : ; a he ℄ Rd qs = [rd q 1 ; : : : ; rd q ℄ W r qs = [wr q 1 ; : : : ; wr q ℄
a he = (rvalue ; dirty ; stale ) rvalue : dirty ; stale : Boolean rd q ; wr q : Queue of m
m
m
i
i
i
i;
i
i
i;j
i;m
i;
i;m
i;j
j
i;j
i
i;m
i;
i;j
i;j
i
i;j
i;j
i;j
i;j
j
The lo al state of a thread pro ess Thi an be des ribed by a 3-tuple (Ca hei , Rd qsi , W r qsi ) as shown above. Ca hei
ontains the lo al opy of the shared variables (it need not
orrespond to a physi al a he). Rd qsi and W r qsi ea h denote exa tly m queues, one for ea h shared variable. The lo al opy of the shared variable vj in Thi is des ribed by a hei;j = (rvaluei;j ; dirtyi;j ; stalei;j ). The rst omponent rvaluei;j is the value of vj in the lo al opy of Thi .
The se ond omponent dirtyi;j is a bit indi ating whether the lo al opy of vj is dirty, that is, there is an assignment to vj by Thi whi h is not yet visible to other threads (via store, write a tions). The third omponent stalei;j is a bit indi ating whether the lo al opy is stale, that is, the lo al
opy does not re e t re ent write(s) whi h is (are) visible to some other threads. As mentioned before, read/write of the master opy of a variable is not modeled as atomi operation. A read a tion need not immediately pre ede its orresponding load a tion and a write a tion need not immediately follows its
orresponding store a tion. The set of queues Rd qsi and W r qsi model this transit delay. Queue rd qi;j ontains values of the variable vj as obtained (from master opy) by Thi 's read a tions, but for whi h the orresponding load a tions (to update the lo al opy) are yet to be performed. Similarly, queue wr qi;j ontains values of the variable vj as obtained (from lo al opy) by Thi 's store a tions, but for whi h the orresponding write a tions (to update the master opy) are yet to be performed. The lo al state of the main memory pro ess MM is a pair (Memvals, Lo k state). Memvals are the values of the master opy of shared variables: mvalj denotes the value of the shared variable vj in the main memory. The variable Lo k state re ords, for ea h thread, the number of lo k a tions exe uted for whi h the mat hing unlo k a tions are yet to o
ur; lo k nti is a natural number. If thread i has exe uted l lo k a tions for whi h the mat hing unlo k a tions have not o
urred, then lo k nti = l.
MM state = (Memvals, Lo k State) Memvals = [ mval1 , mval2 , : : :, mval ℄ Lo k state = [ lo k nt1 , lo k nt2 , : : :, lo k nt ℄ mval : lo k nt : nat m
n
j
j
i
The JMM enfor es
8i; j (i 6= j ) (lo k nt
i
= 0 _ lo k ntj = 0))
as an invariant. In other words, at most one thread an possess a lo k at any given time. This does not prevent unsafe a
esses of shared variables, be ause a thread may not a quire a lo k before a
essing shared variables (as is the ase in unsyn hronized program fragments). In this paper, we onsider only a single lo k. This an be extended straightforwardly to the ase of multiple lo ks.
Actions. Figure 1 formally des ribes the eight dierent a tions performed by Thi and MM as mentioned in JLS [16℄. At any time step, these pro esses an exe ute either a program a tion or a platform a tion whi h are de ned below. Definition 1 (Program A tion). An a tion invoked by the program running as thread Thi is alled a program a tion. The a tions usei , assigni , lo ki , and unlo ki are program a tions. Definition 2 (Platform A tion). An a tion whi h is performed by the underlying multithreading implementation is alled a platform a tion. The a tions (loadi , storei , readi , and writei ) are platform a tions.
Typi ally, the purpose of exe uting platform a tions is to enable those program a tions whi h are urrently disabled.
A tion usei (j ) :
:stale ! return a he i;j
i;j
A tion assigni (j; val) : empty(rd qi;j ) ! a hei;j := val; dirtyi;j := true; stalei;j := false A tion loadi (j ) : i;j ) ! a hei;j := dequeue(rd qi;j ); stalei;j := false
:empty(rd q
A tion storei (j ) : dirtyi;j ^ empty(rd qi;j ) ^ :full(wr qi;j ) ! enqueue( a hei;j , wr qi;j ); dirtyi;j := false A tion readi (j ) :
:dirty ^ empty(wr q i;j
i;j
) ^ :full(rd qi;j ) ! enqueue(mvalj , rd qi;j )
A tion writei (j ): :empty(wr qi;j ) ! mvalj := dequeue(wr qi;j ) A tion lo ki : (8k 6= i lo k ntk = 0)
V 81 j m (empty(rd q ) ^ :dirty i;j
i;j
)!
lo k nt := lo k nt + 1; for j := 1 to m do stale i
i
i;j
:= true
A tion unlo kV i: lo k nti > 0 81 j m (empty(wr qi;j ) ^ :dirtyi;j ) ! lo k nti := lo k nti
1
Initial Conditions: 81 i n; lo k nti = 0 81 i n; 81 j m :dirtyi;j ^ stalei;j ^ empty(rd qi;j ) ^ empty(wr qi;j ) Figure 1: A tions in the ore memory model
We model ea h a tion as a guarded ommand of the form
G ! B, where the guard G is rst evaluated; if G is true, then the body B is exe uted atomi ally. The guarded- ommand
notation for des ribing on urrent systems has been popularized by many resear hers in luding Chandy and Misra in their Unity programming language [9℄. We denote a tion usei (j ) as a use a tion on shared variable vj by Thi ; similarly for assign, load, store, read, and write. The a tion lo ki denotes lo king of all shared variables by Thi ; similarly for unlo ki .
Understanding the JMM. We now explain the diÆ ulty in understanding/reasoning about the rule-based JMM and how our guarded- ommand spe i ation over omes that dif ulty. Typi ally several rules of the rule based JMM ontribute to the appli ability of an a tion. Thus it is diÆ ult to omprehend the appli ability ondition of an a tion. Our formal model makes this appli ability ondition expli it via the guards in ea h a tion. In the following, we give one example to illustrate this point. We use the notation < to denote the temporal ordering relation among a tions. In the JMM, no rule dire tly prevents assigni (j ) to take pla e between a readi (j ) and the orresponding loadi (j ). However, it is prevented by the intera tion among three different rules of the JMM. One rule requires read, load and store, write to be uniquely paired, where: readi (j ) < loadi (j ) and storei (j ) < writei (j ):
Another rule states that a store must invervene between an assign and a load a tion. assigni (j ) < loadi (j ) ) assigni (j ) < storei (j ) < loadi (j )
Yet another rule ensures that storei (j ) < loadi (j ) ) writei (j ) < readi (j ) where writei (j ) (readi (j )) is the write (read) orresponding to storei (j ) (loadi (j )). Thus, from these three rules we get
assigni (j ) < loadi (j ) ) assigni (j ) < storei (j ) < writei (j ) < readi (j ) < loadi (j )
In other words, we infer that an assigni (j ) annot take pla e between a readi (j ) and the orresponding loadi (j ). This restri tion is expli itly stated in our spe i ation with empty(rd qi;j ) as the guard for assigni (j ) a tion.
4.2
Volatile Variables
In this se tion, we extend our memory model to handle volatile variables. The Java Language Spe i ation (JLS) [16℄ des ribes a variable v as volatile, if every a
ess of v by a thread leads to an a
ess of the master opy of v in the main memory. In other words, the notion of volatile variables disables the ee t of a hing. In addition to the shared program variables des ribed in the previous se tion, let fvm+1 ; : : : ; vm+k g be volatile variables of type vol 1 . First, we extend the lo al states of the thread and main memory pro esses to in lude states for the volatile variables. Here the main dieren e is that we do not have separate read and write queues for ea h volatile variable. Instead, the reads of all volatile variables for Thi are re orded in a single queue vol rd qi , similarly for writes. This models the requirement that not only the memory a
esses of the same volatile variable but also those of dierent volatile variables should pro eed in order. 1 All volatile variables are assumed to be of same type; the model an be easily extended if they are of dierent types.
Ca he = [ a he 1 ; : : : ; a he + ℄ Memvals = [ mval1 ; mval2 ; : : : ; mval + ℄ Rd qs = [ rd q 1 ; : : : ; rd q ; vol rd q ℄ W r qs = [ wr q 1 ; : : : ; wr q ; vol wr q ℄ vol rd q ; vol wr q : Queue of (VolVarId; ) VolVarId : m + 1; : : : ; m + k i
i;
i;m
k
m
i
i;
i
i;m
i;
i
i;m
i
k
i
i
vol
We des ribe the a tions on volatile variables. We denote the use of volatile variable vj by Thi as use volatilei (j); similarly for other a tions. The extension of read and write in presen e of volatile variables is straightforward. Instead of updating the read (write) queue of an individual nonvolatile variable we now update vol rd qi (vol wr qi ). For lo ki and unlo ki a tions, instead of he king the read and write queues of only non-volatile variables, we he k the read and write queues of both volatile and non-volatile variables. The extensions for other a tions (shown in Figure 2) are more involved. Note that ea h use/assign of a volatile variable requires a main memory a
ess, that is, load/store. Moreover, the load must immediately pre ede an use and the store must immediately follow an assign. Thus, stalei;j is true after every a
ess of the lo al opy of the volatile variable vj in Thi ; this for es the next a
ess to go to main memory. Also, dirtyi;j is true if Thi has performed exa tly one update (via assign a tion) on the lo al opy of volatile variable vj whi h is not yet propagated to the master opy. Multiple updates of the lo al opy of a volatile variable is not possible without updating the master opy.
4.3 Prescient Stores The JLS also allows pres ient stores | that is, a store whi h o
urs before the assign. This optimization is allowed only if the value that is written by assign is known beforehand. We de ne a pres ient store as pending if the store has taken pla e, but the orresponding assign has not yet taken pla e. Pres ient stores are allowed only for non-volatile variables. To in orporate pres ient stores into our memory model of Se tion 4.1, we extend the thread state. We add to a hei;j an extra state variable pres ienti;j . The type of pres ienti;j is fnilg [ j . Thus pres ienti;j = nil if there is no pending pres ient store on variable vj by thread Thi ; otherwise pres ienti;j holds the value of the pending pres ient store on vj by Thi . We de ne a new a tion pres ient storei (j) A tion pres ient storei (j ) : empty(rd qi;j ) ^ :full(wr qi;j ) ! pi k val 2 j ; enqueue(val, wr qi;j ); pres ienti;j := val; dirtyi;j := false Note that we have weakened the guard of storei (j) by removing the ondition dirtyi;j = true. This is be ause a pres ient store pre edes an assign whi h sets the dirty bit. The assign a tion is modi ed to ensure that the assign writes the same value as the orresponding pres ient store. The modi ation re e ts both: (a) a normal assign as shown in Se tion 4.1 when pres ienti;j = nil, (b) a delayed assign for a pre eding pres ient store where pres ienti;j 6= nil. In the se ond ase, we do not set the dirtyi;j bit; this prevents an unne essary store a tion following the delayed assign. A tion assigni (j; val W) : pres ienti;j = val (pres ienti;j = nil ^ empty(rd qi;j )) !
a hei;j := val; stalei;j := false; if pres ienti;j = nil then dirtyi;j := true; pres ienti;j := nil
Note that, pres ienti;j 6= nil is true only between a pres ient store and the orresponding delayed assign. A
ording to the JLS [16℄, no lo k, load, or store a tions an o
ur between a pres ient store and a delayed assign. This is ensured in our model by strengthening the guards of lo ki , loadi (j ) and storei (j ) with the ondition pres ienti;j = nil.
4.4
Waiting and Notification
Java supports the feature of waiting and noti ation. A thread Thi , whi h has a quired the lo k, may voluntarily release it via a wait. Thi is added to the set of waiting threads. Subsequently, Thj (i 6= j ) a quires the lo k and de ides to notify one (or more) of the threads from the list of waiting threads, possibly Thi . Thread Thi , however, an pro eed only after Thj (the urrent owner of the lo k) releases the lo k. To model waiting and noti ation, we extend the lo al state of MM with a state variable W ait set : set of T hread id. Also, we onjoin the ondition i 62 W ait set to the guards of the a tions usei , assigni , and lo ki . This prevents a waiting thread from progressing. The guard of unlo ki is not
hanged, as a waiting thread must be allowed to unlo k (so that other threads an progress). The other a tions (load, store, read, and write) orrespond to a tions taken by the JVM implementation. They are not dire tly red by the Java program and therefore their guards are unae ted. To model the three well-known syn hronization onstru ts wait, notify, and notifyAll, we add a tions or guarded
ommands to our model. The des ription of these a tions follows dire tly from the standard notions of waiting, resumption and noti ation. Details are omitted for spa e
onsiderations.
5. VERIFYING PROGRAMS In this se tion, we dis uss how our exe utable Java Memory Model an be used for verifying on urrent Java programs. For this purpose, we rst dis uss how ea h thread is modeled and how the threads are s heduled. Subsequently we dis uss te hniques for alleviating the state spa e explosion problem.
Modeling each thread. Given a multithreaded program k Th2 k : : : k Thn , we model ea h thread as follows:
Th1
read of a variable a is onverted to a tion use(a) write of a variable a is onverted to a tion assign(a) any ode fragment marked as syn hronized is pre eded by lo k and is su
eeded by unlo k.
The reader will observe that we have not dis ussed the modeling of program expressions through use and assign a tions. To model the statement = a + b, we must exe ute use(a); use(b) followed by assign( , v) where is the addition of the values returned by use(a) and use(b). This an be a
ommodated by (1) extending the exe utable model with a register set R to hold the values returned by use a tions, (2) extending assign to be of the form assign(V , E ) where V is a shared variable and E is an expression ontaining registers in R. The exa t des ription of possible expressions depends on the type of the shared variables. R
R
A tion use volatilei (j ) :
:stale ^ :dirty ! stale i;j
i;j
i;j
:= true; return a hei;j
A tion assign volatilei (j; val) : :dirtyi;j ^ stalei;j ^ empty(vol rd qi ) ! a hei;j := val; dirtyi;j := true; stalei;j := false A tion load volatilei (j ) :
:dirty ^ stale ^ :empty(vol rd q ) ! (j; val) := dequeue(vol rd q ); a he i;j
i;j
i
i
i;j
:= val; stalei;j := false
A tion store volatilei (j ) : dirtyi;j ^ :stalei;j ^ empty(vol rd qi ) ^ :full(vol wr qi ) ! enqueue((j; a hei;j ), vol wr qi ); dirtyi;j := false; stalei;j := true Figure 2: A tions for a
essing volatile variables
Scheduling the threads. At ea h time step, any one thread exe utes either a program a tion or a platform a tion (refer De nitions 1 and 2 in Se tion 4.1). Be ause there an be several enabled a tions at any given time, we an adopt a s heduling strategy to rule out ertain behaviors. For example, given a multithreaded program Th1 ; : : : ; Thn our s heduling an pro eed as follows:
If the next program a tion of any thread is enabled then pi k one su h thread Th ; i
exe ute the next program a tion of Thi else pi k a thread Thj with enabled platform a tion; exe ute any enabled platform a tion of Thj .
The above poli y portrays the situation that platform a tions are exe uted only to enable program a tions.2 The above s heduling algorithm does not guarantee sequential
onsisten y, even though the program a tions are started in program order in ea h thread. Re all that in the JMM, exe ution of a program a tion does not update the shared memory. The \ee t" of the program a tions are updated to the shared memory via the platform a tions whi h may
omplete out-of-order.
Invariant Checker. To verify an invariant (property that must hold in every state of every exe ution tra e), we exhaustively he k the states of every exe ution tra e. Our invariant he ker has been implemented on top of a memoized logi programming system XSB [32℄. Be ause our model is expressed in guarded- ommand notation, the Mur' model he ker [12℄ is a andidate implementation vehi le as it supports a guarded- ommand{based spe i ation language. However, note that in the veri ation of any multithreaded program, it is suÆ ient to he k only those exe utions whi h are generated by our s heduling strategy. In other words, we want to program (i.e. prune) the traversal strategy of the sear h spa e of multithreaded exe utions. This programming apability is very naturally supported in a general purpose logi programming system where omputation pro eeds by sear h. A prototype he ker based on our exe utable memory model has been built using the XSB logi programming system. The he ker ould be used in two modes. Either we ould sear h the entire sear h spa e onsisting of all allowed exe ution tra es of program a tions and platform a tions in the threads of a program; or we ould 2 We an relax this strategy to allow ertain platform a tions (su h as store, write) to pro eed even in the presen e of enabled program a tions.
input rules to prune the sear h spa e based on some s heduling algorithm. In the following, we dis uss some te hniques for further redu ing the state spa e explosion.
State Space Reduction. Our exe utable memory model maintains elaborate state information for ea h thread: the lo al a he, as well as the read/write queues. Therefore,
omposing the model of full-blown Java programs along with the underlying Java memory model an result in a tremendous state spa e explosion. To alleviate this problem, we propose the use of our exe utable model for program veri ation as follows. In a multithreaded program Th1 k Th2 k : : : k Thn the user hooses only one program path in ea h thread Thi . A program path essentially en odes a hoi e in every ontrol bran h, and ea h of these hoi es impose onstraints on program variables. Note that the program path that is hosen in thread Thi need not be bounded, for example, a nitely represented unbounded loop an be hosen in Thi . To represent the onstraints on program variables imposed by a program path, we extend the use a tion. The syntax of use is extended to represent onstraints on the value returned by a use a tion, for example, use(a) = 0. We do not spe ify the onstraint domain here as this is not entral to our methodology. For the purposes of this paper's illustration, it suÆ es to onsider arithmeti equality and inequality
onstraints. Thus, if the onstraint use(a) = 0 appears in a thread, then it means that (1) use(a) is exe uted to return a value v, and (2) the he k v = 0 is performed, all in one atomi step. Then we exhaustively he k all possible exe ution tra es made from the hosen program paths of Th1 ; : : : ; Thn , whi h are allowed by the JMM. Our approa h is motivated by the fa t that reasoning about the exe ution tra es allowed by the JMM requires low-level understanding of the a tions/datastru tures of the JMM, and needs to be automated. However, reasoning about the program paths of a thread Thi requires understanding the sour e ode of Thi . In parti ular, the user will hoose program paths 1 ; : : : ; n in threads Th1 ; : : : ; Thn if he/she suspe ts a legal tra e of 1 ; : : : ; n to violate the invariant being veri ed. This is a reative step, but still does not require the user to reason about the JMM. This task is left to the invariant he ker whi h automati ally
on rms/refutes the user's suspi ion. Case Study. We now illustrate the he ker's use in nding a bug in a ommonly used software onstru tion idiom
Thread 1 use(Inst) = null. lo k. use(Inst) = null. assign(Data, newval). assign(Inst, newptr). unlo k. Y := use(Data). assign(Ret, Y).
Thread 2
use(Inst)6= null. Y := use(Data). assign(Ret, Y).
Figure 3: Program paths in two threads running Double-Che ked Lo king
of multithreaded Java: the Double-Che ked Lo king idiom. Double-Che ked Lo king [29℄ is a widely used pattern in multithreaded Java programs (see [18℄ for a dis ussion of its use). This program fragment is used for eÆ ient lazy instantiation of a singleton lass. A singleton lass is a lass with only one instan e; for multithreaded programs this instan e is shared by multiple threads. Double-Che ked Lo king is a program fragment for instantiation in whi h (a) only one instan e is generated, and (b) the instan e is generated only on-demand. Consider a method getInstan e whi h instantiates a singleton lass Singleton. Clearly, getInstan e must he k whether an instan e already exists, before reating an instan e. This is to ensure that only one instan e is generated. In a multithreaded program however, this is not enough. To avoid multiple instantiations of the Singleton lass by multiple threads, the getInstan e method must be exe uted as a riti al se tion. This is a hieved by syn hronization, as shown in the following program fragment. Note that this program fragment will be run by multiple threads. private stati Singleton instan e = null; .... // the other fields publi stati syn hronized Singleton getInstan e() { if (instan e == null) instan e = new Singleton(); return instan e; }
However, there is a substantial performan e overhead for syn hronizing on every invo ation of getInstan e. DoubleChe ked Lo king [18, 29℄ is an eÆ ient s heme whi h avoids su h syn hronization. Note that after the reation of an instan e of the Singleton lass is ompleted, there is no need to syn hronize; any invo ation of getInstan e should simply return this instan e. Double-Che ked Lo king avoids these redundant syn hronizations. A program fragment implementing Double-Che ked Lo king is shown in Figure 4. Any thread whi h invokes getInstan e will exe ute this program fragment. Note that if instan e is null (i.e., an instan e of the Singleton lass has not yet been reated), then the program fragment for es syn hronization and he ks whether instan e is null again within the riti al se tion. In between the rst instan e == null he k and the syn hronization, another thread may invoke the method getInstan e, nd that instan e is null, and then
reate an instan e of the Singleton lass. Hen e the need for the se ond instan e == null he k. We have not shown the other elds of Singleton, whi h get initialized in the onstru tor of the Singleton lass. We
private stati Singleton instan e = null; .... // the other fields publi stati Singleton getInstan e() { if (instan e == null){ syn hronized (Singleton. lass) { if (instan e == null) instan e = new Singleton(); } } return instan e; } Figure 4: Double-Che ked Lo king
want to he k that when multiple threads run getInstan e
on urrently, any invo ation of getInstan e always returns an initialized obje t, that is, the elds of the obje t returned by getInstan e are not uninitialized garbage. For this purpose, it is suÆ ient to onsider only one eld of the obje t
alled datafield, whi h we assume to be initialized in the
onstru tor. When several threads run getInstan e on urrently, one thread allo ates a Singleton obje t and returns it, while other threads simply return the already allo ated obje t. To show this, we an onstru t the program paths shown in Figure 3 with two threads running on urrently. Thread 1 allo ates the Singleton and returns it, while Thread 2 returns the Singleton whi h has been allo ated by Thread 1. In gure 3, Inst and Data denote two shared memory lo ations ontaining instan e and datafield of the Singleton obje t. The lo ation Ret holds the value of o.datafield where o is the obje t returned by getInstan e. We ould have modeled two dierent lo ations Ret1, Ret2 to hold the values returned by the dierent threads. Modeling them as the same lo ation only simpli es the invariant to be proved. Initially, Inst = null, Ret = null and Data = garbage. We need to prove that Ret 6= garbage is an invariant. To ensure that this property holds in every multithreaded implementation, we must show that for every exe ution tra es allowed by the JMM, a state in whi h Ret = garbage is never rea hable. This is a
omplished automati ally by our invariant he ker. The program paths shown in Figure 3 are input to the he ker. The he ker yields a ounterexample in only 0:15 se onds on a Pentium-4 1.3 GHz workstation with 1 GB of memory. In other words, the he ker generates a tra e where the obje t returned by getInstan e an ontain garbage values in the data elds. This shows that the
Double-Che ked Lo king program fragment is unsafe to use in a multithreaded environment. A ounter-example tra e
onstru ted by the he ker is: read(Inst), load(Inst), use(Inst) = null lo k read(Inst), load(Inst), use(Inst) = null assign(Data, newval), assign(Inst, newptr) store(Inst), write(Inst) read(Inst), load(Inst), use(Inst) = null read(Data), load(Data), Y := use(Data) assign(Ret, Y), store(Ret), write(Ret) 6
Thread 1 Thread 1
re-orderings, whose ee t is not visible by other threads, are allowed to pro eed. This provides the eÆ ien y of a weak memory onsisten y model while maintaining the programmer's intuitive abstra tion of a single shared memory, as in sequential onsisten y.
Thread 1 Thread 1 Thread 1 Thread 2 Thread 2 Thread 2
This orresponds to the situation where Thread 1 reates an instan e by setting Data and Inst. This updates the lo al
opies of Data and Inst. The master opy of Inst is then updated. Be ause now Inst 6= null, thread 2 exe utes; it reads the master opy of Data and assigns this value to Ret. However, the master opy of Data has not been updated yet (i.e., the write on o.datafield by Thread 1 has not
ompleted), whi h auses Ret = garbage. Thus, if the writes of Data and Inst are re-ordered then the Double-Che ked Lo king program fragment is unsafe to use for multithreaded programs. This re-ordering is allowed by the JMM and will be performed in many multi-pro essor implementations, for example, SUN SPARC, DEC Alpha. To safely use the Double-Che ked Lo king program fragment on su h implementations, we need to turn o this reordering by expli itly inserting a memory-barrier instru tion in the onstru tor of Singleton. This memory-barrier instru ts the underlying implementation not to re-order operations a ross the barrier.
6. DISCUSSIONS In this paper, we have used formal spe i ation and veri ation te hniques to analyze multithreaded Java programs. Our work is on entrated on formally spe ifying the Java Memory Model(JMM), the rules imposed by the Java language spe i ation for any implementation of multithreading. We demonstrate (with a on rete ase study) why reasoning about the JMM is ne essary to verify multithreaded Java programs in a platform-independent fashion. Even though this paper has fo used on the JMM, the approa h an apply to any multithreaded programming dis ipline. Typi ally, veri ation te hniques for multithreaded programs assume a sequentially onsistent exe ution model. The fo us there is on the automation/eÆ ien y of sear hing the sequentially onsistent exe ution tra es. However, multithreaded programming languages (su h as Java) might impose weaker onsisten y models in order to allow for eÆ ient implementations. This raises the question of generating a formal exe utable spe i ation of these weak onsisten y models. Weak memory onsisten y models [3℄ have traditionally been des ribed de laratively as a set of rules. Constru ting an equivalent exe utable formal model serves many purposes: understanding the onsisten y model, using the onsisten y model to aid veri ation of multithreaded programs. In this paper, we have undertaken this approa h for a realisti multithreaded programming language (Java), and explored its utility. Our te hnique an be used to dete t reorderings whi h produ e ounter-intuitive results in the exe ution of a multithreaded program | that is, break the programmer's intuition of sequential onsisten y. These reorderings an then be expli itly disabled. The rest of the
7. ACKNOWLEDGMENTS This work was partially supported by National University of Singapore Resear h Proje t R-252-000-095-112.
8. REFERENCES [1℄ Java Spe i ation Request (JSR) 133. Java Memory Model and Thread Spe i ation revision. In http://j p.org/jsr/detail/133.jsp, 2001. [2℄ S. Adve. Memory model tutorial. In Revising the Java Thread Spe i ation Workshop, OOPSLA, 2000. [3℄ S.V. Adve, V.S. Pai, and P. Ranganathan. Re ent advan es in memory onsisten y models for hardware shared-memory systems. IEEE spe ial issue on distributed shared-memory, 87(3), 1999. [4℄ G. Barthe et al. A formal exe utable semanti s of the Java ard platform. In European Symposium on Programming, LNCS 2028, 2001. [5℄ E. Borger and W. S hulte. A programmer friendly modular de nition of the semanti s of Java. In Formal Syntax and Semanti s of Java, LNCS 1523, 1999. [6℄ D.L. Bruening. Systemati testing of multithreaded Java programs. Master's thesis, MIT, 1999. [7℄ David R. Butenhof. Programming with POSIX threads. Addison Wesley, 1997. [8℄ P. Cen iarelli et al. An event based stru tural operational semanti s of multithreaded Java. In Formal Syntax and Semanti s of Java, LNCS 1523, 1999. [9℄ K. Mani Chandy and J. Misra. Parallel Program Design: a foundation. Addison Wesley, 1988. [10℄ E.M. Clarke, E.A. Emerson, and A.P. Sistla. Automati veri ation of nite-state on urrent systems using temporal logi spe i ations. ACM Transa tions on Programming Languages and Systems, 8(2), 1986. [11℄ J. Corbett et al. Bandera: Extra ting nite state models from Java sour e ode. In ACM/IEEE International Conferen e on Software Engineering (ICSE), 2000. [12℄ D. L. Dill. The Mur' veri ation system. In Computer Aided Veri ation (CAV), LNCS 1102, 1996. [13℄ D.L. Dill, S. Park, and A. Nowatzyk. Formal spe i ation of abstra t memory models. In Symposium on Resear h on Integrated Systems. MIT Press, 1993. [14℄ P. Godefroid. Model he king for programming languages using VeriSoft. In ACM Symposium on Prin iples of Programming Languages (POPL), 1997. [15℄ A. Gontmakher and A. S huster. Java onsisten y: non-operational hara terizations for Java memory behavior. ACM Transa tions on Computer Systems, 18(4), 2000. [16℄ J. Gosling, B. Joy, and G. Steele. The Java Language Spe i ation. Chapter 17, Addison Wesley, 1996.
[17℄ Y. Gurevi h, W. S hulte, and C. Walla e. Investigating Java on urren y using Abstra t State Ma hines. In Abstra t State Ma hines Workshop, LNCS 1912, 2000. [18℄ A. Holub. Taming Java Threads. Berkeley CA, APress, 2000. [19℄ G. Holzmann and M. Smith. A pra ti al method for verifying event driven software. In ACM/IEEE International Conferen e on Software Engineering (ICSE), 1999. [20℄ Leslie Lamport. How to make a multipro essor
omputer that orre tly exe utes multipro ess programs. IEEE Transa tions on Computers, 28(9), 1979. [21℄ Douglas Lea. Con urrent Programming in Java : Design Prin iples and Patterns. Addison Wesley, 1997. [22℄ J. Maessen, Arvind, and X. Shen. Improving the Java Memory Model using CRF. In ACM OOPSLA, 2000. [23℄ J. Manson and W. Pugh. Core semanti s of multithreaded Java. In ACM Java Grande Conferen e, 2001. [24℄ J.S. Moore. Formal models of Java at the JVM level { a survey from the ACL2 perspe tive. In Workshop on Formal Te hniques for Java Programs, in asso iation with ECOOP, 2001. [25℄ G. Naumovi h, G.S. Avrunin, and L.A. Clarke. Data
ow analysis for he king properties of on urrent Java programs. In ACM/IEEE International Conferen e on Software Engineering (ICSE), pages 399{410, 1999. [26℄ G. Naumovi h, G.S. Avrunin, and L.A. Clarke. An eÆ ient algorithm for omputing MHP information for on urrent Java programs. In ESEC/FSE, LNCS 1687, pages 338{354, 1999. [27℄ S. Park and D.L. Dill. An exe utable spe i ation and veri er for relaxed memory order. IEEE Transa tions on Computers, 48(2), 1999. [28℄ W. Pugh. Fixing the Java Memory Model. In ACM Java Grande Conferen e, 1999. [29℄ D. S hmidt and T. Harrison. Double- he ked lo king: An optimization pattern for eÆ iently initializing and a
essing thread-safe obje ts. In 3rd Annual Pattern Languages of Program Design onferen e, 1996. [30℄ S.D. Stoller. Model he king multithreaded distributed Java programs. In SPIN Workshop on Model Che king of Software, LNCS 1885, 2000. [31℄ W. Visser, K. Havelund, G. Brat, and S. Park. Model
he king programs. In IEEE International Conferen e on Automated Software Engineering, 2000. [32℄ XSB. The XSB logi programming system v2.2, 2000. Available for downloading from http://xsb.sour eforge.net/.
APPENDIX We present the proof of equivalen e of our exe utable Java Memory model and the rule-based memory model given in the Java Language Spe i ation (JLS) [16℄. The proof follows from two lemmas of soundness and ompleteness. First we formalize the notion of a tra e. Definition 3
(Tra e).
Given a program with n > 1
threads, an exe ution tra e is a mapping N
! A t f1; : : : ; ng
where A t is the set of permissible a tions by any thread and f1; : : : ; ng denotes the thread id. Thus, an exe ution tra e is a sequen e of a tions of the various threads.
The permissible a tions are fuse, assign, load, store, read, write, lo k, unlo kg. Some of these a tions (read, write, lo k and unlo k) involve intera tion between a thread and the main memory pro ess. Note that the notion of an exe ution tra e does not distinguish between the program a tions (use, assign, lo k, unlo k), and platform a tions (load, store, read and write) for a parti ular thread. Given the above de nition, we an prove soundness of our JMM spe i ation as follows.
Lemma 1 (Soundness). Any exe ution tra e of a multithreaded Java program whi h is allowed by our exe utable memory model is also allowed by the rules of the JLS [16℄.
We prove this by showing that any exe ution tra e in our exe utable model obeys all the rules in the JLS. The detailed proof given below is a ase-by- ase analysis of the rules in JLS. Proof:
Execution Order Rules. Ea h tra e allowed by our model satis es the rst four rules of exe ution order in JLS by de nition (De nition 3). Also, lo ki /unlo ki a tions are performed jointly by T hreadi and MM sin e they are invoked by T hreadi and they modify the state of the MM pro ess. To guarantee that ea h loadi is uniquely paired with a pre eding readi , note that loadi dequeues an entry from some rd qi;j . Hen e it is uniquely paired with the readi instru tion whi h enqueued this entry to rd qi;j previously. The pairing of storei with writei is guaranteed by our exe utable model similarly. Rules about Variables. Let be a tra e of a program P , s.t. is allowed by our exe utable memory model. P
P
Be ause our model invokes the use, assign a tions as per their o
urren e in threads of program P , the rst rule is satis ed by P . The se ond rule requires a storei (j ) to intervene between assigni (j ) and loadi (j ). This is ensured in our model by the dirtyi;j bit. An assigni (j ) will always set dirtyi;j to true. Now, a loadi (j ) an be applied only if rd qi;j is non-empty, that is, there is a pending readi (j ). As rd qi;j is empty before the exe ution of assigni (j ), a readi (j ) must intervene between assigni (j ) and loadi (j ). Now readi (j ) an be applied only if dirtyi;j is false. Sin e storei (j ) is the only a tion whi h an set dirtyi;j to false, therefore we guarantee that storei (j ) must intervene. The third rule requires assigni (j ) to intervene between loadi (j ) / storei (j ) and a subsequent storei (j ). This is also ensured by the dirtyi;j bit. After the exe ution loadi (j ) / storei (j ), the bit dirtyi;j is guaranteed to be false (refer Figure 1). Also, the guard of a subsequent storei (j ) requires dirtyi;j to be true. Therefore, there must be an intervening a tion whi h sets dirtyi;j to be true. Sin e assigni (j ) is the only su h a tion, it must intervene. The fourth and fth rules require assigni (j ) or loadi (j ) to pre ede the rst o
urren e of usei (j ) or storei (j ). The usei (j ) requires stalei;j to be false. Sin e all stale bits are
initially true, therefore a tions setting stalei;j to false must pre ede usei (j ). The only a tions setting stalei;j to false are assigni (j ), loadi (j ). Similarly, we an show that the rst o
urren e of storei (j ) must be pre eded by assigni (j ) by taking the dirtyi;j bit into onsideration. The sixth rule requires every loadi (j ) to be pre eded by a orresponding readi (j ) whi h transmits the same r-value as loadi (j ). As mentioned before, this is ensured in our model by the rd qi;j queue. Sin e loadi (j ) dequeues values whi h were enqueued into rd qi;j by a pre eding readi (j ), we an see that this rule is always satis ed by tra es in our exe utable model. The dependen e between storei (j ) and writei (j ) as demanded by the seventh rule is shown similarly (by onsidering the wr qi;j queue). The last rule requires (a) readi (j ) to be in the same order as orresponding loadi (j ) a tions, (b) writei (j ) a tions to be in the same order as the orresponding storei (j ) a tions, ( ) if a storei (j ) pre edes loadi (j ), then the orresponding writei (j ) pre edes the orresponding readi (j ), and (d) if a loadi (j ) pre edes storei (j ), then the orresponding readi (j ) pre edes the orresponding writei (j ). Requirement (a) and (b) are ensured by the FIFO dis ipline of rd qi;j and wr qi;j respe tively. Requirement ( ) is ensured by the guard of readi (j ) whi h is enabled only if wr qi;j is empty, that is, there is no pending writei (j ). Similarly, requirement (d) is ensured by disabling storei (j ) if rd qi;j is non-empty. Note that the guard :dirtyi;j is used for the readi (j ) a tion to prevent a deadlo k. Without that guard, a readi (j ) an follow an assigni (j ). But after that the loadi (j ) an be enabled only if there is an intervening storei (j ) and the storei (j ) an be enabled only if there is an intervening loadi (j ). The guard ensures that a readi (j )
annot follow an assigni (j ) without intervening storei (j ) and writei (j ).
Rules about Locks. The rst rule requires that only one thread at a time owns a lo k. This is ensured in our model, sin e the ondition 8k 6= i lo k ntk = 0 in the guard of lo ki ensures that all threads other than i do not own the lo k. The se ond rule requires that only a thread owning a lo k an exe ute an unlo k. This is ensured in our model sin e unlo ki is exe uted only if lo k nti > 0 i.e. thread i owns the lo k. Rules about Interaction of Locks and Variables. The rst rule requires storei (j ) and the orresponding writei (j ) to intervene between an assigni (j ) and an unlo ki . This is ensured in our model sin e the guard of unlo ki is true provided empty(wr qi;j ) holds (no pending writei (j )) and :dirtyi;j holds (no pending storei (j )). The se ond rule requires assigni (j ) or readi (j )/loadi (j ) pair to intervene between a lo ki and a subsequent usei (j ) / storei (j ). In our model, after the lo ki a tion is exe uted we must have 8j stalei;j ^ :dirtyi;j . Therefore usei (j ) / storei (j ) a tions are not enabled. The only a tion setting dirtyi;j is assigni (j ), and the only a tions resetting stalei;j are loadi (j ) and assigni (j ). Therefore, one of these a tions must intervene. Furthermore, loadi (j ) an intervene only if rd qi;j is non-empty. Sin e rd qi;j is empty when lo ki is exe uted, the readi (j ) orresponding to the intervening loadi (j ) must also take pla e after lo ki . 2
We prove ompleteness of our JMM spe i ation w.r.t. the model des ribed in [16℄. Lemma 2 (Completeness). Any exe ution tra e of a multithreaded Java program whi h is allowed by the rules of the Java language spe i ation [16℄ is also allowed by our exe utable memory model. Proof: Consider some exe ution tra e allowed by [16℄ whi h is not allowed by our model. Consider the rst a tion a in whi h is disallowed by our model but is allowed by [16℄. Sin e there are only eight a tions in both models, a
an only be one of them. The eight ases are shown below, and for ea h of them a ontradi tion is obtained. Thus, no su h tra e may exist. a = usei (j ) : Then stalei;j must be true. Then, a o
urs before any assigni (j ) / loadi (j ) or after o
urren e of lo ki (by indu tion on appli ation of our rules). This is disallowed by [16℄ as well. a = assigni (j ) : Then rd qi;j is non-empty. If this tra e is allowed by [16℄ then the tra e annot have a subsequent loadi (j ) until dirtyi;j is reset, whi h is only possible by storei (j ). But storei (j ) again annot be exe uted by [16℄ if rd qi;j is non-empty (be ause then storei (j ) will pre ede a loadi (j ) but the orresponding read, write will be in reverse order). Thus thread i
annot progress and su h a tra e is disallowed. a = loadi (j ) : If rd qi;j is empty, no value an be loaded. Exe ution of loadi (j ) is disallowed by [16℄ also in su h ases. a = storei (j ) : If dirtyi;j is false, then there is no pre eding assigni (j ) without a subsequent storei (j ) (by indu tion on our rules). This is disallowed in [16℄ by the rules about variables (third rule). If rd qi;j is non-empty then storei (j ) will pre ede a loadi (j ) but the orresponding read, write will be in reverse order. Again if wr qi;j is full, no value an be stored. All these
ases are again disallowed by [16℄. a = readi (j ) : If dirtyi;j is true or wr qi;j is nonempty, then there is a pre eding assign leading to a pending storei (j ) or writei (j ) operation. This will lead to a store pre eding a load where the orresponding read and write are in reverse order, violating the last rule about variables in [16℄. Again non-full rd qi;j must trivially hold. a = writei (j ) : Non-empty wr qi;j must trivially hold. a = lo ki : If lo k ntk > 0 where k 6= i, then thread k
ontain has exe uted lo k for whi h the orresponding unlo k has not been performed yet (by indu tion). If rd qi;j is non-empty then there will be a load after lo k whose read has been exe uted before lo k. If dirtyi;j is true, then there will be a store after lo k whose assign is exe uted before lo k. All of these situations violate the rules for lo ks and their intera tion with variables in [16℄. a = unlo ki : If lo k nti = 0 then thread i does not own the lo k (by indu tion). If dirtyi;j is true or wr qi;j is non-empty for some variable j , then there is a pre eding assign leading to a pending storei (j ) or writei (j ) operation. Ea h of these situations are again disallowed by the rules for lo ks and their intera tion with variables in [16℄. 2