Task Types for Pervasive Atomicity - Computer Science

Task Types for Pervasive Atomicity Aditya Kulkarni

Yu David Liu

Scott F. Smith

SUNY Binghamton [email protected]

SUNY Binghamton [email protected]

The Johns Hopkins University [email protected]

Abstract Atomic regions are an important concept in correct concurrent programming: since atomic regions can be viewed as having executed in a single step, atomicity greatly reduces the number of possible interleavings the programmer needs to consider. This paper describes a method for building atomicity into a programming language in an organic fashion. We take the view that atomicity holds for whole threads by default, and a division into smaller atomic regions occurs only at points where an explicit need for sharing is needed and declared. A corollary of this view is every line of code is part of some atomic region. We define a polymorphic type system, Task Types, to enforce most of the desired atomicity properties statically. We show the reasonableness of our type system by proving that type soundness, isolation invariance, and atomicity enforcement properties hold at run time. We also present initial results of a Task Types implementation built on Java. Categories and Subject Descriptors D.3.3 [Programming Languages]: Language Constructs and Features—Concurrent Programming Structures General Terms Design, Languages, Theory Keywords Pervasive Atomicity, Type Systems, SharingAware Programming

1.

Introduction

In an era when multi-core programming is becoming the rule not the exception, the property of atomicity – that program execution in the presence of interleavings has the same effect as a sequential execution – is a crucial invariant. Some programming languages now support a notion of atomic block, requiring the block to be viewable as executing atomically; this means there will not be any interleavings violating the sequential view of that block and program meaning is greatly clarified. One weakness of atomic blocks however is that guarantees of atomicity hold

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. OOPSLA/SPLASH’10, October 17–21, 2010, Reno/Tahoe, Nevada, USA. Copyright © 2010 ACM 978-1-4503-0203-6/10/10. . . $5.00

only in so-marked blocks, and code outside of the marked blocks may well have anomalous behaviour upon interleaving. When the atomicity-enforcing code interleaves with the non-atomicity-enforcing code, a weaker guarantee known as weak atomicity [CMC+ 06; SMAT+ 07; ABHI08] may happen. This paper – built on top of our previous Coqa language [LLS08] – takes the opposite route to address atomicity. Instead of indicating which subparts of a thread should be atomic, a programmer of our language divides the thread into subzones of atomic execution, and every single line of code must be part of some atomic zone. This design principle, which we call pervasive atomicity, eliminates weak atomicity by design. The programming approach is the opposite of atomic zones: by default threads are completely atomic and uncommunicative, and specific atomicity break points are then inserted where the thread needs to communicate with other threads. In addition, since the number of zones is much smaller than the number of program instructions, the conceptual number of program interleavings is significantly reduced, a boon for program analysis and testing, and ultimately for the deployment of more reliable software. Unfortunately, Coqa is not ideal: it requires dynamic monitoring to limit object sharing between threads. Dynamic monitoring mechanisms are known to incur heavy overhead, which can be particularly bad here since every object may need to be monitored. They can also suffer from problems of deadlock or livelock. Our initial Coqa compiler relied on ad hoc optimizations to achieve tolerable performance. What is needed is a principled means to keep the benefits of pervasive atomicity, but without the high cost. This paper answers this need by developing a static, declarative method for dividing objects between threads, Task Types. It is common knowledge [CGS+ 99; WR99; Bla99] that if an object is accessed only by one thread, then dynamic atomicity enforcement on that object is unnecessary. Task Types lift this simple notion to the programming level, effectively enforcing a non-shared memory model by default at compile time. Figure 1(a) illustrates how objects are statically localized in threads via Task Types. Here the two rectangular boxes represent two runtime threads, called tasks in our language. The solid black objects are special task objects which launch a new task every time they are invoked, while the white objects are the non-shared ordinary objects, the vast ma-

task

(a) non-shared task object shared task object

(b)

shared ordinary object non-shared ordinary object messaging

(c)

Figure 1. Isolation and Sharing at Run Time jority of objects that are accessed by only one task. Our type system statically guarantees this picture of isolation. Making non-shared-memory the default case has parallels with Actor-based languages [Agh90; Arm96; SM08], MPI [GLS94] and DPJ [BAD+ 09] amongst others. Compared to these approaches, Task Types aim to get the benefits of these models while offering a more familiar setting for programmers: only minor changes need to be made to most Java programs to make them compilable by the Task Types compiler. One reason why fewer program changes are required is that we support limited inter-task sharing, unlike the above models. Fig. 1(b) and Fig. 1(c) are examples of forms of intertask object sharing that we support. In Fig. 1(b), two tasks communicate through shared task objects, special objects which may themselves hold a set of non-shared ordinary objects and serve as the rendezvous point for other tasks. An important benefit of limited sharing is the degree to which atomicity properties are preserved: the sending task has one zone of atomicity from the start to the shared task object invocation, the shared task itself is an atomic zone, and the task execution after the invocation is finished is a third zone of atomicity. So, our approach is a compromise between the extreme lack of sharing of Actor or Actor-like languages [Agh90; Arm96; SM08; GLS94; BAD+ 09] and the uncontrolled sharing of current multithreaded languages; by aiming in the middle we can achieve a reasonable compromise between ease of programmability and the production of reliable code. Fig. 1(c) shows an additional form of sharing that we support, the shared ordinary objects. These objects allow Coqa-style sharing to be used in limited cases within Task Types: only one task may use a shared ordinary object at a time, so no atomicity of tasks is ever violated due to access of these objects. As in Coqa, run-time support is needed to ensure two different tasks never access such an object at the same time. The programming choice between using a shared task object or a shared ordinary object reflects a clear choice between more parallelism or more atomicity.

Ownership types [CPN98; Cla01; BLR02] and region types [TT97; Gro03; CCQR04] are well-known type-based techniques for static partitioning of memory. Task Types are strongly related to such systems, but differ in two important aspects: explicit sharing exceptions are allowed in Task Types, and static type variables must invariably align with runtime tasks.

2.

Informal Discussion

In this section, we highlight a number of features of our language, focusing on its type system. We use a simplified MapReduce algorithm [DG04] to illustrate basic language features; the code is in Fig. 2. Map-Reduce represents a common type of multi-core algorithm, of the “embarrassingly parallel” style. Later in this section, we will discuss a program of the opposite nature, a high-contention PuzzleSolver [LLS08]. Together, we aim to provide readers a real feel of programming in Task Types. 2.1

Sharing-Aware Programming

Task Types encourage programmers to make upfront sharing decisions, by associating class modifiers µ with classes. The default choice here (µ = ) aligns precisely with the principle of having non-shared memory as the default. The objects instantiated from these classes are non-shared ordinary objects, and messaging to these objects uses the standard Java dot (.) invocation symbol. For instance, the code for MapReduce in Fig. 2 indicates WorkUnit is not shared. The WorkUnit object encapsulates data and a unit of work that needs to be done on the data, and each such instance is indeed exclusively used by each Mapper. Here Mapper is declared as a task, meaning each Mapper object is a (non-shared) task object. Each such object spawns a new task (thread) when sent a message, in analogy to how an actor handles a message [Agh90]. Nonshared tasks are simply threads with a distinguished object representative, and the execution of the body of the invoked method constitutes the lifetime of a task. Note that the completion of the task does not end the lifetime of the task object or any state associated with it – the object may later receive and handle another message following the Actor model. Here the main method’s expression m -> map(ul, r) creates a non-shared task by sending a map message to the entry object m. Such invocations are asynchronous and have no interesting return value. Here, twenty Mapper threads are executing in parallel. Individually, each task object keeps a queue of all received messages and processes them serially, following Actors. This sequential processing constraint is to preserve atomicity; if multiple tasks need to run in parallel, the programmer multiply instantiates the same task class multiple times, and this is illustrated by the twenty Mapper task objects in the example. Any shared classes must be explicitly declared, and the exclamation mark (!) also must be used in the sym-

task class Mapper { task class Main { void map(Loader ul,Reducer r) { void main() { WorkUnit wu = ul !->loadWorkUnit(); int NUM = 20; r !->reduce(wu.work()); CtrFactory cf = new CtrFactory(NUM); } r = new Reducer(cf); } ul = new Loader(cf); shared task class Reducer { for(i=1; i map(ul,r); void Reducer(CtrFactory cf) } } } { toReduce = cf !.newCtr(); } Mapper void reduce(int rt) { Pre-Twinned Access Graph sum = sum + rt; Main toReduce.dec(); if(toReduce.val() == 0) { ...output sum ...} Reducer } Loader } shared task class Loader { Counter toLoad; Loader(CtrFactory cf) { CtrFactory toLoad = cf !.newCtr(); } WorkUnit loadWorkUnit() { Counter (toLoad) Counter (toReduce) WorkUnit toLoad.dec(); return new WorkUnit(this); } } Static Access Graph Main class WorkUnit { Mappers Loader l; WorkUnit(Loader l) { l = l; } int work() { ...return result ... } Reducers Loaders } shared class CtrFactory { int i; CtrFactory(int i) { i = i; } CtrFactory WorkUnits Counter newCtr() {return new Counter(i); } } class Counter { int v = 0; Counters (toReduce) Counters (toLoad) Counter(int v) { v = v; } void dec() { v--; } int val() {return v; } } static access relation static type variables for their shadowed counterparts in Fig. 1

Figure 2. A Simplified Map-Reduce Example

XXX XXExecution XX XXX Access shared (shared ∈ µ)

atomicity zone entry (task ∈ µ)

resource (task ∈ / µ)

µ = shared task (shared task object o3) µ = task (non-shared task object o2)

µ = shared (shared ordinary object o4) µ= (non-shared ordinary object o1)

isolated (shared ∈ / µ) PP

PP what it is PP messaging P P o1 .m(v) intra-task messaging o2 ->m(v) task creation o3 !->m(v) shared “service” o4 !.m(v) shared “data”

why you should use it promotes mutual exclusion and atomicity promotes parallelism by starting up a new thread allows for atomicity-breaking sharing; promotes parallelism with early free allows for atomicity-preserving sharing

Figure 3. Four Kinds of Objects, Four Kinds of Messaging bol for sending messages to these objects so that sharing is highlighted in the source code. In the example, both Reducer and Loader are shared task objects: in particular, all twenty Mapper objects share a single Reducer object to sum up the numerical results, by the invocation r !-> reduce(wu.work()) one by one. Shared task messaging does not run in parallel with the invoker – o!->m(v) is synchronous and blocking and does not spawn a thread. This point can be made clear by looking at the ul !-> loadWorkUnit() expression – the Mapper object has to wait for the return of the WorkUnit. What makes a shared task a “task” is its ability to build its own atomicity zone: the shared task maintains its own objects and frees them all when the shared task ends, i.e. the method returns. This nature of shared tasks helps improve system performance, by not holding onto objects for too long, and makes rendezvous between two “live” tasks possible. Shared ordinary objects, such as CtrFactory, can only be accessed over the lifetime of one task at a time, and as such will never break the atomicity of a task. However, they may be accessed by one task first, and then accessed by another when the first task has completed. As a result, shared ordinary objects must be dynamically monitored. Shared task objects and shared ordinary objects are related, but their effects on atomicity and parallelism are clearly different. Fig. 3 summarizes the four kinds of objects and their messaging. To represent four possibilities, we use the presence or absence of two modifiers, task and shared; the first differentiates the execution policy (a task or not), and the second distinguishes the access policy (shared or non-shared). For convenience, we will equivalently view µ as consisting of set of 0-2 keywords. Thus for instance, task ∈ / µ matches either case in the right column of the first table. 2.2

Static Enforcement of Atomicity

Our previous Coqa compiler [LLS08] implemented atomicity solely with locks, and we proved that a running task is atomic when it locks every (ordinary) object it accesses; if a task attempts to access an object which is already locked

by another task, it blocks until the locking task execution is completed. The main property of Task Types is that non-shared ordinary objects are provably isolated inside at most one runtime task throughout their lifetime, and thus need no mechanism to support mutually exclusive access at runtime. The main goal of the type system is to prove such isolation indeed holds at compile-time, and so the run-time lock monitoring can be removed. We now give a high-level description of how limited, safe sharing of ordinary objects is supported in the type system. A key structure involved in typechecking is the static access graph, a static directed graph with edges for object access via field read/write or being sent a message. Since some data sharing needs to be supported, the static access graph is not a strict hierarchy; in particular, nodes representing shared task objects are sharing points. In the MapReduce example, Loader and Reducer for example share a CtrFactory. The MapReduce static access graph is illustrated in a box inside Fig. 2. The graph generated by the type system is the bottom one labeled “Static Access Graph.” For the purpose of presenting basic ideas however, let us first focus on a simplified version of that graph, labeled “Pre-Twinned Access Graph” in the Figure. Over this graph we define an access path to be a path on this graph from a non-shared task object to a non-shared ordinary object; then we say a cut vertex exists in the graph for some non-shared ordinary object o iff all distinct access paths to o in fact go through a single “cut” node in the static access graph. The name of this property comes from a graph-theoretic property of this invariant, and is formalized as predicate cutExists in Sec. 3.5. To see a potential violation of isolation, suppose we removed the shared task modifier from class Loader (and changed all !-> symbols to . for messaging to its instances). The resulting program is obviously troublesome at runtime – different Mapper instances would race to mutate field toLoad in class Loader, violating its mutual exclusive access and invalidating our atomicity model. Such a bug would be caught by our type system because there is no cut

vertex for access paths of Loader, in this case from Main and from Mapper. Note that the typechecking described above subsumes aggregate locking, i.e. if a large data structure needs to be lock protected, there should not be a need to lock each and every element. To see how this can be supported, observe that when the cut vertex above is a shared ordinary object, all non-shared ordinary objects can “hide” behind it, and the type system still typechecks. The example above also shows the need for a polymorphic type system. Two different Counter’s, toLoad and toReduce, are created there by two invocations of the factory method newCtr. Even though there is only one new Counter statement in the program, two instances are created at runtime. Task Types aim to be maximally expressive in this case, and use context-sensitivity to give unique typings to each counter, as is reflected in the diagram in Fig. 2. The form of context-sensitivity employed in Task Types follows [WS01; EGH94; MRR05; WL04]. We now summarize why the MapReduce example typechecks. The Mapper map method can freely access its work unit wu; it was created by the UnitsLoader and thus the object is passed across a task boundary, but the only task with access to it is the particular mapper task itself; this means the mapper is the cut vertex, i.e. the cutExists typing predicate holds for wu. The reducer, ul, and all the mappers are tasks and so can be shared freely. Task Twinning We now explain why the “Pre-Twinned Access Graph” is not good enough for preserving isolation of non-shared ordinary objects. Observe in that graph that only one node is created for Mapper, even though their might be 20 Mapper’s at run time. The fundamental problem here is static approaches have to finitely approximate the in principle unbounded tasks that arise in the presence of recursion. Since different runtime contexts must share static representations for the analysis to remain finite, we need to make sure this approximation will not introduce errors, and it does in fact introduce some very subtle complications. Suppose we added an additional field called secretShare of type WorkUnit to class Loader, and we changed the return statement at the end of that method to return secretShare. It is not hard to see this is a problem program as all Mapper instances would be sharing the same WorkUnit stored in secretShare; furthermore, the cut vertex predicate above would be unable to detect this problem if we chose to use the “Pre-Twinned Access Graph” for analysis. The root of this problem is that the type system as described up to now created only one instantiation of Mapper to model the many created at runtime, and that approximation does not soundly model sharing between the different Mapper instances at runtime. To address this case we invent a technique called task twinning: we make two static instances instead of a single task instance, which here means two Mapper’s, two

Loader’s and two Reducer’s. The type system in fact produces the graph at the bottom of Fig. 2. This technique intuitively captures the fact that every program point for task instantiation potentially may lead to more than one task instantiation at run time, so it directly uses two distinct static type variables to split all possible runtime objects into two subsets. Combining this with our polymorphic treatment of method invocations, each object instantiated inside the scope of twinned task objects is also multiplied in the graph – there are two Counter instances for toReduce (one for each twinned Reducer), two Counter instances for toLoad (one for each twinned Loader), and four WorkUnit instances (from each twinned Mapper invoking loadWorkUnit of each twinned Loader). Our formal system is constructed to handle the case where every line of code may potentially be involved in recursion, and hence every task object needs to be twinned. In practice, mechanical application of twinning is not always necessary. For instance, the Reducer and Loader objects are in fact instantiated in the bootstrapping main method, so the system would still be sound even if our static access graph had not twinned them. We treat simplifications in this flavor as implementation-level optimizations, and do not model them in the theory, except that the bootstrapping task Main is trivially singular, so we do not twin it. Now, looking at the actual “Static Access Graph” the type system generates in Fig. 2 we see that the cut vertex property still holds for every non-shared ordinary object in the graph. If, however, the program incorporated the secretShare modification above, there would be two WorkUnit instances in the full static graph, and each WorkUnit would be accessed by both Loader’s in addition to both Mapper’s. Such a graph would violate the cut vertex condition since there is an access path to each WorkUnit from each Mapper and there is no cut vertex dominating both Mapper’s. Properties For the simplicity of the presentation, the static access graph drawn in Fig. 2 does not distinguish between read and write access. Our formal type system is more refined, and is constructed using a non-exclusive-readexclusive-write principle. Additionally, we can prove there is no atomicity violation when a field write in constructors are treated as non-exclusive. Immutable objects consequently can be freely shared. In Section 4 we prove type soundness and decidability of type inference for Task Types. We also will show how Task Types preserve a non-shared memory model for all objects declared as non-shared and ordinary. Since these objects do not need lock protection to preserve atomicity, Task Types will have the same pervasive atomicity properties as did Coqa but with a lower run-time cost. Pervasive atomicity also provably subsumes race condition freedom [LLS08], so programmers also will be spared from race conditions –

for example, in the reduce method, sum = sum + rt is guaranteed to compute predictable results. 2.3

Programmability

Effective Task Types programming requires the programmer to develop a careful plan for object sharing. Declaring as many objects as possible to be non-shared ordinary objects, µ = , will increase run-time performance, and at the same time increase the size of atomic zones, so the goal of the programmer is simply to maximize the non-shared ordinary classes but still allow the program to typecheck. Objects that cannot typecheck as unshared ordinary should be typechecked as shared ordinary (µ = shared) or shared task (µ = shared ->). Fortunately, problems with typing a nonshared ordinary object are always solved by hoisting to a shared one since the type system imposes no additional constraints on shared objects. So, this is always available as a last resort; the key is to hoist up as few objects as possible, to obtain the largest zones of atomicity and highest performance. Both shared ordinary and shared task objects are locked for mutual exclusion, but shared ordinary objects do not add new atomicity break points into tasks and are also more appropriate when a task makes continual use of an object. So, shared is the preference over shared task if the object really “should be” local to the task, but the type system is too weak to realize that. A good situation to use shared tasks is when the class wraps up a relatively independent “service,” so that when the service is completed, “partial victory” can be declared. Examples include Reducer and Loader in the MapReduce example. We believe this additional programmer focus on object sharing is time well-spent if the final goal is production of reliable software. It is our belief that the vast majority of objects in a vast range of applications are ordinary objects not shared across tasks, and they can be programmed normally, so additional planning is required only on the shared object portions of the code. We don’t expect this paradigm will extend to every line of every single application – just as there is a rare need to escape to C in Java there will be rare cases where for efficiency the Task Type framework needs to be bypassed. One such example is data structure implementations that use hand-over-hand locking. We next describe a benchmark program with significant contention. Intuitively, this is precisely the category of programs one would expect Task Types to be uncomfortable with, so it will be more of a stress test of our language. 2.3.1

The PuzzleSolver Benchmark

In Section 5 we discuss the implementation and a few benchmarks of its performance. One benchmark, PuzzleSolver, solves a generalized version of the 15-puzzle, the famous 4x4 sliding puzzle where numbers 1-15 must be slid to arrive at numerical order. The primary worker tasks of the pro-

gram are n SolverTasks, each of which in parallel takes existing legal gameplay move sequences (PuzzleMove’s) from a central work queue (PuzzleTaskQueue) and puts back on the queue all gameplay moves extending the play sequence grabbed by one game step, if any exist. All worker tasks loop on this activity until there are no more move sequences to extend on the shared queue. In order to avoid repeating play, all visited board states (PuzzlePositions) are centrally logged in a PositionSeen object so no worker SolverTask will add an already investigated position to the PuzzleTaskQueue. Lastly there are classes Block and Puzzle, the former containing the ID and size of a block (the generalization supports n × m blocks), and the latter which checks for legal puzzle moves. The primary interest for Task Types is what sharing declarations are placed on the classes. SolverTask is obviously a task and therefore declared a task. Classes PuzzleTaskQueue and PositionSeen are data structures that must be shared by all tasks and so logically are declared as shared task which also implicitly guarantees their mutually exclusive access. Classes Block and Puzzle are not changed after they are constructed, and even though they are shared across worker tasks it is still possible to declare them as ordinary unshared classes since they are known to be immutable – this is an example of the practical usefulness of the non-exclusive read we support in the type system, as was discussed in the previous subsection. The only class which the typechecker is less than optimal on is PuzzlePosition which must be declared shared task in order to typecheck. The PuzzlePosition objects are used by SolverTask’s to replay PuzzleMove sequences on the grid, and each worker task has its own private PuzzlePosition which logically is unshared. So, it sounds like PuzzlePosition could be declared unshared ordinary, but there is a subtle problem: there is also a distinguished PuzzlePosition called initial position which holds the initial board configuration and is set up by the main task that launches the worker tasks. This initial position is passed to each worker task so they can set up their own PuzzlePosition (by copying initial position), but even though they are only reading it, the main task had written to initial position when the data was read from a file and so it is considered owned by the main task and so cannot be read by the worker tasks. If Task Types were to support a per-instance declaration of sharing policy, the private per-worker-task PuzzlePosition objects could be declared unshared ordinary, and only the initial position would need to be a shared task. A flow-sensitive typing would also support this since initial position is not mutated after it is read from the file. And, a call-by-copy syntax as discussed in Sec. 7 would solve this problem as well. For simplicity we elected not to include any of these three extensions in the

current design, but conceptually one will likely be needed in a future extension to increase expressiveness.

3.

The Formal System

3.1

Abstract Syntax

The core syntax of our language is formalized in Fig. 4, where notation X is used to represent a sequence of X’s. As a convention, metavariable c is used for class names, m for method names, f for field names, and x for variable names. Special class names include Objectµ , the root classes for the inheritance hierarchy, one for each class modifier µ. The bootstrapping code is located inside the body of class Main’s no-argument method main. The class directly inherits from class Objecttask with no additional fields. We def encode unit/void type via class name Unit = Object . Default constructors are supported: a constructor for class c is syntactically a method in c with method name also c. To make the formalism more uniform, we assume all constructors always return this as the return value. In this formalismfriendly syntax, field read/write expressions are annotated with a scope modifier ζ, denoting whether the expression is lexically scoped in a constructor (ζ = cons) or not (ζ = reg). Parametric Polymorphism Support The formal syntax differs from the programmer syntax in that several program annotations are included to help streamline the formal presentation. Programmer syntax new c(e) is formally expressed as newA c(e) when class c has modifier µ ∈ / task (i.e. “resource” object instantiation), and as newA1 ,A2 c(e) otherwise (i.e. “concurrency unit” object instantiation). The associated subscript A serves to distinguish different instantiation sites. Structurally, each A represents a list of type variables α (of set STV). The first type variable in A represents the object instantiated at the specific program point where the expression occurs, and each of the rest represents an object that can be stored in a field of the instantiated object. For any two distinct new expressions in the source code, we require their respective A’s have distinct elements. The newA1 ,A2 c(e) form for concurrency unit object instantiation is present to realize our need for task twinning to model self-sharing, as was outlined in Sec. 2.2. To support context sensitivity, each call site is differentiated by associating a singleton list [α] with each messaging expression, e∗[α] m(e0 ). For any two distinct messaging expressions in the source code, we require their respective [α]’s to be distinct. As examples, the several expressions in the Loader class of Fig. 2 is automatically annotated as follows: cf!.newCtr() as cf!.[α100 ] newCtr(null) toLoad.dec() as toLoad.[α101 ] dec(null) new WorkUnit(this) as new[α102 ,α103 ] WorkUnit(this)

For each class definition µ class c extends c0 {F M } inside codebase C and each method definition c0 m(c00 x ){e} inside M , we define function mbody(π) = x .e to return the method body and deterministic function mtype(π) =

∀A.(c00 → c0 ) to return the signature for method index π = hc; mi. Both functions are implicitly parameterized by the fixed codebase C . The type variable list above A = [α00 , α0 ] includes two distinct type variables representing the argument and the return value respectively. To illustrate, selected methods of Fig. 2 have the following signatures: mtype(hLoader; Loaderi)=∀[α200 , α201 ]. (CtrFactory → Loader) mtype(hLoader; loadWorkUniti)=∀[α202 , α203 ]. (Unit → WorkUnit) mtype(hCtrFactor; newCtri)=∀[α204 , α205 ]. (Unit → Counter) mtype(hCounter; deci)=∀[α206 , α207 ]. (Unit → Unit)

To model inheritance, we further define mtype(hc0 ; mi) = mtype(hc; mi) for any c0 a subclass of c. This is the only case where mtype may compute overlapping A’s for different π’s, i.e. for all other cases, if mtype(π1 ) and mtype(π2 ) compute A1 and A2 respectively and π1 6= π2 , A1 and A2 have disjoint elements. mtype(hMain; maini) = ∀A.(Unit → Unit) for some (uninteresting) A. Auxiliary Definitions For class µ class c extends c0 {F M } in C , we further define modifier (c) = µ and supers(c) = {c} ∪ supers(c0 ). Here we assume Objectµ 6= c and there is no cycle on the inheritance chain induced by C . In addition, we define modifier (Objectµ ) = µ and supers(Objectµ ) = Objectµ . These functions are also implicitly parameterized by C . We now define standard mathematical notation used in this paper. [ ] is used to represent an empty sequence, and x : def [x1 , . . . , xn ] = [x, x1 , . . . , xn ], and |[x1 , . . . , xn ]| = n. When ordering of a sequence does not matter, we liberally consider them convertible to their unordered counterpart (a set) and standard set operators such as ∈ and ⊆ will be used on it. A special kind of sequence, a mapping sequence, is denoted as x 7→ y and defined as [x1 7→ y1 , . . . , xn 7→ yn ] for some unspecified length n. Given ι being the mapping def def sequence above, dom(ι) = {x1 , . . . xn } and range(ι) = {y1 , . . . yn }. We write ι[x 7→ y] for mapping update: ι and ι[x 7→ y] are identical except that ι[x 7→ y] maps x to y. Updateable mapping concatenation is defined as def ι1 ι = ι1 [x1 7→ y1 ] . . . [xn 7→ yn ]. We also write ι1 ι as ι1 ] ι, except the latter function requires the precondition of dom(ι1 ) ∩ dom(ι) = ∅. Given two sequences, a “zip” like operator 7−→ produces a mapping sequence: def [x1 , . . . , xn ] 7−→ [y1 , . . . , yn ] = [x1 7→ y1 , . . . , xn 7→ yn ]. 3.2

A Bird’s Eye View of the Type System

Task Types are a constraint-based type system with (TProgram) as the top-level typing rule:

C µ F M e ∗ α A ζ π

::= ::= ::= ::= ::= ::= ∈ ::= ::= ::=

c 7→ µ class c extends c0 {F M } classes | shared | task | shared task class modifier cf fields m 7→ c m(c0 x ){e} methods x | null | this | f ζ | f ζ :=e | (c)e | e ∗[α] m(e) | newA c(e) | newA,A0 c(e) expression !-> | !. | -> | . method invocation symbol STV annotated type variable α α sequence reg | cons scope modifier hc; mi method index Figure 4. Abstract Syntax 3.3

`cls C (ci )\C(ci ) for all ci ∈ dom(C ) WF ( C) (T-Program) `p C : C This rule defines typechecking in two phases: • First, constraints are collected for each method of each

class modularly, and inconsistencies are detected as early as possible. The per-class typing rule `cls and the related expression typing rules are defined in Fig. 5. At the end of this phase, constraints are stored in a per-class, permethod fashion, in constraint store C. • Second, a closure phase propagates inter-procedural in-

formation. The resulting constraint closure is computed by the function, as defined in Sec. 3.4. Afterwards the static access graph represented in the closure is checked by a simple WF () function in Sec. 3.5, determining whether static isolation for non-shared ordinary objects holds. We favor a phased definition to be more realistic with object-oriented languages. Today’s OO languages often rely on a modular phase to find as many bugs as possible (either via source code compilation or bytecode verification), and delays as few as possible non-modular constraint solving to dynamic class loading time (such as those related to Java subtyping [LB98]). Presenting Task Types in phases de facto describes how it can be constructed in a language with dynamic class loading. Notably, extracting a modular phase out of an inter-procedural algorithm is no trivial task for objectoriented programs, as the latter are fundamentally mutually recursive: class c1 might contain expression new c2 (e1 ) whereas class c2 might contain expression new c1 (e2 ); or class c1 ’s method m1 might invoke c2 ’s m2 which in turn invokes c1 ’s m1 . By constructing an explicit formal definition of the modular phase here, we gain confidence in the ability to construct a practical and decidable typechecking process.

Modular Class Typing

Expressions are typed via the ` rules of Fig. 5. Classes are typed via judgment `cls , and method bodies via `m . All typing rules are implicitly parameterized by the fixed codebase C . Types are always of the form c@α, with c the class name analogous to Java’s object type, and α the type variable associated with a specific object instance, needed for the polymorphic type system. Typing environment Γ maps variables x , field names f, and this to their types. We delay the discussion of class-indexed constraint store C and method-indexed constraint store M until judgments `cls and `m are explained. Per-method constraint store K is a set of constraints collected for each method body. Each constraint is represented by metavariable κ. The meanings of specific constraints will be clarified later, but in general a θ

≤ constraint is intuitively a constraint recording flow, a 99K constraint is intuitively a constraint recording access. Special variables α ˆ are used only in ≤ constraints. Access Constraints A main goal of our type system is to generate the static access graph that was informally described in Section 2. The nodes of such a graph are type variables representing individual objects. The edges, as collected in the modular type checking phase, are called access θ

constraints. They are of the form thost 99K α, meaning an object represented by type variable α is accessed. Constraint label θ, also called access mode, ranges over three values: when θ = R, the constraint asserts that object α is non-exclusively “read”; when θ = W, α is exclusively “written”; when θ = T, α is the entry/facade object for a “proT

tected zone” of sharing. In our type system, 99K constraints are generated when α is a shared task object or a shared ordinary object. Observe that both kinds of objects are dynamically protected upon entry, and the objects completely hiding behind them need not be dynamically protected. The T

99K constraints make the static access graph aware of the dynamic protection points and reason accordingly. Placeholder type variable thost denotes the “accessor.” Intuitively, the accessor should have been a type variable

(T-Read)

Γ(f) = c@α

(T-Write)

ζ

Γ ` f : c@α\aC (R, Γ(this), ζ)

Γ(f) = c@α1

Γ ` e : c@α2 \K

ζ

Γ ` f :=e : c@α1 \K ∪ {α2 ≤ α1 } ∪ aC (W, Γ(this), ζ)

Γ ` e : τ \K τ = c@α0 mtype(hc; mi) = ∀A.(c1 → c2 ) 0 Γ ` e0 : c1 @α0 \K0 ∗ matches modifier (c) returns c2 κ = [α]α0 ,m,α (T-Msg) Γ ` e ∗[α] m(e0 ) : c2 @α\K ∪ K0 ∪ {κ} ∪ aC (T, τ, ζ)

(T-New)

A = [α, . . . ]

(T-NewTask) (T-Sub)

Γ ` e : c0 @α0 \K0 τ = c@α mtype(hc; ci) = ∀A0 .(c0 → c) κ = [α]α,c,α0 c Γ ` newA c(e) : τ \K0 ∪ {A ≤ α, κ} ∪ aC (T, τ, ζ)

any ζ

Γ ` newA1 c(e) : c@α1 \K1 Γ ` newA2 c(e) : c@α2 \K2 task ∈ modifier (c) Γ ` newA1 ,A2 c(e) : c@α1 \K1 ∪ K2 ∪ {α1 ≤ α2 , α2 ≤ α1 }

Γ ` e : c0 @α\K c ∈ supers(c0 ) Γ ` e : c@α\K

(T-Cast)

Γ ` e : c0 @α\K Γ ` (c) e : c@α\K

(T-This) Γ ` this : Γ(this)\∅

(T-Cls)

any ζ

(T-Var) Γ ` x : Γ(x )\∅

(T-Null) Γ ` null : τ \∅

`cls µ class c0 . . . \∀A0 .M Γ `m mbody(πj ) : mtype(πj )\∀Aj .∀Sj .Kj for all mj ∈ dom(M ), πj = hc; mj i

fields(c) = ∀A.Γ

`cls µ class c extends c0 {F M }\∀A.(M mj 7→ ∀Aj .∀Sj .Kj ) (T-ClsTop) (T-Md)

Γ ] [x 7→ c1 @α1 ] ` e : c2 @α3 \K A = [α1 , α2 ] S = labels(e) Γ `m x .e : ∀A.(c1 → c2 )\∀A.∀S.(K ∪ {α3 ≤ α2 }) . !. !-> ->

τ Γ t S

::= ::= ::= ::=

fields(Objectµ ) = ∀A.Γ `cls µ class Objectµ \∀A.[ ]

matches matches shared matches shared task matches task

returns returns returns returns

c@α t 7→ τ x | this | f A

types typing environment environment variable A sequence

K ::= κ κ α ˆ θ C M

::= ::= ::= ::= ::=

c c c Unit

per-method constraint store α,m,α0

α ˆ≤α | A α | Ac T|R|W c 7→ ∀A.M m 7→ ∀A.∀S.K

def

aC (θ, c@α, ζ) =

θ

| thost 99K α

 θ   {thost 99K α}   ∅

if

constraint flow element access mode class-indexed constraint store method-indexed constraint store

modifier (c) = , θ = R or W, ζ = reg or modifier (c) ∈ shared, θ = T

otherwise Figure 5. Typing Rules

representing the entry facade object for a protected zone. A placeholder is used because no concrete type variable naming this object is known during the modular type checking phase. Consider for example the case when the toLoad Counter object is written by object Loader because toLoad’s field v is written in method dec. When class Counter is typed, we cannot directly determine in what “protected zone” method dec is invoked from – one needs to backtrack on all possible call paths to find the first nonordinary object. We therefore put thost here and rely on the closure phase (Sec. 3.4) to instantiate it; such a scheme is common in inter-procedural analyses with a modular phase. Access constraints are collected by typing rules dependent on aC (θ, τ, ζ). This convenience function collects θ constraints when the type of the accessed entity is τ and the scope of access is ζ. The function is defined in Fig. 5. (TMsg) puts aC (T, τ, ζ) into the constraint set, i.e. it collects T

99K constraints when the message receiver is a shared task object or a shared ordinary object (regardless of the scope). This is consistent with our previous discussion of “protected zones.” The same constraint is collected in (T-New) for constructor calls. For accessing non-shared ordinary objects, an access occurs when the field of that object is read or written. Related constraints are collected in (T-Read) and (T-Write) respectively. Observe that when the field read/write happens in a constructor, the access is not recorded as a constraint. This is sound because, when an object is constructed, its reference is not yet created, let alone leaked to another task and accessed by it – such read/write access fundamentally does not lead to atomicity violations. Expressions with Polymorphic Typing In principle, polymorphic typing behaves rather like let-polymorphism: the type constraints of the polymorphic code – the invokee’s method body here – should be “refreshed” and added to the invoker’s constraint set. What complicates matters here is the fundamentally recursive nature of OO programs: naively merging the invokee’s constraints may lead to nontermination due to an infinite regress of refreshing along a recursive call. It is for this reason our type system in the modular phase only places a delayed contour marker in the constraint set, and delays the task of refreshing and merging 0 to the phase of closure. Marker [α]α0 ,m,α added in (T-Msg) indicates the need to merge (at closure time) the constraints of method m of object α0 , with argument being α0 and return value being α. For instance, typing the annotated expression cf!.[α100 ] newCtr(null) in class Loader generates the following marker constraint: Kloader1 = {[α100 ]α200 ,newCtr,α100 } where α200 is the type variable associated with variable cf – the latter is the argument of the constructor; its associated type variable is computed by mtype(hLoader; Loaderi) earlier. The choice of type variables to type null is irrelevant; we use α100 here. (T-Msg) also contains a predicate

“∗ matches µ returns c”, which matches different method invocation symbols (∗) with class modifiers µ. In addition, it requires that asynchronous top-level task creation has no interesting return values. In (T-New), the type variable representing the object instantiated by newA c(e) is the first element of A, consistent with how A is constructed in the formal syntax. Since all such annotations include disjoint type variables, new expressions at different program points are given different type variables. Rule (T-New) also contains a flow constraint of the form Ac ≤ α. This constraint says that instantiation site Ac flows into type variable α. This constraint, together with the transitivity of ≤ as defined in the closure phase, is used to trace back any type variable to its concrete instantiation point(s) – a concrete type analysis scheme essential for languages with aliases. Lastly, a similar delayed contour marker is added to a constructor call. For instance, typing the annotated expression new[α102 ,α103 ] WorkUnit(this) in class Loader leads to the following constraints: Kloader2 = {[α102 , α103 ]WorkUnit ≤ α102 , [α102 ]α102 ,WorkUnit,α300 } assuming this was given type α300 . (T-NewTask) types expression newA1 ,A2 c(e), which is used when modifier (c) ∈ task. This is how task twinning is reflected in the type system: for each statically known task object, the type system instantiates them twice, with A1 and A2 respectively. Other Rules Standard nominal subtyping is supported by (T-Sub). Casting is typed by (T-Cast); we do not single out stupid cast as warnings [IPW99] as this does not affect soundness. (T-Read), (T-Write), (T-This), (T-Var) rely on the typing environment. Given a class µ class c extends c0 {F M } in codebase C where F = [c1 f1 , . . . , cn fn ], a typing environment with field and this type information is prepared via the following deterministic function: fields(c)

def

=

if

∀A.((Γ ] [f1 7→ c1 @α1 , . . . fn 7→ cn @αn ]) [this 7→ c@α0 ]) fields(c0 ) = ∀A0 .Γ A = A0 ] [α1 , . . . , αn ] Γ(this) = c0 @α0 , α1 , . . . , αn distinct

and the base case is fields(Objectµ ) = ∀[α].[this 7→ Objectµ @α]. This function deterministically assigns each field a distinct type variable, as well as assigning one for this. The function is implicitly parameterized by the fixed codebase C . It is able to support field inheritance but disallows field shadowing for simplicity. As an example, the function has the following behavior on class Loader : fields(Loader)

= ∀[α300 , α301 ]. toLoad 7→ Counter@α301 this 7→ Loader@α300

To make the parametric nature of constraints more explicit, a class-indexed constraint store C computed in (TCls) and (T-ClsTop) is always of the form ∀A.M, where A is computed by the previous fields function. A methodindexed constraint store M computed in (T-Md) is always of the form ∀A.∀S.K, with A containing the two type variables representing the argument and the return value of the method, and S being the set of A labels appearing in the body of the method. S is computed by the following function: def

labels(x ) = ∅ def

labels(newA c(e)) = {A} ∪ labels(e) def

labels(newA1 ,A2 c(e)) = {A1 , A2 } ∪ labels(e) def

labels(e ∗A m(e0 )) = {A} ∪ labels(e) ∪ labels(e0 ) def

labels((c)e) = labels(e) ... We require the type variables computed by fields, by mtype, and by labels to be pairwise disjoint. We now provide a complete picture of all the constraints produced for class Loader: C(Loader) M(Loader)

= ∀[α300 , α301 ].M = ∀[α200 , α201 ].∀[[α100 ]]. Kloader1 ∪ {α300 ≤ α201 } T

M(loadWorkUnit)

3.4

∪{thost 99K α200 } = ∀[α202 , α203 ]. ∀[[α101 ], [α102 , α103 ]]. Kloader2 ∪ {[α101 ]α301 ,dec,α101 } ∪{α102 ≤ α203 }

Type Closure ∆

We first define reflexive and transitive binary relation Ω ,→ Ω0 by the proof system in Fig. 6. This relation denotes Ω ∆

closes to Ω0 under calling context ,→. A calling context ∆ is represented as a sequence of tuples δ, each of the form hβ; c; mi denoting a call to method m of object β of class c on the call chain. We use β, B, ω, Ω to represent the closure-time counterparts of modular-typing-time α, A, κ, K, respectively. We differentiate these syntactic entities to highlight the fact that β’s and α’s are chosen to be disjoint. The set of all β’s are denoted CTV. It includes one special type variable tmain to represent the bootstrapping task. Type closure C, as used in (T-Program), is defined as []

the largest set Ω where relation boot ,→ Ω holds under implicit C, and boot is sugar for {[tmain]tmain,main,tmain , [tmain]Main ≤ tmain}. Intuitively the delayed contour marker in this set indicates that the main method of the Main class is invoked, with tmain representing the “main” task and irrelevant arguments and return values. The overall goal of type closure is to merge local per-class and per-method constraints into one global

θ

set, so that all access constraints β 99K β 0 can form one static access graph for the key well-formedness check. Rule (C-Canon) “canonizes” access constraints, i.e. it traces back the chain of the flow constraints so that the type variable generated at instantiation point is used as a canonical name for the object. This is de facto applying a concrete type analysis to the object aliases, expressed as type variables here, appearing in access constraints. We use constraint form θ β− → β 0 to represent the canonized version. To facilitate the process of instantiation-point back-tracing, rules (C-Flow=) and (C-Flow+) assert that flow constraints ≤ are reflexive and transitive. The transitivity rule (C-Flow+) enables any type variable to ultimately find its instantiation point(s) via the previously explained flow constraint placed in (T-New). This is why in the map function of the Mapper class, our type system ultimately will find out what objects flow into ul, even though ul is not instantiated in its scope. (CTask=) and (C-Task+) define reflexivity and transitivity for T − → – if a protected zone β1 encloses protected zone β2 , and β2 encloses β3 , then β1 can be viewed as enclosing β3 . The main complexity of the closure algorithm arises from context sensitivity, captured by (C-Contour). Recall that for a context-sensitive algorithm, the type constraints associated with the method body need to be “refreshed” according to the specific calling context. We define a function for picking type variables: def

gen(∆, A) = generate(collapse(∆), A) where generate is a deterministic function defined as follows: def generate(∆, A) = B where |A| = |B|, all elements in B distinct with the additional requirement that for any A1 , A2 , the type variables in sequences generate(∆, A1 ) and generate(∆0 , A2 ) are disjoint if collapse(∆) 6= collapse(∆0 ) or A1 6= A2 Note this requirement can be concretely satisfied by indexing the variables on the call string. Function collapse determines whether a recursive invocation has been made, and if so, reuses the results of the initial invocation: def

collapse([ ]) = [ ] def

collapse(δ : ∆) = [δ, δ1 , . . . , δn ] if collapse(∆) = [. . . , δ, δ1 , . . . , δn ] def

collapse(δ : ∆) = δ : collapse(∆) if δ ∈ / collapse(∆) Given calling context ∆ = [hβ1 ; c1 ; m1 i, . . . , hβn ; cn ; mn i], partial function pzone(∆) is used to compute the “current protected zone.” It is defined as βi for some i ∈ [1..n] where modifier (ci ) 6= , and for any j such that 1 ≤ j < i, modifier (ci ) = . Substitution notation •[σ] replaces all type variables α ∈ dom(σ) in • with σ(α). Let us now illustrate one inductive step of closure – when the constraints associated with Mapper class’s map method have been

θ ∆ θ (C-Canon) {β10 99K β20 , (β1 : B1 )c1 ≤ β10 , (β2 : B2 )c2 ≤ β20 } ,→ {β1 − → β2 } ∆ (C-Flow+) {βˆ1 ≤ β, β ≤ β2 } ,→ {βˆ1 ≤ β2 }

∆ (C-Flow=) ∅ ,→ {β ≤ β}

∆ T (C-Task=) ∅ ,→ {β − → β}

∆ T T T (C-Task+) {β1 − → β2 , β2 − → β3 } ,→ {β1 − → β3 }

δ = hβ; c; mi ∆0 = δ : ∆ C(c) = ∀A1 .M ] M(m) = ∀A2 .∀S.K σ = (A1 7−→ B) ] (A2 7−→ [βarg , βret ]) ] A 7−→ gen(∆0 , A) A∈S

(C-Contour) {[βret ]

β,m,βarg

∆

c

δ

, B ≤ β} ,→ {hK[σ]i }

∆ (C-GlobalIntro) {βˆ ≤ β, hΩiδ } ,→ hΩ ∪ {βˆ ≤ β}iδ ∆

B β,m,βarg ∈ / Ω0

(C-GlobalElim)

∆

{hΩ ∪ Ω0 iδ } ,→ Ω0 [thost 7→ pzone(δ : ∆)] ∪ {hΩiδ } Ω ,→ Ω1 ∪ Ω2 ∆

δ:∆

(C-Context)

∈ ::= ::= ::= ::=

Ω1 ,→ Ω2 ∆

hΩ1 iδ ,→ hΩ2 iδ

Ω ,→ Ω1 β B ∆ δ Ω

CTV

type variables in closure β sequence calling context call site closure

β δ hβ; c; mi ω 0 θ θ ω ::= βˆ ≤ β | B β,m,β | β 99K β 0 | β − → β 0 | hΩiδ βˆ ::= β | B c

constraints in closure flow elements in closure

σ ::= α 7→ β WF (Ω) cutExists(Ω, B)

∆

Ω ∪ Ω1 ,→ Ω ∪ Ω2

∆

(C-Subset)

Ω1 ,→ Ω2

(C-Union)

substitution def

=

def

=

θ

W

∀β.β 0 −→ β ∈ Ω =⇒ cutExists(Ω, {β 0 |β 0 − → β}) ^ ^ T T T ∃βc . (βc − → β) ∧ (∀βc0 6= βc . (βc0 − → β) =⇒ (βc0 − → βc )) β∈B

β∈B

Figure 6. Type Constraint Closure and Isolation Preservation merged, we show how (C-Contour) helps merge in the constraints for the method body of loadWorkUnit. Let us assume the closure at that step includes:

A1 7−→ B

is

A2 7−→ [βarg , βret ]

is

α300 → 7 β404 α301 → 7 β405

α202 7→ β403 α203 7→ β401

[β401 ]β402 ,loadWorkUnit,β403  which results from typing ul!->[α401 ] loadWorkUnit() in Mapper and [β404 , β405 ]Loader ≤ β402 from typing new[α404 ,α405 ] Loader(cf) in Main and by flow transitivity. Using the definition for C(Loader) given previously, the substitution built up in (C-Contour) thus is

U

A∈S

A 7−→ gen(∆0 , A)

is

 α101 7→ β601  α102 7→ β602  α103 7→ β603

and given β406 represents the Mapper object, and ∆0 = [hβ402 ; Loader; loadWorkUniti, hβ406 ; Mapper; mapi, htmain; Main; maini], the gen function is: gen(∆0 , [α101 ]) = [β601 ] gen(∆0 , [α102 , α103 ]) = [β602 , β603 ]

There are two interesting points here. First, if function loadWorkUnit were invoked via different call chains, the gen function would map ∆0 to different type variable lists, so that different instances of WorkUnit instantiated from loadWorkUnit can be differentiated. This is also why different Counter instances in Fig. 2 can be approximated statically even though they are all instantiated from one program point. Second, rather than immediately merging the substituted constraints into the type closure, a special contextual constraint of the form hΩiδ is used, a marker denoting constraints Ω in the calling context of δ is to be merged to the type closure. The real merging happens at (CGlobalElim). Here any constraint other than delayed contour markers can be “yanked” out of the contextual constraint – in other words, these constraints are not “context-sensitive”. Note that when this happens, the placeholder for the “current protected zone,” thost, needs to be properly replaced. Continuing with the example above, any constraint inside the hi of the contextual constraint obtained via (C-Contour) can be “yanked” out with thost replaced with β402 . It represents the Loader object itself. This is the value of thost here because shared tasks create protected zones of their own. Other rules related to contextual constraints, (C-GlobalIntro) and (C-Context), are self-explanatory. 3.5

Isolation Preservation

Isolation is enforced by the WF function in Fig. 6. It checks that for any non-shared ordinary object, either it is only accessed once, all accesses are reads, or the accessing tasks as in B must satisfy cutExists(Ω, B). The cutExists(Ω, B) function in Fig. 6 formally defines the notion of cut vertex alluded to in Sec. 2.2: the subgraph including all static access paths ending with a type variable in B must have a cut vertex (or articulation point), and if there is more than one cut vertex for that subgraph, there must be a “least upper bound” of them. The definition here is phrased so as to allow a shared task (or the ordinary objects it owns) to access the objects belonging to its “ancestor” tasks, as long as the cut vertex invariant is not violated.

4.

Operational Semantics and Formal Properties

In this section we briefly describe the operational semantics of our language, with a focus on features that are related to stating the proven formal properties. Small-step reductions S ⇒ S 0 are defined over configurations S = hH; Σ; ei for H the object heap, Σ the dynamic constraint set, and e the expression. The reductions are implicitly parameterized by class list C . We use S ⇒∗ S 0 to represent multi-step reduction, which is defined as the transitive closure of ⇒. We use ⇒C S to represent a computation of program C starting in the initial state and computing in multiple steps to S. We use S ⇑ to mean there is some computation of S that computes forever.

H ::= o 7→ hB c ; Fd i θ

θ

→o| p;o Σ ::= p − β ::= · · · | o ω Fd v o, p, q e

::= ::= ::= ∈ ::= | E ::= | | | | |

θ

· · · | β ; β0 f 7→ v o | null

OID

o

· · · | heiδ | θ | e; e v | post e | e k e •| Eke|ekE f ζ :=E | (c)E E ∗[β] m(e) | v ∗[β] m(E) newB c(E) newB1 ,B2 c(E) E; e | hEiδ

heap dynamic constraint set extended type variable extended constraint field store value object ID extended expressions evaluation context

Figure 7. Dynamic Semantics Definitions Fig. 7 gives defines related data structures. To reuse data structures that have been defined for static semantics, we extend type variables β to include o’s as well. As a result, the previous definition of calling context – a sequence of call sites in the form of hβ; c; mi’s – can also be viewed as that for the dynamic calling context, in the form of a list of ho; c; mi’s. H is a mapping from objects o to field stores (Fd ), and its program point information B c of their instantiation. Expressions are extended with values v, which are either object ID’s or null. Auxiliary expression heiδ is used to represent an expression e evaluated inside a dynamic o call site δ. Expression θ is a helper expression to indicate o is accessed in mode θ, and post e is a helper expression to “post” a “root” task for later execution. Parallel operator e k e0 is commutative. Dynamic constraint set Σ includes two θ θ θ kinds of constraints: − → and ;. We reuse the − → constraint from the static type system, which is computed only for stating theorems and is not collected by the compiler. We use E to represent evaluation contexts. The initial configuration is hHinit ; ∅; einit i where Hinit = [o 7→ h[tmain]Main ; [ ]i] for some o and einit = o->[tmain] main(null). Operational Semantics A complete definition of operational semantics can be found online [TT]. We now only focus on what is relevant for stating meta-theories, especially access invariants. First, reductions under a particular ∆ dynamic calling context ∆ is represented by the S =⇒ S 0 , and this relation is connected with ⇒ by a self-evident context rule. cxt(E)

H, Σ, E[ e ] ⇒ H 0 , Σ0 , E[ e0 ] if H, Σ, e =⇒ H 0 , Σ0 , e0 where

def

Definition 1 (Roots). Function roots(H, Σ, o) is defined as the largest set of o0 such that H(o0 ) = hB c ; Fd i, T modifier (c) = task, and there exists some n ≥ 0, {o0 ; T θ o1 , . . . on−1 ; on , on ; o} ⊆ Σ.

cxt(•) = [ ] def

cxt(J E Ko ) = cxt(E) def

cxt(hEiδ ) = cxt(E) : [δ] def

cxt((c)E) = cxt(E) ...

o

The reductions related to access take the following shape, where H 0 and e0 are irrelevant information we elide here: o

∆

H, Σ, f reg :=v =⇒ H 0 , Σ, W ; e0 if ∆ = ho; c; mi : ∆0 o

∆

H, Σ, f reg =⇒ H, Σ, R ; e0 if ∆ = ho; c; mi : ∆0 ∆

H, Σ, o ∗[β] m(v) =⇒ H, Σ, o

∆

o T

; e0

H, Σ, θ =⇒ H, Σ ∪ dset(µ, θ, p, o), e0 if progressable(H, Σ, µ, θ, p, o), p = pzone(∆) H(o) = hB c ; Fd i, modifier (c) = µ ∆

θ

H, Σ, hviho;c;mi =⇒ H, Σ − {o1 ; o2 | o1 or o2 is o}, v if H(o) = hB c ; Fd i, modifier (c) ∈ task ∆

H, Σ, hviho;c;mi =⇒ H, Σ, v if H(o) = hB c ; Fd i, modifier (c) ∈ / task In essence, all object access controls (field read/write or o messaging) have been delegated to the reduction of θ , where function progressable(H, Σ, µ, θ, p, o) is defined as: H HH θ T R or W HH µ true true shared roots(H, Σ, o) ⊆ roots(H, Σ, p) true shared task roots(H, Σ, o) ⊆ roots(H, Σ, p) true T

(o ; o) ∈ /Σ

task

true

and function dset(µ, θ, p, o) is defined as HH θ T R or W HH µ H θ θ ∅ {p ; o, p − → o} shared

shared task task

T

T

∅

T

T

∅

T

T

∅

{p ; o, p − → o}

{p ; o, p − → o}, {o ; o, o − → o} θ

θ

By now the difference between − → and ; should be clear: θ − → constraints monotonically grow in the dynamic constraint set, recording the entire “history” of the dynamic access θ since the program is bootstrapped. ; constraints on the other hand are removed whenever the invocation to a (shared or non-shared) task object ends – intuitively, a task “frees” all its accessing objects at the end of its execution. For θ that reason, ; only records the access relation at a specific runtime snapshot. We next define a function to compute the set of “root” task objects (i.e. non-shared task objects) that are currently accessing o.

Let us now study liveness. The reduction for θ shows field read/write is never blocked, and messaging to nonshared ordinary objects is never blocked. This is precisely why declaring objects as non-shared ordinary objects improves performance. When the messaging target is a shared ordinary/task object, the execution is not blocked iff the target is not accessed by any task (roots(H, Σ, o) = ∅) or it is a reentrant access (roots(H, Σ, o) = roots(H, Σ, p)). These two cases, combined with the fact that |roots(H, Σ, p)| = 1 (by the nature of how call stacks grow and shrink), can be summarized with predicate roots(H, Σ, o) ⊆ roots(H, Σ, p). If the target is a non-shared task object, messages are processed one at a time, as we discussed in Sec. 2. This is why T pre-condition (o ; o) ∈ / Σ is used in that case. Properties We next discuss the properties of our language, starting with some definitions. With a language runtime with blockings, deadlocks are possible: Definition 2 (Deadlock). S = hH; Σ; hE[ on

o0 T

]ihp0 ;c0 ;m0 i k

· · · k hE[ T ]ihpn ;cn ;mn i k ei is a deadlock configuration iff for i = 0..n, H(oi ) = hBici ; Fd i i, and pi ∈ roots(H, Σ, o(i+1) mod (n+1) ) The definition here shows how Task Types programming can help programmers reduce the likelihood of deadlocks by encouraging the default non-shared memory model. Deadlock cannot happen on non-shared ordinary objects: if there are no locks, there are no deadlocks. Next, some run-time exceptions standard in Java-like languages are possible in our language: Definition 3 (Null Pointer Exception). S leads to a null pointer exception iff S = hH; Σ; E[null ∗[β] m(v)]i. Definition 4 (Bad Cast). Configuration S leads to a bad cast exception iff S = hH; Σ; E[(c0 )o]i where H(o) = hB c ; Fd i and c0 ∈ / supers(c). The type soundness property is as follows. Theorem 1 (Type Soundness). If `p C : C, then there exists S such that ⇒C S, where either S ⇑, or S = hH; Σ; vi, or S is a deadlock configuration, or it leads to a null pointer or bad cast exception for some H, Σ. This Theorem states that the execution of a statically typed program either diverges, leads to a deadlock configuration, leads to a standard exception, or computes to a value. The Theorem is established by showing Lemmas of Subject Reduction, Progress, and the trivial fact that the bootstrapping process leads to a well-typed initial state. Theorem 2 (Type Decidability). For any program C it is decidable whether there exists a C such that `p C : C.

We now move on to state theorems related to the nonshared memory model. Let us first introduce the definition of a well-partitioned heap: Definition 5 (Well Partitioned Heap). partitioned (H, Σ) W holds iff for all o such that o0 ; o ∈ Σ, |roots(H, Σ, o)| = 1. The predicate says that at runtime, if a non-shared ordinary object is write accessed, then all accesses must be initiated by only one non-shared task. We now describe some properties related to task isolation. Theorem 3 (Static Task Isolation). If ⇒C hH; Σ; ei, then partitioned (Σ). This theorem says that a written non-shared ordinary object cannot be accessed by more than one “root” task at the same time. This theorem may appear appealing, but it in fact is weaker than what Task Types enforce for nonshared ordinary objects, because it cannot prevent a nonshared ordinary object from first being accessed by one task, released, and then accessed by another. The theorem that fully reflects the spirit of non-shared ordinary objects is the following: Theorem 4 (Thread Locality). If `p C : C and ⇒C hH; Σ; ei, then WF (Σ). This crucial theorem says the run-time constraint set we are computing in fact “conforms” to the statically checked one, in that cut vertices still exist for all non-shared ordinary θ objects. Note that WF works on − → constraints, which in this case include the entire “history” of access. This theorem implies each non-shared ordinary object is accessed by at most one task over its entire lifetime. Let us now state some concurrency properties: Theorem 5 (Race Freedom). `p C : C, then there are no race conditions for field access in any execution of C . This theorem can be easily derived from Thm. 3, which states the most important subcase that fields with non-shared ordinary objects can never be accessed by more than one “root” tasks. For fields of other kinds of objects, the precondition progressable() suffices to guarantee race freedom. Last, we can prove Task Type programs preserve atomicity: Theorem 6 (Atomicity). `p C : C, then pervasive quantized atomicity defined in [LLS08] is preserved for all executions of C . Since this property is identical to the one defined in [LLS08], we defer it to an accompanying technical report online [TT]. Informally it is the pervasive, partitioned atomicity property described in Sec. 2.2. The definition of that property is based on the Theory of Reductions [Lip75], a standard method for proving atomicity properties.

5.

Implementation and Evaluation

A prototype implementation of Task Types has been built on top of the CoqaJava compiler [LLS08], a compiler built

using the Polyglot Java framework [NCM03]. We support the core syntax presented in Fig. 4, plus standard Java features including primitive values, local variable declarations and assignment, local method invocations, public field access, multiple-argument methods, return statement, field initializers, arrays, static classes and methods, method and constructor overloading, super invocations in constructors, super invocations for regular methods, and control flow expressions (loops, branches, and exception handling). We rely on Polyglot 2.4’s built-in inner class remover to process inner classes, and conservatively wrap static fields as shared ordinary objects (with lock protection) among all instances since they are global variables that any access to may constitute an atomicity break point. The current implementation does not support reflection or native methods. We now report some preliminary benchmarks. We picked two programs on the opposite ends of the contention spectrum: the low-contention RayTracer and the high-contention PuzzleSolver. All benchmarks were performed on a 4 x 6Core Intel E7450 2.4GHz CPU machine (24 cores total) using 64GB RAM, running Debian GNU/Linux 2.6.26. RayTracer RayTracer is a computationally intensive algorithm for rendering 3D images and is taken from the Java Grande Suite [SBO01]. We tweaked the program to fit the new Task Types syntax. In the test runs we created 12 nonshared tasks to process individual Scene objects concurrently; each scene contains 150 * 500 pixels. The programs were executed 50 times and the first 2 runs were discarded. Table. 1 below reports the elapsed time in milliseconds, which is the median of the 48 hot runs.

Coqa Task Types

1 741 622

2 593 540

# Cores 4 6 450 364 279 264

12 340 289

Table 1. RayTracer Benchmark: Coqa vs. Task Types In the table, the “Task Types” results are an implementation of RayTracer with the minimal sharing declarations needed for the program to pass the Task Types typechecker. The program contains 1955 lines of code in 16 classes. We are able to declare 12 classes out of them to be non-shared ordinary classes (µ = ); two are non-shared task classes, i.e. thread launchers, and two needed to be declared shared. The fact that the vast majority of classes are typable as nonshared ordinary classes confirms our earlier assertion that most objects can be coded normally – no special sharing declaration is needed. “Coqa” is a re-implementation of the same program with all 16 ordinary objects above explicitly declared as shared, meaning the program will follow the Coqa model and all non-task objects will be guarded with runtime locks so at most one task can use them at a time.

As can be seen, the use of non-shared objects led to a nontrivial performance improvement, averaging about 20% faster across the cases here. This preliminary result confirms that a nontrivial speedup will be obtained compared to the purely dynamic approach of Coqa. PuzzleSolver The PuzzleSolver benchmark was discussed earlier in Sec. 2.3.1. We benchmarked both the Task Types re-implementation of PuzzleSolver as well as the original Java implementation. To give meaningful multicore data, we created 1 task in the 1-core run, 2 tasks in the 2-core run, and so on. The programs were executed 20 times and the first 2 runs were discarded. Table. 2 below reports the elapsed time in milliseconds, which is the median of the rest of 18 hot runs.

Task Types Java

1 123 80

# Cores 2 4 6 617 94 99 564 91 74

12 100 82

Table 2. PuzzleSolver Benchmark Results Observe that for a single-core execution, the Task Type implementation is about 35% slower. As the number of cores increases, the slowdown decreases to around 20%. Observe also that due to high contention this benchmark is not appreciably faster with multiple cores; it also exhibits an odd slowdown at two cores which is hard to explain but is in the underyling Java concurrency implementation since both implementations exhibit it. Our implementation of runtime shared task locking is still inefficient and can be improved, and we are also unnecessarily locking PuzzlePosition accesses in this benchmark as was pointed out in Sec. 2.3.1. So, we expect these numbers to improve considerably in a production implementation. That is not to say we expect there will be no runtime penalty at all; it is unlikely to go all the way to zero as some of the locking will not be necessary for correctness. Compilation Time Compilation of both benchmarks took only a few seconds; the constraint closure step took around 1/3 sec. in each case, which shows it is not a significant factor here. More efficient implementations of contextsensitive/polymorphic type closure have been investigated [MRR05; EGH94; Age96; WS01; WL04]. As we move on to larger code samples, we will consider implementing various optimizations such as BDDs [BLQ+ 03; WL04] and constraint garbage collection [WS01] if they are needed.

6.

Related Work

Type Systems, Static Analyses, and Logics Flanagan and Qadeer [FQ03] first explored type system support for atomicity; their work takes a shared-memory model as basis, and supports reasoning about the composibility of atomic blocks

over the shared memory, whereas Task Types can be viewed as carving out a non-shared-memory model in which atomicity violations cannot arise. STM systems primarily use dynamic means to enforce atomicity [HF03; WJH04; CMC+ 06]. The problem of weak atomicity has attracted significant interest recently, and a number of static or hybrid solutions exist to ensure transactional code and non-transactional code do not interfere with each other. Such a property is enforced by Shpeisman et. al. [SMAT+ 07] via a hybrid approach with dynamic escape analysis and a not-accessed-in-transaction static analysis. The latter has the good property of allowing “data handoff”: a transaction can pass along an object without accessing it. Data handoff is useful for different threads sharing a data structure (say a queue) but not the elements in it. This property also holds for Task Types. Passing along an object reference or storing the reference is not considered access since atomicity is not violated. AME [ABHI08] describes a conceptually static procedure to guarantee violation-freedom of transactional code and non-transactional code. Harris et. al. [HMPJH05] used the monads of Haskell to separate computations with effects and those without. Constructing type systems to assure race condition freedom is a well-explored topic. These systems work under significantly different assumptions than we do: they typically assume a Java-like shared memory with explicit lock acquire/release expressions. Given the non-shared memory assumption and protected access to shared task objects, race conditions do not occur in our system. Static inference techniques have been designed to automatically insert locks to enforce atomicity for atomic blocks [MZGB06; HFP06; EFJM07; CCG08]. These techniques assume that atomic blocks are a fundamentally “local” construct, and make assumptions that are unrealistic for supporting the larger atomic regions we aim for: Autolocker [MZGB06] requires all invocations in an atomic block to be inlined, a static bound on lock resources is needed in others [EFJM07; CCG08; HFP06], and some require that all objects accessed in the atomic block not be accessed elsewhere [HFP06]. Our locality enforcement algorithm is related to ownership/region types, and particularly their inference [CCQR04; Wre03; DM07; LS08]. Cut vertices are evocative of dominators in the owners-as-dominators ownership type systems; in Task Types, DAGs can be freely formed inside the boundary of the dominator. A major challenge of our design – the problem that motivated task twinning – is not an issue for ownership/region types. When recursion happens, there is nothing wrong for ownership/region type declaration/inference systems to assign the same type variable to all recursive occurrences, but this would lead to atomicity violation in our situation; “two copies” of the task static instances are needed. .

There are many static analysis algorithms for tracking the flow of objects. Closest to our work are several thread escape analyses [CGS+ 99; WR99; Bla99] for object-oriented languages, which use reachability graphs to prevent or track alias escape from threads. Task Types share a focus on thread locality with this work, but differ in two important aspects: (1) the pervasive atomicity of our language requires shared task-style sharing, which is a “partial escape” that should be allowed but has no representation in these analyses; (2) Fundamentally, object references do not need to be confined to guarantee atomicity: it is perfectly fine for a task to create an object, pass it over to another task, which in turn stores it or passes it further to a third task. The key to atomicity is there is no conflicting object access. The effect of these differences is to produce unique issues that escape analyses do not encounter and we need to solve here. A type system is a logical system and our work has a distant relation to program logics such as separation logic. In separation logic, local properties of the heap can be guaranteed by partitioning the heap, reasoning about the components, and then soundly re-composing to get the property over the full heap. Task Types share this philosophy of heap partitioning in how the access relations partition objects. Although there are currently no separation logics for concurrent object-oriented languages as we know of, there has been work on separation logics for Java [PB05; PB08] and for concurrency [OHe07; Bro07]. We believe the locality property of relation cutExists is difficult to express with heap partitioning since they are a subtle form of heap sharing; these may be useful areas on which to focus extensions to separation logic. Language Designs for Atomicity The shortcomings of explicitly declared atomic blocks are summarized in [VTD06]. In that work, a data-centric approach is taken: the fields of an object are partitioned into statically labelled atomic sets, and access to fields in the same set is guaranteed to be atomic, analogous to declaring different shared tasks for different atomic sets in Task Types. Their data-centric approach is a step forward compared with atomic blocks, but the design philosophy is still no-atomicity-unless-you-declare-it, and hence fundamentally different from our notion of pervasive atomicity. The Actor model and related message-passing languages [Agh90; Arm96; SM08] achieve atomicity by imposing stronger limitations on object sharing: threads communicate only at thread launch. A primary appeal of this model is each actor message handler thread executes atomically. If a synchronous communication is needed however, the sender needs to have an explicit message handler for processing the return value. The synchronous sender must thus be coded as two handlers, the code for actions up to and including the send, and the post-send return value handler (the continuation); this breaks the sender into two different atomicity regions. Additionally, this breaks the obvious control flow

that would be apparent in synchronous messaging syntax, making programming more difficult. Some implemented Actor-based languages do include implicit CPS transformation syntax to ease coding, but that convolutes the code meaning (variable scoping for example) and does not repair the fact that the span of an atomic region was broken. Kilim [SM08] is a more recent actor-like language, with a focus on providing refined message passing mechanisms without sacrificing the isolation property of Actors. The Kilim type system relies on extra programmer declarations called isolation modifiers to denote how each parameter can be passed/used in the inter-actor context. Kilim and Task Types have the same focus on object isolation, but in orthogonal design spaces: Kilim on message passing, and Task Types on atomicity. A recent work that is closer to our spirit of pervasive atomicity is AME [IB07; ABHI08]. The language constructs of AME are along the lines of Actors, where an async e expression starts up an atomicity-preserving thread. AME is different from our work in its support of an expression unprotected e, meaning atomicity is not preserved for e. This is fundamentally different from pervasive atomicity. In addition, AME does not support synchronous messaging, and does not overlap with the static type system aspect of our work. A more conservative approach than atomicity is to design determinism into a concurrent language, such as building a type and effect system to guarantee threads are deterministic [BAD+ 09]. While this system is very useful for some algorithms, it is surprisingly subtle the properties that make algorithms deterministic and their system is often too weak to detect the determinism. Also, many algorithms are in fact not deterministic because some choices can be arbitrary. Note that both of our benchmarks are fundamentally arbitrary – the PuzzleSolver for example has tasks competing for next moves and the processor timing will dictate which task wins). The pervasive atomicity of this paper is a less rigid notion for the programmer in that threads must be divided into deterministic segments but need not be wholly deterministic. The two approaches are compatible; perhaps an ideal language would use the DPJ approach when determinism was feasible to guarantee, and fallback to Task Types when not.

7.

Discussion

In our initial experience we have not found it difficult to port basic concurrency patterns to Task Types, but further work is needed to increase accuracy of the system to cover the full spectrum of programming patterns. Since increased accuracy also brings increased complexity, there needs to be a strong justification for an extension before it is made. We left out several potentially useful extensions in this paper, which we discuss now.

Simple Extensions Several features are left out from the formal system for simplicity, but can be added with minimal change. The first feature is per-object sharing declarations (rather than per-class sharing declarations as found in the core calculus). For instance, the PuzzlePosition class of Sec. 2.3.1 is a case where a per-object declaration of the sharing policy, in place of the current per-class declaration, would have allowed the program to typecheck without the need for runtime locks. This extension is technically trivial. Because our polymorphic type system is able to differentiate objects of the same class anyways, the only change to implement it would be (1) turning types from c@α to a more verbose form such as c@α of µ, and (2) whenever a flow constraint is generated, a constraint asserting both sides have compatible modifiers is collected. A second extension which would have allowed the same program above to typecheck is a call-by-copy parameter passing mechanism. This is a variation on immutability which allows object data to be freely passed from one task to another since the callee gets a fresh copy and thus no back-channel will be created. A third extension in which the granularity of sharing could be made finer is to take individual fields of objects as the atomic units of sharing, meaning some fields could be owned by one thread and other fields by another thread. This approach may indeed lead to more programs to typecheck and it is not difficult to implement, but it is not clear to us that this is a good idea for object-oriented programmers (as opposed to C programmers). It violates a principle of objectbased design in how an object has dual allegiance; in fact this may very well be a sign that such an object needs to be refactored into two different objects. On the design spectrum with full declaration on one end and full inference on the other, Task Types lie somewhere in the middle: it is a type inference system but shared and task classes and messaging must be explicitly declared. In general, the design principle here is that sharing policies and execution policies for classes are essential and should be declared, but enforcement can be the job of a program analysis, in this case a polymorphic type inference algorithm, for programmability reasons. The ideas here are to some degree independent of this spectrum: it would be possible to infer sharing when needed, and it would also be possible to have an explicit declarative type system to capture the necessary well-formedness properties. Non-Trivial Extensions Task Types disallow a non-shared ordinary object to be used in “phases”, each of which belongs to a different task: “from bootstrapping to a particular point of execution, object o belongs to task X; from that point on however, o will never be used by X again, and can be grabbed by task Y.” To see why this could in some cases be a useful feature, consider the Swing AWT invokeAndWait method. This method allows one thread to pass some GUI-manipulating code to the AWT thread

where it will be queued up as an event and handled. Since Swing is not thread-safe this is the only safe way another thread can interact with Swing widgets outside of listeners. Consider the following example, taken from the Swing threading documentation: void printTextField() throws Exception { final String[] myStrings = new String[2]; Runnable getTextFieldText = new Runnable() { public void run() { myStrings[0] = textField0.getText(); myStrings[1] = textField1.getText(); } }; SwingUtilities.invokeAndWait(getTextFieldText); System.out.println(myStrings[0] + " " + myStrings[1]); }

Here invokeAndWait will pass the code snippet in run() for execution on the AWT thread; the current thread will block until run() completes. If this Java code was ported directly to Task Types, the myStrings array would want to be statically owned by both the main task and the AWT task and so typing would fail; not even making the array a shared ordinary object would help since even after run() completes the AWT thread will not release the array since it has not completed. A call-by-copy semantics, as discussed above, would restore typability since the original task will have a fresh copy of the data; however, this adds unnecessary overhead. This example shows it may be useful to add an object transfer feature. As can be seen in the above example, tasks hold on to shared ordinary objects even if they are in fact finished with them, but early release at an existing atomicity break point would not increase the number of atomic regions and would allow programs such as the above to statically typecheck: at the point where run() completes, the myStrings array can safely be transferred back to the original task. For the above example a flow-insensitive type analysis should be able to determine that myStrings does not escape run() and if a singular capability to run() were passed to the AWT task, object transfer upon return is sound. In more complex cases it may be necessary to use a flow-sensitive analysis to detect when an object is no longer used. We discussed task twinning in Sec. 2.2. This method is relatively easy to prove correct, but in practice this conservative definition can be refined in several situations. such as a program point is obviously not instantiating more than one task, or if it does, the instantiated tasks are obviously not accessing the same objects. One refinement was mentioned in Sec. 2.2: twinning is not needed when a simple analysis determines a program point is not in a recursive context. Another refinement would be to track information flow in and out of the task object: no sharing can happen between two tasks instantiated from the program point – twinning is not needed as a result – if there is no flow leading them to share. Limitations A common complaint of constraint-based type systems such as this one is they are non-modular, since the whole program is needed to compute the constraint clo-

sure. This mainly leads to two problems: (1) the burden of typechecking overhead at dynamic loading time a la Java, and (2) the difficulty of printing precise error messages. As was discussed in Sec. 3, problem (1) motivated us to formulate our system in two phases, so that many type errors can in fact be discovered modularly. Furthermore, if dynamically loaded code was certified by signature to be identical to the compile-time code, no additional dynamic link checking would be needed. We agree that problem (2) is a real challenge. Indeed, any system with type inference across modularity boundaries – such as the type inference of ML – is faced with this challenge. As our compiler matures, we are interested in studying various error localization approaches.

8.

Conclusion

We have presented an approach to achieve atomicity in multithreaded object-oriented programs, by making the following contributions: • We develop a top-down, mostly static approach to en-

force atomicity as opposed to the usual bottom-up, dynamic approach. Atomicity with Task Types is top-down in the sense that it is pervasive instead of building islands of atomic blocks. Principled language constructs for sharing between tasks are provided, and strong atomicity is achieved for all code regardless of sharing. • We provide a programming model whose language syn-

tax requires only minor changes to standard object syntax, but the philosophy of a non-shared memory model as a default, and explicit support for data sharing between threads, is significantly different than the norm. As a consequence, the object sharing between tasks is brought front and center for the programmer, where it should be. • Task Types incorporate a precise and provably sound

polymorphic/context-sensitive analysis which statically verifies that non-shared ordinary objects are appropriately partitioned between tasks. Since the partitioning is verified statically, there is no need for dynamic partitioning of non-shared objects between tasks, meaning there is no additional runtime overhead. • The viability of Task Types has been confirmed by a pro-

totype compiler, and initial benchmark results are reasonable. Task Types are a complex type system; we believe this stems from complexity inherent in writing shared-memory concurrent programs correctly. Although it could be argued that means shared-memory concurrency needs to be abandoned, we are perhaps too far down that road to turn around, and Task Types represent a compromise between the “Wild West” of Java programming today and the rigid straightjacket of non-shared memory concurrency. Programmers also need not understand every detail of the type system, only that some object is shared more than the type system

is allowing, and either its use must be restricted or its class must be lifted to be a shared class. As in many type system approaches, the programmer burden of typing under Task Types will be nontrivial. However, that is not necessarily a bad thing: the ML language by analogy has a type system that is challenging for new programmers, but greatly increases productivity in the long term, because the type system catches what would have been difficult run-time bugs and so the extra time spent typechecking is more than made up in the time saved in later debugging. We believe Task Types will offer a similar advantage for concurrent programming. The webpage of Task Types [TT] includes all reduction rules, a complete set of proofs for all the theorems presented in this paper, as well as the source code of the compiler.

References [ABHI08] Mart´ın Abadi, Andrew Birrell, Tim Harris, and Michael Isard. Semantics of transactional memory and automatic mutual exclusion. In POPL ’08, pages 63–74, 2008. [Age96] Ole Agesen. Concrete type inference: delivering object-oriented applications. PhD thesis, Stanford University, Stanford, CA, USA, 1996. [Agh90] Gul Agha. ACTORS : A model of Concurrent computations in Distributed Systems. MITP, Cambridge, Mass., 1990. [Arm96] J. Armstrong. Erlang — a Survey of the Language and its Industrial Applications. In INAP’96 — The 9th Exhibitions and Symposium on Industrial Applications of Prolog, pages 16–18, Hino, Tokyo, Japan, 1996. [BAD+ 09] Robert L. Bocchino, Jr., Vikram S. Adve, Danny Dig, Sarita V. Adve, Stephen Heumann, Rakesh Komuravelli, Jeffrey Overbey, Patrick Simmons, Hyojin Sung, and Mohsen Vakilian. A type and effect system for deterministic parallel java. In OOPSLA ’09, pages 97–116, 2009. [Bla99] Bruno Blanchet. Escape analysis for object-oriented languages: application to java. SIGPLAN Not., 34(10):20–34, 1999. [BLQ+ 03] Marc Berndl, Ondrej Lhoták, Feng Qian, Laurie Hendren, and Navindra Umanee. Points-to analysis using bdds. In PLDI ’03, pages 103–114, 2003. [BLR02] Chandrasekhar Boyapati, Robert Lee, and Martin Rinard. Ownership types for safe programming: preventing data races and deadlocks. In OOPSLA ’02, pages 211–230, 2002. [Bro07] Stephen Brookes. A semantics for concurrent separation logic. Theoretical Computer Science, 375(13):227 – 270, 2007. [CCG08] Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani. Inferring locks for atomic sections. In PLDI’08, pages 304–315, 2008. [CCQR04] Wei-Ngan Chin, Florin Craciun, Shengchao Qin, and Martin Rinard. Region inference for an objectoriented language. In PLDI ’04, pages 243–254,

2004. [CGS+ 99] Jong-deok Choi, Manish Gupta, Mauricio Serrano, Vugranam C. Sreedhar, and Sam Midkiff. Escape analysis for java. In OOPSLA’99, pages 1–19, 1999. [Cla01] Dave Clarke. Object Ownership and Containment. PhD thesis, University of New South Wales, July 2001. [CMC+ 06] B. CarlStrom, A. McDonald, H. Chafi, J. Chung, C. Minh, C. Kozyrakis, and K. Olukotun. The atomos transactional programming language. In PLDI’06, June 2006. [CPN98] David G. Clarke, John M. Potter, and James Noble. Ownership types for flexible alias protection. In In OOPSLA’98, pages 48–64. ACM Press, 1998. [DG04] Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified data processing on large clusters. In OSDI’04, 2004. [DM07] W. Dietl and P. Müller. Runtime universe type inference. In IWACO’07, 2007. [EFJM07] Michael Emmi, Jeffrey S. Fischer, Ranjit Jhala, and Rupak Majumdar. Lock allocation. In POPL ’07, pages 291–296, 2007. [EGH94] Maryam Emami, Rakesh Ghiya, and Laurie J. Hendren. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In PLDI’94, pages 242–256, 1994. [FQ03] Cormac Flanagan and Shaz Qadeer. A type and effect system for atomicity. In PLDI’03, pages 338–349, 2003. [GLS94] William D. Gropp, Ewing Lusk, and Anthony Skjellum. Using MPI — Portable Parallel Programming with the Message Passing Interface. MIT Press, Cambridge, MA, 1994. [Gro03] Dan Grossman. Type-safe multithreading in cyclone. In TLDI ’03, pages 13–25, 2003. [HF03] Tim Harris and Keir Fraser. Language support for lightweight transactions. In OOPSLA’03, pages 388– 402, 2003. [HFP06] Michael Hicks, Jeffrey S. Foster, and Polyvios Prattikakis. Lock inference for atomic sections. In TRANSACT’06, June 2006. [HMPJH05] Tim Harris, Simon Marlow, Simon Peyton-Jones, and Maurice Herlihy. Composable memory transactions. In PPoPP ’05, pages 48–60, 2005. [IB07] Michael Isard and Andrew Birrell. Automatic mutual exclusion. In HOTOS’07: Proceedings of the 11th USENIX workshop on Hot topics in operating systems, pages 1–6, Berkeley, CA, USA, 2007. USENIX Association. [IPW99] Atsushi Igarashi, Benjamin Pierce, and Philip Wadler. Featherweight java - a minimal core calculus for java and gj. In ACM Transactions on Programming Languages and Systems, pages 132–146, 1999. [LB98] Sheng Liang and Gilad Bracha. Dynamic class loading in the java virtual machine. In In OOPSLA’98, pages 36–44. ACM Press, 1998. [Lip75] Richard J. Lipton. Reduction: a method of proving properties of parallel programs. Commun. ACM,

18(12):717–721, 1975. [LLS08] Yu David Liu, Xiaoqi Lu, and Scott F. Smith. Coqa: Concurrent objects with quantized atomicity. In CC’08, March 2008. [LS08] Yu David Liu and Scott Smith. Pedigree types. In 4th International Workshop on Aliasing, Confinement and Ownership in object-oriented programming (IWACO), pages 63–71, July 2008. [MRR05] Ana Milanova, Atanas Rountev, and Barbara G. Ryder. Parameterized object sensitivity for points-to analysis for java. ACM Trans. Softw. Eng. Methodol., 14(1):1–41, 2005. [MZGB06] Bill McCloskey, Feng Zhou, David Gay, and Eric Brewer. Autolocker: synchronization inference for atomic sections. In POPL’06, pages 346–358, 2006. [NCM03] Nathaniel Nystrom, Michael R. Clarkson, and Andrew C. Myers. Polyglot: An extensible compiler framework for java. In CC’03, volume 2622, pages 138–152, NY, April 2003. Springer-Verlag. [OHe07] Peter W. OHearn. Resources, concurrency, and local reasoning. Theoretical Computer Science, 375(13):271–307, 2007. [PB05] Matthew Parkinson and Gavin Bierman. Separation logic and abstraction. In POPL’05, pages 247–258, 2005. [PB08] Matthew J. Parkinson and Gavin M. Bierman. Separation logic, abstraction and inheritance. In POPL ’08, pages 75–86, New York, NY, USA, 2008. ACM. [SBO01] L. A. Smith, J. M. Bull, and J. Obdrzálek. A parallel java grande benchmark suite. In Supercomputing ’01: Proceedings of the 2001 ACM/IEEE conference on Supercomputing, 2001. [SM08] Sriram Srinivasan and Alan Mycroft. Kilim: Isolation-typed actors for java. In ECOOP’08, 2008. [SMAT+ 07] Tatiana Shpeisman, Vijay Menon, Ali-Reza AdlTabatabai, Steven Balensiefer, Dan Grossman, Richard L. Hudson, Katherine F. Moore, and Bratin Saha. Enforcing isolation and ordering in stm. In PLDI’07, pages 78–88, 2007. [TT] http://www.cs.binghamton.edu/˜davidL/ tasktypes. [TT97] Mads Tofte and Jean-Pierre Talpin. Region-based memory management. Information and Computation, 1997. [VTD06] Mandana Vaziri, Frank Tip, and Julian Dolby. Associating synchronization constraints with data in an object-oriented language. In POPL ’06, pages 334– 345, 2006. [WJH04] Adam Welc, Suresh Jagannathan, and Antony L. Hosking. Transactional monitors for concurrent objects. In ECOOP’04, pages 519–542, 2004. [WL04] John Whaley and Monica S. Lam. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In PLDI’04, pages 131–144, 2004. [WR99] John Whaley and Martin Rinard. Compositional pointer and escape analysis for java programs. In OOPSLA, pages 187–206, 1999.

[Wre03] Alisdair Wren. Ownership type inference. Master’s thesis, Imperial College, 2003. [WS01] Tiejun Wang and Scott F. Smith. Precise constraintbased type inference for Java. In ECOOP’01, pages 99–117, 2001.

A

Dynamic Semantics

(R-WriteReg) (R-ReadReg) (R-WriteCons) (R-ReadCons) (R-Msg) (R-New)

∆

H, Σ, f reg :=v =⇒ H[o 7→ hBc ; Fd [f 7→ v]i], Σ, if ∆ = ho; c; mi : ∆0 ∆

o W

;v

o

H, Σ, f reg =⇒ H, Σ, R ; Fd(f) if ∆ = ho; c; mi : ∆0 ∆

H, Σ, f cons :=v =⇒ H[o 7→ hBc ; Fd [f 7→ v]i], Σ, v if ∆ = ho; c; mi : ∆0 ∆

H, Σ, f cons =⇒ H, Σ, Fd(f) if ∆ = ho; c; mi : ∆0 ∆

o

H, Σ, o ∗[β] m(v) =⇒ H, Σ, T ; wrap(∗, e) if e = fun(ho; c; mi : ∆, H, v) ] ∆ H, Σ, newB1 c1 (v) =⇒ H ] (q 7→ hB1 c1 ; f 7→ nulli), Σ, q∗[β] c1 (v) f∈dom(Γ )

if q fresh, fields(c1 ) = ∀A.Γ ∗ matches modifier (c1 ) returns c1 , B1 = β : B10 ∆

(R-NewT1) H, Σ, newB1 ,B2 c0 (v) =⇒ H, Σ, newB1 c0 (v) ∆

(R-NewT2) H, Σ, newB1 ,B2 c0 (v) =⇒ H, Σ, newB2 c0 (v) (R-Cast)

∆

0

H, Σ, (c )o =⇒ H, Σ, o if c0 ∈ supers(c) ∆

(R-hi-Elim1)

H, Σ, hviho;c;mi =⇒ H, Σ, v if µ ∈ / task

(R-hi-Elim2)

H, Σ, hviho;c;mi =⇒ H, Σ\o, v if µ ∈ task

(R-Cont) (R-Access) (R-Commute) (R-k-Elim) (R-Post) (R-Context)

∆

∆

H, Σ, v; e =⇒ H, Σ, e o

∆

=⇒ H, Σ ∪ dset(µ, θ, p, o), null if progressable(H, Σ, µ, θ, p, o), p = pzone(∆) H, Σ, e k e0 ⇒ H, Σ, e0 k e H, Σ, v k e ⇒ H, Σ, e H, Σ, E[ post e ] ⇒ H, Σ, E[ null ] k e H, Σ, E[ e ] ⇒ H 0 , Σ 0 , E[ e0 ] H, Σ,

θ

cxt(E)

if H, Σ, e =⇒ H 0 , Σ 0 , e0

Fig. 1. Dynamic Semantics (Let H(o) = hB c ; F di, modifier (c) = µ for all rules)

Small-step reduction is defined as the ⇒ relation in Fig. 1. All rules are implicitly parameterized with code base C . A reduction is stuck if a pre-condition is not satisfied. Partial function σσ 0 is defined as σσ 0 when dom(σ 0 ) ⊆ dom(σ) and undefined oth-

erwise. The following function “context-sensitively” computes the body of the function given a dynamic calling context ∆ and argument v: def

fun(∆, H, v) = he{v/x }[σ]iho;c;mi where mbody(hc0 ; mi) = x .e, c0 ∈ supers(c), and for any c00 such that c00 6= c0 and 00 00 00 c0 ∈ supers(c ] ) and c ∈ supers(c), mbody(hc ; mi) is undefined, σ= A 7−→ gen(d2s(∆, H), A) A∈labels(e)

and ∆ = ho; c; mi : ∆0 . Function d2s(H, ∆) computes the (static) calling context given a heap H and the dynamic calling context ∆: def

d2s([ ], H) = [ ] def

d2s(ho; c; mi : ∆, H) = hβ; c; mi : d2s(∆, H) if H(o) = h(β : B)c ; Fd i The following function wraps the function body up for appropriate messaging: post e if ∗ = -> def wrap(∗, e) = e otherwise The following function is defined for dynamic constraint set subtraction: def

θ

Σ\o = Σ − {o1 ; o2 | o1 = o or o2 = o}

B

Auxiliary Definitions for the Proof

We first define a predicate called twinTasks(Ω), which holds true iff each task-representing type variable in Ω has two instantiation points. Definition 1 (Twin Instantiation Points for Task Objects). Predicate twinTasks(Ω) holds iff for any B1c ≤ β ∈ Ω where task ∈ modifier (c), it entails that B2c ≤ β ∈ Ω where B1 and B2 contain distinct elements. ∆

Lemma 1 (twinTask Preservation Over Closure). If twinTask (Ω1 ) and Ω1 ,→ Ω2 , then twinTask (Ω2 ). Proof. Case analysis on e1 . The only interesting case is when e1 = newB1 ,B2 c(e01 ). B.1

Dynamic Typing Rules []

We use notation Ω C to dentote the largest set Ω 0 where relation Ω ,→ Ω 0 holds under implicit C. In that sense, C can be viewed as the sugared syntax for boot C. Dynamic typing is defined by (T-Config) in Fig. 2. The figure also includes rules to type the heap ((T-Heap) and (T-HeapCell)) and auxiliary rules for typing expressions only showing up at run time. For expressions already in source code, their dynamic typing rules are identical with the expression typing rules we provided in the main text, except that every occurrence of α is replaced with β. For convenience, we do not redefine these rules. A label such as (T-Read) used in the rest of the proof usually refers to the dynamic typing rule unless explicitly noted.

(T-Config) Γ, Σ `h H\Ω1 `p C : C Γ ` e : τ \Ω2 Ω = (Ω1 ∪ Ω2 ) C WF (Ω) twinTasks(Ω) WF (Σ) Γ (dynamic) = Σ FTV (Σ) ⊆ dom(H) C `r hH; Σ; ei (T-Heap) Γ `hc H(oi ) : Γ (oi )\Ωi for all oi ∈ dom(H), i = [1..n] Γ, Σ `h H\(Ω1 ∪ . . . , Ωn ) (T-HeapCell) B = β : B0 fields(c) = ∀A.Γ 0 dom(Fd ) = [f1 , . . . , fn ] Γ ` Fd (fi ) : ci @βi \∅ for i = [1..n] Γ 0 [A 7−→ B](fi ) = ci @βi0 for i = [1..n] Γ `hc hBc ; Fd i\{Bc ≤ β, β1 ≤ β10 , . . . , βn ≤ βn0 } (T-Obj) Γ (o) = (β : B)c Γ ` o : c@β\∅ (T-Access) Γ ` o : τ \∅ Σ = Γ (dynamic)

W

T

θ = W =⇒ ∃p ∈ tasks(∆).p − → p0 ∈ Σ for all (p0 −→ o) ∈ Σ o

Γ ` θ : τ 0 \aC (θ, τ ) (T-ObjectScope) δ = ho; c; mi ∆ = δ : Γ (cxt) (FV (e) − OID) ⊆ dom(Γ 0 ) 0 0 fields(c) = ∀A.Γ Γ (Γ [A 7−→ B]) (cxt 7→ ∆) ` e : τ \Ω W

(p −→ o) ∈ Γ (dynamic) =⇒ p ∈ tasks(∆) Γ ` heiδ : τ \{hΩihβ;c;mi }

(T-Post) Γ ` e : Unit@β\Ω

FV (e) ⊆ OID

(T-Parallel) Γ ` e : τ \Ω

Γ ` e0 : τ 0 \Ω 0 0

Γ ` e k e : τ 00 \Ω ∪ Ω 0

Γ ` post e : Unit@β\Ω (T-Continue) Γ ` e : τ \Ω

Γ ` e0 : τ 0 \Ω 0 0

Γ ` e; e : τ 0 \Ω ∪ Ω 0 t ::= . . . | o | cxt | dynamic τ ::= . . . | Bc | ∆ def

tasks(∆) = {o|ho; c; mi ∈ ∆, modifier (c) ∈ task}

Fig. 2. Dynamic Typing Rules

B.2

Free Variables Definition for Expressions

Defined as the FV function below. We use OID to represent the set of object IDs. def

FV (x ) = {x } def

FV (newB c(e)) = FV (e) def

FV (newB1 ,B2 c(e)) = FV (e) def

FV (this) = {this} def

FV (f) = {f} def

FV (f:=θ e) = {f} ∪ FV (e) def

FV ((c)e) = FV (e) def

FV (e ∗B m(e0 )) = FV (e) ∪ FV (e0 ) def

FV (o) = {o} def

FV (null) = ∅

FV (heiho;c;mi ) = (FV (e) ∩ OID) ∪ {o} def

o

def

FV ( θ ) = {o} def

FV (e; e0 ) = FV (e) ∪ FV (e0 ) def

FV (e k e0 ) = FV (e) ∪ FV (e0 ) def

FV (post e) = FV (e) Note that FV (heiho;c;mi ) is defined as (FV (e) ∩ OID) ∪ {o} rather than FV (e) ∪ {o}. In other words, every object forms its own name scope of local variables x , field names f, this. The only globally scoped typing environment items are object IDs. B.3

Free Type Variables for Source-Code Expressions def

FTV (e) = {α | α ∈ A, A ∈ labels(e)} B.4

Free Type Variables for Types in Modular Type Checking def

FTV (c@α) = {α} B.5

Free Type Variables for Typing Context in Modular Type Checking def

FTV ([t1 7→ τ1 , . . . tn 7→ τn ]) = FTV (τ1 ) ∪ · · · ∪ FTV (τn ) B.6

Free Type Variables for Modular Constraint Store def

FTV ({κ1 , . . . , κn }) = FTV (κ1 ) ∪ · · · ∪ FTV (κn )

B.7

Free Type Variables for Modular Constraints def

FTV (ˆ α≤α ˆ 0 ) = FTV (ˆ α) ∪ FTV (α ˆ0) 0

00

def

FTV ([α]α ,m,α ) = {α, α0 , α00 } θ

def

FTV (α − → α0 ) = {α, α0 } B.8

Free Type Variables for Flow Elements in Modular Constraints def

FTV (α) = {α} def

FTV ([α1 , . . . , αn ]c ) = {α1 , . . . , αn }

C C.1

Proof Notation Conventions

For clarity, numbers 1 and 2 are usually subscripts to represent the pre-reduction state and the post-reduction state in lemmas involving reductions. To differentiate variables in other situations, we often use subscripts in Roman letters. C.2

Properties of Subtyping

Lemma 2 (Subtyping Derivation). If Γ ` e : c@α\Ω, then there must exist Γ ` e : c0 @α\Ω, c ∈ supers(c0 ), and the last step of derivation for the second judgment is not an instance of (T-Sub). Proof. This is a basic property about the derivation tree: an instance of (T-Sub) cannot be a tree leaf. Simple proof can be constructed by contradiction. C.3

Properties of Typing Context

Lemma 3 (Free Variables are Closed under Typing Context). If Γ ` e : τ \Ω, then FV (e) ⊆ dom(Γ ). Proof. Case analysis on the judgment. Lemma 4 (Left Strengthening and Weakening of Typing Context). Γ ` e : τ \Ω iff Γ 0 Γ ` e : τ \Ω for any Γ 0 .

Proof. In the previous lemma, it is known FV (e) ⊆ dom(Γ ). For any variable in dom(Γ 0 ), if it does not show up in dom(Γ ), typing is not affected. Otherwise, the definition of would make the mapping Γ 0 superceded by the mapping of the same variable in Γ . Lemma 5 (Right Strengthening and Weakening of Typing Context). Γ ` e : τ \Ω iff Γ Γ 0 ` e : τ \Ω where FV (e) ∩ dom(Γ 0 ) = ∅.

Proof. Case analysis on the first judgment. Lemma 6 (Right Strengthening and Weakening of Typing Context for Heap Typing). Γ ` H\Ω, iff Γ Γ 0 ` H\Ω where OID ∩ dom(Γ 0 ) = ∅.

Proof. See (T-Heap) and (T-HeapCell).

Lemma 7 (Disjoint Right Strengthening of Typing Context). If Γ1 ` e : τ \Ω then Γ1 ] Γ ` e : τ \Ω. Proof. Given Γ1 ` e : τ \Ω, by Lem. 3, FV (e) ⊆ dom(Γ1 ). According to the definition of ], dom(Γ1 ) ∩ dom(Γ ) = ∅, hence FV (e) ∩ dom(Γ ) = ∅. The conclusion holds by Lem. 5. (Obviously Γ1 ] Γ = Γ1 Γ by definition of .) Lemma 8 (Judgment Instantiation). Given Γ ` e : τ \K, B ∩ (FTV (Γ ) ∪ FTV (e) ∪ FTV (τ ) ∪ FTV (Ω)) = ∅ then Γ [σ] ` e[σ] : τ [σ]\K[σ] for any σ = A 7−→ B and any A where A ⊆ FTV (Γ ) ∪ FTV (e). Proof. Intuitively this is merely the renaming of type variables on the derviation tree. To rigorously prove it, induction on the derivation tree and case analysis on the last step leading to the judgment. Lemma 9 (this Strengthening). Given Γ ` e : τ \K, and Γ (this) = c@α, and c ∈ supers(c0 ), then Γ [this 7→ c0 @α] ` e : τ \K. Proof. Case analysis on the last step leading to the assumption judgment. If it is an instance of (T-Var) and e = this, the conclusion holds by applying (T-Sub). If it is an instance of (T-Read) or (T-Write), note that Γ (this) is used as an argument for aC function, but that function does not depend on the class name. All the other cases are trivial. C.4

Properties of Values

Lemma 10 (Irrelevant Typing Context for Values). Γ ` v : τ \Ω iff Γ Γ 0 ` v : τ \Ω for any Γ 0 such that dom(Γ 0 ) ∩ OID = ∅. Proof. Only object IDs and null are values. Rigorously, induction on the derivation leading to the first judgment. The last step is either (T-Obj), (T-Null), (T-Sub). Lemma 11 (Empty Constraints for Values). If Γ ` v : τ \Ω, then Ω = ∅. Proof. Ditto. Lemma 12 (Substitution). For any Γ such that Γ ` v : τ 0 \Ω, [x 7→ τ 0 ] ` e : τ \Ω 0 , then Γ ` e{v/x } : τ \Ω ∪ Ω 0 . Proof. The language is call-by-value, so the proof here is in fact simpler than substitution lemmas that can replace variables with arbitrary expressions. Definition 2 (Well-formed Substitution). Ω `WF σ holds iff dom(σ) ⊆ Ω and range(σ) ∩ FTV (Ω) = ∅.

C.5

Constraints

We first emphasize on some fact that has been informally stated in the paper, the deterministic choices of type variables for fields and mtype: Lemma 13 (Deterministic Choices of Signature Type Variables). Given a codebase C – fields(c) is deterministically defined over the implicit C for any c ∈ dom(C ). – mtype(π) is deterministically defined over the implicit C for any π = hc; mi where C (c) = µ class c extends c0 {F M } and m ∈ dom(M ). Lemma 14 (Decidable Expression Typing). Given an implicit code base C , a typing context Γ , an expression e, and a type τ , there exists a decision procedure to find out whether K exists such that Γ ` e : τ \K. Proof. Proof by induction. We first attempt to construct the last step of the derivation leading to Γ ` e : τ \K. There are only finite ways to construct it: a) use the typing rule corresponding to the structure of e; or b) use (T-Sub). Since each expression only has one corresponding rule matching its structure, constructing a) or proving its impossibility is obviously decidable. As for b), note that given τ = c@α, the only ways to construct is to find c0 such that c ∈ supers(c0 ). Note that for a fixed code base, the number of subclasses for a particular class is finite. So there are only finite number of ways to use b) for construction. Summing up a) and b), it is known that there are finite ways to construct the last step. If the last step being used is not (T-Sub), then a simple case analysis will reveal that the judgment(s) on the assumption must be on an expression which is a sub-expression of e. By induction, there exists a decision procedure to find out whether they can be proved or not, so the combination effect for constructing the next step to the last must also be finite. If the last step being used is (T-Sub), we cannot use induction since e remains the same. Note however we disallow cyclic inheritance chains in our language, so there are only finite steps of (T-Sub) can be applied until a class name with no subclasses appears in the type to be proved. At that time, a typing rule that corresponds to the structure of the expression will be used, and the conclusion holds by induction. The reasoning here is similar to Lem. 2. Lemma 15 (Deterministic Expression Typing). Given an implicit code base C , if Γ ` e : c1 @α1 \K1 and Γ ` e : c2 @α2 \K2 , then α1 = α2 , K1 = K2 , and either c1 ∈ supers(c2 ) or c2 ∈ supers(c1 ). Proof. Proof by induction. Case analysis on the last step leading to Γ ` e : c1 @α1 \K1 . For each form of expression e, there are only two ways a derivation can be constructed, either via the rule (and the only rule) the expression is structurally corresponds to, or (T-Sub). If (T-Sub) is not used, the interesting case is perhaps (T-Msg). Let e = eobj ∗[α] m(e0 ). By induction, let us assume there are two ways the expression computing the message

receiver eobj can be typed. In this case, we would know that the class names associated with the two types of eobj must be associated with two classes cobj1 and cobj2 , and either cobj1 ∈ supers(cobj2 ) or cobj2 ∈ supers(cobj1 ). Note that according to our definition of mtype, mtype(hcobj1 ; mi) = mtype(hcobj2 ; mi). So the conclusion holds by induction. If (T-Sub) is used, note that the only difference between the assumption judgment and the conclusion judgment is the class name, and Java-style nominal subtyping is observed. Lemma 16 (Deterministic Expression Typing at Dynamic Time). Given an implicit code base C , if Γ ` e : c1 @β1 \Ω1 and Γ ` e : c2 @β2 \Ω2 , then β1 = β2 , Ω1 = Ω2 , and either c1 ∈ supers(c2 ) or c2 ∈ supers(c1 ). Proof. Entirely analogous to Lem. 15. Definition 3 (Declared Type). If `p C : C, the declared type of α, denoted as decType(α, C ) is defined as – If fields(c) = ∀A.Γ for some c, and α ∈ A, then decType(α, C ) = c0 where there exists some f such that Γ (f) = c0 @α. – If mtype(hc; mi) = ∀A.(c1 7→ c2 ) for some c and m, and A = [α1 , α2 ], then decType(α, C ) = c1 if α1 = α; or decType(α, C ) = c2 if α2 = α. – If mbody(hc; mi) = x .e for some c and m, and e contains a subexpression newA c(e0 ) and A = [α1 , . . . , αn ] and α = αi for some 1 ≤ i ≤ n, then decType(α, C ) = decType(αi0 , C ) where fields(c) = ∀A0 .Γ and A0 = [α10 , . . . , αn0 ]. – If mbody(hc; mi) = x .e for some c and m, and e contains a subexpression newA1 ,A2 c(e0 ), and A1 = [α1 , . . . , αn ] or A2 = [α1 , . . . , αn ], and α = αi for some 1 ≤ i ≤ n, then decType(α, C ) = decType(αi0 , C ) where fields(c) = ∀A0 .Γ and A0 = [α10 , . . . , αn0 ]. – If mbody(hc; mi) = x .e for some c and m, and e contains a subexpression eobj ∗[α] mobj (earg ), and Γ ][x 7→ c1 @α1 ] ` eobj : cobj @αobj , where fields(c) = ∀A.Γ , mtype(hc; mi) = ∀[α1 , α2 ].(c1 → c2 ), mtype(hcobj ; mobj i) = ∀[αobj1 , αobj2 ].(cobj1 → cobj2 ), then decType(α, C ) = cobj2 . Lemma 17 (Deterministic decType Function). If decType(α, C ) = c1 and decType(α, C ) = c2 , then c1 = c2 . Lemma 18 (Defined Declared Types). If `p C : C, then decType(α, C ) is defined for any α ∈ FTV (K) such that C(c) = ∀A.M and M(m) = ∀A0 .∀S.K hold for any c, m. We now provide a definition on flow consistency. Intuitively, it means that both ends of the flow must have compatible Java-style types (class names). Note that our flow constraints may allow the declaration type of a superclass to flow into a declaration type of a subclass due to casting. Definition 4 (Consistent Flow Constraint in Modular Constraint Store). Given a fixed code base C , a flow constraint is said to be consistent iff

– For a flow constraint in the form of α1 ≤ α2 , it must hold that either c2 ∈ supers(c1 ) or c1 ∈ supers(c2 )where ci = decType(αi , C ) for i = 1, 2. – For a flow constraint in the form of Ac ≤ α, it must hold that c = decType(α, C ). Lemma 19 (Consistent Flows in Modular Constraint Store). If `p C : C, then any flow constraint α ˆ1 ≤ α ˆ 2 ∈ K is consistent where C(c) = ∀A.M and M(m) = ∀A0 .∀S.K hold for any c, m. Proof. Case analysis on expression rules. If the flow constraint is in the form of α1 ≤ α2 , use Lem. 15. If a flow constraint is in the form of Ac ≤ α, then it immediately follows (T-New). Definition 5 (Type Variable Instantiation). Function genVar (∆, α) = β is defined iff there exists some A and B such that gen(∆, A) = B, where A = [α1 , . . . , αk , α, . . . , αn ] and B = [β1 , . . . , βk , β, . . . , βn ]. Definition 6 (Consistent Flow Constraint in Closure). Given a fixed code base C and static calling context ∆, a flow constraint is said to be consistent iff – For a flow constraint in the form of β1 ≤ β2 , it must hold that either c2 ∈ supers(c1 ) or c1 ∈ supers(c2 ) where ci = decType(αi , C ) and genVar (∆, αi ) = βi for i = 1, 2. – For a flow constraint in the form of B c ≤ β, it must hold that either c0 ∈ supers(c) or c ∈ supers(c0 ) where c0 = decType(α, C ) and genVar (∆, α) = β. Lemma 20 (Consistent Flows in One-Step Closure). Given a fixed code base C and Ω, ∆ `c Ω 0 , and given any flow constraint in Ω under any static calling context is consistent, then any flow constraint in Ω 0 under any static calling context is consistent. Lemma 21 (Consistent Flows in Closure). If `p C : C, then any flow constraint in C is consistent. Definition 7 (Consistent Modifiers). Overloaded predicate consistentMod is defined as follows: – consistentMod (K) holds iff for any α such that A1 c1 ≤ α ∈ K and A2 c2 ≤ α ∈ K, modifier (c1 ) = modifier (c2 ). – consistentMod (Ω) holds iff for any β such that B1 c1 ≤ β ∈ Ω and B2 c2 ≤ β ∈ Ω, modifier (c1 ) = modifier (c2 ). Lemma 22 (Consistent Modifiers for Modular Constraint Store). If `p C : C, then consistentMod (K) for any K such that C(c) = ∀A.M and M(m) = ∀A0 .∀S.K hold for any c, m. Proof. Case analysis for all expression rules. The only rule that involves such a constraint is (T-New), in which case the type variable on the right hand of the flow constraint is the first element of label A associated with each new expression. Recall that we require all program labels to contain unique variables, so for any two constraints A1 c1 ≤ α1 and A2 c2 ≤ α2 , it is not possible that α1 = α2 . consistentMod (K) holds obviously.

Lemma 23 (Consistent Modifiers for Closure). If `p C : C and consistentMod (Ω0 ), then consistentMod (Ω) where Ω0 , ∆ `c Ω with the implicit C. Proof. Lemma 24 (Deterministic Function pzone). pzone(∆) is deterministic. Proof. Directly follows the definition of pzone. C.6

Evaluation Context

Lemma 25 (Type Variable thost Only Show Up in Access Constraints). If Γ ` e : θ

c@α\K, and for any κ ∈ K, κ 6= (α 99K α0 ), then thost 6= FTV (κ); If Γ ` e : θ

c@β\Ω, and for any ω ∈ Ω, ω 6= (β 99K β 0 ), then thost 6= FTV (ω). Proof. Case analysis on e for the typing rules. Lemma 26 (Redex Substitution). Given the implicit code base C where `p C : C, and the facts that Γ `h H1 \Ωh1 Γ `h H2 \Ωh2 Γ ` E[e1 ] : cE @βE1 \ΩE1 (EScope) Γ Γin ` e1 : ce @βe1 \Ωe1 is root of a subderivation of (EScope) for some Γin Ωh1 ∪ Ωe1

Γ Γin ` e2 : ce @βe2 \Ωe2

Γ Γin (cxt)

,→

Ωh2 ∪ Ωe2 ∪ {βe2 ≤ βe1 }

(FV (e2 ) − OID) ⊆ (FV (e1 ) − OID) Γ (cxt)

then there exists Ω2 such that Γ ` E[e2 ] : cE @βE2 \ΩE2 and Ωh1 ∪ ΩE1 ,→ Ωh2 ∪ ΩE2 ∪ {βE2 ≤ βE1 }. Proof. Perform case analysis on E. This proof details three non-trivial proofs, when E = •, when E = hEnest iδ , and when E = f:=θ Enest . All other cases of E should be analogous to the proof of the case of E = f:=θ Enest , if not simpler. Case E = • In this case, (EScope) can be re-written as Γ ` e1 : cE @βE1 \ΩE1 . By assumption Γ Γin ` e1 : ce @βe1 \Ωe1 is root of a subderivation of (EScope) for some Γin . By the way typing rules are constructed, the only rule that allows the judgment of a subderivation to have a different typing environment as the super one is to go through an instance of (T-ObjectScope). But that rule would lead to typing of different expressions for the pre-quent and the sequent. This would not be possible for the case here since both judgments attempt to type the same expression e1 . Hence Γin = [ ]. Thus by Lem. 16 (Deterministic Expression Typing at Dynamic Time), then ΩE1 = Ωe1 and βE1 = βe1 ,

and either ce ∈ supers(cE ) or cE ∈ supers(ce ). Since Γ Γin ` e1 : ce @βe1 \Ωe1 is a subderivation of (EScope), it is obvious that ce ∈ supers(cE ) is not possible. Hence cE ∈ supers(ce ). By assumption Γ Γin ` e2 : ce @βe2 \Ωe2 . By (T-Sub), and the definition of E, Γin : Γ ` E[e2 ] : cE @βe2 \Ωe2 This concludes the first part of the proof. For the second part of the proof, the goal here is to prove Γ (cxt)

Ωh1 ∪ ΩE1 ,→ Ωh2 ∪ Ωe2 ∪ {βe2 ≤ βE1 } This exactly is the assumption, given the facts that Γin = [ ], ΩE1 = Ωe1 and βE1 = βe1 . Case E = hEnest iδ In this case, (EScope) can be re-written as Γ ` hEnest [e1 ]iδ : cE @βE1 \ΩE1 . By the basic property of subtyping derivation (Lem. 2), and (T-ObjectScope), it is known that δ = ho; c; mi Γ (o) = B c B = β : B0 ∆ = hβ; c; mi : Γ (cxt)

(FV (Enest [e1 ]) − OID) ⊆ dom(Γfield ) fields(c) = ∀A.Γfield Γ (Γfield [A 7−→ B]) (cxt 7→ ∆) ` Enest [e1 ] : cEin @βE1 \ΩE1in (EScope-Nest) ΩE1 = {hΩE1in ihβ;c;mi }

cE ∈ supers(cEin ) By assumption, Γ Γin ` e1 : ce @βe1 \Ωe1 is root of a subderivation of (EScope) for some Γin . By the way the derivation tree for (EScope) is constructed, Γ Γin ` e1 : ce @βe1 \Ωe1 is root of a subderivation of (EScope-Nest) (assuming the definition of subderivation is reflexive). By the way Γfield is defined, the definition of fields, it is known that OID ∩ (dom((Γfield [A 7−→ B]) (cxt 7→ ∆))) = ∅. By Lem. 6 (Right Strengthening and Weakening of Typing Context for Heap Typing), Γ (Γfield [A 7−→ B]) (cxt 7→ ∆) `h H1 \Ωh1

Γ (Γfield [A 7−→ B]) (cxt 7→ ∆) `h H2 \Ωh2

By induction, there exists Γ (Γfield [A 7−→ B]) (cxt 7→ ∆) ` Enest [e2 ] : cEin @βE2in \ΩE2in (EScope-InductNest) ∆

Ωh1 ∪ ΩE1in ,→ Ωh2 ∪ ΩE2in ∪ {βE2in ≤ βE1 } (EScope-ConsInNest)

By assumption, (FV (e2 ) − OID) ⊆ (FV (e1 ) − OID). Earlier it was also known that (FV (Enest [e1 ])−OID) ⊆ dom(Γfield ). By the definition of FV , it is not hard to see that (FV (Enest [e2 ])−OID) ⊆ dom(Γfield ). Thus by (T-ObjectScope), (EScope-InductNest), and all other conditions listed above: Γ ` hEnest [e2 ]iδ : cEin @βE2in \{hΩE2in ihβ;c;mi } By the way E is defined, hEnest [e2 ]iδ = E[e2 ]. By the earlier fact that cE ∈ supers(cEin ) and (T-Sub), Γ ` E[e2 ] : cE @βE2in \{hΩE2in ihβ;c;mi } Let βE2 = βE2in and ΩE2 = {hΩE2in ihβ;c;mi }. The judgment above is the conclusion judgment. Given the way ΩE1 is structured, the rest of the proof is to establish Γ (cxt)

Ωh1 ∪ {hΩE1in ihβ;c;mi } ,→ Ωh2 ∪ {hΩE2in ihβ;c;mi } ∪ {βE2in ≤ βE1 } (EScope-FinalGoalNest) By (EScope-ConsInNest), and (C-Context) Γ (cxt)

hΩh1 ∪ ΩE1in ihβ;c;mi ,→ hΩh2 ∪ ΩE2in ∪ {βE2in ≤ βE1 }ihβ;c;mi By (T-Heap), (T-HeapCell), and the assumptions of Γ `h H1 \Ωh1 and Γ `h H2 \Ωh2 , it is known that all constraints in Ωh1 and Ωh2 are flow constraints. By (C-GlobalIntro) Γ (cxt)

Ωh1 ∪ hΩE1in ihβ;c;mi ,→ hΩh1 ∪ ΩE1in ihβ;c;mi By assumption Γ `h H2 \Ωh2 , (T-Heap), (T-HeapCell), all constraints in Ωh2 are flow constraints. By Lem. 25 (Type Variable thost Only Shows Up in Access Constraints), it thus is known that thost ∈ / FTV (Ωh2 ∪ {βE2in ≤ βE1 }). By (C-GlobalElim) Γ (cxt)

hΩh2 ∪ ΩE2in ∪ {βE2in ≤ βE1 }ihβ;c;mi ,→ Ωh2 ∪ {hΩE2in ihβ;c;mi } ∪ {βE2in ≤ βE1 } Thus (EScope-FinalGoalNest) follows the last three relations and the transitivity of Γ (cxt)

,→ . Case E = f:=θ Enest In this case, (EScope) can be re-written as Γ ` f:=θ Enest [e1 ] : cE1 @βE1 \ΩE1 . By the basic property of subtyping derivation (Lem. 2), and (T-Write), it is known that Γ (f) = cfield @βE1 Γ ` Enest [e1 ] : cfield @βfield1 \Ωfield1

(EScope-Write)

ΩE1 = Ωfield1 ∪ {βfield1 ≤ βE1 } ∪ aC (θ, Γ (this)) cE1 ∈ supers(cfield ) By assumption, Γ Γin ` e1 : ce @βe1 \Ωe1 is root of a subderivation of (EScope) for some Γin . By the way the derivation tree for (EScope) is constructed, Γ Γin ` e1 :

ce @βe1 \Ωe1 is root of a subderivation of (EScope-Write) (assuming the definition of subderivation is reflexive). By induction, Γ ` Enest [e2 ] : cfield @βfield2 \Ωfield2 Γ (cxt)

Ωh1 ∪ Ωfield1 ,→ Ωh2 ∪ Ωfield2 ∪ {βfield2 ≤ βfield1 }

(EScope-ConsInWrite)

By (T-Sub) and the earlier fact that cE1 ∈ supers(cfield ), hence Γ ` Enest [e2 ] : cE1 @βfield2 \Ωfield2 . Earlier it has been shown that Γ (f) = cfield @βE1 . By (T-Write): Γ ` f:=θ Enest [e2 ] : cfield @βE1 \Ωfield2 ∪ {βfield2 ≤ βE1 } ∪ aC (θ, Γ (this)) Note that E[e2 ] = f:=θ Enest [e2 ]. So the judgment above completes the first part of the proof, and βE2 = βE1 and ΩE2 = Ωfield2 ∪ {βfield2 ≤ βE1 } ∪ aC (θ, Γ (this)). With these, and the definition of ΩE1 above, hence the rest of the proof is thus to show Γ (cxt)

Ωh1 ∪ Ωfield1 ∪ {βfield1 ≤ βE1 } ∪ aC (θ, Γ (this)) ,→ (EScope-FinalGoalWrite) Ωh2 ∪ Ωfield2 ∪ {βfield2 ≤ βE1 } ∪ aC (θ, Γ (this)) ∪ {βE1 ≤ βE1 } To achieve this, note that it is already known that (EScope-ConsInWrite) holds. By (C-Union): Γ (cxt)

Ωh1 ∪ Ωfield1 ∪ {βfield1 ≤ βE1 } ∪ aC (θ, Γ (this)) ,→

Ωh2 ∪ Ωfield2 ∪ {βfield2 ≤ βfield1 } ∪ {βfield1 ≤ βE1 } ∪ aC (θ, Γ (this)) Γ (cxt)

By (C-Flow+), {βfield2 ≤ βfield1 } ∪ {βfield1 ≤ βE1 } ,→ {βfield2 ≤ βE1 }. By (CUnion), Γ (cxt)

Ωh2 ∪ Ωfield2 ∪ {βfield2 ≤ βfield1 } ∪ {βfield1 ≤ βE1 } ∪ aC (θ, Γ (this)) ,→

Ωh2 ∪ Ωfield2 ∪ {βfield2 ≤ βE1 } ∪ aC (θ, Γ (this)) Γ (cxt)

By (C-Flow=), ∅ ,→ {βE1 ≤ βE1 }. By (C-Union) Γ (cxt)

Ωh2 ∪ Ωfield2 ∪ {βfield2 ≤ βE1 } ∪ aC (θ, Γ (this)) ,→

Ωh2 ∪ Ωfield2 ∪ {βfield2 ≤ βE1 } ∪ aC (θ, Γ (this)) ∪ {βE1 ≤ βE1 } Γ (cxt)

(EScope-FinalGoalWrite) therefore holds as the result of the last three ,→ relations Γ (cxt)

and the definition of ,→ . C.7

Subject Reduction

Definition 8 (Object Static Representation). Function staticizeO(o, H) computes the static representation of an object o based on heap H, and it is defined as β where H(o) = hB c ; Fd i, B = β : B 0 .

Definition 9 (Static Representation of Dynamic Access History). Function staticize(Σ, H) θ

θ

is defined as {staticizeO(o1 , H) − → staticizeO(o2 , H) | o1 − → o2 ∈ Σ}. Lemma 27 (Function fields on Inheritance Chain). If fields(c) = ∀A.Γ , and c ∈ supers(c0 ), then Γ (this) = c@αthis , fields(c0 ) = ∀A ] Aadd .((Γ ] Γadd ) [this 7→ c0 @αthis ]), and dom(Γadd ) are field names. Proof. Directly follows the definition of fields. Lemma 28 (Return Type Variables on a Signature Never Overlap with Expression Type Variables and their Constraints). If Γ ` e : c@α\K, and mtype(∆) = ∀Asig .(cv → cret ) and A = [α1 , α2 ], then {α} ∪ FTV (K) and α2 are disjoint. Proof. Case analysis on e for the typing rules. The type variables associated with signature return types are never used for typing any expression. Lemma 29 (Type Variable thost Never Overlap with Expression Type Variables). If Γ ` e : c@α\K, then α 6= thost. Proof. Case analysis on e for the typing rules. Lemma 30 (Typing the Result of fun Function). Given the implicit code base C where `p C : C and Γ `h H\Ωh cin H(oin ) = hBobj ; Fd i 0 Bobj = βin : Bobj

δ = hoin ; cin ; mi mtype(hcin ; mi) = ∀Asig .(cv → cret ) Γ (cxt) = d2s(∆dyn , H) ∆instc = hβin ; cin ; mi : Γ (cxt) Γ ` v : cv @βv \∅

Γ (cxt)

cin then Γ ` fun(δ : ∆dyn , H, v) : cret @βfun \Ωfun and {[βret ]βin ,m,βv , Bobj ≤ βin } ,→ (Ωfun ∪ {βfun ≤ βret })

Proof. Step 1: Deriving information from static typing: According to the definition of mtype, it is known that given mtype(hcin ; mi), there must exist some class name cdef such that µ class cdef extends csup {M F } ∈ range(C ) cretm m(cargm x ){e} ∈ range(M ) cdef ∈ supers(cin )

Now let us pick cdef such that: 1) if mbody(hcin ; mi) is defined, pick cdef = cin . 2) if mbody(hc∈ ; mi) is not defined, pick cdef to be the most precise superclass of cin where mbody(hcdef ; mi) is defined – that is, for any ctemp such that ctemp 6= cdef and cdef ∈ supers(ctemp ) and ctemp ∈ supers(cin ), mbody(hctemp ; mi) is undefined. According to the definition of mbody, it is known that mbody(hcdef ; mi) = x .e. By assumption `p C : C, and by (T-Program), (T-Cls), (T-Md), the mbody and mtype equations above, we know fields(cdef ) = ∀Adef .Γdef C(cdef ) = ∀Adef .Mdef Mdef (m) = ∀Asig .∀S.(K ∪ {α ≤ αret }) Asig = [αv , αret ] Γdef ] [x 7→ cv @αv ] ` e : cret @α\K

(Fun-Def)

S = labels(e) Earlier it was known that cdef ∈ supers(cin ). By Lem. 27 (Function fields on Inheritance Chain) fields(cin ) = ∀Aobj .Γin Aobj = Adef ] Aadd for some Aadd Γdef (this) = cdef @αthis Γin = (Γdef ] Γadd ) [this 7→ cin @αthis ] for some Γadd

FTV (Γadd ) = Aadd

dom(Γadd ) are field names By the fact that dom(Γadd ) are field names, dom(Γadd )∩dom(Γdef ][x 7→ cv @αv ]) = ∅. By Lem. 7 (Disjoint Right Strengthening of Typing Context) and (Fun-Def), Γdef ] Γadd ] [x 7→ cv @αv ] ` e : cret @α\K. By Lem. 9 (this Strengthening), (Γdef ] Γadd ] [x 7→ cv @αv ]) [this 7→ cin @αthis ] ` e : cret @α\K. Given the trivial fact that x and this are different, the judgment above is Γin ] [x 7→ cv @αv ] ` e : cret @α\K

(Fun-Body)

Before we move on to the next part of the proof, note that given supers(cin ) is defined, cin ∈ dom(C ). Thus by (T-Cls), and the previous definition of fields(cin ), C(cin ) = ∀Aobj .M

(Fun-Cls)

Earlier we have mentioned how cdef is picked. If mbody(hcin ; mi) is defined, cdef = cin , and hence M = Mdef . If mbody(hcin ; mi) is notdefined, cdef is picked to be the most specific superclass of cin where mbody is defined. According to (T-Cls), it is obvious that M(m) = Mdef (m). So in either case M(m) = ∀Asig .∀S.(K ∪ {α ≤ αret })

(Fun-Md)

Step 2: Proving the judgment: Now let σ =

]

A 7−→ gen(∆instc , A). Earlier

A∈labels(e)

we have shown S = labels(e). Thus σ =

]

A 7−→ gen(∆instc , A). By the way

A∈S

∆instc is constructed, this is ] σ= A 7−→ gen(hβin ; cin ; mi : d2s(∆dyn , H), A) A∈S

By definition, it is known dom(Aobj ) only contains type variables associated with fields, dom(Asig ) only contains type variables associated with method signatures, dom(A) for any A ∈ S only contain type variables associated with polymorphic sites (either instantiation or messaging). By assumption we had of annotated programs, it is known type variables in dom(Aobj ), dom(Asig ), and those in any dom(A) in any A ∈ S are distinct from each other. Hence, it is OK to let σ 0 = [Aobj 7−→ Bobj ] ] [αv 7→ βv ] ] σ

(Fun-SubstInner)

σ 00 = [αret 7→ βret ] ] σ 0

(Fun-Subst)

It is obvious that range(σ 0 ) does not intersect with FTV (Γin ) ∪ {αv , α} ∪ FTV (e) ∪ FTV (K), because the former contains all the β-form type variables, whereas the latter are all in the α-form. In addition, dom(σ 0 ) ⊆ FTV (Γin ) ∪ {αv } ∪ FTV (e). By the lemma on judgment instantiation (Lem. 8) and (Fun-Body), we have Γin [σ 0 ] ] [x 7→ cv @αv [σ 0 ]] ` e[σ 0 ] : cret @α[σ 0 ]\K[σ 0 ] By the definition Γin , it is known that FTV (Γin ) = Aobj . Thus Γin [σ 0 ] = Γin [Aobj 7−→ Bobj ]. By the way Asig is defined, it is known that cv @αv [σ 0 ] = cv @βv . Similarly e[σ 0 ] = e[σ]. Thus Γin [Aobj 7−→ Bobj ] ] [x 7→ cv @βv ] ` e[σ] : cret @α[σ 0 ]\K[σ 0 ] We now increment typing context with Γ on the left. By the Lemma of “Left Strengthening and Weakening of Typing Context” (Lem. 4): Γ Γin [Aobj 7−→ Bobj ] ] [x 7→ cv @βv ] ` e[σ] : cret @α[σ 0 ]\K[σ 0 ]

Previously, it has been demonstrated that Γ ` v : cv @βv \∅. Note that by the construction of Γin . it is known o ∈ / dom(Γin ) for any o. Hence o ∈ / dom(Γin [Aobj 7−→ Bobj ] for any o. By the Lemma on Irrelevant Typing Context for Values (Lem. 10) Γ Γin [Aobj 7−→ Bobj ] ` v : cv @βv \∅

Thus by substitution lemma (Lem. 12), Γ Γin [Aobj 7−→ Bobj ] ` e[σ]{v/x } : cret @α[σ 0 ]\K[σ 0 ]. Now let e0 = e{v/x }[σ]. Since σ only works on type variables, e{v/x }[σ] = e[σ]{v/x }. The judgment above is Γ Γin [Aobj 7−→ Bobj ] ` e0 : cret @α[σ 0 ]\K[σ 0 ]. By definition, e is the body of a method. By the way e0 is constructed,

it is obvious that there is no sub-expression in e0 in the form of hesome iδsome . The latter is the only expression that is dependent on Γ (cxt). Hence Γ Γin [Aobj 7−→ Bobj ] (cxt 7→ ∆instc ) ` e0 : cret @α[σ 0 ]\K[σ 0 ] (Fun-NewBody)

We next study what free variables may exist in e0 . Following (Fun-Body) and the Lemma on “Free Variables are Closed under Typing Context” (Lem. 3), FV (e) ⊆ dom(Γin ) ∪ {x }. Also recall e0 = e{v/x }[σ]. If v ∈ OID, then by the definition of substitution, FV (e{v/x }) ⊆ dom(Γin )∪{v}. Otherwise if v = null, then FV (e{v/x }) ⊆ dom(Γin ). Substitution σ only affects type variables, so it does not affect free variables. Hence, if v ∈ OID, FV (e0 ) ⊆ dom(Γin ) ∪ {v}. Otherwise if v = null, then FV (e0 ) ⊆ dom(Γin ). in general: FV (e0 ) − OID ⊆ dom(Γin ) cin By assumption H(oin ) = hBobj ; Fd i, (T-Heap), (T-HeapCell), it is known that Γ (oin ) = cin Bobj . With these two facts, and judgment (Fun-NewBody), together with the fact of fields(cin ) = ∀Aobj .Γin , by (T-ObjectScope), Γ ` he0 ihoin ;cin ;mi : cret @α[σ 0 ]\{hK[σ 0 ]ihβin ;cin ;mi }. By the definition of fun, it is known this is the first part of the conclusion, and βfun = α[σ 0 ] and Ωfun = {hK[σ 0 ]ihβin ;cin ;mi }. Step 3: Proving the closure judgment: With the way C(cin ) is constructed in (Fun-Cls), the way M(m) is constructed in (Fun-Md), the way ∆instc is constructed by assumption, the way σ 00 is constructed in (Fun-Subst), and by (CC-Contour) Γ (cxt)

cin {[βret ]βin ,m,βv , Bobj ≤ βin } ,→ {h(K ∪ {α ≤ αret })[σ 00 ]ihβin ;cin ;mi }

By the definition of σ 0 and σ 00 , and the obvious fact that [αret 7→ βret ] ] σ 0 = σ 0 ] [αret 7→ βret ], {h(K∪{α ≤ αret })[σ 00 ]ihβin ;cin ;mi } is {hK[σ 0 ][αret 7→ βret ]∪{α[σ 0 ][αret 7→ βret ] ≤ βret }ihβin ;cin ;mi }. By Lem. 28 (Return Type Variables on a Signature Never Overlap with Expression Type Variables and their Constraints), then it is known α and αret are distinct. Thus, if α ∈ / dom(σ 0 ), then α[σ 0 ][αret 7→ βret ] = α = α[σ 0 ]. If 0 α ∈ dom(σ ), then by the definition of σ 0 , it is known that range(σ 0 ) and {αret } do not overlap, so α[σ 0 ][αret 7→ βret ] = α = α[σ 0 ]. Therefore for all cases, α[σ 0 ][αret 7→ βret ] = α = α[σ 0 ]. The set above can be rewritten as {hK[σ 0 ][αret 7→ βret ] ∪ {α[σ 0 ] ≤ βret }ihβin ;cin ;mi }. By Lem. 28 again, it is known αret ∈ / FTV (K). By the way σ 0 is 0 0 constructed, αret ∈ / range(σ ). So K[σ ][αret 7→ βret ] = K[σ 0 ]. So the above set can be re-written as {hK[σ 0 ] ∪ {α[σ 0 ] ≤ βret }ihβin ;cin ;mi }. Thus we know Γ (cxt)

cin {[βret ]βin ,m,βv , Bobj ≤ βin } ,→ {hK[σ 0 ] ∪ {α[σ 0 ] ≤ βret }ihβin ;cin ;mi }

By Lem. 25 (Type Variable thost Only Shows Up in Access Constraints), it thus is known that thost ∈ / FTV ({α[σ 0 ] ≤ βret }). By (C-GlobalElim) Γ (cxt)

{hK[σ 0 ] ∪ {α[σ 0 ] ≤ βret }ihβin ;cin ;mi } ,→ {hK[σ 0 ]ihβin ;cin ;mi } ∪ {α[σ 0 ] ≤ βret } Γ (cxt)

With the previous assumptions about βfun and Ωfun , and the transitivity of ,→ , the second half of the conclusion holds.

Lemma 31 (Field Names and this in Typing Context). If Γ ` e : τ \Ω, Γ (cxt) = hβ; c; mi : ∆, fields(c) = ∀A.Γobj , then there exists a unique o such that Γ (o) = B c and Γ (t) = Γobj [A 7→ B](t) for any t ∈ dom(Γobj ). Proof. Case analysis on the judgment. The only interesting rule is (T-ObjectScope), which can lead to the conclusion based on the assumptions of that rule. All other rules immediately follow induction. Lemma 32 (Reduction Type Preservation Under a Calling Context). Given the implicit code base C where `p C : C and the facts that ∆

H1 , Σ1 , e1 =⇒ H2 , Σ2 , e2

(SR-Reduce)

Γ1 `h H1 \Ωh1

(SR-H1)

Γ1 ` e1 : cexp @β1 \Ωe1

(SR-E1)

Γ1 (cxt) = d2s(∆, H1 )

(SR-TE1)

then there exists some Γ2 , Ωh2 , Ωe2 such that Γ2 `h H2 \Ωh2 (SR-Goal0) Γ2 ` e2 : cexp @β2 \Ωe2 (SR-Goal1) staticize(Σ1 , H1 ) ∪ Ωe1 ∪ Ωh1

Γ1 (cxt)

,→ staticize(Σ2 , H2 ) ∪ Ωe2 ∪ Ωh2 ∪ {β2 ≤ β1 } (SR-Goal2) Γ2 = Γ1 ] Γ0 for some Γ0 (SR-Goal3) (FV (e2 ) − OID) ⊆ (FV (e1 ) − OID) (SR-Goal4)

Proof. Perform induction on the length of the reduction sequence, and case analysis on ∆ the last step of the reduction rule being used that leads to H1 , Σ1 , e1 =⇒ H2 , Σ2 , e2 . Note that for the following cases: – We do not prove (SR-Goal0) if H2 = H1 as it strictly follows assumption Γ1 `h H1 \Ωh1 . In that case we also let Γ2 = Γ1 . – If we are able to prove β1 = β2 , we only prove Ωe1 ∪ Ωh1

Γ1 (cxt)

,→

Ωe2 ∪ Ωh2 for

∆

(SR-Goal2) because the conclusion holds by (C-Union) and ∅ ,→ {β1 ≤ β1 } (by (C-Flow=)). – If we are able to prove Γ2 = Γ1 , we do not prove (SR-Goal3) since it holds trivially where Γ0 = [ ].

Case (R-Msg): By that rule, e1 = oin ∗[βret ] m(v), e2 =

oin T

; wrap(∗, fun(δ : ∆, H1 , v)),

cin δ = hoin ; cin ; mi, H2 = H1 , H1 (oin ) = hBobj ; Fd i. With (SR-E1), by the basic property of subtyping derivation (Lem. 2), and (T-Msg), it is known

Γ1 ` oin : τin \Ωin Γ1 ` v : cv @βv \Ωv τin = coin @βoin Ωe1 = Ωin ∪ Ωv ∪ {[βret ]

βoin ,m,βv

} ∪ aC (T, τoin )

mtype(hcoin ; mi) = ∀Asig .(cv → cret ) cexp ∈ supers(cret ) ∗ matches modifer (coin ) returns cret @βret β1 = βret cin 0 Let Bobj = βin : Bobj . By (T-Heap), (T-HeapCell), Γ1 (oin ) = Bobj . Previously, it was shown that Γ1 ` oin : τin \Ωin . According to (T-Obj), by the basic property of subtyping derviation (Lem. 2), it is known that coin ∈ supers(cin ) and βin = βoin . Thus β ,m,βv β ,m,βv . By the definition of mtype and the earlier fact that βin = = [βret ] in [βret ] oin βoin , mtype(hcoin ; mi) = mtype(hcin ; mi). By the Lemma on Empty Constraints for Values (Lem. 11), Ωv = ∅. With these conditions, and (SR-H1), (SR-TE1), according to Lem. 30 (Typing the Result of fun Function):

Γ1 ` fun(δ : ∆, H1 , v) : cret @βfun \Ωfun βin ,m,βv

{[βret ]

cin , Bobj ≤ βin }

d2s(∆,H1 )

,→

(Ωfun ∪ {βfun ≤ βret })

(SR-MsgEntail)

By (T-Sub) and the earlier fact that cexp ∈ supers(cret ), it is known, Γ1 ` fun(δ : ∆, H1 , v) : cexp @βfun \Ωfun . Now case analysis on ∗. If ∗ = 6 ->, then fun(δ : ∆, H1 , v) = wrap(∗, fun(δ : ∆, H1 , v)). If ∗ = ->, by the previous fact that “∗ matches modifer (coin ) returns cret @βret ”, it is known that cret = Unit = Object . Earlier we have shown cexp ∈ supers(Object ). By the definition of supers, c = Object = Unit. Hence Γ1 ` fun(δ : ∆, H1 , v) : Unit@βfun \Ωfun . By the definition of FV and fun, FV (fun(δ : ∆, H1 , v)) ⊆ OID. Thus by (T-Post), Γ1 ` wrap(∗, fun(δ : ∆, H1 , v)) : cexp @βfun \Ωfun holds. End of case analysis. We know in all cases Γ1 ` wrap(∗, fun(δ : ∆, H1 , v)) : cexp @βfun \Ωfun By the earlier fact of Γ1 ` oin : τin \Ωin and the Lemma on Empty Constraints for oin Values (Lem. 11), Ωin = ∅. By (T-Access), it is holds that for any τtemp Γ1 ` T : τtemp \aC (T, τin ). Thus by (T-Continue) oin

Γ1 ` T ; wrap(∗, fun(δ : ∆, H1 , v)) : c@βfun \aC (T, τin ) ∪ Ωfun Thus (SR-Goal1) holds and Ωe2 = aC (T, τin ) ∪ Ωfun and β2 = βfun . By assumption β ,m,βv cin cin (SR-E1), (T-Heap), (T-HeapCell), {Bobj ≤ βin } ⊆ Ωh1 . Thus obviously {[βret ] in , Bobj ≤ βin } ⊆ Ωe1 ∪ Ωh1 . By (SR-MsgEntail) and (C-Union) (applying Ωe1 ∪ Ωh1 on both ends of (SR-MsgEntail)), it is known Ωe1 ∪ Ωh1

d2s(∆,H1 )

,→

(Ωfun ∪ {βfun ≤ βret }) ∪ Ωe1 ∪ Ωh1

Earlier it was shown β1 = βret and β2 = βfun . We also know the structure of Ωe1 . By (SR-TE1), Γ1 (cxt) = d2s(calls, H1 ), Hence the relation above can be rewritten as Γ1 (cxt)

Ωe1 ∪ Ωh1 ,→ Ωfun ∪ {β2 ≤ β1 } ∪ Ωin ∪ Ωv ∪ {[βret ]

βoin ,m,βv

} ∪ aC (T, τoin ) ∪ Ωh1

Γ1 (cxt)

Thus (SR-Goal2) holds by (C-Subset) and the transitivity of the ,→ . By the way fun function and the wrap function are defined, and the definition of FV , it is known that FV (e2 ) ⊆ OID. (SR-Goal4) holds trivially. o

Case (R-Read): By that rule, e1 = f, e2 = R ; Fd(f), ∆ = ho; c; mi : ∆0 , H2 = H1 , and H1 (o) = hB c ; Fd i. By (SR-E1), the basic property of subtyping derivation (Lem. 2), and (T-Read), it is known Γ1 (f) = cread @β1 , cexp ∈ supers(cread ), and Ωe1 = aC (R, Γ1 (this)). With (SR-Reduce), the reduction can happen. Thus Fd (f) must be defined. Hence f ∈ dom(Fd ). By this fact, (SR-H1), (T-Heap), (T-HeapCell), and the fact that H(o) = hB c ; Fd i, it holds that Γ1 (o) = B c fields(c) = ∀A.Γobj B = βhead : B 0 Γ1 ` Fd (f) : cfd @βfd \∅ 0 Γobj [A 7−→ B](f) = cfd @βfd 0 {βfd ≤ βfd } ⊆ Ωh1

Given the fact that Γobj [A 7−→ B](f) is defined, it is known that f ∈ dom(Γobj ). By Lem. 31 (Field Names in Typing Context) and the previous facts that Γ1 (o) = B c , Γ1 (f) = Γobj [A 7−→ B](f). Previously it was shown that Γ1 (f) = cread @β1 , it is 0 known that cread = cfd and β1 = βfd . Hence Γ1 ` Fd (f) : cread @βfd \∅. Earlier it has been shown that cexp ∈ supers(cread ), by (T-Sub), Γ1 ` Fd (f) : cexp @βfd \∅. Earlier it is known Γ1 (o) = B c and B = βhead : B 0 . By (T-Obj), Γ1 ` o : c@βhead \∅. By (To Access), (T-Continue), Γ1 ` R ; Fd(f) : cexp @βfd \aC (R, c@βhead ). Hence (SR-Goal1) holds where β2 = βfd and Ωe2 = aC (R, c@βhead ). By the definition of fields, this ∈ dom(Γobj ). By Lem. 31 again, Γ1 (this) = Γobj [A 7−→ B](this). By the definition of fields and the structure of B, Γobj [A 7−→ B](this) = c@βhead . Hence Γ1 (this) = c@βhead . Therefore Ωe1 = Ωe2 . Earlier it was 0 0 shown that β1 = βfd and β2 = βfd and {βfd ≤ βfd } ⊆ Ωh1 . Hence Ωh1 ∪ Ωe1

Γ1 (cxt)

,→

Γ1 (cxt)

Ωh1 ∪ Ωe2 ∪ {β2 ≤ β1 } holds by the reflexivity of ,→ . (SR-Goal2) holds. By the definition of Fd , Fd (f) ∈ V. No matter Fd (f) is in the form of null or an object ID o, FV (Fd (f)) − OID = ∅. Hence FV (e2 ) − OID = ∅ by the structure of e2 . (SR-Goal4) holds. o

Case (R-Write): By that rule, e1 = f:=θ v, e2 = θ ; v, ∆ = ho; c; mi : ∆0 , H2 =

H1 [o 7→ hB c ; Fd [f 7→ v]i], and H1 (o) = hB c ; Fd i. By (SR-E1), the basic property of subtyping derivation (Lem. 2), and (T-Write), it is known that Γ1 (f) = cwrite @β1 , Γ1 ` v : cwrite @βv \Ωv , cexp ∈ supers(cwrite ), Ωe1 = Ωv ∪ {βv ≤ β1 } ∪ aC (θ, Γ1 (this)). By Lem. 11 (“Empty Constraints for Values”), Ωv = ∅. By (SR-H1), (T-Heap), (THeapCell), and the fact that H(o) = hB c ; Fd i: Γ1 (o) = B c fields(c) = ∀A.Γobj B = βhead : B 0 dom(Fd ) = [fFd 1 , . . . , fFd n ] Γ1 ` Fd (fFd i ) : cFd i @βFd i \∅ for i = [1..n] 0 Γobj [A 7−→ B](fFd i ) = cFd i @βFd for i = [1..n] i 0 0 Ωh1 = {B c ≤ βhead , βFd 1 ≤ βFd , . . . , βFd n ≤ βFd } ∪ Ω for some Ω 1 n

With (SR-Reduce), the reduction can happen. With the definition of , f ∈ dom(Fd ). Thus it is known that f = fFd k for some k ∈ [1..n]. Thus Γ1 ` Fd (f) : cFd k @βFd k \∅ 0 and Γobj [A 7−→ B](f) = cFd k @βFd . Given the fact that Γobj [A 7−→ B](f) is dek fined, it is known that f ∈ dom(Γobj ). By Lem. 31 (Field Names in Typing Context) and the previous facts that Γ1 (o) = B c , Γ1 (f) = Γobj [A 7−→ B](f). Previously it was 0 shown that Γ1 (f) = cwrite @β1 , it is known that cwrite = cFd k and β1 = βFd . Prek viously, it was shown that Γ1 ` v : cwrite @βv \∅. Obviously Fd [f 7→ v](f) = v. Hence Γ1 ` Fd [f 7→ v](f) : cFd k @βv \∅. Hence by (T-Heap), (T-HeapCell), Γ1 `h H[o 7→ hB c ; Fd [f 7→ v]i]\Ωh2 and (SR-Goal0) holds. In this context, 0 0 Ωh2 = {B c ≤ βhead , βFd 1 ≤ βFd , . . . , βFd k−1 ≤ βFd , βv ≤ β1 , βFd k+1 ≤ 1 k−1 0 0 βFd k+1 , . . . βFd n ≤ βFd n } ∪ Ω. Earlier it is known Γ1 (o) = B c and B = βhead : B 0 . By (T-Obj), Γ1 ` o : c@βhead \∅. By (T-Access), (T-Continue), and the earlier fact o of Γ1 ` v : cwrite @βv \∅, Γ1 ` θ ; v : cwrite @βv \aC (θ, Γ1 (this)). Earlier it has been o

shown that cexp ∈ supers(cwrite ). By (T-Sub), Γ1 ` θ ; v : cexp @βv \aC (θ, Γ1 (this)). Hence (SR-Goal1) holds where Ω2 = aC (θ, Γ1 (this)) and β2 = βv . By the definition of fields, this ∈ dom(Γobj ). By Lem. 31 again, Γ1 (this) = Γobj [A 7−→ B](this). By the definition of fields and the structure of B, Γobj [A 7−→ B](this) = c@βhead . Hence Γ1 (this) = c@βhead . Therefore given the definition of Ωe1 earlier, Ωe1 = Ωe2 ∪ {βv ≤ β1 }. Comparing Ωh1 and Ωh2 , it is obvious that Ωh1 ∪ {βv ≤ β1 } = Ωh2 ∪ {βFd k ≤ β1 }. Thus Ωh1 ∪ Ωe1 = Ωh1 ∪ Ωe2 ∪ {βv ≤ Γ1 (cxt)

Γ1 (cxt)

β1 } = Ωe2 ∪Ωh2 ∪{βFd k ≤ β1 }∪{βv ≤ β1 }. By reflexivity of ,→ , Ωh1 ∪Ωe1 ,→ Ωe2 ∪ Ωh2 ∪ {βFd k ≤ β1 } ∪ {βv ≤ β1 }. (SR-Goal2) holds by (C-Subset) and the tranΓ1 (cxt)

sitivity of the ,→ . No matter v is in the form of null or an object ID o, FV (v) − OID = ∅. Hence FV (e2 ) − OID = ∅ by the structure of e2 . (SR-Goal4) holds.

Case (R-New): By that rule, e1 = newB cnew (v), e2 = q∗[β] cnew (v), B = β : B 0 ,

∆ = ho; c; mi : ∆0 , H2 = H1 ] (q 7→ hB cnew ;

]

f 7→ nulli), fields(cnew ) =

f∈dom(Γobj )

∀A.Γobj , q fresh, ∗ matches modifier (cnew ) returns cnew . By (SR-E1), the basic property of subtyping derivation (Lem. 2), and (T-New), it is known that Γ1 ` v : cv @βv \Ωv , β1 = β, mtype(hcnew ; cnew i) = ∀A.(cv → cnew ), Ωe1 = Ωv ∪{B cnew ≤ β, [β]β,cnew ,βv }∪ aC (T, cnew @β), and cexp ∈ supers(cnew ). Now let Γ2 = Γ1 [q 7→ Bcnew ]. Since q is fresh, this is Γ2 = Γ1 ] [q 7→ B cnew ]. (SR-Goal3) holds. By Lem. 11 (“Empty Constraints for Values”), Ωv = ∅. By Lem. 10 (“Irrelevant Typing Context for Values”): Γ2 ` v : cv @βv \∅

(New-V)

Obviously Γ2 (q) = B cnew . By (T-Obj), Γ2 ` q : cnew @β\∅

(New-Obj)

By the earlier facts of mtype(hcnew ; cnew i) = ∀A.(cv → cnew ) and ∗ matches modifier (cnew ) returns cnew , by (T-Msg), (New-V), (New-Obj) Γ2 ` q∗[β] cnew (v) : cnew @β\{B cnew ≤ β, [β]β,cnew ,βv } ∪ aC (T, cnew @β) By the definition of e2 and Ωe1 , the judgment above is precisely Γ2 ` e2 : cnew @β\Ωe1 . Earlier it was known cexp ∈ supers(cnew ). By (T-Sub), Γ2 ` e2 : cexp @β\Ωe1 . (SR-Goal1) holds, and β2 = β, and Ωe2 = Ωe1 . Now let Γobj [A 7−→ B] = [fFd 1 7→ cFd 1 @βFd 1 , . . . , fFd n 7→ cFd n @βFd n ]. By (T-Null), it is obvious that Γ2 ` null : cFd i @βFd i for i = 1..n. By (T-HeapCell), (T-Heap), and (SR-H1): Γ2 `h H2 \Ωh1 ∪ {B cnew ≤ β, βFd 1 ≤ βFd 1 , . . . , βFd n ≤ βFd n } Hence (SR-Goal0) holds where Ωh2 = Ωh1 ∪ {B cnew ≤ β, βFd 1 ≤ βFd 1 , . . . , βFd n ≤ βFd n }. Earlier it was shown {B cnew ≤ β} ⊆ Ωe1 = Ωe2 . Hence Ωh2 ∪ Ωe2 = Ωh1 ∪ Γ1 (cxt)

Ωe1 ∪ {βFd 1 ≤ βFd 1 , . . . , βFd n ≤ βFd n }. By reflexivity of ,→

Γ1 (cxt)

Ωh1 ∪ Ωe1 ∪ {βFd 1 ≤ βFd 1 , . . . , βFd n ≤ βFd n } ,→ Ωh2 ∪ Ωe2 By (C-Flow=), ∅

Γ1 (cxt)

,→

Γ1 (cxt)

{βFd 1 ≤ βFd 1 , . . . , βFd n ≤ βFd n }. By reflexivity of ,→ ,

Γ1 (cxt)

Ωh1 ∪ Ωe1 ,→ Ωh1 ∪ Ωe1 . By (C-Union) Γ1 (cxt)

Ωh1 ∪ Ωe1 ,→ Ωh1 ∪ Ωe1 ∪ {βFd 1 ≤ βFd 1 , . . . , βFd n ≤ βFd n } Γ1 (cxt)

Earlier it was shown that β1 = β2 = β. By (C-Flow=), ∅ ,→ {β ≤ β}. By reflexivity Γ1 (cxt)

Γ1 (cxt)

of ,→ , Ωh2 ∪ Ωe2 ,→ Ωh2 ∪ Ωe2 . By (C-Union) Γ1 (cxt)

Ωh2 ∪ Ωe2 ,→ Ωh2 ∪ Ωe2 ∪ {β ≤ β} Γ1 (cxt)

(SR-Goal2) thus holds following the three listed ,→ relations above, and transitivity. By the definition of FV , FV (e2 ) = {q} ∪ FV (v). No matter v is an object ID or null, FV (e2 ) ⊆ OID. Hence (SR-Goal4) holds.

Case (R-NewT1): By that rule, e1 = newB1 ,B2 cnew (v), e2 = newB1 cnew (v), H2 = H1 . By (SR-E1), the basic property of subtyping derivation (Lem. 2), and (T-NewTask), it is known that Γ1 ` e2 : ce @β1 \Ωte1 , Γ1 ` newB2 cnew (v) : ce @βt2 \Ωte2 , and Ωe1 = Ωte1 ∪ Ωte2 ∪ {β1 ≤ βt2 , βt2 ≤ β1 }, and cexp ∈ supers(ce ). By (T-Sub), Γ1 ` e2 : cexp @β1 \Ωte1 . (SR-Goal1) holds where β2 = β1 and Ωe2 = Ωte1 . With Γ1 (cxt)

this last equation, (SR-Goal2) holds by the reflexivity of ,→ , and (C-Subset). Note that FV (e2 ) = FV (v). No matter v is an object ID or null, FV (v) ⊆ OID. Thus (SR-Goal4) holds.

Case (R-NewT2): By that rule, e1 = newB1 ,B2 cnew (v), e2 = newB2 cnew (v), H2 = H1 . By (SR-E1), the basic property of subtyping derivation (Lem. 2), and (T-NewTask), it is known that Γ1 ` newB1 cnew (v) : ce @β1 \Ωte1 , Γ1 ` e2 : ce @βt2 \Ωte2 , and Ωe1 = Ωte1 ∪ Ωte2 ∪ {β1 ≤ βt2 , βt2 ≤ β1 }, and cexp ∈ supers(ce ). By (T-Sub), Γ1 ` e2 : cexp @βt2 \Ωte2 . (SR-Goal1) holds where β2 = βt2 and Ωe2 = Ωte2 . Hence Γ1 (cxt)

by the reflexivity of ,→

Γ1 (cxt)

Ωh1 ∪ Ωe1 ,→ Ωh1 ∪ Ωte1 ∪ Ωte2 ∪ {β1 ≤ β2 , β2 ≤ β1 } Γ1 (cxt)

(SR-Goal2) holds by (C-Subset) and the transitivity of ,→ . (SR-Goal4) holds for the same reason as in the previous case of (R-NewT1).

Case (R-Cast): By that rule, e1 = (ccast )o, e2 = o and ccast ∈ supers(c), H1 (o) = hB c ; Fd i, H2 = H1 . By (SR-E1), the basic property of subtyping derivation (Lem. 2), and (T-Cast), it is known that Γ1 ` o : ce @β1 \Ωe1 , cexp ∈ supers(ccast ). By Γ1 ` o : ce @β1 \Ωe1 , the basic property of subtyping derivation (Lem. 2), and (T-Obj), it is known that Γ1 (o) = Boco , ce ∈ supers(co ) and Bo = β1 : Bo0 . With H1 (o) = hB c ; Fd i, (SR-H1), Γ1 (o) = B c . Thus Bo = B and co = c. Thus by (T-Obj), Γ1 ` o : c@β1 \Ωe1 . Previously it was known that ccast ∈ supers(c), cexp ∈ supers(ccast ), by (T-Sub), Γ1 ` o : cexp @β1 \Ωe1 . Hence (SR-Goal1) holds where β2 = β1 , and Ωe2 = Ωe1 . Hence Ωh1 ∪Ωe1 ,→ Ωh2 ∪Ωe2 by the reflexivity of ,→ . Obvously, FV (e2 ) ⊆ OID. (SR-Goal4) holds trivially. Γ (cxt)

Γ (cxt)

Case (R-hi-Elim1) or Case (R-hi-Elim2): In both rules, e1 = hviho;c;mi , e2 = v, H2 = H1 . By (SR-E1), the basic property of subtyping derivation (Lem. 2), and (TObjectScope), it is known that Γ1 Γobj [A 7−→ B] (cxt 7→ ∆) ` o : ce @β1 \Ωe1in , and cexp ∈ supers(ce ), where fields(c) = ∀A.Γobj , Γ1 (o) = B c , B = β : B 0 , ∆ = hβ; c; mi : Γ1 (cxt). By the definition of fields, it is known dom(Γobj [A 7−→ B] (cxt 7→ ∆)) and OID are disjoint. By the Lemma on Irrelevant Typing Context for Values (Lem. 10), Γ1 ` o : ce @β1 \Ωe1in . By (T-Sub) and cexp ∈ supers(ce ),

Γ1 ` o : cexp @β1 \Ωe1in . Hence (SR-Goal1) holds where β2 = β1 and Ωe2 = Ωe1in . By Lem. 11 (Empty Constraints for Values), Ωe1in = ∅. Hence (SR-Goal2) holds by Γ1 (cxt)

the reflexivity of ,→ and (C-Subset). (SR-Goal4) holds in the same way as in the previous case of (R-NewT1).

Case (R-Cont) By that rule, e1 = v; e, and e2 = e, H2 = H1 . By (SR-E1), the basic properties of subtyping (Lem. 2), (T-Continue), Γ1 ` e : c0exp @β1 \Ωe and cexp ∈ supers(c0exp ) and Ωe ⊆ Ωe1 . By (T-Sub), Γ1 ` e : cexp @β1 \Ωe . (SR-Goal1) holds and β2 = β1 and Ωe2 = Ωe . (SR-Goal2) holds by (C-Subset). (SR-Goal4) holds trivially by the definition of FV over v; e and the simple case analysis on v. FV (e2 ) − OID = FV (e1 ) − OID. o

Case (R-Access) By that rule, e1 = θ , and e2 = null, H2 = H1 . By (T-Null), Γ1 ` null : cexp @β1 \∅. (SR-Goal1) holds where β2 = β1 and Ωe2 = ∅. (SR-Goal2) holds by (C-Subset). (SR-Goal4) holds trivially as FV (null) = ∅. Lemma 33 (Subderiviations Inside E Context). If Γ ` E[e]\Ω and Γ `h H\Ωh , and Γ (cxt) = [ ], then there must exist judgment Γ Γ 0 ` e : τ 0 \Ω 0 which is the root of a subderivation of the former judgment, and dom(Γ 0 ) ∩ OID = ∅ and d2s(cxt(E), H) = Γ Γ 0 (cxt).

Proof. Proving this lemma is a simple case analysis on E. The non-trivial fact is that even though the derivation tree for Γ ` E[e]\Ω contains sub-judgments whose typing contexts are different Γ , but they never change the typing of object IDs, i.e. dom(Γ 0 ) ∩ OID = ∅. Lemma 34 (Subject Reduction). Given the implicit code base C where `p C : C and the facts that H1 , Σ1 , e1 ⇒ H2 , Σ2 , e2

(SRM-Reduce)

Γ1 `h H1 \Ωh1

(SRM-H1)

Γ1 ` e1 : cexp @β1 \Ωe1

(SRM-E1)

Γ1 (cxt) = [ ]

(SRM-CT1)

then there exists some Γ2 , Ωh2 , Ωe2 such that Γ2 `h H2 \Ωh2 (SRM-Goal0) Γ2 ` e2 : cexp @β2 \Ωe2 (SRM-Goal1) []

staticize(Σ1 , H1 ) ∪ Ωe1 ∪ Ωh1 ,→ staticize(Σ2 , H2 ) ∪ Ωe2 ∪ Ωh2 ∪ {β2 ≤ β1 } (SRM-Goal2) Γ2 = Γ1 ] Γ0 for some Γ0 (SRM-Goal3)

∆

Proof. Case (R-Context): By that rule, e1 = E[e] and e2 = E[e0 ], and H1 , Σ1 , e =⇒ H2 , Σ2 , e0 and cxt(E) = ∆. By (SRM-E1) and Lem. 33 (Subderivations Inside E Context), Γ1 Γ10 ` e : cpe @βpe1 \Ωpe1

(SRM-ContextE)

(SRM-ContextE) is root of a subderivation of (SRM-E1) dom(Γ10 ) ∩ OID = ∅

Γ1 Γ10 (cxt) = d2s(cxt(E), H1 )

With dom(Γ10 ) ∩ OID = ∅, by Lem. 6 (Right Strengthening and Weakening of Typing Context for Heap Typing), (SRM-H1), then Γ1 Γ10 ` H1 \Ωh1 . Previously, it was known that cxt(E) = ∆ and Γ1 Γ10 (cxt) = d2s(cxt(E), H1 ). Hence Γ1 Γ10 (cxt) = ∆ d2s(∆, H1 ). With these facts, together with the previous facts of H1 , Σ1 , e =⇒ H2 , Σ2 , e0 and (SRM-ContextE), by Lem. 32 (Type Preservation Inside a Calling Context): (Γ1 Γ10 ) ] Γ0 `h H2 \Ωh2 for some Γ0 Ωh1 ∪ Ωpe1

(Γ1

Γ10 )

d2s(∆,H1 )

,→

0

] Γ0 ` e : cpe @βpe2 \Ωpe2

Ωh2 ∪ Ωpe2 ∪ {βpe2 ≤ βpe1 }

(FV (e0 ) − OID) ⊆ (FV (e) − OID)

(SRM-ContextH2) (SRM-ContextE’) (SRM-ContextIm) (SRM-ContextFV)

With the definition of ] and , it is known that dom(Γ1 ) ∩ dom(Γ0 ) = ∅. Hence (Γ1 Γ10 ) ] Γ0 = (Γ1 ] Γ0 ) Γ10 . Let Γ2 = Γ1 ] Γ0 . (SRM-Goal3) holds obviously. Previously it was shown that dom(Γ10 ) ∩ OID = ∅. By Lem. 6 (Right Strengthening and Weakening of Typing Context for Heap Typing), and (SRM-ContextH2), Γ2 `h H2 \Ωh2 . (SR-Goal0) thus holds. With Lem. 7 (Disjoint Right Strengthening of Typing Context), (SRM-E1) Γ2 ` E[e] : cexp @β1 \Ωe1

(SRM-ContextEnvN)

With (SRM-ContextE) and Lem. 7 (Disjoint Right Strengthening of Typing Context), (Γ1 Γ10 ) ] Γ0 ` e : cpe @βpe1 \Ωpe1 , which is Γ2 Γ10 ` e : cpe @βpe1 \Ωpe1

(SRM-ContextEN)

Γ2 Γ10 ` e0 : cpe @βpe2 \Ωpe2

(SRM-ContextEN’)

Γ2 ` H1 \Ωh1

(SRM-ContextH1)

Rewrite (SRM-ContextE’)

With dom(Γ10 ) ∩ OID = ∅, by Lem. 6 (Right Strengthening and Weakening of Typing Context for Heap Typing), (SRM-H1), then

Previously, it was known that the domains of Γ1 Γ10 and Γ0 are disjoint. Given cxt ∈ dom(Γ1 ), cxt ∈ / dom(Γ0 ). Given the relationship between Γ2 and Γ1 , it is obvious that Γ1 Γ10 (cxt) = Γ2 Γ10 (cxt) = d2s(∆, H1 ). Rewrite (SRM-ContextIm): Ωh1 ∪ Ωpe1

d2s(Γ2 Γ10 (cxt),H1 )

,→

Ωh2 ∪ Ωpe2 ∪ {βpe2 ≤ βpe1 }

(SRM-ContextImN)

By assumption (SRM-CT1), Γ1 (cxt) = [ ]. Therefore Γ2 (cxt) = [ ]. With (SRM-Goal0), (SRM-ContextEnvN), (SRM-ContextEN), (SRM-ContextEN’), (SRM-ContextImN), (SRM-ContextFV), (SRM-ContextH1), and Lem. 26 (Redex Substitution), Γ2 ` E[e0 ] : cexp @β2 \Ωe2 and []

Ωh1 ∪ Ωe1 ,→ Ωh2 ∪ Ωe2 ∪ {β2 ≤ β1 }. These are (SRM-Goal1) and (SRM-Goal2).

Case (R-Commute): By that rule, e1 = e k e0 , and e2 = e0 k e, H2 = H1 . Let Γ2 = Γ1 . (SRM-Goal3) holds obviously. So is (SRM-Goal0). By (SRM-E1), the basic properties of subtyping (Lem. 2), (T-Parallel), Γ1 ` e : ce @βe \Ωe and Γ1 ` e0 : c0e @βe0 \Ωe0 , and Ωe1 = Ωe ∪ Ωe0 . By (T-Parallel), Γ1 ` e0 k e : cexp @β1 \Ωe0 ∪ Ωe . Thus (SRM-Goal1) []

holds and Ωe2 = Ωe1 and β2 = β1 . Thus (SRM-Goal2) holds by the reflexivity of ,→, (C-Flow=), (C-Union).

Case (R-k-Elim): By that rule, e1 = v k e, and e2 = e, H2 = H1 . Let Γ2 = Γ1 . (SRM-Goal3) holds obviously. So is (SRM-Goal0). By (SRM-E1), the basic properties of subtyping (Lem. 2), (T-Parallel), Γ1 ` e : ce @βe \Ωe and Ωe ⊆ Ωe1 . Since β1 and cexp can be freely picked according to (T-Parallel), we pick Ωe = Ω1 and ce = cexp . Hence Γ1 ` e : cexp @β1 \Ωe . Thus (SRM-Goal1) holds and β2 = β1 and Ω2 = βe2 . (SRM-Goal2) holds by (C-Flow=), (C-Union), and (C-Subset).

Case (R-Post): By that rule, e1 = E[post e], e2 = E[null] k e, H2 = H1 . Let Γ2 = Γ1 . (SRM-Goal3) holds obviously. So is (SRM-Goal0). By (SRM-E1) and Lem. 33 (Subderivations Inside E Context), Γ1 Γ10 ` post e : cpe @βpe1 \Ωpe1

(SRM-PostE)

(SRM-PostE) is root of a subderivation of (SRM-E1) dom(Γ10 ) ∩ OID = ∅

Γ1 Γ10 (cxt) = d2s(cxt(E), H1 )

With (SRM-PostE), the basic properties of subtyping (Lem. 2), (T-Post), cpe ∈ supers(Unit), FV (e) ⊆ OID, and Γ1 Γ10 ` e : Unit@βpe1 \Ωpe1

(SRM-PostIn)

By the definition of supers and the definition of Unit, cpe1 = Unit. By (T-Null), Γ1 Γ10 ` null : Unit@βpe1 \∅. Note that (SRM-PostE) is the root of a subderivation of (SRM-E1). Hence (SRM-PostIn) is the root of a subderivation of (SRM-E1) as well. A simple case analysis will reveal that if we replace the subtree rooting at (SRM-PostE) with Γ1 Γ10 ` null : Unit@βpe1 \∅, the resulting tree is a derivation for Γ1 ` E[null] : cexp @β1 \Ωe1 − Ωpe1 . Previously it was shown that FV (e) ⊆ OID. Previously it was also known that dom(Γ10 ) ∩ OID = ∅, by Lem. 5 (Right Strengthening and Weakening of Typing Context), Γ1 ` e : Unit@βpe1 \Ωpe1 . With the two judgments above, and

(T-Parallel), Γ1 ` E[null] k e : cexp @β1 \Ω1 (SRM-Goal1) holds where β2 = β1 and Ω2 = Ω1 . Thus (SRM-Goal2) holds by the []

reflexivity of ,→, (C-Flow=), (C-Union). C.8

Progress and Initial Configuration Typing

Lemma 35 (Progress). If C `r S, then either S is a deadlock configuration, or it leads to a null pointer or bad cast exception, or there exists some S 0 such that S ⇒ S 0 . Proof. The only interesting case is (R-Access). Now suppose S is not a deadlock configuration, nor does it lead to a null pointer or bad cast exception, then according to o0 on the definition of deadlocks, given S = hH; Σ; hE0 [ θ ]ihp0 ;c0 ;m0 i k · · · k hEn [ θ 0

n

]ihpn ;cn ;mn i k ei, and for i = 0..n, H(oi ) = hBici ; Fd i i, there does not exist pi ∈ roots(H, Σ, o(i+1) mod (n+1) ) and θ0 = θ1 = θ2 = · · · = θn = T for all i. Not to lose generality, there are only several subcases: 1) for some i, θi 6= T. The pre-condition progressable(H, Σ, µ, θ, pi , oi ) for (T-Access) is ∅, the reduction trivially progresse; 2) θ0 = θ1 = θ2 = · · · = θn = T but for some i, roots(H, Σ, oi ) = ∅. This satisfies the pre-condition progressable(H, Σ, µ, θ, pi , oi ) for (T-Access), the reduction progresses; 3) θ0 = θ1 = θ2 = · · · = θn = T but for some i, roots(H, Σ, oi ) = roots(H, Σ, pi ). This again satisfies the pre-condition progressable(H, Σ, µ, θ, pi , oi ) for (T-Access), the reduction progresses. 4) θ0 = θ1 = θ2 = · · · = θn = T and for any i, roots(H, Σ, oi ) 6= ∅ and roots(H, Σ, oi ) 6= roots(H, Σ, pi ). According to the way the stack grows and shrinks, it is known that |roots(H, Σ, oi )| ≤ 1. Not to lose generality, there must exist some j such that roots(H, Σ, oj ) = psome and psome 6= pi for any i (otherwise a deadlock configuration would form). In this case, expression e osome must contain some sub-expression in the form of hEsome [ θ ]ihpsome ;csome ;msome i (This can some be intuitively seen by the way (R-Msg) and (R-Post) work; rigorously, induction over reduction rules), and the rest of the proof will follow induction on the reduction of e. Lemma 36 (Initial Configuration). If `p C : C, H = [omain 7→ h[tmain]Main ; [ ]i] for some omain , then there exists some Γ such that Γ (cxt) = [ ], and some Ω such that Γ `h H\Ω and some τ , Ω 0 such that Γ ` omain ->[tmain] main(null) : τ \Ω 0 and Ω ∪ Ω 0 = boot. Proof. We first prove Γ `h H\Ω. Observe that by the syntactical restriction on the Main class, the class directly derives from Objecttask with no extra fields. Thus fields(Main) = [this 7→ Main@αmc ] for some αmc and there are not f’s in the domain of the mapping above. Thus by (T-Heap), (T-HeapCell) [omain 7→ [tmain]Main ] `h H\{[tmain]Main ≤ tmain} Now let Γ = [omain 7→ [tmain]Main ] ] [cxt 7→ [ ]]. Obviously cxt ∈ / OID. By Lem. 6(Right Strengthening and Weakening of Typing Context for Heap Typing), Γ `h H\{[tmain]Main ≤ tmain}. Hence Ω = {[tmain]Main ≤ tmain}.

We now prove Γ ` omain ->[tmain] main(null) : τ \Ω 0 . By the previous judgment and (T-Obj), Γ ` omain : Main@tmain\∅. The argument of this expression is null, by (T-Null), Γ ` null : Unit@tmain\∅. By the definition of mtype, mtype(hMain; maini) = ∀A.(Unit → Unit). By the restriction of class Main, modifier (Main) = task. Hence -> matches modifier (Main) returns Unit holds by definition. Hence by (T-Msg), Γ ` omain ->[tmain] main(null) : Unit@tmain\{[tmain]tmain,main,tmain }∪aC (T, Main@tmain). By the definition of aC , aC (T, Main@tmain) = [ ]. Hence the judgment holds where τ = Unit@tmain and Ω 0 = {[tmain]tmain,main,tmain }. By the definition of boot, boot = Ω ∪ Ω 0 . C.9

Soundness

Lemma 37 (Soundness). as stated in the paper. Proof. See Lem. 34, Lem. 35, and Lem. 36. C.10

Atomicity

We write step str = (S, r, S 0 ) to denote a transition S ⇒ S 0 by reduction rule r. We let change(str ) = (so , r, s0o ) denote the fact that the begin and end heaps of step str differ at most on their state of o, taking it from so = hB c ; Fd i to some s0o . Similarly, the change in two consecutive steps which changes at most two objects o1 and o2 , o1 6= o2 , is represented as change(str1 str2 ) = ((so1 , so2 ), r1 r2 , (s0o1 , s0o2 )). If o1 = o2, then change(str1 str2 ) = (so1 , r1 r2 , s0o1 ). In this section we assume `p C : C for some fixed C , and all computation steps are from the well-typed program C . Since programs are all typed, for this section we change the definition of progressable(H, Σ, µ, T, p, o) in Section ?? to make the case match the shared case – since reductions are over well-typed programs only, by Theorem ?? the condition roots(H, Σ, o) ⊆ roots(H, Σ, p) is equivalent to true. Similarly, the definition of dset(µ, θ, p, o) is also equivalent if changed to make the case match the shared case. Since the (R-Access) rule has different atomicity behavior based on whether it is an access of a non-task object or not, in this section we will view that rule as being split down the middle into two equivalent rules, (R-Access-) for the case µ is a non-task ( or shared) or θ 6= T, and (R-Access-T) for the case it is a task messaging (shared task or task). Definition 10 (Local and Nonlocal Step). A step str = (S, r, S 0 ) is a local step if r is one of the local rules: either (R-Read), (R-Write), (R-New), (R-NewT1), (R-NewT2), (R-Cast), (R-hi-Elim), (R-k-Elim), (R-Cont), (R-Msg), (R-Access-), (R-Post). str is a nonlocal step if r is one of the nonlocal rules: either (R-Access-T), or (R-J K-Elim). We will bundle the (R-Context) rules that build up a context with the underlying rules, so a step st is always a non-(R-Context) rule followed by 0 or more (R-Context) rules, and the (R-Context) applications will be implicit. It is not hard to see that this simplification does not change the meaning of executions. Additionally, the (R-Access-)

rules for the R and W cases of θ only write information to Σ for purposes of invariants in subject reduction and so wolog we can assume those firings are replaced with an no-op rule that does nothing. Atomicity is a property which states we can re-order arbitrary computation steps of a given thread to be atomically sequenced and “still observe the same results”; to formalize this property we then need to formalize what is observable. Ultimately the only events that are observable are I/O actions: reading and writing to files, the network, etc. Here we prove an “internal” notion of observation, namely that there is an equivalent computation with the exact same nonlocal step firings. Since I/O will be implemented by a shared task, messages to the I/O object will in fact be nonlocal (R-Access-T) steps and will be in our category of observables. The relationship between these internal obervables and actual I/O is elaborated more in [?]. In atomicity properties the runtime states S are not going to align, only the actions of the nonlocal rules needs to align: the particular rule that fired and what relevant objects it worked on. In order to capture the actions of the nonlocal rules, we extend those rules to be a labeled transition system (LTS): every nonlocal rule has a label. To the (R-Access-T) rule we add a label (R-Access-T)(p, o) meaning task p messaged to a another task object o. To the (R-hi-Elim) rule we add a label (R-hi-Elim)(o) meaning task o completed. These labels are used as the observable; the local rules carry no labels since they are local steps internal to a task and are not observable. Lemma 38. In any given local step str , at most one object o’s state can be changed from so to s0o (so is null if str creates o). Proof. Direct by inspection of the reduction rules. Definition 11 (Computation Path). A computation path p is a finite sequence of steps str1 str2 . . . stri−1 stri such that we know str1 , str2 , . . . stri−1 , stri = (S0 , r1 , S1 ), (S1 , r2 , S2 ),. . . , (Si−2 , ri−1 , Si−1 ), (Si−1 , ri , Si ). Here we only consider finite paths as is common in process algebra, which simplifies our presentation. Infinite paths can be interpreted as a set of ever-increasing finite paths. For brevity purpose, we use computation path and path in an interchangeable way if no confusion would arise. Note that by inspection of the rules, the expression e at any point in reduction is always of the form e = J e1 Kt1 k . . . k J en Ktn indicating tasks with OIDs t1 . . . tn are currently running in parallel; furthermore, ti 6= tj for all i, j. Definition 12 (Observable Behavior). The observable behavior of a computation path π, ob(π), is the label sequence of all nonlocal steps occurring in π. Definition 13 (Observable Equivalence). Two paths π1 and π2 are observably equivalent, written as π1 ≡ π2 , iff ob(π1 ) = ob(π2 ). From this point onward, for clarity we will add t to the OID set to range over object IDs, the intention being t are in fact tasks at runtime.

Definition 14 (Object-blocked). A task t is in an object-blocked state S at some point in a path π if it would be enabled for a next step str = (S, r, S 0 ) for which r is an (R-Access) step on object o, except for the fact that there is a capture violation on o: the progressable(H, Σ, µ, θ, p, o) precondition fails to hold in S and so the step cannot in fact be the next step at that point. An object-blocked state is a state a task cannot make progress at because it has to wait for an object to become available to it. Definition 15 (Sub-path and Maximal Sub-path). Given a fixed π, for some task t a sub-path sπt of π is a sequence of steps in π which are all local steps of task t. A maximal sub-path is a sπt in π which is longest: no local t steps in π can be added to the beginning or the end of sπt to obtain a longer sub-path. Note that the steps in sπt need not be consecutive in π, they can be interleaved with steps belonging to other tasks. Definition 16 (Pointed Maximal Sub-path). For a given path, a pointed maximal subpath for a task t (pmsπt ) is a maximal sub-path sπt such that either 1) it has one nonlocal step appended to its end or 2) there are no more t steps ever in the path. The second case is the technical case of when the (finite) path has ended but the task t is still running. The last step of a pmsπt is called its point. We omit the t subscript on pmsπt when we do not care which task a pmsπ belongs to. Since we have extended the pmsπ maximally and have allowed inclusion of one nonlocal step at the end, we have captured all the steps of any path in some pmsπ: Lemma 39. For a given path π, all the steps of π can be partitioned into a set of pmsπ’s where each step str of π occurs in precisely one pmsπ, written as str ∈ pmsπ. Proof. This immediately follows the definition of pmsπ. Given this fact, we can make the following unambiguous definition. Definition 17 (Indexed pmsπ). For some fixed path π, define pmsπ(i) to be the ith pointed maximal sub-path in π, where all the steps of the pmsπ(i) occur after any of pmsπ(i + 1) and before any of pmsπ(i − 1). The pmsπ’s are the units which we need to serialize: they are all spread out in the initial path π, and we need to show there is an equivalent path where each pmsπ runs in turn as an atomic unit. Definition 18 (Path around a pmsπ(i)). The path around a pmsπ(i) is a finite sequence of all of the steps in π from the first step after pmsπ(i − 1) to the end of pmsπ(i) inclusive. It includes all steps of pmsπ(i) and also all the interleaved steps of other tasks. Definition 17 defines a global ordering on all pmsπ’s in a path π without concerning which task a pmsπ belongs to. The following Definition 19 defines a task scope index of a pmsπ which is used to indicate the local ordering of pmsπ’s of a task within the scope of the task.

Definition 19 (Task Indexed pmsπ). For some fixed path π, define pmsπt,i to be the ith pointed maximal sub-path of task t in π, where all the steps of the pmsπt,i occur after any of pmsπt,i+1 and before any of pmsπt,i−1 . For a pmsπt,i , if we do not care its task scope ordering, we omit the index i and simply use pmsπt . Definition 20 (Waits-for and Deadlocking Path). For some path π, pmsπt1 ,i waitsfor pmsπt2 ,j if t1 goes into a object-blocked state in pmsπt1 ,i on an object captured by t2 in the blocked state. A deadlocking path π is a path where this waits-for relation has a cycle: pmsπt1 ,i waits-for pmsπt2 ,j while pmsπt2 ,i0 waits-for pmsπt1 ,j 0 . Hereafter we assume in this theoretical development that there are no such cycles. Definition 21 (Quantized Sub-path and Quantized Path). A quantized sub-path contained in π is a pmsπt of π where all steps of pmsπt are consecutive in π. A quantized path π is a path consisting of a sequence of quantized sub-paths. The main technical Lemma is the following Bubble-Down Lemma, which shows how local steps can be pushed down in the computation. Use of such a Lemma is the standard technique to show atomicity properties. In this approach, all state transition steps of a potentially interleaving execution are categorized based on their commutativity with consecutive steps: a right mover, a left mover, a both mover or a non-mover. The reduction is defined as moving the transition steps in the allowed direction. We show the local steps are right movers; in fact they are both-movers but that stronger result is not needed. Definition 22 (Step Swap). For any two consecutive steps str1 str2 in a computation path π, a step swap of str1 str2 is defined as swapping the order of application of rules in the two steps, i.e., apply r2 first then r1 . We let st0r2 st0r1 denote a step swap of str1 str2 . Definition 23 (Equivalent Step Swap). For two consecutive steps str1 str2 in a computation path π, where str1 ∈ pmsπt1 , str2 ∈ pmsπt2 , t1 6= t2 and str1 str2 = (S, r1 , S 0 )(S 0 , r2 , S 00 ),if the step swap of str1 str2 , written as st0r2 st0r1 , gives a new path π 0 such that π ≡ π 0 and st0r2 st0r1 = (S, r2 , S ∗ )(S ∗ , r1 , S 00 ), then it is an equivalent step swap. Lemma 40 (Bubble-down Lemma). For any path π with any two consecutive steps str1 str2 where str1 ∈ pmsπt1 ,str2 ∈ pmsπt2 and t1 6= t2 , if str1 is a local step, then a step swap of str1 str2 is an equivalent step swap. Proof. According to Definition 12 and 13, ≡ is defined by the sequence of labels of nonlocal steps occurring in path π, so a step swap of str1 str2 always gives a new path π 0 , π 0 ≡ π, since str1 is a local step by the lemma so that swap str1 with any str2 never changes the ordering of labels. Therefore, to show that the step swap of str1 str2 is an equivalent step swap we only need to prove that if str1 str2 = (S, r1 , S 0 )(S 0 , r2 , S 00 ), then the step swap of str1 str2 is st0r2 st0r1 = (S, r2 , S ∗ )(S ∗ , r1 , S 00 ). Because str1 is a local step, it can at most change one object’s state on the heap H and a local step does not change Σ except in (R-Access-). We then can represent

str1 as str1 = (hH1 ; Σ1 ; e1 i, r1 , hH2 ; Σ2 ; e2 i) where H1 and H2 differ at most on one object o. (Case I) str2 is also a local step. In this case, str2 = (hH2 ; Σ2 ; e2 i, r2 , hH3 ; Σ3 ; e3 i). Because str1 and str2 are steps of different tasks t1 and t2 , str1 and str2 must be elements running in different tasks (t1 and t2 respectively) by inspection of the rules. Suppose change(str1 ) = (so1 , r1 , s0o1 ) and change(str2 ) = (so2 , r2 , s0o2 ). If o1 6= o2 , then it must hold that change(str1 str2 ) = ((so1 , so2 ), r1 r2 , (s0o1 , s0o2 )). Swapping the order of str1 , str2 by applying r2 first then we know r1 results in the change ((so1 , so2 ), r2 r1 , (s0o1 , s0o2 )), which by inspection of the local rules has the same start and end states as change(str1 str2 ), regardless of what r1 and r2 are. If o1 = o2 = o, then change(str1 ) = (so , r1 , s0o ), change(str2 ) = (s0o , r2 , s00o ), change(str1 str2 ) = (so , r1 r2 , s00o ). We consider each of the local rules for this case, against any other possible local rule. The only local rules that may potentially conflict on the same object is (R-Read)/(R-Write)(R-Access-): the only other local rule which accesses the object heap or Σ is (R-New), and by inspection of the rules the just-created object name stays local to the current task unless it is later passed out by that task, and so there can be no interference. Now, for (R-Read)/(R-Write), each operation must occur in a nearest enclosing intra-task context hiZ where Z = hB; oi for this being the shared o; however, such a state could not have been arrived at unless progressible held for o and by inspection of that definition it can hold for at most one task at a time, a contradiction. For the (R-Access-) case, the indicated operation will in fact not complete until o after the θ condition has succeeded, and by the definitions of dset and progressible at most one task can execute two such statements in sequence on the same object o and thus there cannot have been two different tasks executing these two operations on the same object, contradiction. (Case II) str2 is a nonlocal step. Let str1 = (hH1 ; Σ1 ; e1 i, r1 , hH2 ; Σ2 ; e2 i) and H2 differs from H1 at most on one object o since str1 is a local step. Subcase a: r2 = (R-Access-T). Since r1 is a local step it cannot be another (R-AccessT); it could however be a (R-Read)/(R-Write)/(R-Access-) which could conflict on o. However if it were either of these three, the progressable precondition on o (which for read/write is, as above, the o on the nearest enclosing context hiZ where Z = hB; oi) would again restrict firing of these rules since progressable must have held for the (RAccess-T) firing in the other task and the condition can hold for only one task at a time. Subcase b: r2 = (R-J K-Elim) The potential problem here is if the o in (R-J K-Elim) were in fact accessed in str1 ; however, as in the previous subcase progressable will again preclude the same object being accessed by (R-Read)/(R-Write)/(R-Access-). Given this Lemma we can now directly prove the Quantized Atomicity Theorem. Theorem 1 (Quantized Atomicity). For all paths π there exists a quantized path π 0 such that π 0 ≡ π. Proof. Proceed by first sorting all pmsπ’s of π into a well ordering induced by the ordering of their points in π. Write pmsπ(i) for the i-th indexed pmsπ in this ordering.

Suppose that there are n pmsπ’s in total in π for some n. We proceed by induction on n to show for all i ≤ n, π is equivalent to a path πi where the 1st to ith indexed pmsπ’s in this ordering have been bubbled to be quantized subpaths in a prefix of πi : π ≡ πi = pmsπ(1) . . . pmsπ(i) . . . where pmsπ(k) is quantized with k = 1 . . . i . With this fact, for i = n we have π ≡ πn = pmsπ(1) . . . pmsπ(n) where pmsπ(k) is quantized with k = 1 . . . n, proving the result. The base case n = 0 is trivial since the path is empty. Assume by induction that all pmsπ(i) for i < n have been bubbled to be quantized subpaths and the bubbled path πi = pmsπ(1) . . . pmsπ(i) . . . where pmsπ(k) is quantized with k = 1 . . . i, has the property πi ≡ π. Then, the path around pmsπ(i + 1) includes steps of pmsπ(i + 1) or pmsπ’s with bigger indices. By repeated applications of the Bubble-Down Lemma, all these local steps that do not belong to pmsπ(i + 1) can be pushed down past its point, defining a new path πi+1 . In this path pmsπ(i + 1) is also now a quantized subpath, and πi+1 ≡ π because πi ≡ π and the Bubble-Down lemma which turns πi to πi+1 does not shuffle any nonlocal steps so πi ≡ πi+1 .

Task Types for Pervasive Atomicity - Computer Science

Task Types for Pervasive Atomicity - Computer Science

Suggest Documents

A Criterion for Atomicity - UNC Computer Science

A Criterion for Atomicity - UNC Computer Science

Intelligent Pervasive Framework for Consumer ... - Computer Science

Run-Time Analysis for Atomicity - Stony Brook Computer Science

Atomicity: A Unifying Concept in Computer Science - DROPS

Types for Atomicity: Static Checking and Inference for Java - Slang

Fibred Data Types - Computer Science - Swansea University

Inductive, Projective, and Retractive Types - Computer Science

Arguing About Task Reallocation Using ... - Computer Science

DEPARTMENT OF COMPUTER SCIENCE Task Data

Chip-based Reconfigurable Task Management - Computer Science ...

Concept Relationship Types for AHA! 2.0 - Computer Science

Dependent Types for Distributed Arrays - School of Computer Science

Natural Methods for Robot Task Learning - Computer Science and

Efficient task scheduling for budget constrained ... - Computer Science

Metrics and Task Scheduling Policies for Energy ... - Computer Science

The Effects of Different Task Types on Learners ... - Science Direct

Pervasive transcription - Current Science

the pervasive applications of computer science ... - ACM Digital Library

Lecture Notes in Computer Science - Pervasive Computing Research ...

consumers and pervasive retail - Department of Computer Science ...

Pervasive interaction - IEEE Computer Society

Pervasive interaction - IEEE Computer Society

Universal Connector Framework for Pervasive ... - Science Direct