Verifying Stateful Programs with Substructural State and ... - Microsoft

1 downloads 0 Views 420KB Size Report
an object invariant in which the name self refers to the object it- self. Note that the .... For this, we provide functions in Prims that construct and destruct lists while .... Errors that may be reported when verifying FX programs include type mismatch ...
Verifying Stateful Programs with Substructural State and Hoare Types Johannes Borgstr¨om ∗

Juan Chen

Nikhil Swamy

Uppsala University [email protected]

Microsoft Research, Redmond [email protected]

Microsoft Research, Redmond [email protected]

Abstract A variety of techniques have been proposed to verify stateful functional programs by developing Hoare logics for the state monad. For better automation, we explore a different point in the design space: we propose using affine types to model state, while relying on refinement type checking to prove assertion safety. Our technique is based on verification by translation, starting from FX, an imperative object-based surface language with specifications including object invariants and Hoare triple computation types, and translating into Fine, a functional language with dependent refinements and affine types. The core idea of the translation is the division of a stateful object into a pure value and an affine token whose type mentions the current state of the object. We prove our methodology sound via a simulation between imperative FX programs and their functional Fine translation. Our approach enables modular verification of FX programs supported by an SMT solver. We demonstrate its versatility by several examples, including verifying clients of stateful APIs, even in the presence of aliasing, and tracking information flow through sideeffecting computations.

1.

Introduction

Several recent papers propose a verification methodology for stateful functional programs by developing Hoare logics for the state monad (Nanevski et al. 2006; Swierstra 2009; Borgstr¨om et al. 2010). Tools based on this approach are known to be powerful. For example, the Ynot tool has been used to carry out interactive proofs of correctness for programs that manipulate B+ trees (Malecha et al. 2010), a pointer structure with tricky sharing properties. While such successes make a good case for interactive machineassisted proofs in a Hoare logic for the state monad, this paper explores a different point in the design space of a program logic for stateful functional programs. We are motivated primarily by a desire to automate the modular verification of functional programs that make use of local state. We aim to develop a verification methodology based on classical first-order logic, for which powerful off-the-shelf solvers already exist. (In contrast, tools like Ynot work with separation logic to recover modularity, but automation for separation logic remains a significant challenge, although the situation is improving (Chlipala et al. 2009).) Our work considers a well-known alternative to monadic state as the basis of a verification methodology: substructural state, by which we mean state modeled using linear (use-once) or affine (use-at-most-once) types (Wadler 1990). Our insight is that the problem of verifying stateful functional programs can be factored into two pieces. First, we can use an affine type system to model ∗ The

first author was employed by Microsoft Research, Cambridge, for the duration of his work on this paper.

the stateful behavior of the program in a purely functional style. Then, to prove the safety of assertions in the program, we can rely on automated refinement type checking for a first-order classical logic. Affine types partition the state of the program into a number of disjoint pieces, thus yielding a modular verification procedure without necessitating the use of the separation logic connectives. In addition to improved automation, modeling state using affine types is attractive since it enables combinations of Hoare logic with other program verification disciplines. We work out one such example in detail in §5.1, where we combine our basic approach with a discipline of fractional capabilities to control aliasing. Concretely, our work is grounded in the context of Fine (Swamy et al. 2010), a purely functional language with a type system based on a combination of dependent refinements and affine types. Our contributions include the following: • We present FX (§3), an extension of the term syntax of Fine

with stateful commands including object allocation, deletion and mutation. We also extend the type language with Hoare types (Nanevski et al. 2006) to give specifications (in the style of Hoare triples) to stateful code in FX. • We show how to translate FX programs to Fine (§4), using

Fine’s affine types to model state, and dependent refinements for the translation of Hoare types. The core idea is to divide a stateful object into a pure value and an affine token whose type mentions the current state of the object. We prove that our translation is a simulation, which implies that refinement type checking in Fine guarantees assertion safety for FX. • We demonstrate the effectiveness of our approach on a variety

of examples (§5), which, although simpler than the B+ trees verified interactively in Ynot, are still known to be challenging. Our examples include a conference management program previously verified for authorization properties by Swamy et al. (2010); a client of a stateful API of collections and iterators studied by Bierhoff and Aldrich (2007); and an encoding of information flow tracking suitable for use with programs that may leak information via side effects (an enhancement of a technique studied by Swamy (2008)). Although the results of this paper are limited (primarily) to firstorder programs, we discuss ongoing work that naturally extends our approach to the higher-order case in §5.3. The remainder of this section presents a short overview of our work, highlighting some of the technical challenges we solve. 1.1

Marrying substructural and dependent types

Linear types have been known as an effective tool for modeling state for the last 20 years (Wadler 1990), roughly as long as the monadic approach. Yet, with a few notable exceptions, verification tools and programming languages have only seldom adopted linear

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

types as model of state. One reason for this, particularly in the typebased approach to verification, is the subtle interaction between linear and dependent types. The metatheory of an arbitrary mixture of these types is usually considered intractable. For example, Linear LF (Cervesato and Pfenning 2002), perhaps the canonical presentation of linearity and dependency, restricts linear values from appearing as indices in types. Fine has a similar restriction on affine indices, placing seemingly severe limitations on expressiveness. For a sense of the difficulty, consider implementing a function incr that increments an affine integer x:aInt. One might write it as λ x:aInt. x+1, consuming the original integer x and threading back the incremented version to the caller. We would like to give this function a more precise type, such as x:aInt → {y:aInt | y>x}. This is the type of a dependent function from affine integers x to affine integers y, where the refinement type to the right of the arrow states that the returned value y is greater than x. However, this type is illegal in Linear LF and Fine, because the refinement formula y>x is interpreted as a type, and the affine values x and y cannot be used at the type level. A key technical contribution of our work is to surmount this difficulty by using a combination of value-indexed types and affine capabilities (Walker et al. 2000) to permit unrestricted refinements of stateful values. In our solution, the type of an increment function, say incr2, could be x:int → Token x → (y:{y:int | y > x} ∗ Token y), where Token :: int ⇒ A is a dependent type constructor that constructs an affine type from an int-typed value. (The kind of affine types is A .) The type of incr2 indicates that to increment x, client code must present a token for x. The incr2 function destructs the token for x, produces y, and a new token for y. By making Token x affine, we ensure that the client can no longer use x in a context that expects a mutable integer. By indexing the type of the token with the value of x, we ensure that a token of one value cannot be confused with a token for another. Finally, by separating x from its token, we can use x in refinement formulas, e.g., in the formula y > x in the return type of incr2. 1.2

Shielding programmers from affinity

Clearly, requiring programmer to directly manipulate tokens is far from optimal—mutable values and their tokens have to be threaded through the program in a functional style. Additionally, exposing affinity to the programmer is also questionable, since it limits the use of library code—data structures that contain affine values become affine themselves, making it hard to use, say, a standard non-affine lists to hold affine values. To counter these difficulties, we present FX, a surface syntax with imperative commands that hides the low-level plumbing of tokens and affinity from the programmer. To illustrate FX, consider a snippet of an FX implementation of ConfWeb, a verified conference management tool we had previously implemented directly in Fine. 1 type review = {[txt:string; u:user]} 2 val upd review: r:review → x:string → 3 {(s1) >} :unit {(s2) s2(r).txt=x && s2(r).u=s1(r).u} 4 let upd review r x = r.txt := x

Here, we define a mutable object type review of reviews, with fields for review text and the reviewer u. Line 4 shows the implementation of a function upd review that updates a review r by assigning to its txt field. FX programmers can use imperative commands like assignment rather than threading mutable objects through the program. Mutable objects can also be placed in standard data structures, without having to re-implement libraries to deal with affinity. Specifications in FX are written using Hoare types, a notation resembling Hoare triples. Lines 2-3 show the specification of upd review, with the form x:t1 → {(s1) Φ }r:t2 {(s2) Ψ }, where the formula Φ is a pre-condition on the function argument x:t1 and prestate s1; r:t2 is the name and type of the return value; and Ψ is a

post-condition on the argument x, the pre-state s1, the return value r, and the post-state s2. The main technical development of this paper is a translation from FX to Fine that introduces affine tokens and threads stateful values and their tokens through the program. The main results are Theorem 1, which establishes that the translation is a simulation, and Corollary 2, which establishes progress for FX programs that translate to well-typed Fine programs.

2.

Overview

We begin with a brief review of the Fine programming language, then discuss ConfWeb, a conference management application we previously implemented in Fine. This application motivates, in part, the need for better support for imperative features in Fine. §2.2 presents the design of FX using examples from ConfWeb. §2.3 presents informally our methodology of verifying FX programs via translation to Fine. 2.1

A brief review of Fine

Fine is an experimental, purely functional programming language on the .NET platform. The principal novelty of Fine is in its type system, which is designed to support static verification of safety properties via a mixture of refinement and substructural types. Our work on Fine has, thus far, focused primarily on verifying security properties, including properties related to authorization policies and information flow controls. However, by design, the type system of Fine has little in it that is security-specific. This paper is a step towards putting Fine to use for the general-purpose verification of programs that mix both functional and imperative idioms. We discuss several elements of Fine’s design next—for details, consult our prior papers (Swamy et al. 2010; Chen et al. 2010). Value-indexed types. Types in Fine can be indexed both by types (e.g., list int) as well as by values. For example, array int 17 could represent the type of an array of 17 integers, where the index 17:nat is a natural number value. Fine prohibits non-value expressions from appearing as type indices, in effect forbidding type-level computations. This restriction limits expressiveness, but considerably simplifies Fine’s metatheory and implementation. Dependent function types. Functions in Fine are, in general, given dependent function types, i.e., their range type depends on their argument. Dependent function types are written x:t → t’, where the formal name x of the parameter of type t is in scope in t’. For example, the type of a function that allocates an array of n integers can be given the type n:nat → array int n. When a function is nondependent, we simply drop the formal name. Refinement types. A refinement type in Fine is written {x:t | φ }, where φ is a formula in which x may appear free. Formulas are drawn from the same syntactic category as types, although, for readability, we typeset formulas differently and use distinct metavariables (φ versus t). In practice, types are refined by formulas from a first-order logic with equality, extended with user-defined predicates. For example, we may give the following type to a list permutation: ∀α .l:list α → {m:list α | ∀x:α . In x l ⇔ In x m}. Universal quantifiers in formulas are represented using dependent arrows and existential quantifiers using dependent pairs; In is a userdefined predicate for list membership; and connectives like ⇔ are represented using indexed types (we elaborate below). Refinement type checking. A refinement type {x:t | φ } is inhabited by values v:t, for which φ [v/x] is derivable. Derivability is defined with respect to assumptions induced by the program context (e.g., equalities due to pattern matching) as well as a set of assumptions that axiomatize user-provided predicates. We formalize this using an LCF-style (Milner 1979) kernel for Fine, which contains the inference rules for a classical first-order logic with equality, ex-

2

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

tended with constructors corresponding to user-provided axioms. As such, user-provided axioms must be used with care, since faulty axioms compromise soundness. Derivability is decided by relying on Z3 (de Moura and Bjorner 2008), an SMT solver. Formally, refinement types are viewed as Σ-types. However, we include a program transformation (somewhat similar to the coercions used by Sozeau (2006)) that systematically inserts pack/unpack operations so as to equip refinement types with a subtyping relation that allows programmers to view {x:t | φ } as a subtype of t. Affine types. Dependent refinements in Fine bear close resemblance to constructs found in related languages like F7 (Bengtson et al. 2008) and Sage (Flanagan 2006), despite several technical differences. A substantial difference in Fine however, is the addition of affine types. The interaction between dependent and affine types in Fine is strictly regulated by the no-affine-indices restriction— this prohibits the use of values with affine types as type indices. As such, Fine is closely related to Linear LF (Cervesato and Pfenning 2002), which integrates linear and dependent types, with a similar restriction that prevents linear resources being mentioned in types. This paper shows that even with this restriction, the combination of dependent and affine types available in Fine provides a powerful set of primitives for verifying programs that use mutable local state. Kind language. To enforce the no-affine-indices restriction (as well as to keep account of type constructors) Fine employs a system of kinds. The (slightly simplified) syntax of kinds is shown below: kinds

k

::=

? | A | α ::k ⇒ k | x:t ⇒ k

Types in Fine are divided into two basic kinds: ?, the kind of normal non-affine types, and A, the kind of affine types. A ?-kinded type can be coerced to the A universe using the modality ¡, e.g., int::? , while ¡int::A . Type constructors are given arrow kinds, which come in two flavors. The first, α ::k ⇒ k0 is the kind of type functions that construct a k0 -kinded type from a k-kinded type α . Just as at the term level, type-level arrows are dependent—the type variable α can appear free in k0 . Type functions that construct value-indexed types are given a kind x:t ⇒ k, where x names the formal of type t and x can appear free in k. In both cases, when the kind is nondependent, we simply drop the formal name. The no-affine-indices manifests itself as a restriction on kinds x:t ⇒ k, where the domain type t must have kind ?. For example, the kind of list is ? ⇒ ? ; the kind of the value-indexed array constructor is ? ⇒ nat ⇒ ? ; the kind of the propositional connective And is ? ⇒ ? ⇒ ? ; the kind of the user-defined predicate In is α ::? ⇒ α ⇒ list α ⇒ ? . Module system. Aside from the core type system, Fine has a simple module system allowing us to define certain types private to a module, which in effect forces clients of the module to view values of these private types abstractly. Fine uses the module system to prove secrecy and authenticity properties. In this paper, the module system comes in handy for the definition of unforgeable capabilities and the corresponding authenticity property plays a crucial role in proving the correctness of our translation from FX to Fine. Proof-carrying compilation. Finally, it is worth mentioning that Fine is compiled in a proof-carrying style to DCIL, .NET bytecode enhanced with dependent and affine types in a backwardscompatible way. In addition to certification, compilation to DCIL allows Fine to interoperate easily with the other .NET languages. By virtue of the translation of the forthcoming sections, FX too enjoys a proof-carrying translation to DCIL. 2.2

ConfWeb: A first taste of FX

Our prior work on Fine included the development of an application ConfWeb, a conference management tool based on the Continue server (Krishnamurthi 2003). Continue’s behavior is governed by an authorization policy, modeled with liberal use of object-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

type user=int type NoConflict :: user ⇒ user ⇒ ? assume nc sym: ∀u1, u2. NoConflict u1 u2 ⇒ NoConflict u2 u1 val checkNC: u1:user → u2:user → {b:bool | b=true ⇔ NoConflict u1 u2} type review = {[txt:string; u:user]} type paper = {[authors:list user; revs:list review | ∀u:user, r:review. (In u self.authors && In r self.revs) ⇒ NoConflict u r.u ]} (∗ Allocation function with a Hoare type ∗) val mk review: txt:string → u:user → {(s1) >} r:review {(s2) s2(r).txt=txt && s2(r).u=u}

let mk review txt u = new review{[txt=txt; u=u]} (∗ Pure library function can operate on a list of mutable objects ∗) val for all: ∀α ::? , P::α ⇒ ? . f:(x:α → {b:bool | b=true ⇔ P x}) → l:list α → {b:bool | b=true ⇔ ∀x:α . In x l ⇒ P x} (∗ Impure function mutates a paper; also given a Hoare type ∗) val add review: p:paper → r:review → {(s) not(In s(r) s(p).revs)} o:option review {(t) (In s(r) t(p).revs || t(o)=Some s(r))&& ...}

let add review p r = if for all (checkNC r.u) p.authors then let tl = (p.revs :=: Nil) in p.revs :=: Cons r tl; None else Some r

Figure 1. Invariants on mutable objects in ConfWeb orientation and mutable state in Alloy (Jackson 2002). A basic data structure in the model is a paper, an object with mutable fields holding a paper submission and list of its associated reviews. An invariant on paper ensures that the reviews of an paper are from reviewers not conflicted with the paper’s authors. Our prior implementation of ConfWeb modeled many of these stateful features, but some features (including invariants on mutable objects like paper) proved to be too cumbersome to implement directly in Fine. In this section, we introduce FX, a surface syntax for Fine with direct support for mutable objects. FX considerably simplifies programming with mutable objects and makes an implementation of ConfWeb more faithful to its original specification. Subsequent sections shows how mutable state can be systematically translated away via translation to purely functional Fine programs. Figure 1 shows a fragment of ConfWeb written in FX. The key data structure in ConfWeb is paper (line 6), a mutable object with fields containing the authors of a paper and its reviews (in addition to other fields, not shown). We verify that this program preserves the no-conflict invariant on paper. The full program translates to a 109-line Fine program that is verified automatically in 8 seconds. Mutable objects and invariants. FX extends Fine with a notation for records with mutable fields—we call these types objects, although our core calculus for FX does not include other OO features like inheritance. Object types are of the form {[f1:t1;...;fn:tn | φ]}, where each fi is a mutable field of type ti, and the formula φ is an object invariant in which the name self refers to the object itself. Note that the values of non-object types like string and int are immutable. The paper object and its invariant. Line 6 defines paper, an object with a field authors (a list of immutable integers, standing for user ids); and a field revs, a list of objects representing the reviews of a paper. The invariant on paper is a formula which uses two user-defined predicates. The first predicate, In, stands for list membership. The predicate NoConflict is a binary predicate to reflect the absence of conflict-of-interest between a pair of users. Line 2 shows the kind of NoConflict and line 3 shows a user-provided axiom that asserts that NoConflict is a symmetric relation. Rather than provide further axioms to define NoConflict, at line 4 we show the signature of a trusted external function checkNC, where we use dependent refinement types in Fine to assert that checkNC decides the NoConflict relation. The implementation of this function consults

3

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

a database that records conflict-of-interest declarations by authors and PC members of the conference—interoperability between Fine and the rest of .NET makes it easy to implement this function in F# and to call it from FX. Impure functions and Hoare types. At line 11, we show the implementation of a function mk review, which allocates a new review object. Its specification at lines 9-10 shows the type of an impure function, which, in general, has the form x:t → {(s1) Φ }r:t’ {(s2) Ψ }, as explained in §1.2. For mk review, the pre-condition Φ is trivial— >, which always holds. The post-condition Ψ contains terms of the form s2(r)—this stands for the value of the mutable object r in the state s2. In this case, s2(r) is a record value of the object type review. The post-condition of mk review states that the fields of the returned object r contain the arguments passed to the function. Interacting with libraries. Next, at lines 13-15, we show the type of for all, a library function on lists. This is a higher-order function whose argument f is a (pure) boolean-valued function over the α typed elements of its second argument l, where f decides some property P of the list elements. The type we give to for all is polymorphic in both α and P. The return type indicates that for all returns true if, and only if, f returns true on each element of l. The type also indicates that both f and for all are pure, since neither of them has a Hoare type. This pure function can be programmed in Fine and verified against its specification. The implementation is entirely standard—we omit it here for brevity. Mutation and controlled aliasing. Finally, at lines 17-23, we show the type and implementation of a function add review that tries to add a review to a paper. Informally, the implementation checks that the review’s author is not conflicted with any of the paper’s authors and, if the check succeeds, adds the review to the paper and returns None. Otherwise, add review leaves the paper unchanged and returns the review to the caller. The type shows an impure function, where the precondition requires the review r to not be present in the paper p’s revs list; the returned value is an option review, where the post-condition shows that the review r has been stored in the paper p, or that r is contained in returned option value. The implementation of add review reveals several subtleties of FX. First, to properly model stateful invariants on objects, FX forbids constructing aliases to mutable objects. (§5.1 shows how to relax this restriction.) If a mutable object o is stored in another object c’s f field, then a field projection c.f constructs an alias to the object o. As a result, FX does not permit unrestricted field projection. Instead, we provide other operations to emulate field projections (which get compiled to a small set of function calls in Fine). At line 22, we call for all, providing explicit instantiations for its type and predicate parameters, α and P respectively. We apply checkNC to r.u; project p.authors; and pass both to for all. The projections here are safe since the type of for all is pure, and hence does not create any unsafe aliases to p. Additionally, since the user type is immutable, it is always safe to project r.u. If the no-conflict check fails, we return r to the caller. Otherwise, we update the paper, adding the new review r to p.revs and returning None. The update uses FX’s swap command, c.f :=: v, which replaces the f field of c with the value v and returns the original contents of c.f. We first swap out the original list of reviews from p; add our new review to it; and then swap the extended list back in. Note that it would also be safe to project p.revs and use it in a normal assignment p.revs := Cons r p.revs, since this creates no aliases. Projections of fields that contain primitive or immutable types are also safe. In both branches of add review, the mutable object r is stored within another object—in the then-branch, within p; in the elsebranch, in the returned option object. Since creating unrestricted aliases to mutable objects is forbidden, and the caller retains a reference to r, we need to indicate to the caller that its reference

r is invalidated. We use the absence of a sub-term t(r) in the postcondition to indicate that r is invalidated in the post-state, i.e., that

it may no longer be used by the caller. By contrast, the type of upd review shows that the review r is still accessible to the caller by mentioning s2(r) in its post-condition. Currently, FX cannot express that a reference has been conditionally consumed, e.g., returning a boolean success code b from add review to indicate in the postcondition that r is consumed only if b=true. 2.3

Verifying FX programs via translation to Fine

This section sketches the translation of FX to Fine. The main idea is to thread record values corresponding to mutable objects through the program in a purely functional style. For this functional translation to be sound, we also thread affine capabilities for each object to render stale record values unusable. As discussed in §1, by making the capabilities affine, while giving record values ?-kinded types, we circumvent the no-affine-indices restriction and give precise refinement types (corresponding to object invariants, pre- and postconditions) to the functional translation of imperative FX code. As a result, we can rely on Fine’s automated refinement type checking system to verify FX programs. A module for FX primitives. The translation is organized around a Fine module, Prims, which collects definitions of FX primitives. Certain types will be private to Prims, and the metatheory of Fine (specifically, the value abstraction theorem provided by its module system) guarantees that clients of Prims treat these types abstractly. Objects and constructors. An object with mutable fields in FX, say r:review, where review = {[txt:string; u:user]} from Figure 1, is represented (roughly) in Fine as a pair of 1) a record value r:review where review is a record type in Fine, {txt:string; u:user}; and 2) a token value v:Token r, used as a capability for r, where Token::α ::? ⇒ α ⇒ A, is a constructor of an affine, value-indexed type. Given an object type declaration in FX, we generate several public functions in Prims, corresponding to constructors, field updates, etc., as shown below. Imperative commands in FX get translated into calls to these functions. 1 2 3 4 5 6 7 8

private type Token:: α ::? ⇒ α ⇒ A = | MkToken: x:α → Token x private type review = {txt:string; u:user} val mkReview: txt:string → u:user → unit → (r:rev ∗ Token r ∗ { :unit | r.txt=txt && r.u=u}) val reviewUpdTxt: r:review → txt:string → (Token r ∗ unit) → (r’:review ∗ Token r’ ∗ { :unit | r’.txt=txt && r’.u=r.u}) val reviewReadTxt: r:review → (Token r ∗ unit) → (txt:string ∗ Token r ∗ { :unit | r.txt=txt})

The function mkReview is the constructor function of review. In addition to the constructed record value r and its token, it returns a unit value refined with a formula recording information about the contents of r. The reviewUpdTxt function updates the txt field. Its arguments include r the object to be updated and its token; the return type shows r’ and its token, where the refined unit shows that r’ differs from r in only its txt field. We also show reviewReadTxt, a pure function that serves as a projection function for the txt field. Note that all three functions take seemingly redundant unit-typed arguments and return refined units—we include these because, as we shall see, they help keep our translation uniform. Finally, the Token type, which has a single constructor MkToken, is declared private to Prims to ensure that its values cannot be manufactured directly by clients. Likewise, the review record type is also private. Stateful operations on r:review (e.g., updating one of its fields) require a client to present a v:Token r attesting to the validity of r; the operation consumes the token, rendering r invalid for further use in stateful operations. Pure operations like reviewReadTxt thread the token back to the caller for further use. Hoare types. The Fine type below is the translation of the FX type r:review → {(s1) s1(r).u=1} x:int {(s2) s2(r).u=s1(r).u+x}.

4

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

Nested objects and object invariants. Translating objects with nested objects does not add any further complexity. Object invariants too are easily translated using refinement types. The translation of the FX type paper (from Figure 1) is shown below. The constructor and destructor are similar to those for list and thus omitted.

1 r:review → (Token r ∗ { :unit | r.u=1}) → 2 (x:int ∗ r’:review ∗ Token r’ ∗ { :unit | r’.u=r.u+x})

This type shows a function that takes as arguments tokens for all the variables x where s1(r) is a subterm in the pre- or post-condition; in this case a token for r, since only s1(r) is mentioned. In addition to the token, we have a unit argument refined with the translation of the pre-condition. The return type of the function shows x:int the value returned by the impure FX computation. We also include in the return type all variables y where s2(y) is a subterm of in the post-condition (i.e., r,); tokens for these values; and, the translation of the post-condition formula expressed as a refinement on a unit value. The scoping rules of dependent functions, pairs (aka sums), and refinement types naturally allow the post-condition to describe properties of the initial value of r in the pre-state, the returned value x, and the value of r in the post-state, (i.e., r’). Adapting libraries to work with mutable objects. A significant advantage of our token-based translation is that it lends itself naturally to the re-use of types and libraries that were originally constructed for purely functional programs. For example, we show below how the standard list library can be adapted to work with lists of mutable objects. The list type shown below is the standard type for a list of non-affine values. FX programs are free to use the constructors of list directly to store immutable values within them, and to pattern match on lists to extract immutable values as well. But, we require a bit more care to store mutable objects within lists, since the constructed list would capture a reference to the object. For this, we provide functions in Prims that construct and destruct lists while managing tokens appropriately. 1 2 3 4 5 6 7

type list :: ? ⇒ ? = Nil : list α | Cons : α → list α → list α val mkCons: hd:α → tl:list α → (Token hd ∗ Token tl ∗ unit) → (l:list α ∗ Token l ∗ { :unit | l=Cons hd tl}) val unCons: l:list α → (Token l ∗ { :unit | ∃hd,tl. l=Cons hd tl}) → (hd:α ∗ tl: list α ∗ Token hd ∗ Token tl ∗ { :unit | l=Cons hd tl}) val mkNil: unit → (l:list α ∗ Token l ∗ { :unit | l=Nil})

The function mkCons takes an object value hd:α and a list of object values tl:list α ; tokens for each of them; consumes the tokens for hd and tl and returns a new list l, a token for l, and a refined unit capturing the structure of l. While it is possible for a client program to pattern match against l to extract its components, to mutate any object in l, a client must destruct l by presenting l and its token to unCons; proving a pre-condition that l is indeed a Cons-cell; and receiving the components of l, tokens for each of them, and a postcondition relating the components to the original l. By adopting a token-based approach, not only have we been able to circumvent the no-affine-indices restriction, but we have been able to store mutable objects in standard lists, operate over these lists using the standard implementations of map, fold, for all, etc., since these functions are pure. Impure functions that need to explicitly manipulate the objects in a list, need to be written and verified in FX. Even if the no-affine-indices restriction were to somehow be lifted, representing mutable objects using affine types directly would lead to significant difficulties. For example, if our translation were to translate the mutable object rev in FX to an affine record arev::A, then we would need an alternative type of list, say alist :: A → A, in which to store arev::A values and a duplicate standard library to operate over alist α values (which would be much more cumbersome to write, since we would have to pay attention to the affinity of both the list and of the α -typed values it contains). Even in a system like F◦ (Mazurak et al. 2010), which supports sub-kinding between normal and linear kinds, we would have to construct separate libraries to work with linear and nonlinear lists. Tokens make life much easier!

1 private type paper = {self:{authors:list user; revs: list rev} | 2 ∀u:user, r:rev. In u self.authors && In r self.revs ⇒ NoConflict u r.u } 3 val swapRevs: p:paper → newrl:list rev → (Token p ∗ Token newrl ∗ 4 { :unit | ∀u:user, r:rev. In u p.authors && In r newrl ⇒ NoConflict u r.u}) → 5 (oldrl:list rev ∗ p’:paper ∗ Token oldrl ∗ Token p’ ∗ 6 { :unit | p’.authors=p.authors && p’.revs=newrl && p.revs=oldrl})

The refinement on the paper record shows the invariant from FX. Swapping a new value newrl into the revs field of a paper p calls swapRevs and has to prove that newrl satisfies the object invariant. The tokens of the old record p and the new field value newrl are consumed. The token for the old field value oldrl is available again. As an example, we show how to translate an FX expression p.revs :=: Cons r tl; None using swapRevs: 1 2 3 4

let newrl, tok newrl = mkCons r tl (tok r, tok tl, ()) in let oldrl, p’, tok oldrl, tok p’, = swapRevs p newrl (tok p, tok newrl, ()) in let none, tok none, () = mkNone () in none, p’, tok none, tok p’, ()

At line 1, we construct the new review list newrl by calling mkCons. We then call swapRevs to swap newrl in and the old field value oldrl out, creating a new record p’. We create a record none corresponding to None at line 3. At line 4, in addition to returning none, we thread out p’ and its token, since it represents an updated object that may be accessed later. 2.4

Understanding FX error messages

Errors that may be reported when verifying FX programs include type mismatch, e.g., passing an integer instead of a string to up review, and assertion failures, i.e., assertions in FX programs are translated to unsatisfiable formulas in Fine. Another kind of errors relates to aliasing and is more subtle. Aliasing restrictions in FX are violated if a program tries to use an object that is unavailable, because the object has been deleted, or nested inside another object, or assigned to a local variable. More experience with FX programming and implementation in the future would give us more insights into user-friendly error reporting. 2.5

DCIL

and the efficient execution of FX programs

The purely functional model of FX in terms of Fine, while convenient for verification and equational reasoning, leaves much to be desired in terms of producing an efficient execution strategy for FX programs. While not the main focus of this paper, in this section, we briefly consider how FX programs may be executed efficiently by translating them first to Fine and thence to DCIL—a dependently typed bytecode language to which we compile Fine in a type-preserving style. We begin with a few brief remarks to introduce DCIL. DCIL is a typing discipline for a purely functional fragment of the .NET byte code language (CIL) (ECMA 2006). It extends the type system of CIL with value-indexed types and affine types, but not refinement types. DCIL is backwards compatible with CIL, and DCIL programs can be executed on standard .NET virtual machines. Conceptually, our type-preserving compilation strategy encodes refinement types in Fine using dependent pairs in DCIL.A value v of a refined type {x:t | φ } in Fine is translated to a t-typed value v and a proof of the refinement formula φ [v/x]. A consequence of our type-preservation results for Fine and DCIL is that FX programs also enjoy compilation in a proof-carrying style to .NET byte code. However, after verification of DCIL byte code, we may choose to optimize the program by, say, erasing computationally irrelevant

5

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

fragments of the program such as proof terms. In what follows, we discuss a few further optimizations that aim, roughly, to undo the functional encoding of state wrought by the translation from FX to Fine—of course, we have yet to implement any of these optimizations. Erasure of tokens and units. The first ineffciency in our translation of FX to Fine is the introduction of affine tokens and refined unit values. Tokens are threaded through computation to witness the accuracy of the functional model of state, and the unit values appear as placeholders which witness the validity of pre- and post-conditions—both of these uses are computationally irrelevant. Since both tokens and units are identifiable by their types, we expect a type-based erasure of these fragments of the program to be a simple first optimization. In-place updates. A convenient aspect of our translation is that the parts of the program that perform field updates are demarcated as separate functions within the Prims module, e.g., field updates and swaps are modeled by functions like reviewUpdTxt and swapRevs. These functions are expensive since they copy their arguments rather than performing in-place updates. While Fine itself has no facilities to model in-place updates, DCIL programs, because they are also CIL programs do not have this restriction. A straightforward strategy, then, is for functions like reviewUpdTxt to be implemented in CIL using in-place updates. We can then rely on the .JIT compiler to inline the function call. Elimination of threading. A final inefficiency of our translation is that mutable objects are always threaded through computations; although, clearly, once mutation is implemented using in-place updates, threading objects is redundant. A pass to eliminate threading would be a nice optimization. However, in contrast to the two previous optimizations, elimination of threading requires a somewhat more sophisticated whole-program transformation.

3.

Formalization of core FX

This section formalizes the syntax and the dynamic semantics of a core subset of FX. As mentioned previously, FX is considered a surface syntax for Fine. FX has no static semantics of its own. Instead, we introduce the notion of type shapes for FX and formalize a simple syntactic analysis of FX to compute these shapes. Type shapes serve only to facilitate the translation of FX to Fine (§4), where the types of an FX program can be interpreted using Fine’s type system. We show that this interpretation is sound by proving a simulation between FX programs and their Fine counterparts. Core FX is a lambda calculus augmented with imperative commands for object allocation, deletion, and mutation. It uses a monomorphic typing discipline based around dependent functions, Hoare types and objects with mutable fields. Its syntax is below. Core syntax of FX τ ::= TC | x:τ → C | {[ f :τ | φ ]} C ::= {(s)Φ} r:τ {(s0 )Ψ} φ , Φ, Ψ ::= P p | ∀x:τ.Φ | ∃x:τ.Φ | Φ ∧ Φ0 | Φ ∨ Φ0 | ¬Φ | > | ⊥ S ::= assume φ | P :: k | S; S0 p ::= v | s(η) | p. f v ::= η | c | error | λ x:τ.e η ::= x | ` e ::= v | v v0 | let x = e in e0 | e:C | assert (s)Φ | newτ {[ f = v]} | destruct v as {[ f = x]} in e | v. f

types Hoare types formulas signature atoms values names terms :=: v0

FX kinds are the same as those in Fine (see §2.1) and omitted here. FX types τ include nullary type constants TC, like unit. Abstractions are given impure, dependent function types, x:τ → C, where x names the formal and is bound in the computation type C, whose syntax was described in §1.2. (Pure function types x:τ → τ 0

are sugar for x:τ → {(s)>} τ 0 {(t)>}, since the pre- and postconditions > do not mention any objects.) We also have object types {[ f :τ | φ ]}, which can be thought of as records with mutable fields f1 :τ1 , . . . , fn :τn . The formula φ specifies an object invariant, referring to each of the fields using self. f1 , . . . , self. fn . Formulas are ranged over by φ , Φ, and Ψ, and are first order formulas defined over a set of n-ary base predicates P over atoms p. The atoms include FX values v; field projections (which are not themselves values in FX); and terms of the form s(η), which stands for the value of an object named η in the state s. We use φ for object invariants, which cannot refer to state variables, while Φ and Ψ may contain state variables. A signature S collects top-level user-specified assumptions assume φ and declarations of predicates P :: k of an FX program. The assumptions are global and do not refer to state variables. The predicate declarations specify kinds (k) of predicates (P). Values in FX are names (variables x or memory locations `); constants c, where the type of each constant c is denoted TCc ; error, used only during execution to indicate failures; or lambda abstractions. Expressions e are required to be in A-normal form (Flanagan et al. 1993). Dependent typing in FX and in Fine only permits values to index types—A-normal form helps ensure that expressions do not escape into the type level. The expression forms include the standard function application v v0 and let-binding let x = e in e0 ; the form e:C ascribes a computation type C to e; and assert (s)Φ asserts formula Φ of the current state s. The imperative fragment of FX includes two core constructs for object allocation and deletion. Allocation is newτ {[ f = v]}, which creates a new object of type τ initializing its fields f1 , . . . , fn to v1 , . . . , vn . Expression destruct v as {[ f = x]} in e destructs an object v, and binds the contents of its fields f1 , . . . , fn to fresh variables x1 , . . . , xn in e. The fresh variables xi are in scope within e, while v is no longer accessible in e or later in the program. Using these two constructs alone, we can encode field updates, field swaps, etc. However, we include field swaps, v. f :=: v0 , as a primitive as well, since we use this in our examples. This instruction assigns the new value v0 to the field f of v and returns the old value of the field. 3.1

Dynamic semantics

Runtime configurations in FX are expressions e paired with a store σ , a finite map from memory locations ` to object values {[ f = v]}. We show the dynamic semantics of FX below. Congruence rules are elided. The semantics has the form (e, σ ) → (e0 , σ 0 ), and is mostly as one would expect, except for two subtleties. First, to specify progress for FX, it is convenient to ensure that FX programs never get stuck. Instead, we mark certain evaluation steps as erroneous and define safe FX programs as those that are guaranteed to reduce without taking erroneous steps. Specifically, when evaluating assert (s)Φ we check whether the current values in the store satisfy the formula Φ, under the assumptions in the signature S. Subterms s(`) in Φ are replaced with vˆ` , which is a location-free object value corresponding to σ (`) (where σ ` ` ↓ vˆ` is in §4). Assertions evaluate to the error value if the check fails (the last rule). Although some care has to be taken to translate the two-state predicate Ψ properly, the translation of ascriptions (e:C) to assertions is straightforward. The other subtlety is that bindings in the store σ are tagged with a superscript ι, indicating whether a location represents allocated (⊕) or freed ( ) memory. These tags have no impact on the reduction of FX programs, but the translation of FX to Fine uses these tags to prove that if an FX program is translated to a well-typed Fine program, it never references freed memory. The reduction rule for new creates a new memory location ` and tags it with ⊕. The rule for destruct substitutes the bound variables with the object contents

6

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

in e and tags the destructed location with . Finally, the rule for swap updates the object and returns the old value, as expected. Expressions with failures such as assertion failures or missing fields evaluate to the error value. Selected dynamic semantics of

FX: (e, σ ) → (e0 , σ 0 )

σ ::= σ + {` 7→ {[ f = v]}}ι | •

where ι ::= ⊕ |

(let x = v in e, σ ) → (e[v/x ], σ ) (let x = v:C in e, σ ) → (e[v/x ], σ ) ((λ x:τ.e) v2 , σ ) → (e[v2/x ], σ ) (assert (s)Φ, σ ) → ((), σ )

L ET L ET C F UN if S |= Φ[vˆ` /s(`)], ∀` ∈ FV (Φ) where σ ` ` ↓ vˆ`

((e:{(s)Φ} r:τ {(t)Ψ}), σ ) → (let = assert (s)Φ in let x = e in let = assert (t)Ψ[x/r, σ (`)/s(`)] in x, σ ) (newτ {[ f = v]}, σ ) → (`, σ + {` 7→ {[ f = v]}}⊕ ) ` 6∈ dom(σ ) (destruct ` as {[ f = x]} in e, σ + {` 7→ {[ f = v]}}ι ) → (e[vi/xi ]ni=1 , σ + {` 7→ {[ f = v]}} ) (`. fi :=: v0i , σ + {` 7→ {[ f = v]}}ι ) → (vi , σ + {` 7→ {[ · · · ; fi = v0i ; · · · ]}}ι ) (e, σ ) → (error, σ ) e 6= v otherwise 3.2

Computing type shapes

In preparation for the translation of FX to Fine, we define a simple syntactic analysis of FX programs to compute type shapes. Type shapes describe an equivalence relation on computation types, where types that describe expressions with the same footprint (i.e, the set of locations read or written) are placed in the same equivalence class. The footprint of an expression (and hence the shape of its type) guides the translation of §4 since it determines the values that we must thread into and out of a translated FX expression. Formally, we define two functions, iv (input variables) and ov (output variables), from computations types C to sets of names X,Y ⊆ 2η . Given a computation type C = {(s)Φ} r:τ {(t)Ψ}, iv(C) = {η | s(η) ∈ Subterms(Φ, Ψ)} and ov(C) = {η | t(η) ∈ Subterms(Ψ)} \ {r}. We write C ∼ {X} r:τ 0 {Y }, when X = iv(C), Y = ov(C), and τ ∼ τ 0 . The last relation, τ ∼ τ 0 is a structural congruence on type shapes lifted into function types, i.e., τ ∼ τ and x:τ1 → C1 ∼ x:τ2 → C2 when τ1 ∼ τ2 and C1 ∼ C2 , where we write C1 ∼ C2 when C1 ∼ {X} r:τ {Y } and C2 ∼ {X} r:τ {Y }. Intuitively, the input variables of an expression e (strictly, e’s type C) represent the objects that may be read, modified, deleted, or stored inside other objects by e; the output variables represent the subset of the input variables (excluding the returned value r) that are still accessible after e has been evaluated. The first judgment Γ ` v ∼ τ says that the value v (if it type checks in Fine) has a type that is in the same shape equivalence class as τ, where Γ maps names to their types. Likewise, the judgment Γ ` e ∼ C says that an expression (if it type checks in Fine) has a type that is in the same equivalence class as C. In some cases, we overload notation and write Γ ` e ∼ {X} r:τ {Y }, with the obvious meaning, i.e., the type of e is in the equivalence class described by {X} r:τ {Y }. We also write η:τ, X to conditionally add a name to an input or output variable set—η:τ, X means η, X when τ is an object type, and X otherwise. Computing type shapes: Γ ` v ∼ τ and Γ ` e ∼ C Γ ` η ∼ Γ(η)

Γ ` e ∼ C0 C0 ∼ C Γ ` (e:C) ∼ C

X = {η | s(η) ∈ Subterms(Φ)} Γ ` assert (s)Φ ∼ {X} : unit {X} τ 0 = {[ f :τ | φ ]}

Γ`v∼τ

Γ ` (newτ 0 {[ f = v]}) ∼ {v:τ} r:τ 0 {}

Γ ` v1 ∼ {[ f :τ | φ ]} Γ ` v2 ∼ τk Γ ` (v1 . fk :=: v2 ) ∼ {v1 , v2 :τk } r:τk {v1 } Γ ` v ∼ {[ f :τ | φ ]}

Γ, z:τ ` e ∼ {X} r:τ {Y }

Γ ` destruct v as {[ f = z]} in e ∼ {X \ {z} ∪ v} r:τ {Y \ z} Γ ` e1 ∼ {X1 } x:τ1 {Y1 } Γ, x:τ1 ` e2 ∼ {X2 } r:τ2 {Y2 } Γ ` let x = e1 in e2 ∼ {X1 ∪ (X2 \ {x})} r:τ2 {Y2 ∪ (Y1 \ X2 ) \ {x}}

The first four rules compute shapes for values and function application. An object value v, when used in an expression context, can be given a computation type that includes itself as an input variable, and no output variables. We give assert expressions assert (s)Φ a shape where the input and output variables are the same, namely those referenced by Φ, i.e., all the objects mentioned in an assertion are still accessible after the assertion has been evaluated. An expression e can be ascribed the type C, (using e:C) so long as the ascription does not change the shape of the type. When computing type shapes for the imperative operations, we pay attention to the aliasing relationships induced by each operation. However, we do not attempt to check here that FX programs respect the aliasing constraints on the object graph necessary for assertion safety—all checking is left to Fine. With this principle in mind, consider the rule for new. It states that the input variables are the stateful field values v, while the output variable set is empty. Since references to each of the v are captured by the newly allocated object, the empty output variable set indicates that the v should no longer be accessed directly by the program. The swap statement is similar. Its input variables include the object being updated v1 and the new contents of the field v2 (if its type τk is an object type); the output variables only include v1 since the reference to v2 has been captured by v1 . For a destruct expression, the input variables include v, the object being destructed, and e’s input variables but not the z since these are locally bound; for the same reason, z are excluded from the output variables. Note that e (or some enclosing expression) may attempt to use v after it has been destructed—this situation will be trapped by the Fine type checker. Finally, we have lets. Here, the input variables are the union of the input variables of e1 and e2 , excluding the let-bound variable x. The output variables are the outputs of e2 (excluding x) and those outputs of e1 that are not in the footprint of e2 .

4.

Translating FX to Fine

As sketched in §2.3, FX programs are translated to functional Fine programs. We formalize this translation here, and prove that the translation of FX programs preserve their semantics. Since Fine programs can be verified using refinement type checking, we also obtain the result that the safety of FX programs can be established by the Fine type checker. Our description of the translation starts by describing the role of tokens and their types in our translation. We then describe the translation of types and finally the translation of terms. The section concludes with discussion of our main results. 4.1

FX

objects, tokens, and ownership

For safety in the presence of mutation and aliasing, we require FX programs to respect an ownership discipline on the object store. Affine tokens play a central role in our translation, and serve primarily to enforce this ownership discipline. To illustrate this point, we first discuss how to represent FX object values in Fine. As

Γ ` c ∼ TCc

Γ, x:τ ` e ∼ C Γ ` λ x:τ.e ∼ x:τ → C

Γ`v∼τ Γ ` v ∼ {v:τ} r:τ {}

Γ ` v1 ∼ (x:τ 0 → C) Γ ` v2 ∼ τ 0 Γ ` v1 v2 ∼ C[v2/x ]

7

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

we have seen, FX object values are locations `, where a store σ maintains a mapping from locations to records holding the object’s fields. In Fine, we represent object values as immutable records. The translation of objects is described by the judgment below. σ (`) = {[ f = v]}ι

σ ` vi ↓ vˆi ∀i

σ ` ` ↓ { f1 : vˆ1 , . . . , fn : vˆn }` This rule looks up the binding for a location ` in σ , and produces a tuple of Fine values, where each vˆi , recursively, is the translation of the components of the object stored at `.1 Fine values that correspond to FX objects are tagged with the location ` from which the value was derived—we explain its significance shortly. Now, given a store σ = `1 7→ {[ f = 0]}⊕ , consider the simple FX program and its translation to Fine shown below (eliding irrelevant unit-values from the Fine versions for clarity). FX (1) Fine (1)

let x = newτ 0 {[g = `1 ]} in let y = `1 . f :=: 1 in x let x, tokx = mkτ 0 { f = 0}`1 tok`1 in let y, toky = swapτ { f = 0}`1 1 tok`1 in x, tokx

The FX program allocates a new object x storing the object `1 inside it, then updates `1 and returns x. The program above violates ownership, since after a single step of execution, the store contains a new location `2 7→ {[g = `1 ]}⊕ , meaning that a reference to `1 has been captured by `2 , and the very next step mutates `1 directly. The image of the FX program under the translation is the Fine program shown directly beneath it. The Fine program contains calls to the functions mkτ 0 and swapτ (discussed in §2.3), passing in Fine values corresponding to the object `1 (i.e., { f = 0}), and, importantly, the variable tok`1 representing a token or capability to use the immutable record value in a stateful way. Tokens are, of course, affine and since tok`1 is used more than once, the Fine program fails to type check, and the FX program is dismissed as potentially unsafe. Not all violations of ownership are as easily detected as in this first example. One problematic case is when a reference to an object is captured by a closure, and somewhere else in the program the captured object is updated via an alias, as in the example shown below (with the same initial store σ ). FX (2) Fine (2)

let g = λ :unit.`1 in let = `l . f :=: 1 in g let g = λ :unit.λ x:Token { f = 0}`1 .({ f = 0}`1 , x) in let = swapτ { f = 0}`1 1 tok`1 in g

In this case, we translate the thunk g by adding an additional argument for the token of the captured object value { f = 0}`1 . Notice that the type of the token variable (Token { f = 0}`1 ) is indexed by the value for which it is a capability. Any operation that mutates `1 consumes its token, producing a new token with a type indexed by the new value. Thus, after an update the thunk g becomes unusable since a token with the appropriate type cannot be passed in. (Unless the update does not change the value of the object, in which case it is still perfectly safe to use g). Since this program never attempts to use g, it is safe in FX and is also type correct in Fine. In other words, in addition to tokens being affine, giving them value-indexed types is also crucial for safety. Token threading, for all its benefits (recall also the discussion of §2.3), also makes it harder to prove that our translation from FX to Fine is a simulation. Informally, we wish to show that when an FX program e1 is translated to a well-typed Fine program eˆ1 , and if e1 steps to e2 , then eˆ1 steps (using one or more steps) to some eˆ2 , where eˆ2 is semantically equivalent to the image of e2 under the translation. If we inspect the example program FX (2) above, we see that after several steps it reduces 1 Throughout

this section, syntactic elements from Fine are indicated using the hatted version of the corresponding elements in FX.

to λ :unit.`1 , in a store where `1 7→ {[ f = 1]}⊕ , which translates to λ :unit.λ x:Token { f = 1}`1 .({ f = 1}`1 , x) . However, this Fine lambda term is syntactically distinct from the value of g in the program Fine (2) above—g contains a stale value for `1 . In proving our simulation result we show that, despite the syntactic differences, these two values are in fact semantically equivalent. We use Fine’s module system (formalized and proved sound previously using the colored brackets of Grossman et al. (2000)) to show that stale values that result from the reduction of Fine programs are semantically irrelevant since these values must always be treated abstractly. The `-superscripts on values facilitate this proof. 4.2

Type translation

The translation of FX types τ and computation types C to Fine ˆ is shown below. Throughout the types τˆ (Cˆ is a synonym for τ) translation, we use variables named tokη for η’s token. We also use toks(X) to mean the set of tokens for variables in X, i.e., toks(η, X) = tokη , toks(X) with toks(·) = ·. Translation of types: Γ ` τ

τˆ and Γ ` C

∀i.Γ ` τi Γ ` TC

TC Γ`τ

Γ ` x:τ → C

τˆi

φˆ

ˆ | φˆ } {self :{ f : τ}

Γ ` {[ f :τ | φ ]} Cˆ

C ∼ {X} r:τ {Y } x:τˆ → ({Token x j }x j ∈X ∗ { :unit | Pre(C)Γ }) → Cˆ τˆ

Γ; x : τ ` C

Γ`φ



C ∼ {X} r:τ {Y } Γ`τ τˆ ∀y j ∈ Y, Γ ` y j : τ j Γ ` τj τˆ j Γ ` C (r:τˆ ∗ {y0j :τˆ j }y j ∈Y ∗ {Token y0j }y j ∈Y ∗ { :unit | Post(C)Γ })

where, for C = {(s)Φ} r:τ {(t)Ψ} ˆ ˆ Pre(C)Γ = Φ if Γ ` Φ[η/s(η)] Φ ˆ if Γ ` Ψ[η/s(η)][η 0 /t(η)] Post(C)Γ = Ψ

ˆ Ψ

Type constructors TC remain the same. Object types in FX are translated to refined record types in Fine, with each field type translated to τˆ j , and the invariant translated to φˆ as described shortly. Function types in FX x:τ → C are translated to Fine function types ˆ tokens for that bind the same parameter x (with translated type τ), each input variable of C, and a unit value refined with the translated precondition Pre(C)Γ . The function returns a value with the transˆ A computation type C is translated to lated computation type C. a dependent tuple, consisting of the return value r (with the transˆ the output variables y0j and their tokens, and a lated return type τ), unit value whose type is refined with the translated post-condition Post(C)Γ . The output variables y j are rebound to y0j in the tuple type, to stand for the values of y j at the function exit point. Translation of FX formulas is straightforward and is defined ˆ and Γ ` φ by the judgments Γ ` Φ Φ φˆ . The rules in these judgments are congruences over the structure of formulas, where, in the base case, value-indexed predicates Pv¯ are translated to ¯ˆ for Γ ` vi Pv, vˆi , the value translation judgment of the next section. We use two macros Pre(C)Γ and Post(C)Γ for translating pre- and post-conditions of computations to refinement formulas in Fine. The refinement formula replaces the stateful terms in a formula, s(η) and t(η), with the corresponding name bindings for the immutable values in Fine that correspond to η. Translation of signatures. Predicate declarations P :: k are translated to type constructors in Fine with the same name P and the same kind k. Assumptions assume φ are translated to data constructors in Fine representing proofs of φˆ if Γ ` φ φˆ . These data constructors are included in Fine’s LCF-styple proof kernel, along with the standard first-order logic inference rules (see §2.1).

8

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

4.3

Value and expression translation

This section describes the translation of FX expressions. An FX expression e is translated to a Fine terms e, ˆ where the free variables of eˆ correspond to the object locations ` in e. For translation of source code the translated term eˆ is closed, since store locations do not occur in the source code of a program. The value translation judgment Γ ` v vˆ states that an FX value v is translated to a Fine value v. ˆ The translation of FX names η (variables or locations) is trivial—these are translated to Fine variables with the same names, which may be substituted away when the Fine expression is closed. The rule for translation of functions is shown below. Γ, x:τ1 ; • ` (e:C) eˆ Γ ` τ1 τˆ1 τˆ2 = ({Token η}η∈iv (C) ∗ { :unit | Pre(C)Γ }) Γ ` λ x:τ1 .(e:C)

λ x:τˆ1 .λ y:τˆ2 .let tokη , = y in eˆ

This rule follows the translation of function types. We add a parameter y:τˆ2 to receive the tokens for the function’s input variables. The function body e:C is translated to e, ˆ preceded by a let that unpacks the components of the tuple y, giving token variables their required names and types, tokη :Token η. Our formalization does not attempt to infer types, or pre- and post-conditions for functions. As such, the rule above expects both the formal parameter x to be decorated with a type and for the function body to be ascribed a computation type. In practice, we expect weakest precondition inference to alleviate some of this difficulty, although, of course, annotations will be expected at least on recursive functions. Additionally, while our simulation result holds for translations of arbitrary FX programs (that are well-typed in Fine), in practice we expect FX programs to be closure-converted prior to translation. The judgment Γ; Y˙ ` e eˆ (below) translates an FX expression e to a Fine expression e. ˆ The environment includes Y˙ , either empty (•) or a set of names Y representing the output variables of a computation. We define an operation on output sets: Y˙ Y 0 returns Y 0 if Y˙ = •, and Y˙ otherwise. Intuitively, Y˙ corresponds to locations that may have been modified in expressions preceding e in a block of lets. Fine values (and tokens) corresponding to locations in this set need to be “threaded out” in the translation. Translation of expressions, Γ; Y˙ ` e eˆ Γ`v∼τ Γ ` v vˆ T-Return Γ;Y ` v (v,Y, ˆ toks(v:τ,Y ), ())

T-Let

T-Assert

T-Chk

T-App

T-New

Γ ` e1 ∼ {X1 } x:τ1 {Y1 } Γ; • ` e1 eˆ1 Γ ` let x = e1 in e2 ∼ C Γ, x:τ1 ; Y˙ ov(C) ` e2 eˆ2 Γ; Y˙ ` let x = e1 in e2 let x,Y1 , tokx , tokY1 , = eˆ1 in eˆ2 Γ ` Φ[η/s(η)]

ˆ Φ

Γ; • ` assert (s)Φ

X = ivs (Φ) ˆ ASRTX (Φ)

Γ; • ` e eˆ C ∼ {X} r:τ {Y } C = {(s)Φ} r:τ {(t)Ψ} X1 = ivs (Φ) X2 = ivt (Ψ) Γ; • ` (e:C) let , X1 , tokX1 , = ASRTX1 (Pre(C)Γ ) in let r,Y, toks(r:τ,Y ), = eˆ in let , X2 , tokX2 , = ASRTX2 (Post(C)Γ ) in (r,Y, toks(r:τ,Y ), ()) Γ ` v1 v2 ∼ C Γ ` v1 vˆ1 Γ ` v2 vˆ2 Γ; • ` v1 v2 vˆ1 vˆ2 (toks(iv(C)), ()) Γ ` (newτ {[ f = v]}) ∼ C Γ; • ` (newτ {[ f = v]})

∀i. Γ ` vi

vˆi

mkτ vˆ1 · · · vˆn (toks(iv(C)), ())

T-Upd

T-Destr

Γ ` (v1 . fk :=: v2 ) ∼ C Γ; • ` (v1 . fk :=: v2 )

Γ ` v1

vˆ1

Γ ` v2

vˆ2

upd fk (vˆ1 , vˆ2 , toks(iv(C)), ())

Γ ` v ∼ {[ f :τ]} Γ ` v vˆ Γ, zi :τi ; Y˙ ` e Γ; Y˙ ` destruct v as {[ f = z]} in e let zi , tokzi , = destrτ vˆ tokv in eˆ



To illustrate the significance of Y˙ , consider the FX expression let y = (`. f :=: 1) in y. The type shape computed for this expression is {`} r:int {`}, indicating that ` was modified in the expression and is still accessible after the expression has been evaluated. When translating the body of the let (i.e., y), we need to return y as well as the value and token corresponding to the updated value of `. The rule (T-Return) does just this. It translates values v that appear in expression contexts to tuples containing the translated value v, ˆ any additional values dictated by the output set Y (if any), tokens for the object values, and finally a unit for the post-condition. The other aspect of the Y˙ environment is illustrated in the translation of let expressions let x = e1 in e2 . The first premise computes the shape of the let-bound expression e1 for its set of output variables Y1 , and binds names and tokens for these in the translation. To translate the body e2 , we compute the shape C of the entire let expression for its set of outputs Y = ov(C)—these are locations that may have been modified in either e1 or e2 and are still live, and so, must be threaded out when translating e2 . We do this by translating e2 in a context Y˙ Y , which ensures that Y is threaded out of e2 , unless Y˙ is non-empty, in which case the output variables of some enclosing let are threaded out of e2 . The remaining rules are mostly straightforward. To translate assertions, we define the following macros: ivs (Φ) = {η | s(η) ∈ ˆ = ((), X, toks(X), (() : { : unit | Subterms(Φ)} and ASRTX (Φ) ˆ Φ})). Assertions reduce to unit in FX—the first unit value in the tuple corresponds to this. The last unit value with a refined type corresponds to a check of the assertion formula. Additionally, we include values and tokens for each object value ` ∈ X. This ensures that assertions in Fine never refer to stale values, thereby keeping the behavior of Fine assertions in correspondence with FX. The ascription form (e:C) is translated, as expected, using a pair of assertions. The tokens used in the assertion are rebound so that the capabilities they stand for are not consumed. Applications (v1 v2 ) are translated to pass in tokens (computed using the type shape judgment) for the input variables of v1 , as well as a unit value for the precondition (T-App). The Fine type checker must prove that the unit value passed in can be given a refined type corresponding to the precondition of v1 . The translation of new uses the constructor mkτ of the record type τ (T-New). (The function mkRev in §2.3 is an instance of such a constructor.) The constructor takes the initial values vˆ1 , . . . , vˆn for the fields, consumes the tokens for the input variables (again computed using type shapes), and returns the new record r and its token tokr . The translation of swaps using (T-Upd) is similar. The translation of a destruct expression calls the destructor destrτ of the record type τ (T-Destr). The destructor consumes the token tokv for the record to destruct, and returns the field values and their tokens. These are let-bound and in scope for e, ˆ the translated body. The main result of this section is Theorem 1, which shows that the translation from FX to Fine preserves the semantics of the FX program, if the resulting Fine program is well-typed. More precisely, the translation is a weak simulation modulo strong bisimilarity. We do not give meaning to FX programs that translate to ill-typed Fine porgrams since ill-typed Fine programs are never executed.

9

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

From this result we derive Corollary 2, a progress result for that shows that FX programs that translate to well-typed Fine programs do not reach the error state. An additional result is that the computation types given to FX programs describe small footprints. §A gives formalizations of all lemmas and proofs. Setting up for the simulation. Using the rules of §4.3, an FX expression e produces e, ˆ a Fine program that contains free variables corresponding to locations ` and tokens for locations tok` . Given a store σ that maps locations ` to values v` , free location variables ` are eliminated via substitution with the corresponding translated Fine values vˆ` . The semantics of Fine also makes it convenient to handle free token variables, since it permits affine values to be held in a memory, M, rather than inlined in the program. We translate the FX store σ to a Fine memory M, which maps free token variables tok` in eˆ to Fine values MkToken vˆ` . The reduction of Fine programs has the fine form (M, e) ˆ → (M 0 , eˆ0 ). Reads from M are destructive, and Fine’s metatheory establishes that affine values are used at most once. We present a simplified version of our simulation lemma below, and illustrate its structure graphically. To type check the Fine programs eˆ1 and eˆ2 in the statement below, we use an environment including a signature Sˆbase for FX primitives (corresponding to the Prims module of §2.3), and affine capabilities for every token in M. FX

FX

(e1 , σ1 ) ↓ (e2 , σ2 )

(M2 , eˆ02 )

∼ =

Fine (M1 , eˆ1 ) ↓+ (M2 , eˆ2 )

Theorem 1 (Fine programs simulate FX programs). If an FX runtime configuration (e1 , σ1 ) translates to a well-typed Fine configuration (M1 , eˆ1 ), and (e1 , σ1 ) steps to (e2 , σ2 ), then (e2 , σ2 ) translates to a well-typed Fine configuration (M2 , eˆ02 ), and (M1 , eˆ1 ) takes multiple steps to a configuration (M2 , eˆ2 ) such that eˆ02 is strongly bisimilar to eˆ2 . Theorem 1 establishes that our translation relation is a simulation up to strong bisimilarity, between eˆ2 and eˆ02 above. The relation M2 ` eˆ2 ∼ = eˆ02 states that eˆ2 and eˆ02 reduce in “lock step”, and is derived from a value abstraction theorem provided by Fine’s module system. In addition to this core simulation result, the full version of Theorem 1 establishes several other key properties. Among these are 1. that FX programs that translate to well-typed Fine programs respect aliasing and ownership constraints on the object graph; and 2. that the types given to effectful expressions only mention locations that they access or modify. Theorem 1 also yields an important corollary. Since the error state cannot be translated to Fine, FX programs that translate to well-typed Fine programs do not reach the error state. Corollary 2 (Progress for FX). Given a closed FX program e. ˆ then for any reduction If ·; • ` e e, ˆ where Sˆbase ; ·; · ` eˆ : τ, fx sequence (e, ·) →∗ (e0 , σ 0 ), e0 is guaranteed to not be error.

5.

Examples

This section illustrates the use of FX using further examples. We have applied the translation of §4 (manually, at present) to extended versions of these examples (and the ConfWeb example of §2), and verified the resulting programs using the Fine type checker. 5.1

Verifying stateful APIs in the presence of aliasing

In this section, we show how to verify two properties of clients of a stateful API of collections and iterators, even in the presence of aliasing. First, we ensure that the collection underlying an iterator is never modified while an iteration over the collection is in progress. Second, we ensure that a client never attempts to extract

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43

(∗ Fragment of a library implementing fractional permissions ∗) private type trk α = {[v:α ; p:rat]} type Aliases :: (α ::? ⇒ α ⇒ α ⇒ ? ) assume A refl: ∀x:trk α , y:trk α . x.v=y.v ⇔ Aliases x y val split: x:trk α → {(s) >} y:trk α {(t) Aliases t(x) s(x) && Aliases t(x) t(y) && t(y).p=t(x).p=s(x).p/2 } val join: x:trk α → y:trk α → {(s) Aliases s(x) s(y)} unit {(t) Aliases t(x) s(x) && t(x).p=s(x).p+s(y).p} (∗ Fragment of a safe wrapper for the Collections API ∗) type collection α type iterator α type coll α = trk (collection α ) type istate = HasMore | Unknown type iter α = trk {[i:iterator α ; c:coll α ; st:istate]} val newColl: unit → {(s) >} c:coll α {(t) t(c).p=1} val add: c:coll α → y:trk α → {(s) s(c).p=1 && s(y).p > 0} unit {(t) t(c).p=s(c).p && Aliases t(c) s(c) && t(y).p=s(y).p/2 } val iterator: c:coll α → {(s) s(c).p>0} i:iter α {(t) t(c).p=s(c).p/2 && Aliases t(c) s(c) && t(i).p=1 && Aliases t(c) t(i).v.c && t(i).v.c.p = t(c).p } val finalize: c:coll α → i:iter α → {(s) s(i).p=1 && Aliases c s(i).v.c} unit {(t) t(i).p=0 && Aliases t(c) s(c) && t(c).p = s(c).p + s(i).v.c.p} val next: i:iter α → {(s) s(i).p=1 && s(i).v.st=HasMore} y:trk α {(t) t(i).p=s(i).p && t(y).p > 0 && Aliases s(i) t(i) && t(i).v.i=s(i).v.i} val hasNext: i:iter α → {(s) s(i).p>0 } b:bool {(t) t(i).p=s(i).p && Aliases t(i) s(i) && (b=true ⇔ t(i).v.st=HasMore) } (∗ A client of the Collections API ∗) val client: c:coll α → {(s) s(c).p=1} unit {(t) t(c).p = s(c).p && t(c).v=s(c).v} let client c = let it1 = iterator c in let rec loop1 (c:coll α ) (it1:iter) : {(s) Aliases s(it1).v.c s(c) && s(it1).p=1 && s(c).p> 0} unit {(t) t(it1).p=s(it1).p && t(it1).v.c=s(it1).v.c && t(c).p=s(c).p} = if hasNext it1 && ... then let rec loop2 (c:coll α ) (it2:iter) : {(s) Aliases s(it2).v.c s(c) && s(it2).p=1} unit {(t) t(c).p=s(c).p+s(it2).v.c.p} = if hasNext it2 then let a = next it2 in ... loop2 c it2 else finalize c it2 in let it2 = iterator c in loop2 c it2; let b = next it1 in ... loop1 c it1 else () in loop1 c it1; finalize c it1

Figure 2. Controlled aliasing using fractional permissions

elements from an iterator that has been exhausted. Our example is adapted from the work of Bierhoff and Aldrich (2007), who develop a special-purpose type system that uses linear logic to check type-state properties in the presence of aliasing, and apply it to clients of the Java collections library. A variant of this example has also been studied by Krishnaswami et al. (2009), who verify a similar program using interactive proofs in a higher-order separation logic. In contrast, we enforce an aliasing discipline by developing a library of fractional permissions (Boyland 2003), and rely on an SMT solver for assertion checking. We highlight three elements of our solution. First, the librarybased approach to aliasing illustrates the flexibility of our approach— variations on our permission scheme could be implemented in a similar fashion. This library also illustrates the value of substructural state—affine tokens introduced by the translation ensure proper use of the library. Next, we leverage local state, a feature of FX. Permissions are mutable fields within structured, alias-controlled objects. A similar approach using monadic state would involve explicitly modeling a map from alias-controlled objects to their permissions. Finally, we show how the local state of permissions for collections and iterators can be easily combined, and for these permissions to be used with a state machine that

10

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

tracks whether or not an iterator has been exhausted. The modularity afforded by our approach makes these combinations natural. The program in Figure 2 contains three parts. First, we have the interface of our permissions library. Next, we show the interface of a wrapper to an underlying implementation of the .NET collections library. Although not shown, the wrapper is programmed in FX and contains calls to the permissions library to manage aliases. Finally, we show a client program that uses the wrapped collections library. Although we have yet to formalize this, we conjecture that for clients that verify against the wrapper, calls to the wrapper can be replaced with direct calls to the underlying library. Permissions library. Lines 1-9 of Figure 2 show the interface of our fractional permissions library. At a high level, our library defines a type trk α that associates a permission p:rat (a rational number) with an α -typed value. Using split, a client can construct an alias y to a tracked object x, but the permission previously associated with x is split between x and y. The join function allows a client to destroy y, an alias of x, coalescing the permission previously associated with y into x’s new permission. The library implementation (not shown) directly manipulates the private permission field, and uses A refl to generate its post-condition Aliases. Two tracked values are Aliases if they refer to the same underlying object. Wrapper of the collections library. Lines 10-27 show the interface of our wrapper of the .NET collection API. The abstract types collection α and iterator α correspond to the types implemented by the .NET library. We then define coll α , the tracked version of a collection α . The type iter α is the tracked version of an iterator and associates a permission with a record of three fields, and illustrates how local state in FX can be easily composed. The first field, i, is the iterator itself; the second field c:coll α holds a tracked reference to the collection from which the iterator was derived; the last field represents a state machine to track when an iterator is exhausted. At line 16 we show the signature of newColl, a function that allocates a new collection. Its post-condition shows that the returned collection has a full permission. The add function (17-18) allows an object y to be added to the collection c. Since this mutates the underlying collection, the pre-condition of add requires c to hold a full permission. Since c captures a reference to y, the post-condition shows the permission of y halved. Lines 19-20 shows show the signature of iterator, which produces an iterator i over a collection c. Calling iterator requires only a non-zero permission on c, but the post-condition shows that it consumes half of c’s permission, since the iterator captures a reference to the collection from which it was derived. By halving the permission of c, we ensure that c cannot be modified while the iterator is still live. The implementation of iterator (not shown) calls split to stash a reference to c in the returned object. The finalize operation (22-23) allows an iterator i to be destroyed, so long as there are no other aliases to i. Its post-condition restores the permission to the collection from which the iterator was derived—the implementation of finalize calls to join. The function next (24-25) allows a client to extract a tracked object from an iterator. The pre-condition of next requires the iterator to be in a state where it has more elements. In the postcondition, the state of the iterator is updated to indicate that it may or may not have more elements. In order to establish the precondition of next, clients can call and test the result of hasNext. A client program. Lines 29-44 show a client function that takes a collection c with a full permission, extracts an iterator it1 (line 31), and iterates over it1 using loop1 and finalizes the iterator afterward (line 44), leaving c with a full permission. At each iteration of loop1, we extract another iterator it2 from c (line 41) and iterate over it in loop2 and finalize it2 before exiting loop2 (line 40). Concurrent modifications to the collection c (say, by calling add)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

module InfoFlow type level = High:level | Low:level | J:level → level → level type LEQ :: level ⇒ level ⇒ ? assume Lattice assumptions: LEQ Low High, ... (∗ The program counter sensitivity level, initially Low ∗) private type pc = {[v:level]} let pc=new pc{[ v=Low]} (∗ An abstract monad for leveled data ∗) private type L :: ? ⇒ level ⇒ ? = MkL : α → l:level → L α l val return: l:level → α → L α l val bind: l:level → m:{m:level | LEQ l m} → pc1:level → x:L α l → f:(α → {(s2) s2(pc).v=J l pc1 } L β m {(s2’) s2’(pc)=s2(pc)}) → {(s1) LEQ s1(pc).v pc1 } L β m {(s1’) s1’(pc)=s1(pc)} let bind l m pc1 (MkL x ) f = let tmp = pc.v in pc.v := J l pc1; let r = f x in pc.v := tmp; r (∗ Channels on which to send data (side effects) ∗) type Ch :: ? ⇒ level ⇒ ? val write: α → l:level → Ch α l → {(s1) LEQ s1(pc).v l} unit {(s2) s2(pc)=s1(pc)} end val client: L str Low → L str High → Ch str High → {(s1) s1(pc).v=Low } L str High {(s2) s2(pc)=s1(pc)} let client lx1 lx2 chan = bind Low High Low lx1 (fun x1 → bind High High High lx2 (fun x2 → let x = strcat x1 x2 in (write x High chan); return High x))

Figure 3. Tracking information flow through side-effects fail to type check, since while there are still iterators extant, c has less than a full permission. Notice that aside from the annotation with loop invariants, the client function is fairly direct. Pleasingly, client contains no explicit calls to split or join. All these operations are factored into the implementation of the wrapped collection API, and the specifications of this API in effect manage the implicit aliasing behavior of the client program. Clients can also call split and join directly to explicitely manage aliases, should the need arise. Translating to Fine. Translating Figure 2 to Fine is relatively straightforward. We first closure-convert the nested loops, hoisting them to the top-level. Thereafter, the translation follows the rules of §4 directly (which generalizes readily to handle type abstraction and application). Our implementation uses a more complex model for permissions that maintains separate fractions for read and write permissions; a larger API for collections, including functions to remove elements from collections and to query the size of a collection; and finally, a more complex client program. The resulting Fine program is 244 lines long, and its verification involves proving 29 goals, which we discharge in 15 seconds. 5.2

Tracking implicit information flows through impure code

Our second example develops an information-flow tracking library suitable for programs with side-effects. The basic idea is to combine a monadic treatment of dependency (as in DCC (Abadi et al. 1999)) with the program-counter technique used (originally) by Fenton (1973) in his Data Mark Machine. In short, we represent α -typed data protected at security level l as values of the type L α l, where the type L represents a family of monads arranged in a lattice according to the ordering on security levels. Additionally, we maintain a global stateful value pc that accounts for the influence of protected data on reaching the current program point. Effectful operations (such as writing a message to a channel) have preconditions (expressed as constraints on pc) to ensure that they are not control dependent on secret data. We prove two properties of client programs. First, that Low channels never carry data that is marked High-security; and, secondly, that Low channels are never used at High security-level program points. Using these two properties, a syntactic proof of a

11

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

noninterference property is possible. Although we have not proved noninterference for the program of Figure 3 per se, a similar encoding was proved noninterferent by Swamy (2008). Note that this program does not make use of local state—there is a single global piece of state called pc. As such, this program could easily be modeled and verified using a monadic treatment of state. However, this example highlights two features of FX. First, it illustrates the ability of FX to handle programs that manipulate global state. Second, it serves as initial evidence that our approach generalizes to higher order programs. Our information flow library example makes use of a second-order function bind, where the type of its function argument is parametrized by the preceding arguments to the bind function. We discuss further generalizations to higher order programs in §5.3. The API. Figure 3 shows a module InfoFlow (lines 1-20) which defines the monad L α l and exports it abstractly to the client (2125). InfoFlow begins with a definition of the security levels using the type level, which stands for a two-point lattice, ordered by the relation LEQ. The ordering is axiomatized by user-provided assumptions—we show one such assumption at line 4; the others are standard. At line 6, we define the type pc, a mutable object with a single field, v, which will hold the sensitivity of the program counter—we initialize pc to Low at line 7. We define the L α l type at line 9 and the type of function return at line 10, which allows any value to be injected into the monad using any level. To appreciate the type of bind in Figure 3, it is instructive to consider a type of bind for flow controls in purely functional code: l:level → m:{m:level |LEQ l m} → x:L α l → f:(α → L β m) → L β m This version of bind allows f to view the protected data x:L α l at the unprotected type α , but the constraint LEQ l m ensures that the result f computes from x is at least as secret as x itself. However, this model is only sound if f is a pure function—nothing prevents f from writing its unprotected α -typed argument to a public channel. The purpose of the program-counter pc is to prevent such leaks, and it plays a role in two parts of InfoFlow. Consider, first, our model of channels at lines 16-19. Here, we define Ch α l, the type of a channel on which to send (or receive) α -typed data to parties privileged to view data at secrecy level l. To protect against leaks via implicit flows, the pre-condition of write states that the sensitivity of the program counter pc must not be greater than l. Next, consider the type of bind at line 11—we give the function f passed to bind the type α → {(s2) s2(pc).v=J l pc1} L β m {(s2’) s2’(pc)=s2(pc)}. This type allows f to view the protected data x:L α l at its underlying type α . But, the pre-condition on f’s type says that the state of pc is elevated to be the join of pc1 (the value of pc at the time write is called) and l (the level on the secret data x). This ensures that f cannot satisfy the pre-condition of write if f calls write using a channel c:Ch α l2, for some l2 less secret than l. A proof of noninterference for this scheme relies crucially on f treating pc values abstractly—we achieve this by marking the pc type private. This ensures, for example, that f cannot mutate the pc itself. The only mutation of pc occurs in the implementation of bind, which elevates the pc before calling f; then restores it before returning the value computed by f. The client program concatenates Low and High strings, writes the result on a High channel, and returns a High string. The explicit level arguments to bind (e.g., the three occurrences of High on line 24) lead to some syntactic noise—this could easily be eliminated with some inference for implicit parameters. Translating to Fine. As previously, translating Figure 3 to Fine begins with a simple closure-conversion step, turning the global variable pc into an argument of every stateful function. The resulting Fine program is 81 lines long, and produces 7 verification goals that are discharged automatically in 6 seconds.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

type pt = {[x:real; y:real]} val div x by y: p:pt → {(s) s(p).y ¡= 0} unit {(t) t(p).x = s(p).x/s(p).y } let div x by y p = p.x := p.x/p.y val iter : f:(x:α → {(s) PreF s x} unit {(t) PostF s t x}) → l:list α → {(s) Inv s l && ∀s1:State, s2:State, hd:α , tl:list α . ((Inv s1 Nil ⇒ Post s1 s2 Nil) && ((Inv s1 (Cons hd tl) ⇒ PreF s1 hd) && (Inv s1 (Cons hd tl) && PostF s1 s2 hd ⇒ Inv s2 tl) && (PostF s1 s2 hd && Post s1 s2 tl ⇒ Post s1 s2 (Cons hd tl))) } unit {(t) Post s t l}

let rec iter f l = match l with Nil → () | Cons hd tl → f hd; iter f tl prop XDiv :: State ⇒ State ⇒ list pt ⇒ ? assume XD nil: ∀s1:State, s2:State, XDiv s1 s2 Nil assume XD cons: ∀s1:State, s2:State, hd:pt, tl:list pt. ((s2(hd).x = s1(hd).x/s1(hd).y) && (XDiv s1 s2 tl)) ⇒ XDiv s1 s2 (Cons hd tl)

prop YNEQ0 :: State ⇒ list pt ⇒ ? assume YNEQ0 Nil:∀ s:(YNEQ0 s Nil) assume YNEQ0 Cons: ∀s:State, hd:pt, tl:list pt. (s(hd).y ¡= 0) && (YNEQ0 s tl) ⇒ YNEQ0 s (Cons hd tl)

val div all: pl:list pt → {(s) YNEQ0 s pl} unit {(t) XDiv s t pl } let div all pl = iter div x by y pl

Figure 4. Mutating a list of objects using a higher-order iterator 5.3

Extension to higher-order programs

Although our token-threading translation works equally well for both first- and higher-order programs, we have yet to formalize an extension of the assertion language of FX to work with higherorder stateful code. However, this extension is fairly natural, since the kind language of Fine makes it easy to write types for higherorder programs. A simple example of this was shown in §2.2, where the type of for all abstracted over the predicate P decided by its argument f. However, in the case of for all, the argument f was required to be a pure function. Below is a detailed example showing how extend this scheme to higher-order stateful code. We intend to elaborate on the application of FX to higher-order programs in future work. Our final example illustrates how FX generalizes to the verification of higher-order stateful programs. We use extensively polymorphism which is provided by the implementation of Fine. This example uses a technique due to R´egis-Gianas and Pottier (2008) for purely functional programs. A recent extension of this work to higher-order stateful programs has been developed in the context of the Who verifier (Kanig and Filliˆatre 2009), requiring a custom effect analysis and assuming a strict no-aliasing discipline. Our example shows that Fine’s general-purpose affine and refinement types are up to the task too, and, although we have yet to combine them, we conjecture that the fractional permissions library of §5.1 can be used in conjunction with higher-order stateful computations to relax Who’s restrictions on aliasing. At a high-level, the program in Figure 4 works with points, objects with two mutable fields, x and y, called pt. Given a list of points pl, the client program div all calls a library function iter on pl, passing in a stateful function div x by y which updates the x field of each point in the list by dividing it by the y field. Given that pl is a list in which every point has a non-zero y field, we would like to prove that after updating pl, the x field of each point in the updated list has been divided by the corresponding y field, and that no divide-by-zero errors occur. The type and definition of iter. The main complexity in this example is in the type of iter, shown at lines 4-12. The traditional arguments of iter are a type argument α ::? for the type of the list elements; a function argument f of type α → unit; and a list l:list α .

12

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

The implementation of iter (line 13) applies f to each element of l and returns unit. To verify iter independently of its uses, we give it a specification that generalizes over the specification of f—the additional type arguments to iter (PreF etc.) serve this purpose. First, the predicates PreF and PostF correspond to the pre- and post-conditions of the function f. We use an FX-specific type State for the state variables bound in computation types. (The State type has no counterpart in Fine—they are erased in the translation.) PreF has kind State ⇒ α ⇒ ? , meaning a one-state predicate applied to the α -typed values, i.e., for a state variable s and x:α , PreF s x is a valid type. PostF has kind State ⇒ State ⇒ α ⇒ ? , a twostate predicate on α -typed values. We use these predicates in f’s type: (x:α → {(s) PreF s x} unit {(t) PostF s t x }). PreF allows us to speak about the value of x in the pre-state s; PostF relates the value of x in the pre-state s to its value in the post-state t. In addition to the preand post-conditions of f, iter is also parametrized by an inductive invariant Inv on the list l; as well as a post-condition Post for iter itself. Inv, like PreF, is a single-state predicate for the state of the list just prior to each (recursive) call to iter. The post-condition Post relates the state of the list before and after each call to iter. Then, at lines 7-12, we see the computation type for the body of iter. The pre-condition expresses a number of constraints between the four quantified predicates. First, we require the inductive invariant to be true of the list argument l in the initial state (Inv s l on line 7); line 8 is the base case of the induction; line 9 establishes the pre-condition of f on each iteration; line 10 re-establishes the inductive invariant; line 11 is the induction step; the post-condition on line 12 is straightforward. A client program. At lines 14-17, we define an inductive, two-state predicate XDiv on lists of points to say, roughly, that each point in the list is updated so that its x field is divided by its y field. At lines 18-21, we define another inductive (one-state) predicate, YNEQ0, to say that every point in a list has a non-zero y field. At line 22 we show the type of div all. In the implementation of div all, we call iter instantiating each of its type and predicate parameters. The instantiation of α to pt is easy. But, to instantiate the predicate parameters, we rely on predicate literals of the form \x.P, where x is an abstracted value variable in the body of the proposition P. Using this notation, we instantiate PreF and PostF to predicates corresponding to the pre- and post-conditions of div x by y; Inv and Post to the invariant and the expected post-condition of div all. Finally, we pass in the function argument and the list itself. Translating to Fine. Admittedly, our example program uses some rather powerful machinery to describe a type for iter—machinery that is considerably beyond the formalism of prior sections. However, an extension of our translation to account for these features is relatively straightforward. Indeed, a significant virtue of our approach of verifying stateful FX programs via translation to functional Fine programs is that much of the complexity associated with quantifying over state predicates is simply eliminated (although, we still rely on the ability in Fine to quantify over predicates on values). Our translation produces an 89 line program in Fine, with 13 proof obligations that we discharge in about 7 seconds. As stated, our formalization of translation in Section 4 is sufficiently expressive to handle higher-order stateful functions abstractions—threading tokens through such computations “just works”. However, without the ability to quantify over predicates, higher-order stateful functions are not particularly useful. The main limitation of our formalism is that it lacks polymorphism and an associated language of kinds. Extending our formalism to handle polymorphism over ? -kinded types is trivial; quantification over non-stateful predicates, as in the type of for all from §2 also poses no additional challenges. However, translating quantification over stateful predicates does take some care—we discuss this below.

Each n-ary one-state predicate of kind State ⇒ t1 ⇒ ... tn ⇒ ? , is translated to a predicate in Fine by erasing the State argument, i.e., t1 ⇒ ... tn ⇒ ? . No translation of the types t1 ... tn is needed either, since object types in FX are translated to records of the same name and fields in Fine. Applications of one-state predicates are translated trivially—we simply drop the state variable argument and the other arguments remain unchanged. For each n-ary two state predicate, State ⇒ State ⇒ t1 ⇒ ... tn ⇒ ? in FX, we generate a 2n-ary predicate in Fine of kind t1 ⇒ t1 ⇒ ... ⇒ tn ⇒ tn ⇒ ∗. To translate applications of two-state predicates we follow a pattern similar to the translation of post-conditions in computation types, i.e., the Post(Ψ,C) thing from Section 4. We used primed variables etc. For example, the instantiation of PostF at line 27 in Figure 4 becomes \p.\p’. p’.x=p.x/p.y; the return type of iter (r:unit ∗ l’:list α ∗ Valid l’ ∗ { :unit | Post l l’}); etc. The simplicity of this translation is a direct consequence of our functional simulation—it allows us to simply erase state variable arguments and to relate the value of a variable in the pre-state to its value in the post-state using a pair of immutable values.

6.

Related work and conclusions

Our work is perhaps most closely related to the work of Chargu´eraud and Pottier (2008), who show how to functionalize imperative programs using a translation based on linear capabilities. Like us, they prove their translation sound via a simulation argument. However, their work does not include a program logic, although motivated by the desire to verify programs; our work includes the use of Hoare types and their translation to refinement types. Additionally, in the absence of a logic, Chargu´eraud and Pottier embed a specific aliasing discipline into their calculus, (based on the adoption and focus constructs of F¨ahndrich and DeLine (2002)). In contrast, because of the expressiveness of refinement types, we show how aliasing controls can be encoded using a library of fractional permissions. Our use of tokens in the translation of FX to Fine, is closely related to Walker et al.’s calculus of capabilities (Walker et al. 2000). Whereas Walker et al. prove a syntactic type soundness property for the capability calculus (which is sufficient for their domain of safe memory management), we prove the correctness of our tokenbased translation using a simulation argument. Instead of targeting just memory safety properties, the ATS language (Zhu and Xi 2005) combines stateful views with indexed types for full verification, where stateful views are described using linear logic. This gives ATS the ability to reason directly about pointer manipulations, but at a price, since linear logic is hard to automate. FX is also related to HOOP (Flanagan et al. 2006), a language that uses dependent types to express refinements on imperative objects. However, unlike FX, refinements in HOOP can only mention immutable data. Finally, our work is closely related to the work of Borgstr¨om et al. (2010), who use Hoare types for a state monad to verify stateful computations. However, as argued throughout this paper, the state monad suffers from a lack of modularity—a monolithic state is threaded through the entire program, making it difficult (without resorting to separation logic, as in Ynot), to reason locally about parts of the state or to mix stateful idioms. In contrast, FX programs use local state in the form of mutable objects, and, as illustrated by the example of §5.1, permits multiple stateful idioms to be combined in a modular way. In summary, we have presented FX, a functional language with support for mutable objects. We show how FX programs can be translated to functional Fine programs, using affine types to model state and refinement type checking for verification. Our approach extends the scope of automatic, modular verification for programs that use mutable local state and aliasing.

13

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

References M. Abadi, A. Banerjee, N. Heintze, and J. G. Riecke. A core calculus of dependency. In POPL, 1999.

D. Zhu and H. Xi. Safe programming with pointers through stateful views. In PADL, 2005.

S. Arun-Kumar and M. Hennessy. An efficiency preorder for processes. Acta Inf., 29(9):737–760, 1992. ISSN 0001-5903. J. Bengtson, K. Bhargavan, C. Fournet, A. D. Gordon, and S. Maffeis. Refinement types for secure implementations. In CSF, 2008. K. Bierhoff and J. Aldrich. Modular typestate checking of aliased objects. OOPSLA, 2007. J. Borgstr¨om, A. Gordon, and R. Pucella. Roles, stacks, histories: A triple for Hoare. J. Funct. Program., 2010. To appear. J. Boyland. Checking interference with fractional permissions. In SAS, pages 55–72. Springer, 2003. I. Cervesato and F. Pfenning. A linear logical framework. Inf. Comput., 179 (1), 2002. A. Chargu´eraud and F. Pottier. Functional translation of a calculus of capabilities. In ICFP ’08, 2008. J. Chen, R. Chugh, and N. Swamy. Type-preserving compilation of end-toend verification of security enforcement. In PLDI ’10. ACM, 2010. A. Chlipala, G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Effective interactive proofs for higher-order imperative programs. In ICFP, 2009. L. de Moura and N. Bjorner. Z3: An efficient SMT solver. In TACAS, 2008. ECMA. Standard ecma-335: Common language infrastructure (cli), 2006. M. F¨ahndrich and R. DeLine. Adoption and focus: practical linear types for imperative programming. In PLDI, 2002. J. Fenton. Information Protection Systems. PhD thesis, U. Cambridge, 1973. C. Flanagan. Hybrid type checking. In POPL, 2006. C. Flanagan, A. Sabry, B. F. Duba, and M. Felleisen. The essence of compiling with continuations. In PLDI, 1993. C. Flanagan, S. N. Freund, and A. Tomb. Hybrid types, invariants, and refinements for imperative objects. In FOOL/WOOD ’06, 2006. D. Grossman, G. Morrisett, and S. Zdancewic. Syntactic type abstraction. ACM TOPLAS, 22(6):1037–1080, 2000. ISSN 0164-0925. D. Jackson. Alloy: a lightweight object modelling notation. TOSEM, 11(2), 2002. J. Kanig and J.-C. Filliˆatre. Who: a verifier for effectful higher-order programs. In ML. ACM, 2009. S. Krishnamurthi. The Continue server. In PADL, 2003. N. R. Krishnaswami, J. Aldrich, L. Birkedal, K. Svendsen, and A. Buisse. Design patterns in separation logic. In TLDI, 2009. G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Toward a verified relational database management system. In POPL, 2010. K. Mazurak, J. Zhao, and S. Zdancewic. Lightweight linear types in System F◦ . In TLDI, 2010. R. Milner. LCF: A way of doing proofs with a machine. In MFCS, 1979. A. Nanevski, G. Morrisett, and L. Birkedal. Polymorphism and separation in Hoare type theory. In ICFP, 2006. Y. R´egis-Gianas and F. Pottier. A Hoare logic for call-by-value functional programs. In MPC, 2008. M. Sozeau. Subset coercions in Coq. In TYPES. Springer-Verlag, 2006. N. Swamy. Language-based Enforcement of User-defined Security Policies. PhD thesis, University of Maryland, College Park, August 2008. N. Swamy, J. Chen, and R. Chugh. Enforcing stateful authorization and information flow policies in Fine. In ESOP, 2010. W. Swierstra. A Hoare logic for the state monad. In TPHOLs, 2009. P. Wadler. Linear types can change the world! CONCEPTS AND METHODS. North, 1990.

In PROGRAMMING

D. Walker, K. Crary, and G. Morrisett. Typed memory management via static capabilities. ACM TOPLAS, 22(4), 2000.

14

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

Syntax of instrumented Fine expressions eˆ uˆ vˆ

::= ::= ::=

vˆ | vˆ1 vˆ2 | let x = eˆ1 in eˆ2 ˆ eˆ | (vˆ1 , vˆ2 ) x | ` | tok` | Dvˆ1 , . . . , vˆn | λ x:τ. uˆ | uˆ`:τ

Expressions pre-values values

Extending typing relation in Fine for l-annotated values ˆ Γ; ˆ · ` uˆ : τ S; Sˆ 3 (prims, τ) ˆ Γ; ˆ X ` uˆ`:τ : τ S;

T-Loc

Note the empty X-set in the premise of this rule. Values corresponding to objects are token-free.

Reduction for Fine expressions R-Beta

fine



R-Let

fine



(M; λ x.eˆ v) ˆ pub (M; e[ ˆ v/x]) ˆ

(M; let x = vˆ in e) ˆ pub (M; e[ ˆ v/x]) ˆ

fine

0 0 (M; eˆ1 ) → p (M ; eˆ1 )

R-Ctx

fine

0 0 (M; let x = eˆ1 in eˆ2 ) → p (M ; let x = eˆ1 in eˆ2 )

R-Pair

fine



(M; let x, y = (vˆ1 , vˆ2 ) in e) ˆ pub (M; e[ ˆ vˆ1 /x][vˆ2 /y]) vˆbase = (vˆ1 , . . . , vˆn )`:τ M = M0 , TokenVal (vˆ1 ), . . . , TokenVal (vˆn )

` fresh M 0 = M0 , (tok` , MkToken vˆbase ) fine

R-New



(M; mkτ vˆ1 · · · vˆn (tokens(vˆ1 , . . . , vˆn ), ())) prims (M 0 ; vˆbase , tok` , ()) vˆbase = (vˆ1 , . . . , vˆk , . . . , vˆn )`:τ vˆ0base = (vˆ1 , . . . , vˆ0k , . . . , vˆn )`:τ

M = M0 , TokenVal vˆbase , TokenVal vˆ0k M 0 = M0 , TokenVal vˆ0base , TokenVal vˆk fine

R-Upd

→ (M; upd fk (vˆbase , vˆ0k , tokens(vˆbase , vˆ0k ), ())) prims (M 0 ; vˆk , vˆ0base , tokens(vˆk , vˆ0base ), ())

vˆbase = (vˆ1 , . . . , vˆn )`:τ M = M0 , TokenVal vˆbase M 0 = M0 , TokenVal vˆ1 , . . . , TokenVal vˆn fine

(M; destrτ vˆbase 1. where

2. 3.

R-Destr



tokvˆbase ) prims (M 0 ; vˆ1 , . . . , vˆn , tokens(vˆ1 , . . . , vˆn ), ())

TokenVal vˆ = (tokl , MkToken v) ˆ when vˆ = (vˆ1 )`:τ otherwise, TokenVal vˆ = · token(v) ˆ = tokl when vˆ = (vˆ1 )`:τ otherwise token(v) ˆ =· ¯ˆ = token(vˆ1 ), tokens(v) ¯ˆ tokens(vˆ1 , v) with tokens() = ·

Figure 5. Syntax and semantics of Fine (Reproduced with minor revisions from ESOP ’10)

A.

Soundness of the translation

The translation of runtime configurations of FX programs entails one additional technicality that amount to further book-keeping for tokens. The subtlety that needs to be handled is that when a program e = let x = x. f := 1; e1 in e2 reduces to e0 = let x = e1 in e02 , the type shape computed for (x. f := 1; e1 ) is, in general, different from the shape computed for e1 alone; e.g., the shape for e1 alone would omit x as an output variable. Since the shape of the types govern the structure of the token threading, we need to account for the tokens threaded when translating e when translating the reduction of e, i.e, e0 . We accomplish this in the translation judgment (Figure 6) Γ; Y ` e =⇒ C e. ˆ This judgment is a reproduction of the rules of §4, with two changes. First, the judgment is indexed by a structure Y that specifies the set of output variables to be threaded out of each sub-expression in the program. Second, we find it convenient in our proofs to inline the definition of the type-shape judgment into the definition of translation judgment—thus, in addition to translating e to e, ˆ we compute a type shape C for e while accounting explicitly for the tokens that, according to Y, must be threaded through. This additional judgment is simply a technical device. We prove (Lemma 13) that whenever Γ; Y˙ ` e eˆ then ∃Y,C.Γ; Y ` e =⇒ C e. ˆ Definition 3 (Base signature). A type declaration in FX is of the form td = (τ = {[ f1 : τ1 ; . . . ; fn : τn ]}) and an FX signature is a sequence of type declarations, S ::= td, S|·.

15

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

Γ; Y ` e =⇒ C

T-Return

T-Assert

T-Chk

T-New

T-Destr

eˆ Translation of FX runtime configurations, where Y ::= Y | (1:Y, 2:Y)

Γ`v∼τ Γ`v Γ;Y ` v =⇒ C Γ ` Φ[η/s(η)]

vˆ C ∼ {v:τ} r:τ {Y } (v,Y, ˆ toks(v:τ,Y ), ())

ˆ Φ

T-Let

Γ; Y.1 ` e1 =⇒ C1 eˆ1 C1 ∼ = {X1 } x:τ1 {Y1 } Γ, x:τ1 ; Y.2 ` e2 =⇒ C2 eˆ2 C2 ∼ = {X2 } s:τ2 {Y2 } C∼ = {X1 ∪ X2 \ {x}} s:τ2 {Y2 } Γ; Y ` let x = e1 in e2 =⇒ C let x,Y1 , tokx , tokY1 , = eˆ1 in eˆ2

C ∼ {X} :unit {X} ˆ ASRTX (Φ)

X = ivs (Φ)

Γ; • ` assert (s)Φ =⇒ C

T-App

Γ ` v1 v2 ∼ C Γ ` v1 vˆ1 Γ ` v2 vˆ2 Γ; • ` v1 v2 =⇒ C vˆ1 vˆ2 (toks(ov(C)), ())

Γ; Y ` e =⇒ C0 eˆ C ∼ {X} r:τ {Y } ∼ C0 C = {(s)Φ} r:τ {(t)Ψ} X1 = ivs (Φ) X2 = ivt (Ψ) Γ; Y ` (e:C) =⇒ C let , X1 , tokX1 , = ASRTX1 (Pre(C)Γ ) in let r,Y, toks(r:τ,Y ), = eˆ in let , X2 , tokX2 , = ASRTX2 (Post(C)Γ ) in (r,Y, toks(r:τ,Y ), ()) Γ ` (newτ {[ f = v]}) ∼ C

∀i. Γ ` vi

vˆi

T-Upd

Γ; • ` (newτ {[ f = v]}) =⇒ C mkτ vˆ1 · · · vˆn (toks(iv(C)), ()) Γ ` v ∼ {[ f :τ]} Γ`v C0 ∼ {X} r:τ {Y }

Γ, zi :τi ; Y ` e =⇒ C0 C ∼ {v, X} r:τ {Y }



Γ; Y ` destruct v as {[ f = z]} in e =⇒ C

Γ ` (v1 . fk :=: v2 ) ∼ C Γ ` v1 vˆ1 Γ ` v2 Γ; • ` (v1 . fk :=: v2 ) =⇒ C upd fk (vˆ1 , vˆ2 , toks(iv(C)), ())

vˆ2



let zi , tokzi , = destrτ vˆ tokv in eˆ

σ ` v ↓ v; ˆ L Translating FX objects to Fine values σ (`) = {[ f = v]}ι

σ ` vi ↓ vˆi ; L

σ ` ` ↓ (vˆ1 , . . . , vˆn

∀i

· ` λ x:τ.e ⇒ τ 0 vˆ σ ` λ x:τ.e ↓ v; ˆ·

)`:τ ; L, `

σ ` c ↓ c; ·

σ ` σ ↓ M Translating FX stores to Fine token stores ¯ σ0 ` ` ↓ v; ˆL σ0 ` (σ \ L) ↓ M ∀`0 .σ (`0 ) = {[ f = v]}⊕ ⇒ ` 6∈ FV (v) σ0 ` (σ , {` 7→ {[ f = v]}⊕ }) ↓ M, (tok` , MkToken v) ˆ σ0 ` σ ↓ M σ0 ` (σ , {` 7→ {[ f

= v]} }) ↓ M

M-2

σ0 ` σ2 , σ1 ↓ M σ0 ` σ1 , σ2 ↓ M

σ ` eˆ1 ↓ eˆ2 ; M Closing Fine expressions σ ; Γ; Y ` e =⇒ C T-Trans

M-3

M-1

σ0 ` · ↓ ·

M-4

σ `σ ↓M eˆ2 = ρ eˆ1 ρ = {[`/vˆ ] | ` ∈ FV (eˆ1 ) ∧ σ ` ` ↓ v; ˆ L} σ ` eˆ1 ↓ eˆ2 ; M

e; ˆ M Top-level judgment: Translating FX expressions to Fine programs with token stores

Γ; Y ` e =⇒ C eˆ1 σ ` eˆ1 ↓ e; ˆM σ ; Γ; Y ` e =⇒ C e; ˆM Figure 6. Translating FX runtime configurations to Fine

16

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

ˆ using the following judgment: td, S Given an FX signature S, we derive a Fine signature S, translation of a single type declaration is defined below:

Sˆtd , Sˆ when td

Sˆtd and S

ˆ The S.

(τ = {[ f1 :τ1 ; . . . ; fn :τn | Φ]}) (prims, τ::?), (prims, Tokenτ ::τ ⇒ A) (prims, MkTokenτ : x:τ → Tokenτ x) (prims, Dτ : f1 :τ1 → . . . → fn :τn → τ), V mkτ : x1 :τ1 → . . . → xn :τn → (tokens(x1 :τ1 , . . . , xn :τn ) ∗ { : unit | Φ[xi /self. fi ]}) → (x:τ ∗ Token x ∗ { : unit | i x. fi = xi }), 0 0 0 0 0 0 upd fk : x:τ → y :τk → (tokens(x:τ, y :τk ) ∗ unit) → (y:τk ∗ x :τ ∗ tokens(y:τk , x :τ) ∗ { : unit | y = x. fk ∧ y = x . fk }) V destrτ : x:τ → (tokens(x:τ) ∗ unit) → ( f1 :τ1 ∗ . . . ∗ fk :τk ∗ tokens( f1 :τ1 , . . . , fn :τn ) ∗ { : unit| fi = x. fi }) ˆ Γ, ˆ such that Lemma 4 (Preservation for Fine). Given (M, eˆ1 ) and S; ˆ Γˆ |= M (1) S; ˆ Γ; ˆ X ` eˆ1 : τ, ˆ for X ⊆ dom M (2) S; fine

0 ˆ ) (3) (M, eˆ1 ) → p (M , e 2

ˆ Γˆ 0 ; X 0 ` eˆ2 : τˆ and S; ˆ Γˆ 0 |= M 0 . Then, there exists Γˆ 0 , X 0 such that S; 0 Furthermore, for ∆X = (dom(M) ∪ dom(M )) \ (dom(M) ∩ dom(M 0 )), if dom(M 0 ) ⊇ dom(M) then X 0 = X ∪ ∆X ; otherwise X 0 = X \ ∆X . Proof. Extension of Fine’s soundness proof to account for the three additional base cases, R-New, R-Upd, and R-Destr. ˆ Γ, ˆ such that Lemma 5 (Progress for Fine). Given (M, eˆ1 ) and S; (1) (2) (3)

ˆ Γˆ |= M S; ˆ Γ; ˆ X ` eˆ1 : τ, ˆ for X ⊆ dom M S; eˆ1 6= vˆ fine

0 ˆ ). Then, there exists M 0 , eˆ2 such that (M, eˆ1 ) → p (M , e 2

Proof. Extension of Fine’s soundness proof to account for the three additional base cases, R-New, R-Upd, and R-Destr. Equivalence of Fine configurations M1 ∼ M1 ` eˆ1 ∼ = M2 = eˆ2 (M1 , eˆ1 ) ∼ = (M2 , eˆ2 ) (M1 , eˆ1 ) ∼ = (M2 , eˆ2 ) M1 ∼ = M2

(M2 , M1 ) ∼ =M (M1 , M2 ) ∼ =M

M1 ∼ = M2 ˆ (M1 , (tokl , v)) ˆ ∼ = (M2 , (tokl , v))

·∼ =·

tokl 6∈ dom(M) ∼ `:τ M ` uˆ`:τ 1 = uˆ2

M ` eˆ1 ∼ = eˆ2

C-Cfg

C-Dead

M = M 0 , (tokl , MkToken (vˆ1 , . . . , vˆn )`:τ ) uˆi = (vˆ01 , . . . , vˆ0n ) ∼ uˆ`:τ M ` uˆ`:τ = 1

M ` eˆ ∼ = eˆ

∃ k, vˆk 6= vˆ0k

M ` eˆi ∼ ∀i = eˆ0i ∼ M ` let x¯ = eˆ1 in eˆ2 = let x¯ = eˆ01 in eˆ02

C-id

M ` vˆi ∼ ∀i = vˆ0i ∼ D vˆ0 . . . vˆ0 M ` D vˆ1 . . . vˆn = 1

M ` eˆ1 ∼ = eˆ2 ˆ eˆ1 ∼ ˆ eˆ2 M ` λ x:τ. = λ x:τ.

C-Data

n

C-Lam

C-Stale

2

C-Let

M ` vˆi ∼ ∀i = vˆ0i ∼ (vˆ0 , vˆ0 ) M ` (vˆ1 , vˆ2 ) = 1

M ` vˆi ∼ ∀i = vˆ0i 0 ∼ M ` (vˆ1 vˆ2 ) = (vˆ vˆ0 )

C-Pair

2

C-App

1 2

Lemma 6 (Structural equivalence). For all (M1 , eˆ1 ) and (M2 , eˆ2 ) such that (M1 , eˆ1 ) ∼ = (M2 , eˆ2 ). There M1 = M2 , and exists eˆ and σ1 = (x1 , ul1 ), . . . (xn , uln ) and σ2 = (x1 , vl1 ), . . . (xn , vln ) Such that eˆ1 = σ1 eˆ and eˆ2 = σ2 eˆ And vice versa. Proof. Easy induction on the structure of the equivalence relation.

17

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

Lemma 7 (Equivalent Fine configurations simulate each other). Given and FX signature S and its Fine counterpart Sˆbase such that S Then, for all (M1 , eˆ1 ), (M2 , eˆ2 ) such that A1 (M1 , eˆ1 ) ∼ = (M2 , eˆ2 ) ˆ 1 ); dom M1 ` eˆ1 : τ A2 Sˆbase ; Γ(M ˆ 1 ); dom M1 ` eˆ2 : τ A3 Sˆbase ; Γ(M

Sˆbase .

fine

0 0 A4 (M1 , eˆ1 ) → p (M , e 1 ˆ1 ) fine

0 0 0 0 ∼ 0 0 Then, (M2 , eˆ2 ) → p (M , e 2 ˆ2 ) And (M1 , eˆ1 ) = (M2 , eˆ2 )

Proof. We start by appealing to Lemma 6, and deriving e, ˆ σ1 , and σ2 such that eˆ1 = σ1 eˆ and eˆ2 = σ2 e. ˆ From well-typedness and an inversion of (T-Loc) we find that for each ul:τ in the range of σ1 , σ2 ; we have S 3 (prims, τ). We proceed by induction on the structure of the reduction relation (A4). We consider two main sub-cases. Sub-case p = pub: From the metatheory of Fine, specifically from the soundness of Fine’s module system, we have a value abstraction result. ˆ Γ(M), ˆ Namely, for S; x : (prims, τˆx ); X ` e : τ and for any two u1 and u2 such that Sˆ ` u1 : τˆx and Sˆ ` u2 τˆx . fine



If (M, e[u1 /x]) pub (M 0 , e0 ). fine



Then, there exists e00 such that e0 = e00 [u1 /x] and (M, e[u2 /x]) pub (M 0 , e00 [u2 /x]). From this result, and the inverse direction of Lemma 6, we get our conclusion. Sub-case p = prims: Sub-sub-case R-New: We have e1 = mkτˆ uˆl1 , . . . , uˆln (l1 , . . . , ln , ()) e2 = mkτˆ vˆl1 , . . . , vˆln (l1 , . . . , ln , ()) Since we have M1 = M2 we have M1 (li ) = M2 (li ) = MkToken wˆ i : Tokenwˆ i . From well-typedness of e1 and e2 we have li : Tokenuˆi and li : Tokenvˆi . Together, from convertibility of types, we have ui = vi = li . Thus, e1 = e2 and we have the conclusion. Sub-sub-case R-Upd and R-Destr: Analogous to (R-New)—the presence of the tokens and M1 = M2 guarantees that e1 = e2 . Sub-sub-case R-Ctx: Simple, from the induction hypothesis.

A store and expression respect ownership constraints on object graph RespectsOwnership σ eˆ

RespectsOwnership σ eˆ ⇐⇒ OwnershipOK σ {l | tokl ∈ FV (e)} ˆ

OwnershipOK σ L , for L = l1 , . . . , ln OwnershipOK (σ2 , σ1 ) L OwnershipOK (σ1 , σ2 ) L FreeLocs(v) ∩ dom(σ ) = 0/

RO-Perm

OwnershipOK • L

l 6∈ Reachable σ L

RO-Nil

OwnershipOK σ L

OwnershipOK (σ + {l 7→ {[ f = v]} }) L FreeLocs(v) ∩ dom(σ ) = 0/ Parents σ l ∩ Reachable σ L = 0/ OwnershipOK σ L OwnershipOK (σ + {l 7→ {[ f = v]}⊕ }) (L, l) l 6∈ L FreeLocs(v) ∩ dom(σ ) = 0/ |Parents σ l ∩ Reachable σ L| ≤ 1 OwnershipOK σ L OwnershipOK (σ + {l 7→ {[ f = v]}⊕ }) L

RO-Dead

RO-Root

RO-Nested

` ∈ Reachable σ L `∈L ` ∈ Reachable σ L

(Parents σ `) ∩ (Reachable σ L) 6= 0/ ` ∈ Reachable (σ + {` 7→ }ι ) L

We write Parents σ ` for {`0 | `0 ∈ dom(σ ) ∧ ` ∈ FreeLocs(σ (l 0 ))}, and Ancestors σ ` for the fixed point of this relation. We write Live σ for {` ∈ dom(σ ) | σ (`) = {[ f = v]}⊕ }

18

Preliminary version made available for peer review. Subject to revision. – October 12, 2010

2010/10/12

Lemma 8 (Parents). Given σ , L such that OwnershipOK σ L. Then ∀l ∈ dom σ .|L ∩ (l ∪ Ancestors σ l)| ≤ 1 Proof. Straightforward, by induction on the definition of OwnershipOK σ L. Lemma 9 (Canonical derivations for RO). If OwnershipOK σ (L1 , L2 ). Then, there exists a permutation (σ0 , σ1 , σ2 ) of σ , Such that, OwnershipOK (σ0 , σ1 , σ2 ) (L1 , L2 ). Where, (1) (2) (3) (4)

OwnershipOK (σ0 , σ1 ) L1 and OwnershipOK σ2 L2 dom(σ0 ) = dom(σ ) \ Live (σ ) dom(σ1 ) = Reachable σ L1 ∩ Live (σ ) dom(σ2 ) = Reachable σ L2 ∩ Live (σ )

Proof. By induction on the definition of OwnershipOK σ L, using the fact that the side conditions of the inductive rules are monotonic in σ. Lemma 10 (Extensibility of RO). If OwnershipOK σ1 L1 and OwnershipOK σ2 L2 . where, L1 ∩ L2 = 0, / L10 = L1 \ Reachable σ2 L2 , ∀l ∈ dom(σ1 ).∀l 0 ∈ dom(σ2 ). l ∈ FreeLocs(σ2 (l 0 )) ⇒ l 0 ∈ Reachable σ2 L2 , and ∀l ∈ dom(σ2 ).∀l 0 ∈ dom(σ1 ). l ∈ FreeLocs(σ1 (l 0 )) ⇒ σ1 (l 0 ) = {[ f = v]} . Then, OwnershipOK (σ1 , σ2 ) (L10 , L2 ). Proof. By induction on the derivation of OwnershipOK σ1 L1 , using the fact that no dead locations in σ2 are reachable from L1 , L2 . Lemma 11 (Soundness of small footprints on single steps). fx If Γσ ` e : C, where C ∼ = {X} { } and (σ , e) → (σ 0 , e0 ). 0 Then, there exists σ0 , σ1 , σ1 such that σ is a permutation of σ0 , σ1 ; and σ 0 is a permutation of σ0 , σ10 ; and dom(σ1 ) = X and FreeLocs(σ2 ) = dom σ1 ∪ L, where L fresh. Proof. By induction on the derivation of the transition. Lemma 12 (Input variables correspond to tokens). If Γσ ` e ⇒ C and C ∼ = {X} { } ˆ and Sˆbase ; Γˆ σ ; dom(Γˆ σ ) ` eˆ : τ.



Then FreeTokens(e) ˆ = X. Proof. Induction over the structure of the token-threading judgment. Lemma 13 (Memo-ized derivations). If Γ; Y˙ |= e ⇒ C

e. ˆ Then, ∃Y.Γ; Y ` e =⇒ C



Proof. Induction on the rewriting judgment. eˆ and C ∼ ˆ = {l, X} r : τ { }. Then tokl ∈ FV (e).

Lemma 14 (Input variables correspond to tokens). If Γσ ` e ⇒ C Proof. Induction over the structure of the token-threading judgment. Lemma 15 (Type translation). If Γσ ; Y ` e =⇒ C

ˆ Γσ ` C ⇒ C. ˆ e. ˆ Then, exists C,

Proof. Induction over the structure of the token-threading judgment. Lemma 16 (Well-typed translations preserve types). If Γσ ` e ⇒ C

eˆ and Γˆ σ ` eˆ : τˆ and Γσ ` C

ˆ Then Γˆ σ ` τˆ

Suggest Documents