Microsoft Research, Cambridge, U.K. ... Formalizations of virtual machines such as the Java Virtual .... Pusch's work on the JVM [18, 10, 12] and van Inwegen's.
Automating Type Soundness Proofs via Decision Procedures and Guided Reductions Don Syme and Andrew D. Gordon Microsoft Research, Cambridge, U.K.
Abstract Operational models of fragments of the Java Virtual Machine and the .NET Common Language Runtime have been the focus of considerable study in recent years, and of particular interest have been specifications and machine-checked proofs of type soundness. In this paper we aim to increase the level of automation used when checking type soundness for these formalizations. We present a semi-automated technique for reducing a range of type soundness problems to a form that can be automatically checked using a decidable first-order theory. Deciding problems within this fragment is exponential in theory but is often efficient in practice, and the time required for proof checking can be controlled by further hints from the user. We have applied this technique to two case studies, both of which are type soundness properties for subsets of the .NET Common Language Runtime. These case studies have in turn aided us in our informal analysis of that system. 1
Introduction
Formalizations of virtual machines such as the Java Virtual Machine (JVM) or the .NET Common Language Runtime (CLR) have been the focus of considerable study in recent years [15, 12, 14, 4]. Of particular interest have been specifications and proofs of type soundness for these systems, frequently involving machine-checked proofs using interactive theorem provers [16, 18, 17]. While the automation available in interactive theorem provers has increased, both the kind of automation applied (e.g. rewriting) and the manner of its application (e.g. tactics) tend to be substantially adhoc. The proof scripts needed to check these properties are often many thousands of lines long.1 In this paper we aim to increase the level of automation applied to this problem, focusing on one particluar automated decision procedure and one particular form of user guidance. We isolate out the user’s guidance into a component called a guided reduction, which indicates how to extract the relevant facts that make the proof go through for particular cases. Applying the reduction is an automated process that transforms the type soundness problem into a form that can be automatically checked using case-splitting and validity checking within a combination of decidable firstorder theories. The particular decision procedure used in this paper is essentially the algorithm used by the Stanford 1 See Chapter 3 of [16] for a discussion of this and why it is problematic.
Validity Checker (SVC) [1], which has been successfully applied to large hardware verification proofs. We have applied this technique to models of subsets of the .NET Common Language Runtime, which has in turn aided our informal analysis of that system. This paper is structured as follows. In the remainder of this section we consider the background to this work, including a number of particularly relevant studies of Java and the JVM. In §2 we describe Spark, a model of a fragment of the IL of the .NET Common Language Runtime, which is used for explanatory purposes in this paper. In §3 we describe guided reductions, our new technique for semi-automatically converting high-level statements of type soundness into a form suitable for analysis by an automated decision procedure. In §4 and §5 we apply this combination of techniques to two case studies, and in §6 we discuss interesting potential avenues for future work. 1.1
Background: Type Soundness for Virtual Machines
In this section we consider the typical structure of a type soundness specification for a virtual machine. A good supply of examples exists against which to compare this structure, e.g. [16, 12, 13, 11], and we have examined these examples to check that they fall within the general structure described here. A structured operational semantics (SOS) used in a type soundness proof typically has the following components, as illustrated in Figure 1: A formal description of programs. A specification of the syntax of programs accepted for execution; A formal description of typechecking. A component modelling all the checks performed on a program prior to execution, i.e. typing rules for a source language or verification checks for a virtual machine; A formal description of execution. An abstract interpreter or model of compilation/execution; A type soundness property. Type soundness is usually specified by properties that express that (a) no errors of particular kinds occur during execution; (b) the machine always makes progress. Formal descriptions of systems as complex as the JVM or the .NET CLR vary substantially according to the specification methodology used, the exact logic in which the system is formalized and individual choices about how to model operations in the logic. However, the components above are
Figure 1: Typical Structure of a Type Soundness Specification for a Virtual Machine Proposition 2 If p : τ and p ` s ≤: S then there exists an s0 such that p ` s ; s0 and furthermore p ` s0 ≤: S. Whether the proof of such a property is effectively automatable obviously depends on the nature of the relations :, ; and ≤:. We stress that previous work on machinechecking such propositions has applied essentially adhoc automation techniques. While this paper does not attempt to achieve complete automation of the proofs of such properties, we believe it offers a first step in that direction.
always recognisable. The primary points of departure between different descriptions of the same system are: • Functional v. relational models of type checking and execution. • Big-step v. small-step models of execution, i.e. the representation of execution may be either an evaluation relation, and evaluation function or a transition relation.
1.2
Related Work
Wright and Felleisen’s 1994 work presented a systematic syntactic approach to a range of type soundness proofs for source languages [20], and in common with many other authors we have adopted aspects of their methodology in this paper. No prior work has attempted to systematically apply decision procedures or other particular automated techniques to type soundness proofs. However, there has been considerable work on using interactive theorem proving for these kinds of proofs, most notably Nipkow, von Oheimb and Pusch’s work on the JVM [18, 10, 12] and van Inwegen’s work on Standard ML [17]. Syme’s work on Java used a more restrictive proof style and applied decision procedures to prove resulting obligations [16]. There have been other efforts to formalize aspects of virtual machine descriptions but without mechanized proof checking, e.g. [13, 4], as well as a set of very extensive Abstract State Machine (ASM) descriptions of the JVM [14]. The work presented in this paper has also been inspired by Norrish’s treatment of C [11] and the general background of HOL theorem proving [5].
• The representation of errors. That is, if things go wrong during type checking or execution this may be modelled by either explicit error propogation and/or nonmembership of various relations. • The atomicity of execution steps. For most abstract models of execution, even small-step ones, steps may be larger than the one machine instruction that is the level of granularity on a real machine. This can effect the strength of the type soundness result proved. • The degree of realism of the model of execution, e.g. whether it models important features of execution such as optimizations.2 We now give example forms of the terms and predicates for the different components of a specifcation. We stress that the exact form of the functions and predicates differ in detail between systems, but the essence of the techniques used do not, and that our techniques are equally applicable to many particular specifications. Programs: a type Prog or programs p where P rog is defined via structural types such as lists, finite maps, records, integers, strings, products and sums;
2
Spark, a Little .NET Common Language Runtime
We now give a concrete example of a type soundness specification that serves to motivate our techniques to substantially automate type soundness proofs. A larger case study is discussed in §5. Our example is motivated by the instruction set of the .NET Common Language Runtime (CLR) [9, 8] and is called Spark.3 We informally introduce Spark in §2.1, and §2.2 gives a mixed relational/functional specification of the execution and verification of Spark programs. We also restate the type soundness property for Spark.
Checking: a predicate p : τp , indicating that the program p has the given type τp . This is usually defined compositionally in terms of a number of predicates p ` item : τitem indicating that various sub-components item of p are well-typed given the context of the whole program. Execution: a type State of states s, an initial state s0 and a relation p ` s ; s0 indicating that if the virtual machine is in state s running program p then it may take a step to state s0 .
2.1
Given the relations and functions above, type soundness can be defined as inhabitation of the transition relation:
Informal Description of Spark
A program in the Spark bytecode language consists of a single method implementation paired with a signature. A method implementation consists of an array of instructions. A method signature describes the number and types of the method’s arguments, the type of the method’s result, and the number and types of the method’s local variables. The state of the Spark execution engine when running a program consists of a program counter, a set of arguments, a set of local variables, and an evaluation stack. The values of arguments, local variables, and stack slots may be either signed integers or IEEE floating point numbers. The execution engine initializes the program counter to point at the first instruction of the method, the local variables to all contain zero, and the evaluation stack to be empty. The
Proposition 1 If p : τ and p ` s0 ;∗ s then there exists an s0 such that p ` s ; s0 . That is, none of the errors detected by the particular model of execution occur when running checked (well-typed) programs. Propositions like this are typically proved via an invariant that specifies “good” states, i.e. a type StateType recording expected shapes S of state structures arising at runtime (stack frames, heap entries etc.), and a predicate p ` s ≤: S indicating when a state conforms to a state type. Then the statement of soundness becomes: 2 The incorrect implementation of optimizations can break type soundness, so the strength of the property proven is relative not only to the typing rules but also the model of execution.
3 The CLR project was once called Lightning, hence the name: Spark, a little Lightning.
2
execution engine repeatedly executes instructions, and terminates when it encounters a return instruction, returning a value. Execution may, in general, lead to untrapped errors, when the execution engine is in a state where the instruction indexed by the program counter does not make sense. For example, it may refer to an argument or local variable that does not exist, or attempt to pop an empty evaluation stack, or apply an integer operation to a float, or branch to an address that does not exist. The purpose of the Spark verifier is to rule out any method whose execution could lead to an untrapped error. 2.2
type instr = | I_ret | I_ldarg | I_starg | I_ldc | I_ldloc | I_stloc | I_br | I_ble | I_pop | I_add | I_mul
We describe execution and verification of Spark programs by programming transition functions over data structures in a version of the Caml dialect of ML [7]. Our descriptions of Spark avoid all the imperative features of ML and use no recursion. Hence, we can directly interpret our ML data structures and procedures as mathematical sets and total functions, respectively. We have built a tool that imports our ML definitions into the DECLARE theorem prover [16], which essentially implements the higher order logic of the HOL system. The ML definitions are interpreted as phrases of higher order logic. §2.2 explains ML data structures that define the instruction set of the Spark intermediate language, and the metadata accompanying each method. §2.3 describes ML procedures which define initialization and transition functions of the Spark execution engine. Similarly, §2.4 describes ML procedures that define initialization and transition functions for the Spark verifier. Instructions in the intermediate language need to index particular program addresses, method arguments, and local variables. We make the following ML type definitions to describe such indexes, where the type int is the primitive type of integers.
Item types, method signatures: type itemT = I | F type msig = {argsT: itemT list; retT: itemT; locsT: itemT list}
item type signed integer IEEE f.p. number method signature argument types return types local variable types
Finally, the following record type describes the methods run by our baby execution engine: each consists of a signature paired with a method implementation. Method: type meth = {msig: msig; instrs: instr list}
Indexes occurring in instructions:
2.3
bytecode address argument index local variable index
method method signature method implementation
The Spark Execution Function
Our description of the execution of individual instructions by the execution engine is presented in Appendix A. It specifies an ML function step with the following type:4 step: meth -> state -> state option
Next, we define an algebraic type to represent numeric constants.
This is a functional description of a deterministic transition relation. The type state is defined as follows:
Constants occurring in instructions: type const = | Const_I of int | Const_F of float
arg_idx arg_idx const loc_idx loc_idx addr addr
The metadata accompanying a method implementation is a signature, which describes the number and types of its arguments, the type of its result, and the number and types of its local variables. In the following, we define an algebraic type itemT whose values I and F are the execution engine types of integers and reals, respectively. Next, we define a record type msig whose values are records describing the types of a method’s arguments, result, and locals.
The Specification of Spark
type addr = int type arg_idx = int type loc_idx = int
of of of of of of of
instruction exit the method load an argument store into an argument load an integer or float load a local store into a local unconditional branch conditional branch pop an element off the stack addition multiplication
Items, states:
constant integer IEEE f.p. number
type item = Int of int | Float of float
item
(The type float is the primitive type of IEEE floating point numbers. In general, each value of an algebraic type consists of one of several constructors, possibly applied to an argument. For example, each value of the type const is either an application Const_I i, where i : int, or an application Const_F f , where f : float.) The instruction set is the following algebraic type instr.
type state = {args: item list; locs: item list; stack: item list; pc: addr}
execution state items in arguments items in local variables items on the stack program counter
Spark intermediate language instructions:
4 The option type constructor or an equivalent is part of the standard libraries of most theorem provers. It simply adds a distinguished element to a type, type ’a option = None | Some of ’a, with projection function the: ’a option -> ’a.
integer IEEE f.p. number
For the purposes of this paper we ignore the generation of initial states.
3
2.4
The Spark Verification Checks
let destOK (methT, (daddr, daddrT)) ↔ match (read methT daddr) with | None -> false | Some daddrT’ -> daddrT = daddrT’
The purpose of a bytecode verifier is to attempt to guarantee, given a method as input, that no execution path through the method can lead to an untrapped error. The verifier achieves this by computing, for each instruction in the method, a precondition on the types of items on the stack immediately before the instruction’s execution. If the verifier can check that the effect of simulating each instruction on the types of items on the stack is consistent with the preconditions of all successor instructions, then it can guarantee that no untrapped error can occur. The consistency checks made by the verifier are not necessary conditions; there are inevitably false positives where a method fails verification even though no execution path through it leads to an untrapped error. In this paper we represent verification checks by a relation that relies on being given a summary of information that would typically be inferred during the execution of a series of verification checks. Pusch and Nipkow have shown how to formalize the link between a verification algorithm and a relational view of the checks made during verification [12]. We follow their approach of defining the verification checks at particular instructions so that they could be shared between an algorithmic and relational specification. A precondition on an address is simply a list of types representing the shape of the stack prior to execution of that address. The ML type stackT represents a precondition. Preconditions for all or some of the addresses in a method are represented by ML values of type methT, a list indexed by addresses.
let addr_hastype (addr,m,mT) ↔ match (read m addr) with | None -> true | Some(stackT) -> begin match (dests (m,addr,stackT)) with | Some(dests) -> ∀d ∈ dests. destOK (mT,d) | _ -> false end let hastype (m,mT) ↔ ∀addr . addr_hastype (addr,m,mT)
(Here read : ’a list -> int -> ’a option indexes into a list, returning None if the index is out-of-range.) 2.5
In this section, we define what it means for an execution state to conform to a method typing, mτ . The relation stackOK(stack , stackT ) means that the list stack conforms to the type list stackT when executing program m. For the relation to hold, the lists must have the same length, and each item in stack must conform to the corresponding type in stackT. Conformant items, stack, locals and arguments: let valOK(v,vT) ↔ match v,vT with | Int _, I -> true | Float _, F -> true | _,_ -> false
Stack and method typings: type stackT = itemT list type addrT = addr × stackT type methT = stackT list
Type Soundness for Spark and the Dynamic Conformance Relations
types of items on stack address with its type stack type for each addr
let stackOK(stack,stackT) ↔ (length(stack) = length(stackT)) ∧ ∀j < length(stack). valOK(nth stack j,nth stackT j)
Our main subroutine is a function dests that simulates a an instruction and computes its destinations. It depends on a subroutine effect to simulate the effect of running an instruction on a precondition.
Likewise for argsOK and locsOK.
(The function nth: ’a list -> int -> ’a fetches the nth element from a list.) The relation stateOK(s, mτ ) means that the execution state s conforms to the preconditions mτ . For the relation to hold, the arguments and locals maps must conform to the signature of the method, and there must be a precondition stackT assigned to the current program counter by mτ such that the current stack s.stack conforms to stackT .
effect: msig × instr × stackT -> stackT option dests: meth × addr × stackT -> addrT list option The definition of these functions is shown in Appendix B. The remainder of the checks are defined in relational form (we continue to use program-like syntax for consistency). The relation addr hastype(addr , m, τm ) means that an instruction instr occurring at address addr in a method m is well-typed, where τm is the set of preconditions for the whole method. For the relation to hold, the effect of the instruction must be compatible with its precondition, and with the preconditions of all instructions reachable from the instruction. The relation hastype(m, τm ) is the primary typing predicate and means that the method m is well-typed with respect to the stack type τm . For the relation to hold, each of the instructions of the method must be well-typed with respect to the preconditions τm .5
Conformant state: let stateOK(s,sT) ↔ argsOK(s.args,sT.argsT) ∧ locsOK(s.locs,sT.locsT) ∧ match read methT s.pc with | Some(stackT) -> stackOK (s.stack,stackT) | _ -> false end
The key type soundness propositions can now be stated: Proposition 3 If hastype(m, τm ) and stateOK(s, mτ ) then there exists and s0 such that step(s) = Some(s0 ) and stateOK(s0 , mτ ).
Typing predicates:
We now want to substantially automate the proof of this and similar, more complex propositions by using a decision procedure combined with a little user guidance.
5 In this paper we ignore initialization conditions for simplicity: in our actual formulation the precondition on the first instruction of the method must be the empty stack.
4
3
Controlling a Decision Procedure via Guided Reductions
interactive proof checking. Techniques where no human interaction is required are just called decision procedures.
Recent efforts have shown that it is feasible to machine-check type soundess properties for non-trivial models of execution and verification, certainly including propositions such as Proposition 3. However, the techniques currently require substantial amounts of human guidance. We reiterate that it is an open question as to whether practical fully automated techniques exist for checking classes of type soundness properties, of which Proposition 3 is but one simple example. It is important to remember that the execution and verification checks can be almost arbitrarily more complex than the example shown in the previous section, all the way up to doing proofs about realistic implementations of fullly featured virtual machines. Thus these properties will remain a challenging and fertile area for applying automated techniques for a long time to come. However in this paper we do not aim for full automation but rather seek to reduce and limit the amount of human intervention required in the proof effort. For pragmatic reasons we are interested in using decision procedures to perform our automated reasoning, as it is necessary to produce simple counter-examples for failed proof efforts. We will apply essentially the technique implemented by the Stanford Validity Checker (SVC) [1] (equally applicable would be its successor, CVC [2]). This procedure checks the validity of quantifier-free formula first-order logic with respect to theories for arithmetic, products, arrays (maps), sums and conditionals. Good counter-examples can be generated when proofs fail. SVC does not make use of axioms for finite discrimination or induction over datatypes, and the axioms it uses for indexed data structures apply equally to lists accessed using indexing functions, partial functions, total functions and finite maps. SVC has been successfully used for proofs about abstracted descriptions of microprocessors, for example where the ALU is modelled by a mathematical function rather than a circuit. It is an open question if other decision procedures (e.g. [6]) can be applied to the kind of proofs described in this paper. 3.1
3.2
An Example of a Decidable Type Soundness Proof
First let us consider a type soundness proof that can be fully automated using our target decision procedure: this will give a flavour of what can and can’t be done automatically. Consider a very trivial machine containing a single stack reg that contains zero or more values ([v1 ; ...; vn ])). A program consists of a single instruction and a type for the register is either A, indicating at least one value is definitely present, or else B. We only consider one instruction, “cgt”, which takes a value from the register and pushes either 0 or 1: Definitions for Example let hastype (inst,regT) ↔ match inst with | I_cgt -> (regT = A) | _ -> false let step (inst,reg) = match inst with | I_cgt -> begin match reg with | (v :: rest) -> Some(f v::rest) | [] -> None end | _ -> None let stateOK(reg,regT) ↔ match reg,regT with | (v :: _),A -> true | [],B -> true | _ -> false
Following the form of Example Proposition 2, type soundness for this trivial machine will be: Proposition 4 If hastype(p, pτ ) and stateOK(reg, reg τ ) then there exists reg 0 such that step(p, reg) = Some(reg 0 ) and stateOK(reg 0 , reg τ ).
Guided Proof Checking
Variations on the “transform and give to a decision procedure” theme are used on an adhoc basis in theorem proving and the combination of automatic transformations with decision procedures occurs often in computing. However our transformations are not automatic: i.e. they are non-trivial and represent insight on the part of the user. So in order to demonstrate that we have not resorted to full-blown interactive theorem proving we need a characterization of the input specified by the user. In this paper we rigorously seperate proof checking into three steps:
Expanding all definitions, adding the assumption p = I cgt, and replacing pattern matching with its equivalent let/test/destructor form is an automatable process and gives the following:6 Example after Preparation for Decision Procedure let let let let let let let let let let let
1. The user specifies a transformation that reduces the problem to one within a recognised, decidable logic; 2. The system automatically applies the transformation; 3. The resulting formula is passed to a decision procedure, which returns “OK” or a counter-example.
p = I_cgt in regT = reg0T in reg’ = reg0 in v = hd reg’ in rest = tl reg’ in reg’’ = reg0 in regT’’ = reg0T in v’ = hd reg’’ in reg’’’ = reg1 in regT’’’ = reg1T in v’’ = hd reg’’’ in
(if is_cgt p then if (regT = A) then Some(A) else None else None) = Some(reg1T) ∧
When proof is divided in this away we call it guided proof checking, and we call our particular technique guided reduction. In contrast, we call techniques where the user interactively applies further proof methods to residue problems
6 We do not simplify the formula under the assumptions: this is the job of the decision procedure.
5
• The problem statement may include large, irrelevant definitions such as that for valOK. A heuristic-based case-splitting decision procedure such as SVC can be easily misled by the presence of such terms. Better heuristics can help, but ultimately these definitions are only needed for some branches of a proof and their presence on other branches greatly hinders both the automation and the interpretation of counter-examples.
(if is_cgt p then if is_cons reg’ then Some((if v > 0 then 1 else 0)::rest) else if is_nil reg’ then None else arb else None) = Some(reg1) ∧ (if is_cons reg’’ ∧ is_A regT’’ then true else if is_nil reg’’ ∧ is_B regT’’ then true else false) → (if is_cons reg’’’ ∧ is_A regT’’’ then true else if is_nil reg’’’ ∧ is_B regT’’’ then true else false)
The predicate stackOK could be defined recursively, e.g. from left-to-right along the list. However the automated routine must then determine how many times to unwind that recursion. In addition, the number of unwindings depends upon the branch of the problem, as we shall see, and on some branches of some proofs may be indexed by a parameter, for example when n arguments are popped off a stack at a call instruction in a virtual machine. Furthermore, whatever techniques we use have to be able to cope with random access structures such as lists indexed by position and finite maps, which do not fit nicely into a recursive framework.
(Here → is logical implication.) Now, this is exactly the kind of problem that is immediately decidable by an SVC-like decision procedure. SVC repeatedly looks for propositions to case-split on, propogates resulting facts and equalities and applies simplification rules to keep the term in a canonical form. SVC is blindingly fast for small propositions, and can perform 100,000s of equation propogations per second, partly because of the efficient graph-based term representation used. It is extremely good at trimming parts of the case-splitting search space on the basis of known facts, because term sizes are reduced as facts become known. 3.3
3.4
Our Proof Guidance
We now describe our technique to let the user avoid the problems associated with uncontrolled unwinding of large definitions and first-order quantifiers. A proof script is made up of three parts:
Motivating Guided Reductions
Unfortunately, not all type soundness problems may be solved immediately by the application of a SVC-like decision procedure. Consider the following problem:
• The specification of a set of problematic predicates. • The specification of how and where to apply the fundamental rules associated with problematic predicates. This is called a guided reduction.
A Problem Not Immediately Provable by SVC
let valOK(v,vT) ↔ some large expression let stackOK(stack,stackT) ↔ (length(stack) = length(stackT)) ∧ ∀j < length(stack). valOK(el(j,stack),el(j,stackT))
• The specification of some additional heuristic information that may be necessary for the efficient execution of the decision procedure in very large problems, for example case-split orderings.
stack0T [] ∧ stack1T = hd(stack0T)::hd(stack0T)::(tl(stack0T)) ∧ stack0 [] ∧ stack1 = hd(stack0)::hd(stack0)::(tl(stack0)) ∧ stackOK(stack0,stack0T) (A) → stackOK(stack1,stack1T) (B)
This constitutes the full input specified by the user. The well-formedness of the guided reduction can be checked automatically. The process of applying the reduction involves: • Expanding all definitions of all terms representing applications of rules.
This is based on the cae for the “dup” (duplicate) instruction when proving Proposition 1 from §2.5. It is similar to the example from the previous section, except that the types for the stack are now a list of types, rather than a single “bit”, and the predicate stackOK, which checks that a stack matches its typing, uses the ∀ operator. Such a problem is not immediately solvable via the SVC technique for two reasons:
• Expanding the definitions of all non-problematic predicates and functions. • Replacing pattern matching by the equivalent test/get form. The problem is then submitted to the decision procedure. The first part of the proof script is a specification of a set of problematic predicates and functions. These are typically:
• The operators length, nth and ∀ lie in theories not understood by the decision procedure, so the procedure regards them as uninterpreted. Looking at this another way, the way stackOK has been defined as a universal first order predicate has lead us into reasoning about both first-order and equational theories simultaneously. If we were using a first-order theorem prover, we could throw in axioms about these operations and perform a proof search. However it is well known that combining such problems are difficult, and while progress has been made recently to determine forms of such problems that are tractable [3], it is not yet clear if the techniques will scale up to very large verification problems while providing the high-quality counter-example feedback that is required.
• Those whose definition is recursive; • Those whose definition involves operators such as ∀ that lie outside the theory supported by the target decision procedure; • Those that are not asserted to be irrelevant to the property being proved and may thus be treated as abstract; • Those whose uncontrolled expansion creates an unacceptable blow-out in proof checking times;
6
• Those whose uncontrolled expansion creates unacceptable confusion in the counter-examples generated for failed proof attempts;
Indexed (Strengthening). A rule r of the form ∀y1 , . . . , yn . Q[x1 , . . . , xm , y1 , . . . , yn ] → p(x1 , . . . , xm ) for a problematic predicate p.
• Those which are used in rules for other problematic predicates.
For example, after a trivial rearrangment of quantifiers stackOKHead is a definitional rule, and stackOKHeadW is weakening rule. Note we rely on rules being in one of the above forms: for example, this paper does not consider induction principles for inductive relations or recursive term structures. These will be considered in a later paper.
The non-problematic predicates and functions are typically those which have a lone equational axiom of the form p(x1 , . . . , xn ) = Q[x1 , . . . , xn ] where Q contains no problematic predicates. The definitions of all such predicates and functions will be expanded before the problem is sent to the decision procedure. In our problem domain, problem specifications are “complex but shallow”, and the majority of functions and predicates are not problematic. The problematic predicates in the example from §3.3 are valOK and stackOK. There are no problematic functions. Problematic predicates are left uninterpreted except as specified in the guided reduction given by the user. We assume that a set of rules are available for problematic predicates. For functions these are always the equational definitions of the functions, and for predicates are often the definitions of the predicates themselves. More generally they can be a set of useful lemmas related to the predicates which follow immediately and trivially from its definition. For this paper we simply assume these results as axioms, but [16] shows how similar results can be derived automatically from definitional forms such as equations involving quantifiers or least-fixed-point operators. We assume all rules containing problematic predicates can be classified as information preserving (no suffix), weakeners (W) and strengtheners (S). For functions we assume rules that are equations for the function. Multiple rules can exist for each problematic predicate and function. For example, here are two rules for stackOK. They follow directly from the definition in §3.3.
3.5
An Informal Look at Some Guided Reductions
The second part of a proof script is a specification of how and where to apply the fundamental rules associated with problematic predicates, called a guided reduction. Consider the following informal specification of a guided reduction for the problem specified in §3.3 above: “If the instruction is “dup”, apply the stackOKHead rule once to the input stack (as characterized by fact (A)) and twice to the output stack (as characterized by fact (B)).” Informally, applying this reduction to the problem in §3.3 means replacing the specified facts and goals with the r.h.s. of rule stackOKHead and leaving remaining instances of stackOK and valOK uninterpreted. The problem is then immediately solvable by a decision procedure such as SVC. In this example we have effectively used guided reductions for controlled rewriting of the relevant predicates. Note that uncontrolled rewriting using rule stackOKHead does not terminate. Two more example informal descriptions of guided reductions are as follows:
Rules for stackOK
• “If the instruction is I pop, apply stackOKHead once to the input stack”.
stackOKHead |stackOK(stack,stackT) ↔ match (stack,stackT) with | [],[] -> true | (h::t),(hT::tT) -> valOK(h,hT) | _,_ -> false
• “If the instruction is a binary operation b, apply stackOKHead twice to the input stack (revealing the input values) and once to the output stack (revealing the output value). Further for each input and output value apply the valOKDef rule.”
∧ stackOK(t,tT)
The above examples indicate that we would like guided reductions which can be combined together in meaningful ways, for example chaining (“apply stackOKHead to the input stack and valOKDef to the value that results”) and disjunction (“either apply stackOKHead once or twice”), and which can be conditional, where the conditionals are based on abstracted criteria (a binary instruction) rather than primitive syntactic criteria (e.g. a list of specific instructions).
stackOKHeadW |stackOK(stack,stackT) ∧ stack [] ∧ stackT [] → valOK(hd stack,hd stackT) ∧ stackOK(tl stack,tl stackT)
We consider the following forms for rules: Definitional. A rule r of the form p(x1 , . . . , xn ) ↔ Q[x1 , . . . , xm ] for a problematic predicate p. Weakening. A rule r of the form p(x1 , . . . , xm ) Q[x1 , . . . , xm ] for a problematic predicate p.
3.6
→
The Algebra of Guided Reductions
Formally, a guided reduction r for a predicate p is a specification of a replacement predicate for p using one of the following forms:
Strengthening. A rule r of the form Q[x1 , . . . , xm ] → p(x1 , . . . , xm ) for a problematic predicate p.
Identity. The predicate p itself. See §3.6.1. A Rule Operator. An operator corresponding to one of the rules for p supplied with appropriate guided reductions as arguments, as specified below. See §3.6.2 and See §3.6.3.
Indexed (Weakening). A rule r of the form ∀y1 , . . . , yn . p(x1 , . . . , xm ) → Q[x1 , . . . , xm , y1 , . . . , yn ] for a problematic predicate p.
7
3.6.3
A Monotone Combinator. Guided reductions can be combined using combinators monotone or antimonotone in each of their arguments. The examples we use in this paper are given in below. See §3.6.4.
Weakening (likewise strengthening) indexed rules of the form ∀y1 , . . . , yn .p(x1 , . . . , xm ) → Q[x1 , . . . , xm , y1 , . . . , yn ] for a problematic predicate p give rise to an operator for building weakening (likewise strengthening) guided reductions. For example, the following indexed rule justifies random access into a list:
Guided reductions must be well-formed and are categorized as weakening and/or strengthening. An information preserving rule is one that is both weakening and strengthening. The fundamental property required of a guided reduction is that a weakening reduction r for p must satisfy ∀x.p(x) → r(x), and a strengthening reduction must satisfy ∀x.r(x) → p(x). This is easily demonstrated for each of the forms we describe below. 3.6.1
Weakening rule for argsOK argsOKPoint |∀ i. argsOK(args,argsT) → match (read args i, read argsT i) with | None,None -> true | Some v, Some vT -> valOK(v,vT) | _,_ -> false
Identity Reductions
Each problematic predicate can itself be used as a guided reduction indicating that no reduction should be performed. These are well-formed and information preserving. For example, the guided reduction stackOK is a well-formed information preserving guided reduction. 3.6.2
For Spark this rule follows immediately from the definition of argument conformance in §2.5. This gives rise to the following operator for building weakening guided reductions:
Using Simple Rules
Operator for rule argsOKPoint
Each definitional rule r of the form p(x1 , . . . , xk ) ↔ P [x1 , . . . , xk , q1 , . . . , qm ] problematic predicates p, q1 , . . . , qm gives rise to an operator R parameterized by predicate variables V1 , . . . , Vn , one for each occurrence of a problematic predicate in P (i.e. we have n ≥ m). R(Q1 , . . . , Qn )(x1 , . . . , xk ) holds if and only if P [x1 , . . . , xk , V1 , . . . , Vn ]. Here the replacement of Q1 , . . . , Qn replaces the n individual occurrences of q1 , . . . , qm in P . For example, the rule “stackOKHead” gives rise to the following operator:7
argsOKPointW i V (args,argsT) ::= match (read args i, read argsT i) with | None,None -> true | Some v, Some vT -> V(v,vT) | _,_ -> false
In other words, this operator lets us pick out an index i at which to reveal the fact that valOK holds, and furthermore to reveal additional information about that value by giving an appropriate argument for V . Thus the operators give a compact notation for supplying important instantiations and chaining inferences. In effect we are taking advantage of the fact that in “complex but shallow” problem domains specifying a few critical inferences can open the way for automation to do very useful amounts of work. Once again conditions apply for the argument given to predicate variables such as V . In particular, if they are used in a positive (likewise negative) position on the right of the definition then they must be given a weakening (likewise strengthening) guided reduction. For example, the guided reduction argsOKPointW 3 argsOK is a well-formed weakening reduction, because argsOK is well-formed and information preserving.
Operator for rule stackOKHead stackOKHead V1 V2 (stack,stackT) ↔ match (stack,stackT) with [],[] -> true | (h::t),(hT::tT) -> V1(h,hT) ∧ V2(t,tT) | _,_ -> false
A position within a nesting of first order connectives is defined as positive, negative or neutral in the usual way, i.e. according to the markup scheme: + ∧ +; + ∨ +; − → +; 0 ↔ 0; ¬−; ∀+; ∃+. For example, the variables V1 and V2 occur in positive positions on the right side of the definition of stackOKHead. For a guided reduction R(A1 , . . . , An ), each Ai must be well-formed, and further the guided reduction is weakening (likewise strengthening) if each Ai is:
3.6.4
Combining Reductions
It is now easy to write operators to combine guided reductions: Search Operators
• weakening (likewise strengthening) when the corresponding position in P is positive; or
(p1 OR p2) ::= λx. p1(x) ∨ p2(x) (p1 AND p2) ::= λx. p1(x) ∧ p2(x)
• strengthening (likewise weakening) when the corresponding position in P is negative.
The operator OR former is typically applied to guided reductions used to transform goals, and effectively describes multiple ways of proving the same goal. The operator AND is applied to guided reductions used to transform facts, and effectively describes how to derive multiple pieces of information from the same fact. Guided reductions built using these operators are weakening/strengthening if both arguments weakening/strengthening. The following guided reductions discard facts and goals and are respectively weakening and strengthening.
In other words, each rule defines an operator that is neutral, monotone or anti-monotone in each of its predicate arguments according to the way the predicates corresponding to the arguments are used in the body of that rule. For example, the guided reduction stackOKHead(p1 , p2 ) weakening if both p1 and p2 are weakening and strengthening if both p1 and p2 are strengthening. 7
Using Indexed Rules
Here we use curried syntax for the extra higher-order arguments.
8
Operators for discarding information
4
DoNotUse ::= λx. true DoNotProve ::= λx. false
Case Study 1: Spark
We now consider the use of our techniques to prove Proposition 3. The overall proof script required to prove the first part of Proposition 3 is:8
The if/then/else operator lets the user choose an appropriate reduction based on a condition. It is strengthening is both p1 and p2 are strengthening, likewise weakening. The => operators let us conditionally extract extra information from a fact or goal for use on those branches of a proof where the guard predicate holds:
Proof script for Spark Soundness (1) Proposition: hastype(1) (m,methT) ∧ stateOK(2) (s0,methT) --> step (m,s0) None
Conditional Guidance Problematic predicates: stateOK, stackOK, hastype, locsOK, argsOK
(if g then p1 else p2) ::= λx. (if g then p1(x) else p2(x)) (g => p1) && p2 ::= (if g then p1 else DoNotUse) AND p2 (g => p1) || p2 ::= (if g then p1 else DoNotProve) OR p2
Replace (1) by: hastypePointW pc0 Replace (2) by: stateOKRule ((is_stloc i) => stackOKHead1 stackOK && (is_starg i) => stackOKHead1 stackOK && (is_binop i) => stackOKHead2 stackOK && (is_ret i) => stackOKHead2 stackOK && (is_ble i) => stackOKHead1 stackOK && (is_pop i) => stackOKHead1 stackOK && stackOK) ((is_stloc i or is_ldloc i) => locsOKPointW loc0 && locsOK) ((is_starg i or is_ldarg i) => argsOKPointW arg0 && argsOK) Where pc0 ::= s0.pc i ::= match (read m.instrs pc0) with Some(i) -> i loc0 ::= (match i with I_stloc x -> x | I_ldloc x -> x) arg0 ::= (match i with I_starg x -> x | I_ldarg x -> x)
Examples of the use of these are given below. 3.6.5
Guided Reductions as Term Replacement
When authoring a guided reduction the user directly replaces uses of problematic predicates by applications of these operators. The correctness of this process can be determined syntactically, by checking that the weakening (likewise strengthening) guided reductions are only applied to facts (likewise goals). Guided reductions could be authored in other forms, e.g. as tactics in a theorem prover such as HOL or Isabelle. However there are important practical benefits to representing guided reductions by predicate-replacement: • The terms are type-checked in combination with the term defining the problem itself, which captures many errors early on;
The rules for the problematic predicates hastypePointW, stackOKHead1, stackOKHead2, argsOKPointW and locsOKPointW are derived immediately from the definitions given in §2. The rule stateOKRule is also derived from the definition of stateOK in §2.5 and composes choices for guided reductions for the input stack, locals and arguments. In practice we prove the entire soundness property in one step by a similar script, the proposition proved begin:
• The terms may involve proof constants from the problem specification; • The terms occur directly in position rather than as a later, disassociated operation, reducing the fragility of the guided-reduction vis-a-vis reorderings and restructurings of the problem statement. 3.6.6
Example The Spark Soundness Theorem
Finally, back to our example. The guided reduction in §3.5 can be formalized by replacing the predicate stackOK in formula (1) in §3.3 by
hastype(m,methT) ∧ stateOK(s0,methT) --> match step(s0) with | None -> false | Some(s1) -> state_ok(s1,methT)
(i = I_dup) => (stackOKHead valOK stackOK) && stackOK and the occurrence in formula (2) in §3.3 by
The overall proof script for this property is under 40 lines. The really promising thing about these proof scripts is just how much has not been mentioned. In particular, if we examine the definitions in §2 and the Appendix, no mention has been made of functions such as binop, step, pop2, |>, dests, effector instr_hastype. The decision procedure is fed a very large term with all the definitions of these functions expanded. The process of case-splitting through all the instructions and all the failure/success cases implicit in the execution and verification semantics happens automatically.
(i = I_dup) => (stackOKHead valOK (stackOKHead valOK stackOK)) || stackOK This replacement is justified because the guided reductions are respectively weakening and strengthening. The application of the guided reduction simply involves expanding all definitions of operators and non-problematic predicates and functions and applying the decision procedure, which is now able to check the validity of the resulting formula.
8 This is a tidied-up version of the actual proof script, which is a little more arcane.
9
primitive operations.11 The fragment included a subtyping relation with appropriate rules. Some aspects of this proof are beyond the scope of this paper, in particular the use of guided reductions in the presence of inductively defined relations over recursive term structures. Apart from this issue the core technique used was essentially as described in §3. The Baby-IL instuctions for which we have verified the corresponding soundness property include:
Proof Times After the application of this reduction the validity of the resulting formula is passed to the decision procedure and a counter-example, if any, is returned. The overall size of the expanded problem sent to the decision procedure would run for hundreds of pages (many sub-terms are shared within the problem). Our implementation of the SVC decision procedure takes 14.4s to prove the first part of the soundness proof9 , with 5217 case-splits and 1377 unique terms constructed.
• • • • • • • • • • • • •
Split Hints We have always found that proof times can be dramatically reduced by specifying some simple and natural case-split orderings. For example, if we specify that the first split should be on the kind of instruction, the time reducses to 0.25s with 109 case-splits.10 4.1
Producing Counter-examples
Consider what happens if we omit the check for vT = vT2 for the instruction starg in the definition of effect from the Appendix. Type soundness no longer holds, but what does this do to our proof? A 30 line counter-example is printed, containing, among other things:
The components of the specification are:
Sample Counter-example: Predicate Form
• A term model of programs consisting of 20 lines of ML datatype definitions;
is_starg i0 is_F (nth m.mref.argsT (starg_get0 i0)) is_Int (hd s0.stack)
• A pseudo-functional12 small-step execution semantics comprising 180 lines of ML code, including some uninterpreted operations (e.g. a function that resolves virtual call dispatch is assumed);
This suggests that the verifier is unsound when the instruction is a starg, the type of the argument being written is F, i.e. a floating point number is expcted, but the first value on the stack is a Int, i.e. an integer value - the bug has been detected! Counter-example predicates like these can also be solved to give a sample input (with some unknowns) that exposes the error – this shortens the counter-example:
• A functional type checker comprising 90 lines of ML code and some uninterpreted operations; • A specification of conformance akin to that found in [16].
Sample Counter-example: Structural Form
The proof assumes the following lemmas, which we have proved by inspection:
m --> { mref = { argsT = { ?x -> F; ... }; ... }; instrs = { pc0 -> I_starg ?x; ... } }; s0 --> { stack = Int ... :: ...; ... }
• Weakening/strengthening rules about the problematic predicates - effectively we used these as axiomatic specifications of these predicates.
Here “...” and ?x represent places where the counterexample is not concrete, and a -> b indicates that index a in a list contains value b. The case study could be made concrete by searching for arbitrary terms which satisfy the remaining non-structural constraints, but we have not attempted to do this. 5
Loading a constant; Sequencing operations; Conditionals; While loops; Call instructions (passing multiple parameters); Virtual call instructions; Loading and storing arguments; Boxing structs (inline valus) to objects; Creating new objects; Creating new struct (inline) objects; Loading the address of an argument; Loading the address of a field in an object; Loading and storing via a pointer.
• 11 lemmas about recursive operations such as “write into a nested location within a struct (inline value) given a path into that value”; • One lemma about the existence of a heap typing that records the types of all allocations that will occur during execution;
Case Study 2: Investigating Baby-IL
• A lemma connecting the typechecking process to an initial term-conformance predicate, of the kind stated and proved in Chapter 7 of [16].
Our second case study consists of verifying the type soundness of a small-step term rewriting system corresponding to the Baby-IL fragment of MS-IL described in [4]. Our verification is up to a number of assumptions about some
11 We were simultaneously developing a written proof for this subset, and our primary aim was to use machine-assisted checking to help clarify our understanding of that proof. Thus for our purposes it was not essential to discharge all the assumptions. 12 By which we mean a relation written in the form R = lfp(F ) for some functional F and where F is itself defined using a function that defines a relation by returning an option.
9
This is using a 750 MHz Pentium III. Noted that our implementation of the SVC procedure is neither as optimized as SVC and does not include the full set of heuristics implemented in that system. SVC and its successor CVS should ran faster still. 10
10
The overall guided reduction was specified in tabular form, with 41 rows (each corresponding to one rewrite action in the execution semantics), and 7 columns (specifying guided reductions for the input heap, stack frames, input term, execution step, output heap, output stack frames and output term). The table was sparse, with 60% of entries being an entry indicated that no special reasoning about the information was needed for that branch of the proof. This left around 120 entries for the essence of the proof, each a couple of identifiers long. In contrast the proof performed in [16] took around 2000 lines of proof script, despite applying very considerable amounts of proof automation. The proof was developed over the course of a week, again in contrast to previous efforts which took many months. We typically executed the proof for each the instruction independently (by commenting parts of an induction theorem), and each instruction took under 10 sec. to verify. We also found many mistakes in both our verifier and our model of execution during this process, though this is normal for specifications that are not tested by other means. 6
will reveal if they transfer in practice. For example, applying the techniques outlined in this paper to the recent extensive ASM descriptions of Java and the JVM [14] could form the starting point for mechanically checked proofs of properties for much larger formal models. The proof guidance technique described in this paper is novel, especially the automatic generation of combinators for a proof algebra from a specification of basic axioms for problematic predicates and functions. We have not described how inductive and other second-order proof techniques fit into this framework. It would also be very interesting to apply similar techniques to other problem domains. For example, combining theorem proving with model checking is a significant area of current research, and there is a great need for techniques to guide the process of decomposing hardware verification properties into problems that can be independently model checked. Guided reductions may have a role to play here. Acknowledgements John Matthews commented on a draft of this paper and worked on early attempts to prove facts about our formal models. We also thank Corin Pitcher and Paul Hankin for working with us on our attempts to formalize and test aspects of the .NET CLR.
Conclusion and Future Directions
This paper has presented a new semi-automated technique for mechanically checking the type soundness of virtual machines, and two case studies applying that technique. It is the first time SVC-like decision procedures have been extensively applied to a problem domain that was previously exclusively tackled using interactive theorem proving. The manual part of the proof technique is based on an algebra of “guided reductions” built using combinators that are automatically derived from definitions and rules for the predicates and functions being manipulated. The guided reductions allow the user to control the unwinding of recursive definitions and to give instantiations for certain crucial first-order rules. This gives a compact but controlled way of specifying the information necessary for different parts of a proof, and the proof hints can be combined to express finite proof search and conditional guidance. The automated part of the proof uses SVC-like validity checking for a quantifier free theory of arithmetic and structured terms, and although exponential in theory has proved efficient and controllable in practice, sometimes by giving hints regarding case-split orderings. This mirrors experiences with using these algorithms for hardware verification [1]. We have also described two case studies applying these techniques to fragments of the .NET CLR’s intermediary language, and when contrasted with interactive theorem proving, these case studies have certainly benefited from the increased use of automation. We found that the semiautomatic proof checking process was truly effective in helping us understand the problem we were looking at, especially for the second, larger case study. The results from the case-studies, although preliminary, indicate that the problem domain is highly automatable and that it is worthwhile to pursue the quest of a disciplined combination of proof guidance and proof automation. There are many interesting future directions for this work. It is certain that further automation can be applied in this problem domain, perhaps even achieving fully automated checking for important classes of soundness properties. It is also likely that properties other than type soundness will benefit from such treatment. In addition, applying our present combination of techniques to new specifications
References [1] Clark Barrett, David Dill, and Jeremy Levitt. Validity checking for combinations of theories with equality. In Mandayam Srivas and Albert Camilleri, editors, Formal Methods In Computer-Aided Design, volume 1166 of Lecture Notes in Computer Science, pages 187–201. Springer-Verlag, November 1996. Palo Alto, California, November 6–8. [2] Clark W. Barrett, David L. Dill, and Aaron Stump. A generalization of Shostak’s method for combining decision procedures. In Frontiers of Combining Systems (FROCOS), Lecture Notes in Artificial Intelligence. Springer-Verlag, April 2002. [3] Anatoli Degtyarev and Andrei Voronkov. Equality reasoning in sequent-based calculi. In Handbook of Automated Reasoning, Volume I, pages 611–706. Elsevier Science and MIT Press, 2001. [4] A. Gordon and D. Syme. Typing a multi-language intermediate code. In 27th Annual ACM Symposium on Principles of Programming Languages, January 2001. [5] M.J.C. Gordon and T.F. Melham. Introduction to HOL: A theorem-proving environment for higher-order logic. Cambridge University Press, 1993. [6] J.G. Henriksen, J. Jensen, M. Jørgensen, N. Klarlund, B. Paige, T. Rauhe, and A. Sandholm. Mona: Monadic second-order logic in practice. In Tools and Algorithms for the Construction and Analysis of Systems, First International Workshop, TACAS ’95, LNCS 1019, 1995. [7] Xavier Leroy. The Objective Caml system, documentation and user’s guide. INRIA, Rocquencourt, 1999. Available from http://caml.inria.fr.
11
Items, states:
[8] Serge Lidin. Inside Microsoft .NET IL Assembler. Microsoft Press, 2002. [9] Microsoft Corporation. Microsoft IL Assembly Programmer’s Reference Manual, July 2000. Part of the .NET Framework Software Development Kit, distributed on CD at the Microsoft Professional Developers Conference, Orlando, Florida, July 11–14, 2000. [10] Tobias Nipkow, David von Oheimb, and Cornelia Pusch. µJava: Embedding a programming language in a theorem prover. In F.L. Bauer and R. Steinbr¨ uggen, editors, Foundations of Secure Computation. Proc. Int. Summer School Marktoberdorf 1999, pages 117–144. IOS Press, 2000.
item
type state = {args: item list; locs: item list; stack: item list; pc: addr}
execution state items in arguments items in local variables items on the stack program counter
integer IEEE f.p. number
We use several auxiliary definitions to define the step function: Pushing, Updating, Branching, Failing, Popping:
[11] M. Norrish. C formalised in HOL. PhD thesis, University of Cambridge, 1998.
let push x s = Some {s with stack=x::s.stack} let starg x y s = Some {s with args= write x y s.args} let stloc x y s = Some {s with locs= write x y s.locs} let br addr s = Some({s with pc=addr}) let next s = br (s.pc+1) s let error (s: string) = None let pop s = match s.stack with [] -> error "pop" | (h::t) -> Some(h,{s with stack=t})
[12] C. Pusch. Proving the soundness of a Java bytecode verifier specification in Isabelle/HOL. In TACAS’99, Lecture Notes in Computer Science. Springer Verlag, 1999. [13] Z. Qian. A formal specification of JavaTM virtual machine instructions for objects, methods and subroutines. In J. Alves-Foss, editor, Formal Syntax and Semantics of Java, volume 1532 of Lecture Notes in Computer Science, pages 271–312. Springer Verlag, 1999. [14] Robert St¨ ark, Joachim Schmid, and Egon B¨ orger. Java and the Java Virtual Machine. Springer Verlag, 2001.
(Here “::” is the “cons” operation for lists, and pattern matching is the left-to-right satisfaction of equational constraints. A record update such as {s with stack=x::s.stack} denotes the outcome of making a fresh copy of the record s, and then updating its stack field to be x::s.stack, which itself denotes the outcome of consing the value x onto the front of the list s.stack. A list update such as write x y s.args denotes the outcome of making a fresh copy of the list s.args, replacing the entry at idx by y. The list is unchanged if the index is out-of-range.) The push function adds an element to the stack, and cannot fail, hence it always returns a Some value. The starg and stloc functions update an argument or a location, and cannot fail. The functions br and next update the program counter in the state. The first implements a branch to an arbitrary address; the second branches to the next address in sequence. The function error represents an untrapped error by returning None. Its string argument, which it discards, documents the cause of failure. During testing, we use the argument for debugging. The function pop attempts to pop an item off the stack. If the stack is empty, it calls error to signal an untrapped error. Otherwise, the Some constructor indicates error-free execution. When we come to program the step function, we often have to deal with intermediate calls to functions, like pop, that signify untrapped errors by returning None. A common pattern is for the whole call to step to return None if any of its intermediate calls return None. It is convenient to express this pattern using an infix combinator |>. Suppose that res has type Some S and f and has type S -> Some T. Then res |> f has type Some T, and means: evaluate res, and if its result is Some x, evaluate f x, but if the result is None, return None at once.
[15] R. Stata and M. Abadi. A type system for Java bytecode subroutines. In Proceedings POPL’98, pages 149– 160. ACM Press, 1998. [16] D. Syme. Declarative Theorem Proving for Operational Semantics. PhD thesis, University of Cambridge, 1998. [17] M. VanInwegen. The Machine-Assisted Proof of Programming Language Properties. PhD thesis, Department of Computer and Information Science, University of Pennsylvania, May 1996. [18] D. von Oheimb and T. Nipkow. Machine-checking the Java specification: Proving type-safety. In J. AlvesFoss, editor, Formal Syntax and Semantics of Java, volume 1532 of Lecture Notes in Computer Science, pages 119–156. Springer Verlag, 1999. [19] Philip Wadler. Comprehending monads. Mathematical Structures in Computer Science, 2:461–493, 1992. (Special issue of selected papers from 6’th Conference on Lisp and Functional Programming.). [20] Andrew K. Wright and Matthias Felleisen. A syntactic approach to type soundness. Information and Computation, 115(1):38–94, 1994. A
type item = Int of int | Float of float
The Spark Execution Semantics
We refer to the machine values that are passed as arguments, stored in locals or on the stack, as items. An item, of ML type item, is either a signed integer or a float. A state, of ML type state, consists of the method being executed, arguments, local variables, an evaluation stack, and a program counter. Code is included as part of the state, as this would, in principle, allow us to model dynamic loading and JIT compilation.
Sequential composition of computations: let (|>) res f = match res with Some x -> f x | None -> None
12
Using the |> combinator, the following function attempts to pop two items off the stack.
(For the purposes of this paper, the machine executes an infinite loop when a “return” is encountered. In our actual formalization we have represent the completion of execution by returning a special result from the step function.) This completes our formalization of the Spark execution engine.
Pop two items off the stack: let pop2 s0 = pop s0 |> fun (v1,s1) -> pop s1 |> fun (v2,s2) -> Some (v1,v2,s2)
pop v1 off s0, leaving stack s1 pop v2 off s1, leaving stack s2 return residual s2 plus v1, v2
B This code sequence exemplifies how we write functional models of imperative operations, with the |> combinator acting as a sequential composition. This imperative style improves the readability of our specifications by hiding many details of error handling. Still, because we can interpret the notation as a logical function, we can still conveniently reason about the code. This style is often known as monadic functional programming; see [19] for an introduction. It is a convenience rather than a necessity; if error handling were explicit our automatic analyses would still be possible. By combining previously described functions, we get the following function for evaluating binary operations on integer values:
The Spark Verification Semantics
To implement these functions we introduce auxiliary functions to push and pop types. They are analogous to the functions push, pop and pop2 on item stacks. Push and pop types: let pushT x y = Some(x::y) let popT (xs: stackT) = match xs with | [] -> error "popT" | x::xs’ -> Some (x,xs’)
Execute binary arithmetic instructions: let binop f1 f2 s0 = pop2 s0 |> fun (v1,v2,s1) -> pop v1, v2 off, leaving s1 match v1,v2 with determine style of operation Int i1,Int i2 -> integer operation push (Int (f1 i1 i2)) s1 execute f1 and push | Float r1,Float r2 -> floating point operation push (Float (f2 r1 r2)) s1 execute f2 and push | _ -> error "binop"
Next, the ML step function formalizes the description of instruction execution from §2.2.
push a type onto the stack pop one type off the stack
Next, we define the effect function. A call effect instr msig stackT returns None if instr is incompatible with stackT, and otherwise returns Some stackT’ where stackT’ is the effect of simulating instr on stackT. An auxiliary function binopT deals with simulating the add and mul instructions. Effect of an instruction: let binopT stackT = popT stackT |> fun (vT1,stackT1) -> popT stackT1 |> fun (vT2,stackT2) -> match vT1,vT2 with | I,I -> pushT I stackT2 | F,F -> pushT F stackT2 | _ -> error "binopT"
A single step of execution: let step p s = read p.instrs s.pc |> fun instr -> match instr with | I_ret -> pop s |> (fun (v,s’) -> if s’.stack=[] then Some(s) else error "stack not empty on return") | I_ldarg x -> read s.args x |> (fun v -> push v s |> next) | I_starg x -> pop s |> (fun (v,s’) -> starg x v s’) |> next | I_ldc x -> begin match x with Const_I a -> Some(Int a) | Const_F b -> Some(Float b) end |> (fun v -> push v s) |> next | I_ldloc x -> read s.locs x |> (fun v -> push v s |> next) | I_stloc x -> pop s |> (fun (v,s’) -> stloc x v s’) |> next | I_br off -> br off s | I_ble off -> pop s |> (fun (v,s’) -> if (match v with Int a -> a b pop s |> (fun (v,s’) -> Some(s’)) |> next | I_add -> binop (+) (+.) s |> next | I_mul -> binop (*) (*.) s |> next | _ -> error "step"
13
let effect (msig,instr,stackT) = match instr with | I_ldloc n -> read msig.locsT n |> fun vT -> pushT vT stackT | I_stloc n -> read msig.locsT n |> fun vT -> popT stackT |> fun (vT2,stackT’) -> if vT = vT2 then Some stackT’ else error "stloc" | I_ldarg n -> read msig.argsT n |> fun vT -> pushT vT stackT | I_starg n -> read msig.argsT n |> fun vT -> popT stackT |> fun (vT2,stackT’) -> if vT = vT2 then Some stackT’ else error "starg" | I_ldc v -> (match v with | Const_I _ -> Some I | Const_F _ -> Some F | _ -> error "push") |> fun vT -> Some(vT::stackT) | I_pop -> popT stackT |> fun (vT,stackT’) -> Some stackT’ | I_add -> binopT stackT | I_mul -> binopT stackT | I_br addr -> Some stackT | I_ble addr -> popT stackT |> fun (vT,stackT’) -> if vT = I or vT = F then Some stackT’ else error "ble" | I_ret -> popT stackT |> fun (vT,stackT’) -> if vT = msig.retT & stackT’ = [] then Some stackT’ else error "ret" | _ -> error "effect"
The dests function computes expected results of the typing simulation for a particular instruction. A call dests(m, i, pre) uses effect to compute the postcondition post of simulating instruction i of method m given precondition pre. Then it returns a list of pairs (j,post) such that j is reachable from i. New destinations from instruction simulation: let dests m (i,pre) = read m.instrs i |> (fun instr -> effect (m.msig,instr,pre) |> (fun post -> match instr with | I_ble j -> Some([ (j, post); (i+1, post) ]) | I_br j -> Some([ (j, post) ]) | I_ret -> Some([ ]) | _ -> Some([ (i+1, post) ])))
14