Reliable Frameworks for Extensible Compilers - Semantic Scholar

7 downloads 0 Views 212KB Size Report
Stephan Schmitt, Carl Witty, and Xin Yu. MetaPRL. — A modular logical ... [18] Andrew M. Pitts and Murdoch Gabbay. A metalanguage for programming with ...
Reliable Frameworks for Extensible Compilers∗ Jason Hickey, Nathan Gray, Aleksey Nogin, Cristian T¸apus ˘ ¸ Caltech, M/C 256-80 1200 E California Blvd Pasadena, CA 91125, USA {jyh,n8gray,nogin,crt}@cs.caltech.edu

ABSTRACT We present a new methodology for compiler design, based on the use of a transformation logic defined within an existing general-purpose logical framework. We demonstrate how this methodology can be used to address several central issues in compiler design and implementation: ease of implementation, extensibility, compositionality, and trust.

Categories and Subject Descriptors D.3.4 [Programming Languages]: Processors—Translator writing systems and compiler generators; D.2.4 [Software Engineering]: Software/Program Verification—Formal methods

General Terms Reliability, Verification

Keywords Formal compiler, compositional compiler, extensible compiler, higher-order abstract syntax, logical programming environment

1.

INTRODUCTION

We present a new methodology for compiler design, based on the use of a transformation logic defined within an existing general-purpose logical framework. In our approach the central part of the compiler is a set of specifications on a formal language; these specifications follow a standard textbook account of programming language semantics almost to ∗This work was supported in part by the DoD Multidisciplinary University Research Initiative (MURI) program administered by the Office of Naval Research (ONR) under Grant N00014-01-1-0765, the Defense Advanced Research Projects Agency (DARPA), the United States Air Force, the Lee Center, and by NSF Grant CCR 0204193.

ICFP ’04 Submission

the letter. Most of the work required to turn these specifications into an actual compiler is handled automatically by MetaPRL, an existing general-purpose formal toolkit. We demonstrate how this methodology can be used to address several central issues in compiler design and implementation: ease of implementation, extensibility, compositionality, and trust. The formal framework provides a well-defined syntax of terms, types, and programs. We represent programs and program transformations using higher-order abstract syntax (HOAS); binding, scoping, and substitution are handled automatically by the framework. The HOAS also allows mixing the object language with the meta-language, explicitly expressing the intermediate states of the compilation process. In addition, the framework provides a rich tactic language for guiding proofs and transformations and for automatically extracting such guidance information from annotated specifications. Finally, the framework provides us with an interactive program refinement mode (initially designed for interactive formal proof development) and together with the explicit meta-language it proved to be an extremely powerful debugging tool. Compositionality is a well-established principle in the construction of logical theories. For example, a standard development of the predicate calculus might define a core logic including variables and implication, and each of the other connectives like conjunction, disjunction, and quantification would be added as logical extensions to the core theory. While the consistency of the full calculus is not necessarily ensured by consistency of the parts, the framework provides both a methodology for compositionality (all logics are to be viewed as open-ended, and extensions must act as conservative extensions) as well a mechanism for extensibility. In the compiler domain, we take a similar approach to compositionality and extensibility. The compiler we present defines a core theory for System F (variables, functions, application, and second order quantifiers). The core is divided into transformation stages including type inference, type checking, CPS transformation, closure conversion, and assembly code generation. Each stage consists of a set of formal rules for transformation, as well as a table-driven extensible tactic to guide the transformation. Additional components for Boolean values, arithmetic, tuples, arrays, recursive functions, etc., are defined as independent extensions. Each extension defines its own set of formal rules for each transformation stage, and it adds new strategy code to the tactic used to control that stage. By ensuring locally that the component acts as a conservative extension

of the core and other components it is derived from, we get a strong guarantee that in the compiler for the entire language with all its extensions, there will be no unexpected interactions between different compiler modules or different language features. Another extremely important and challenging issue in compiler development is reliability and trust. In the context of a compiler, it is useful to make a distinction between trusted code and untrusted code. Trusted code is code where flaws have the potential to cause the compiler to produce incorrect output for some input program. Flaws in untrusted code may cause the compiler to fail to produce output on some valid input programs, but they cannot cause the compiler to produce incorrect output. When a compiler is implemented in a general purpose language, it is often difficult to isolate the parts of the compiler that must be trusted, and in the worst case the entire code base must be trusted. Trust is also a central issue in compositionality and ease of implementation. If the invariants that specify the compiler involve complex interactions between many parts of the implementation, maintaining and extending the compiler can be quite difficult. In our approach, compiler transformations are each defined in two parts: a set of trusted transformation rules and untrusted tactic code to direct the transformation strategy. The transformation rules are defined in a formal logic using notation similar to that in the literature, they represent only a small part of the compiler, and they are verifiable. That is, the entire trusted code path is small, precisely and formally defined, and it may be validated against a program semantics if desired. A number of guarantees are provided by the framework itself. For example, the HOAS implementation ensures that even incorrectly specified or invalid program transformations are not allowed to violate scoping and/or accidentally capture a variable. Even the framework implementation does not have to be trusted—the tool is capable of retaining and providing a full log of the program transformations performed during the compilation process; and if an extreme level of confidence is needed, an independent checker could be implemented, capable of making sure that all steps in the log are faithful instances of the formally specified allowed transformations. Organization. Although the principles just proposed seem reasonable enough, it is not entirely clear a priori whether they can be translated to practice. One of the biggest challenges is that all program transformations must be constructed from a fixed number of rewrite rules that each describe a pattern over a fixed number of program points. In other words, global program transformations must be composed of a sequence of local transformations, and it is not always obvious how to do this. In addition, global transformations require knowledge of the entire program syntax, which can be at odds with compositionality. This paper is based on a case study of a compiler for an ML-like source language, compiled to assembly code for the Intel x86 machine architecture. As mentioned, the core is based on the language of System F. There are extensions for 1) additional base types like Boolean values and integers, 2) aggregates like arrays and tuples, and 3) recursive functions. The backend uses HOAS to define a scoped x86 assembly language [10]. The compiler stages include type inference, type checking, CPS transformation, closure conversion, and

assembly code generation. The compiler is implemented in the MetaPRL [9, 11] logical framework. As we show, each of the standard compiler stages can be precisely and concisely defined formally. The precision comes from using the formal notation, and the brevity follows from the rich set of tools provided by the logical framework. We begin the account with a description of terminology (Section 2) and the overall compiler architecture (Section 3), and follow it with a description of a few of the key stages of the compiler. Type checking (Section 4.1) is a simple stage, where compositionality is quite straightforward. In contrast, type inference (Section 4.2) requires more complex control flow; nearly the entire phase is implemented informally through untrusted code, and finalized with a formal type check. We present the CPS conversion (Section 4.3) based on the work of Danvy and Filinsky [5] and show how the use of HOAS and derived rules in logical framework can make our implementation simpler that Danvy and Filinsky’s original account. We present closure conversion (Section 4.4) as a stage that has a particularly elegant formalization in the logical framework. Section 5 describes the production of machine code based on a scoped version of the Intel instruction set [10]. Scope of this paper. This account of formal, extensible compilers focuses on the methodology. While the work we present is intended to be verifiable, we do not specifically address the issue of verification. Even though the trusted code base is precisely defined and quite small, it is likely that a full simulation-based verification remains impractical, although other methods may be possible [7]. Also, in most cases we will present the formal part of the transformation without detailed discussion of the pre-existing logical framework mechanisms that turn these transformations into the actual compiler. Contributions of this work include the following. • A formal compiler architecture, designed for reliability, extensibility, and ease of implementation. The compiler is structured around language features, rather than compiler stages. • An implementation in the MetaPRL logical framework, giving a case study for compiling an ML-like language to a typed intermediate language, then to Intel x86 assembly code. • All transformations preserve preserve program invariants such as scoping, well-typedness. These invariants are never violated, even temporarily. • Transformations are stated abstractly, in the form one would find in a standard semantics textbook. The framework converts this to an implementation using preexisting methods for proof search and automation. • The compiler description allows free mixing of the metalanguage with the object-language, leading to an implementation that is simple because it is locally defined. Another interesting aspect of our approach is the representation of arities; in our account functions may take zero or more arguments, and tuples may have zero or more components. The type system must be able to express arity information as well. We take an unusual approach, using sequent notation to represent functions, where the premises represent function parameters, and the conclusion represents the function body.

2.

TERMINOLOGY

All logical syntax in the MetaPRL framework is expressed in the language of terms. The general syntax of all terms has three parts. Each term has 1) an operator-name (like “sum”), which is a unique name identifying the kind of term; 2) a list of parameters representing constant values; and 3) a set of subterms with possible variable bindings. We use the following syntax to describe terms: opname [p1 ; · · · ; pn ] {~v1 .t1 ; · · · ; ~vm .tm } | {z } | {z }| {z } operator name

parameters

subterms

All the free occurrences of variables ~vi in ti will be considered bound by the operator. Below are a few examples of the terms that could be used in a formalization of a simple lambda calculus. Displayed form 1 λx.b f (a) x+y

Term integer[1]{} lambda[]{ x. b } apply[] { f; a } sum[]{ x; y }

Numbers have an integer parameter. The lambda term contains a binding occurrence: the variable x is bound in the subterm b. Each operator has a fixed arity, which includes a fixed number of parameters, a fixed number of subterms and a fixed number of bindings for each subterm. (More specifically, if two operators have different arities, they will be considered to be distinct even if they happen to have the same opname.) In addition to the basic term language described above, the framework also provides three special kinds of terms. The first one is the simple first-order (object language) variables. These are the variables that can be bound by the corresponding binding occurrences. Another class of special terms are second-order (metalanguage) variables, which explicitly define scoping and substitution [17]. A second-order variable pattern has the form v[v1 ; · · · ; vn ], which represents an arbitrary term that may have free first-order variables v1 , . . . , vn . The corresponding substitution has the form v[t1 ; · · · ; tn ], which specifies the simultaneous, capture-avoiding substitution of terms t1 , . . . , tn for v1 , . . . , vn in the term matched by v. The second-order variables are used to specify the logical rule and term rewrites. For example, the rule for β-reduction could be specified with the following rewrite. (λx.v1 [x]) v2 ← [beta] → v1 [v2 ] The left-hand-side of the rewrite is a pattern called the redex. The v1 [x] stands for an arbitrary term that may have free occurrences of the first-order variable x, and v2 is another arbitrary term. The right-hand-side of the rewrite is called the contractum. The second-order variable v1 [v2 ] substitutes the term matched by v2 for x in v1 . A term rewrite specifies that any term that matches the redex can be replaced with the contractum, and vice-versa. The second-order notation can also express the lack of bound occurrences of a certain variable. The following rewrite is valid in second-order notation and would be provable in the presence of the β-reduction. (λx.v[]) 1 ← [const] → (λx.v[]) 2

In the context λx, the second-order variable v[] matches only those terms that do not have x as a free variable. No substitution is performed; the β-reduction of both sides of the rewrite yields v[] ←→ v[], which is valid reflexively. Normally, when a second-order variable v[] has an empty argument list [], we omit the brackets and use the simpler notation v. The last class of special terms are sequents (sometime also called telescope terms) of the form x1 : t1 ; . . . ; xn : tn `a c. The term c is the conclusion of the sequent; the terms ti (0 ≤ i ≤ n; and n could be 0) are its hypotheses; the variables xi introduce binding occurrences (each xi is bound in all tj for j > i and in c). Finally, the term a is the sequent argument that specifies what kind of sequent it is— essentially the argument plays the same role for sequents as the operator name plays for the ordinary terms. In sequent schemas [17] may include context meta-variables that stand for arbitrary lists of hypotheses. For example, the sequent schema Γ; x : T []; ∆[x] `a[] c[x] where Γ and ∆ are context variables and T , a and c are second-order variables stands for an arbitrary sequent with at least one hypothesis. MetaPRL is a tactic-based prover that uses OCaml [20] as its meta-language. When a rewrite is defined in MetaPRL, the framework creates an OCaml expression that can be used to apply the rewrite. Code to guide the application of rewrites is written in OCaml, using a rich set of primitives provided by MetaPRL. MetaPRL automates the construction of most guidance code; we describe rewrite strategies only when necessary. For clarity, we will describe syntax and rewrites using the displayed forms of terms. The compilation process is expressed in MetaPRL as a judgment of the form Γ ` hheii, which states the the program e is compilable in logical context Γ. The meaning of the hheii judgment is defined by the target architecture. A program e0 is compilable if it is a sequence of valid assembly instructions. The compilation task is a process of rewriting the source program e to an equivalent assembly program e0 .

3.

COMPILER OVERVIEW

A compiler is defined by a sequence of transformations that take a program in a source language and translate it to a program in a target language. In this case study, the full source language is a ML-like source language with type inference and higher-order functions. The guiding principles behind the architecture are compositionality and extensibility. The sequence of transformations is defined first for a core language that is as minimal as possible, and each transformation in the core is designed to be extensible. Additional programming language constructs are added as components that define extensions to each of the stages in the core. Figure 1 shows a diagram of the compiler architecture for the case study, where the core and the extensions are represented horizontally. Extensions do not have to define code for each of the stages; for example, closure conversion applies only to functions, and the other extensions may ignore it. The extensions may also have dependencies with one another, as shown by the arrows on the left of each extension:

recursive functions

6

8

arrays

9

17

17

25

integers

3

30

6

81

arithmetic

9

18

18

15

Boolean values

9

23

18

20

source program

4

12

75

156

90

18

320

type inference

type checking

CPS conversion

closure conversion

code generation

x86 target program

core: polymorphic lambda calculus

Figure 1: The high-level compiler architecture is designed around a sequence of transformations for a core language based on the polymorphic lambda calculus. Each extension defines new types and values, as well as an extension to each of the core stages. The vertical arrows indicate extensions to core stages; the code is structured horizontally. The numbers within the stages indicate lines of trusted code. tuples require integers, which require general operations for arithmetic, which requires Boolean values for relations. The compiler includes an initial informal phase that uses the Phobos extensible parser to convert the textual source code to the term representation used by the logical framework [6]. The syntax for the typed intermediate language for the case study is shown in Figure 2. The source language itself is similar, where types have been erased. The syntax is shown in beautified form; internally each of the expressions and types uses native MetaPRL notation. The arities of functions, application, and tuples are unconstrained. Internally, functions and their types use sequent notation. For example, the sequent x1 : t1 , . . . , xn : tn `κ e is used to represent the function λκ (x1 : t1 , . . . , xn : tn ).e. There are three kinds of functions and application: λr represents a recursive function (f is the recursive binding); λs represents a “normal” function; an application e(e1 , . . . , en : t1 , . . . , tn )c represents a closure (the runtime passes the arguments as a tuple).

4.

KEY TRANSFORMATIONS

For illustration, we present three keys stages of the compiler that exemplify both the formal nature of the compilations process, as well as its compositionality. We wish to illustrate both that 1) the formal definition of the compiler transformations is natural, and 2) that the methodology is compositional. We present the following transformations. Type checking: this is an example of a straightforward stage one would expect to work well in a logical framework. Type inference: in contrast, type inference is not nearly as straightforward; however the compositional formulation is quite natural. CPS conversion: we present a very straightforward implementation based on the ability of the framework to combine the meta-language and the object language and we show how the tail recursive optimizations can be derived formally from the eta reduction. Closure conversion: closure conversion has an elegant formalization that avoids the usual dataflow analysis in traditional compilers.

4.1

Type checking

Type checking is often specified as a set of inference rules

Γ, x : t, ∆ ` x : t Γti = s1≤i≤n i

axiom

Γ, x1 : t1 , . . . , xn : tn ` e : t

Γ ` λs (x1 : t1 , . . . , xn : tn ).e : (s1 , . . . , sn ) → t Γ ` e : (t1 , . . . , tn ) → t

Γ ` ei : t1≤i≤n i

Γ ` e(e1 , . . . , en : t1 , . . . , tn ) : t

abs

app

Γ ` e1 : B Γ ` e2 : t Γ ` e3 : t if Γ ` if e1 then e2 else e3 : t Figure 3: A few of the typing rules for the calculus. The rules are stated in the component to which they belong. For instance, the axiom, abs, and app rules are part of the core theory, while the if rule is part of the Boolean component. All rules from all included extensions are used during type checking. in a sequent calculus. This, of course, has a natural formulation in a logical framework. A few of the rules are shown in Figure 3. When the compiler is built, the framework collects the inference rules from all the extensions included in the compiler and builds a lookup table. During the type checking phase of the compilation process the framework builds the type checking theorem using proof search. Normally, the type checking will be syntax directed and we only need to provide the system with the typechecking rules (annotated as belonging to the type checking stage) and the rest happens automatically. However, for more expressive languages, other cases may arise: the proof search may not be syntax directed, or it may even be undecidable. In either case, the programmer can still intervene and provide the framework with explicit code for directing the proof search. To give an idea, here is how the if rule looks in the MetaPRL native syntax: prim if_intro {| intro sequent{ >- ’e1 sequent{ >- ’e2 sequent{ >- ’e3 sequent{ >- If{

[] |} : in TyBool } --> in ’ty } --> in ’ty } --> ’e1; ’e2; ’e3 } in ’ty }

Expressions Core language e ::= x | (e : t) | λκ (x1 : t1 , . . . , xn : tn ).e | e(e1 , . . . , en : t1 , . . . , tn )κ | Λ(α1 , . . . , αn ).t | e[t1 , . . . , tn ]

Types Variables Type constraint Functions Function application Type abstraction Type application

Boolean values | true | false | if e then e else e

t

::= ⊥ | > | (t1 , . . . , tn ) →κ t

Empty type All programs Function types

|

∀(α1 , . . . , αn ).t

Polymorphism

Constants Conditional

|

B

Boolean type

Integers | | |

i e binop e e relop e

Integer constants Arithmetic Relations

|

Z

Integer type

| |

(e1 , . . . , en ) e.i

Tuples Projection

|

t1 ∗ · · · ∗ tn

Product type

Tuples

Recursive functions λr (x1 : t1 , . . . , xn : tn , f : t).e binop relop

::= + | − | · · · ::= < | ≤ | · · ·

Function kinds κ ::= s | c | r

Binary operations Binary relations

Figure 2: The typed intermediate language is based on the polymorphic lambda calculus. Extensions add Boolean values, arithmetic, tuples, arrays (not shown), and recursive functions. The source language is a type erased version of the intermediate language. In this rule, prim is a MetaPRL keyword that signifies that the rule is being added as a primitive axiom, in other words—as trusted code (as opposed to a theorem that will be derived and does not have to be trusted). The if intro is the name of the rule. The {| intro [] |} annotation signifies that the rule should be added to the syntax-directed typechecking table [8]. Finally, is the MetaPRL notation for the context Γ and the ’v syntax is used to input a second-order variable v. These five lines are the only code that needs to be added to the compiler for it to be able to type check the if-then-else expressions.

4.2

Type inference

Type inference is an example of a transformation where the main algorithm is informal (that is, it is specified using mostly OCaml code rather than the term language). The type inference algorithm is an informal procedure that, given a program without type annotations, constructs an equivalent program with explicit typing. The formal part of the transformation is quite brief, as shown in the rule below. The erase(e) term performs a type erasure on the expression e. The rule states that the type erased version of a expression e is compilable if e is compilable and well-typed with some type t. The objective of type inference is to determine the fully-typed program e and its type t. Γ ` hheii Γ ` e : t infer Γ ` hherase(e)ii We construct type inference on a compositional version

of algorithm W [4]. We rely on a unification function unify(s, t1 , t2 ) provided by the MetaPRL toolkit. It takes a substitution s and two terms t1 , t2 , and augments the substitution to unify the two terms (if possible). The inference procedure is again syntax directed. For each kind of expression, we define a function that takes five arguments (infer , T, V, s, e): infer is the type inference function, T is a set of type variables that are bound in expression e, V is a type environment for the variables bound in e, and s is the current substitution. The procedure produces a triple (e0 , s0 , t0 ). For example, the following procedure infers the type for a variable. let infer var(infer, T, V, s, v) = if v ∈ / dom(V ) then error(‘‘unbound variable v’’); v, s, V (v) For functions, the process is similar. The parameters are given fresh types, and the infer function is used to perform type inference on the function body. let infer lambda(infer, T, V, s, λκ (x1 , . . . , xn ).e) = let V 0 = V with (x1 : α1 , . . . , xn : αn ) for fresh type variables α1 , . . . , αn in let (e0 , s0 , t0 ) = infer(T, V 0 , s, e) in (λκ (x1 : α1 , . . . , xn : αn ).e0 , s0 , (α1 , . . . , αn ) → t0 ) Applications make use of unification.

let infer apply(infer, T, V, s, e(e1 , . . . , en )) = let (e0 , s, t0 ) = infer(T, V, s, e) in let (e0i , s, t0i ) = infer(T, V, s, ei ) in let s = unify(s, t0 , (t01 , . . . , t0n ) → α) in (e0 (e01 , . . . , e0n : t01 . . . . , t0n ), s, α)

Using this operation, the cps apply rule is written as the following. CPS{f (es : ts); t; v.c[v]} ← [cps apply] → CPS{f ; ts → t; vf . CPS{es; ts; ve . letm t0 = TyCPS{t} in letm t00 = t0 → ⊥ in let c2 = λs v : t0 .c[v] : t00 in vf (c2 , ve : t00 , TyCPS{ts})}}

As in the type checking case the code here is very straightforward and MetaPRL takes care of combining the individual functions from the compiler core and all the included extensions into the syntax-directed lookup table that is than used to perform the actual type inference: let infer type tbl = let rec infer (T, V, s, t) = let inf = try lookup tbl t with Not found -> report internal error in inf (infer, T, V, s, t) in (fun t -> infer (empty T, empty V, empty s, t))

4.3

This is more efficient as the type t will only have to be converted once, not 3 times. Again, the ability to combine the object language with meta-language yields very compact straightforward and precise formal code. The ability to manipulate the meta-continuations also helps making the rules for the conversion of the argument lists very concise. CPS{e1 :: es; t1 :: ts; v.c[v]} ← [cps args cons] → CPS{e1 ; t1 ; v1 .CPS{e2 ; ts; vs.c[v1 :: vs]}} CPS{(); (); v.c[v]} ← [cps args nil] → c[()]

CPS conversion

We implement the CPS conversion by adding a new term to the meta-language—CPS{e; t; v.c[v]} where the first argument e is the expression that is being converted, the second argument t is the type of that expression and the third argument is the meta-continuation of the CPS process. In other words, the c is the rest of the program and v marks the location where the CPS of e should go. The following rule specifies CPS for object variables. CPS{!x; t; v.c[v]} ← [cps var] → c[!x] The notation !x is MetaPRL syntax for first-order variables that are bound outside of the local scope of the rewrite rule. In this rule, the meta-continuation is being consumed. The rewrite puts the variable into the appropriate location and returns the whole expression. In the rule for let expressions, a new meta-continuation is being created. CPS{let v1 : t1 = e1 in e2 [v1 ]; t2 ; v2 .c[v2 ]} ← [cps let] → CPS{e1 ; t1 ; v3 .let v1 : TyCPS{t1 } = v3 in CPS{e2 [v1 ]; t2 ; v2 .c[v2 ]} } TyCPS here is a meta-term that is used to specify the CPS conversion for types (adding an extra argument to all function types) similarly to how the CPS term is used to specify the CPS conversion for expressions. The rule for the CPS of applications could be specified the following way: CPS{f (es : ts); t; v.c[v]} ← [cps apply] → CPS{f ; ts → t; vf . CPS{es; ts; ve . let c2 = λs v : TyCPS{t}.c[v] : (TyCPS{t} → ⊥) in vf (c2 , ve : (TyCPS{t} → ⊥), TyCPS{ts})}} In our implementation we add a meta-let operation to the meta-language. letm v = e1 in e2 [v] ← [meta let] → e2 [e1 ]

Next, we define the tail-recursive version of transformation as TailCPS{e; t; k} := CPS{e; t; v.k(v)}. Using this definition we formally derive the tail call optimizations from the eta reduction rule. Note that we use meta-language notation in place of Danvy and Filinski’s “static” operators @ and λ [5].

4.4

Closure conversion

The objective of closure conversion is to transform all functions in the program so that they are closed. Abstractly, this can be accomplished as follows: find a function in the program; if it has a free variable, add the variable to the function’s parameters; and replace the function by a partial application to the variable in question. The rules in Figure 4 capture this algorithm. In the first step, the init rule acts as an inverse-beta reduction to wrap any normal function in an empty closure. The close rule can be used to place a trivial let-definition before any expression e[x] with free variable x. Although this transformation is valid in any context, the closure conversion phase applies it only around closure expressions. In the third step, the let is folded into the closure by adding the variable as a new parameter in the closure function, as well as a new argument in the closure application. As a final step, functions are hoisted to top-level using the meta let transformation used in CPS conversion. Here the formal specification does not provide the system with enough information on how exactly the transformation should be applied. To address this, the compiler includes guidance code that takes care of descending into terms, finding all the lambda terms, finding all the free variables in them and closing over the free variables. The framework makes it easy to access the set of free variables of a certain term; we put only those variables that actually occur freely in the body of the function, not all those that are in scope. The guidance code for closure conversion is extensible, using a mechanism similar to that used in type inference. However, since only functions and let-expressions need to be handled specially, there are far fewer cases to consider.

λs (x1 : t1 , . . . , xn : tn ).e ← [init] → (λc ().λs (x1 : t1 , . . . , xn : tn ).e)()c e[!x] ← [let] → let x : t = !x in e[x] let x0 : t0 = !x00 in (λc (x1 : t1 , . . . , xn : tn ).e)(x01 , . . . , x0n : t01 , . . . , t0n )c ← [close] → (λc (x0 : t0 , x1 : t1 , . . . , xn : tn ).e)(!x00 , x01 , . . . , x0n : t0 , t01 , . . . , t0n )c Figure 4: Closure conversion in 3 steps: the init rule adds a closure function to each normal function; the let rule is used to add a let-definition around any expression with a free variable; and the close function adds the variable in a let-definition to a closure function.

5.

X86 BACKEND

Once closure conversion has been performed, all function definitions are closed, and it becomes possible to generate assembly code. When formalizing the assembly code, we continue to use higher-order abstract syntax: registers and variables in the assembly code correspond to variables in the meta-language. There are two important properties we must maintain. First, scoping must be preserved: there must be a binding occurrence for each variable that is used. Second, in order to facilitate reasoning about the code, variables/registers must be immutable. These two requirements seem at odds with the traditional view of assembly, where assembly instructions operate by side-effect on a finite register set. In addition, the Intel x86 instruction set architecture primarily uses two-operand instructions, where the value in one operand is both used and modified in the same instruction. For example, the instruction add r1 ,r2 performs the operation r1 ← r1 + r2 , where r1 and r2 are registers. To address these issues, we define an abstract version of the assembly language that uses a three operand version on the instruction set. The instruction add v1 , v2 , v3 .e performs the abstract operation let v3 = v1 + v2 in e. The variable v3 is a binding occurrence, and it is bound in body of the instruction e. In our account of the instruction set, every instruction that modifies a register has a binding occurrence of the variable being modified. Instructions that do not modify registers use the traditional non-binding form of the instruction. For example, the instruction add v1 , (%v2 ); e performs the operation (%v2 ) ← (%v2 ) + v1 , where (%v2 ) means the value in memory at location v2 . The complete abstract instruction set that we use is shown in Figure 5 (the Intel x86 architecture includes a large number of complex instructions that we do not use). Instructions may use several forms of operands and addressing modes. • The immediate operand $i is a constant number i. • The register operand %v refers to register/variable v. • The indirect operand (%v) refers to the value in memory at location v. • The indirect offset operand i(%v) refers to the value in memory at location v + i. • The array indexing operand i1 (%v1 , %v2 , i2 ) refers to the value in memory at location v1 + v2 ∗ i2 + i1 , where i2 ∈ {1, 2, 4, 8}. The instructions can be placed in several main categories. • mov instructions copy a value from one location to another. The instruction mov o1 , v2 .e[v2 ] copies the value in operand o1 to variable v2 .

• One-operand instructions have the forms inst1 o1 ; e (where o1 must be an indirect operand), and inst1 v1 , v2 .e. For example, the instruction inc(%r1 ); e performs the operation (%r1 ) ← (%r1 ) + 1; e; and the instruction inc %r1 , r2 .e performs the operation let r2 = r1 + 1 in e. • Two-operand instructions have the forms inst2 o1 , o2 ; e, where o2 must be an indirect operand; and inst2 o1 , v2 , v3 .e. For example, the instruction add %r1 , (%r2 ); e performs the operation (%r2 ) ← (%r2 )+r1 ; e; and the instruction add o1 , v2 , v3 .e is equivalent to let v3 = o1 + v2 in e. • There are two three-operand instructions: one for multiplication and one for division, having the form inst3 o1 , v2 , v3 , v4 , v5 .e. For example, the instruction div %r1 , %r2 , %r3 , r4 , r5 .e performs the following operation, where (r2 , r3 ) is the 64-bit value r2 ∗ 232 + r3 . The Intel specification requires that r4 be the register eax , and r5 the register edx . let r4 = (r2 , r3 )/r1 in let r5 = (r2 , r3 ) mod r1 in e • The comparison operand has the form cmp o1 , o2 ; e, where the processor’s condition code register is modified by the instruction. We do not model the condition code register explicitly in our current account. However, doing so would allow more greater flexibility during code-motion optimizations on the assembly. • The unconditional branch operation jmp o(o1 , . . . , on ) branches to the function specified by operand o, with arguments (o1 , . . . , on ). The arguments are provided so that the calling convention may be enforced. • The conditional branch operation if cc then e1 else e2 is a conditional. If the condition-code matches the value in the processor’s condition-code register, then the instruction branches to expression e1 ; otherwise it branches to expression e2 .

5.1

Translation to concrete assembly

Since the instruction set as defined is abstract, and contains binding structure, it must be translated before actual generation of machine code. The first step in doing this is register allocation: every variable in the assembly program must be assigned to an actual machine register. This step corresponds to an α-conversion where variables are renamed to be the names of actual registers; the formal system merely validates the renaming. We describe this phase in the section on register allocation 5.3.

l r v om

or o cc inst1 inst2 inst3 cmp jmp e

::= ::= | ::=

string eax | ebx | ecx | edx esi | edi | esp | ebp r | v1 , v2 , . . .

::= (%v) | i(%v) | i1 (%v1 , %v2 , i2 ) | l ::= %v ::= om | or | $i | $l ::= ::= ::= ::= ::= ::= ::= | | | | | | | | |

Function labels Registers

• The if cc then e1 else e2 instruction prints as the following sequence, where cc 0 is the inverse of cc, and l is a new label.

Variables

jcc 0

Memory operands

Register operand General operands Constants

eq | lt | gt | · · · Condition codes inc | dec | · · · 1-operand opcodes add | sub | and | · · · 2-operand opcodes mul | div 3-operand opcodes cmp | test comparisons jmp branch mov o, v.e copy inst1 om ; e 1-operand mem inst inst1 or , v.e 1-operand reg inst inst2 or , om ; e 2-operand mem inst inst2 o, or , v.e 2-operand reg inst inst3 o, or , or , v1 , v2 .e 3-operand reg inst cmp o1 , o2 ; e comparison jmp o(or , . . . , or ) branch if cc then e1 else e2 conditional branch let l = λ(v1 , . . . , vn ).e1 functions in e2

l:

• A function definition let l = λ(v1 , . . . , vn ).e1 in e2 prints as the following sequence. Again, this assumes that the calling convention is satisfied during register allocation. l:

The final step is to generate the actual program from the abstract program. This requires only local modifications, and is implemented during printing of the program (that is, it is implemented when the program is exported to an external assembler). The main translation is as follows. • Memory instructions inst1 om ; e, inst2 or , om ; e, and cmp o1 , o2 ; e can be printed directly. • Register instructions with binding occurrences require a possible additional mov instruction. For the 1-operand instruction inst1 or , r.e, if or = %r, then the instruction is implemented as inst1 r. Otherwise, it is implemented as the following two-instruction sequence. mov inst1

or , %r %r

5.2

Code generation

The production of assembly code is primarily a straightforward translation of operations in the intermediate code to operations in the assembly. There are two main kinds of translations: translations from expressions to operands, and translation of expressions into instruction sequences. We express these translations with the term codegen e, which is the translation of the IR expression e to an assembly expression; and let v = codegen a in e[v], which produces the assembly operand for the atom a and substitutes it for the variable v in assembly expression e[v].

Translation of the core language

The core language contains just functions, application, and let-expressions. In addition, the closure conversion stage classifies functions into two classes, the “normal” functions λs (x1 : t1 , . . . , xn ).e, and the “closure” functions λc (x1 : t1 , . . . , xn ).e. In the core, these semantics of these two functions is the same. However, the function annotation serves as a hint to code-generation: the arguments to closure functions are to be passed as a tuple, and the arguments to normal functions are passed according to the normal calling convention. One invariant of closure conversion is that these two kinds of functions are always nested. Every normal function is wrapped in a closure function. During code generation, we add an extra argument to the inner function that represents the function environment, and then project the elements, as shown in Figure 6. The closure itself is represented as a tuple (f, v0 , . . . , vn ), where f is the function, and v0 , . . . , vn are the arguments to the closure function. The closure application apply c allocates this tuple. The normal application apply s expects a closure; it adds the closure to the argument list, and branches to the function.

5.2.2 Similarly, the two-operand instruction inst2 o, or , r.e may require an additional mov from or to r, and the three-operand instruction inst3 o, or1 , or2 , r1 , r2 .e may require two additional mov instructions. • The jmp o(o1 , . . . , on ) prints as jmp o. This assumes that the calling convention has been satisfied during register allocation, and all the arguments are in the appropriate places.

e1 e2

The compiler back-end then has two stages for code generation and register allocation, described in the following sections.

5.2.1 Figure 5: Scoped Intel x86 instruction set. Instructions that would normally modify registers are represented in a form that contains a binding occurrence for the result. The abstract assembly also retains function structure and scoping.

l e1 e2

Memory operations

Closure creation requires allocation of a tuple. The memory operations shown in Figure 7 are among the most complicated translations. For the runtime, we use a contiguous heap and a copying garbage collector. The representation of all memory blocks in the heap includes a header word containing the number of bytes in the block (the number of bytes is always a multiple of the word size), following by one word for each field. A pointer to a block points to the first

let v = codegen λc (x1 , . . . , xm ).λs (y1 , . . . , yn ).e1 in e2 [v] ← [fun] → let l = λ(x, y1 , . . . , yn ). mov 4(%x), x1 . ··· mov 4m(%x), xm . codegen e1 in e2 [l]

let x = (o1 , . . . , on ) in x ← [tuple allocation] → reserve(n) in mov context[next], p1 . add $4(n + 1), %p1 , p2 . mov %p2 , context[next]; mov $header (n), (%p1 ); add $4, %p1 , x. mov o1 , 0(%x); ··· mov on , ((n − 1) ∗ 4)(%x); e[x]

let x = codegen e0 (e1 , . . . , en : t1 , . . . , tn )c in e[x] ← [applyc ] → let v0 = codegen e0 in let v1 = codegen e1 in ··· let vn = codegen en in let x = (%v0 , %v1 , . . . , %vn ) in codegen e[x]

codegen let v : t = e1 [e2 ] in e[v] ← [array subscript] → let v1 = codegen e1 in let v2 = codegen e2 in mov v1 , p. mov v2 , i 0 . sar $1, %i 0 , i. mov −4(%p), size 0 . sar $2, %size 0 , size. cmp %i, %size; if ae then goto bounds.error mov 0(%p, %i, 4), v. codegen e[v]

codegen e(e1 , . . . , en : t1 , . . . , tn )s ← [applys ] → let v0 = codegen e in let v1 = codegen e1 in ··· let vn = codegen en in jmp(%v0 )(%v0 , %v1 , . . . , %vn ) codegen let x : t = e1 in e2 [x] ← [let] → let x0 = codegen e1 in mov x0 , x. codegen e2 [x]

reserve(i) in e ← [reserve] → mov context[limit], limit. sub context[next], %limit, free. cmp $4i, %free; if ae then gc(i) else e

let v2 = codegen v1 in e[v2 ] ← [var] → e[%v1 ] Figure 6: Code generation for functions and application. The arguments to a closure function are passed as tuples; the arguments to normal functions use the standard calling convention.

field of the block (the word after the header word). The heap area itself is contiguous, delimited by base and limit pointers; the next allocation point is in the next pointer. These pointers are accessed through the context[name] pseudooperand, which is later translated to an absolute memory address.

5.2.3

Translation for extensions

The transformation rules for code generation are collected into a resource; the code generation for extensions is accomplished by adding additional rewrite rules to the resource. For illustration, we define example translations for the integer and Boolean extensions. Since the assembly code we generate is untyped, we use a 31-bit representation of integers, where the least-significantbit is always set to 1. Since pointers are always word-aligned, this allows the garbage collector to differentiate between integers and pointers. The division operation is the most complicated translation: first the operands a1 and a2 are shifted to obtain the standard integer representation, the division operation is performed, and the result is converted to a 31bit representation. The translation for a representative set of integer operations is shown in Figure 8. Code generation for Boolean values is similar. In this case,

Figure 7: Memory operations. A tuple allocation appends to the next heap location, denoted by context[next]. Array subscripting performs a bounds-check. Every allocation must be preceded by a reservation that calls the garbage collector if storage is not available. we represent true as 1, and false as 0. For the conditional, we generate a conditional branch. For optimization, an important case is when the condition is a relation. In this case, there is no need to generate the intermediate Boolean value.

5.3

Register Allocation

Register allocation is one of the easier phases of the compiler formally: the main objective of register allocation is to rename the variables in the program to use register names. Because we are using higher-order abstract syntax, the formal problem is just an α-conversion, which can be checked readily by the formal system. From a practical standpoint, however, register allocation is a NP-complete problem, and the majority of the code in our implementation is devoted to a Chaitin-style [2] graph-coloring register allocator. These kinds of allocators have been well-studied, and we do not discuss the details of the allocator here. The overall structure of the register allocator algorithm is as follows. 1. Given a program p, run a register allocator R(p). 2. If the register allocator R(p) was successful, it returns

let v = codegen i in e[v] ← [int] → e[$(i ∗ 2 + 1)] let v = codegen e1 < e2 in e3 [v] ← [lt] → let o1 = codegen e1 in let o2 = codegen e2 in cmp o1 , o2 ; mov $0, x. setnz %x, v e3 [%v] let v = codegen e1 + e2 in e3 [v] ← [add] → let v1 = codegen e1 in let v2 = codegen e2 in add v2 , v1 , tmp. dec %tmp, sum. e3 [%sum] let v = codegen e1 / e2 in e3 [v] ← [div] → let v1 = codegen e1 in let v2 = codegen e2 in sar $1, v1 , v10 . sar $1, v2 , v20 . mov $0, v3 . div %v10 , %v20 , %v30 , q 0 , r0 . shl $1, %q 0 , q 00 . or $1, %q 00 , q. e3 [%q] Figure 8: Code generation for integer operations. Integers use a 31-bit representation, where the leastsignificant bit is always 1. an assignment of variables to register names; α-convert the program using this variable assignment, and return the result p0 . 3. Otherwise, if the register allocator R(p) was not successful, it returns a set of variables to “spill” into memory. Rewrite the program to add fetch/store code for the spilled registers, generating a new program p0 , and run register allocation R(p0 ) on the new program. Part 2 is a trivial formal operation (the logical framework checks that p0 = p). The generation of spill code for part 3 is not trivial however, as we discuss in the following section.

5.3.1

Generation of spill code

The generation of spill code can affect the performance of a program dramatically, and it is important to minimize the amount of memory traffic. Suppose the register allocator was not able to generate a register assignment for a program p, and instead it determines that variable v must be placed in memory. We can allocate a new global variable, say spill i for this purpose, and replace all occurrences of the variable with a reference to the new memory location. This can be captured by rewriting the program just after the binding occurrences of the variables to be spilled. The following two rules give an example. mov o, v.e[v] ← [smov] → mov o, spill i .e[spill i ]

let v = codegen false in e[v] ← [false] → e[$0] let v = codegen true in e[v] ← [true] → e[$1] codegen if e1 then e2 else e3 ← [if] → let x = codegen e1 in cmp $0, x ; if nz then codegen e2 else codegen e3 codegen if e1 < e2 then e3 else e4 ← [iflt] → let v1 = codegen e1 in let v2 = codegen e2 in cmp v1 , v2 ; if lt then codegen e3 else codegen e4 Figure 9: Code generation for Boolean operations. The Boolean constants are represented by the constants 0 and 1. The iflt rule shows an important optimization; in the case where the condition is a relation, the intermediate Boolean value need not be constructed. os

::= |

e ::= |

spill (v, s) spill (s)

Spill operands

set or , s.e[s] New spill get os , v.e[v] Get the spilled value

Figure 10: Spill pseudo-operands and instructions inst2 o, or , v.e[v] ← [sinst2] → mov or , spill i . inst2 o, spill i ; e[spill i ] However, this kind of brute-force approach spills all of the occurrences of the variable, even those occurrences that could have been assigned to a register. Furthermore, the spill location spill i would presumably be represented as the label of a memory location, not a variable, allowing a conflicting assignment of another variable to the same location. To address both of these concerns, we treat spill locations as variables, and introduce scoping for spill variables. We introduce two new pseudo-operands, and two new instructions, shown in Figure 10. The instruction set or , s.e[s] generates a new spill location represented in the variable s, and stores the operand or in that spill location. The operand spill (v, s) represents the value in spill location s, and it also specifies that the values in spill location s and in the register v are the same. The operand spill (s) refers to the value in spill location s. The value in a spill operand is retrieved with the get os , v.e[v] and placed in the variable v. The actual generation of spill code then proceeds in two main phases. Given a variable to spill, the first phase generates the code to store the value in a new spill location, then adds copy instruction to split the live range of the variable so that all uses of the variable refer to different freshly-generated operands of the form spill (v, s). For example, consider the code fragment shown in Figure 11, and suppose the register allocator determines that the variable v is to be spilled, because a register cannot be assigned in

and o, or , v. ...code segment 1... add %v, o; ...code segment 2... sub %v, o; ...code segment 3... or %v, o;

−→

and o, or , v1 . set %v1 , s ...code segment 1... get spill (v1 , s), v2 add %v2 , o; ...code segment 2... get spill (v2 , s), v3 sub %v3 , o; ...code segment 3... get spill (v3 , s), v4 or %v, o;

Figure 11: Spill example code segment 2. The first phase rewrites the code as follows. The initial occurrence of the variable is spilled into a new spill location s. The value is fetched just before each use of the variable, and copied to a new register, as shown in Figure 11 above. Note that the later uses refer to the new registers, creating a copying daisy-chain, but the registers have not been actually eliminated. Once the live range is split, the register allocator has the freedom to spill only part of the live range. During the second phase of spilling, the allocator will determine that register v2 must be spilled in code segment 2, and the spill (v2 , s) operand is replaced with spill (s) forcing the fetch from memory, not the register v2 . Register v2 is no longer live in code segment 2, easing the allocation task without also spilling the register in code segments 1 and 3.

5.3.2

Formalizing spill code generation

The formalization of spill code generation can be performed in three parts. The first part generates new spill locations (line 2 in the code sequence above); the second part generates live-range splitting code (lines 4, 7, and 10); and the third part replaces operands of the form spill (v, s) with spill (s) when requested by the register allocator. The first part requires a rewrite for each kind of instruction that contains a binding occurrence of a variable. The following two rewrites are representative examples. Note that all occurrences of the variable v are replaced with spill (v, s), potentially generating operands like i(%spill (v, s)). These kinds of operands are rewritten at the end of spill-code generation to their original form, e.g. i(%v). mov or , v.e[v] ← [smov] → mov or , v. set %v, s e[spill (v, s)] inst2 o, or , v.e[v] ← [sinst2] → inst2 o, or , v.e[v] set %v, s e[spill (v, s)] The second rewrite splits a live range of a spill at an arbitrary point. This rewrite applies to any program that contains an occurrence of an operand spill (v1 , s), and translates it to a new program that fetches the spill into a new register v2 and uses the new spill operand spill (v2 , s) in the remainder of the program. This rewrite is selectively applied before any instruction that uses an operand spill (v1 , s). e[spill (v1 , s)] ← [split] → get spill (v1 , s), v2 .e[spill (v2 , s)]

In the third and final phase, when the register allocator determines that a variable should be spilled, the spill (v, s) operands are selectively eliminated with the following rewrite. spill (v, s) ← [spill] → spill (s)

6.

RELATED WORK

FreshML [18] adds to the ML language support for straightforward encoding of variable bindings and alphaequivalence classes. Our approach differs in several important ways. Substitution and testing for free occurrences of variables are explicit operations in FreshML, while MetaPRL provides a convenient implicit syntax for these operations. Binding names in FreshML are inaccessible, while only the formal parts of MetaPRL are prohibited from accessing the names. Informal portions—such as code to print debugging messages to the compiler writer, or warning and error messages to the compiler user—can access the binding names, which aids development and debugging. FreshML is primarily an effort to add automation; it does not address the issue of validation directly. Liang [13] implemented a compiler for a simple imperative language using a higher-order abstract syntax implementation in λProlog. Liang’s approach includes several of the phases we describe here, including parsing, CPS conversion, and code generation using a instruction set defined using higher-abstract syntax (although in Liang’s case, registers are referred to indirectly through a meta-level store, and we represent registers directly as variables). Liang does not address the issue of validation in this work, and the primary role of λProlog is to simplify the compiler implementation. In contrast to our approach, in Liang’s work the entire compiler was implemented in λProlog, even the parts of the compiler where implementation in a more traditional language might have been more convenient (such as register allocation code). Hannan and Pfenning [7] constructed a verified compiler in LF (as realized in the Elf programming language) for the untyped lambda calculus and a variant of the CAM [3] runtime. This work formalizes both compiler transformation and verifications as deductive systems, and verification is against an operational semantics. Previous work has also focused on augmenting compilers with formal tools. Instead of trying to split the compiler into a formal part and a heuristic part, one can attempt to treat the whole compiler as a heuristic adding some external code that would watch over what the compiler is doing and try to establish the equivalence of the intermediate and final results. For example, the work of Necula and Lee [15, 16] has led to effective mechanisms for certifying the output of compilers (e.g., with respect to type and memoryaccess safety), and for verifying that intermediate transformations on the code preserve its semantics. Pnueli, Siegel, and Singerman [19] perform verification in a similar way, not by validating the compiler, but by validating the result of a transformation using simulation-based reasoning. Semantics-directed compilation [12] is aimed at allowing language designers to generate compilers from high-level semantic specifications. Although it has some overlap with our work, it does not address the issue of trust in the compiler. No proof is generated to accompany the compiler, and the compiler generator must be trusted if the generated

compiler is to be trusted. Boyle, Resler, and Winter [1], outline an approach to building trusted compilers that is similar to our own. Like us, they propose using rewrites to transform code during compilation. Winter develops this further in the HATS system [21] with a special-purpose transformation grammar. An advantage of this approach is that the transformation language can be tailored for the compilation process. However, this significantly restricts the generality of the approach, and limits re-use of existing methods and tools.

7.

CONCLUSIONS AND FUTURE WORK

During the course of this work on the case study, we found that the implementation was easier than we expected, in part because the ability to mix the object and meta-language freely gave us more power than we anticipated. Because the account mirrors standard textbook specifications very closely, it is easy to believe in its correctness. The mechanisms for extensions and compositionality provided by the logical framework generalized naturally to the compiler design. There are many open avenues to explore. We plan to investigate bounded polymorphism, which we will use for object systems and extensible tuples. The current core language already provides preliminary, but incomplete support. We also plan to develop a representation of mutually recursive functions, which will require extending the support provided by the logical framework. Another big challenge for this approach is the formalization of optimization techniques that are normally implemented using global program analysis, such as global code motion optimizations.

8.

REFERENCES

[1] J. Boyle, R. Resler, and K. Winter. Do you trust your compiler? Applying formal methods to constructing high-assurance compilers. In High-Assurance Systems Engineering Workshop, Washington, DC, August 1997. [2] Gregory J. Chaitin, Marc A. Auslander, Ashok K. Chandra, John Cocke, Martin E. Hopkins, and Peter W. Markstein. Register allocation via coloring. Computer Languages, 6(1):47–57, January 1981. [3] G. Cousineau, P.L. Curien, and M. Mauny. The categorical abstract machine. The Science of Programming, 8(2):173–202, 1987. [4] Luis Damas and Robin Milner. Principal type schemes for functional programs. In Ninth ACM Symposium on Principles of Programming Languages, pages 207–212, 1982. [5] Olivier Danvy and Andrzej Filinski. Representing control: A study of the CPS transformation. Mathematical Structures in Computer Science, 2(4):361–391, 1992. [6] Adam Granicz and Jason Hickey. Phobos: A front-end approach to extensible compilers. In 36th Hawaii International Conference on System Sciences. IEEE, 2002. [7] John Hannan and Frank Pfenning. Compiler verification in LF. In Proceedings of the 7th Symposium on Logic in Computer Science. IEEE, IEEE Computer Society Press, 1992.

[8] Jason Hickey and Aleksey Nogin. Extensible hierarchical tactic construction in a logical framework. Accepted to the TPHOLs 2004 conference, 2004. [9] Jason Hickey, Aleksey Nogin, Robert L. Constable, Brian E. Aydemir, Eli Barzilay, Yegor Bryukhov, Richard Eaton, Adam Granicz, Alexei Kopylov, Christoph Kreitz, Vladimir N. Krupski, Lori Lorigo, Stephan Schmitt, Carl Witty, and Xin Yu. MetaPRL — A modular logical environment. In David Basin and Burkhart Wolff, editors, Proceedings of the 16th International Conference on Theorem Proving in Higher Order Logics (TPHOLs 2003), volume 2758 of Lecture Notes in Computer Science, pages 287–303. Springer-Verlag, 2003. [10] Jason Hickey, Aleksey Nogin, Adam Granicz, and Brian Aydemir. Compiler implementation in a formal logical framework. In Proceedings of the 2003 workshop on Mechanized reasoning about languages with variable binding, pages 1–13. ACM Press, 2003. http://doi.acm.org/10.1145/976571.976575. Extended version of the paper is available as Caltech Technical Report caltechCSTR:2003.002. [11] Jason J. Hickey, Aleksey Nogin, Alexei Kopylov, et al. MetaPRL home page. http://metaprl.org/. [12] Peter Lee. Realistic compiler generation. MIT Press, 1989. [13] Chuck C. Liang. Compiler construction in higher order logic programming. In Practical Aspects of Declarative Languages, volume 2257 of Lecture Notes in Computer Science, pages 47–63, 2002. [14] J. Gregory Morrisett, David Walker, Karl Crary, and Neal Glew. From system F to typed assembly language. Principles of Programming Languages, 1998. [15] George C. Necula. Translation validation for an optimizing compiler. ACM SIGPLAN Notices, 35(5):83–94, 2000. [16] George C. Necula and Peter Lee. The design and implementation of a certifying compiler. In Proceedings of the 1998 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 333–344, 1998. [17] Aleksey Nogin and Jason Hickey. Sequent schema for derived rules. In Victor A. Carre˜ no, C´ezar A. Mu˜ noz, and Sophi`ene Tahar, editors, Proceedings of the 15th International Conference on Theorem Proving in Higher Order Logics (TPHOLs 2002), volume 2410 of Lecture Notes in Computer Science, pages 281–297. Springer-Verlag, 2002. [18] Andrew M. Pitts and Murdoch Gabbay. A metalanguage for programming with bound names modulo renaming. In R. Backhouse and J. N. Oliveira, editors, Mathematics of Program Construction, volume 1837 of Lecture Notes in Computer Science, pages 230–255. Springer-Verlag, Heidelberg, 2000. [19] A. Pnueli, M. Siegel, and E. Singerman. Translation validation. Lecture Notes in Computer Science, 1384:151–166, 1998. [20] Pierre Weis and Xavier Leroy. Le langage Caml. Dunod, Paris, 2nd edition, 1999. In French. [21] Victor L. Winter. Program transformation in hats. In Proceedings of the Software Transformation Systems Workshop, May 1999.