Jun 15, 2012 - The complete code is available online (Pouillard, 2011b). ...... Of course, ...... Programming Languages
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
c Cambridge University Press JFP TODO (): 1–90, 2012. doi:TODO
1
Printed in the United Kingdom
A unified treatment of syntax with binders Nicolas POUILLARD and Franc¸ois POTTIER INRIA (e-mail:
[email protected] and
[email protected])
Abstract Atoms and de Bruijn indices are two well-known representation techniques for data structures that involve names and binders. However, using either technique, it is all too easy to make a programming error that causes one name to be used where another was intended. We propose an abstract interface to names and binders that rules out many of these errors. This interface is implemented as a library in Agda. It allows defining and manipulating term representations in nominal style and in de Bruijn style. The programmer is not forced to choose between these styles: on the contrary, the library allows using both styles in the same program, if desired. Whereas indexing the types of names and terms with a natural number is a well-known technique to better control the use of de Bruijn indices, we index types with worlds. Worlds are at the same time more precise and more abstract than natural numbers. Via logical relations and parametricity, we are able to demonstrate in what sense our library is safe, and to obtain theorems for free about world-polymorphic functions. For instance, we prove that a world-polymorphic term transformation function must commute with any renaming of the free variables. The proof is entirely carried out in Agda.
1 Introduction Many programmers have to deal with the mundane business of building and transforming data structures that contain names and binders. Compilers, code generators, static analysers, theorem provers, and type-checkers have this in common. They manipulate programs, formulae, proofs, and types. One central difficulty is the representation of variables, that is, the representation of names and binders. One traditional approach is to represent all occurrences of a variable identically. For this purpose, one typically uses character strings or integers. One can in fact use any data type that has an infinite number of elements and admits an equality test. The nature of its elements does not matter, which is why they are often known as atoms. While this seems to be the most obvious representation, it causes trouble when dealing with operations like substitution. Substitution must be capture-avoiding: variables must sometimes be renamed in order to avoid an accidental change of meaning of a name. The mathematical foundations of this technique, which we refer to as the nominal approach, are described by Pitts (2006). Another traditional approach is to use de Bruijn indices (de Bruijn, 1972). This representation is often referred to as “nameless” because variables are no longer identified by a name but by a notion of “distance” to the binding point. This approach solves part of the
ZU064-05-FPR
rubtmp13
2
June 15, 2012
14:48
N. Pouillard and F. Pottier
problem by providing a canonical representation. However, due to its arithmetic flavor, the manipulation of de Bruijn indices remains an error-prone activity. In the end, regardless of which representation is used, programs that work with names can be hard to understand and are easy to get wrong. To remedy this, many proposals have been made, often in the form of finer-grained type systems for these representations (Altenkirch, 1993; McBride & McKinna, 2004; Bellegarde & Hook, 1994; Bird & Paterson, 1999; Altenkirch & Reus, 1999; Shinwell et al., 2003; Licata & Harper, 2009; Pottier, 2007). We continue in this direction by presenting a library whose abstract types allow safe and fine-grained programming with both named and nameless representations. This paper combines the conference papers by Pouillard and Pottier (2010) and by Pouillard (2011a) and extends them significantly. In the first conference paper, we present an abstract interface to names and binders. This interface comes with two implementations, one of which is in nominal style, the other in de Bruijn style. Thus, client code can be written and type-checked independently of which representation technique is ultimately chosen. In the second conference paper, Pouillard drops the nominal implementation and focuses on the de Bruijn side. Furthermore, he exploits dependent types, whereas the first conference paper avoids them. These decisions allow him to greatly simplify the interface of the library, which now has just one implementation. In the present paper, we stick with a single interface and a single implementation, but we extend them in such a way that the nominal representation and de Bruijn’s representation (as well as several others) are simultaneously available to the client. The connection with our previous papers is described in greater depth in section 12. Here, let us just stress again our main contribution: whereas the first conference paper “unifies” the nominal representation and de Bruijn’s representation by supporting one or the other, the present paper “unifies” them by supporting them simultaneously. This paper is organized as follows. In section 2, we briefly recall a few facts about AGDA, the language in which we write both our programs and proofs about these programs. Then, we begin the presentation of our library. We decide to initially focus on the fragment of the library that supports programming in nominal style (sections 3 to 7). We informally recall several existing approaches to representing data structures with names and binders in a nominal style (section 3). Then, we explain our new approach, by presenting the interface of (the nominal fragment of) our library (section 4), its usage (section 5), and its implementation (section 6). In section 7, we use logical relations and parametricity in order to explain and prove in what sense our library is safe. At this point, we are ready to extend the library with support for de Bruijn indices. We follow an analogous path as in the nominal case. First, we informally recall several existing approaches to working with de Bruijn indices in a “safe” manner. Then, we extend the interface and implementation of our library (section 9). We explain how these extensions can be exploited by the library’s users (section 10) and why they are sound (section 11). Let us stress again that this is not an alternative implementation of the library, but an extension of it. Finally, we review some of the related work (section 12) and discuss several directions for future work (section 13).
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
3
2 A brief introduction to AGDA Throughout the paper, our definitions are presented in the syntax of AGDA. Naturally, we cannot possibly provide a self-contained introduction to AGDA in this paper. In the following, we present a few of AGDA’s notations and conventions. We hope that this is enough for someone who is familiar with functional programming to understand our code. To better grasp the features of AGDA, we recommend the following further reading. Norell’s Ph.D. thesis (2007) describes the theoretical and practical aspects of the development of AGDA. Another introduction to AGDA, of intermediate size, can be found in the introduction of Pouillard’s Ph.D. thesis (2012). Types AGDA is a dependent type theory. This means, in particular, that it has very few built-in types. In fact, only two forms of types are built-in, namely function types and the type of all types. Everything else is a user-defined type. The language offers facilities for defining record types, inductive types, and co-inductive types. An ordinary (non-dependent) function type is written A → B. A dependent function type is written ( x : A ) → B or ∀ ( x : A ) → B. A function parameter can be implicit: in this case, the function type is written ∀{x : A} → B. This allows an actual parameter to be omitted at a call site if its value can be inferred from the context. The distinction between ∀ ( x : A ) → B and ∀{x : A} → B is quite superficial: in principle, after reconstructing the value of every omitted actual parameter, one can put every program in a form where every function parameter is explicit. This means, in particular, that the logical relations (and the “free theorems”) associated with the types ∀ ( x : A ) → B and ∀{x : A} → B are the same. The syntax of function types offers shortcuts for introducing multiple arguments at once and for omitting a type annotation, as in ∀{A} {i j : A} x → C. The type Set, also known as Set0, is the type of all “small” types, such as List String, N, and Maybe ( Bool × N ). The type Set1 is the type of Set and “others like it”, such as Set → Bool, N → Set, and Set → Set. There is in fact an infinite hierarchy of types of the form Set `, where ` is a universe level, roughly, a natural integer. Propositions There is no specific sort of propositions: instead, propositions inhabit Set ` for some `. The unit type >, a record type with no fields, represents the true proposition. The empty type ⊥, an inductive type with no constructors, represents the false proposition. Negation ¬ A is defined as A → ⊥. Lexical conventions Whitespace is significant: x≤y is an identifier, whereas x ≤ y is an application. This makes it possible to name a variable after its type, deprived of any whitespace. For example, if one follows this convention, then a variable of type x ≤ y (that is, a proof of x ≤ y) is named x≤y. AGDA offers infix and “mixfix” notation. For instance, the expression x ≤ y is syntactic sugar for an application of the function ≤ to the arguments x and y. Note how, in the name of the function, the underscore character “ ” is used to indicate where the arguments should appear.
ZU064-05-FPR
rubtmp13
4
June 15, 2012
14:48
N. Pouillard and F. Pottier
Note that the same name can be used for different data constructors of distinct data types. AGDA makes use of type annotations to resolve ambiguities. AGDA programs can be constructed incrementally. To do so, one makes use of “holes”, which represent the unfinished parts of a program. Whatever appears inside a hole is not type-checked. The hole itself can receive any type. A hole is delimited by curly braces and exclamation marks, {!like this!}. In the following, we use holes when we want to omit a definition or a proof. A comment begins with -- and extends to the end of the line. Notions of equality In type theory, there are several notions of equality. Definitional equality is the most basic notion. It is deeply rooted in the computation rules of the programming language, here, AGDA. Two terms are definitionally equal if and only if they reduce to a common term. In AGDA, reduction combines β -reduction, η-conversion, and expansion of the equations that appear as part of definitions. For instance, ( λ x → x ) suc zero is definitionally equal to suc zero, and suc is definitionally equal to λ x → suc x. The term zero + n is definitionally equal to n, because the definition of addition is by case analysis upon its first argument. For the same reason, n + zero is not definitionally equal to n. Definitional equality is decidable but weak. When one wishes for a notion of equality that relies not just on computation, but also on reasoning, on turns to propositional equality. In AGDA and in this document, this equality is written ≡. Its negation is written ı. Propositional equality is defined as an inductive type, in the following style: data ≡0 {A : Set0} ( x : A ) : A → Set0 where refl : x ≡0 x For simplicity, the above definition defines ≡0 , a version of propositional equality that is restricted to “small” types of type Set0. We cannot fully explain the subtleties of this definition. Let us only point out that this inductive type has a single constructor, which requires both sides of the equality to be the same, that is, to be definitionally equal. Propositional equality subsumes definitional equality. For example, the proposition zero + n ≡0 n is true, because the terms zero + n and n are definitionally equal and (as a result) the term refl is an inhabitant of the type zero + n ≡0 n. Propositional equality is strictly more powerful than definitional equality. For instance, the proposition ∀ n → n + zero ≡0 n can be proved by a simple induction on n. Although propositional equality is stronger than definitional equality, it is still an intensional notion of equality. It is sometimes the case that it cannot relate two functions even though they are pointwise equal, that is, they map every input to the same output. Thus, it makes sense to introduce an extensional notion of equality, namely pointwise equality, written $. The definition presented here is specialized to small types, non-dependent functions, and the equality remains intensional on the codomain of the functions: $0 : {A B : Set0} ( f g : A → B ) → Set0 f $0 g = ∀ x → f x ≡0 g x
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
5
It is possible to show, for instance, that the functions λ x → not ( not x ) and λ x → x are pointwise equal. This amounts to proving ∀ x → not ( not x ) ≡ x; the proof is by cases on x. It is not possible to show that these functions are propositionally equal. The logical relation which we introduce in section 7 can be viewed as a generalized version of pointwise equality. It is extensional: it relates all functions that map related inputs to related outputs. Everything is online We use some definitions from AGDA’s standard library, including natural numbers, booleans, lists, and applicative functors (pure and f ). For the sake of conciseness, the code fragments presented in this document are sometimes not perfectly self-contained. For instance, we sometimes use the operation ◦0 , a variant of the standard composition operation ◦, which is equipped with a simpler (nondependent) type and facilitates type reconstruction. We choose to gloss over these details. The complete code is available online (Pouillard, 2011b). 3 Introduction to the nominal approach In order to explain what problem we are trying to solve, let us informally review how people traditionally encode abstract syntax as an algebraic data type and why these encodings are usually not satisfactory. 3.1 Warm-up: the bare nominal approach The bare approach to syntax in nominal style requires very little infrastructure to begin with. Names and binders are represented by so-called atoms. The most important operation that atoms offer is an equality test. The set of atoms is countably infinite, and there must be a way of obtaining “fresh” atoms when desired. (In the following informal discussion, we elude this aspect and simply assume that we have a number of constants of type Atom at hand.) Atoms are usually represented as natural numbers. -- A set of atoms ( could be N ) Atom : Set -- Atom is countably infinite; here are some atoms: -- x,y,z... could be represented by 0,1,2... x y z f g ... : Atom -- The equality test on atoms ==A : ( x y : Atom ) → Bool Using the type Atom , we can readily define algebraic data types for abstract syntax with names and binders. Our running example is the untyped λ -calculus; its definition appears below. The type TmA (Tm for “term” and A for “atom”) has four data constructors. The constructor V is for variables and carries just an atom. The constructor · represents function application and carries two subterms. The constructor ň carries an atom and
ZU064-05-FPR
rubtmp13
June 15, 2012
6
14:48
N. Pouillard and F. Pottier
a subterm in which this atom is informally considered bound. Thus, the atom carried by ň is informally considered a binder, whereas the atom carried by V is a (free or bound) occurrence of a name: it refers to some binder. The constructor Let carries an atom and two subterms. The atom is informally considered bound in the second subterm only. data TmA : Set where V : ( x : Atom ) → TmA · : ( t u : TmA) → TmA ň : ( b : Atom ) ( t : TmA) → TmA Let : ( b : Atom ) ( t u : TmA) → TmA It is striking that there is no formal distinction between the atoms that represent binders and those that represent occurrences. Neither is there any indication of the scope of the binders. This calls for improvement. Here are two examples of terms that represent object-level syntax. They represent the identity function and the application function of the λ -calculus. -- λ x. x idTmA : TmA idTmA = ň x ( V x ) -- λ f. λ x. f x apTmA : TmA apTmA = ň f ( ň x ( V f · V x ) ) The strength of the “bare nominal” approach resides in its simplicity. The representation of names and binders is the same as in the concrete syntax. One important issue, though, is α-equivalence: there are multiple equivalent representations of the “same” term. Another issue is capture: it is easy to inadvertently change the meaning of a name by placing it in a context that happens to bind this name. The need for a notion of α-equivalence arises out of the fact that choices of atoms are arbitrary. One would like to consider that two terms represent the “same” piece of syntax when they differ only by a consistent renaming of bound names. Two such terms are said to be α-equivalent. For example, here are two α-equivalent terms: tx tx ty ty
: = : =
TmA ň x (V f · V x) TmA ň y (V f · V y)
The term tx is α-equivalent to ty because consistently replacing the bound name x with the name y in the term tx yields the term ty. Conversely, consistently replacing y with x
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
7
in ty yields tx. However, α-equivalence can be a subtle notion. The term tf below is not α-equivalent to tx and ty: tf : TmA tf = ň f ( V f · V f ) Although replacing the bound name x with the name f in the term tx does yield tf, this would be an inconsistent renaming. The name f that occurs in the renaming would be captured by the binder ň f that occurs in the term tx. Conversely, if one attempts to consistently replace f with x in tf, one finds that such a replacement is permitted, but does not yield tx. Without delving any further into α-equivalence, let us emphasize the point of view that we adopt in this paper. Choices of bound names are purely a representation issue, so they should not be able to influence computation in an observable way. The flip-side of this slogan is, “good” computations should not observably depend on choices of bound names. This can be stated as follows: A function is well-behaved if, when applied to α-equivalent arguments, it produces α-equivalent results. Let us give a few examples of well-behaved functions. The function rmA removes all occurrences of an atom from a list of atoms. It operates only on free names and thus does not depend on choices of bound names, hence is well-behaved. The function fv, which constructs a list of the free variables/atoms of a term is well-behaved. rmA : Atom → List Atom → List Atom = [] rmA [] rmA x ( y :: ys ) = if x ==A y then rmA x ys else y :: rmA x ys -- rmA behaves well; for instance, this holds: test-rmA : rmA x [ x ] ≡ rmA y [ y ] test-rmA = refl -- both sides reduce to [] fv fv fv fv fv
: TmA → List Atom (V x) = [ x ] (t · u) = fv t ++ fv u (ň x t) = rmA x ( fv t ) ( Let x t u ) = fv t ++ rmA x ( fv u )
-- fv behaves well; for instance, this holds: test-fv : fv tx ≡ fv ty test-fv = refl -- both sides reduce to [ f ]
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
8
N. Pouillard and F. Pottier
We now illustrate misbehaving functions with two examples. The function ba computes the list of “bound atoms” of a term, that is, the list of atoms that appear as binders in this term. The function cmp-ba accepts two terms and, if they are ň-abstractions, compares the atoms bound by ň. The codomains of these functions are respectively List Atom and Bool. At these types, which do not contain any binders, α-equivalence is just equality. ba ba ba ba ba
: TmA → List Atom (ň x t) = x :: ba t ( Let x t u ) = x :: ba t :: ba u (t · u) = ba t ++ ba u (V ) = []
[x]ı[y] : [ x ] ı [ y ] [x]ı[y] = {! omitted !} -- ba does not behave well: test-ba : ba tx ı ba ty test-ba = [x]ı[y] cmp-ba : TmA → TmA → Bool cmp-ba ( ň x ) ( ň y ) = x ==A y cmp-ba = false -- cmp-ba does not behave well: test-cmp-ba : cmp-ba tx tx ı cmp-ba tx ty test-cmp-ba ( ) -- true ı false In traditional informal proofs, α-equivalence is identified with equality. This means that every function must map α-equivalent inputs to α-equivalent outputs, or it does not deserve to be called a “function”. Thus, whenever one defines a function, one must prove that it is well-behaved. In this paper, we wish to exploit the type system in order to meet this proof obligation. Using logical relations and parametricity, we will be able to prove, once and for all, that well-typed functions are well-behaved (with respect to a notion of α-equivalence which is itself type-directed). For an appropriate definition of the type of λ -terms, we will find that the function cmp-ba cannot be typed at all, and that the function ba can be typed only at a type that makes it harmless. 3.2 Using well-formedness judgements One way to define the scoping rules of the object language is to define a well-formedness judgement. This can be done using an inductive predicate over the structure of terms. It is a well-known technique used in the definition of type systems. The standard presentation makes use of a set of inference rules where judgements are made of an environment, a term
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
9
and a type. The environment tracks the type of each introduced variable. To specify just the scoping rules, one follows the same presentation, without types. We directly focus on a formal presentation in AGDA since inference rules (and grammars) have a direct translation into inductive types. First, one defines environments, which are just lists of atoms: data Env : Set where ε ,
: Env : (Γ : Env ) ( x : Atom ) → Env
Then comes the definition of the membership predicate. It states when an atom is a member of some environment. The definition consists of two rules. One rule applies when the atom is present at the first position in the environment. The other rule applies when it is present in the tail of the environment. We use AGDA comments to give each inference rule a traditional visual appearance: data ∈ x : (Γ : Env ) → Set where here : ------------x ∈ (Γ , x ) there : ∀ {y}
→ x ∈ Γ → ------------x ∈ (Γ , y )
Finally, one can give the scoping rules of the λ -calculus. The definition boils down to using an environment to track bound atoms, using the membership predicate at variable occurrences and pushing the binders onto the environment: data ` Γ : TmA → Set where V : ∀ {x} → x ∈ Γ → --------Γ ` V x ·
: ∀ {t u}
→ Γ ` t → Γ ` u → ----------Γ ` t · u
ň : ∀ {t b}
→ Γ , b ` t → ----------Γ ` ň b t
Let : ∀ {t u b}
→ Γ ` t → Γ , b ` u → --------------Γ ` Let b t u
ZU064-05-FPR
rubtmp13
June 15, 2012
10
14:48
N. Pouillard and F. Pottier
We can now state that our example terms are well-scoped in the empty environment. Thanks to implicit arguments and to the definition of the membership predicate, the proof that a term is well-scoped is the term itself, expressed in de Bruijn notation (here and there respectively play the role of zero and successor). `id : ε ` idTmA `id = ň ( V here ) `ap : ε ` apTmA `ap = ň ( ň ( V ( there here ) · V here ) ) However, this technique is not exactly what we are looking for. Indeed, here, the scoping properties have to be explicitly stated on the side of each definition. We are looking for something more integrated into the types. This would enable well-formedness to be enforced in a more pervasive and automatic manner. Fortunately, this technique can be adapted so as to merge the scoping information and the term. We study this idea next.
3.3 Well-scoped terms We now merge the inductive definition of terms and the inductive definition of the scoping predicate. Thus, we end up with an inductive type of terms that is indexed by an environment. This environment is extended at binders and queried at variable occurrences: data TmJ Γ : Set where V : ∀ {x} → x ∈ Γ → TmJ Γ · : TmJ Γ → TmJ Γ → TmJ Γ ň : ∀ b → TmJ (Γ , b ) → TmJ Γ Let : ∀ b → TmJ Γ → TmJ (Γ , b ) → TmJ Γ The idea of encoding well-scopedness (and well-typedness as well) as part of the inductive definition of terms goes back at least as far as Pfenning and Lee (1989). It also appears in the original LF paper (Harper et al., 1993) and in the nested data type approach (Bellegarde & Hook, 1994; Bird & Paterson, 1999; Altenkirch & Reus, 1999). Hence, it is by now widely known. An instance of it in the recent literature is Chlipala’s type-preserving compiler (2007). One can define our two example terms in this style as well: idTmJ : TmJ ε idTmJ = ň x ( V here ) apTmJ : TmJ ε apTmJ = ň f ( ň x ( V ( there here ) · V here ) )
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
11
The function fv can easily be adapted to this style, while the function rmA can be reused: fv fv fv fv fv
: ∀ {Γ} → TmJ Γ → List Atom = [ x ] ( V {x} ) (t · u) = fv t ++ fv u (ň x t) = rmA x ( fv t ) ( Let x t u ) = fv t ++ rmA x ( fv u )
Alas, the functions ba and cmp-ba are not rejected by this style. Their types becomes: ba : ∀ {Γ} → TmJ Γ → List Atom cmp-ba : ∀ {Γ1 Γ2} → TmJ Γ1 → TmJ Γ2 → Bool Thus, although we have been able to formally describe the scoping rules, and to ensure that all terms are well-scoped by construction, we have not made much progress towards our goal. Indeed, two different but α-equivalent terms can still be distinguished. Worse, by introducing environments and membership proofs as explicit objects, we have introduced new issues. A function that receives a term as an argument now also receives (or otherwise has access to) the environment in which this term is well-scoped. This extra information can be used by the function. For instance, here are two functions that examine the environment by pattern-matching: fast-fv : ∀ {Γ} → TmJ Γ → List Atom = [] -- for sure the term is closed fast-fv {ε } fast-fv t = fv t env-length : Env → N env-length ε = 0 env-length (Γ , ) = 1 + env-length Γ env-length-TmJ : ∀ {Γ} → TmJ Γ → N env-length-TmJ {Γ} = env-length Γ The issue is that Γ is a concrete input to the functions fv, ba, cmp-ba, fast-fv, and env-length-TmJ. It is not erased: it is an ordinary argument, which can be examined by the function. The fact that it is an implicit argument offers a notational convenience, nothing more. This is not satisfactory. We would like to think of Γ as an “index”, that is, a piece of information that the type-checker keeps track of in order to reject ill-behaved code and that can be erased prior to running the program. Here, this is impossible: for instance, the value of env-length-TmJ t depends not just on the λ -term that t represents but also on the environment Γ in which t is considered. In order to address these issues, we make use of abstract types. We replace environments, which have the concrete type Env above, with worlds, where the type World of worlds
ZU064-05-FPR
rubtmp13
June 15, 2012
12
14:48
N. Pouillard and F. Pottier
remains abstract. Thus, worlds cannot be examined by client code. We set everything up so that worlds need not exist at runtime: that is, they can be erased prior to execution. However, we do not formally prove that worlds can be erased: we leave this issue for future work (see section 13). We replace the type Atom with two distinct types. An atom that is used in a binding position receives the abstract type Binder. We do not equip this type with an equality test, so the function cmp-ba is ruled out. An atom that serves as a (free or bound) occurrence is glued together with a proof of its membership in some world α, and this pair receives the abstract type Name α. Because these types are abstract, we must equip them with a set of operations, otherwise they would be completely unusable. We select these operations so as to be able to prove, eventually, that well-typed functions are well-behaved. For instance, it is sound to equip the type Name α with an equality test. The framework of logical relations, which we develop in section 7, serves as a guide in the choice of these operations. For each operation independently, it allows us to automatically construct the statement of a lemma that must be proved to justify that this operation is safe. 4 The N OM PA interface (nominal fragment) Our library is called N OM PA. To the client, it offers an interface whose types are abstract. Its implementation is hidden from the client. This allows us to exploit parametricity and to prove that well-typed client code is well-behaved. As announced earlier, in sections 4 to 7, we focus on a fragment of the library, which supports programming in nominal style. The rest of the library, which supports programming in de Bruijn style as well as in combinations of the two styles, is presented in section 8. Thus, the listing that appears in Figure 1 represents just the nominal fragment of the library. We now present each ingredient in turn. First, here are the building blocks needed to define algebraic data types with names and binders in a nominal style. 4.1 Everything we need to define nominal syntax In order to replace environments in the definition of well-scoped terms, we introduce an abstract notion of worlds. Hence, the interface begins with an abstract type of worlds: World : Set We find it necessary to distinguish atoms that are used in a binding position and atoms that are used as (free or bound) occurrences. We refer to the former as binders and to the latter as names. To that end, we provide abstract types of binders and names. A binder is an atom that is meant to be used in a binding position. Internally, it is just an atom, but this fact is not exposed. Although, by traversing a term, one can gain access to all of the binders that appear in it, this does not imply that one can distinguish two αequivalent terms. Indeed, the interface does not provide any means of distinguishing two binders: there is no equality test at type Binder. Binder : Set
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders record NomPa : Set1 where constructor mk infixr 5 / infix 3 ⊆ infix 2 # field -- Abstract types for worlds, names, and binders World : Set Name : World → Set Binder : Set →N : (α β : World ) → Set α →N β = Name α → Name β field -- Constructing worlds 0/ : World / : Binder → World → World -- An infinite set of binders zeroB : Binder sucB : Binder → Binder -- Converting back and forth between names and binders nameB : ∀ {α} b → Name ( b / α ) -- There is no name in the empty world ¬Name0/ : ¬ ( Name 0/ ) -- Two names can be compared; a binder and a name can be compared ==N : ∀ {α} ( x y : Name α ) → Bool exportN : ∀ {α b} → Name ( b / α ) → Name ( b / 0/ ) ] Name α -- The # : #0/ : suc# :
fresh-for relation Binder → World → Set ∀ b → b # 0/ ∀ {α b} → b # α → ( sucB b ) # ( b / α )
-- World inclusion ⊆ : World → World → Set coerceN : ∀ {α β} → (α ⊆ β ) → (α →N β ) ⊆-refl : Reflexive ⊆ ⊆-trans : Transitive ⊆ ⊆-0/ : ∀ {α} → 0/ ⊆ α ⊆-/ : ∀ {α β} b → α ⊆ β → ( b / α ) ⊆ ( b / β ) ⊆-# : ∀ {α b} → b # α → α ⊆ ( b / α )
Figure 1. The N OM PA interface (nominal fragment)
13
ZU064-05-FPR
rubtmp13
June 15, 2012
14
14:48
N. Pouillard and F. Pottier
The type of names, Name, is indexed by a world. A world α can be thought of informally as a set of atoms. A name of type Name α can be thought of as a pair of an atom and a proof that this atom is a member of the set α. In other words, a name has type Name α if it “makes sense” in the world α. Name : World → Set Throughout the paper, we often use “name transformers”, that is, functions from type Name α to type Name β. We write α →N β as a shorthand for this type. The intuitive view of worlds as sets of atoms can be referred to as the unary view of worlds. In section 7.3, where we study logical relations, we introduce a binary view of worlds, whereby worlds are viewed as certain relations between sets of atoms. Although the unary view represents a good intuition (and we stick with it for the moment), the binary view is required in order to understand why (and prove that) well-typed functions are wellbehaved. We now introduce a way of extending a world with a binder. This is analogous to the constructor , of section 3.2 for extending an environment with an atom. /
: Binder → World → World
If a world is thought of as a set of atoms, then the set b / α is just the union of the singleton set {b} and of the set α. There is no requirement that these sets be disjoint: the world b / α is defined even if b is already a member of α. This reflects the fact that, in the nominal representation, it is permitted for two binders to bind the same name, even if one of them is nested in the other. In that case, one traditionally considers that the most recent binder “hides”, or “shadows”, the previous binder. For this reason, we pronounce b / α as “b hides α”. At this point, we have everything that is needed to build data types with names and binders. The new definition of Tm is very close to the previous one. All we have to do is use / instead of , , rename Γ to α, and use Name instead of a pair of an atom and a membership proof: data Tm V : · : ň : Let :
α : Set where Name α → Tm α Tm α → Tm α → Tm α ∀ b → Tm ( b / α ) → Tm α ∀ b → Tm α → Tm ( b / α ) → Tm α
→Tm : (α β : World ) → Set α →Tm β = Tm α → Tm β The description of the binding constructs ň and Let relies on dependent types: the binder b is used in the type of the subterm. Here is a trivial example of a function that traverses a term and measures its size. It is remarkable for its simplicity: name abstractions are traversed without fuss. The code
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
15
would be exactly the same in the bare nominal approach of section 3.1. It is efficient: no renaming is involved. In comparison, in FreshML (Shinwell et al., 2003) or in the locally nameless approach (Aydemir et al., 2008; Chargu´eraud, 2011), crossing a binder can have a cost that is linear in the size of the subterm. Polymorphic recursion is exploited: one the line that deals with ň, the recursive call to size t is at some inner world. size size size size size
: ∀ {α} → Tm (V ) = (t · u) = t) = (ň ( Let t u) =
α 1 1 1 1
→ N + size t + size u + size t + size t + size u
4.2 Building binders and names In order to build terms, we need binders and names. Binders are introduced via two primitive operations called zeroB and sucB. We then lift any natural number to a binder with a cheap convenience function: zeroB : Binder sucB : Binder → Binder B
: N → Binder B zero = zeroB B ( suc n ) = sucB ( n B) In effect, binders are natural numbers with a restricted interface. Here, we give only zero and successor, but all of the arithmetic operations on natural numbers could be exposed as well. One key limitation, however, is that no information must be allowed to “leak” out of a binder. Hence, no equality test over binders is provided. The reason why this must be so will appear more clearly when we build the logical relation in section 7.3. In order to build names, we provide a function, called nameB, which turns a binder into a name. nameB : ∀ {α} b → Name ( b / α ) While nameB can turn an arbitrary binder b into a name, it imposes that the resulting name inhabit a world of the form b / α. Even though α can be instantiated at will, this is an important limitation. We will soon introduce a notion of world inclusion that allows overcoming this limitation (section 4.3). For the moment, we have enough building blocks to define a representation of the identity function! idTm : ∀ {α} → Tm α idTm = ň x ( V ( nameB x ) )
where x = 0
B
ZU064-05-FPR
rubtmp13
June 15, 2012
16
14:48
N. Pouillard and F. Pottier
As another example, here is a representation of the λ -term that represents “false” in Church’s encoding, that is, λ x.λ x.x. We use the binder x twice on purpose to show that shadowing is permitted. To type-check the variable occurrence of x, the world x / α is implicitly passed to nameB, which returns a name in the world x / x / α. Then, each of the two “ň x”’s takes away one “x /”, so the final term inhabits the world α. falseTm : ∀ {α} → Tm α falseTm = ň x ( ň x ( V ( nameB x ) ) ) where x = 0 B However, one faces an issue when one attempts to build a representation of the Church encoding of “true”, that is, λ x.λ y.x. The na¨ıve approach does not type-check: -- this does not type-check trueTm : ∀ {α} → Tm α trueTm = ň x ( ň y ( V ( nameB x ) ) ) where x = 0 B y = 1 B While nameB x inhabits any world of the form x / β, the context expects a name in the world y / x / α. We introduce means of moving a name from one world to another in section 4.3. Closed terms The above well-typed terms (idTm and falseTm) have no free variables. They are said to be closed. They both have type ∀ {α} → Tm α. This world-polymorphic type reflects the closedness property. Indeed, if any world will do, then the empty world, written 0, / will do as well. Thus, idTm also has type Tm 0. / Conversely, because the empty world is included in every world, we will be able to move any term from the empty world to an arbitrary world. In short, Tm 0/ and ∀ {α} → Tm α are isomorphic types. To allow exploiting the fact that a term that inhabits the empty world must be closed, we introduce ¬Name0, / which witnesses that there is no name in the empty world: 0/ : World ¬Name0/ : ¬( Name 0/ ) In various situations, this operation enables arguing that “this case is impossible”. For instance, when writing a function that looks up a name in an environment represented as a list, one typically uses ¬Name0/ in the case where the environment is the empty list. This amounts to arguing that “because the name we are looking for is properly bound, the search cannot fall off the end of the environment”. 4.3 World inclusion It is often necessary to widen the world which a name inhabits. Our earlier attempt to define trueTm illustrates this: for this definition to type-check, we need to widen the world
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
17
of the name x. Widening a world causes a loss of static information. For instance, we have seen that a term in the empty world must be closed. If one widens its world, then this term is no longer statically known to be closed. We call this operation “weakening” because it causes the loss of some static information. The name “weakening” is also vastly used for the same purpose when referring to typing environments. To account for the multiple ways in which we could widen a world, we introduce a world inclusion predicate. In AGDA terminology, we introduce a type ⊆ for witnesses of world inclusion. If a world α is included in a world β then it is permitted to transport a name (and a term as well, as we shall see) from α to β. The primitive operation coerceN serves this purpose. It takes an inclusion witness, a name, and returns the same name at a wider world. ⊆ : World → World → Set coerceN : ∀ {α β} → α ⊆ β → (α →N β ) We also introduce an alias for coerceN, called h-because -i. It helps keep the code separate from the type-checking argument: the proof of world inclusion, which appears between the angle brackets, can safely be skipped by the reader. infix 0 h-because -i h-because -i : ∀ {α β} → Name α → α ⊆ β → Name β h-because -i n pf = coerceN pf n -- We can now write: x h-because some proof -i -- Which is less noisy than: coerceN ( some proof ) x
World inclusion rules A set of world inclusion rules is given in figure 1. World inclusion is reflexive and transitive (⊆-refl and ⊆-trans). The empty world is a least element for world inclusion (⊆-0). / For every binder b, the operation b / is covariant, which means that it preserves world inclusion (⊆-/). The last world inclusion rule, ⊆-#, states that a world α is included into b / α under the condition that b is not a member of α. This condition is required for soundness. At this point, we expect that the reader might be surprised, because an interpretation of worlds as sets suggests that this condition is unnecessary: indeed, if α is a set, then it is unconditionally a subset of { b } ∪ α. This shows the limitations of this way of thinking. When we introduce logical relations in section 7.3, we show that interpreting worlds as relations provides a richer viewpoint and explains why this condition is necessary. Without waiting until then, the following code snippet demonstrates that in the absence of this condition one could write ill-behaved programs: wrong : Binder → Binder → Bool wrong x y = nameB x ==N nameB y h-because wrongProof -i where postulate wrongProof : y / 0/ ⊆ x / y / 0/
ZU064-05-FPR
rubtmp13
18
June 15, 2012
14:48
N. Pouillard and F. Pottier
As this code snippet shows, in the absence of this condition, it would be permitted to compare names in different worlds, hence to compare binders, and, finally, to distinguish two α-equivalent terms. The “fresh-for” relation The last inclusion rule, ⊆-#, uses a “fresh-for” relation, written # . As suggested above, the proposition b # α must guarantee at least that the name b is not a member of the world α, interpreted as a set. In fact, we choose to equip the “freshfor” relation with a strictly stronger meaning. Taking advantage of the fact that atoms are integers internally, we take b # α to mean that b dominates α, that is, b is strictly greater than every name that inhabits α. We introduce two rules to produce witnesses of the relation # . These rules appear in / states that every binder is fresh for the empty world. The figure 1. The first rule, #0, second rule, suc#, is the reason why we equip # with a stronger meaning. It states that, if b dominates α, then the successor of b dominates b / α. This gives us a simple and efficient way of building “fresh-for” witnesses. We note that, although this axiomatization of the “fresh-for” relation is not complete, it has proved sufficient for our purposes. Emptiness of worlds The inclusion relation can express emptiness. A world α is empty if and only if α is included in 0. / We favor this definition of emptiness, as opposed to an intensional equality with the empty world, because it is more flexible. In section 9, we introduce a new operation on worlds, called +1. We will see that the world 0/ +1 is empty, in the sense that it is included in the empty world, yet it does not reduce to 0. / N By combining coerce and ¬Name0, / we obtain a proof that, if the world α is empty, then there is no name in the world α. This formulation reduces the goal of obtaining a contradiction to a world inclusion goal. This is beneficial if there is automation for constructing inclusion proofs. ¬Name : ∀ {α} → α ⊆ 0/ → ¬( Name α ) ¬Name α⊆0/ = ¬Name0/ ◦ coerceN α⊆0/ One can also introduce a version of this operation whose codomain is an arbitrary type A instead of the empty type: Name-elim : {A : Set} {α} → α ⊆ 0/ → Name α → A Name-elim pf x = ⊥-elim ( ¬Name pf x ) Here is a prototypical example of a function where we can avoid the case for variables: ex-Name-elim : Tm 0/ → {! . . . !} ex-Name-elim ( t · u ) = {! . . . !} ex-Name-elim ( ň b t ) = {! . . . !} ex-Name-elim ( Let b t u ) = {! . . . !} -- This last case is discarded by Name-elim ex-Name-elim ( V x ) = Name-elim ⊆-refl x
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
19
Relational reasoning Sometimes, one must build complex inclusion witnesses. While inference would be of great effect here, we leave it for future work. For the time being, we propose a modest syntactic tool to help build inclusion witnesses. The ⊆-Reasoning module gives access to the transitivity rule ⊆-trans in a style which focuses on the intermediate states of the reasoning, as opposed to the reasoning steps. The syntax is a list of worlds interspersed with inclusion witnesses. After the last world, a box ends the proof. We present an example of the use of this notation when we define trueTm in the next paragraph. The code for ⊆-Reasoning is a trick commonly used in the AGDA standard library (Danielsson, 2011) and is given below for reference, but can safely be skipped. module ⊆-Reasoning where infix 2 infixr 2 ⊆h i ⊆h i : ∀ α {β γ } → α ⊆ β → β ⊆ γ → α ⊆ γ ⊆h i = ⊆-trans
: ∀ α → α ⊆ α = ⊆-refl
Building any term We can now finally build all nominal terms (up to α-equivalence). In section 5.5, we prove that we can build all λ -terms by defining a function that converts a “bare nominal” λ -term in the style of section 3.1 to a term of type Tm. Pouillard (2012) shows that our system can encode not just the type of λ -terms, but more generally an arbitrary nominal signature in the sense of Pitts (2006). For the time being, we focus on the term trueTm, and show that we can now type-check it. trueTm : ∀ {α} → Tm α trueTm {α} = ň x ( ň y ( V xN) ) where x = 0 B y = 1 B N x = nameB x h-because x / 0/ y / x / 0/ y / x / α
⊆h ⊆-# ( suc# ( x #0/ ) ) i ⊆h ⊆-/ y ( ⊆-/ x ⊆-0/ ) i -i
The proof that is required in order to “move x into the correct world” is involved in comparison with the simplicity of the example. Here, by fixing the empty world instead of allowing world polymorphism, we could cut the proof in half. To build larger examples, we define smart constructors which require only that one specify the distance to the binding site (Pouillard, 2011b).
ZU064-05-FPR
rubtmp13
20
June 15, 2012
14:48
N. Pouillard and F. Pottier 4.4 Comparing and refining names
While two binders cannot be compared, our interface allows comparing two names that inhabit a common world. This may seem contradictory, since one can turn binders into names. In fact, binder comparison cannot be implemented in terms of name comparison because two arbitrary binders can be turned only into names in distinct worlds. ==N : ∀ {α} ( x y : Name α ) → Bool While world inclusion gives a means of weakening the type of a name, we also need a means of strengthening the type of a name, that is, of refining its world. Naturally, this requires a dynamic check. We choose to offer an operation that compares a binder b and a name x and refines the type of x according to the outcome of the comparison. This operation is known as exportN. Assume that x is in the scope of b, that is, the name x inhabits the world b / α. The function exportN tests whether x is equal to b. If so, x is refined to the world b / 0, / which only b inhabits. Otherwise, it is refined to α. In short, given a name of type Name ( b / α ), the function exportN returns the same name, with a refined type that tells whether this name stands on the b side or on the α side. exportN : ∀ {b α} → Name ( b / α ) → Name ( b / 0/ ) ] Name α exportN = maybe inj2 ( inj1 ( nameB ) ) ◦0 exportN? Actually, the primitive operation offered by the library is the function exportN?, which returns a result of type Maybe ( Name α ). The function exportN is then built on top of exportN?. -- A →? B is the type of partial functions from A to B A →? B = A → Maybe B exportN? : ∀ {b α} → Name ( b / α ) →? Name α The idea that a dynamic check yields refined static type information is quite old, and it is difficult to determine to whom it should be attributed. It is present, for instance, in Floyd and Hoare’s rules for reasoning about programs (Floyd, 1967; Hoare, 1969). Nevertheless, it is worth noting that exportN? is very much analogous to McBride’s type-refining name comparison operation thick (McBride, 2003). The function exportTm?, which we define later on top of exportN? (section 5.4.7), is analogous to McBride’s type-refining occurcheck, but is able to deal with terms that contain binders. On top of exportN?, we also build a convenient eliminator for names. It is simply the elimination of the result of exportN?. exportWith : ∀ {b α A} → A → ( Name α → A ) → Name ( b / α ) → A exportWith v f = maybe f v ◦0 exportN?
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
21
5 Programming on top of N OM PA (nominal fragment) 5.1 Example: computing free variables We now have enough tools to present a more interesting example, namely the function fv, which constructs a list of the free variables of a term. At variables and applications, the code is straightforward. At a name abstraction, one easily collects the free variables of the body via a recursive call. However, this yields a list of names that inhabit the inner world of the abstraction, that is, a value of type List ( Name ( b / α ) ). This list cannot be returned, because the codomain of fv is declared to be List ( Name α ). This is fortunate, since returning this list would let the bound name leak out of its scope! As before, we rely on an auxiliary function, rm, which removes all occurrences of a binder b in a list of names. A new feature of rm is that it now performs type refinement in the style of (and by using) exportN. rm : ∀ {α} b → List ( Name ( b / rm b [] = [] rm b ( x :: xs ) with exportN x -... {- bound: x≡b -} | inj1 ... {- free: xıb -} | inj2 x0 fv fv fv fv fv
α ) ) → List ( Name α ) b is implicit = rm b xs = x0 :: rm b xs
: ∀ {α} → Tm α → List ( Name α ) (V x) = [ x ] ( fct · arg ) = fv fct ++ fv arg (ň b t) = rm b ( fv t ) ( Let b t u ) = fv t ++ rm b ( fv u )
The function rm applies exportN {b} to every name x in the list and builds a list of only those x’s that successfully export to the world α. It exhibits a typical way of using exportN to perform a comparison of a name against a binder together with a type refinement. This idiom is recurrent in the programs that we have written.
ZU064-05-FPR
rubtmp13
22
June 15, 2012
14:48
N. Pouillard and F. Pottier 5.2 Example: working with environments
Here is another example, where we introduce the use of an environment. occurs : ∀ {α} → Name α → Tm α → Bool occurs x0 = occ ( λ y → x0 ==N y ) where OccEnv : World → Set OccEnv α = Name α → Bool extend : ∀ {α b} → OccEnv α → OccEnv ( b / α ) extend = exportWith false occ occ occ occ occ
: Γ Γ Γ Γ
∀ {α} → OccEnv α → Tm α → Bool (V x) = Γ x (t · u) = occ Γ t ∨ occ Γ u t) = occ ( extend Γ) t (ň t u ) = occ Γ t ∨ occ ( extend Γ) u ( Let
The function occurs tests whether the name x0 occurs free in a term. An environment Γ is carried down, augmented when a binder is crossed, and looked up at variable occurrences. This environment is represented as a function of type Name α → Bool. The definition of extend states how to look up x in the environment extend Γ. (Recall that the function extend takes two implicit parameters, so extend Γ is synonymous with extend {α} {b} Γ.) To this end, one must first compare x and b. If x and b are equal, then this occurrence of x is not free, so occ Γ ( V x ) must return false. If they differ, one must look up x in Γ. This case analysis is concisely implemented by using the function exportWith, which was built on top of exportN. We believe that this code is written in a relatively natural and uncluttered style. There is no hidden cost: no renaming is required when a name abstraction is crossed. The type system forces us to use names in a sound way. For instance, in the definition of occ, forgetting to extend the environment when crossing a binder (that is, writing Γ instead of extend Γ) would cause a type error. In the definition of extend, attempting to check whether x occurs in Γ without first comparing x and b would cause a type error. Recall that the definition of the type Tm allows a newer binding to shadow an earlier one. Our type discipline guarantees that the code works properly in the presence of shadowing. Although representing an environment as a function is a simple and elegant representation, others exist. For instance, in the case of occurs, we could represent the environment as a list of binders: the code for this variant is online (Pouillard, 2011b).
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
23
Let us now consider an example where an environment is represented as an explicit data structure, namely an association list, where keys are binders. Here are the definitions of this data structure and of the environment lookup function: data DataEnv ( A : Set ) : (α β : World ) → Set where ε : ∀ {β} → DataEnv A β β , 7→ : ∀ {α β} (Γ : DataEnv A α β ) b ( x : A ) → DataEnv A ( b / α ) β lookup : ∀ {A α β} → DataEnv A α β → Name α → Name β ] A lookup ε = inj1 lookup (Γ , 7→ v ) = exportWith ( inj2 v ) ( lookup Γ) The type DataEnv A α β is the type of an environment, or environment fragment, where every name in the environment is associated with a datum of type A. We refer to the parameter α as the “inner world”, and to the parameter β as the “outer world”. The outer world can be thought of as the world that exists before the binders in the environment are introduced. The inner world is the world obtained after these binders are introduced. The expression lookup Γ x looks up the name x in the environment Γ. The name x must make sense in the scope of Γ, that is, x must inhabit the inner world α. If x is found among the bindings, then the information associated with x is returned. This information has type A. If x is not found among the bindings, then x is returned, with a more precise type: indeed, since x is not among the names introduced by Γ, it must make sense outside Γ, that is, in the outer world β. We illustrate the use of DataEnv with an alternative definition of the function fv. Here, the payload type parameter A is instantiated with the unit type >. This variant avoids the need to take the bound atoms off the list by not inserting them in the first place. At variable occurrences, we use lookup to test whether the name is free or bound. If it is free, we wrap it in a singleton list (using the function [ ]) and return it. If it is bound, we ignore it and return an empty list (using the function const []). The function [ , ]0 allows eliminating the sum produced by lookup. At every other node, we simply carry out a recursive traversal. Whenever a name abstraction is entered, the current environment Γ is extended with the bound name b. fv’ fv’ fv’ fv’ fv’
: Γ Γ Γ Γ
∀ {α β} → DataEnv > α β → Tm α → List ( Name β ) (V x) = [ [ ] , const [] ]0 ( lookup Γ x ) (t · u) = fv’ Γ t ++ fv’ Γ u (ň b t) = fv’ (Γ , b 7→ ) t ( Let b t u ) = fv’ Γ t ++ fv’ (Γ , b 7→ ) u
fv : ∀ {α} → Tm α → List ( Name α ) fv = fv’ ε Admittedly, neither functions nor lists are the most efficient representation of environments. It would be nice to be able to implement environments using, say, balanced binary
ZU064-05-FPR
rubtmp13
24
June 15, 2012
14:48
N. Pouillard and F. Pottier
search trees. At the moment, this cannot be done by the user, outside our library. The reason is, the library does not expose a total ordering on names. We cannot expose such an ordering: the logical relation which we build in section 7 forbids it. The library could, however, offer an efficient implementation of association maps whose keys are names: this would be permitted by the logical relation. We leave a deeper study of this issue for future work.
5.3 Example: comparing terms We now show how terms can be tested for α-equivalence, or, more generally, for αequivalence up to a certain relation over their free names. We first define the type |Cmp| F of functions that compare F-structures, where F is an indexed type. In the following, the type Ix of indices will be instantiated with World, and the type F will be instantiated with Name or Tm. |Cmp| : ∀ {Ix} ( F : Ix → Set ) ( i j : Ix ) → Set |Cmp| F i j = F i → F j → Bool In order to compare two terms, we carry down an environment, which tells us how to compare two names. At name occurrences, we consult this environment. At application nodes, we carry it down. At name abstractions, we must extend it. To this end, we define the function extendNameCmp. This function receives a name comparator f for worlds α1 and α2, a name x1 in the world b1 / α1, and a name x2 in the world b2 / α2. Then, the function extendNameCmp attempts to export x1 through b1 and x2 through b2. If both attempts succeed, then we can use f to compare x1 and x2. If both attempts fail, then x1 is b1 and x2 is b2; hence, each of these two names is equal to the nearest enclosing binder, and we return true. If one attempt succeeds and the other fails, then we return false. extendNameCmp : ∀ {α1 α2 b1 b2} → → extendNameCmp f x1 x2 with exportN? x1 | exportN? x2 ... | just x10 | just x20 = ... | nothing | nothing = | = ... |
|Cmp| Name α1 α2 |Cmp| Name ( b1 / α1) ( b2 / α2)
f x10 x20 true false
cmpTm : ∀ {α1 α2} (Γ : |Cmp| Name α1 α2) → |Cmp| Tm α1 α2 cmpTm Γ ( V x1) ( V x2) = Γ x1 x2 cmpTm Γ ( t1 · u1) ( t2 · u2) = cmpTm Γ t1 t2 ∧ cmpTm Γ u1 u2 cmpTm Γ ( ň t1) (ň t2) = cmpTm ( extendNameCmp Γ) t1 t2 cmpTm Γ ( Let b1 t1 u1) ( Let b2 t2 u2) = cmpTm Γ t1 t2 ∧ cmpTm ( extendNameCmp Γ) u1 u2 cmpTm = false
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
25
In the above code, there is no need to compare names or binders found in the first term with names or binders found in the second term. Instead, we consider that two bound names are equal if and only if they were bound at the same time. In short, bound names are compared positionally. This explains why the lack of a function that compares binders is not a problem. The function cmpTm must be able to accept two terms in different worlds for the recursion to go through successfully. In the end, though, this generality is often unnecessary. By supplying the name comparison function ==N as the initial name comparator, we obtain a specialized version of cmpTm, baptised ==Tm . This homogeneous comparison function tests whether its arguments are α-equivalent. ==Tm : ∀ {α} → Tm α → Tm α → Bool ==Tm = cmpTm ==N
5.4 Kits and traversals We have seen that working with worlds requires explicitly moving names from world to world using operations like coerceN and exportN?. It quickly appears that it is necessary to lift these operations to user-defined algebraic data types, such as the type Tm, so that user-defined data structures can be moved from world to world. More generally, a number of operations on names can be lifted to user-defined algebraic data types. Our experience with the library leads us to emphasize two points. First, in most of the operations on terms that we are about present, only the parts that deal with the binding structure vary. The code that carries out the traversal is fixed and can be written only once. Second, the parts that are specific to each operation can be made reusable, so as to work not only with the generic traversal but with custom traversals as well. In order to share code and separate concerns, we introduce some infrastructure, which we later instantiate for the type Tm. 5.4.1 Traversal kits We begin by introducing the notion of a traversal kit. A traversal kit is a record whose components indicate how a traversal should deal with names and binders. The first component of a traversal kit is the parameterized type Env of the environments that are carried down during the traversal. We are interested in traversals that perform some kind of translation. Thus, in an environment of type Env α β, the parameter α represents the world which the original term inhabits, while the parameter β represents the world which the transformed term inhabits. Such an environment maps names of type Name α to data of type Res β, where the type Res is itself a component of the traversal kit, so that it can vary from application to application. The last three components of a traversal kit are functions. The function trName looks up a name in the environment, and is typically used when the traversal reaches an occurrence of a variable. The function trBinder indicates how to translate a binder. For instance, this function could be the identity. Or, if there is a need to avoid capture (as is the case
ZU064-05-FPR
rubtmp13
June 15, 2012
26
14:48
N. Pouillard and F. Pottier
when defining capture-avoiding substitution), it could be a function that returns a fresh binder. The function trBinder has access to the environment, which, as we will see, can encapsulate a supply of fresh binders. Finally, the function extEnv indicates how to extend the environment when descending under a binder. It is worth noting that, while the source world α is extended with the binder b, the destination world β is extended with its image through the translation, that is, trBinder ∆ b. record TrKit ( Env : ( Res : constructor mk field trName : ∀ {α trBinder : ∀ {α extEnv : ∀ {α
(α β : World ) → Set ) World → Set ) : Set where
β} → Env α β → Name α → Res β β} → Env α β → Binder → Binder β} b (∆ : Env α β ) → Env ( b / α ) ( trBinder ∆ b / β )
In the following, we build a number of traversal kits. The coercing kit allows applying a world inclusion witness. The renaming kit allows applying a potentially effectful function of names to names. The substitution kit allows applying a function of names to terms. We also present a few ways of building new kits out of existing kits. 5.4.2 The coercing kit Our first kit, called coerceKit, is simple and to the point. Its environment type is ⊆ . Its result type is Name. The action on names is coerceN. The action on binders is the identity, which means that this kit does not perform any kind of renaming or freshening. Finally, the environment extension operation is just the world inclusion rule ⊆-/. coerceKit : TrKit ⊆ Name coerceKit = mk coerceN ( const id ) ⊆-/ To illustrate the use of coerceKit, we lift coerceN from names to terms. Here is an inductive definition of the function coerceTm: module CoerceTmWithCoerceKit where open TrKit coerceKit ∀ {α β} → α ⊆ β → α →Tm β (V x) = V ( trName ∆ x ) (t · u) = coerceTm ∆ t · coerceTm ∆ u (ň b t) = ň ( trBinder ∆ b ) ( coerceTm ( extEnv b ∆) t ) coerceTm ∆ ( Let b t u ) = Let ( trBinder ∆ b ) ( coerceTm ∆ t ) ( coerceTm ( extEnv b ∆) u )
coerceTm coerceTm coerceTm coerceTm
: ∆ ∆ ∆
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
27
The function coerceTm takes an inclusion witness and an input term. The inclusion witness is carried down during the traversal, used at names, and extended at abstractions. In this code, because of the declaration open TrKit coerceKit, the variable trName refers to the trName component of coerceKit, which by definition is coerceN. Similarly, trBinder is the identity, and extEnv is ⊆-. We have formulated this code in such a way that the traversal is actually independent of which traversal kit is used. Permitting such a formulation is the reason why we introduced traversal kits in the first place. We will soon see that it is possible to define a generic traversal function and to redefine coerceTm as an instance of the generic traversal with the coercing kit (sections 5.4.6 and 5.4.7). 5.4.3 The renaming kits We now wish to define a “renaming kit” that allows applying an arbitrary function of names to names to (the free names of) a term. In order to avoid capture, we must perform “freshening”, that is, replace the binders found in the original term with fresh binders. For this purpose, we introduce the concept of a name supply. A name supply for the world α is just a pair of a binder, called seedB, and a proof that seedB is fresh for α: record Supply α : Set where constructor , field seedB : Binder seed# : seedB # α It may seem surprising that a single fresh binder can be thought of as a name supply. The reason is that, thanks to the operation suc# (which was presented in section 4), a single fresh binder gives rise to an infinite stream of fresh binders. The function sucs, defined below, helps do this: it increments both the seed and the “fresh-for” proof. The constant zeros is an initial name supply. zeros : Supply 0/ zeros = 0 B , 0 B #0/ sucs : ∀ {α} → ( s : Supply α ) → Supply ( Supply.seedB s / α ) sucs ( seedB , seed# ) = sucB seedB , suc# seed# In our system, a world-polymorphic function that does not need to generate fresh names is parameterized over just a world α, whereas a world-polymorphic function that needs to generate fresh names is typically parameterized over a world α and a name supply of type Supply α. We are now in a position to define a renaming kit. We first define its environment type, SubstEnv α β. It is a record type. Its first two components, Res and trName, specify an action on names. This action is chosen by the end user: hence, the renaming kit is parametric in it. The last component of the environment, supply, is a name supply for
ZU064-05-FPR
rubtmp13
28
June 15, 2012
14:48
N. Pouillard and F. Pottier
the destination world β. This reflects the fact that we need to create fresh binders in the transformed term. record SubstEnv ( Res : World → Set ) α β : Set where constructor , field trName : Name α → Res β supply : Supply β open Supply supply public The renaming kit, renameKit, is defined as follows. First, we let Res be Name. RenameEnv : (α β : World ) → Set RenameEnv = SubstEnv Name Then, we provide definitions for the functions trName, trBinder, and extEnv. The definition of trName is trivial: it is the trName component of the environment. The function trBinder uses the supply component of the environment to obtain a fresh binder. The definition of extEnv is the most involved part of the kit. Because an environment is a pair of an action trName and a supply, the job of extEnv is to lift these two components through a binder. The manner in which trName is lifted, so as to obtain a new function trName0 , is depicted in figure 2. The function trName0 takes a name and uses exportWith to test whether this name is bound or free. If this name is bound (that is, equal to b), then the fresh binder that was chosen to stand for b, namely seedB, is returned. If this name is free, then exportWith refines its type to Name α, which allows us to apply trName to it. This produces a name in the world β, which is then imported back using coerceN. This call to coerceN is valid only because seedB is known to be fresh for the destination world. renameKit : TrKit RenameEnv Name renameKit = mk SubstEnv.trName trBinder extEnv where -- Each binder is translated to a fresh binder trBinder : ∀ {α β} → RenameEnv α β → Binder → Binder = seedB trBinder ( , ( seedB , ) ) extEnv : ∀ {α β} b (∆ : RenameEnv α β ) → RenameEnv ( b / α ) ( / β ) extEnv ( trName , ( seedB , seed#β ) ) 0 s B , ( suc ( seed , seed#β ) ) ) = ( trName where trName0 = exportWith ( nameB seedB) -- bound ( coerceN ( ⊆-# seed#β ) ◦ trName ) -- free
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
29
A unified treatment of syntax with binders trName :
Name β
Name α
exportWith
trName0 :
Name ( b / α )
coerceN
Name (
/ β)
Figure 2. Lifiting trName
Because we have defined Res to be Name, the above renaming kit works with total functions of names to names. In order to lift the function exportN? from names to terms, we need to deal with partial functions as well. This leads us to define another renaming kit, which is parameterized over a notion of effectful computation, that is, over an applicative functor. An applicative functor (McBride & Paterson, 2008) is halfway between a functor and a monad. Like a monad, an applicative functor has a unit, called pure. The function pure allows viewing a pure value as a potentially effectful one. Furthermore, an applicative functor comes with an effectful application, written f . This operation takes an effectful function, an effectful argument, and produces an effectful result. As an illustration, here is how one uses an applicative functor to map an effectful function over a list: module MapA {E} ( E-app : Applicative E ) where open Applicative E-app mapA : {A B : Set} → ( A → E B ) → List A → E ( List B ) mapA [] = pure [] mapA f ( x :: xs ) = pure :: f f x f mapA f xs In order to define our second and more general renaming kit, we reuse the type SubstEnv, but define the result type Res to be E ◦ Name, as opposed to just Name. The construction is parameterized with the applicative functor E. This allows us to support several kinds of effects. The code for renameAKit is similar to that of renameKit, so we omit it and show only its type: RenameAEnv : ( E : Set → Set ) (α β : World ) → Set RenameAEnv E = SubstEnv ( E ◦ Name ) renameAKit : ∀ {E} → Applicative E → TrKit ( RenameAEnv E ) ( E ◦ Name ) renameAKit = {! code similar to renameKit omitted !}
ZU064-05-FPR
rubtmp13
30
June 15, 2012
14:48
N. Pouillard and F. Pottier 5.4.4 The substitution kit
We now generalize the renaming kit along a different direction. Instead of actions that map names to names, we now wish to work with actions that map names to “terms”. The type family for terms does not have to be Tm: we parameterize the substitution kit over a type family F. We require that F be equipped with two operations. First, we require an operation that turns a name into a term. We call this operation V, by analogy with the data constructor V of the type Tm. Second, we require a way of coercing a term from one world to another. The substitution kit defines the type of environments to be SubstEnv F. The use of SubstEnv reflects the fact that we need to generate fresh binders in order to avoid capture, and the use of the parameter F reflects the fact that names are mapped to terms.. -- Index-respecting functions ◦ F → G = ∀ {i} → F i → G i -- The type for ‘coerce’ on an F-term Coerce F = ∀ {α β} → α ⊆ β → F α → F β substKit : ∀ {F} ◦ (V : Name → F ) ( coerceF : Coerce F ) → TrKit ( SubstEnv F ) F substKit = {! code similar to renameKit omitted !}
5.4.5 Other kits and combinators We build a few other kits and combinators (Pouillard, 2011b). For instance, ◦-Kit composes two kits by pairing the two environments and composing their operations. Another combinator, starKit, takes a kit whose environments have type Env and builds a kit whose environments have type Star Env, where Star is AGDA’s reflexive and transitive closure operator. Finally, the combinator mapKit allows pre-composing a function (of names to names) and post-composing a function (of results to results) with a kit to obtain a new kit. Here is the definition of mapKit: ◦
◦
mapKit : ∀ {Env F G} ( f : Name → Name ) ( g : F → G ) → TrKit Env F → TrKit Env G mapKit f g kit = mk ( λ ∆ → g ◦ trName ∆ ◦ f ) trBinder extEnv where open TrKit kit
5.4.6 A reusable traversal We now write a term-to-term transformation function that works with an arbitrary effect and with an arbitrary “name-to-term” kit. It is essentially a “map” function over terms: it maps terms to terms, and transforms names and binders as specified by the kit. More
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
31
precisely, the function trTm traverses the term, carrying an environment. Name occurrences are transformed into terms via trName. (The data constructor V is not necessarily preserved). Binders are transformed using trBinder. (In the code that follows, this information is implicit and is reconstructed by AGDA.) The structure of the term is otherwise preserved. The operations of the applicative functor are used when constructing the new term. The function extEnv allows carrying the environment under a binding. module TraverseTm {E} ( E-app : Applicative E ) {Env} ( trKit : TrKit Env ( E ◦ Tm ) ) where open Applicative E-app open TrKit trKit trTm trTm trTm trTm trTm
: ∆ ∆ ∆ ∆
∀ {α β} → Env α β → ( Tm α → E ( Tm β ) ) (V x) = trName ∆ x (t · u) = pure · f trTm ∆ t f trTm ∆ u (ň b t) = pure ( ň ) f trTm ( extEnv b ∆) t ( Let b t u ) = pure ( Let ) f trTm ∆ t f trTm ( extEnv b ∆) u
It is convenient to also define a specialized version of trTm, which accepts a “name-toname” kit and preserves the data constructor V. open TraverseTm trTm0 : ∀ {E} ( E-app : {Env} ( trKit : {α β} → Env α 0 trTm E-app trKit = trTm E-app ( mapKit id
Applicative E ) TrKit Env ( E ◦ Name ) ) β → ( Tm α → E ( Tm β ) ) ( Applicative. E-app V ) trKit )
5.4.7 Reusing the traversal We can now collect the fruit of our work, by combining the reusable traversal with various kits. For the sake of simplicity, we demonstrate this at type Tm. In the actual implementation (Pouillard, 2011b), we further abstract over Tm and trTm by defining a sequence of parameterized modules. We now revisit the definition of coerceTm. Our earlier definition (section 5.4.2) can be replaced with a more concise one: all we have to do is instantiate the generic traversal trTm0 with the identity applicative functor (which denotes the absence of side effects) and with the coercing kit. -- The identity applicative functor id-app : Applicative id id-app = {! definition omitted !}
ZU064-05-FPR
rubtmp13
32
June 15, 2012
14:48
N. Pouillard and F. Pottier
coerceTm : ∀ {α β} → α ⊆ β → α →Tm β coerceTm = trTm0 id-app coerceKit We would like to think of coerceTm as a coercion, that is, an identity function. One can informally check that if worlds and proofs of membership in a world were erased, then coerceTm would indeed boil down to the identity function, which means that applications of coerceTm could be optimized away. However, formally studying world erasure, as well as persuading AGDA to perform this erasure, are left for future work. To obtain a function that renames a term, we instantiate the generic traversal trTm0 with the identity applicative functor and with the total renaming kit. renameTm : ∀ {α β} → Supply β → (α →N β ) → (α →Tm β ) renameTm s f = trTm0 id-app renameKit ( f , s ) To obtain a function that renames a term while allowing for failure, we instantiate it with the applicative functor Maybe and with the partial renaming kit. renameTmA : ∀ {E} → Applicative E → ∀ {α β} → Supply β → ( Name α → E ( Name β ) ) → ( Tm α → E ( Tm β ) ) renameTmA E-app s f = trTm0 E-app ( renameAKit E-app ) ( f , s ) renameTm? : ∀ {α β} → Supply β → ( Name α →? Name β ) → ( Tm α →? Tm β ) renameTm? = renameTmA Maybe.applicative Obtaining a function that exports a term is now just a matter of instantiating renameTm? with the partial function exportN?. exportTm? : ∀ {b α} → Supply α → Tm ( b / α ) →? Tm α exportTm? s = renameTm? s exportN? Another useful special case of renameTm? is closeTm?. This function takes a term in any world and checks if the term is closed. If so, the same term is returned, and its type is refined to the empty world. Otherwise, the function fails by returning nothing: closeTm? : ∀ {α} → Tm α →? Tm 0/ closeTm? = renameTm? zeros ( const nothing )
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
33
Finally, in order to define capture-avoiding substitution, we use the substitution kit with arguments V (which means that free variables are mapped to themselves) and coerceTm. substTm : ∀ {α β} → Supply β → ( Name α → Tm β ) → (α →Tm β ) substTm ( s , s# ) f = trTm id-app ( substKit V coerceTm ) ( f , s, s# ) To illustrate the use of substTm, here is a simple function, baptised β-red, which performs a β -reduction when a β -redex appears at the root of its argument. The function exportWith a V maps a to b and maps x to V x when x ı b. β-red : ∀ {α} → Supply α → Tm α → Tm α β-red s ( ň b f · a ) = substTm s ( exportWith a V ) f t = t β-red
5.5 Building any λ -term One way to argue that every λ -term can be represented using our type Tm is to define a conversion function from another type for λ -terms to the type Tm. We do so by choosing the “bare nominal” type TmA of section 3.1 as the source language. The process is very close to the combination of a specific renaming kit and a specific traversal function. The kit is specific because the source names are of type Atom and not Name. The traversal function is specific because the source and target types are not the same and because we fix the identity functor for simplicity. First, we introduce the type of environments. An environment holds a mapping from free atoms to free names and a name supply: module Conv-TmA→Tm where record Env α : Set where constructor , field trAtom : Atom → Name α supply : Supply α open Supply supply public open Env Then, we define how an environment is extended. This is similar to what we did for the renaming kit: extEnv : ∀ {α} → Atom → (∆ : Env α ) → Env ( seedB ∆ / α ) extEnv bA ∆ = trN , sucs ( supply ∆) where trN = λ xA → if bA ==A xA then nameB ( seedB ∆) else coerceN ( ⊆-# ( seed# ∆) ) ( trAtom ∆ xA)
ZU064-05-FPR
rubtmp13
34
June 15, 2012
14:48
N. Pouillard and F. Pottier
It is then straightforward to define the conversion function, conv. We use trAtom at variable occurrences and extEnv when crossing a binding. conv conv conv conv conv
: ∆ ∆ ∆ ∆
∀ {α} → Env α → TmA → Tm α (V x) = V ( trAtom ∆ x ) (ň b t) = ň ( conv ( extEnv b ∆) t ) (t · u) = conv ∆ t · conv ∆ u ( Let b t u ) = Let ( conv ∆ t ) ( conv ( extEnv b ∆) u )
In order to use the function conv, one must provide an environment, that is, a mapping of (all) atoms to names in the world α together with a name supply for α. Of course, this is possible only if α is a non-empty world. For instance, the following environment, whose action maps all atoms to the name 0 N and whose name supply begins at 1 N, is an environment for the singleton world 0 B / 0. / emptyEnv : Env ( 0 B / 0/ ) emptyEnv = const ( 0 N) , sucs zeros By post-composing the conversion function conv emptyEnv with the test for closedness closeTm?, we obtain a function that converts a closed “bare nominal” term of type TmA to a term of type Tm 0. / This function fails if its argument is not a closed term. conv0? / : TmA →? Tm 0/ conv0? / = closeTm? ◦ conv emptyEnv
5.6 Towards elaborate uses of worlds The type Tm is just one basic example of an algebraic data type that involves names and binders. Let us briefly present a few algebraic data type definitions that make more advanced use of worlds.
Contexts Consider a type C of one-hole contexts associated with Tm. The type C is indexed with two worlds α and β, which respectively play the role of an “outer world” and an “inner
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
35
world”. The idea is, plugging a term of type Tm β into the hole of a context of type C α β produces a term of type Tm α. The definition of the type C is as follows: data C α : World → Set where Hole : C α α ·1 : ∀ {β} → C α β → Tm α → C α β ·2 : ∀ {β} → Tm α → C α β → C α β ň : ∀ {β} b → C ( b / α ) β → C α β Let1 : ∀ {β} b → C α β → Tm ( b / α ) → C α β Let2 : ∀ {β} b → Tm α → C (b / α) β → C α β Contexts bind names: the hole can appear under one or several binders. This is why, in general, a context has distinct outer and inner worlds. A context contains a list of binders that “connects” the outer and inner worlds: these binders are carried by the constructors ň and Let2. A context and a term can be paired to produce a term-in-context. This can be viewed as a user-defined binding construct: the names introduced by the context are in scope in the term. In fact, a one-hole context for a data structure that involves binders is exactly what de Bruijn calls a “telescope” (1991). A telescope is a first-class object that has binding power, that is, it binds zero or more names. CTm : World → Set CTm α = ∃[ β ]( C α β × Tm β ) It is straightforward to define a function plug from CTm α to Tm α, which accepts a pair of a context and a term and plugs the latter into the former. Conversely, one can define a family of focusing functions of type ∀{α} → Tm α → CTm α that split a term into a pair of a context and a term. There are several such functions, according to where one wishes to focus. The contexts presented here are “ordinary” contexts: the root of the context is the root of the term, and as one goes down into the context, one goes down into the term. Of course, since a context is just a list, it is possible to hold it from the other end. A context that is “inside-out” is known as a “zipper” (Huet, 1997; McBride, 2001). Our system can express zippers, as well as the list reversal functions that allow transforming a telescope into a zipper and vice-versa. Unfortunately, describing this in detail would take us a little too far, so we leave this for another occasion.
Multiple sorts of names Some object languages have multiple sorts of names. For instance, in Girard and Reynolds’ System F, there are term variables, which occur in terms, and type variables, which occur in types and in terms. Thus, it is natural to index objectlevel types with one world (which tells which type variables are in scope) and to index
ZU064-05-FPR
rubtmp13
June 15, 2012
36
14:48
N. Pouillard and F. Pottier
object-level terms with two worlds (one of which concerns type variables, the other of which concerns term variables). module SysF where infixr 5 ⇒ data Ty α : Set where V : ( x : Name α ) ⇒ : ( σ τ : Ty α ) ‘∀‘ : ∀ b ( τ : Ty ( b / α ) ) data Tm α γ : V : ∀ · : ∀ ň : ∀ b ·τ Λ
→ Ty α → Ty α → Ty α
Set where ( x : Name α ) ( t u : Tm α γ ) ( τ : Ty γ ) ( t : Tm ( b / α ) γ ) : ∀ ( t : Tm α γ ) ( τ : Ty γ ) : ∀ b ( t : Tm α ( b / γ ) )
→ Tm α γ → Tm α γ → Tm α γ → Tm α γ → Tm α γ
5.7 Advanced example: normalization by evaluation As an advanced example, we show how to express a normalization by evaluation algorithm in our system. This algorithm has been previously used as a benchmark by several researchers (Shinwell et al., 2003; Pitts, 2006; Licata & Harper, 2009; Cave & Pientka, 2012). The challenge lies in the way in which the algorithm mixes computational functions, name abstractions, and fresh name generation. The object language of interest is again the pure λ -calculus. The algorithm exploits two different representations of object-level terms, which are respectively known as syntactic and semantic representations. Because these representations differ only in their treatment of name abstractions, they can be given a common definition, which is parameterized over the representation of name abstractions: module M ( Abs : ( World → Set ) → World → Set ) where data T α : Set where V : Name α → T α ň : Abs T α → T α · : T α → T α → T α The parameter Abs has kind ( World → Set ) → ( World → Set ): it is an indexedtype transformer.
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
37
In order to obtain the syntactic representation, we instantiate Abs with the nominal abstractions that we have used everywhere so far: an abstraction is a package of a binder and of a term that inhabits an extended world. This yields the type Term of syntactic terms. SynAbsN : ( World → Set ) → World → Set SynAbsN F α = ∃[ b ]( F ( b / α ) ) open M SynAbsN renaming ( T to Term ) In order to obtain the semantic representation, we instantiate Abs with a different notion of abstraction, in the style of higher-order abstract syntax: an abstraction is a computational function, which substitutes a term for the bound name of the abstraction. This yields the type Sem of semantic terms. SemAbs : ( World → Set ) → World → Set SemAbs F α = ∀ {β} → α ⊆ β → F β → F β open M SemAbs renaming ( T to Sem ) Sem is not an inductive data type. Fortunately, with the --no-positivity-check flag, AGDA accepts this type definition, at the cost of breaking strong normalization. (To minimize risk, we isolate this code in a module where this flag is activated.) Naturally, because untyped λ -calculus is not terminating, one cannot expect to be able to implement a terminating normalization procedure. Our semantic name abstractions involve bounded polymorphism in a world: we define SemAbs F α as ∀{β} → α ⊆ β → F β → F β, as opposed to the more na¨ıve F α → F α. This provides a more accurate and more flexible description of the behavior of substitution. Indeed, when instantiating an abstraction t with some term u, it makes perfect sense for u to inhabit a larger world than t, that is, for u to refer to certain names that are fresh for t. The result of the substitution then inhabits the same world as u: that is, it potentially refers to these fresh names, in addition to all of the names that occurred free in the abstraction t. The types SemAbs and Sem are covariant with respect to the parameter α. This would not be the case had we adopted the na¨ıve definition of SemAbs. In other words, it is possible to define a “coerce” operation for semantic terms: coerceSem coerceSem coerceSem coerceSem
: ∀ {α β} → α ⊆ β → ( Sem α → Sem β ) pf ( V a ) = V ( coerceN pf a ) pf ( ň f ) = ň ( λ pf0 v → f ( ⊆-trans pf pf0 ) v ) pf ( t · u ) = coerceSem pf t · coerceSem pf u
At a semantic abstraction, no recursive call is performed, because the body of the abstraction is opaque: it is a computational function. Instead, we exploit the transitivity of world inclusion and build a new semantic abstraction that inhabits the desired world. Like coerceTm (section 5.4.7), coerceSem is a “coercion”, in the sense that, if worlds and
ZU064-05-FPR
rubtmp13
38
June 15, 2012
14:48
N. Pouillard and F. Pottier
proofs of membership in a world were erased, coerceSem would boil down to the identity function. The normalization by evaluation algorithm makes use of environments. Here, environments are functions from names to semantic terms. The function , 7→ extends such an environment: EvalEnv EvalEnv -- α is -- β is
: (α β : World ) → Set α β = Name α → Sem β the inner world the outer world
, → 7 : ∀ {α β} (Γ : EvalEnv α β ) b → Sem β → EvalEnv ( b / α ) β , → 7 Γ b v = exportWith v Γ -- meaning: b 7→ v -x 7→ Γ x An environment, of type EvalEnv α β maps a name of type Name α to a semantic term that lies outside the scope of the environment, that is, a semantic term of type Sem β. The type EvalEnv α β is covariant in its destination world, as witnessed by the following coercion function: coerceEnv : ∀{α β 1 β 2}→ β 1 ⊆ β 2 → EvalEnv α β 1 → EvalEnv α β 2 coerceEnv pf Γ = coerceSem pf ◦ Γ The first part of the normalization by evaluation algorithm is a function eval that evaluates a syntactic term within an environment to produce a semantic term. When evaluating a λ -abstraction, we build a semantic abstraction, which encapsulates a recursive call to eval. The bounded polymorphism required by the definition of semantic abstractions forces us to coerce the environment Γ via coerceEnv. eval : ∀ {α β} → EvalEnv α β → Term α → Sem β eval Γ ( ň ( a , t ) ) = ň ( λ pf v → eval ( coerceEnv pf Γ , a 7→ v ) t ) eval Γ ( V x ) = Γ x eval Γ ( t · u ) = app ( eval Γ t ) ( eval Γ u ) where app : ∀ {α} → Sem α → Sem α → Sem α app ( ň f ) v = f ⊆-refl v app n v = n · v
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
39
The second part of the algorithm reifies a semantic term back into a syntactic term. When reifying a semantic abstraction, we build a syntactic abstraction. This requires generating a fresh name, and leads us to parameterizing reify with a supply of fresh names. reify reify reify reify ň
: ∀ {α} → Supply α → Sem α → Term α s (V a) = V a s ( n · v ) = reify s n · reify s v ( sB , s# ) ( ň f ) = ( sB , reify ( sucs ( sB , s# ) ) ( f ( ⊆-# s# ) ( V ( nameB sB) ) ) )
The constructor V has type Name α → Sem α. Hence, it is a valid initial environment of type EvalEnv α α. Evaluation under this initial environment, followed with reification, yields a normalization algorithm. This algorithm works with open terms: its argument, as well as its result, are terms in an arbitrary world α, provided we have a name supply for the world α. nf : ∀ {α} → Supply α → Term α → Term α nf supply = reify supply ◦ eval V In particular, zeros is a name supply for the empty world, so we can normalize closed terms. Here is an example of the normalization of a closed term: idT : Term 0/ idT = ň ( 0 B , V ( 0 N) ) test-nf : nf zeros ( ( idT · ( idT · idT) ) · idT) ≡ idT tset-nf = refl
6 The N OM PA implementation (nominal fragment) The implementation of our library (of which only the nominal fragment has been presented so far) is not surprising. Most of the code consists of types and proofs. Although we now present some of the internal details of our library, we emphasize that none of these definitions are meant to be known to or used by the client. Worlds are defined first. A world is represented by a list of Booleans. An integer atom n is deemed a member of the world if and only if the nth element of the list is true. More formally, the meaning of a world is defined by the following membership predicate. World : Set World = List Bool 0/ : World 0/ = []
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
40
N. Pouillard and F. Pottier
∈ : N → World → Set ∈ [] = zero ∈ ( false :: ) = = zero ∈ ( true :: ) suc n ∈ ( :: xs ) =
⊥ ⊥ > n ∈ xs
The choice of a list of Booleans to represent a world was guided by two facts. First, operations over worlds are defined by structural induction. This makes type-level computation easier, and becomes especially important when we introduce support for de Bruijn indices (section 9). Second, because elements are ordered, modulo trailing occurrences of false, two equivalent sets are represented in the same way. Worlds are meant to be computationally irrelevant. This means that, prior to running a program, it should be possible in principle to erase worlds as well as proofs of membership in a world, proofs of world inclusion, and proofs of freshness. A program in which worlds have been erased should behave in the same manner as the original program in which worlds are present, but opaque. There are two reasons why we wish to have such an erasure property: first, it means that there is a clear “phase distinction” between the code that we wish to run and the world annotations that explain why this code makes sense; second, this guarantees that world annotations incur no performance penalty. Although we do not formally demonstrate that it is possible to erase worlds, the library is designed with this goal in mind. In particular, we are careful not to include in the library any operation that constructs a non-erasable result out of an erasable argument. An example of this would be an operation that accepts a world α and produces a binder b that is fresh for α. To compensate for the lack of such an operation, the client of the library must work with explicit name supplies where required. It might be possible to add this operation to the library (somewhat amazingly, it seems that the soundness proof in section 7 would support it) and to implement it, after erasure, as an effectful “global gensym” operation. We leave this idea for future work. A notion of irrelevance has recently been introduced in AGDA. The irrelevant function space is noted .( x : A ) → B. The value of an irrelevant argument not only cannot influence the result of a computation, but also cannot influence the type-checking process: any two irrelevant values of the same type are considered equal. Hence, irrelevance can be used only in situations where the only thing that matters is the existence of a value of a
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
41
certain type. In our setting, worlds cannot be considered irrelevant. The following example shows that considering an arbitrary world α as equal to the empty world leads to nonsense: module Worlds-should-be-erased-but-are-relevant ( World : Set ) -- A type for worlds ( 0/ : World ) -- An empty world ( Name : .World → Set ) -- Names are made world irrelevant ( ¬Name0/ : ¬( Name 0/ ) ) -- No name inhabits 0/ .(α : World ) -- An irrelevant world ( x : Name α ) -- A name where bot : ⊥ -- ...and that’s the end of the world bot = ¬Name0/ x One might however be able to apply AGDA irrelevance to proof terms, such as world membership witnesses, fresh-for witnesses, and maybe inclusion witnesses as well. Our experiments were successful as far as the implementation is concerned, but led to trouble in the proofs. We leave this aspect to future work. Binders are represented by natural numbers. The operation / defines how to extend a world with a binder. Given a binder n and a world α, it updates the world α with the value true at index n. Binder : Set Binder = N zeroB : Binder zeroB = zero sucB : Binder → Binder sucB = suc / : zero suc n zero suc n
Binder → / [] / [] / ( :: α ) / ( b :: α )
infixr 5
World → World = true :: [] = false :: n / [] = true :: α = b :: n / α
/
The proof that / has the intended set-theoretic semantics is offered by the following lemma: /-sem : ∀ α x y → ( x ∈ y / α ) ≡ ( if x ==N y then > else x ∈ α )
ZU064-05-FPR
rubtmp13
42
June 15, 2012
14:48
N. Pouillard and F. Pottier
A name of type Name α is a pair of a binder (that is, a number) and a proof that this binder is a member of the world α. In AGDA, we use a record: record Name α : Set where constructor , field binderN : Binder b∈α : binderN ∈ α infixr 4 , open Name public In order to produce a name out of a binder b, the operation nameB simply packs b together with a proof that b is a member of the world b / α. This proof is easily manually constructed. nameB : ∀ {α} b → Name ( b / α ) nameB b = b , {! proof omitted !} The equality test ==N and the function exportN? compare the integer values that underlie names and binders. ==N : ∀ {α} ( x y : Name α ) → Bool ==N ( x , ) ( y , ) = x ==N y exportN? : ∀ {b α} → Name ( b / α ) →? Name α exportN? {b} {α} ( x , pf ) if x ==N b then nothing else just ( x , {! proof omitted !} ) World inclusion is defined as set-theoretic inclusion, that is, as the preservation of membership. This is exploited in the definition of coerceN, where we need to build a proof that b is a member of β. Note that, after erasure, coerceN boils down to the function that maps b to b, that is, the identity function. infix 2 ⊆ ⊆ : (α β : World ) → Set α ⊆ β = ∀ x → x ∈α → x ∈β coerceN : ∀ {α β} → α ⊆ β → (α →N β ) coerceN α⊆β ( b , b∈α ) = b , α⊆β b b∈α The proofs of the world inclusion rules (figure 1) are computationally irrelevant. We omit them here. The remaining part is the fresh-for relation ( # ). To cope with the proof of suc#, we give two characterizations of this relation. One is a set of syntactic rules (omitted here)
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
43
and the other is semantic. These presentations are equivalent (Pouillard, 2011b). The semantic version was given earlier (section 4.3): it states that x # α holds if and only if x dominates α, that is, x is strictly greater than every name y that inhabits α. # : Binder → World → Set x # α = ∀ y → y ∈α → x>y This strong definition of “fresh-for” allows us to implement the operation suc# without inspecting the world: after erasure, suc# is just the successor operation on natural numbers. Thus, we are able to generate fresh names in a manner that is compatible with erasure and is efficient. 7 Soundness of N OM PA (nominal fragment) Our library is written in AGDA, a type-safe language. Thus, the property that “well-typed programs do not go wrong” comes for free. However, this does not quite satisfy us. Indeed, we have explained that certain operations, such as an equality test for binders, must not be provided to programmers, or it would be possible to write “ill-behaved” code. Yet, these operations are perfectly type-safe. So, type safety is not a sufficient criterion in order to determine which operations can and cannot be provided. Earlier, we stated informally that “a function is well-behaved if, when applied to αequivalent arguments, it produces α-equivalent results”. This is, roughly speaking, the criterion that we are looking for: we would like to guarantee that every function that can be written by a client of our library is well-behaved. Of course, we must define this criterion in a more formal and more general manner. This involves defining what we mean by “αequivalence”: we must define this relation not just at our example type Tm, where we have a pretty clear idea of what “α-equivalence” means, but at every type. Similarly, we must define “well-behavedness” not just at function types, but at every type. Fortunately, these two problems are the same. In the following, we build a logical relation, that is, a type-indexed equivalence relation. This relation gives rise to a notion of “α-equivalence”: we consider that two AGDA expressions of type τ are “α-equivalent” if and only if they are related at type τ. It also gives rise to a notion of “well-behavedness”: we consider that an AGDA expression of type τ is “well-behaved” if and only if it is related to itself at type τ. The construction of a logical relation for AGDA is a standard technique (Bernardy et al., 2010). It is independent of our work. By relying on this technique, all we have to do is define the relation at each of the abstract types that we introduce (namely World, Name, ⊆ , etc.) and prove that each of the values that we introduce is related to itself. Each of these little proofs is independent of the others. This makes the soundness proof modular. This also facilitates the addition of new features: when considering the addition of a new operation, it is easy to construct the proof obligation that comes with it and to find out whether it is safe to add this operation. This section is organized as follows. First, we recall the basics of logical relations and parametricity (section 7.1). We give a toy example, so as to practice a bit (section 7.2). Then, we define the logical relation at each of our abstract types, and briefly describe the
ZU064-05-FPR
rubtmp13
44
June 15, 2012
14:48
N. Pouillard and F. Pottier
proof obligations that arise about the operations of the library (section 7.3). Finally, we discuss the meaning of the “free theorems” that arise out of this construction (section 7.4).
7.1 Recap of the framework A relation is said to be type-indexed, or type-directed, when it is inductively defined over the structure of types. Let R be such a type-directed relation, and let τ be a type. Then, Rτ is a relation on values of type τ, that is, we have Rτ : τ → τ → Set. Recall that Set serves as the type of propositions in AGDA. A type-directed relation is called a “logical” relation when it relates functions in an extensional manner, that is, when two functions are related if and only if they produce related results out of related arguments. Let Ar be a relation for the arguments and Br a relation for results. Two functions f1 and f2 are “logically” related if and only if for every pair of arguments ( x1, x2) related by Ar, the results f1 x1 and f2 x2 are related by Br. This definition can be given in AGDA as well: RelatedFunctions Ar Br f1 f2 = ∀ {x1 x2} → Ar x1 x2 → Br ( f1 x1) ( f2 x2) An expression of type τ “fits” a logical relation if and only if it is related to itself at type τ. A logical relation is universal if every well-typed program fits this logical relation. John Reynolds defined a logical relation for the polymorphic λ -calculus and proved that it is universal: this is the “Abstraction Theorem” (Reynolds, 1983). Bernardy et al. (2010) define a logical relation for every pure type system (PTS) and informally suggest how to extend it to AGDA. While no complete mechanized definition and proof exist, we refer to this extension as the “AGDA logical relation” and assume that it is universal. In the following, we briefly explain how the AGDA logical relation is defined and state our assumption in a precise way. The AGDA logical relation In order to simplify things, the definitions that follow are not universe-polymorphic. The reader can find universe-polymorphic definitions in the full implementation (Pouillard, 2011b). When attempting to formally define a logical relation within AGDA, one immediately faces a difficulty: AGDA does not allow structural induction over types, that is, over values of type Set. In order to work around this difficulty, a natural and common technique is to introduce an algebraic data type U that represents the syntax of AGDA’s types. The type U is known as a “universe”, and its elements are known as “codes”. Then, in order to construct an explicit connection between codes and the types that they are supposed to represent, one defines a function that assigns meanings to codes. This function, called El, has type U → Set. Thus, if τ is a code, then El τ is a type, and can be thought of as “the elements of τ”. Finally, the logical relation is defined by induction over codes. It is a function J K that maps a code τ to a relation over the elements of τ. In other words, the function J K has type ( τ : U ) → El τ → El τ → Set.
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
45
Unfortunately, because AGDA’s types involve quantification and dependent types, the algebraic data type of codes must involve some representation of names and binders. This adds a good deal of complexity to the universe technique. Perhaps ironically, we do not wish to deal with this complexity, as it would obscure our idea. Thus, we do not adopt the universe technique. Instead, we follow a simpler and more limited approach. First, we give a formal (noninductive) definition of the logical relation at every type constant. In AGDA, the type constants are → , Π, Set0, and the user-defined inductive data types, such as N. Thus, to each such constant κ, we associate a relation, which we write Jκ K. (Note that there are no spaces in this name: Jκ K is just the name of a new constant. We are not formally defining a function called J K.) Thus, we define J→K , JΠK, JSet0K, and one constant per user-defined inductive data type, such as JNK. Then, instead of giving a formal inductive definition of the logical relation at every type, we view the application of the logical relation to a type as an informal “macro-expansion” process. For instance, imagine we wish to compute the definition of the logical relation at type N → N → Bool. We cannot write J N → N → Bool K, with spaces near the brackets, because we have not formally defined a function called J K. Instead, we manually distribute the semantic brackets over the arrows and write JNK J→K JNK J→K JBoolK, without spaces near the brackets. Because we have formally defined the constants JNK, JBoolK, and J→K , this is a valid AGDA expression, whose meaning can be automatically computed by AGDA. The definition of J→K is just RelatedFunctions. The definition of JΠK is a dependent version of RelatedFunctions, where the relation that is required of the results is allowed to depend on the manner in which the arguments are related: JΠK Ar Br f1 f2 = ∀ {x1 x2} ( xr : Ar x1 x2) → Br xr ( f1 x1) ( f2 x2) Following Bernardy et al. (2010), the definition of the constant JSet0K is as follows: JSet0K : Set0 → Set0 → Set1 JSet0K A1 A2 = A1 → A2 → Set0 The type A1 → A2 → Set0 is the type of all relations between the types A1 and A2. Although this definition may seem somewhat cryptic at first glance, it allows recovering Reynolds’ definition of the logical relation at polymorphic types. In AGDA, a polymorphic type is encoded as a dependent type of the form ( A : Set0) → τ. By combining the above definitions of JΠK and JSet0K, one finds that the logical relation at such a polymorphic type involves a universal quantification over three things, namely two types A1 and A2 and a relation Ar between these types. This is Reynolds’ definition. We recall all of the above definitions in figure 3, and introduce some syntactic sugar. These definitions cover core type theory. Inductive data types are covered in a simple and systematic manner. The process is as follows: for each constructor κ of type τ, declare a new constructor Jκ K whose type is Jτ K κ κ. Record types are treated in an analogous manner. As an illustration, the logical relations for the inductive data types that are used in this paper are given in figure 4.
ZU064-05-FPR
rubtmp13
46
June 15, 2012
14:48
N. Pouillard and F. Pottier
As announced earlier, we do not formally define a function J K. Instead, if τ is a closed type, we view J τ K as a “macro”, which can be manually expanded by replacing within τ every constant κ with Jκ K, every non-dependent arrow A → B with J A K J→K J B K, every dependent arrow ( x : A ) → B with h xr : J A K iJ→K J B K, etc. By convention, we use the subscript r in the expansion of dependent arrows. Here are a few examples of the manual expansion of the notation J K: -- What we would like to write but cannot: J N → N → Bool K = -- What we write instead: JNK J→K JNK J→K JBoolK = -- What this means: λ f1 f2 → ∀ {x1 x2} ( xr : JNK x1 x2) {y1 y2} ( yr : JNK y1 y2) → JBoolK ( f1 x1 y1) ( f2 x2 y2) -- The logical relation at a polymorphic type: J ( A : Set0) → A → A K = JΠK JSet0K ( λ Ar → Ar J→K Ar ) = λ f1 f2 → ∀ {A1 A2} ( Ar : A1 → A2 → Set0) {x1 x2} ( xr : Ar x1 x2) → Ar ( f1 A1 x1) ( f2 A2 x2) -- Using the notation instead of JΠK: J ( A : Set0) → List A K = h Ar : JSet0K iJ→K JListK Ar = λ l1 l2 → ∀ {A1 A2} ( Ar : A1 → A2 → Set0) → JListK Ar ( l1 A1) ( l2 A2)
The parametricity hypothesis As announced earlier, we assume that the AGDA logical relation is universal. We can now state this assumption in a precise way: (Parametricity hypothesis for AGDA) We assume that, for every well-typed term M of closed type τ, the theorem J τ K M M is provable. Because J K is an informal notation, as opposed to an AGDA function, the above hypothesis must be stated in an informal manner. Nevertheless, for a specific type τ, it is possible to give a formal statement of this hypothesis. For instance, in section 11.2, where we prove that “world-polymorphic functions commute with renamings”, we explicitly and formally use a hypothesis of the form J τ K f f, for a specific type τ and for a specific term f.
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
47
JSet0K : ∀ ( A1 A2 : Set0) → Set1 JSet0K A1 A2 = A1 → A2 → Set0 JSet1K : ∀ ( A1 A2 : Set1) → Set2 JSet1K A1 A2 = A1 → A2 → Set1 : ∀ {A1 A2 B1 B2} → JSet0K A1 A2 → JSet0K B1 B2 → JSet0K ( A1 → B1) ( A2 → B2) Ar J→K Br = λ f1 f2 → ∀ {x1 x2} → Ar x1 x2 → Br ( f1 x1) ( f2 x2) J→K
infixr 0
J→K
JΠK : ∀ {A1 A2} ( Ar {B1 B2} ( Br → ( ( x : A1 ) JΠK Ar Br = λ f1 f2
: JSet0K A1 A2) : ( Ar J→K JSet0K ) B1 B2) → B1 x ) → ( ( x : A2) → B2 x ) → Set1 → ∀ {x1 x2} ( xr : Ar x1 x2) → Br xr ( f1 x1) ( f2 x2)
syntax JΠK Ar ( λ xr → f ) = h xr : Ar iJ→K f J∀K : ∀ {A1 A2} ( Ar : JSet0K A1 A2) {B1 B2} ( Br : ( JSet0K J→K JSet0K ) B1 B2) → JSet1K ( {x : A1} → B1 x ) ( {x : A2} → B2 x ) J∀K Ar Br = λ f1 f2 → ∀ {x1 x2} ( xr : Ar x1 x2) → Br xr ( f1 {x1} ) ( f2 {x2} ) syntax J∀K Ar ( λ xr → f ) = ∀h xr : Ar iJ→K f
Figure 3. Logical relations for core type theory We warmly encourage the reader to study Bernardy et al. (2010) in order to understand the subject in greater depth. The theorems obtained by instantiating the parametricity hypothesis with a specific type τ are known as “free theorems” (Wadler, 1989) because they allows us to establish a property of a term M of type τ without requiring us to reason about the definition of M. Usually, this property is non-trivial only if the type τ involves polymorphism. Indeed, in this case, the statement of the “free theorem” begins with universal quantifiers that can be instantiated in useful ways. In our setting, polymorphism arises out of two distinct sources. First, because the types defined by our library are abstract, the client must be polymorphic with respect to these types. Hence, the free theorem about the client begins with a series of universal quantifiers which we can instantiate in a suitable manner. Second, when the client defines a world-polymorphic function, this particular function comes with a powerful “free theorem”. Later on (section 11.3), we give a more detailed account to various function types and the strength of their associated “free theorems”.
7.2 An example: Boolean values represented by numbers Logical relations help understand in what sense the interface offered by an abstract type is safe, or in other words, in what sense the abstraction offered by the interface is independent of the underlying representation (Reynolds, 1983; Mitchell, 1986). In order to explain this,
ZU064-05-FPR
rubtmp13
June 15, 2012
48
14:48
N. Pouillard and F. Pottier
data J⊥K : JSet0K ⊥ ⊥ -- no constructors data JBoolK : JSet0K Bool Bool where JtrueK : JBoolK true true JfalseK : JBoolK false false data JNK : JSet0K N N where JzeroK : JNK zero zero JsucK : ( JNK J→K JNK ) suc suc {A1 A2 B1 B2} ( Ar : JSet0K A1 A2) ( Br : JSet0K B1 B2) : A1 ] B1 → A2 ] B2 → Set0 where Jinj1K : ( Ar J→K Ar J ] K Br ) inj1 inj1 Jinj2K : ( Br J→K Ar J ] K Br ) inj2 inj2
data
J]K
Figure 4. Logical relations for inductive data types
we introduce a tiny example, where Boolean values are represented using natural numbers. We want 0 to represent false and any other number to represent true. Therefore, Boolean disjunction can be implemented using addition. We show that logical relations help build a “model” and ensure that an implementation respects this model. Then, parametricity can be used to show that a client that uses only the interface must also respect the model. Our tiny implementation of Booleans using natural numbers is given below. It contains a type B that we want to keep abstract. It contains obvious definitions for true, false, and disjunction ∨ . Furthermore, it intentionally offers a dubious operation, is42?. B : Set B = N false : B false = 0 true : B true = 1 ∨ : B → B → B m ∨ n = m + n is42? : B → B is42? 42 = true is42? = false The function is42? is intuitively not “well-behaved” because, even though the natural numbers 41 and 42 both encode the Boolean value true, this function maps them to distinct results. Thus, we wish to define a criterion that allows us to easily and formally
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
49
tell which operations are safe and which are not. To begin, we define a binary relation JBK over the type B. The idea is, two natural numbers are related if and only if they have the same meaning, that is, if and only if they encode the same truth value. This relation is an “invariant” which every operation must preserve. Technically, we define JBK as an inductive data type. Its definition states that 0 is related with itself and that any two nonzero numbers are related. data JBK : B → B → Set where JfalseK : JBK 0 0 JtrueK : ∀ {m n} → JBK ( suc m ) ( suc n ) In this approach, one explicitly defines when two values mean “the same thing”, but one does not explicitly define what that “thing” is. In this toy example, one could easily adopt a different (and simpler) approach, where one explicitly defines that 0 represents false and that any other number represents true. There would naturally follow that two numbers are equivalent if and only if they represent the same truth value. This works well because the inhabitants of our intended model, namely the Boolean values, have a canonical representation. In our real-world application, where the problem is to define α-equivalence at every type, it is easier to define when two terms are “equivalent” than it is to map every term to a canonical representative. This is why logical relations seem particularly natural and useful in our setting. Defining JBK suffices to define the logical relation at every type. This defines what we view as “good behavior”. If a piece of client code has type τ, we expect it to satisfy the free theorem J τ K. The reader might wonder, however, why we are allowed to choose the definition of JBK. After all, since B is internally defined as N, mustn’t we define JBK as JNK? If we define JBK in some other way, how do we know that the logical relation is still universal? To see why this makes sense, consider a client of the library. This client can be thought of as a function that expects an implementation of the library as an argument. Thus, this function is parameterized over the type B and over the operations true, false, ∨ , and is42?. In other words, this function is polymorphic in B. Thus, the “theorem for free” that comes with this function is universally quantified with respect to B and with respect to a relation JBK. This explains why we may define the relation JBK however we please. Naturally, we must still satisfy a few proof obligations. The “theorem for free” that comes with the client is further parameterized with the operations true, false, ∨ , is42? and with proofs JtrueK, JfalseK, J ∨ K , Jis42?K that each of these operations is well-behaved. That is, we must prove that each of the operations offered by the library is related to itself. The data constructors JtrueK and JfalseK are obvious witnesses to the fact that the operations true and false are well-behaved. The remains to check whether ∨ and is42? are well-behaved. Every time, the statement that must be proved is constructed in a systematic manner: if an operation has type τ, then one must check that its implementation
ZU064-05-FPR
rubtmp13
June 15, 2012
50
14:48
N. Pouillard and F. Pottier
is related to itself by the relation J τ K. For instance, here is the statement that must be proven about ∨ : J∨K
: ( JBK J→K JBK J→K JBK )
∨
∨
Once unfolded, this statement looks like this: J∨K
: ∀ {x1 x2} ( xr : JBK x1 x2) {y1 y2} ( yr : JBK y1 y2) → JBK ( x1 ∨ y1) ( x2 ∨ y2)
This proposition states that the disjunction operation maps related arguments to related results. Now, thanks to the inductive definition of + , pattern-matching on the first relation argument suffices to cause the goal to reduce. Thus, we are able to offer the following nice-looking definition of J ∨ K , which one can recognize as the usual lazy definition of left-biased disjunction: JfalseK JtrueK
J∨K J∨K
x
= =
x JtrueK
Let us now consider the question of the well-behavedness of the function is42?. Of course, there is no proof that this function is well-behaved. In fact, it is easy to prove that it is ill-behaved. It suffices to exhibit two related inputs, say 42 and 27, that are mapped to non-related outputs (we have is42? 42 = 1 and is42? 27 = 0). ¬ Jis42?K : ¬( ( JBK J→K JBK ) is42? is42? ) ¬ Jis42?K Jis42?K with Jis42?K {42} {27} JtrueK ... | ( ) -- absurd Note that is42? is rejected by our model with no consideration of which other operations are exported. Once the relation JBK is defined, it suffices to “turn the crank” to find out that is42? is ill-behaved. This modularity is precious and has helped us easily determine which operations could or could not be offered as part of the N OM PA library.
7.3 Relations for N OM PA For N OM PA, we apply the same process as in the toy example. We define our expectations by defining a relation for each of the abstract types that the library advertises. Then, we prove that each operation offered by the library is well-behaved with respect to the logical relation that arises out of these definitions.
7.3.1 Relations for N OM PA types For reference, the definitions are given in figure 5. We now describe them in turn.
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
51
The first thing to do is to define the constant JWorldK. What does it mean for two worlds to be related? It is useful to think of the manner in which JSet0K is defined. Recall that it is defined by the equation JSet0K A1 A2 = A1 → A2 → Set0. This means that the “theorem for free” that describes a polymorphic object is universally quantified over two types A1 and A2 and over a relation Ar between these types. Now, we would like to define JWorldK in such a way that, similarly, the “theorem for free” that describes a world-polymorphic object is universally quantified over two worlds α1 and α2 and over a relation between these worlds. Thus, it seems that we could perhaps let JWorldK α1 α2 be Name α1 → Name α2 → Set0, that is, the set of all relations between (the sets of names denoted by) α1 and α2. However, if we adopted such a definition, we would later be unable to prove that the name comparison operation ==N is well-behaved. Because this operation is world-polymorphic, we will have to prove that, for all worlds α1 and α2 and for every relation αr of type JWorldK α1 α2, this operation maps αr-related arguments to equal results. (Indeed, the relation JBoolK is just equality.) If αr was allowed to range over arbitrary relations, it would follow that ==N must be a constant function. Thus, we must restrict the set of allowable relations. It is clear that the equality test ==N is well-behaved if and only if every relation in JWorldK α1 α2 preserves equality in both directions, i.e., is functional and injective: Preserve-≡ R = ∀ x1 y1 x2 y2 → R x1 x2 → R y1 y2 → x1 ≡ y1 ↔ x2 ≡ y2 We allow JWorldK α1 α2 to contain all such relations. Thus, the “theorem for free” that describes a world-polymorphic object will be universally quantified over two worlds and over a functional and injective relation between them. In other words, we choose the definition of JWorldK that leads to the strongest possible “free theorems”, under the requirement that ==N be well-behaved. It turns out that, with this definition, we will be able to prove that every other operation is well-behaved too. Because the type Name is parameterized with a world, the relation JNameK is parameterized with a relation between worlds. It is defined as JNameK the identity: two names are related by JName αr K if and only if they are related by αr. The definition of the relation JBinderK is surprisingly simple: it is the full relation. Thus, every two binders are related by JBinderK. Thus, a function that allows distinguishing between two binders is considered ill-behaved. This is consistent with our goal of disallowing functions that distinguish between two α-equivalent representations of a λ -term.
ZU064-05-FPR
rubtmp13
June 15, 2012
52
14:48
N. Pouillard and F. Pottier
A consequence of this definition is that the library cannot provide an equality test over binders. Indeed, as demonstrated by the following theorem, if f is a function of type Binder → Binder → Bool, then it must be a constant function. -- JfK is the parametricity theorem for f JfK : ( JBinderK J→K JBinderK J→K JBoolK ) f f -- f-const is a corollary of JfK. -- f-const shows that f is a constant function. f-const : ∀ x1 x2 y1 y2 → f x1 y1 ≡ f x2 y2 f-const x1 x2 y1 y2 with JfK {x1} {x2} {y1} {y2} ... | JtrueK = refl ... | JfalseK = refl Next, we define the relation J⊆K . Again, we wish to adopt the most liberal definition with respect to which we can prove that the operations offered by the library are wellbehaved. For this purpose, we exploit the fact there is ultimately only one way to use an inclusion witness, which is to pass it as an argument to the operation coerceN. Thus, we posit that two world inclusion witnesses α1⊆β 1 and α2⊆β 2 are related if and only if coerceN α1⊆β 1 and coerceN α2⊆β 2 are related (see figure 5). Although this definition is arguably quite elegant, it may seem somewhat “magic” and opaque. Fortunately, one can formulate an equivalent definition in terms of inclusion of relations. Let us say that a relation R1 is a subset of a relation R2 if and only if every pair that is related by R1 is related by R2 as well. Then, two relations αr and β r are related by J⊆K if and only if αr is a subset of β r. Indeed, if one considers the definition of J⊆K in figure 5 and expands the definitions of JNameK and J→K , one finds: J⊆K αr β r α1⊆β 1 α2⊆β 2 = ∀ {x1 x2} → ( x1 , x2) ∈ αr → ( coerceN α1⊆β 1 x1 , coerceN α2⊆β 2 x2) ∈ β r Because the function coerceN behaves (after erasure) as the identity function, the righthand side of the above equation informally means that the relation αr is a subset of the relation β r. We now define J 0K / and J/K . Whereas 0/ is a world, J 0K / is a relation between the empty world and itself. We have no choice: there is only one such relation, namely the empty relation. We show its type and omit its definition: J 0K / : JWorldK 0/ 0/ Whereas / maps a binder and a world to a new world, J/K maps a pair of binders and a relation between worlds to a new relation between worlds. In other words, it extends an existing relation between worlds, say αr, with a new pair of binders, say br. How should we define J/K ? An idea that naturally comes to mind is to construct the set-theoretic union of the relation αr and of the singleton set {br }. However, this would not make
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
53
Preserve-≡ : {A B : Set0} (R : A → B → Set0) → Set0 Preserve-≡ R = ∀ x1 y1 x2 y2 → R x1 x2 → R y1 y2 → x1 ≡ y1 ↔ x2 ≡ y2 -JWorldK : JSet1K World World record JWorldK (α1 α2 : World ) : Set1 where constructor , field R : Name α1 → Name α2 → Set R-pres-≡ : Preserve-≡ R JNameK : ( JWorldK J→K JSet0K ) Name Name -- : ∀ {α1 α2} → JWorldK α1 α2 → Name α1 → Name α2 → Set JNameK (R , ) x1 x2 = R x1 x2 JBinderK : JSet0K Binder Binder -- : Binder → Binder → Set JBinderK => J 0K / : JWorldK 0/ 0/ → ⊥) , ( λ ( ) ) J 0K / = (λ : ( JBinderK J→K JWorldK J→K JWorldK ) / / -- not proper Agda def br J/K αr = { ( b1, b2) } ∪ { ( x, y ) | ( x, y ) ∈ αr ∧ x ı b1 ∧ y ı b2 } J/K
J#K : ( JBinderK J→K JWorldK J→K JSet0K ) # # -- : ∀ {b1 b2} → JBinderK b1 b2 → ∀ {α1 α2} → JWorldK α1 α2 -→ b1 # α1 → b2 # α2 → Set J#K => : ( JWorldK J→K JWorldK J→K JSet0K ) ⊆ ⊆ -- : ∀ {α1 α2} → JWorldK α1 α2 → -∀ {β 1 β 2} → JWorldK β 1 β 2 → -α1 ⊆ β 1 → α2 ⊆ β 2 → Set J⊆K αr β r α1⊆β 1 α2⊆β 2 = ( JNameK αr J→K JNameK β r ) ( coerceN α1⊆β 1) ( coerceN α2⊆β 2) J⊆K
Figure 5. Relations for N OM PA types
sense: we must be careful to ensure that the resulting relation is functional and injective. If the first component of the pair br is already a member of the domain of the relation αr, or (symmetrically) if the second component of br is already a member of the codomain of αr, then the set-theoretic union of αr and {br } might not be functional and injective. This corresponds to a situation where a new binder “shadows” a previous one. In that case, we would like the new pair br to take precedence over any earlier bindings. Thus, the definition that we ultimately adopt can be described as the set-theoretic union of the
ZU064-05-FPR
rubtmp13
June 15, 2012
54
14:48
N. Pouillard and F. Pottier
3 • 2 •
• 3 • 2
1 • 0 •
• 1 • 0 αr
4 • 3 •
• 4 • 3
4 • 3 •
• 4 • 3
2 • 1 •
• 2 • 1
2 • 1 •
• 2 • 1
0 •
• 0
0 •
• 0
h4,2iJ/K αr
h4,4iJ/K h4,2iJ/K αr
Figure 6. The effect of J/K on relations
relation αr, deprived of any bindings that conflict with br, and of the singleton set {br }. It can be informally defined as follows: : ( JBinderK J→K JWorldK J→K JWorldK ) / / -- not proper Agda def br J/K αr = { ( b1, b2) } ∪ { ( x, y ) | ( x, y ) ∈ αr ∧ x ı b1 ∧ y ı b2 } J/K
The effect of J/K is illustrated in figure 6.
7.3.2 N OM PA values fit the relation We now give a short overview of the proofs needed to show that the operations offered by the library fit the relation. Formally, for each operation p of type τ that appears in the interface of the library, we have to exhibit a proof JpK of the statement J τ K p p. All proofs can be found online (Pouillard, 2011b). Many of these proofs are immediate. For instance, because the relation JBinderK is the full relation, any operation whose return type is Binder is well-behaved. For some operations, the proof is just a matter of expanding the definitions. For instance, in the case of the operation JnameBK, which converts a binder to a name, the statement that must be proved is the following: JnameBK : ( ∀h αr : JWorldK iJ→K h br : JBinderK iJ→K JNameK ( br J/K αr ) ) nameB nameB -: ∀ {α1 α2} (αr : JWorldK α1 α2) -{b1 b2} ( br : JBinderK b1 b2) -→ JNameK ( br J/K αr ) ( nameB {α1} b1) ( nameB {α2} b2) That is, roughly speaking, we must prove that the names b1 and b2 are related by the relation br J/K αr. This follows immediately from the definition of J/K , since the effect of this operation is precisely to extend the relation αr with the pair ( b1 , b2).
ZU064-05-FPR
rubtmp13
June 15, 2012
14:48
A unified treatment of syntax with binders
55
In the case of the equality test J==NK , once unfolded, the statement requires that the equality test commute with a renaming. In other words, the outcome of an equality test must not change when its inputs are consistently renamed. J==NK
-----
: ( ∀h αr : JWorldK iJ→K JNameK αr J→K JNameK αr J→K JBoolK ) ==N ==N : ∀ {α1 α2} (αr : JWorldK α1 α2) {x1 x2} ( xr : JNameK αr x1 x2) {y1 y2} ( yr : JNameK αr y1 y2) → JBoolK ( x1 ==N y1) ( x2 ==N y2)
The proof is in two parts. First, we prove that the Boolean-valued function ==N decides propositional equality on names. Second, we exploit the fact that the relation αr preserves equality, that is, αr is functional and injective. The proof that exportN? is in the relation relies on two points. First, the success of ( exportN? {b} x ) (that is, whether it returns just or nothing) depends only on the equality between ( nameB b ) and x. Second, every pair in the relation ( br J/K αr ) is either in the relation ( br J/K J 0K / ) or in the relation αr. This follows from our definition of J/K . Thanks to the definition of J⊆K , the proof that coerceN is well-behaved is immediate. There remains to show that each of the world inclusion rules (figure 1) is well-behaved. This can be done informally by a simple inspection of these rules, while keeping in mind that a world α must now interpreted as a relation between worlds and world inclusion ⊆ must now be interpreted as inclusion of relations. In the last rule, ⊆-#, we can now see why the freshness hypothesis b # α is required. Indeed, the goal is to prove that the relationαr is a subset of the relation ( br J/K αr ). In the absence of any hypothesis about br and αr, this is false, because the operation J/K can deprives αr of certain pairs if there is shadowing. In the presence of the above freshness hypothesis, we can further assume that b1 is not in the domain of αr and that b2 is not in the codomain of αr. In this case, αr is indeed a subset of ( br J/K αr ).
7.3.3 An example of an ill-behaved operation Consider the following function, which exposes a total ordering on names: