Dynamic Rebinding for Distributed Programming Gavin Bierman†
Michael Hicks‡
Peter Sewell†∗ Gareth Stoyle†
†
University of Cambridge {First.Last}@cl.cam.ac.uk
Abstract Most programming languages adopt static binding, but for distributed programming an exclusive reliance on static binding is too restrictive: dynamic binding is required in various guises. Typically it is provided only by ad-hoc mechanisms that lack clean semantics. In this paper we adopt a more foundational approach, showing how core dynamic rebinding mechanisms can be added to a CBV λ-calculus. To do so we first develop two refinements of the CBV reduction strategy with delayed instantiation, the redex-time and destruct-time semantics. Delayed instantiation ensures we obtain the most recent version of a definition following a rebinding. The destructtime semantics gives the basis for a λmark calculus that supports dynamic rebinding, with primitives to package values and marks to control which identifiers are dynamically rebound when they are unpackaged. We extend λmark with concurrency and communication, giving examples showing how wrappers for encapsulating untrusted code can be expressed, and discuss the extent to which it can be statically typed. Finally, we use the destruct-time semantics also as a basis for a λupdate calculus with simple primitives to provide type-safe, dynamic updating of code. We thereby put a variety of real-world mechanisms on a common semantic foundation. 1.
Introduction
Dynamic Rebinding Most programming languages employ static binding, with the meanings of each occurrence of an identifier determined by its compile-time context. In general, this gives more comprehensible code than dynamic binding alternatives, where the meaning of identifiers depend in some sense on their ‘use-time’ context. Static binding is also a requirement for most static type systems, with their well∗ Corresponding Author. Computer Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge, CB3 0FD, UK. Fax: +44 1223 334678.
Keith Wansbrough†
‡
Cornell University
[email protected]
known advantages. Modern software, though, is becoming increasingly dynamic, as it becomes ever more modular, extensible, and distributed. Exclusive use of static binding is too limiting in many ways. Simple uses of dynamic binding are ubiquitous, of course. Conventional executables will, when run, load so-called shared libraries to resolve standard library functions (e.g. read, write, etc.). Which libraries are loaded depends upon the context; for example, a machine might have a library compiled with profiling enabled and one without. Dynamically loading and resolving program code, whether shared libraries or otherwise, is achieved via dynamic linking. Modern languages often provide an interface to the dynamic linker so that programs can load new code at runtime, perhaps downloaded from the network [9, 8, 20, 23, 4]. While extremely useful, dynamic linking is not always sufficient, since it both predefines which identifiers are to be dynamically bound, and fixes those bindings once they are established. If code or running computations can be marshalled and communicated at runtime then some of their identifiers may need to be rebound upon a move to a new context – both ‘external’ identifiers of system-calls or language runtime library functions, and ‘internal’ identifiers from application libraries which exist in the new context and may have important location-dependent behaviour. Moreover, the set of identifiers to be rebound may change as it moves. For example, it may be desirable to acquire an organisation-specific library that, once resolved, should be fixed and carried with code moved within that organisation. A number of distributed process calculi have interaction primitives with location-dependent semantics but without higher-order functions [5, 27, 22, 24, 28, 6]. Another motivation for rebinding arises from the fact that many systems must provide uninterrupted service, but nonetheless must be dynamically updated to fix bugs or add new functionality [17, 14, 29, 11, 4]. One solution is to dynamically load new code and data into the program, and then rebind some of its existing identifiers to the new definitions. As with mobile code, potentially any identifer must be subject to rebinding. For security reasons, in several of these scenarios we may wish to customize which definitions are bound to which identifiers [23, 25, 19, 27]. For example, when loading an untrusted applet, we may bind its open identifier to a safe open function that only opens files in the /tmp directory. On the other hand, we want the flexibility to link trusted code with the unconstrained open function.
Goals and Approach While dynamic rebinding is clearly useful in practice, most modern programming languages provide only rather ad-hoc mechanisms. Moreover, no adequate semantic understanding of rebinding currently exists. Our goal in this paper is to identify the key aspects of dynamic rebinding, as a step towards the design of improved languages for distributed computation and dynamic updating. We relate our approach to prior work in detail in §6. We are most concerned with languages that have call-byvalue (CBV) reduction, as that fits well with wide-area communication, and static typing, as early detection of errors is particularly important in both distributed and long-running systems. Ultimately, dynamic rebinding primitives should be integrated with a module system that supports separate compilation, but to focus on the essentials we work in a lambda-calculus setting. Accordingly, we begin with a simply-typed CBV lambda-calculus and develop a number of related calculi, together addressing the various forms of dynamic binding above. We discuss the extent to which each can be statically typed. To keep the calculi as transparent as possible they are defined with operational semantics over the expression syntax, rather than with abstract machines. Three questions are fundamental: 1. How and when does rebinding occur within a program, and how does it interact with conventional CBV λ variable resolution? 2. How is it determined which identifiers can be rebound and which are fixed? 3. How can rebinding be used to support mobile and dynamically updateable programs? In the remainder of the introduction we present an overview of our work, which gives answers to these questions. Revisiting CBV λ-Calculus To address the first question, we start with the CBV λ-calculus, but re-examine the way in which identifiers are instantiated. The usual operational semantics substitutes out binders: the (app) and (let) rules (app) (λz :T .e)v (let) let z = v in e
−→ −→
{v /z }e {v /z }e
instantiate all instances of z as soon as the value v that it has been bound to has been constructed. This approach is not compatible with dynamic rebinding, as it loses too much information. To see this, suppose that e in let z = v in e transmits an expression containing z to some other machine, and we have indicated somehow that z should be dynamically rebound to the local definition when it arrives. With the (let) rule this would be futile, as the z is substituted away before the communication occurs. Similarly, a dynamic update of z after a (let) will be void. We therefore need a more refined semantics that preserves information about the binding structure of terms, allowing us to delay ‘looking up’ the value associated with an identifier as long as possible so as to obtain the most relevant/recent version of its definition. This must maintain the essentially call-by-value nature of the calculus, however – we elaborate below on exactly what this means. We present two reduction strategies with delayed instantiation in §2. The redex-time (λr ) semantics resolves identifiers when in redex position. While this is clean and simple, it is still unnecessarily eager, and so we formulate the
destruct-time (λd ) semantics to delay resolving identifiers until their values must be destructed. Dynamic Rebinding: the λmark Calculus With λd in place we can consider the second question concerning dynamic rebinding: how can the user specify which variables should be rebound and which should be fixed? Our answer is embodied in the λmark calculus of §3, which contains primitives for packaging a value such that some of its identifiers are fixed to bindings in the current context, while others will be rebound when unpackaged in a new scope (e.g. when the code is moved). Which bindings will be fixed is dynamically determined with respect to a mark. Let-bindings occuring ‘below the mark’ are grabbed and treated as part of the value, statically bound, whereas identifiers declared above the mark will be rebound when the value is unpackaged. For example, in the expression let y1 = 6 in mark M in let x1 = ( let z1 = 3 in grab M (y1 , z1 )) in let y1 = 7 in mark M 0 in ungrab M 0 x1 the pair (y1 , z1 ) is grabbed with respect to the context marked M , where y1 = 6, but ungrabbed with respect to the context M 0 , where y1 = 7. The z1 , on the other hand, is bound below mark M , so its binding z1 = 3 is grabbed and carried with it. The final result is (7, 3), not (6, 3), showing that z1 and y1 were respectively bound statically and rebound dynamically. Marks allow the programmer to cleanly and flexibly notate which definitions should be fixed and which should be rebindable, making this choice locally for each grab. Adding marks or changing their position is syntactically lightweight – it does not require any change to code except at grab/ungrab points. Moreover, it will usually be straightforward to change the let-bindings in programs that contain marks: changing let-bindings inside marks is as usual; changing them outside a mark may require corresponding changes outside other marks but no change to any grab and ungrab expressions. Contrast, for example, with primitives that require explicit lists of the identifiers to be rebound. Na¨ıve dynamic rebinding introduces new potential runtime errors, e.g. if the context does not contain a required binding. In some scenarios a runtime check will be essential, but others intuitively should be statically typable. We discuss the difficulties and give a partial solution for marks between top-level lets. Applications: Mobility and Update To answer the third question, we present calculi for two real-world applications of dynamic rebinding: mobile code and dynamic update. To support the first, we extend our initial development of λmark with π-calculus-style communication and external library functions, yielding the calculus λio mark . We present the relevant semantics and gives further examples in §4. To support dynamic updating, we extend λd with an update primitive that allows particular program variables to be rebound to new expressions; we present the resulting λupdate
step reduction relation e −→ e 0 , using evaluation contexts E , and a runtime-error predicate e err (though the rules for the latter are elided). For now we will be concerned only let x1 = 5 in xtran{{y ⇐ (x1 , 6)}} let x1 = 5 in with the behaviour of closed expressions, without external let y1 = (4, 6) in let y1 = (x1 , 6) in library functions (though the redex-time and destruct-time let z1 = update in let z1 = () in semantics must define −→ for all expressions). The choice π 1 y1 π 1 y1 of a small-step semantics will be important when we add dynamic rebinding and communication later. Both this and The update expression indicates that an update is possible the remaining calculi are simply-typed, with a typing judgeat the point during evaluation when update appears in ment Γ ` e:T defined as usual, where Γ ranges over sea reduction context. At that runtime point the user can quences of z :T pairs. We omit the typing rules for brevity. supply an update of the form {w ⇐ e}, indicating that w Notation: context application and composition are both should be rebound to expression e. In the example this written with a dot, e.g. E .E 0 and E .e. The set of free identiupdate is {y ⇐ (x1 , 6)}; the let-binder for y1 is modified fiers of an expression or context is written with fv( ); the list accordingly yielding the expression on the right above, and of identifiers that bind around the hole of a context E is writthence a final result of 5. Here any identifier in scope at ten hb(E ), defined by hb( ) = [], hb(E .(let z = e in )) = the update point can be rebound, to an expression that may hb(E ), z , and hb(E .A) = hb(E ) for any other atomic context mention any identifiers in scope at its binding point. We A. We overload ∈ for lists. Standard capture-avoiding subdefine what it means for an update to be well-typed with stitution of e for variable z in e 0 is written {e/z }e 0 . A numrespect to a program; applying well-typed updates preserves ber of evaluation examples are presented in Fig. 2, alongside typing. examples in the redex-time and destruct-time calculi, which λupdate requires delaying variable instantiation in the same we present next. We will look at these examples more closely manner as λmark , and is addressed with the same λd mechin the next two subsections. anisms in a clean semantics. Such a treatment is novel because it deals simply and cleanly with higher-order func2.2 Redex-time tions, largely ignored in past work. We imagine λupdate will The redex-time semantics is also in Fig. 1. Instead of form the core of future calculi that include other desirable substituting out bindings of identifiers to values, as in the features, such as state transformation, changing the types of construct-time (app) and (let), it preserves them for later invariables, multi-threading, etc. stantiation. The redex-time (app) simply records a binding In §§6–7 we relate our approaches to prior work, comment of the abstraction’s identifier to the application argument, on possible application to full-scale programming languages, e.g. and conclude.
calculus in §5. As an example, consider the expression on the left below:
2.
Call-by-value λ-calculus revisited This section revisits the call-by-value lambda calculus, exploring refined operational semantics that instantiate identifiers at different times. We first recall the standard construct-time semantics, instantiating identifiers as soon as they are bound to values. We then give novel redex-time and destruct-time semantics, which instantiate identifiers in redex position and under destructors respectively. Without dynamic binding or dynamic update primitives, the three semantics are observationally indistinguishable – we prove the last two are still essentially CBV despite their somewhat ‘lazy’ feel. They are both plausible foundations for the additions of binding and update primitives in the remainder of the paper; the increased observational power of such primitives splits them apart. All of the calculi use a standard syntax: Identifiers Integers Types Expressions
z n T e
::= ::= |
int | unit | T ∗ T 0 | T → T 0 z | n | () | (e, e 0 ) | πr e λz :T .e | ee 0 | let z = e in e 0
where r ranges over {1, 2}, and work up to alpha equivalence for expressions (though not for contexts). The calculus includes also a letrec for functions, but for brevity we elide the syntax and rules. 2.1 Construct-time The standard semantics, here called the construct-time semantics, is recalled at the top of Fig. 1. We define a small-
(λz :T .e)u
−→
let z = u in e
This is reminiscent of an explicit substitution [1], save that here the let will not be percolated through the term structure, and of the λlet -calculus [3], though we are in a CBV not CBN setting. Example (1) in Fig. 2 illustrates (app), contrasting it with the substitution approach of the constructtime semantics. Note that the resulting let z = 8 in 7 is a λr value, for reasons we explain below. The semantics must allow reduction under lets – in addition to the atomic evaluation contexts we had above, here A1 , we now have the binding contexts A2 ::= let z = u in . Reduction is closed under both. Variable resolution is now handled with the (inst) rule, which resolves the identifier z in redex position with the innermost enclosing let that binds that identifier. The side-condition z ∈ / hb(E3 ) ensures that the correct binding of z is used. The other side-condition (which can always be achieved by alpha conversion) prevents identifier capture, making E3 and let z = u in transparent for u. Example (2) in Fig. 2 illustrates identifier instantiation. While the construct-time strategy substitutes for x immediately, the redex-time strategy instantiates x under the let, following the evaluation order. Both this and the first example also illustrate a further aspect of the redex-time calculus: values u include let-bindings of the form let z = u in u 0 . Intuitively, this is because a value should ‘carry its bindings with it’ preventing otherwise stuck applications, e.g. (λx :int.x )(let z = 3 in 5) or (for an example where the let is not garbage) (λf :(int → int).x 2)(let z = 3 in λx :int.z ). Note that identifers are not values, so z , (z , z ) and let z =
Construct-time λc Values Atomic evaluation contexts Evaluation contexts (proj) πr (v1 , v2 ) (app) (λz :T .e)v (let) let z = v in e
::= n | () | (v , v 0 ) | λz :T .e ::= ( , e) | (v , ) | πr | e | v | let z = ::= | E .A
v A E
in e
−→ vr −→ {v /z }e −→ {v /z }e
e −→ e 0 E .e −→ E .e 0
Redex-time λr Values Atomic evaluation contexts Atomic bind contexts Evaluation contexts Bind contexts Reduction contexts (proj) πr (E2 .(u1 , u2 )) (app) (E2 .(λz :T .e))u (inst) let z = u in E3 .z
u A1 A2 E1 E2 E3
::= n | () | (u, u 0 ) | λz :T .e | let z = u in u 0 ::= ( , e) | (u, ) | πr | e | u | let z = in e ::= let z = u in ::= | E1 .A1 ::= | E2 .A2 ::= | E3 .A1 | E3 .A2
−→ E2 .ur −→ E2 .let z = u in e −→ let z = u in E3 .u
if fv(u) ∈ / hb(E2 ) if z ∈ / hb(E3 ) and fv(u) ∈ / z , hb(E3 )
e −→ e 0 E3 .e −→ E3 .e 0
Destruct-time λd Values Atomic evaluation contexts Atomic bind contexts Evaluation contexts Bind contexts Reduction contexts Destruct contexts
u A1 A2 E1 E2 E3 R
::= n | () | (u, u 0 ) | λz :T .e | let z = u in u 0 | z ::= ( , e) | (u, ) | πr | e | (λz :T .e) | let z = ::= let z = u in ::= | E1 .A1 ::= | E2 .A2 ::= | E3 .A1 | E3 .A2 ::= πr | u
(proj) πr (E2 .(u1 , u2 )) −→ E2 .ur (app) (E2 .(λz :T .e))u −→ E2 .let z = u in e if fv(u) ∈ / hb(E2 ) (inst-1) let z = u in E3 .R.E2 .z −→ let z = u in E3 .R.E2 .u if z ∈ / hb(E3 , E2 ) and fv(u) ∈ / z , hb(E3 , E2 ) (inst-2) R.E2 .let z = u in E20 .z −→ R.E2 .let z = u in E20 .u if z ∈ / hb(E20 ) and fv(u) ∈ / z , hb(E20 )
in e
e −→ e 0 E3 .e −→ E3 .e 0
Figure 1: Three Call-by-Value Lambda Calculi 3 in (z , z ) are not values. Values may contain free identifiers under lambdas, though, so λx :int.z is an open value (as usual) and let z = 3 in λx :int.z is a closed value. Because values may involve lets, some clean-up is needed to extract the usual result, for which we define [| n |] [| () |] [| (u, u 0 ) |] [| λx :T .e |] [| let z = u in u 0 |] [| z |]
= n = () = ([| u |], [| u 0 |]) = λx :T .e = {[| u |]/z }[| u 0 |] = z
taking any value (λr or λd ) and substituting out lets outside lambdas. The (proj) and (app) rules are straightforward except for the additional binding context E2 . This is necessary as a value may now have some let bindings around a pair or lambda; terms such as π1 (let z = 3 in (4, 5)) or (more interestingly) π1 (let z = 3 in (λx :int.z , 5)) would otherwise be stuck. The side condition for (app) can always be achieved by alpha conversion; it prevents capture.
2.3 Destruct-time The redex-time strategy is appealingly simple, but it instantiates earlier than necessary. In example (2) in Fig. 2, both occurrences of x are instantiated before the projection reduction. However, we could conceivably delay resolving x until after the projection; we see this behaviour in the destruct-time semantics in the third column. In many dynamic rebinding scenarios it is desirable to instantiate as late as possible. For example, in repeatedly-mobile code, we want to instantiate each identifier only as needed to always pick up local definitions. Similarly, for dynamically updateable code we want to delay looking up a variable so as to acquire the most recent version. To instantiate as late as possible, while remaining call-byvalue, we instantiate only identifiers that are immediately under a projection or on the left-hand-side of an application. In these ‘destruct’ positions their values are about to be deconstructed, and so their outermost pair or lambda structure must be made manifest. The destruct contexts R ::= πr | u can be seen as the outer parts of the construct-time (proj) and (app) redexes. The choice of destruct contexts is deter-
Construct-time λc
Redex-time λr
Destruct-time λd
(1) (λz .7)8 −→ 7
(λz .7)8 let z = 8 in 7
(λz .7)8 let z = 8 in 7
(2) −→ −→ −→
let x = 5 in π1 (x , x ) π1 (5, 5) 5
let x let x let x let x
= 5 in π1 (x , x ) = 5 in π1 (5, x ) = 5 in π1 (5, 5) = 5 in 5
let x = 5 in π1 (x , x ) let x = 5 in x
(3) −→ −→ −→
let x = (5, 6) in let y = x in π1 y let y = (5, 6) in π1 y π1 (5, 6) 5
let x let x let x let x
= (5, 6) in let y = (5, 6) in let y = (5, 6) in let y = (5, 6) in let y
(4) −→ −→ −→
π1 (π2 (let x = (5, 6) in (4, x )) π1 (π2 (4, (5, 6))) π1 (5, 6) 5
π1 (π2 (let x = (5, 6) in (4, x )) π1 (π2 (let x = (5, 6) in (4, (5, 6))) π1 (let x = (5, 6) in (5, 6)) let x = (5, 6) in 5
= x in π1 y = (5, 6) in π1 y = (5, 6) in π1 (5, 6) = (5, 6) in 5
let x let x let x let x
= (5, 6) in let y = (5, 6) in let y = (5, 6) in let y = (5, 6) in let y
=x =x =x =x
in π1 y in π1 x in π1 (5, 6) in 5
π1 (π2 (let x = (5, 6) in (4, x )) π1 (let x = (5, 6) in x ) π1 (let x = (5, 6) in (5, 6)) let x = (5, 6) in 5
Figure 2: Call-by-Value Lambda Calculi Examples mined by the basic redexes – the syntax does not include any arithmetic operations, so there is here no need to instantiate any identifier of int type. The essential change from the redex-time semantics is that now any identifier is a value ( u ::= ... | z ). The (proj) and (app) rules are unchanged. The (inst) rule is replaced by two that together instantiate identifiers in destruct contexts R. The first (inst-1) copes with identifiers that are let-bound outside a destruct context, e.g.: let z = (1, 2) in π1 z
−→
let z = (1, 2) in π1 (1, 2)
whereas in (inst-2) the let-binder and destruct context are the other way around: π1 (let z = (1, 2) in z ) −→ π1 (let z = (1, 2) in (1, 2)) Further, we must be able to instantiate under nested bindings between the binding in question and its use. Therefore, (inst-2) must allow additional bindings E2 and E20 between R and the let and between the let and z . Similarly, (inst1) must allow bindings E2 between the R and z ; it must allow both binding and evaluation contexts E3 between the let and the R, e.g. for the instance −→
let z = (1, (2, 3)) in π1 (π2 z ) let z = (1, (2, 3) in π1 (π2 (1, (2, 3)))
with E3 = π1 , R = π2 and E2 = . The conditions z ∈ / hb(E3 , E2 ) and z ∈ / hb(E20 ) ensure that the correct binding of z is used; the other conditions prevent capture and can always be achieved by alpha equivalence. Finally, because variables are now values, the context (u ) is now ambiguous. For example, supposing e −→ e 0 and taking e 00 = let f = λx :int.x in fe, we have both e 00 e 00
−→ let f = λx :int.x in f e 0 −→ let f = λx :int.x in (λx :T .x )e
and
To prohibit this nondeterminism, we replace the A1 evaluation context (u ) by (λx :T .e) , thus forcing the second
reduction. Examples To illustrate the differences in binding time, we revisit our Fig. 2 examples. A key trend to observe is the reduced loss of binding information as we move from left to right in the figure. In the construct-time case, we substitute for variables and eliminate their original definitions, losing information as to their origin. While redex-time does not eliminate variable definitions, it still eliminates relationships between variables; notice how the relationship between x and y is lost in the first evaluation step of example (3), or how the relationship between the two occurrences of x in the pair is lost after the first evaluation step in example (2). In both cases, if x were to be rebound (either by relocating the program to a different node, or by updating the local definition) after the first reduction step, the redex-time calculus would not ‘notice’ the change, while the destructtime calculus would (the same is true for example (4)). The difference between the redex-time and the destruct-time calculi can be succinctly summarized by saying that redex-time resolves identifiers “outside-in” while destruct-time resolves them on-demand, from the inside-out. Example (3) illustrates this well. In the redex case, the (outside) x is resolved in the first reduction step, and then the (inside) y in the second reduction step. By contrast, the destruct case first resolves the x (to y), and then the y. 2.4 Properties This subsection gives properties of our various λ-calculi: sanity checks to confirm that our definitions are coherent and more substantial results showing that λr and λd are essentially CBV. For lack of space some results and all details of proofs are omitted – the full details will appear in an extended paper. First, we recall the important property of evaluation contexts for λc , essentially as in [30], namely that any expression is either a value, an error, or has a unique decomposition into an evalutation context and a redex. As we have seen, our λr and λd have rather more subtle evaluation contexts, but
nonetheless an analogous result holds: Theorem 1 (Unique decomposition for λr and λd ). Let e be a closed expression. Then, in both the redex-time and destruct-time calculi, exactly one of the following holds: (1) e is a value; (2) e err; (3) there exists a triple (E3 , e 0 , rn) such that E3 .e 0 = e and e 0 is an instance of the left-hand side of rule rn. Furthermore, if such a triple exists then it is unique. (Note that the destruct-time error rules must include cases for identifiers in destruct contexts that are not bound by enclosing lets and so are not instantiatable, giving stuck nonvalue expressions.) Determinacy is a trivial corollary. We are also able to show type preservation and type safety properties for our three calculi. Theorem 2 (Type preservation for λc , λr and λd ). If Γ ` e:T and e −→ e 0 then Γ ` e 0 :T . Theorem 3 ¬(e err).
(Safety for λc , λr and λd ). If ` e:T then
Finally we should like to show that all three calculi are in some senses observationally equivalent, or put another way, both λr and λd are ‘essentially call-by-value’. As we noted earlier, values in λr and λd may need to ‘cleaned-up’ to be exactly equivalent to λc values. Theorem 4
(Observational Equivalence).
1. If ` e:int and e −→∗c n then e −→∗r u and e −→∗d u0 for some u and u 0 with [| u |] = [| u 0 |] = n. 2. If ` e:int and e −→∗r u (e −→∗d u) then ∃n.e −→∗c n and [| u |] = n. 3.
A Dynamic Rebinding Calculus: λmark This section introduces a simple extension of the call-byvalue lambda calculus that supports dynamic rebinding. We motivate the choice of primitives below, and then present the operational semantics in §3.1 and the problem of static typing in §3.2. Most applications require a mix of dynamically and statically bound variables; for example, system and application library functions should be dynamically bound, while local functions should be statically bound. To provide lightweight but flexible control over which are which, we allow the programmer to mark contexts, and then package values such that they can be unpackaged elsewhere with some of their identifiers dynamically rebound. Marking is done with an expression form e ::= ... | mark M in e Here the mark name M is taken from a new syntactic class. Packaging and unpackaging is done by expressions e ::= ... | grab M e | ungrab M e which are both with respect to a mark. A grab M e will first reduce e to a destruct-time value u, then will grab copies of all the bindings within the nearest enclosing mark M . Free identifiers of u that are not bound below the mark are subject to dynamic rebinding when this package is ungrabbed
– if this is with respect to a different occurrence of mark then those identifiers are rebound to their definitions above that occurrence. In the simplest practical case each program might have a single mark Lib in , distinguishing library code, which is defined above the mark, from application code, which is defined below it. We defer dealing with external libraries until §4; this section addresses only user-defined libraries. Explicit communication is also deferred to §4, so examples here are slightly artificial, coded using lets to move values between different scopes. The syntax of calculi in §2 was treated up to alpha equivalence. Now, though, occurrences of the ‘same’ identifier under different bindings must be related so that the identifier can be grabbed with respect to one and ungrabbed with respect to another. Accordingly, instead of working with identifiers x we work with variables xi that are pairs of an identifier x and a tag i. Alpha equivalence changes only the tags, so e.g. λx1 :T .x2 = λx2 :T .x2 6= λy2 :T .y2 and λx1 :T .λy1 :T .(x1 , y1 ) = λx2 :T .λy3 :T .(x2 , y3 ) The fv( ) and hb( ) functions now give sets and lists of variables, not identifiers. In practice tags would not appear in source programs; they are needed only for the semantics. 3.1 Semantics of λmark The simple dynamic binding calculus λmark extends the destruct-time λd -calculus of §2.3; the syntax and new semantic rules are given in Fig. 3, though once again letrec and error rules are elided. In explaining the semantics we revisit the example from §1, whose reduction sequence in shown in Fig. 4. In broad, the example grabs a value (y1 , z1 ) in one context, where y1 = 6, and ungrabs it in another where y1 = 7. The z1 is bound below the mark and so is treated statically. The first reduction step copies the bindings (here just z1 = 3) that are inside mark M and around the grab expression, ensuring that these have static-binding semantics. This gives a value grabbed M (let z1 = 3 in (y0 , z1 )) This grabbed M u form is used in the semantics but should not occur in source programs. The free variables of u are subject to rebinding when this is ungrabbed, so we regard all of fv(u) as bound in grabbed M u. This is emphasized in the example by showing a y0 alpha-variant. The second step instantiates the x1 under the ungrab M 0 with its value let z1 = 3 in ...grabbed.... The third step performs the ungrab, rebinding the y0 in the packaged value let z1 = 3 in (y0 , z1 ) to the innermost yi binder outside mark M 0 – here, to y2 . It also discards the redundant bindings. Modulo final instantiation, the result is (7, 3) not (6, 3), showing the y1 and z1 have been treated dynamically and statically respectively. On the other hand, putting the first let y1 = 6 inside the first mark would give (6, 3). Turning now to the details of the rules, the (proj), (app) and (inst-r ) rules are as in λd but with zk instead of z . The (grab) rule copies all bindings (and marks) between and the closest enclosing mark M , using the grab M the bindmark( ) auxiliary to extract the bind and mark
Dynamic Binding Calculus λmark : Syntax Integers
n
Types Expressions
Identifiers T e
::= ::= |
x , y, z
Tags
i, j , k
Context marks
M
int | unit | T ∗ T 0 | T → T 0 | Grabbed M T zi | n | () | (e, e 0 ) | πr e | λxi :T .e | ee 0 | let zk = e in e 0 mark M in e | grab M e | grabbed M u | ungrab M e
Dynamic Binding Calculus λmark : Semantics Values
u
Atomic evaluation ctxts
A1
Atomic bind and mark contexts
A2
Evaluation contexts Bind and mark contexts Reduction contexts Destruct contexts
E1 E2 E3 R
::= n | () | (u, u 0 ) | λxi :T .e | let zk = u in u 0 | zi | mark M in u | grabbed M u ::= ( , e) | (u, ) | πr | e | (λxi :T .e) | let zk = in e | grab M | ungrab M ::= let zk = u in | mark M in ::= | E1 .A1 ::= | E2 .A2 ::= | E3 .A1 | E3 .A2 ::= πr | u | ungrab M
Rules (proj), (app), (inst-r ) are exactly as in λd except with zk replacing z . These reductions are closed under E3 , whereas the (grab) and (ungrab) rules are global. E3 .mark M .E30 .grab M u −→ E3 .mark M .E30 .grabbed M (bindmark(E30 ).u) if mark M not around in E30 (ungrab) E3 .mark M .E30 .ungrab M .E2 .grabbed M 0 u −→ E3 .mark M .E30 .S (u) if S = rebind(fv(u), hb(E3 )) is defined, E30 does not contain mark M around the hole, and ran(S ) ∈ / hb(E30 ) (grab)
Figure 3: Dynamic Binding Calculus λmark components of a context E3 , discarding the evaluation context components: bindmark( ) = , bindmark(E3 .A1 ) = bindmark(E3 ), and bindmark(E3 .A2 ) = bindmark(E3 ).A2 . The (ungrab) rule rebinds the fv(u) to the let-binders in E3 around the nearest enclosing mark M , using the auxiliary function rebind( , ) to construct the appropriate substitution. Here rebind(V , L), for a set V and list L of variables, is defined if for all xi ∈ V there is some j with xj ∈ L, in which case it is the the substitution taking each such xi to the rightmost such xj . To keep a unique decomposition property the (ungrab) rule is global, not closed under additional E3 . Reduction must take place under a mark so A2 now contains mark M in . To maintain a CBV semantics both grab and ungrab should fully reduce their arguments, so they are included in the evaluation contexts A1 . The (ungrab) rule can only fire if the argument to ungrab is of the form grabbed M 0 u, so the destruct contexts must include ungrab M . 3.2 Static Typing for Dynamic Binding In some cases one would expect dynamic rebinding to require a run-time check to ensure safety, e.g. if code is sent to a site that may or may not provide some resource it requires. In other cases, though, it is intuitively obvious that an ungrab operation should succeed; one would like those cases to be captured by a static type system. Here we explore the design of such, identifying some important issues and giving a partial solution in the form of a system that excludes ‘most’ runtime errors. There are two key difficulties in defining a static type sys-
tem. The first is exhibited by the example let x1 = λy0 :int.y0 in let z1 = λw0 :unit.x1 in let x2 = () in mark M in (ungrab M (grab M (z1 ())))5 which has a runtime error. The (z1 ()) application reduces to x1 , after which this x1 is grabbed and rebound to the innermost x binding, i.e. x2 , hence () is applied to 5. To guarantee safety we must require for each x that all xi outside any mark have the same type. This should not be a problem in practice, as shadowing in libraries is rare. (In fact, for simplicity we require that all xi have the same type globally.) Note there is a possible alternative solution of forbidding all shadowing, but that would exclude our secure encapsulation examples, shown later. The second difficulty is in proving type preservation. Any sound type system for λmark must constrain the contexts around marks, ensuring that when ungrabbing a grabbed value the context of the ungrab mark contains bindings for all variables that were in the context of the grab mark. The problem is that reduction moves subterms, in particular subterms containing marks, so the shape of the context around a mark can change dynamically. These changes are not arbitrary, and we speculate that they could be captured in a sophisticated type preservation proof. Here, though, to get a simple type preservation proof we impose somewhat draconian restrictions to exclude marks that are dynamically moved in this way. The first restriction is to prohibit marks except between top-level let-bindings. This still permits our
let y1 = 6 in mark M in let x1 = ( let z1 = 3 in grab M (y1 , z1 )) in let y2 = 7 in mark M 0 in ungrab M 0 x1
(grab) (inst-1) −→ let y1 = 6 in −→ let y1 = 6 in mark M in mark M in let x1 = ( let x1 = ( let z1 = 3 in let z1 = 3 in grabbed M ( grabbed M ( let z1 = 3 in let z1 = 3 in (y0 , z1 ))) in (y0 , z1 ))) in let y2 = 7 in let y2 = 7 in mark M 0 in mark M 0 in 0 ungrab M x1 ungrab M 0 ( let z1 = 3 in grabbed M ( let z1 = 3 in (y0 , z1 )))
(ungrab) −→ let y1 = 6 in mark M in let x1 = ( let z1 = 3 in grabbed M ( let z1 = 3 in (y0 , z1 ))) in let y2 = 7 in mark M 0 in let z1 = 3 in (y2 , z1 )
Figure 4: Dynamic Binding Calculus λmark : Example examples to be typed and is reasonable if one thinks of marks as a module-level construct (see §7). It prevents mark movement by (app) and (inst-r ) rules, but unfortunately does not completely suffice – in examples such as mark M in mark M 0 in let x1 = grab M (λy1 :unit.grab M 0 e) in e 0 the (grab) rule can move the M 0 mark. We have no good way to exclude this by typing, so extend (grab) by an additional sidecondition allowing grab only with respect to the dynamically-innermost mark, giving a rule (vargrab). The failure of this condition will be a runtime error e err0 that is not prevented by the type system (the only such). Some key typing rules are below. In the judgement Γ `m ∆ e:T the ∆ is a global assignment of assumptions x :T ; the Γ is a list of marks M and xi :T assumptions, consistent with ∆; and m is a boolean recording whether e contains mark , used to express the top-level restriction. Types T include Grabbed M T for values of type T grabbed with respect to M.
Γ
Γ, M `m ∆ e:T mark M in e:T
`t∆
Γ, M , Γ0 `f∆ e:T Γ, M , Γ0 `f∆ grab M e:Grabbed M T Γ, M , Γ0 , M 0 , Γ00 `f∆ e:Grabbed M T Γ, M , Γ0 , M 0 , Γ00 `f∆ ungrab M 0 e:T
Γ, M , Γ0 `f∆
Γ00 , M `f∆ u:T Γ00 =α Γ grabbed M u:Grabbed M T
Theorem 5 (Type Preservation for λmark ). In the (var0 m 0 grab) semantics, if `m ∆ e:T and e −→ e then `∆ e :T Theorem 6 (Safety for λmark ). In the (vargrab) semantics, if `m ∆ e:T then ¬(e err). Of course the latter result does not exclude the e err0 described above; in practice it might give rise to an exception.
Extending λmark with IO: λio mark In this section we extend λmark just enough to show examples of the rebinding scenarios from §1. For lack of space almost all details are omitted; we show just the syntax and examples in Fig. 5, and touch on the main points below. Two additions are required: semantics for open terms, for programs that use external library functions such as print; and communication, to allow code movement. Configurations P are parallel compositions of expressions e, each with a thread id t. The semantics is with respect to a type environment Γt for each thread t, for external variables; these Γt must all be of the same shape but may have differing tags, so threads on machine 1 might be w.r.t. the print 1 system call and one on machine 2 w.r.t. print 2 . They are subject to some elided constraints. The semantics has labelled transitions for invocations of and returns from such calls: 4.
t:fi [|bindmark(E3 ).u|]
(lib-app) t:E3 .(E2 .fi )u −→ t:E3 .retT 0 if fi :T → T 0 ∈ Γt and fi ∈ / hb(E3 , E2 ) (lib-ret)
t:E3 .retT 0 if ` u 0 :T 0
t:u 0
−→
t:E3 .u 0
and the (grab) and (ungrab) rules must be modified slightly. Communication between threads is by asynchronous message passing on typed channels c, with output and input forms e!e 0 and e?e 0 . Only grabbed values should be communicated, so these are typed as below. Γ ` e:Chan M T Γ ` e 0 :Grabbed M T Γ ` e!e 0 :unit
Γ ` e:Chan M T Γ ` e 0 :(Grabbed M T ) → T 0 Γ ` e?e 0 :T 0
Their dynamic semantics is given by: (comm) t:E3 .c!u | t 0 :E30 .c?(λx :T .e) −→ t:E3 .() | t 0 :E30 .(λx :T .e)u where fv(u) empty (ensured by typing) with suitable additions to the A1 and R contexts. Typing of channels and of marks is global, with a ΓC and ΓM . Example P in Fig. 5 shows rebinding to an external print and an internal here on a communication from the left thread
λio mark : Syntax Integers Identifiers
n x , y, z
Types Expressions
T e
Configurations
P
λio mark : Γr ΓC ΓM Simple: t1 :
Strings Tags
s i, j , k
Channels Context marks
c M
Thread ids
t
::= int | unit | T ∗ T 0 | T → T 0 | Grabbed M T | Chan M T | string ::= xi | n | () | (e, e 0 ) | πr e | λxi :T .e | ee 0 | let zk = e in e 0 | letrec zk = λxi :T .e in e 0 | grab M e | grabbed M u | ungrab M e | mark M in e | retT | c | e!e 0 | e?e 0 | s ::= 0 | t:e | (P | P 0 )
Examples = = =
printr :string → unit r = 1, 2 c:Chan StdLib (unit → unit) StdLib:(print:string → unit, here:string)
P= let here 1 = “site 1” in t2 : let here 2 = “site 2” in mark StdLib in mark StdLib in let = print 1 here 1 in c?(λf1 :Grabbed StdLib (unit → unit).(ungrab StdLib f1 )()) c!grab StdLib (λx1 :unit.print 1 here 1 )
Secure encapsulation: Q =
t1 : let here 1 = “site 1” in mark StdLib in let = print 1 here 1 in c!grab StdLib (λx1 :unit.print 1 here 1 )
t2 : let here 2 = “site 2” in mark StdLib in let print 3 = (λs1 :string. let = print 2“sandboxed: ” in print 2 s1 ) in let here 3 = “site 33” in mark StdLib 0 in c?(λf1 :Grabbed StdLib (unit → unit).(ungrab StdLib 0 f1 )())
Figure 5: Dynamic Binding with IO and Communication λio mark to the right. It has a transition sequence with labels t1 :print 1“site 1” t1 :()
t2 :print 2“site 2” t2 :()
for the invocations and returns of the two external printr functions. Our rebinding calculus is powerful enough to perform customized linking, useful for implementing secure encapsulation. Example Q is similar to P but the receiver ungrabs the received value with respect to an encapsulated context, delimited by StdLib 0 , which reimplements both print and here with ‘safe’ versions. The safe print prints the warning string “sandboxed: ” before any output; the safe here provides the fake “site 33” to the encapsulated code, which has no way to access the true here 2 = “site 2” binding. Q has a transition sequence with labels t1 :print 1“site 1” t1 :() t2 :print 2“sandboxed: ” t2 :() 5.
t2 :print 2“site 33” t2 :()
Simple Update Calculus: λd + update This section introduces a simple extension of the call-byvalue lambda calculus that supports dynamic updating of code. Dynamic update is essentially dynamic rebinding: an update causes a binding to change. Consequently we again start from the destruct-time λd -calculus of §2.3. Whilst the calculus we take in this paper is quite simple (other variants will appear in another paper), it is still quite expressive,
and unlike other work in this area is not tied to a particular abstract machine, or a first-order setting. Our extension to the λd -calculus, the λupdate -calculus is given in Fig. 6 (the λd rules and error rules are elided). It is now convenient to use explicitly-typed lets, but the types are omitted in examples. We allow the programmer to place an expression update at points in the code where where an update could occur; defining such updating ‘safe points’ is useful for ensuring programs behave properly [17]. The intended semantics is that this expression will block, waiting for an update (possibly null) to be fed in. An update can modify any identifier that is within its scope (at updatetime), for example in let x1 = (let w1 = 4 in w1 ) in let y1 = update in let z1 = 2 in (x1 , z1 ) x1 may be modified by the update, but w1 , y1 and z1 may not. For simplicity we only allow a single identifier to be rebound to an expression of the same type, and we do not allow the introduction of new identifiers. We define the semantics of the update primitive using a labelled transition system, where the label is the updating expression. For example, supplying the label {x ⇐ π1 (3, 4)} means that the nearest enclosing binding of x is replaced with a binding to π1 (3, 4). Note that this update can be an
Simple Update Calculus: Syntax Integers Types Expressions
n T e
Identifiers ::= ::=
x , y, z
Tags
i, j , k
int | unit | T ∗ T 0 | T → T 0 xi | n | () | (e, e 0 ) | πr e | λxi :T .e | ee 0 | let zk :T = e in e 0 | letrec zk :T = λxi :T .e in e | update
Simple Update Calculus: Semantics
(upd-replace-ok)
S = rebind(fv(e), hb(E3 )) is defined Γ(E3 ) ` S (e):T ∀j .xj ∈ / hb(E30 ) {x ⇐e}
E3 .let xi :T = u in E30 .update −→ E3 .let xi :T = S (e) in E30 .()
Figure 6: Simple Update Calculus: λupdate expression, which may be open; if it is not a value then after the update it will be in the redex position. The static typing rule for update is trivial, as it is simply an expression of type unit. Naturally we have to perform some type checking at run-time, this is the second condition in the transition rule in Fig. 6. Notice however, that we do not have to type-check the whole program; it suffices to check that the expression to be bound to the given identifier has the required type in the context that it will evaluate in. The other conditions of the transition rule are similarly straightforward. The first ensures that a rebinding substitution is defined (as in §3.1), and that the context E3 has hole binders that are alpha-equivalent to the free variables of e. The third condition ensures that the binding being updated, xi , is the closest such binding occurrence for x . These conditions are sufficient to ensure that the following theorem holds. Theorem 7
(Type preservation for λupdate ). {x ⇐e 0 } 00
6.
Related Work
6.1 Lambda Calculi As we said in §2.2, our approach in λr and λd of using lets to record the arguments of functions has similarities to prior work on explicit substitutions [1] and on sharing in call-by-need languages [3]. Both technical details and motivation differ, though: for the first, our lets do not percolate through term structure and we are in a CBV setting; for the second, these works attempt to narrow the gap between λ-calculus theory and implementation by modeling sharing and variable resolution more realistically. In contrast, we are concerned with capturing when a variable is instantiated relative to some external event, such as an update or site relocation, as highlighted by the difference between λd and λr . A further choice in λr and λd is not to admit letcommutation rules such as the let-A rule of [3]:
If ` e:T and e −→ e then ` e 00 :T
let y = (let x = u in v ) in e −→ let x = u in let y = v in e
Our calculus is novel because its use of delayed instantiation cleanly supports updating higher-order functions. Consider the following program:
as for both λmark and λupdate it is important to preserve the shapes of the environment of expressions. This forces the adoption of let x = u in u 0 as a value form. Gunter et al. [15] devise a calculus with functional continuations and prompts, as a simplification of the control operators in SML/NJ. Their operational semantics has a similar feel to our λmark rules; for example they can place prompts (c.f. marks) in their code, and a control operator that grabs an evaluation context up to the nearest enclosing appropriate prompt. The key difference with λmark is that our grab grabs only the binding parts of the intervening context, discarding the evaluation-context parts. It is interesting to note that although they claim a type safety result, this does not exclude the possibility that a well-typed program can get ‘stuck’ if an appropriate prompt does not exist (c.f. §3.2).
let f1 = let w1 = let y1 = let z1 = (y1 , z1 )
λy1 .(π2 y1 , π1 y1 ) in λg1 .let = update in g1 (5, 6) in f1 (3, 4) in w1 f1 in
which contains an occurrence of update in the body of w1 . If, when w1 is evaluated, we update the function f : {f ⇐λp1 .p1 }
e −→∗ −−−−−−−−→ −→∗ u we have [| u |] = ((4, 3), (5, 6)). Delayed instantiation plays a key role here: with the λc semantics, the result would be [| u |] = ((4, 3), (6, 5)); i.e. the update would not take effect because the g1 in the body of w1 is substituted away by the (app) rule before the update occurs. More interesting examples involving, for example, updates that include updates, and updating recursive function definitions (e.g. for long-running loops) can also be expressed in λupdate .
6.2 Dynamic Binding We loosely use the term dynamic binding to refer to techniques that resolve variables based on their runtime context. In general, existing dynamic binding calculi instantiate variables too eagerly to best support rebinding.
Binding Calculi Hashimoto and Ohori introduce a typed context calculus [16] for expressing first-class evaluation contexts within the lambda calculus. Context holes can be ‘filled in’ with terms having free variables which are captured by the surrounding context. This is intuitively similar to our use of mark to delineate a context from which free variables may be resolved. However, in our case, variables are resolved at the last possible moment, while here they are resolved at context-application time; our approach permits rebinding if the code should move to a new site, for example. Dami defines a calculus λN [7] for dynamic binding that has two key features: variables are indexed by names, and application is performed relative to names, rather than variables, in two steps: (bind) and (close). This approach is naturally suited to dynamic extension, since binding is incremental, and Dami is able to use this fact to encode extensible records and first-class contexts. Because of the significant departure from the standard λ-calculus, it is hard to understand whether our approach can be encoded in λN , or whether this encoding could be easily adapted to existing CBV languages, as is our goal. It is worth noting that Dami’s encoding of first-class contexts is, like Hashimoto’s, too eager to best support rebinding. Dynamic Linking Dynamic linking allows program bindings to be resolved either at load-time or runtime, rather than statically, and has been formally modeled for low-level machine code [10, 19, 18], and high-level languages like Java [9]. However, once dynamically bound, a variable’s definition is fixed, precluding rebinding. Some dynamic linkers can be customized to consider a variety of criteria when resolving variables, such as security/trust information, performance or debugging flags, etc. Rouaix implemented namespace restriction for untrusted code during dynamic linking in OCaml [23]. DITools [25] enables both binding variables to customized functions (e.g. redirecting references to read to a special safe read to enhance security), and interposing new functions between callers and callees, say to perform profiling. Hicks et. al [19] describe custom linking as a benefit of their approach, since they define a dynamic linking framework, rather than a particular strategy. Very little theoretical work on custom binding exists, though some results about secure encapsulation have been developed by Sewell and Vitek [27]. Dynamic Scoping Dynamic scoping is a technique by which variables are resolved based on the dynamic call-graph. While flexible, it can result in unpredictable behavior, since variables can be inadvertently captured; this was referred to as the downward funarg problem in the LISP community. Our λr and λd calculi are not dynamically scoped: they delay the moment at which variable instantiation occurs (relative to standard CBV), but the result of that instantation is unchanged. When we add rebinding primitives in λmark and λupdate , both the type system and user controls (e.g. marks and grab) ensure the correct variables are captured. Dynamically-scoped languages can be translated to statically scoped ones by passing an explicit environment. Garrigue [13] presents a calculus based on streams that can be used to encode dynamic binding for particular, scope-free variables. Lewis et al propose to add syntactically-distinct, dynamically-scoped implicit parameters [21] to staticallyscoped languages. To avoid the downward funarg problem,
they disallow higher-order functions from using dynamically scoped variables, thus preventing inadvertent variable capture. For pragmatic reasons our approach requires no special variable syntax or type formation restrictions: all variables should be rebindable, as determined by the surrounding context (given by marks). 6.3 Rebinding in Distributed Calculi The JoCaml and Nomadic Pict languages for mobile computation [12, 28] provide rebinding to external functions, but the details are matters of implementation, not semantically specified – though a more principled proposal for JoCaml has been made by Schmitt [24]. Several distributed process calculi [5, 27, 22, 28, 6] provide interaction primitives where the semantics of a name depends on where it is used, most with respect to some location tree-structure. This allows a form of rebinding to application libraries. It is unclear how the mechanisms should be integrated with a functional programming language, however. 6.4 Dynamic Update Formalisms There is little formal treatment of dynamically updateable code as it relates to variable binding. Duggan [11] has a formal framework for updating types, but updating code is considered only informally, based on arguments around reference types. Gilmore et al. [14, 29] have a formal description of updating, but it is centered on abstract types, and is tied to their particular abstract machine. Neither of these systems properly handles updating first-class functions. Gilmore et al. require that a function not be active when it is updated; closures in activation records are active, and cannot thus be updated. Reference-based indirections require that the types of function arguments change in a way that interacts poorly with polymorphism [17]. 7. Conclusion Contribution In summary, we have established a clean semantic foundation for dynamic rebinding, in particular: • introducing the refined redex-time and destruct-time CBV λ-calculus reduction strategies with delayed instantiation; • introducing the λmark calculus with mark, grab and ungrab primitives for dynamic rebinding, with a clean operational basis in the destruct-time strategy; • discussing static typing for fragments of λmark ; • showing how λmark can be extended with communication and external functions to λio mark , thereby expressing examples of dynamic rebinding of transmitted code and of secure encapsulation; and • introducing the λupdate calculus to express software update, again with a clean destruct-time operational semantics. Future Work There are several directions that are worth pursuing. Firstly, we would like a more liberal static type system for λmark , preferably one that requires no runtime checking for soundness. The main difficulty here seems to be capturing the ways in which the environment of a mark can change – we speculate that an enriched term structure that explicitly records the DAG of scopes would enable a type preservation proof. Part of the motivation for this work is to cope with marshalling of values in distributed functional languages, but
this paper does not deal with issues of type coherence between separately-compiled runtimes. One might extend λio mark with type dynamic mechanisms [2] or build-time generativity [26]. The λio mark calculus has communication on channels but not π-calculus-style new channel generation. Adding these is an interesting problem, as the usual π semantics allows scope extrusion of new-binders but for grab/ungrab we require a semantics that preserves the shape of the binding environment outside marks. This paper has focussed on semantics for small calculi, but ultimately dynamic rebinding mechanisms should be integrated with full-scale programming languages. For ML-like languages with second-class module systems it may be natural to have mark only at the module level (loosely analogous to the λmark top-level mark restriction). Generalising, one might wish to grab/ungrab with respect to a set of structures rather than a single mark. Libraries may need careful design to work well with mobile code, to delimit any hard-tomove OS or library state. There are obvious problems with optimised implementation of calculi with redex- or destructtime semantics, as dynamic rebinding or update primitives invalidate general use of standard optimisations, e.g. inlining, and perhaps also environment-sharing schemes. For performance it will be important to identify conditions under which such optimisations are still valid – perhaps via a characterisation of contextual equivalence for λmark . We speculate that λmark can be implemented by recording at each mark the points in the heap of binders outside the mark and for grab traversing the heap until those points are reached. Finally, for dynamic update the λupdate calculus is only the beginning of a rigorous treatment. The full story must address correctness of updates with state transformation, changing the types of variables, multi-threading, and so on.
[11] [12]
[13]
[14]
[15]
[16] [17] [18]
[19]
[20] [21]
[22]
[23] 8.
REFERENCES
[1] M. Abadi, L. Cardelli, P.-L. Curien, and J.-J. L`evy. Explicit substitutions. In Proc. 17th POPL, pages 31–46, 1990. [2] M. Abadi, L. Cardelli, B. Pierce, and G. Plotkin. Dynamic typing in a statically typed language. ACM TOPLAS, 13(2):237–268, 1991. [3] Z. M. Ariola, M. Felleisen, J. Maraist, M. Odersky, and P. Wadler. A call-by-need lambda calculus. In Proc. 22nd POPL, pages 233–246, 1995. [4] J. Armstrong, R. Virding, C. Wikstrom, and M. Williams. Concurrent Programming in Erlang. Prentice Hall, second edition, 1996. [5] L. Cardelli and A. D. Gordon. Mobile ambients. In Proc. 1st FoSSaCS, LNCS 1378, pages 140–155, 1998. [6] T. Chothia and I. Stark. A distributed pi-calculus with local areas of communication. In Proc. 4th HLCL, ENTCS 41.2, 2000. [7] L. Dami. A lambda-calculus for dynamic binding. Theoretical Computer Science, 192(2):201–231, 1998. [8] POSIX dlopen specification. http://www.opengroup. org/onlinepubs/007904975/functions/dlopen.html. [9] S. Drossopoulou and S. Eisenbach. Manifestations of Java dynamic linking. http://www-dse.doc.ic.ac. uk/projects/slurp/dynamic_link/Manifest.pdf. [10] D. Duggan. Sharing in Typed Module Assembly
[24] [25]
[26] [27]
[28]
[29]
[30]
Language. In Proc. 3rd Workshop on Types in Compilation, pages 85–116, 2000. D. Duggan. Type-based hot swapping of running modules. In Proc. 5th ICFP, pages 62–73, 2001. C. Fournet, G. Gonthier, J.-J. L´evy, L. Maranget, and D. R´emy. A calculus of mobile agents. In Proc. 7th CONCUR, LNCS 1119, pages 406–421, 1996. J. Garrigue. Dynamic binding and lexical binding in a transformation calculus. In Proc. Workshop on Functional and Logic Programming, 1995. S. Gilmore, D. Kirli, and C. Walton. Dynamic ML without dynamic types. Technical Report ECS-LFCS-97-378, The University of Edinburgh, 1997. C. Gunter, D. R´emy, and J. Riecke. A generalisation of exceptions and control in ML-like languages. In Proc. FPCA, pages 12–23, 1995. M. Hashimoto and A. Ohori. A typed context calculus. Theoretical Computer Science, 266(1-2):249–272, 2001. M. Hicks. Dynamic Software Updating. PhD thesis, University of Pennsylvania, August 2001. M. Hicks and S. Weirich. A calculus for dynamic loading. Technical Report MS-CIS-00-07, University of Pennsylvania, 2000. M. Hicks, S. Weirich, and K. Crary. Safe and flexible dynamic linking of native code. In Proc. 3rd Workshop on Types in Compilation, pages 147–176, 2000. X. Leroy et al. The Objective Caml system release 3.04 documentation and user’s manual, Dec. 2001. J. R. Lewis, J. Launchbury, E. Meijer, and M. Shields. Implicit parameters: Dynamic scoping with static types. In Proc. 27th POPL, pages 108–118, 2000. J. Riely and M. Hennessy. Trust and partial typing in open systems of mobile agents. In Proc. 26th POPL, pages 93–104, 1999. F. Rouaix. A Web navigator with applets in Caml. In Proc. 5th World Wide Web Conference, pages 1365–1371, 1996. A. Schmitt. Safe dynamic binding in the join calculus. A. Serra, N. Navarro, and T. Cortes. DITools: Application-level support for dynamic extension and flexible composition. In Proc. USENIX Annual Technical Conference, pages 225–238, 2000. P. Sewell. Modules, abstract types, and distributed versioning. In Proc. 18th POPL, pages 236–247, 2001. P. Sewell and J. Vitek. Secure composition of untrusted code: Wrappers and causality types. In Proc. 13th Computer Security Foundations Workshop, pages 269–284, 2000. P. Sewell, P. T. Wojciechowski, and B. C. Pierce. Location-independent communication for mobile agents: a two-level architecture. In Internet Programming Languages, LNCS 1686, pages 1–31, 1999. C. Walton. Abstract Machines for Dynamic Computation. PhD thesis, University of Edinburgh, 2001. ECS-LFCS-01-425. A. K. Wright and M. Felleisen. A syntactic approach to type soundness. Information and Computation, 115(1):38–94, 1994.