Lucretia — intersection type polymorphism for scripting languages Marcin Benke University of
Warsaw∗
[email protected]
Viviana Bono Dipartimento di Informatica dell’Università di
Aleksy Schubert Torino†
[email protected]
University of Warsaw∗
[email protected]
Scripting code may present maintenance problems in the long run. There is, then, the call for methodologies that make it possible to control the properties of programs written in dynamic languages in an automatic fashion. We introduce Lucretia, a core language with an introspection primitive. Lucretia is equipped with a (retrofitted) static type system based on local updates of types that describe the structure of objects being used. In this way, we deal with one of the most dynamic features of scripting languages, that is, the runtime modification of object interfaces. Judgements in our systems have a Hoare-like shape, as they have a precondition and a postcondition part. Preconditions describe static approximations of the interfaces of visible objects before a certain expression has been executed and postconditions describe them after its execution. The field update operation complicates the issue of aliasing in the system. We cope with it by introducing intersection types in method signatures.
1
Introduction
Dynamic languages optimise the programmer time, rather than the machine time, and are very effective when small programs are constructed [13, 17]. The advantages of the languages that help in development of short programs can be detrimental in the long run. Succinct code, which has clear advantages over short-term programming, gives less information on what a particular portion of code is doing (and figuring this out is critical for software maintenance, see [14, 9]). As a result, productivity of software development can be in certain situations impaired [12]. In particular, strong invariants a programmer can rely on in understanding of statically typed code are no longer valid, e.g., the type of a particular variable can easily change in an uncontrolled way with each function call in the program. Still, systems that handle complex and critical tasks such as the Swedish pension system [16], developed in Perl, are deployed and maintained. Thus it is desirable to study methodologies which help programmers in understanding their code and keeping it consistent. To this end, retrofitted type systems1 may be an approach to bridge the gap between flexibility and type safety. Our proposal is a retrofitted type system for a calculus with a reflection primitive. Our type system handles one of the most dynamic features of object-oriented scripting languages, the runtime modification of object interfaces. In particular, the runtime type of an object variable may change in the course of program execution. This feature can be tackled to some extent through the introduction of a single assignment form for local variables. Still, this cannot be applied easily to object fields. On the other hand, the information that statically describes the evolution of the runtime type of a variable cannot be ∗ This
work was partly supported by the Polish government grant no N N206 355836.
† This work was partly supported by the MIUR PRIN 2010-2011 CINA grant and by the ICT COST Action IC1201 BETTY. 1A
retrofitted type system is a a type system that was designed after the language. In particular, this is used in the setting of dynamic languages to indicate a static type system flexible enough to accept their most common idioms, that would be ill-typed with a classical type system, but that are run-time correct.
Jakob Rehof (Ed.): Intersection Types and Related Systems (ITRS) EPTCS 177, 2015, pp. 65–78, doi:10.4204/EPTCS.177.6
c M. Benke, V. Bono, A. Schubert
This work is licensed under the Creative Commons Attribution License.
66
Lucretia
locations variables value names field names constants function value values function expressions atomic expressions expressions
objects fields list stores
Loc 3 Var 3 VNames 3 Fnames 3 ConstV 3 FVal 3 Val 3
l x, y ::= (identifiers) z, w ::= x | l n, m ::= (identifiers) c ::= (literals) v f ::= func(x1 , · · · , xn ){e} v ::= c | v f | l e f ::= x | v f a ::= v | z Expr 3 e ::= a | opn (a1 , · · · , an ) | new | a.n | a1 .n = a2 | let x = e1 in e2 | if (a) then e1 else e2 | e f (a1 , · · · , an ) | ifhasattr (a, n) then e1 else e2 Obj 3 o ::= {} | {L f } L f ::= n : v | n : v, L Heaps 3 σ ::= · | (l, o)σ
Figure 1: Abstract syntax just a type in the traditional sense, but must reflect the journey of the runtime type throughout the control flow graph of the program. However, it would be very inconvenient to repeat the structure of the whole control flow graph for each variable in the program. It makes more sense to describe the type of each variable at program points which are statically available and this is the approach we follow in this paper. In our calculus, a variable referring to an object is annotated with a type variable paired with a constraint expressing an approximation (a lower bound) of the actual type of the object. Our type system design draws inspiration from the work on type-and-effect systems [11, 6, 1]. We present our typings in a different manner, i.e., one where an effect is described by two sets of constraints that express type approximations before and after execution of an instruction. The sets of constraints together with the typed expression can be viewed as a triple in a Hoare-style program logic. An important element of the language design is the way functions (called methods in object-oriented vocabulary) are handled. The function types describe contracts associated with the functions. We obtained a satisfactory level of flexibility of function application due to type polymorphism. We use two kinds of polymorphism here that serve two different purposes. The first one is the parametric polymorphism, similar to the one of System F. Through universal quantifier instantiation we make it possible to adapt the function type to different sets of parameters. The second one is a form of ad-hoc polymorphism obtained through the use of intersection types [3] and its purpose is to provide particular contracts that are for specific aliasing schemes, i.e., one may describe additional possible behaviours of a function that cannot be described by instantiation of a universal type.
2
Overview of the Calculus
The syntax of our calculus is depicted in Figure 1. The elements of the set VNames = Var ∪ Loc are called value names. The calculus is object-based and our objects are records of pairs fieldname:value.
M. Benke, V. Bono, A. Schubert
67
(Let-Propag) if σ , e1 ; σ 0 , e01 then σ , let x = e1 in e2 ; σ 0 , let x = e01 in e2 (Let-Reduce) σ , let x = v in e ; σ , e[x := v] (Op-Eval)
σ , opn (v1 , · · · , vn ) ; σ , δn (opn , v1 , · · · , vn )
(β v)
σ , func(x1 , · · · , xn ){e}(v1 , · · · , vn ) ; σ , e[x1 := v1 , · · · , xn := vn ]
(If-True) (If-False)
σ , if (true) then e2 else e3 ; σ , e2 σ , if (false) then e2 else e3 ; σ , e3
(Ifhtr-True) σ , ifhasattr (l, n) then e1 else e2 ; σ , e1 when a ∈ dom(σ (l)) (Ifhtr-False) σ , ifhasattr (l, n) then e1 else e2 ; σ , e2 when a 6∈ dom(σ (l)) (New) (SetAttr) (GetAttr)
σ , new ; (l, {})σ , l l fresh σ , l.n = v ; σ [l := σ (l)[n := v]], v σ , l.n ; σ , σ (l)(n) when n ∈ dom(σ (l))
Figure 2: Semantic rules of Lucretia Moreover, it is imperative, that is, it has side-effects, therefore we have a heap where objects are stored. Methods are modelled by fields containing functions. There is no built-in concept of self, but it can be encoded (see the examples in Section 3). Values are either constants, functions, locations (the latter do not appear in source programs, only in the semantics). Expressions include value names, primitive operation application, an object creation operation, field access, field update, let-assignment, function application, a conditional expression, an introspectionbased conditional expression checking if a certain field belongs to an object. The operational semantics is presented in Figure 2. The construct let is the only possible evaluation context of the calculus, and rule (Let-Propag) takes care of the propagation of the reduction, while (Letreduce) performs the appropriate substitution of the computed value v, once this is obtained. Rule (OpEval) applies the semantical counterpart of the operation symbol to the given arguments. Rule (βv ) is the call-by-value function application. Rules (If-True) and (If-False) are self-documented. Rules (IfhtrTrue) and (Ifhtr-False) check whether a certain field belongs or not to an object allocated in the heap, and choose a computation branch accordingly. Rule (New) allocates a fresh address in the heap. Rule (SetAttr): either adds the field n to the object allocated at location l, initialised with value v, if n does not exist in the object; or updates n with v, otherwise. Rule (GetAttr) extracts the value of the field n from the object at location l, if n belongs to the object. Note that the semantics is deterministic. The usage of an object field depends on its type, and since the type clearly depends on the computation flow, we need to update the constraints via static analysis of the computation flow; to keep track of the knowledge about the current fieldset, we use judgements which are a combination of usual typing judgements, and Hoare-style triples: Ψ1 ; Γ ` e : t; Ψ2 , where Ψi are constraint sets representing type information about the objects in expression e, respectively before and after considering the effects of expression. We call them the precondition and the postcondition. The type information associated with an expression is, then, a combination of two items: a representation of its actual type and a set of constraints on objects in the relevant part of the heap. New fields can be added dynamically to our objects, moreover any existing field can be assigned with values of different types during the computation, as it happens in dynamic languages (e.g., Python, JavaScript, Ruby). An object type, then, is not fixed once and forever. We decided, therefore, to type an object with a constrained type variable, written X