Type Checking for JavaScript
?
Christopher Anderson1 and Paola Giannini2 1
2
Imperial College of Science, Technology and Medicine
[email protected] Dipartimento di Informatica, Universit` a del Piemonte Orientale
[email protected]
Abstract. JavaScript is a powerful imperative object based language made popular by its use in web pages. It supports flexible program development by allowing dynamic addition of members to objects. Code is dynamically typed: a runtime access to a non-existing member causes an error. We suggest two static type systems for JavaScript that will detect such runtime type errors. Therefore, programmers can have the benefit of flexible programming offered by JavaScript with the safety offered by a static type system. We demonstrate our type systems with a formalism of JavaScript, JS0. Both type systems are based on structural recursive types. Members of a type are classified into definite and potential. The type of an object can evolve through assignment to potential members which turns them to definite. We first present a type system with integer and object types, and then we extend it adding also type variables to have polymorphism ` a la ML. We prove that our type systems are sound.
1
Introduction
JavaScript (see [12]) is a powerful imperative object based language made popular by its use in web pages. JavaScript supports flexible program development by allowing dynamic addition of members to objects. This allows creation of dynamic, interactive applications that runs completely within a Web browser. JavaScript code is embedded directly in web pages and interpreted as the page is loaded. Code is dynamically typed and if at runtime a field is accessed or method called that does not exist then a runtime type error generated. When such errors occur the user is usually presented with an error dialog box. We suggest two static type systems for JavaScript that will detect type errors that are currently only detected at runtime. Therefore, programmers can have the benefit of flexible programming with the safety offered by a static type system. We demonstrate our type systems with a formalism of JavaScript, JS0 . JS0 supports the standard JavaScript flexible features, e.g. functions creating objects, and dynamic addition/reassignment of fields and methods. Our type systems tackles the following challenges introduced by the flexible features of JS0 : 1) JS0 object structure is determined by assignment of members. 2) JS0 objects can have members added after the object is created. 3) JS0 methods are created by assigning functions to members. 4) JS0 methods can be shared between objects. 5) JS0 functions can have three different roles: creating objects, methods of objects and global functions. ?
Partially supported by EU within the FET - Global Computing initiative, project DART IST2001-33477, and by MURST Cofin’02 project McTati. The founding bodies are not responsible for any use that might be made of the results presented here.
We first address these issues with a typed version of JS0 , JST 0 , with explicit type annotations: 1) JST uses structural types of the form: µ α. < < m >. 1 : t1 · · · mn : tn > 0 2) JST allows members of a type to be annotated with ? indicating they are yet to be 0 assigned. 3) JST gives a function identifier, f, type f↑ indicating an alias to a function. 4) 0 and 5) JST requires each function to declare the type of the receiver and the parameter 0 it expects and the type of its result. The function can then be used as a method of any object that has a type which is a subtype of the type of the declared receiver. Therefore, the function can be method of objects with different structure, in case the function does not make requirements on its receiver it can be used as a global function. Our subtyping is a subtyping in width in the sense that a type t is subtype of type t0 if t has all the members of t0 and with the same type. The type t could have more members. T We then introduce the type system JSTP a la 0 which extends JS0 with polymorphism ` ML [13]. That is, type variables as types, and, since we have a subtyping relation between types, we can specify that there is a relation (partial order) between type variables. To match the type of the formal parameter with the one of the actual parameter (of a function containing type variables) we have to find a substitution compatible with the relation between its type variables that transforms the type of the formal parameter in the one of the actual parameter. So, in JSTP 0 functions have two orthogonal ways to be polymorphic. This allows to give type to more expressions. We see this work as a first step in statically typing languages like JavaScript. Our intention is to use type inference to automatically translate JavaScript code to the typed variant for type checking. This paper is organized as follows. In Section 2 we present an example introducing the features of JS0 (the subset of JavaScript considered) and of its type system. In Section 3 we define the syntax of JS0 and its operational semantics, and in Section 4 we give a typed T version of JS0 : JST 0 . The proof od soundness JS0 is outlined in Section 5, and in Section TP 6 we present JS0 an extenion to the type system of JST 0 to include type variables. We give the relevant definitions and the statement of soundness for JSTP 0 . Finally, in Section 7 we compare the our work with the one of others and outline our future directions.
2
Example
We start with an example demonstrating the classic untyped style of programming as seen in JavaScript. In Figure 1 we give an example that describes a scenario with people and their jobs. We define functions Person, moneyTrans, employPerson. The code preceeded by the comment //Main is the entry point to the program. Figure 1 demonstrates: 1) creating objects using functions, line 16. 2) Implicit addition creation of members in objects through assignment, lines 2, 3. 3) Acquiring methods through assignment of a function to a member, line 3. 4) Method call, paul.payMe(10) binds paul to this when moneyTrans is executed, line 16. 5) Addition of members after object creation, with employPerson adding member boss, line 12. 6) Global function call, where a function is called without a receiver, line 17. We now look at the same example in the context of a typed version of JavaScript. Figure 2 gives a typed version of Figure 1. The first thing to note is that unlike JavaScript there are type annotations for the formal parameter and return type of a function. Secondly, the types are recursive and structural. Consider the return type of function Person on line 1, µ α. . Types comprises a list of members each with their own type e.g. money:Int. Types have a bound variable α that allows a type to refer to itself e.g. boss:α.
Members in a type can be annotated with ? indicating a potential member. In the example, this is used for the boss member, boss:α ?, in the return type of function Person. Therefore, when objects are created using function Person they are given type, µ α. , which allows a member boss to be added later. This allows objects to evolve in a controlled manner. Note also, that the type of member boss is α indicating a recursive type. This captures our requirement that the boss of a person is also a person. When a potential member is assigned to, it become definite, loosing its ?. To keep the type system manageable we only track assignments to variables (function formal parameters and this) within the scope of a function. Therefore, for the effect of assignments to be visible outside a function the variable must be returned from the function with appropriate type. Consider function employPerson where assignment, x.boss = y, makes boss definite in the type of x. x is returned from employPerson with type µ α. . There are two key components to typing methods. Firstly, aliases to function identifiers, f, are given a type f↑. This way we can track aliases to functions in objects. For example, in the type, µ α. , member payMe is given type moneyTrans↑. This indicates... Secondly, each function declares a type for the receiver, this, it expects as seen on lines 2, 9, 16. Therefore, when a function is used as a method of an object upon calling we check that the receiver is a subtype of the declared type for this in the function. For example, with call paul.payMe(10) on line 24 we have that µ α. is a subtype of µ α. . Subtyping is based on the structures of the types concerned. For one t to be a subtype of t0 , t must declare at least the members defined in t0 with the same types. We saw this with the type of variable paul being a subtype of this declared in moneyTrans i.e. we can declare only the members we use in a type. This means that function moneyTrans can be used as a method of any object whose declared type contains member money of type Int.
1 2 3 4 5
function Person(x) { this.money = x; this.payMe = moneyTrans; this }
6 7 8 9
function moneyTrans(x) { this.money = this.money + x; }
10 11 12 13
function employPerson(x,y) { x.boss = y; x }
14 15 16 17
//Main john = new Person(100); paul = new Person(0); paul = employPerson(paul,john); paul.payMe(10); paul.boss
Fig. 1. Untyped JS0 Person Example
1 2 3 4 5 6
function Person(x:Int):µ α. < > { this:µ α. < >; this.money = x; this.payMe = moneyTrans; this }
7 8 9 10 11
function moneyTrans(x:Int):Int { this:µ α. < >; this.money = this.money + x }
12 13 14 15 16 17 18
function employPerson(x:µ α. < >, y:µ α. < >): µ α. < > { this:µ α. < >; x.boss = y; x }
19 20 21 22 23 24
//Main john:µ paul:µ john = paul =
α. < >; α. < >; new Person(100); new Person(0); paul = employPerson(paul,john); paul.payMe(10); paul.boss
Fig. 2. Typed JS0 Person Example
3
JS0
We have developed JS0 a subset of JavaScript which includes the following features: 1. Functions used to create objects, 2. Functions can be aliased and used as members of objects, 3. Members can be added to objects dynamically. We chose these features because, 1 represents the way objects are created in JavaScript, 2 is a way by which objects acquire methods, and 3 gives flexibility to programs. JS0 does not include the following JavaScript features: 1. Libraries of functions, 2. Native calls, 3. Global this (through a global object), 4. Dynamic variable creation, 5. Functions as objects, 6. Dynamic removal of members, 8. Delegation and prototyping, We omitted these features because 1,2 and 3 are not central to the paradigm, while 5,6 and 7 are too difficult to support in a statically typed language We can write the introductory examples from [10] in JS0 assuming libraries of functions, and predefined types floats, strings, etc. The syntax of JS0 is given in Figure 3 3.1
Operational Semantics
We have a structural operational semantics for JS0 that rewrites tuples of expressions, heaps and stacks into tuples of values, heaps and stacks in the context of a program, P. The signature of the rewriting relation ; is: Program ¢ Exp × Heap × Stack ¢ (Val ∪ Dev) × Heap × Stack
D∗ FuncDecl function f (x): { e } var locals f function identifier new f(e) object creation e; e sequence e.m(e) member call e.m member select f(e) global call lhs = e assignment e1 ? e2 : e3 conditional null null var ∈ EnvVars ::= this | x lhs ∈ LeftSide ::= x | e.m P ∈ Program D ∈ Decl F ∈ FuncDecl e ∈ Exp
::= ::= ::= ::=
Identifiers f ∈ FuncID ::= f | f 0 | . . . m ∈ MemberID ::= m | m0 | . . .
Fig. 3. Syntax of JS0 where: H ∈ Heap = Addr * f in Obj S ∈ Stack = ({this} ¢ Addr) ∪ ({x} ¢ Val) v ∈ Val = {null} ∪ FuncID ∪ Addr ∪ Int dv ∈ Dev = {nullPntrExc, stuckErr} Obj = { [[m1 : v1 ...mn : vn ]] | m1 ...mn are member identifiers v1 ...vn are values } As we can see, the heap maps addresses to objects, where addresses, Addr, are ι0 , ..ιn ... The stack maps this to an address and variables to values, where values, Val, are function identifiers (denoting functions), addresses (denoting objects), null , or integers. Finally objects are fnite mappings between member identifiers and values. To give a taste of the operational semantics, we show rule (mem-call). A full description of the rules is given in Appendix A. e1 , H, S ¢ ι, H1 , S1 e2 , H1 , S1 ¢ v0 , H2 , S0 H2 (ι)(m) = f P(f) = function f(x) {e 0 } e 0 , H2 , {this 7→ ι, x 7→ v0 } ¢ v, H0 , S00 e1 .m(e2 ), H, S ¢ v, H0 , S0
(mem-call)
In rule (mem-call) we first evaluate the receiver and then the actual parameter of the method. We obtain the function definition (corresponding to the method) by looking up the value of member m in the receiver (obtained by evaluation of e) in P. We execute
the body with a stack in which this refers to the receiver of the call and x to the value of the actual parameter. Finally we return the stack after the evaluation of the actual parameter and the heap resulting from the execution of the body of the method. So that this and x are bound to their value before the evaluation of the body, but the heap has the effects of the evaluation of the body. Returning to the example in Figure 1, executing the body of function Main in the presence an empty heap, H0 , with stack, S0 mapping john and paul to null will produce heap H1 and stack S1 such that john and paul map to ι0 and ι1 respectively: H1 (ι0 ) = [[money : 100, payMe : moneyTrans, boss : null]] H1 (ι1 ) = [[money : 10, payMe : moneyTrans, boss : ι0 ]] Note that the member payMe aliases function moneyTrans which was invoked when paul.payMe(10) was executed.
4
A Type System for JS0
In this section we introduce a fully-typed version of JS0 : JST 0. 4.1
Types
Figure 4 shows the parts of JST 0 that differ from JS0 along with the definitions of types. We make the following observations: – Functions have a return type preceded by a colon. – Function formal parameters are given types. – Function bodies start by declaring the type of the receiver, this. Types, t1 , ..., tn , comprise object types, function types, or Int (the type of integers). Object types list the methods and fields present in the object. We use the µ-notation to allow a type to refer to itself. So µ α.M where M =>, is the type of an object with members m1 , ..., mn of type t1 , ..., tn , respectively. The definition of free variables of a type is the standard one: S FV(µ α. >) = i∈1...n FV(ti ) − {α}, FV(α) = {α}, and FV(f↑) = FV(Int) = ∅. The type assigned to expressions by the type system are always closed, that is they do not contain free variables (see Figure 12 in Appendix B for formal definition of closed type, ` t ¦closed ). If the type of m is α or µ α0 .M0 or Int the member represents a field . In the case of α the type has the structure of the enclosing type (µ α.M). If the type of m is f↑, then m represents a method. We denote type f↑ to mean an alias to function for a compatible function (see Section 4.1), in an object of this type m will be bound to the function identifier. The types for the formal parameter, receiver and return value for a function, f, are found using lookup function L (defined in Figure 5) Some members of an object type are annotated with ?. If the type of an object contains m:t ? it means that the member m is a potential member, and in case it is present then its type is t. However, the object may not have the member m. The type of objects with potential members may evolve making a potential member into a definite member when it is assigned a value. In a well-typed program potential members may not be accessed.
Syntax P ∈ Program D ∈ Decl F ∈ FuncDecl
::= D ∗ ::= FuncDecl ::= function f (x:t):t0 { this:t00 ; e }
Types t ∈ Type ::= µ α.M | α | f↑ | Int M ∈ TypeMembers ::= < < (m : t[?])∗ > > α ∈ ObjType ::= α | α0 | α00 | . . .
Fig. 4. Syntax of JST 0
Congruence and Subtyping Types are congruent up to reordering of fields and αconversion. This is expressed formally by the judgement t ≡ t0 in Fig. 11 of the Appendix B. The judgement P ` t 4 t0 means that an object or function of type t can be used whenever one of type t0 is required. For object types, firstly consider the definite members. For these members we have subtyping in width in that all definite members of t0 must be present and congruent with those in t (third line of the rule for P ` µ α.M 4 µ α0 .M0 in Fig. 6.) To ensure that the types are closed we substitute occurrences of bound variables by their enclosing type. The second condition of the definition in Fig. 6 (fourth line of the rule for P ` µ α.M 4 µ α0 .M0 in Fig. 6) refers to the potential members in M0 . In particular, it says that if a potential member in M0 is also present as (a potential or definite) member of M its type in M0 must be congruent with the type in M (as before we have to close the types). This condition is needed to insure that the addition of a new member to an object does not break the compatibility of the object with some type the objects was compatible with before. For functions we require that the types of parameter, receiver, and return type must be congruent. In future versions of this work we may relax this restriction and allow controvariance on the receiver and parameter type and covariance on the return type. We can prove that subtyping is reflexive and transitive, so it is a partial order. Moreover, given types t, and t0 , it is decidable wether F ` t 4 t0 or not. 4.2
Typing of Expressions
Typing an expression, e in the context of a function store, F, and environment, Γ has the form: P, Γ ` e : t k Γ0 The environment, Γ, maps the receiver, this, and formal parameter, x, to types and has the form {this : ψ, x : t}. The environment on the right hand side of the judgement reflects the changes to the type of the receiver or parameter while typing the expression. The possible changes are the removal of annotations, that is the effect of the assignment to the annotated member of a value (compatible with its type).
L(P, f) = < Γ, t00 > where Γ = {this : t, x : t0 } P(f) = function f(x : t0 ) : t00 { this : t; e} M↓• = < < m1 : t 1 . . . mn : t n > > where {m1 : t1 . . . mn : tn } = {m : t | M(m) = t } M↓? = < < m1 : t 1 ? . . . mn : t n ? > > where {m1 : t1 . . . mn : tn } = {m : t? | M(m) = t?} Γ00 (var)[m 7→ t](m0 ) =
½
t if m0 = m 00 0 00 0 Γ (var)[m 7→ t](m ) = Γ (var)(m ) otherwise
Fig. 5. Operations on Types
L(P, f) =< Γ, t > L(P, f 0 ) =< Γ0 , t0 > Γ0 (this) ≡ Γ(this) Γ0 (x) ≡ Γ(x) t ≡ t0 P ` f↑4 f 0↑
P ` Int 4 Int
M↓• =< < m1 : t 1 . . . mn : t n > > M↓? =< < mn+1 : tn+1 ? . . . mn+p : tn+p ? > > 0 • 0 0 0 0 M ↓ =< < m1 : t 1 . . . mr : t r > > M0 ↓? =< < m0r+1 : t0r+1 ? . . . m0r+q : t0r+q ? > > ∀j ∈ 1 . . . r ∃i ∈ 1 . . . n m0j = mi ∧ ti [α/µ α.M] ≡ t0j [α0 /µ α0 .M0 ] ∀i ∈ 1 . . . n + p ( ∃j ∈ 1 . . . r + q m0j = mi =⇒ ti [α/µ α.M] ≡ t0j [α0 /µ α0 .M0 ] ) P ` µ α.M 4 µ α0 .M0
Fig. 6. Subtyping
Rules (var), (f unc), and (const) are straightforward. We now discuss the most interesting rules (mem − acc), (method − call), (new/global − call), (assign − add) and (cond). In (mem − acc) the expression must be of an object type in which the member m is defined and not potential i.e. annotated with ?. All occurrences of α are substituted by the type of the target of the member access to return a closed type. In (method − call) we check that the type of the receiver expression is an object type in which the member m has a function type. Moreover, the type of the receiver expression and the one of the actual parameter must be compatible (subtypes) with the type of the receiver and the parameter of the function. In (new/global − call) we make the requirement that the type of the receiver defined in the function is a subtype of the empty object type, P ` µ α. 4 Γ1 (this). This is consistent with the operational semantics, as in the case of global call there is no receiver defined and in object creation we start with an empty receiver object. In (assign − add) we may modify the type environment, by removing from the type of the member m (of this or of the parameter) the annotation ?. Starting from this point the member m may be accessed. The type of the expression assigned is checked for compatibility with the type of the member after the substitution of all the occurrences of α with the type of the variable that contains it. For example, consider environment, Γ1 , where Γ1 (x) has type µ α. >. With the assignment expression, x.m2 = x, (which could correctly follow x = null ) the environment Γ1 is updated to Γ2 where Γ2 (x) has type µ α. >. This reflects the updating of member m2 . Note that the type of x in Γ1 is compared with the type of the member m2 after the substitution of µ α. > for α. In (cond) the operation t t t0 is applicable (and so also the rule) only when the types t and t0 are compatible, that is: – either t ≡ t0 , – or t = f↑, t0 = f 0↑, and P ` f↑4 f 0↑ – or t = µ α.M, t0 = µ α0 .M0 and m : t00 ∈ M if and only if m : t000 ∈ M0 and t00 [α/t] ≡ t000 [α0 /t0 ]. So for object types the types must have the same members (that could be definite in one type and potential in the other) with congruent types. If t and t0 are compatible define the upper bound of t and t0 , t t t0 by: t if t0 = t if t = f↑ f↑ (1) t t t0 = µ α.M00 if t = µ α.M, t0 = µ α0 .M0 where • 00 00 • 00 000 0 • m : t ∈ M ↓ iff m : t ∈ M↓ and m : t ∈ M ↓ , and m : t00 ∈ M00 ↓? iff m : t00 ∈ M↓? or m : t000 ∈ M0 ↓?
For object types a member of t t t0 is definite if it is a definite member of both t and t0 , otherwise it is a potential member. Compatibility for environments is defined as follows: Γ and Γ0 are compatible if and only if for all var such that var ∈ Γ, and var ∈ Γ0 , Γ(var) is compatible with Γ0 (var). We can prove that if P, Γ ` e1 : t k Γ0 and P, Γ ` e2 : t0 k Γ00 then Γ00 and Γ0 are compatible, since Γ0 and Γ00 may differ from Γ only because some potential members of a var have become definite. Clearly if there is no relation between e1 and e2 there is no relation between t and t0 . If Γ and Γ0 are compatible Γ t Γ0 = {this : Γ(this) t Γ0 (this), x : Γ(x) t Γ0 (x)}
(2)
The rules (assign − update), which assumes that the member m be defined, and (var − ass) are obvious.
P, Γ ` this : Γ(this) k Γ P, Γ ` x : Γ(x) k Γ
(var)
P, Γ ` e : µ α.M k Γ0 M(m) 6= Udf P, Γ ` e.m : t[α/µ α.M] k Γ0
P, Γ ` e1 : µ α.M k Γ0 M(m) = f↑ P, Γ0 ` e2 : t k Γ00 L(P, f) = < Γ1 , t1 > P ` t 4 Γ1 (x) P ` µ α.M 4 Γ1 (this) P, Γ ` e1 .m(e2 ) : t1 k Γ
00
F(f) 6= Udf P, Γ ` f : f↑ k Γ
(mem − acc)
(f unc)
P, Γ ` null : µ α.M k Γ P, Γ ` n : Int k Γ
P, Γ ` e1 : t k Γ0 P, Γ0 ` e2 : t0 k Γ00 P, Γ ` e1 ; e2 : t0 k Γ00
(seq)
P, Γ ` e : t k Γ0 L(P, f) = < Γ1 , t1 > P ` t 4 Γ1 (x) P ` µ α. < >4 Γ1 (this) (method − call)
P, Γ ` var : µ α.M k Γ M(m) = t0 P, Γ ` e2 : t k Γ0 P ` t 4 t0 [α/µ α.M] Γ00 = Γ0 (x)[m 7→ t0 ] P, Γ ` var.m = e2 : t k Γ00
(assign − add)
P, Γ ` e1 : Int k Γ0 P, Γ0 ` e2 : t k Γ00 P, Γ0 ` e3 : t0 k Γ000 P, Γ ` e1 ? e2 : e3 : t t t0 k Γ00 t Γ000
(cond)
(const)
P, Γ ` new f(e) : t1 k Γ0 P, Γ ` f(e) : t1 k Γ0
P, Γ ` e1 : µ α.M k Γ0 M(m) = t0 P, Γ0 ` e2 : t k Γ00 P ` t 4 t0 [α/µ α.M] P, Γ ` e1 .m = e2 : t k Γ00
(new/global call)
(assign − update)
P, Γ ` e : t k Γ0 P ` t 4 Γ0 (x) P, Γ ` x = e : t k Γ0 [x 7→ t]
(var − ass)
Fig. 7. Type rules for expressions in JST 0 A program is correct if all the types used are closed and all the function used are defined and well-typed.
5
Formal Properties of the Type System
In this section we prove the soundness of our type system w.r.t. the operational semantics given in Section 3.1. We first define the notion of a value having a given type. The defi-
nition is given coinductively by first defining the properties that any agreement relation should have. Definition 1. Agreement of Values with (closed) types - Given a heap, H, and a program, P, we say that A ⊆ (Val × Type) is an agreement relation if: – (null, t) ∈ A if and only if t = µ α.M for some µ α.M – (n, t) ∈ A if and only if t = Int, – if (f, t) ∈ A, then t = f 0↑, and 1. L(P, f) =< Γ, t > L(P, f 0 ) =< Γ0 , t0 > 2. Γ0 (this) ≡ Γ(this) 3. Γ0 (x) ≡ Γ(x) 4. t ≡ t0 – if (ι, t) ∈ A, then t = µ α.M, and a. M↓• =>, M↓? => b. H(ι) = [[m1 : v1 . . . mp : vp ]] with n ≤ p c. for all 1 ≤ j ≤ n and t0 ≡ tj [α/t] , (vj , t0 ) ∈ A d. for all 1 ≤ j ≤ r, 1 ≤ i ≤ p and t0 ≡ t0j [α/t], if m0j = mi then (vj , t0 ) ∈ A We can prove that if (v, t) ∈ A and t ≡ t0 then also (v, t0 ) ∈ A. Moreover, if A and A0 are agreement relations also A ∪ A0 is an agreement relation. Therefore given a heap, H, and a program, P, the union of all agreement relations defines the relation between values and types, that says when a value has a given type. In the following we define when a value v is compatible with type t in H, Definition 2. P,H ` v J t holds if for some agreement relation A on H and P then (v, t) ∈ A Note that an address may be compatible with more than one type. In particular, if a value is compatible with a type, then it is compatible with all its supertypes. Lemma 1. If P ` t 4 t0 and P, H ` v J t then P, H ` v J t0 The following definition introduces the definition of a stack and a heap being compatible. Definition 3. P, Γ ` H, S ¦ holds if P, H ` S(this) J Γ(this) and P, H ` S(x) J Γ(x) We introduce a relation between pairs of heaps, stacks saying that a pair heap, stack can be obtained from the other during the evaluation of an expression. Definition 4. Given the heaps, H, and H0 , and the stacks S, and S0 , define P ` H, S ¤ H0 , S0 when S(this) = S0 (this) and for all addresses ι and types t if P,H ` ι J t holds then also P,H’ ` ι J t holds We can now state the main lemma and the soundness theorem. Lemma 2. For a well-formed program P, environment Γ, and expression e, such that: P, Γ ` e : t k Γ0 If e, H, S ¢ v, H0 , S0 , and P, Γ ` H, S ¦ then 1. P, H0 ` v J t, 2. P, Γ0 ` H0 , S0 ¦, and 3. P ` H, S ¤ H0 , S0
The following theorem asserts that if an expression is well-typed in a type environment Γ, then the evaluation of the expression starting in a heap and stack that agree with Γ cannot produce a run-time error. That is, the result of the evaluation is either a value of the right type, or it is a nullPntrExc exception. In particular, it is not a stuckErr error. Theorem 1. [Type Soundness] For a well-formed program P, and environment Γ, and expression e, such that: P, Γ ` e : t k Γ0 If, P, Γ ` H, S ¦ and e, H, S converges then – either e, H, S ¢ v, H0 , S0 , and P, H0 ` v J t, – or e, H, S ¢ nullPntrExc, H0 , S0
6
Adding Parametric Polymorphism: JSTP 0
T a la ML [13]. Subtyping allows a certain degree of JSTP 0 extends JS0 with polymorphism ` polymorphism. A function that has a formal parameter of type t can be applied to an actual parameter of type t0 which is a subtype of t. As we can see from the definition of subtyping this means for object types that t0 has at least all the members of t but may have more, and the type of corresponding members must be congruent. Consider defining a swap function that takes a parameter which is an object with two members and swaps the content of the members. This could be written, as follows:
function this:µ x.temp x.m1 = x.m2 = }
swap(x:µ α. < < m1 : t, m2 : t, temp : t? > >): t { α. < >; = x.m1; x.m2; x.temp
This function can be called with actual parameters that are object types having at least the members m1 and m2 of type t, and if they have the member temp this must also be of type t. So assuming we have an object with members m1 and m2 but of type t0 (which is not congruent to t) the function swap cannot be used, even though the only restriction that the function swap imposes on its parameter is that it has members m1 and m2 (and potentially temp) of the same type. It is well known that this kind of restriction can be expressed by parametric polymorphism, that is, using type variables. Type variables are instantiated (using a substitution) when calling the function to match the types of the actual parameters. So we add type variables to the syntax of types in Fig. 4: t ∈ Type
::= · · · | T
where T ∈ VarType ::= T | T0 | T00 | . . . Consider, however, a function that instead of swapping the members m1 and m2 , assign the value of m2 to m1 . function assign(x:µ α. < < m1 : t?, m2 : t0 > >):t { this:µ α. < >; x.m1 = x.m2; }
To be correct the only restriction is that type t0 be a subtype of t. To account for this we also add type variables constraints where a constraint, C, is a relation between type variables, that is C ⊆fin (VarType × VarType). Since a constraint is meant to represent a subtyping relation between type variables, we assume that the relation is a partial order. (When we write constraints we omit the pair needed for reflexivity.) We exdend function definitions so that they specify the constraints that are must hold between the type variables: function f(x : t) : t0 { C; this : t00 ; e} Functions swap and assign will have the following definition: function swap(x:µ α. < < m1 : T, m2 : T, temp : T? > >): T { ∅; // no constraints this:µ α. < >; x.temp = x.m1; x.m1 = x.m2; x.m2 = x.temp } function assign(x:µ α. < < m1 : T?, m2 : T0 > >):t { 0 {(T , T)}; // the type substituted for T0 must be a subtype of T this:µ α. < >; x.m1 = x.m2; }
If the signature of a function contains type variables and constraints, it means that it has all the types that can be obtained by substituting the type variables with types, however, the substitution must respect the set of constraints (of the function). Definition 5. A substitution S is a mapping from type variables to types which is the identity for all but finitely many arguments. We extend the subtyping relation, P ` t 4 t0 , given in Figure 6 to take into account type variables and their constraints, P, C `P t 4 t0 . For two type variables, T, T0 , to be subtypes there must be a corresponding entry in C: (T, T0 ) ∈ C P, C `P T 4 T0
Definition 6. Given sets of constraints C and C 0 and a substitution S, we say that S respects C 0 w.r.t. C, written P, C, S ² C 0 , if for all (T, T0 ) ∈ C 0 : P, C `P S(T) 4 S(T0 ) Typing an expression depends on the constraints between type variables so we extend the typing judgment of Section 4 to be of the form, P, Γ, C `P e : t k Γ0 . The only typing rules that are modified w.r.t. Section 4 are those for method and new/global calls. In these rules type parameter may be instantiated to match the actual parameters with the formal ones. The substitution used for the instantiation should respect the constraints of
the function. We give the rule for method call, (method-call): P, Γ, C `P e1 : µ α.M k Γ0 M(m) = f↑ P, Γ0 , C `P e2 : t k Γ00 L(P, f) = < Γ1 , t1 , C1 > ∃ S : P, C, S ² C1 P, C `P t 4 S(Γ1 (x)) P, C `P µ α.M 4 S(Γ1 (this)) P, Γ, C `P e1 .m(e2 ) : S(t1 ) k S(Γ00 )
(method − call)
Similarly the rule for global call and object creation, (global/new call): P, Γ, C `P e : t k Γ0 L(P, f) = < Γ1 , t1 , C1 > ∃ S : P, C, S ² C1 P, C `P t 4 S(Γ1 (x)) P, C `P µ α. 4 S(Γ1 (this)) P, Γ, C `P f(e) : S(t1 ) k S(Γ0 )
(new/global call)
Returning to the assign example above, consider the call assign(x) with environment, Γ = {this : t, x : µ α. >}, we can derive using (new/global call) the type: P, Γ, {(T2 , T1 )} `P assign(x) : T1 k Γ The substitution applied for typing the call, S, is S(T) = T1 and S(T0 ) = T2 , and S is such that: P, {(T2 , T1 )}, S ² {(T0 , T)} Assume instead that the environment, Γ = {this : t, x : µ α. >}. In this case we derive, P, Γ, {(T2 , T1 )} `P assign(x) : Int k Γ, with the substitution S0 such that S0 (T) = Int and S0 (T0 ) = Int. Definition 7. A substitution S is a ground substitutions, if for all T such that S(T) 6= T we have that S(T) does not contain type variables. The following lemma asserts that a derivation of the parameterized systems represents the set of derivations of the non parameterized systems that can be obtained by applying to it a ground substitution respecting the constraints of the derivation. Lemma 3. For a well-formed program P, environment Γ, set of constraints C, and expression e such that P, C, Γ `P e : t k Γ0 , we have that: for all ground substitutions S such that P, ∅, S ² C we have that P, S(Γ) ` e : S(t) k S(Γ0 ). The following theorem extends theorem 1 for the type system with type variables. It says that if an expression (containing type variables) is well-typed w.r.t. a set of constraints, then for all ground substitution respecting the constraints the evaluation of the expression starting in a heap and stack that agree with the ground substitution applied to the environment cannot produce a run-time error. The proof of the theorem relies on the previous lemma.
Theorem 2. [Type Soundness] For a well-formed program P, environment Γ, set of constraints C, and expression e, such that: P, C, Γ `P e : t k Γ0 If S is a ground substitution such that P, ∅, S ² C, and P, S(Γ) ` H, S ¦ and e, H, S converges then – either e, H, S ¢ v, H0 , S0 , and P, H0 ` v J S(t), – or e, H, S ¢ nullPntrExc, H0 , S0
7
Comparisons and Future Work
In this paper a flexible type system for an idealized version of JavaScript is defined, and its soundness proved. JavaScript is an object based language allowing extensible objects, and sharing of method bodies. Type systems for object based languages have been developed mainly in a functional setting, see [1] and [9]. An imperative type safe object oriented language, TOIL, was introduced in [6]. Even though the language is class based, its type system does not identify types with classes. This makes the definition of types similar to ours. TOIL, however, does not have extensible objects, so there is no need for identifying potential members. Extensible objects have being considered in a functional setting in [8]. An imperative calculus for extensible objects was proposed by Bono and Fisher, in [5]. In Bono and Fisher type system there are two type for objects: the proto-types that can be extended and the object-types that cannot. The type system tracks potential members. The main difference between our system and Bono Fisher type system is the fact that we use recursive types (instead of row types plus universal and existential quantification). This makes possible, for us, to have a decidable type inference algorithm, see the final paragraph of this section. We point out that Bono and Fisher aim was to encode classes in their object calculus, not to get a type inference algorithm. Recursive types with subtyping have been studied in conjunction with functional programming languages by various researchers, see for instance [3]. Alias types are used in [4] and [7] to track the evolution of objects. In particular, in [7] potential members are used for the same purpose as the current paper. Alias types are, however, very different from the types used in this paper. They are singleton types identified with the address of objects. The need for ensuring type safety also in dynamically typed languages has been widely recognized. See for instance [11], [2], and [14]. In these papers, constraints are defined that insure that terms for which the inferred constraints are solvable do not cause message not understood errors. We approach the same problem differently, we first define a type system, that has good properties, such as soundness (well-typed programs do not cause message not understood errors), and expressiveness (all the significant examples we have can be typed). Our next step will be defining a type inference algorithm that be such that the type inferred from a term is a type derivable for the term. In particular, we would like to achieve a principal typing, that is a typing from which all the typing of a term be derivable. Principality insures that we can do the type analysis in a modular way, that is we can type check two expressions separately and then type check their composition just based on their type information. This is not possible for the systems of [11], [2], and [14]
8
Acknowledgements
We would like to thank Sophia Drossopoulou for very detailed comments and useful suggestions, and Mario Coppo for his help and insight. We would also like to thank our colleagues at Imperial College Department of Computing and Dipartimento di Informatica
References 1. Mart´ın Abadi and Luca Cardelli. A Theory of Objects. Springer-Verlag, New York, NY, 1996. 2. Ole Agesen, Jens Palsberg, and Michael I. Schwartzbach. Type inference of SELF: Analysis of objects with dynamic and multiple inheritance. Lecture Notes in Computer Science, 707:247–??, 1993. 3. Roberto M. Amadio and Luca Cardelli. Subtyping recursive types. ACM Transactions on Programming Languages and Systems, 15(4):575–631, 1993. 4. C. Anderson, F. Barbanera, M. Dezani-Ciancaglini, and S. Drossopoulou. Can addresses be types? (a case study: objects with delegation). In WOOD ’03, volume 82 of ENTCS. Elsevier, 2003. 5. V. Bono and K. Fisher. An Imperative, First-Order Calculus with Object Extension. In Proc. of ECOOP’98, volume 1445 of LNCS, pages 462–497, 1998. A preliminary version already appeared in Proc. of 5th Annual FOOL Workshop. 6. Kim Bruce, A. Schuett, and R. van Gent. Polytoil: A type safe polymorphic object-oriented language. In Proceedings of the European Conference on Object-Oriented Programming (ECOOP), 1995. 7. F. Damiani and P. Giannini. Alias types for environment aware computations. In WOOD ’03, volume 82 of ENTCS. Elsevier, 2003. 8. K. Fisher. Type Systems for Object-Oriented Programming Languages. PhD thesis, Stanford University, 1996. Available as Stanford Computer Science Technical Report number STANCS-TR-98-1602. 9. K. Fisher, F. Honsell, and J. C. Mitchell. A Lambda Calculus of Objects and Method Specialization. Nordic Journal of Computing, 1(1):3–37, 1994. A preliminary version appeared in Proc. of IEEE Symp. LICS’93. 10. David Flanagan. JavaScript - The Definitive Guide. O’Reilly, 1998. 11. Justin O. Graver and Ralph E. Johnson. A type system for smalltalk. In Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 136–150. ACM Press, 1990. 12. Javascript Language Reference. http://developer.netscape.com/docs/manuals/js/. 13. R. Milner, M. Tofte, R. Harper, and D. MacQueen. The Definition of Standard ML - Revised. MIT Press, 1997. 14. Mike Salib. Static type inference (for python) with starkiller. http://www.python.org/ pycon/dc2004/papers/1/paper.pdf, 2004.
A
Operation Semantics of JS0
Figure 8 gives a structural operational semantics for JS0 that rewrites tuples of expressions, heaps and stacks into tuples of values, heaps and stacks in the context of a program, P. The signature of the rewriting relation ; is: Program ¢ Exp × Heap × Stack ¢ (Val ∪ Dev) × Heap × Stack. and the definitions of the components are given in Section 3.1.
We now discuss the most interesting rules namely: (var), (mem-sel), (param-ass), (new), (cond-true), (cond-false). (Rule (mem-call) was discussed in the Section 3.1). In (var) the receiver (this) or parameter (x) are looked up in the stack, and heap and stack are unmodified. A function name, (f) is not looked up, as it is a value. In (mem-sel) member m is looked up in the receiver ι (obtained by evaluation of e) in the heap. If m is not found in ι, then execution is stuck. In (param-ass) we replace the value of x in the stack with the value obtained by execution of e. In (new) we execute the body of function f (looked up in P) with a stack that maps this to a fresh address that points to an empty object, formal parameter (x) to the value obtained by execution of the actual parameter. In (cond-true) and (cond-false) we evaluate the conditional test, e1 . If the value evaluates to 0 or 1 then e2 or e3 respectively is executed. Runtime Errors The operational semantic rules given in Figure 8 describe execution when there are no errors. Figures 9 gives rules for the cases where something has gone wrong. One can think of these rules as runtime type error detection with the exception of nullPntrExc. Figure 10 gives the rules for propagation of errors once they have been generated. Intuitively, an error is propogated upwards until it reaches the top, as with Java exceptions.
B
Definitions of congruence and well-formed programs
In Figure 11 we define congruence between types. Types are congruent if they differ for the order of members or if they are α-convertible. In Figure 12 we define well formed types. A type is well formed if its: – An integer type, Int. – A function alias, f↑, and f is defined in P. – An object type that contains unique member definitions and all the types used are closed. In the definition MD ::= (m : t[?])∗ In Figure 12 we define well formed types and well formed programs and functions. A program is well formed if all the functions it defines are well formed. A function is well formed if all the types in its definition (formal parameter, return type and type of this) are well formed and the body is well typed and has a type that is subtype of the declared return type.
this, H, S ¢ S(this), H, S x, H, S ¢ S(x), H, S f, H, S ¢ f, H, S sval, H, S ¢ sval, H, S e1 , H, S ¢ v0 , H1 , S1 e2 , H1 , S1 ¢ v, H0 , S0 e1 ; e2 , H, S ¢ v, H0 , S0
e, H, S ¢ ι, H0 , S0
(var)
e.m, H, S ¢ H0 (ι)(m), H0 , S0
(mem-sel)
(seq)
e1 , H, S ¢ ι, H1 , S1 e2 , H1 , S1 ¢ v, H2 , S0 H0 = H2 {ι.m C v}
e, H, S ¢ v, H0 , S00 S0 = {this 7→ S00 (this), x 7→ S00 (x)}
(mem-ass)
e1 .m = e2 , H, S ¢ v, H0 , S0 e1 , H, S ¢ v0 , H00 , S00 v0 > 0 e2 , H00 , S00 ¢ v, H0 , S0 e1 ? e2 : e3 , H, S ¢ v, H0 , S0
x = e, H, S ¢ v, H0 , S0
e1 , H, S ¢ 0, H00 , S00 e3 , H00 , S00 ¢ v, H0 , S0
(cond-true)
e, H, S ¢ v0 , H1 , S0 P(f) = function f(x) {e 0 } ι is new in H1 and H2 = H1 {ι 7→ [[]]} e 0 , H2 , {this 7→ ι, x 7→ v0 } ¢ v, H0 , S00 new f(e), H, S ¢ ι, H0 , S0 e1 , H, S ¢ ι, H1 , S1 e2 , H1 , S1 ¢ v0 , H2 , S0 H2 (ι)(m) = f P(f) = function f(x) {e 0 } e 0 , H2 , {this 7→ ι, x 7→ v0 } ¢ v, H0 , S00 e1 .m(e2 ), H, S ¢ v, H0 , S0
e1 ? e2 : e3 , H, S ¢ v, H0 , S0
(new)
(mem-call)
e, H, S ¢ v0 , H1 , S0 P(f) = function f(x) {e 0 } e 0 , H1 , {this 7→ Udf , x 7→ v} ¢ v, H0 , S00 f(e), H, S ¢ v, H0 , S0
(func-call)
Fig. 8. Operational Semantics of JS0
(param-ass)
(cond-false)
e, H, S ¢ null, H0 , S0
e, H, S ¢ v, H0 , S0 v 6= null v 6∈ Addr or H(v) = Udf
e.m, H, S ¢ stuckErr, H0 , S0 e.m, H, S ¢ nullPntrExc, H0 , S0 0 0 0 e.m = e 0 , H, S ¢ stuckErr, H0 , S0 e.m = e , H, S ¢ nullPntrExc, H , S e.m(e 0 ), H, S ¢ nullPntrExc, H0 , S0
S(x) = Udf x, H, S ¢ stuckErr, H, S x = e, H, S ¢ stuckErr, H, S
e, H, S ¢ ι, H0 , S0 H0 (ι)(m) = Udf e.m, H, S ¢ stuckErr, H0 , S0
e1 , H, S ¢ ι, H1 , S1 e2 , H1 , S1 ¢ v, H0 , S0 H0 (ι)(m) = Udf
e1 , H, S ¢ v, H0 , S0 v 6= null v 6∈ Addr or H(v) = Udf or H(v)(m) = Udf
e1 .m = e2 , H, S ¢ stuckErr, H0 , S0
e1 .m(e2 ), H, S ¢ stuckErr, H0 , S0
e1 , H, S ¢ ι, H1 , S1 e2 , H1 , S1 ¢ v0 , H0 , S0 H0 (ι)(m) = f P(f) = Udf
e, H, S ¢ v0 , H0 , S0 P(f) = Udf
e1 .m(e2 ), H, S ¢ stuckErr, H0 , S0
f(e), H, S ¢ stuckErr, H0 , S0
e1 , H, S ¢ ι, H0 , S0 e1 ? e2 : e3 , H, S ¢ stuckErr, H0 , S0
Fig. 9. Operational Semantics - generation of exceptions
e1 , H, S ¢ ι, H1 , S1 e2 , H1 , S1 ¢ dv, H0 , S0
e, H, S ¢ dv, H0 , S0 x = e, H, S ¢ dv, H0 , S0 f(e), H, S ¢ dv, H0 , S0 e.m, H, S ¢ dv, H0 , S0 e.m = e 0 , H, S ¢ dv, H0 , S0 e.m(e 0 ), H, S ¢ dv, H0 , S0
e1 .m = e2 , H, S ¢ dv, H0 , S0 e1 .m(e2 ), H, S ¢ dv, H0 , S0
e1 , H, S ¢ ι, H1 , S1 e2 , H1 , S1 ¢ v0 , H2 , S0 H2 (ι)(m) = f P(f) = function f(x){var y; e 0 } e 0 , H2 , {this 7→ ι, x 7→ v0 } ¢ dv, H0 , S00 e1 .m(e2 ), H, S ¢ dv, H0 , S0 e, H, S ¢ v0 , H1 , S0 P(f) = function f(x){var y; e 0 } e 0 , H1 , {this 7→ Udf , x 7→ v0 } ¢ dv, H0 , S00 f(e), H, S ¢ dv, H0 , S0 e1 , H, S ¢ dv, H0 , S0 or (e1 , H, S ¢ v, H00 , S00 and e2 , H00 , S00 ¢ dv, H0 , S0 and v > 1) or (e1 , H, S ¢ 0, H00 , S00 and e3 , H00 , S00 ¢ dv, H0 , S0 ) e1 ? e2 : e3 , H, S ¢ dv, H0 , S0 ’
Fig. 10. Operational Semantics - propagation of exceptions
Reflexity
t≡t Reordering ti ≡ t0i i ∈ 1...n µ α. < < m1 : t1 . . . mn : tn > > ≡ µ α. < < mπ(1) : t0π(1) . . . mπ(n) : t0π(n) > > (For some permuation π a bijection 1. . . n ¢ 1. . . n) alpha − conversion ti ≡ t0i i ∈ 1...n µ α. < < m1 : t1 . . . mn : tn > > ≡ µ α0 . < < m1 : t01 [α/α0 ] . . . mn : t0n [α/α0 ] > >
Fig. 11. Congruence for types
FV(t) = ∅ ` t ¦closed
F ` Int ¦
F(f) 6= Udf
∀m : M =< < MD1 m : t MD2 > >, M =< < MD3 m : t0 MD4 > >, =⇒ MD1 = MD3 , MD2 = MD4 ` µ α.M ¦closed
F ` f↑ ¦
F ` µ α.M ¦
P(f) = function f(x : t) : t0 { this : µ α.M; e } F ` t ¦, F ` t0 ¦, F ` µ α.M ¦ L(P, f) = < {this : µ α.M, x : t}, t0 > P, {this : µ α.M, x : t} ` e : t00 k Γ0 ` P¦ F ` t00 4 t0 P ` f¦ ∀ f : P(f) 6= Udf =⇒ P, P ` f¦ ` P¦
Fig. 12. Well formed types, Closed types and Well formed programs