Mar 15, 2018 -
The MLKit Standard ML Compiler Tool Kit Martin Elsman Department of Computer Science University of Copenhagen DIKU
March 15, 2018
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
1 / 35
Overview of the MLKit
Outline
1 Overview of the MLKit
History Compiler Overview Modules and Recompilation 2 SMLserver
Overview RDBMS Interfacing The Region Model Measurements 3 SMLtoJs
Features JavaScript Integration Tail Calls Composing Js Fragments Compiling in a Browser Other Uses Related Work and Conclusion Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
2 / 35
Overview of the MLKit
History
History of the MLKit 1 MLKit version 1
Test-bed for language development; 1990 Definition (static and dynamic semantics) [Tofte et al. 1991] 2 MLKit with Regions [1993–...]
Region-inference and region-based runtime system for the Core language [Tofte et al. 1993–1999] HPPA and X86 backends [1996] Static interpretation of Modules and smart recompilation [1999] Combination with GC [1999–2003] Support for the entire Standard ML Basis Library [1999–2003] 3 SMLserver [2003–...]
Region-based bytecode interpreter combined with Apache web server with efficient DB integration [2003] Used at ITU (course mngt., course eval., diplomas, ...) [2003–...] 4 SMLtoJs [2005–...]
MLKit bootstrapped in a browser; good support for tail-calls [2010] Used with SMLserver for single-language multi-tier applications Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
3 / 35
Overview of the MLKit
History
Many Contributors 1 MLKit version 1
Lars Birkedal, Nick Rothwell, Mads Tofte, David N. Turner 2 MLKit with Regions [1993–...]
Lars Birkedal, Peter Bertelsen, Martin Elsman, Niels Hallenberg, Tommy Højfeld Olesen, Peter Sestoft, Mads Tofte, Magnus Vejlstrup 3 SMLserver [2003–...]
Martin Elsman, Niels Hallenberg, Ken Friis Larsen, Carsten Varming 4 SMLtoJs [2005–...]
Martin Elsman
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
4 / 35
Overview of the MLKit
Compiler Overview
MLKit Overview file.{sml,mlb} Frontend lexing, parsing, type-checking static interpretation of modules pattern-match compilation
Typed Lambda
Region Lambda :
Javascript
3-address code
Bytecode backend
X86 backend
file.bc
file.o
Martin Elsman (DIKU)
Optimisations: - function inlining - recursive function specialisations - constant propagation - dead code elim - let-floating -…
file.js
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
5 / 35
Overview of the MLKit
Modules and Recompilation
Static Interpretation of Modules and Smart Recompilation Programmers organise source code in so-called MLB-files (same as for MLton) Modules (signatures, structures, and functors) are eliminated entirely at compile time... A smart recompilation system allows for cross-module optimisations while avoiding unnecessary recompilation when source code changes.
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
6 / 35
SMLserver
Overview
SMLserver (http://www.smlserver.org/) A web server platform for Standard ML programs
Features Access to a variety of RDBMSs through an efficient generic interface that supports reuse of database connections Support for type safe data caching, HTTP requests, filtering and script scheduling Programs are compiled into bytecode files, which are loaded only once but may be executed many times A multi-threaded execution model allows multiple requests to be served simultaneously Integrated with the Apache web server
Why MLKit? No reference tracing GC and no tags, but region based memory management
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
7 / 35
SMLserver
RDBMS Interfacing
Interfacing to an RDBMS Support for Oracle, Postgresql, and MySQL Database pooling A handle identifies a connection to an RDBMS SMLserver maintains a configurable number of pools (of handles) A database handle is owned by at most one script at a time Handles are requested and released by the database functions in such a way that no deadlocks appears The programmer needs not know about handles, unless transactions are used
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
8 / 35
SMLserver
RDBMS Interfacing
The RDBMS Interface signature WEB_DB = sig val dml : quot -> unit val fold : ((string->string) * ’a -> ’a) -> ’a -> quot -> ’a val qqq : string -> string ... end
Notes Quotations are used for embedding SQL The function qqq escapes quotes (’) for SQL string embedding The function dml can be used for executing insert and update statements The function fold folds over the rows returned by a SQL statement (similar to List.foldr)
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
9 / 35
SMLserver
The Region Model
The Region based memory model
Memory allocation and deallocation directives are inserted in the program at compile time Memory is allocated from a free list of region pages A region is a list of region pages — a free-list contains region pages currently unused Activation records stored on the runtime stack contain: temporary variables and region descriptors
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
10 / 35
SMLserver
The Region Model
The Region based memory model 6
6
6
Region Heap
6
r1
6
6
r2
6
6
r3
6
-
6 free list
Machine stack with activation records
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
11 / 35
SMLserver
The Region Model
The efficient thread-safe SMLserver memory model Code Caching
Cached library code
Cached script code
Thread 1 Data Caching
Library cache Library regions
Cached script code Thread N Library cache
Script regions
When server boots
Library regions
Shared region page free list
Script regions
When executing script
load library code
load script code
execute library code
execute script code
copy “Library regions” into “Library cache”
deallocate script regions
Martin Elsman (DIKU)
restore “Library regions”
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
12 / 35
SMLserver
The Region Model
Consequences of the Execution Model Execution starts in an initial heap each time a request is served. Thus, it is not possible to maintain state implicitly in web applications using Standard ML references or arrays. Instead, state must be maintained explicitly using an RDBMS, perhaps combined with SMLserver cache primitives. Alternative: emulate state with form variables or cookies. Region-inference works very well in environments where programs run shortly but are executed often!
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
13 / 35
SMLserver
Measurements
Measurements with ApacheBench Requests / second Program hello date db guest calendar mul table log
MosML MSP
AOLserver TCL
Apache PHP
SMLserver MSP
55 54 27 25 36 50 21 8
724 855 558 382 27 185 59 12
489 495 331 274 37 214 0.7 0.4
1326 1113 689 543 101 455 93 31
ApacheBench (v. 1.3d) uses eight threads (60 seconds each) Old experiments: 850Mhz Pentium 3 Linux box (384Mb RAM) Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
14 / 35
SMLtoJs
SMLtoJs: Higher-Order and Typed (HOT) Web browsing http://www.smlserver.org/smltojs/ Easy development and maintainance of advanced Web browser libraries (e.g., Reactive Web Programming) Allow developers to build client-based web applications in a HOT language
Allow for existing code to execute in browsers Programs (e.g., SMLtoJs itself — it is itself written in SML) Libraries (e.g., The IntInf Basis Library module) Support all of SML and (almost) all of the SML Basis Library
Web programming without tiers Allow the same code to run both in the browser and on the server (e.g., complex serialisation code) Type-safe multi-tier applications
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
15 / 35
SMLtoJs
Features
Features of SMLtoJs Supports all Browsers SMLtoJs compiles Standard ML programs to JavaScript for execution in all main Internet browsers.
Compiles all of Standard ML SMLtoJs compiles all of SML, including higher-order functions, pattern matching, generative exceptions, and modules.
Basis Library Support Supports most of the Standard ML Basis Library, including: Array2 ArraySlice Array Bool Byte Char CharArray CharArraySlice CharVector CharVectorSlice Date General Int Int31 Int32 IntInf LargeWord ListPair List Math Option OS.Path Pack32Big Pack32Little PackReal Random Real StringCvt String Substring Text Time Timer Vector VectorSlice Word Word31 Word32 Word8 Word8Array Word8ArraySlice Word8Vector Word8VectorSlice
Additional Libraries: Martin Elsman (DIKU)
JsCore Js Html Rwp The MLKit Standard ML Compiler Tool Kit
March 15, 2018
16 / 35
SMLtoJs
Features
JavaScript Integration and DOM Access ML code may call JavaScript functions and execute JavaScript statements. SMLtoJs has support for simple DOM access and for installing ML functions as DOM event handlers and timer call back functions. Applies MLKit’s optimisations Static interpretation of Modules. Function inlining and constant propagation Specialisation of higher-order recursive functions (map, foldl) JavaScript-specific optimisations Tail-call optimisation of so-called straight tail calls Unboxing of certain datatypes (lists, certain trees, etc.) Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
17 / 35
SMLtoJs
Features
Example: Compiling the Fibonacci Function (fib.sml) fun fib n = if n < 2 then 1 else fib(n-1) + fib(n-2) val _ = print(Int.toString(fib 23))
Resulting JavaScript Code var fib$45 = function fib$45(n$48){ if (n$48 ’a1 * ’a2 -> ’b ... end Phantom types are used to ensure proper interfacing: fun documentWrite d s = J.exec2 {stmt="return d.write(s);", arg1=("d",J.fptr), arg2=("s",J.string), res=J.unit} (d,s) SMLtoJs inlines stmt if it is known statically; otherwise a Function object is created and stmt resolved and executed at runtime. Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
19 / 35
SMLtoJs
JavaScript Integration
Library (Js) for Manipulating the DOM and Element Events signature JS = sig eqtype win and doc and elem (* dom *) val openWindow : string -> string -> win val document : doc val windowDocument : win -> doc val documentElement : doc -> elem val getElementById : doc -> string -> elem option val value : elem -> string val innerHTML : elem -> string -> unit datatype eventType = onclick | onchange (* events *) val installEventHandler : elem -> eventType -> (unit->bool) -> unit type intervalId val setInterval : int -> (unit->unit) -> intervalId val clearInterval : intervalId -> unit val onMouseMove : (int*int -> unit) -> unit ... end Notice: implemented using the JsCore module. Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
20 / 35
SMLtoJs
JavaScript Integration
The Inner Workings of SMLtoJs (no tail calls) SMLtoJs compiles SML to JavaScript through an MLKit IL. SML reals, integers, words, and chars are implemented as JavaScript numbers with explicit checks for overflow. SML variables are compiled into JavaScript variables. SML functions are compiled into JavaScript functions: [[fn x → e]]exp = function(x){[[e]]stmt } SML variable bindings compiles to JS function applications: [[let val x = e in e0 end]]exp = function(x){[[e0 ]]exp }([[e]]exp ) When compilation naturally results in a JavaScript statement, the statement is converted into an expression: [[e]]stmt = stmt
(1)
[[e]]exp = function(){stmt; }()
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
21 / 35
SMLtoJs
Tail Calls
Optimizing Straight Tail Calls A straight tail call is a tail call to the nearest enclosing function with a tail call context containing no function abstractions. Example SML code: fun sum (n,acc) = if n ’’c) -> val >>> : (’’b,’’c,’k)arr val fst : (’’b,’’c,’k)arr (* derived combinators *) val snd : (’’b,’’c,’k)arr val *** : (’’b,’’c,’k)arr arr val &&& : (’’b,’’c,’k)arr end
(’’b,’’c,’k) arr * (’’c,’’d,’k)arr -> (’’b,’’d,’k)arr -> (’’b*’’d,’’c*’’d,’k)arr -> (’’d*’’b,’’d*’’c,’k)arr * (’’d,’’e,’k)arr -> (’’b*’’d,’’c*’’e,’k) * (’’b,’’d,’k)arr -> (’’b,’’c*’’d,’k)arr
Notice The ARROW signature specifies combinators for creating basic arrows and for composing arrows. Specifically, we model behavior transformers and event stream transformers as arrows. The ’k’s are instantiated either to B (behavior) or to E (events). Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
29 / 35
SMLtoJs
Other Uses
The Rwp library: Building Basic Behaviors and Event Streams signature RWP = sig type B type E (* kinds: Behaviors (B) and Events (E) *) type (’a,’k)t type ’a b = (’a, B)t type ’a e = (’a, E)t include ARROW where type (’a,’b,’k)arr = (’a,’k)t -> (’b,’k)t val timer : int -> Time.time b val textField : string -> string b val mouseOver : string -> bool b val mouse : unit -> (int*int) b val pair : ’’a b * ’’b b -> (’’a * ’’b) b val merge : ’’a e * ’’a e -> ’’a e val delay : int -> (’’a,’’a,B)arr val calm : int -> (’’a,’’a,B)arr val fold : (’’a * ’’b -> ’’b) -> ’’b -> ’’a e -> ’’b e val click : string -> ’’a -> ’’a e val changes : ’’a b -> ’’a e val hold : ’’a -> ’’a e -> ’’a b val const : ’’a -> ’’a b val flatten : ’’a b b -> ’’a b val insertDOM : string -> string b -> unit end Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
30 / 35
SMLtoJs
Other Uses
Example: Adding the Content of Fields open Rwp infix *** &&& >>> val _ = print("
Add Content of Fields
" ˆ " + " ˆ " = ?") val si_t : (string,int,B)arr = arr (Option.valOf o Int.fromString) val form = pair( textField "a", textField "b" ) val t = (si_t *** si_t) >>> (arr op +) >>> (arr Int.toString) val _ = insertDOM "c" (t form)
Notice t takes a behavior of pairs of integers and returns an integer behavior.
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
31 / 35
SMLtoJs
Other Uses
Example: Reporting the Mouse Position val _ = print ("Mouse Position
" ˆ "?
" ˆ "?
" ˆ "?
") val t : (int*int,string,B) arr = arr (fn (x,y) => ("[" ˆ Int.toString x ˆ "," ˆ Int.toString y ˆ "]")) val bm = mouse() val t10 : (int*int,int*int,B) arr = arr (fn (x,y) => (x div 10 * 10, y div 10 * 10)) val bm2 = (t10 >>> t) bm val _ = insertDOM "mouse0" (t bm) val _ = insertDOM "mouse1" (calm 400 bm2) val _ = insertDOM "mouse2" (delay 400 bm2)
Notice calm waits for the underlying behavior to be stable. delay transforms the underlying behavior in time. Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
32 / 35
SMLtoJs
Other Uses
Implementation Issues Behaviors and event streams are implemented using “listeners”: type (’a,’k) t = {nid : int, listeners : (LId.t * (’a->unit)) list ref, current : ’a ref option} Behaviors (of type (’a,B)t) always have a current value, whereas event streams do not. Installing a behavior b in the DOM tree involves adding a listener to b that updates the element using Js.innerHTML. The implementations of calm and delay make use of Js.setTimeout. The implementation of textField makes use of Js.installEventHandler. The implementation of mouse makes use of Js.onMouseMove. A heap is used to guarantee that when some node change value, other nodes are at most updated once.
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
33 / 35
SMLtoJs
Related Work and Conclusion
Related Work The Google Web Toolkit project (GWT). The Scm2Js and Hop projects by Loitsch and Serrano, TFL’2007. The Links project. Wadler et al. 2006. The AFAX F# project by Syme and Petricek, 2007. The ML5 project by Murphy, Crary, and Harper, 2007. O’Browser by Canou, Balat, and Chailloux, ML’2008. The js_of_ocaml project by Vouillon, 2011.
Related Reactive Programming Work The Flapjax language and JavaScript library by Shriram Krishnamurthi et al. John Hughes. Generalising Monads to Arrows. Science of Computer Programming 37. Elsevier 2000. The Fruit Haskell library by Courtney and Elliott.
Martin Elsman (DIKU)
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
34 / 35
SMLtoJs
Martin Elsman (DIKU)
Related Work and Conclusion
The MLKit Standard ML Compiler Tool Kit
March 15, 2018
35 / 35