An Expression Processor: A Case Study in Refactoring Haskell Programs Christopher Brown1 , Huiqing Li2 , and Simon Thompson2 1
School of Computer Science, University of St. Andrews, UK.
[email protected] 2 School of Computing, University of Kent, UK. {H.Li,S.J.Thompson}@kent.ac.uk
Abstract. Refactoring is the process of changing the structure of a program while preserving its behaviour. This behaviour preservation is crucial so that refactorings do not introduce any bugs. Refactoring is aimed at increasing code quality, programming productivity and code reuse. Refactoring has been practised manually by programmers for as long as programs have been written; however, with the advent of refactoring tools, refactoring can be performed semi-automatically, allowing refactorings to be performed (and undone) easily. In this paper, we briefly describe a number of refactorings implemented in the Haskell Refactorer, HaRe. In addition to this, we also implement a simple expression processor to demonstrate how some of the refactorings implemented in HaRe can be used to aid programmers in developing Haskell software.
1
Introduction
Often programmers write a first version of a program without paying full attention to programming style or design principles [1]. Having written a program, the programmer will realise that a different approach would have been much better, or that the context of the problem has changed. Refactoring tools provide software support for modifying a program into a better program thus avoiding the expense of re-starting from scratch. Refactoring is the process of changing the internal structure of a program, while preserving its behaviour. The term refactoring was first introduced by Opdyke in his PhD thesis in 1992 [2] and the concept goes at least as far back as the fold/unfold system proposed by Burstall and Darlington in 1977 [3], although, arguably, the fold-unfold system was more about algorithm change than structual changes. The key aspect of refactoring —in contrast to general program transformations, such as genetic programming [4]— is the focus on purely structural changes rather than changes in program behaviour. The Haskell Refactorer, HaRe is the result of the combined effort of the Refactoring Functional Programs project at the University of Kent [5] [6] by Li, Reinke, Thompson and Brown. HaRe provides refactorings for the full Haskell 98
Hygienic Macros for ACL2 Carl Eastlund and Matthias Felleisen {cce,matthias}@ccs.neu.edu Northeastern University Boston, MA, USA
Abstract. ACL2 is a theorem prover for a purely functional subset of Common Lisp. It inherits Common Lisp’s unhygienic macros, which are used pervasively to eliminate repeated syntactic patterns. The lack of hygiene means that macros do not automatically protect the producers or consumers of macros from accidental variable capture. This paper demonstrates how this lack of hygiene interferes with theorem proving. It then explains how to design and implement a hygienic macro system for ACL2. An evaluation of the ACL2 code base shows the impact of this hygienic macro system on existing libraries and practices.
1
Unhygienic Macros Are Not Abstractions
ACL2 [1, 2] is a formal verification system that combines a first-order functional subset of Common Lisp with a first-order theorem prover over a logic of total functions. It has been used to model and verify large commercial hardware and software artifacts. ACL2 supports functions and logical statements over numbers, strings, symbols, and s-expressions. Here is a sample program: (defun double (x) (+ x x)) (defthm double⇒evenp (implies (integerp x) (evenp (double x)))) The defun form defines double, a function that adds its input to itself. The defthm form defines double⇒evenp, a conjecture stating that an integer input to double yields an even output. The conjecture is implicitly universally quantified over its free variable x. ACL2 validates double⇒evenp (blessing it as a theorem) using the definition of double and axioms about implies, integerp, and evenp. From Common Lisp, ACL2 inherits macros, which provide a mechanism for extending the language via functions that operate on syntax trees. According to Kaufmann and Moore [1], “one can make specifications more succinct and easy to grasp . . . by introducing well-designed application-specific notation.” Indeed, macros are used ubiquitously in ACL2 libraries: there are macros for pattern matching; for establishing new homogenous list types and heterogenous structure types, including a comprehensive theory of each; for defining quantified claims using skolemization in an otherwise (explicit) quantifier-free logic; and so on.
Evaluating Call By Need on the Control Stack Stephen Chang, David Van Horn, and Matthias Felleisen PLT & PRL, Northeastern University, Boston, MA 02115
Abstract. Ariola and Felleisen’s call-by-need λ-calculus replaces a variable occurrence with its value at the last possible moment. To support this gradual notion of substitution, function applications—once established—are never discharged. In this paper we show how to translate this notion of reduction into an abstract machine that resolves variable references via the control stack. In particular, the machine uses the static address of a variable occurrence to extract its current value from the dynamic control stack .
1
Implementing Call-by-need
Following Plotkin [1], Ariola and Felleisen characterize the by-need λ-calculus as a variant of β: (λx.E[x]) V = (λx.E[V ]) V , and prove that a machine is an algorithm that searches for a (generalized) value via the leftmost-outermost application of this new reduction [2]. Philosophically, the by-need λ-calculus has two implications: 1. First, its existence says that imperative assignment isn’t truly needed to implement a lazy language. The calculus uses only one-at-a-time substitution and does not require any store-like structure. Instead, the by-need β suggests that a variable dereference is the resumption of a continuation of the function call, an idea that Garcia et al. [3] recently explored in detail by using delimited control operations to derive an abstract machine from the by-need calculus. Unlike traditional machines for lazy functional languages, Garcia et al.’s machine eliminates the need for a store by replacing heap manipulations with control (stack) manipulations. 2. Second, since by-need β does not remove the application, the binding structure of programs—the association of a function parameter with its value— remains the same throughout a program’s evaluation. This second connection is the subject of our paper. This binding structure is the control stack, and thus we have that in call-by-need, static addresses can be resolved in the dynamic control stack. Our key innovation is the CK+ machine, which refines the abstract machine of Garcia et al. by making the observation that when a variable reference is in focus, the location of the corresponding binding context in the dynamic control
Graphical and Incremental Type Inference A Graph Transformation Approach Silvia Clerici, Cristina Zoltan, and Guillermo Prestigiacomo Universitat Polit`ecnica de Catalunya Barcelona, Spain
Abstract. We present a graph grammar based type inference system for a totally graphic development language. NiMo (Nets in Motion) can be seen as a graphic equivalent to Haskell that acts as an on-line tracer and debugger. Programs are process networks that evolve giving total visibility of the execution state, and can be interactively completed, changed or stored at any step. In such a context, type inference must be incremental. During the net construction or modification only type safe connections are allowed. The user visualizes the type information evolution and, in case of conflict, can easily identify the causes. Though based on the same ideas, the type inference system has significant differences with its analogues in functional languages. Process types are a non-trivial generalization of functional types to handle multiple outputs, partial application in any order, and curried-uncurried coercion. Here we present the elements to model graphical inference, the notion of structural and non-structural equivalence of type graphs, and a graph unification and composition calculus for typing nets in an incremental way
1
Introduction
The data flow view of lazy functional programs as process networks was first introduced in [1]. The graphic representation of functions as processes and infinite lists as non-bounded channels, helps to understand the program overall behaviour. The net architecture shows in a bi-dimensional way the chains of function compositions, exhibits the implicit parallelism, and back arrows give an insight of the recurrence relations from the new results and those already calculated. The graphic execution model that the net animation suggests was the starting point for the NiMo language design whose initial version was presented in [2]. It was completely defined in terms of graph transformations and implemented in the graph transformation system AGG. This first prototype NiMoAGG showed the feasibility of having a graphical equivalent for Miranda or Haskell that could be also executable in a totally graphic way. A small set of graphic primitives allows representing and handling higher order, partial application, non-strict evaluation, and type inference with parametric polymorphism. Since the net is the code but also its computation graph, users have total visibility of the execution internals according to a comprehensible model. Partially defined nets can be executed, dynamically completed or modified and stored at any step, thus allowing incremental development on the fly. Also, execution steps can be undone, acting as an on line tracer and debugger where everything can be dynamically modified. In the current version of NiMo even
Data-Driven Detection of Catamorphisms Towards Problem Specific Use of Program Schemes for Inductive Program Synthesis Martin Hofmann Faculty Information Systems and Applied Computer Science, University of Bamberg, email:
[email protected]
Abstract. Inductive Program Synthesis or Inductive Programming (IP) is the task of generating (recursive) programs from an incomplete specification, such as input/output (I/O) examples. All known IP algorithms can be viewed as search in the space of all candidate programs, with consequently exponential complexity. To constrain the search space and guide the search traditionally program schemes are used, usually given a priori by an expert user. Consequently, all further given data is now interpreted w.r.t this schema which now almost exclusively decides on success and failure, depending on whether it fits the data or not. Instead of trying to fit the data to a given schema indiscriminately, we propose to utilise knowledge about data types to choose and fit a suitable schema to the data! Recursion operators associated with data type definitions are well known in functional programming, but less in IP. We propose to exploit universal properties of type morphisms which may be detected in the given I/O examples. This technique allows us to introduce catamorphisms as generic recursion schemes on arbitrary inductive data types in our analytical inductive functional programming system Igor II.
1
Introduction
Inductive Program Synthesis or Inductive Programming (IP) researches the task of automatically synthesising probably recursive programs. Contrary to deductive program synthesis, which relies on a complete and formalised specification, IP uses incomplete specifications such as input/output (I/O) examples or execution traces of the desired program’s behaviour. Usually, it focuses on the synthesis of declarative (logic, functional, or functional logic) programs. The aims of IP are manifold. On the one hand, research in IP provides better insights in the cognitive skills of human programmers. On the other hand, powerful and efficient IP systems can enhance software systems in a variety of domains—such as automated theorem proving and planning—and offer novel approaches to knowledge based software engineering such as model driven software development or test driven development, as well as end user programming support in the XSL domain [1]. So it should be clear that to “automagically” create a whole system is far too daring, but generating parts of software such as modules or functions is a quite realistic task.
Experimental Mathematics in Haskell: on Pairing/Unpairing Functions and Boolean Evaluation Paul Tarau1 and Brenda Luderman2 1
Department of Computer Science and Engineering University of North Texas
[email protected] 2 ACES CAD
[email protected]
Abstract. By using Haskell as our mathematical meta-language, we explore natural number encodings of boolean functions and logic circuit representations. A fresh look at ordered decision trees and binary decision diagrams in terms of functions on natural numbers is provided. Using pairing and unpairing functions on natural number representations of truth tables, we derive an encoding for Ordered Binary Decision Trees (OBDTs) with the unique property that its boolean evaluation faithfully mimics its structural conversion to a natural number through recursive application of an unpairing function. We use this result to derive ranking and unranking functions for OBDTs and reduced OBDTs. Finally generalizations to arbitrary variable orderings for OBDTs encodings of to Multi-Terminal OBDTs are described. The paper is organized as a literate Haskell program ( code available at http://logic.csci.unt.edu/tarau/research/2009/fOBDT.hs). Keywords: ordered binary decision trees (OBDTs), ranking/unranking functions for OBDTs, encodings of boolean functions, pairing/unpairing functions, computational mathematics and functional programming
1
Introduction
This paper is an exploration with functional programming tools of the relation between pairing/unpairing and ranking/unranking functions and Ordered Binary Decision Trees (OBDTs), as well as their connection to boolean evaluation. Its main theoretical contribution is showing that we can construct a pairing function that mimics boolean evaluation of an ordered binary decision tree. This result leads to algorithms for ranking and unranking functions of OBDTs and generalization to arbitrary variable order OBDTs and Multi-Terminal OBDTs. The paper is organized as follows: Section 2 overviews Ordered Binary Decision Trees (OBDTs). Section 3 introduces pairing/unpairing functions acting directly on bitlists. Section 4 introduces a novel OBDT encoding (based on unpairing functions) and discusses the
Typing Coroutines Konrad Anton and Peter Thiemann Institut f¨ ur Informatik, Universit¨ at Freiburg {anton,thiemann}@informatik.uni-freiburg.de
Abstract. A coroutine is a programming construct between function and thread. It behaves like a function that can suspend itself arbitrarily often to yield intermediate results and to get new inputs before returning a result. This facility makes coroutines suitable for implementing generator abstractions. Languages that support coroutines are often untyped or they use trivial types for coroutines. This work supplies the first type system with dedicated support for coroutines. The type system is based on the simplytyped lambda calculus extended with effects that describe control transfers between coroutines.
1
Introduction
A coroutine is a programming construct between function and thread. It can be invoked like a function, but before it returns a value (if ever) it may suspend itself arbitrarily often to return intermediate results and then be resumed with new inputs. Unlike with preemptive threading, a coroutine does not run concurrently with the rest of the program, but rather takes control until it voluntarily suspends to either return control to its caller or to pass control to another coroutine. Coroutines are closely related to cooperative threading, but they add value because they are capable of passing values into and out of the coroutine and they permit explicit switching of control. Coroutines have been invented in the 1960s as a means for structuring a compiler [4]. They have received a lot of attention in the programming community and have been integrated into a number of programming languages, for instance in Simula 67 [5], BETA, CLU [11], Modula-2 [19], Python [17], and Lua [16], and Knuth finds them convenient in the description of algorithms [8]. Coroutines are also straightforward to implement in languages that offer first-class continuations (e.g., Scheme [7]) or direct manipulation of the execution stack (e.g., assembly language, Smalltalk). The main uses of coroutines are the implementation of compositions of state machines as in Conway’s seminal paper [4] and the implementation of generators. A generator enumerates a potentially infinite set of values with successive invocations. The latter use has lead to renewed interest in coroutines and to their inclusion in mainstream languages like C# [14], albeit in restricted form as generators.
Monad Factory: Type-Indexed Monads Mark Snyder and Perry Alexander The University of Kansas Information and Telecommunication Technology Center 2335 Irving Hill Rd, Lawrence, KS 66045 {marks,alex}@ittc.ku.edu
Abstract. Monads provide a greatly useful capability to pure languages in simulating side-effects, but implementations such as the Monad Transformer Library [1] in Haskell prohibit heterogeneous applications of those side-effects such as threading through two different states without some work-around. Monad Factory provides a straightforward solution for opening the non-proper morphisms by indexing monads at both the typelevel and term-level, allowing ‘copies’ of the monads to be created and simultaneously used. This expands monads’ applicability and mitigates the amount of boilerplate code we need for monads to work together, and yet we use them nearly identically to non-indexed monads. Key words: monads, Haskell, type-level programming
1
Introduction
Programming with monads in Haskell provides a rich set of tools to the pure functional programmer, but their homogeneous nature sometimes proves unsatisfactory. Due to functional dependencies in the Monad Transformer Library (MTL), an individual monad can only be used in one way in a fragment of code, such as using the State monad to store a particular state type. When a programmer wants to use a monad in a heterogeneous fashion, some hack or work-around is necessary, such as using record structures or playing games with lift-ing. While monad transformers allow us to combine the effects of different monads, we cannot directly use them to combine the effects of a particular monad multiple times. The problem grows as code from various sources begins to interact–when using existing code, a monad’s usage might be “reserved” for some orthogonal use; should you edit that code, even if you can? Even when we give in and re-implement a common monad, we must provide instances for a host of other common monads in defining the transformer, including providing an instance relating to the the copied monad–these details require more knowledge about monads than simply using them. We seek a more adaptable process for capturing monadic behavior, and expect some streamlining, re-usability, and a less error-prone process than some ad-hoc options that are commonly used. In particular, we expect to write fewer instances for transformer inter-operability. This paper introduces type-indexed monads, which allow for multiple distinct instances of a monad to coexist, via explicit annotations. A simple type class
Static Balance Checking for First-Class Modular Systems of Equations John Capper and Henrik Nilsson Functional Programming Laboratory, School of Computer Science, University of Nottingham, United Kingdom {jjc,nhn}@cs.nott.ac.uk
Abstract. Characterising a problem in terms of a system of equations is common to many branches of science and engineering. Due to their size, such systems are often described modularly by composition of individual equation system fragments. Checking the balance between the number of variables (unknowns) and equations is a common approach to early detection of mistakes that might render such a system unsolvable. However, current approaches to modular balance checking are quite restrictive. This paper investigates a more flexible approach that in particular makes it possible to treat equation system fragments as true first-class entities. The central idea is to record balance information in the type of an equation fragment. This information can then be used to determine if individual fragments are well formed, and if composing fragments preserves this property. The type system presented in this paper is developed in the context of Functional Hybrid Modelling (FHM). However, the key ideas are in no way specific to FHM, but should be applicable to any language featuring a notion of modular systems of equations. Key words: Systems of equations; equation-based, non-causal modelling; first-class components; equation-variable balance; structural analysis; linear constraints; refinement types.
1
Introduction
Systems of equations, also known as simultaneous equations, are abundant in science and engineering. Applications include modelling, simulation, optimisation, and more. The systems of equations are often parametrised, describing not just a specific problem instance, but a set of problems. Their size and complexity are frequently such that numerical methods and computers are required to solve them. The equations thus need to be turned into programs that can be used to solve for various problem instances. Such programs can be written manually, but a more expedient route is often to transcribe the equations into a high-level language, e.g. a modelling language, thus making it possible to automatically translate them into a program that tries to compute a solution given specific
Ever-Decreasing Circles: a Skeleton for Parallel Orbit Calculations in Eden Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews, UK. {chrisb,kh}@cs.st-andrews.ac.uk
Abstract. The Orbit algorithm is widely used in symbolic computation, allowing the exploration of a solution space given some initial starting values and a number of mathematically-defined generators. In this paper, we consider how the Orbit algorithm can be encoded in Haskell, and thence how a parallel skeleton can be developed to cover a variety of Orbit calculations, using the Eden parallel dialect of Haskell. We report on a set of performance results that demonstrate the potential of the skeleton to allow a simple but effective implementation of the Orbit algorithm for shared-memory parallel machines.
1
Introduction
This paper describes a new parallel skeleton for a class of algorithm that is widely used in symbolic computation, namely Orbit calculations. An algorithmic skeleton [1] is a higher-order function that abstracts over a generalised parallel computation. More specifically, a parallel skeleton abstractly describes the (parallelisable) structure of an algorithm, in the form of a higher-order function. The computational details that implement the actual algorithm are passed as (functional) arguments to this skeleton. As with the higher-order functions in e.g. the Haskell Prelude, parallel skeletons thus aim to provide a transparent abstraction for the applications programmer, hiding details of the parallel implementation within the definition of the higher-order function. In this paper, we concentrate on implementing a new Orbit skeleton in the Eden [2] parallel dialect of Haskell [3], and use it to solve some large symbolic computations. Informally, an Orbit is an algorithm that computes the transitive closure for a given set of inputs and a list of generator operations. We will give a more formal and precise description in Section 2. The main contributions of this paper are as follows: – we design and implement a new parallel skeleton for the Orbit algorithm using the Eden parallel primitives; – we test our Orbit implementation on some simple benchmarks and demonstrate that our parallel implementation of the orbit skeleton can achieve near-linear speedups on a commercial off-the-shelf multicore machine; and – we show that our parallel implementation adds very little overhead over our sequential implementation.
Hume to FPGA Abdallah Al Zain1 , Wim Vanderbauwhede2 , and Greg Michaelson3 1
3
Heriot-Watt University, Edinburgh, Scotland, EH14 4AS, UK
[email protected] 2 University of Glasgow, Glasgow, Scotland, G12 8QQ, UK
[email protected] Heriot-Watt University, Edinburgh, Scotland, EH14 4AS, UK
[email protected]
Abstract. Hume is a novel language in the functional tradition, strongly oriented to systems requiring strong guarantees that resource bounds are met. To facilitate resource assurance, Hume enforces a separation of coordination and computation concerns, and deploys an abstract machine intermediary between implementations and analyses. These core design decisions also enable a high degree of portability across architectures and suit Hume well to multi-processor implementations. This papers considers how Hume may be implemented on FPGAs via concurrent abstract machines. Initial results from experimental implementations are discussed and the design of a novel FPGA architecture tailored to Hume coordination is presented.
Keywords: FPGA; embedded system; Hume.
1
Introduction
For some little time now, it appears that further advances in single CPU performance have been checked, and that increases in processor speed for commodity platforms are being sought primarily through multiple-cores. Indeed, much is made of soon deploying tens, and indeed hundreds, of cores in one processor, with thread allocation and scheduling controlled transparently by compilers an run-time systems. However, making effective use of multiple cores will still be bound by well known limitations to shared memory multi-processor systems. Essentially, except for very specific algorithms processing very specific patterns of data, shared memory multi-processor performance tails of markedly beyond around 16 processors. It then becomes more effective to build distributed memory assemblies of shared memory nodes, with all the attendant complexities of scheduling and balancing activites, and optimising inter-processor communication, across as well as within nodes. Field programmable gate arrays (FPGAs) hold considerable promise for avoiding the constraints on von Neumann architectures, by offering the prospect of tailoring low level platforms to higher level algorithms, programs and systems.
ComputErl – Erlang-based Framework for Many Task Computing Michał Ptaszek1,2 and Maciej Malawski1 1
Institute of Computer Science AGH, al. Mickiewicza 30, 30-059 Krak´ow, Poland 2 Erlang Solutions Ltd., London, United Kingdom
[email protected],
[email protected]
Abstract. This paper shows how Erlang programming language can be used for creating a framework for distributing and coordinating the execution of many task computing problems. The goals of the proposed solution are (1) to disperse the computation into many tasks, (2) to support multiple well-known computation models (such as master-worker, map-reduce, pipeline), (3) to exploit the advantages of Erlang for developing an efficient and scalable framework and (4) to build a system that can scale from small to large number of tasks with minimum effort. We present the effects of work on designing, implementing and testing ComputErl framework. The preliminary experiments with benchmarks as well as real scientific applications show the primising scalability on the computing cluster. Keywords: many task computing, Erlang, grid, distributed computing, parallelism
1
Introduction
In modern times, when the magnitude of data that needs to be processed on the daily basis is often far too large to consider it to be suitable for a single workstation, the importance of taking advantage of machines that form a cluster or computing grid is getting higher. In most cases grid systems are aimed at performing the coarse grained computations that last for a relatively long time. The typical usage is to employ a big number of loosely coupled workstations to perform a highly specified, number-crunching and computational intensive job. Erlang as a functional programming language, focusing on concurrency, distribution and robustness [1], has taken a measure of a tool that allows programmers to build a highly scalable systems. However, although Erlang has never had a strong position in the computational world, it has been used several times as a highly-scalable middleware layer responsible for coordination and message transport3 4 as well as a tool acting as a key-value storage [2]. One of main goals for this work was to prove that Erlang is capable of handling a massive-scale computation coordination. We specifically focus on fine-grained computational tasks in so-called many task computing model [3] which is gaining importance 3 4
http://www.heroku.com http://www.facebook.com/notes.php?id=9445547199
Gozer: A Dynamic Object-Oriented Lisp for the JVM Jason Madden RiskMetrics Group, 201 David L. Boren Blvd, Suite 300, Norman, OK, USA
[email protected]
Abstract. The Gozer language is a highly dynamic Lisp dialect designed for the rapid development of complex scripts that can easily exploit a distributed environment as well as local parallelism. Although fundamentally object-oriented, it supports multiple programming paradigms, including functional, imperative, object-oriented and generic. Gozer runs on the Java virtual machine and incorporates ideas from languages such as Common Lisp, Scheme, and Java, among others. It is in production usage at RiskMetrics Group.
1
Introduction
The Gozer language is a Lisp dialect that could be described as a “scripting language” due to its support for interactive development, rapid prototyping, and tight integration with existing Java libraries. Its primary influence is Common Lisp, but it includes elements from other languages including Clojure, Groovy, Java, and Scheme. The Gozer language is executed by a custom bytecodebased Gozer virtual machine (GVM) and runtime layered on top of the JVM (Java virtual machine); this implementation supports its distributed and parallel features. This paper will provide an overview of the Gozer language with comments on its relation to other languages. Following a brief description of the history and motivations for Gozer, the remainder of the paper will focus on the language itself, beginning with the syntax before progressing to evaluation, with particular attention paid to the condition system to highlight the flexibility of the language and its control flow constructs, and ending with the integration of dynamic object-oriented features into the base functional language. 1.1
History and Motivation
Beginning in 2003, RiskMetrics developed on the Java platform a (proprietary) distributed, messagepassing computing environment called BlueBox, based on a service-oriented architecture in which services communicate by exchanging well-defined XML messages. In 2006, it was determined that a way to quickly coordinate existing services for the implementation of complex business processes called workflows was needed, and development of the workflow system that would ultimately become the Gozer Workflow System began [1]. There were several requirements for the language that workflows would be written in. These included the ability to rapidly develop new workflows, effortless integration with BlueBox services and Java libraries, easy modification of existing workflows and adaptation to system changes (without human intervention, where possible), simple (preferably transparent) scaling across the distributed system, and simple (again, preferably transparent) fault-tolerance. These requirements were originally met by describing workflows in an abstract XML document (which was “executed” by a simple tree-walking interpreter whose execution state, a primitive continuation, could be persisted and migrated between JVMs for fault-tolerance and distribution) but it soon became apparent that
Modeling Tumor Invasion: Assessing Erlang for the Modeling of Molecular/Cellular Dynamics Tim Ashley1 , Dee Wu2 , Robert Watkins, and Henry Neeman1 1
2
University of Oklahoma University of Oklahoma Health Sciences Center
Abstract. Modeling nano- and mesoscale particles can be well-suited for evaluation by a functional programming (FP) language. In particular, many biological processes, including those for drug delivery and tumor evaluation, are of interest to the medical community. However, much work concerns the use of continuum mechanics models that may oversimplify real world biophysical relationships, which only applies an approximation to a coarse scale of measurement. We investigate the use of Erlang to model molecular/cellular dynamics of a tumor. FP is well suited to model generational data structures that are ultimately scalable with system resources as well as can maintain multiple ‘functional state representations for processes. We apply pattern-matching rules and store the state of our world using list and tuple data structures. However the incorporation of both space and generational data as a large state model can be a challenging computational problem. Thus, to model complex physics interactions from multiple domain spaces, we apply a recursive approach to provide an estimation of the full model which can incorporate the hybridized information from both the spatial and generational processes.
Key words: Molecular/Cellular Dynamics, Erlang, Multiscale Space, Hybrid Processing
1
Introduction
There have been attempts to provide multiscale models for tumor invasion in the literature [1]. However, mathematical models for cellular biology can be complex in terms of both modeling the equations of motion and the state information within the cell groups. Early efforts for cell modeling revolved around continuum based models. These continuum models approximate the large scale interaction between groups of cells, and much sophisticated numerical and computational machinery are available to be used to model continuum scale systems. Unfortunately, these models are approximate because they are focused on larger spatial and temporal scales than individual cell group dynamics. As an option, it may also be possible to model discretely the behavior of individual cells. This is also often a challenge, as it has extremely high computational cost associated with performing this form of analysis. Recently, it has been proposed to incorporate
HaskHOL: A Haskell Hosted Domain Specific Language Representation of HOL Light Evan Austin and Perry Alexander The University of Kansas Information and Telecommunication Technology Center 2335 Irving Hill Rd, Lawrence, KS 66045 {ecaustin,alex}@ittc.ku.edu
Abstract. Traditionally, members of the higher-order logic (HOL) theorem proving family have been implemented in the Standard ML programming language or one of its derivatives. This paper presents a description of a recently initiated project intended to break with tradition and implement a lightweight HOL theorem prover library, HaskHOL, as a Haskell hosted domain specific language. The goal of this work is to provide the ability for Haskell users to reason about their code directly without having to transform it or otherwise export it to an external tool. The paper also presents a verification technique leveraging popular Haskell tools QuickCheck and Haskell Program Coverage to increase confidence that the logical kernel of HaskHOL is implemented correctly. Key words: Haskell, HOL, HOL Light, Theorem Prover
1
Introduction
Modern higher-order logic (HOL) theorem provers have a rich history dating back to when Michael Gordon first modified Cambridge LCF, a system based on Robin Milner’s Logic for Computable Functions, back in the late 1980s[10]. Starting with HOL90, a reimplementation of the first stable release of a HOL system (HOL88), these theorem provers have all shared more than their logical basis; they were all implemented in Standard ML or one of its derivatives. This is a trend that has continued to this day, leaving users of other functional programming languages with few to no native representations of a HOL system. HaskHOL aims to correct this deficiency for Haskell by providing a hosted domain specific language (DSL) representation of a lightweight HOL theorem prover that users can leverage to reason about their code without having to leave the Haskell universe. The design and implementation of HaskHOL is heavily influence by HOL Light, a popular member of the HOL theorem proving family developed by John Harrison, sharing its logical kernel and data type representations[12]. The HOL Light system was selected as the basis of HaskHOL because it has a much simpler logical kernel when compared to other HOL provers while still maintaing comparable proving power and demonstrating an impressive track record of successful verifications of industrial problems.
Implicitly Heterogeneous Multi-Stage Programming for FPGAs Fulong Chen1 , Rajat Goyal2 , Edwin Westbrook3 , and Walid Taha3 1
2
Department of Computer Science, Anhui Normal University, Wuhu, Anhui 241000, China
[email protected] Integrated M.Tech, Mathematics and Computing, India Institute of Technology, New Delhi 110016, India
[email protected] 3 Department of Computer Science, Rice University, Houston, Texas 77005, USA {emw4,taha}@rice.edu
Abstract. Previous work on semantics-based multi-state programming language design focused on homogeneous and heterogeneous software designs. In homogenous software design,the source and the target software programming languages are the same. In heterogeneous software design, they are different software languages. This paper proposes a practical means to circuit design by providing specialized offshoring translations from subsets of the source software programming language to subsets of the target hardware description language (HDL). This approach avoids manually writing codes for specifying the circuit of the given algorithm. To illustrate the proposed approach, we design and implement a translation to a subset of Verilog suitable numerical and logical computation. Through the translator, programmers can specify abstract algorithms in high level languages and automatically convert them into circuit descriptions in low level languages. Key words: Multiple-Stage Programming, Offshoring Translation, Circuit Design, Verilog
1
Introduction
Multi-stage programming (MSP) languages allow the programmer to use abstraction mechanisms such as functions, objects,and modules. In homogenous MSP language design[1], the source and the target software programming languages are the same. In heterogeneous design[2], they are different software languages. Previous work on implicitly heterogeneous multi-stage programming has implemented two target software languages-C and Fortran in MetaOCaml. However the conversion from software programming languages to hardware description languages (HDLs) is not supported.
The Internals and Externals of Kansas Lava Extended Abstract Andy Gill, Tristan Bull, Andrew Farmer, Garrin Kimmell, and Ed Komp Information Technology and Telecommunication Center Department of Electrical Engineering and Computer Science The University of Kansas 2335 Irving Hill Road Lawrence, KS 66045 {andygill,tbull,anfarmer,kimmell,komp}@ittc.ku.edu
Abstract. In this extended abstract, we overview the design and implementation of our latest version of Kansas Lava. Driven by needs of implementing telemetry circuits, we have made a number of recent improvements to both the external API and the internal representations used. We have retained our dual shallow/deep representation of signals in general, but now have a number of externally visible abstractions for combinatorial, sequential, and enabled signals. We introduce these abstractions, as well as our new abstractions for memory and memory updates. Internally, we found the need to represent unknown values inside our circuits, so we made aggressive use of type functions to lift our values in a principled and regular way. We discuss this design decision, how it unfortunately complicates the internals of Kansas Lava considerably, and how we mitigate this complexity.
1
Introduction
Kansas Lava is a modern implementation of a hardware description language that uses functions to express hardware components, and leverages the abstractions in Haskell to build complex circuits. Lava, the given name for a family of Haskell based hardware description libraries, is an idiomatic way of expressing hardware in Haskell which allows for simulation and synthesis to hardware. In this paper, we explore the internal and external representation of a Signal in Kansas Lava, and how different representations of signal-like concepts work together in concert. We have been using Kansas Lava for about a year to aid in the development of high-performance high-rate forward error correction codes, targeting a high-end FPGA development board. Guided by the experiences of writing solutions to various encoding and decoding algorithms, we have made a number of improvements and changes to our earlier design.
What’s the Matter with Kansas Lava? Andrew Farmer, Garrin Kimmell, and Andy Gill Information Technology and Telecommunication Center Department of Electrical Engineering and Computer Science The University of Kansas 2335 Irving Hill Road Lawrence, KS 66045 {anfarmer,kimmell,andygill}@ku.edu
Abstract. Kansas Lava is a functional hardware description language implemented in Haskell. In the course of attempting to generate ever larger circuits, we have found the need to effectively test and debug the internals of Kansas Lava. This includes confirming both the simulated behavior of the circuit and its hardware realization via generated VHDL. In this paper we share our approach to this problem, and discuss the results of these efforts.
1 1.1
Introduction What is Kansas Lava?
Kansas Lava is an effort to create a modern implementation of the Lava design pattern that allows direct (Chalmers style) specification of circuits [1]. There are two concrete types in Kansas Lava: Seq and Comb, which represent sequential and combinatorial values, respectively. Combinatorial values exclude the notion of a clock, whereas sequential values encode a series of values over time. Both are instances of the Signal type class, over which most primitives are defined. This allows the user to write circuits that work on both types of input, pretending there is a single unified type, Signal. As an example: halfAdder a b = (carry,sum) where carry = and2 a b sum = xor2 a b Notice that the halfAdder circuit we just defined can be used with both types of input values: ghci> halfAdder (constComb True) (constComb True) (T,F) ghci> let x = toSeq $ cycle [False,False,True,True] ghci> let y = toSeq $ cycle [False,True] ghci> halfAdder x y (F :~ F :~ F :~ T :~ ..., F :~ T :~ T :~ F :~ ...)
Functional Video Games in CS1 Marco T. Moraz´an Seton Hall University, South Orange, NJ, USA
[email protected]
Abstract. Over the past decade enrollments in Computer Science programs have drastically dropped while simultaneously seeing demand for computer scientists in the job market increase. The reason for this disconnect is, in part, due to the perception new potential students have of programming as a dull activity requiring endless hours of coding in front of a monitor requiring little social interaction or no creativity. The question then is how can we capture the imagination of new students and perk their interest in a way that gets them excited while at the same time giving them a solid foundation in computer programming and Computer Science. This position article describes the proposed solution that is being implemented at Seton Hall University using video game programming, a subset of the Scheme programming language, and Felleisen et al.’s textbook How to Design Programs. The article briefly describes the first-year programming curriculum and illustrates how to get students interested in programming through the development of a Space-Invaders-like game. Emphasis is placed on the use of a functional language as the language of choice for the first programming course.
1
Introduction
Over the past decade enrollments in Computer Science programs have drastically dropped up to 70% in some countries [10]. According to CRA’s most recent Taulbee Survey in the United States and Canada, the number of Computer Science and Computer Engineering newly declared majors has dropped from a high around 24,000 in the year 2000 to under 14,000 in the year 2008 [11]. In addition, the production of Bachelor’s dropped from a high of over 20,000 in 2002 to under 12,000 in 2009. The Taulbee Survey also suggests that retention rates need to be improved. For example, in 2004 there were about 16,000 newly declared majors and, four years later, in 2008 there were under 12,000 Bachelor’s produced. This represents a a retention rate under 75%. The drop in enrollment is occurring while seeing demand for computer scientists in the job market increase. According to recent occupational employment projections for 2008-2018, computer and mathematical occupations are expected to grow by 22.2% [6]. This rate of growth is over twice as high as the average for all occupations. Among the fastest growing occupations are computer software engineers with demand for application developers expected to increase by 34% and demand for systems software developers to increase by 30.4%. The data
Reasoning About DrScheme Programs in ACL2 Melissa Wiederrecht, Christopher MacLellan, and Ruben Gamboa University of Wyoming Department of Computer Science Laramie, WY
Abstract. Beginning programmers need to learn more than the syntax of programming languages. They also need to learn how to reason about the programs they write. Thus we believe that beginners will benefit from tools that help them understand their programs, just as they already benefit from IDEs that help them to build and debug their programs. This paper describes a project aimed at automating some of the techniques required to reason about programs in Beginning Student Language (BSL), the first language in DrScheme’s How to Design Programs curriculum [4]. The automation is based on the theorem prover ACL2.
1
Introduction
Beginning programming students have a much larger job in front of them than mastering the syntax of their first programming language. These students need to learn how to think like programmers. But reasoning about programs involves sophisticated techniques from logic, which are usually at the levels of graduate students or advanced undergraduates, certainly not freshmen. So what can help them to fill the gap between a student’s first intuition as to how a program should be written and what is in fact a correct solution to a given problem? We propose that a mechanical theorem prover, written fresh, automated, extended, and embellished with pedagogical apparatus could be used to provide students with a tool at their fingertips that would grant them instant feedback about the correctness of their programs and what they might possibly do to improve them. Writing theorem provers is hard! So the most convenient solution to this problem is to take an existing mechanical theorem prover, such as ACL2 [7], and modify it to our liking. However, existing theorem provers were designed for researchers, not for students. For example, a beginning student could probably manage to write factorial in a perfectly reasonable manner in Lisp like this: (defun fact (n) (if (= n 0) 1 (∗ n (fact (− n 1))))) A more forward-thinking student may write it in the following form instead: (defun fact (n)
Efficient Bijective G¨ odel Numberings for Term Algebras Paul Tarau Department of Computer Science and Engineering University of North Texas
[email protected]
Abstract. We introduce a G¨ odel numbering algorithm that encodes/decodes elements of a term algebra as unique natural numbers. In contrast with G¨ odel’s original encoding and various alternatives in the literature, our encoding has the following properties: a) is bijective b) natural numbers always decode to syntactically valid terms c) it works in linear time in the bitsize of the representations d) the bitsize of our encoding is within constant factor of the syntactic representation of the input. The algorithm can be applied to derive compact serialized representations for various formal systems and programming language constructs. The paper is organized as a literate Haskell program available from http://logic.cse.unt.edu/tarau/research/2010/fgoedel.hs. Keywords: natural number encodings of terms, bijective G¨ odel numberings, computational mathematics in Haskell, ranking/unranking functions, bijective base-k encodings
1
Introduction
A ranking/unranking function defined on a data type is a bijection to/from the set of natural numbers (denoted N through the paper). When applied to formulas or proofs, ranking functions are usually called G¨ odel numberings as they have originated in arithmetization techniques used in the proof of G¨odel’s incompleteness results [1, 2]. In G¨odel’s original encoding [1], given that primitive operation and variable symbols in a formula are mapped to exponents of distinct prime numbers, factoring is required for decoding, which is therefore intractable for formulas of non-trivial size. As this mapping is not a surjection, there are codes that decode to syntactically invalid formulas. This key difference, also applies to alternative G¨ odel numbering schemes (like G¨odel’s beta-function), while ranking/unranking function, as used in combinatorics, are bijective mappings. Besides codes associated to formulas, a wide diversity of common computer operations, ranging from data compression and serialization to data transmissions and cryptographic codes are essentially bijective encodings between data types. They provide a variety of services ranging from free iterators and random objects to data compression and succinct representations. Tasks like serialization and persistence are facilitated by simplification of reading or writing operations without the need of special purpose parsers.
Using McErlang to Verify an Erlang Process Supervision Component∗ David Castro1 , Clara Benac Earle2 , Lars-˚ Ake Fredlund2 , Victor M. Gulias1 , and Samuel Rivas3 1
MADS Group, Computer Science Department University of A Coru˜ na, Spain {dcastrop,gulias}@udc.es 2 Babel Group, School of Computer Science, Universidad Polit´ecnica de Madrid (UPM), Spain {cbenac,lfredlund}@fi.upm.es 3 LambdaStream Servicios Interactivos S.L. Ronda de Outeiro 33 Entlo., A Coru˜ na, Spain
[email protected]
Abstract. We present a case-study in which a tool for model checking programs written in Erlang, McErlang, was used to verify a complex concurrent component. The component is an alternative implementation of the standard supervisor behaviour of Erlang/OTP. This implementation, in use at the company LambdaStream, was checked against several safety and liveness properties. In one case, McErlang found an error.
1
Introduction
Developing reliable concurrent software is a hard task given the inherent nondeterministic nature of concurrent systems. A technique which is often used to check that a concurrent program fulfils a set of desirable properties is model checking [12]. In model checking, in theory, all the states of a concurrent system are systematically explored. Erlang [6] is a functional programming language that is used by several companies worldwide. One such company is LambdaStream, which is dedicated to improving their software development methodology, as shown by their participation in the European research project ProTest1 . Thanks to this project, a fruitful collaboration has been established between LambdaStream, the University of A Coru˜ na and the Universidad Polit´ecnica de Madrid. One result of the collaboration is the verification, using the McErlang model checker, of a process supervision component developed by LambdaStream. Although the supervisor has been used in several products, and was well tested, we did find a discrepancy between its documentation and the implementation of the component. ∗
1
This work has been partially supported by the following projects: ProTest (FP7ICT-2007-1 215868), DESAFIOS10 (TIN2009-14599-C03-00), and PROMETIDOS (P2009/TIC-1465). http://www.protest-project.eu/
Capturing Functions and Catching Satellites Andy Gill and Garrin Kimmell Information Technology and Telecommunication Center Department of Electrical Engineering and Computer Science The University of Kansas 2335 Irving Hill Road Lawrence, KS 66045 {andygill,kimmell}@ku.edu
Abstract. The 2009 ICFP programming contest problem required contestants to control virtual satellites that obey basic physical laws. The orbital physics behavior of the system was simulated via a binary provided to contestants which ran on top of a simple virtual machine. Contestants were required to implement the virtual machine along with a controller program to manipulate the satellite’s behavior. In this paper, we describe the modeling of the simulation environment, with a focus on the compilation and testing infrastructure for the generated binaries for this virtual machine. This infrastructure makes novel use of an implementation of a deeply embedded Domain Specific Language (DSL) within Haskell. In particular, with use of IO-based observable sharing, it was straightforward for a function to be both an executable specification as well as a portable implementation.
1
Introduction
Organizing the ICFP contest presented a challenge for the students and faculty at the University of Kansas. As the Computer System Design Laboratory, we wanted to set a challenging controller based problem, but how do we provide an interesting simulation environment for this controller that could be used with any possible computer language, and on any possible system? The environment must be interactive (take input, generate output), and perhaps contain and encode hidden challenges and puzzles. The architecture we chose was to provide a binary that encoded the executable specification of a model behavior, and require contestants to write a small virtual machine for this binary. This solution raised the issue of how we should write our implementation of the simulation, and how we should generate the reference binary. Rather than write an assembler or compile from a high level language, we chose to experiment with compiling a high-level specification, via a small custom Domain Specific Language (DSL) written for this specific problem. In particular, we used the recently developed IO-based observable sharing [6] and aggressive use of Haskell overloading to enable the sharing of a purely functional model of satellite behavior for both testing and code generation.
Functional Programmers: Get Them When They are Young David P. Miller School of AME & School of CS University of Oklahoma
[email protected] http://faculty-staff.ou.edu/M/David.P.Miller-1/
Abstract. With a renewed emphasis on technology education, schools and government agencies are using kid attractive activities such as robotics competitions to draw middle and high school students to STEM oriented majors and careers in general and computer science and programming in particular. These activities tend to use simplified imperative programming languages that emphasize changes to state, loops and jumps as the primary programming techniques. This paper explains why these languages are used and where they came from. The paper also talks about R the Botball program – a software oriented activity for middle and high school students, and explores what characteristics would be needed in a functional programming language in order to get it to be used in Botball and similar programs. Keywords: beginning languages, STEM education, robotics
1
Introduction
It is widely believed that the first programming language one learns influences programming habits and styles for life. That helps account for the frequent, heated language battles for CS-1, though those decisions are often made for reasons of future employability of the students as they are for pedagogy [6]. As part of a general STEM education push, NSF and DARPA are supporting educational programs involving robots and video games as a way to draw younger students into programming. This is occurring both at the K-12 and undergraduate levels [21]. Unfortunately, the programming languages most widely supported by these funding agencies for this purpose are not only imperative languages, but often ones that lack basic capabilities such as recursion, eliminating even the illustration of many good programming practices – and making the subsequent adoption of those practices by these students all the more difficult in later years. Many existing functional programming languages (e.g., Haskell, Yampa [9], GRL [8], LISP and Scheme) have been used to control robots. However, none of these are set up for novice programmers, and more importantly, none are set up for easy integration into public school IT environments whose security restrictions could put most DoD contractors to shame.
Testing with Functional Reference Implementations Pieter Koopman and Rinus Plasmeijer Institute for Computing and Information Sciences (ICIS), Radboud University Nijmegen, the Netherlands {pieter, rinus}@cs.ru.nl
Abstract. This paper discusses our approach to test programs that determines which candidates are elected in the Scottish Single Transferable Vote (STV) elections. Due to the lack of testable properties of STVelection, We have implemented a reference implementation in a pure functional programming language. Our tests revealed issues in the law regulating these elections as well as the programs implementing the rules. Functional programming languages appeared to be an excellent tool to implement reference implementations.
1
Introduction
We were recently asked for the second time to test election software to be used in Scottish local elections with a specific implementation of a Single Transferable Vote (STV) system [7, 8]. In such a STV system each voter can indicate a sequence of candidates on her voting ballot. When a candidate does not need the vote fully to be elected or is eliminated in the voting process the vote is transferred (partly) to the next candidate on this ballot. The exact rules to be followed are very operationally specified in the law for these elections. The law states how a human being should determine the result of the election by sorting ballots and transferring ballots from one pile to another pile with a specific weight. See section 2 for more details. For a real election there are a large number of ballot and the STV system can need a large number of stages, we have seen numbers up to 100, to decide which candidates are elected. To compute the election results fast and accurately it is necessary to use a computer program that implements the law and determines which candidates are elected based on all given ballots. Since such a program determines which candidates will be elected it better be correct. Since there is no way to check the election results one has to trust such a program. In order to improve the confidence in this software we were asked to perform black box test for this system with e given test suite. As authors of the model-based test tool G∀st [4, 5] we initially planned to state a sufficiently strong set of logical properties and test these properties with G∀st. G∀st is an model based test system that automatically tries to find behavior of the the implementation under test, iut, that is not allowed by the specification. A
Every Animation Should Have a Beginning, a Middle, and an End Kevin Matlage and Andy Gill Information Technology and Telecommunication Center Department of Electrical Engineering and Computer Science The University of Kansas 2335 Irving Hill Road Lawrence, KS 66045 {kmatlage,andygill}@ku.edu
Abstract. Animations are sequences of still images chained together to tell a story. Every story should have a beginning, a middle, and an end. We argue that this advice leads to a simple and useful idiom for creating an animation Domain Specific Language (DSL). We introduce our animation DSL, and show how it captures the concept of beginning, middle, and end inside a Haskell applicative functor we call Active. We have an implementation of our DSL inside the image generation accelerator, ChalkBoard, and we use our DSL on an extended example, animating a visual demonstration of the Pythagorean Theorem.
1
Introduction
Our earliest attempts at animation using ChalkBoard [1], our image generation accelerator, started off pretty rudimentary. This paper describes our endeavor to construct a useful abstraction for these ChalkBoard animations, as well as animations in general. Our early attempts revolved around trying to create images that were dependent on arguments passed to the function that created them. We could then use a simple loop, calling the function repeatedly with slightly changing arguments in order to create an animation. While this solution did create animations successfully, it was by no means a very sophisticated solution. It required an argument for each aspect of an image that we wanted to change (or reuse of the same argument), and so this sequence of values needed to be generated for each argument every time an animation was created. This solution was also very intertwined with each specific animation that was created. Every animation had to be built from scratch, without much in the way of reusable animation code. This lack of abstraction seemed unnecessary, as the drawing functions and how to use these functions over time are two inherently different things. At least the basics of an animation language seemed like it should be abstracted from the image creation language, even if more complex combinators needed to be created in order to accomplish certain tasks that are repeated often. We therefore wanted an animation language that would allow for this abstraction, but that would also lend itself well towards creating useful combinators