Electronic Notes in Theoretical Computer Science (Preliminary Versions)
15th workshop on functional and (constraint) logic programming WFLP’06
Madrid, Spain November 16-17, 2006
Guest Editor: ´ pez Fraguas Francisco J. Lo
ii
Contents Preface
v
S. Escobar, J. Meseguer (Invited Speaker) and P. Thati Narrowing and Rewriting Logic: from Foundations to Applications . . .
1
P. Padawitz (Invited Speaker) Expander2: Program verification between interaction and automation
29
M. Hanus Reporting Failures in Functional Logic Programs . . . . . . . . . . . . . . . . . . . . . 49 R. Caballero, C. Hermanns and H. Kuchen Algorithmic Debugging of Java Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63
B. Brassel A Framework for Interpreting Traces of Functional Logic Computations
77
P. H. Sadeghi and F. Huch The Interactive Curry Observation Debugger COOiSY . . . . . . . . . . . . . . .
93
D. Cheda, J. Silva and G. Vidal Static Slicing of Rewrite Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 C. Ochoa and G. Puebla A Study on the Practicality of Poly-Controlled Partial Evaluation . . . . 123 R. Caballero and Y. Garc´ıa Implementing Dynamic-Cut in Toy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 R. Berghammer and S. Fischer Implementing Relational Specifications in a Constraint Functional Logic Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 S. Fischer Lazy Database Access with Persistent Predicates . . . . . . . . . . . . . . . . . . . . . 167 C. M. Segura and C. Torrano Using Template Haskell for Abstract Interpretation . . . . . . . . . . . . . . . . . . . 183 V. Nogueira and S. Abreu Temporal Contextual Logic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
iii
´vez, A. J. Ferna ´ndez, T. Hortala ´, M. Rodr´ıguez and R. S. Este del Vado V´ırseda A Fully Sound Goal Solving Calculus for the Cooperation of Solvers in the CFLP Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 ´lez del Campo and F. Sa ´enz R. Gonza Programmed Search in a Timetabling Problem over Finite Domains . . 227 ˜o and J. M. Rey E. J. Gallego, J. Marin Disequality Constraints in Sloth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
iv
Preface This volume contains preliminary versions of the papers presented at the 15th Workshop on Functional and (Constraint) Logic Programming (WFLP’06), which was held in Madrid, Spain, on November 16-17, 2006. The definitive version of the Proceedings will appear as an issue of the series Electronic Notes in Theoretical Computer Science (ENTCS, Elsevier, http://www.elsevier. nl/locate/entcs.). The aim of the workshop is to bring together researchers sharing a common interest in functional programming and (constraint) logic programming, as well as their integration. It promotes a cross-fertilizing exchange of ideas and experiences among researchers and students from the different communities interested in the foundations, applications, and combinations of high-level, declarative programming languages and related areas. Previous WFLP editions have been organized in Tallinn (2005), Aachen (2004), Valencia (2003), Grado (2002), Kiel (2001), Benicassim (2000), Grenoble (1999), Bad Honnef (1998), Schwarzenberg (1997, 1995, 1994), Marburg (1996), Rattenberg (1993), and Karlsruhe (1992). On this occasion 18 papers were submitted to WFLP’06. After a careful review process, with at least three reviews for each paper and a subsequent in-depth discussion, the Program Committee selected 14 papers for presentation at the workshop. In addition to regular papers, the scientific program of WFLP’06 included also two invited talks by Jos´e Meseguer (University of Illinois at Urbana) and Peter Padawitz (University of Dortmund). I would like to thank all the people who contributed to WFLP’06: the contributing authors; the PC members and the additional reviewers for their great effort in the review process; the invited speakers for their willingness to attend WFLP’06 and to prepare complete versions of their talks; the Organizing Committee for their continuous help; and, last but not least, the sponsoring institutions for their financial and logistic support. Next WFLP will be held in Paris. This will be an excellent opportunity for the WFLP community to meet again and, as Henri IV said some centuries ago, ‘Paris vaut bien une messe’.
Madrid, 20 October 2006 Francisco J. L´ opez Fraguas
v
Program Committee Sergio Antoy Rafael Caballero Agostino Dovier Rachid Echahed Santiago Escobar Moreno Falaschi Michael Hanus Frank Huch Tetsuo Ida Herbert Kuchen Francisco J. L´ opez-Fraguas Wolfgang Lux Mircea Marin Julio Mari˜ no Juan J. Moreno-Navarro Germ´ an Vidal
Portland State University, USA Universidad Complutense de Madrid, Spain Universit`a di Udine, Italy Institut IMAG, France Universidad Polit´ecnica de Valencia, Spain Universit`a di Siena, Italy Christian-Albrechts-Universit¨at zu Kiel, Germany Christian-Albrechts-Universit¨at zu Kiel, Germany University of Tsukuba, Japan Westfalische Wilhelms-Universit¨at M¨ unster, Germany Universidad Complutense de Madrid, Spain (chair) Westfalische Wilhelms-Universit¨at Munster, Germany University of Tsukuba, Japan Universidad Polit´ecnica de Madrid, Spain Universidad Polit´ecnica de Madrid, Spain Universidad Polit´ecnica de Valencia, Spain
Additional reviewers
Organizing Committee
Marco Comini Lars-Ake Fredlund Emilio Gallego Sebastian Fischer Evelina Lamma Gines Moreno Carla Piazza Gianfranco Rossi Josep Silva Alicia Villanueva
Rafael Caballero Sonia Est´evez Isabel Pita Juan Rodr´ıguez Carlos Romero Jaime S´anchez Rafael del Vado
Sponsoring Institutions Ministerio de Educaci´ on y Ciencia Grants TIN2005-09207-C03-03 ‘MERIT-FORMS’ and TIN2006-26891-E Comunidad de Madrid Grant S-0505/TIC/0407 ‘PROMESAS-CAM’ Universidad Complutense de Madrid Vicerrectorado de Investigaci´on Facultad de Inform´ atica Departamento de Sistemas Inform´aticos y Computaci´on
vi
WFLP 2006
Narrowing and Rewriting Logic: from Foundations to Applications Santiago Escobara,1 Jos´e Meseguerb,2 Prasanna Thatic,3 a b
Universidad Polit´ ecnica de Valencia, Spain.
University of Illinois at Urbana-Champaign, USA. c Carnegie-Mellon University, USA.
Abstract Narrowing was originally introduced to solve equational E-unification problems. It has also been recognized as a key mechanism to unify functional and logic programming. In both cases, narrowing supports equational reasoning and assumes confluent equations. The main goal of this work is to show that narrowing can be greatly generalized, so as to support a much wider range of applications, when it is performed with rewrite theories (Σ, E, R), where (Σ, E) is an equational theory, and R is a collection of rewrite rules with no restrictions. Such theories axiomatize concurrent systems, whose states are equivalence classes of terms modulo E, and whose transitions are specified by R. In this context, narrowing is generalized from an equational reasoning technique to a symbolic model checking technique for reachability analysis of a, typically infinite, concurrent system. We survey the foundations of this approach, suitable narrowing strategies, and various applications to security protocol verification, theorem proving, and programming languages. Keywords: Narrowing, Rewriting Logic, Maude, Reachability, Equational Reasoning, Security protocols
1
Introduction
1.1
Why Rewriting Logic
Logic programming is a parametric idea: it is parameterized by the computational logic one chooses as the basis of one’s programming language [42]. The more expressive the logic, the wider the range of applications one can naturally support without having to corrupt the language’s declarative semantics. This poses the interesting challenge of finding more expressive computational logics without losing good efficiency; that is, without falling into the Turing tar pits of general theorem proving. Rewriting logic [43] is a computational logic that can be efficiently implemented and that widens quite substantially the range of applications naturally supported by 1 2 3
Email:
[email protected] Email:
[email protected] Email:
[email protected]
This paper is electronically published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs
Escobar, Meseguer, Thati
declarative programming. It generalizes both equational logic and Horn logic, and furthermore supports a declarative programming style for object-oriented systems and for general distributed programming [44]. In fact, it is a very general logical and semantic framework, in which a wide range of logics and models of computation can be faithfully represented [40]. For the purposes of this paper it may be enough to sketch out two ideas: (i) how rewriting logic combines equational logic and traditional term rewriting; and (ii) what the intuitive meaning of a rewrite theory is all about. A rewrite theory is a triple R = (Σ, E, R) with Σ a signature of function symbols, E a set of Σ-equations of the form t = t0 , and R a set of Σ-rewrite rules 4 of the form l → r. Therefore, the logic’s atomic sentences are of two kinds: equations, and rewrite rules. Equational theories and traditional term rewriting systems then appear as special cases. An equational theory (Σ, E) can be faithfully represented as the rewrite theory (Σ, E, ∅); and a term rewriting system (Σ, R) can likewise be faithfully represented as the rewrite theory (Σ, ∅, R). Of course, if the equations of an equational theory (Σ, E) are confluent, there ˜ ∅, RE ), where is another useful representation, namely, as the rewrite theory (Σ, → − → − ˜ Σ = Σ ∪ {≈, true}, and RE = E ∪ { x ≈ x → true}, where E are the rewrite rules obtained by orienting the equations E. By confluence we then have the equivalence: (Σ, E) ` t = t0
⇔
˜ ∅, RE ) ` t ≈ t0 →∗ true (Σ,
Much work in rewriting techniques and in functional logic programming has traditionally centered around this equivalence. But by implicitly suggesting that rewrite rules are just an efficient technique for equational reasoning, this equivalence can easily prevent us from seeing that rewrite rules can have a much more general nonequational semantics. This is the whole raison d’etre of rewriting logic. In rewriting logic a rewrite theory has two complementary readings: one computational, and the other logical. Computationally, a rewrite theory R = (Σ, E, R) axiomatizes a concurrent system, whose states are E-equivalence classes, and whose atomic transitions are specified by the rules R. Logically, R axiomatizes a logical inference system, whose formulas are Σ-expressions satisfying structural axioms E, and whose inference rules are precisely the rules in R. The inference system of rewriting logic [43] then allows us to answer the same question in two complementary readings: (i) can we reach state [t0 ]E from state [t]E ? and (ii) can we derive formula [t0 ]E from formula [t]E ? 1.2
Narrowing as Symbolic Reachability Analysis
Of course, questions (i) and (ii) above are the same question, namely, the reachability question. Rewriting logic gives us a complete inference system [43] to derive for a given rewrite theory R all valid universally quantified reachability formulas → (∀− x ) t →∗ t0 . But an equally important problem is being able to derive all valid → existentially quantified reachability formulas (∃− x ) t →∗ t0 . Why is answering such existential reachability questions important? Because if we could, we would have 4
In general, rewrite rules can be conditional [43], but we treat here the simpler, unconditional case.
2
Escobar, Meseguer, Thati
a very powerful symbolic model checking technique, not in the limited BDD-based finite-state sense, but in the much more general sense of model checking infinite → state systems. Indeed, in the formula (∃− x ) t →∗ t0 , t represents a typically infinite 0 set of initial states, and t represents a typically infinite set of target states. The model checking question is then whether from some initial state in t we can reach some state in t0 . For example, t0 may describe a set of attacks on a security protocol; → then such attacks exist iff (∃− x ) t →∗ t0 is valid. Here is where narrowing comes in. Proposed originally as a method to solve → equational goals (∃− x ) t = t0 , [23,33,35], it was soon recognized as a key mechanism to unify functional and logic programming [26,29]. But even in that original setting we can reinterpret narrowing as a technique to answer reachability questions. That → is, narrowing allows us to convert the question (∃− x ) t = t0 in (Σ, E) into the → − 0 ∗ ˜ ∅, RE ). But reachability question (∃ x ) t ≈ t → true in the rewrite theory (Σ, the converse is definitely not true: when we interpret rewrite rules as transitions in a system, reachability questions do not have an equational counterpart: we may be able to reach a state b from a state a, but it may be impossible to return to a from b. That is, reachability is definitely not symmetric. The whole point of a rewrite theory (Σ, E, R) is to distinguish equality between states by E, and reachability between states by R as totally different relations. The goal, then, is to generalize narrowing from an equational solving technique for confluent equational theories (Σ, E) to a symbolic reachability analysis technique for arbitrary rewrite theories (Σ, E, R), whose rules R may typically fail to be confluent, and may often not terminate. In this way, we obtain a useful new technique, first suggested in [44,14], to model check infinite state systems. In this sense, narrowing complements other existing techniques for analyzing such systems, including model checking for suitable subclasses, e.g., [7,9,17,24], abstraction techniques, e.g., [10,38,28,36,54], tree-automata based reachability analyses, e.g., [25,50], and theorem proving, e.g. [52,51]. Note that narrowing now has to happen modulo the equations E. A particularly nice situation, on which we focus, is when the equations E decompose as a disjoint union E = ∆ ] B, with B having a finitary unification algorithm, and with ∆ confluent and terminating modulo B. Under appropriate coherence [60] conditions discussed in Section 3, the theory (Σ, E, R) becomes semantically equivalent to the → − much more tractable theory (Σ, B, ∆ ∪ R). In the fullest possible generality, narrowing is a sound but not necessarily com→ plete method to solve reachability goals (∀− x ) t →∗ t0 . However, in Sections 5 and 6 we show that: (i) it is complete in the weaker sense of finding all normalized solutions; (ii) it is complete in the strong sense for wide classes of theories of practical interest; and (iii) completeness in the solvability sense can be recovered for arbitrary rewrite systems using back-and-forth narrowing. Efficiency of narrowing by means of clever strategies is another important concern. In Section 6.2 we report on two research directions we have been advancing in this area. One the one had, our goal has been to generalize to arbitrary rewriting systems the extension from lazy rewriting strategies [55,4,5] to a lazy narrowing strategies for functional logic programming provided by Antoy, Echahed, and Hanus with their needed narrowing [6,5]. This is an optimal demand-driven strategy that 3
Escobar, Meseguer, Thati
lazily narrows only those outermost positions that are strictly necessary while generating also optimal unifiers. Needed narrowing was improved by a more refined notion of demandedness by the natural narrowing strategy proposed by S. Escobar [18,19]. However, these lazy narrowing strategies are complete only under certain strong assumptions, such as that the rewrite rules are left-linear and constructor-based. These assumptions, while reasonable in a functional (logic) setting, are quite restrictive in our more general reachability setting that we are interested in. In recent work [21], we have proposed a generalization of natural narrowing to a reachability setting where the rewrite rules can be non-left-linear and non-constructor-based. This generalization is strictly more efficient than needed narrowing when specialized to the functional (logic) setting, and it is complete in the weak sense that it is guaranteed to find all R-normalized solutions. On the other hand, a second, quite different strategy idea, first suggested by C. Meadows [41], centers upon using term grammars to drastically cut down the narrowing search space. Although we illustrate this technique in the context of security protocol verification in which it first arose, and where we are further extending it in collaboration with Meadows [20], we believe that it will have a much wider applicability in practice to general narrowing-based symbolic model checking. 1.3
From Foundations to Applications
As already mentioned, the whole point of having a more general computational logic is to support a wider range of applications. In Section 7 we try to give a flavor for several new applications that our proposed generalization of narrowing to rewrite theories make possible, including: (i) new security protocol verification methods; (ii) more efficient theorem proving techniques; and (iii) more expressive and efficient programming language techniques.
2
Background
We assume some familiarity with term rewriting and narrowing, see [57,43] for missing definitions. Given a binary relation ⇒⊆ T × T on a set T of elements, we say that an element a ∈ T is ⇒-irreducible (or is a normal form w.r.t. ⇒) if there is no element b ∈ T such that a ⇒ b. We denote the transitive closure of ⇒ by ⇒+ , and the transitive and reflexive closure by ⇒∗ . We say that the relation ⇒ is terminating if there is no infinite sequence a1 ⇒ a2 ⇒ · · · ⇒ · · · . We say that ⇒ is confluent if whenever a ⇒∗ b and a ⇒∗ c, there exists an element d such that b ⇒∗ d and c ⇒∗ d. We say that ⇒ is convergent if it is confluent and terminating. An order-sorted signature Σ is defined by a set of sorts S, a partial order relation of subsort inclusion ≤ on S, and an (S∗ × S)-indexed family of operations {Σw,s }(w,s)∈S∗ ×S . We denote f ∈ Σw,s by f : w → s. We define a relation ≡ on S as the smallest equivalence relation generated by the subsort inclusion relation ≤. We assume that each equivalence class of sorts contains a top sort that is a supersort of every other sort in the class. Formally, for each sort s we assume that there is a sort 5 [s] such that s ≡ s0 implies s0 ≤ [s]. Furthermore, for each f : s1 × . . . × sn → s 5
In the order-sorted specifications discussed in this paper we will sometimes leave this top sort and its
4
Escobar, Meseguer, Thati
we assume that there is also an f : [s1 ] × . . . × [sn ] → [s]. We require the signature Σ to be sensible, i.e., whenever we have f : w → s and f : w0 → s0 with w, w0 of equal length, then w ≡ w0 implies s ≡ s0 . A Σ-algebra is defined by an S-indexed family of sets A = {As }s∈S such that s ≤ s0 implies As ⊆ As0 , and for each function f : w → s with w = s1 × . . . × sn a function fAw,s : As1 × . . . × Asn → As . Further, we require that subsort overloaded operations agree, i.e., for each f : w → s and (a1 , . . . , an ) ∈ Aw we require fAw,s (a1 , . . . , an ) = fA[w],[s] (a1 , . . . , an ), where if w = s1 × . . . × sn , then [w] = [s1 ] × . . . × [sn ]. We assume a family X = {Xs }s∈S of infinite sets of variables such that s 6= s0 implies Xs ∩ Xs0 = ∅, and all variables in X are different from any constant symbols in Σ. We use uppercase letters X, Y, W, . . . to denote variables in X . We denote the set of ground Σ-terms and Σ-terms of sort s by TΣs and TΣ (X )s , respectively. More generally, we write TΣ for the Σ-algebra of ground terms over Σ, and TΣ (X ) for the Σ-algebra of terms with variables from X . We use lowercase letters t, s, u, v, w, . . . to denote terms in TΣ (X ). Var (t) denotes the set of variables in t ∈ TΣ (X ). A term is linear if each variable in the term occurs at a single position. We denote the linearized version of a term t by t. We use a finite sequence of positive integers, called a position, to denote an access path in a term. We use lowercase letters p, q to denote positions in a term. For t ∈ TΣ (X ), Pos(t) denotes the set of positions in t, and Pos Σ (t) denotes the set of non-variable positions in t. Given a position p and a set P of positions, we define p.P = {p.q | q ∈ P } and just write p.q for p.{q}. The root of a term is at the empty position . The subterm of t at position p is denoted by t|p and t[s]p is the term t with the subterm at position p replaced by s. A substitution is an S-sorted mapping σ : X → TΣ (X ) which maps a variable of sort s to a term of sort s, and which is different from the identity only for a finite subset Dom(σ) of X . A substitution σ with Dom(σ) = {X1 , . . . , Xn } is usually denoted as σ = [t1 /X1 , . . . , tn /Xn ]. The identity substitution is denoted by id, i.e., Dom(id) = ∅. We denote the homomorphic extension of σ to TΣ (X ) also by σ. The set of variables introduced by σ is Ran(σ) = ∪X∈Dom(σ) Var (σ(X)). A substitution σ is called a renaming if it is a bijective mapping of variables to new variables that preserves the sorts strictly, i.e., for each X ∈ Xs , σ(X) ∈ (Xs \ Dom(σ)) and σ(X) 6= σ(Y ) for any two different variables X, Y ∈ Dom(σ). A term t is called a renamed version of another term s if there is a renaming σ such that t = σ(s). The restriction of a substitution σ to a set of variables V is defined as σ|V (X) = σ(X) if X ∈ V ; and σ|V (X) = X otherwise. For substitutions σ, ρ such that Dom(σ) ∩ Dom(ρ) = ∅ we define their composition as (σ ◦ ρ)(X) = ρ(σ(X)) for each variable X in X . We say that a substitution σ is away from a set of variables V if Ran(σ) ∩ V = ∅. A Σ-equation is an expression of the form t = t0 , where t, t0 ∈ TΣ (X )s for an appropriate sort s. Order-sorted equational logic has a sound and complete inference system E `Σ (see [45]) inducing a congruence relation =E on terms t, t0 ∈ TΣ (X ): t =E t0 if and only if E `Σ t = t0 ; where under the assumption that all sorts S in Σ are non-empty, i.e., ∀s ∈ S : TΣs 6= ∅, the inference system E `Σ can treat universal associated operators implicit, in the sense that an order-sorted signature can always be conservatively completed to one satisfying our requirements.
5
Escobar, Meseguer, Thati
quantification in an implicit way. The E-subsumption preorder E holds between t, t0 ∈ TΣ (X ), denoted t E t0 (meaning that t0 is more general than t), if there is a substitution σ such that t =E σ(t0 ); such a substitution σ is said to be an E-match from t to t0 . For substitutions σ, ρ and a set of variables V we define σ|V =E ρ|V if σ(x) =E ρ(x) for all x ∈ V , and σ|V E ρ|V if there is a substitution η such that σ|V =E (ρ ◦ η)|V . We write e e0 when E is empty, i.e., for e ∅ e0 . An E-unifier for a Σ-equation t = t0 is a substitution σ such that σ(t) =E σ(t0 ). For Var (t) ∪ Var (t0 ) ⊆ W , a set of substitutions CSUE (t = t0 , W ) is said to be a complete set of unifiers of the equation t =E t0 away from W if: (i) each σ ∈ CSUE (t = t0 , W ) is an E-unifier of t =E t0 ; (ii) for any E-unifier ρ of t =E t0 there is a σ ∈ CSUE (t = t0 , W ) such that ρ|V E σ|V and V = Var (t) ∪ Var (t0 ); (iii) for all σ ∈ CSUE (t = t0 , W ), Dom(σ) ⊆ (Var (t) ∪ Var (t0 )) and Ran(σ) ∩ W = ∅. An E-unification algorithm is complete if for any equation t = t0 it generates a complete set of E-unifiers. Note that this set needs not be finite. A unification algorithm is said to be finitary and complete if it always terminates after generating a finite and complete set of solutions. A rewrite rule is an expression of the form l → r, where l, r ∈ TΣ (X )s for an appropriate sort s. The term l (resp. r) is called the left-hand side (resp. right-hand side) of the rule l → r. In this paper we allow extra variables in righthand sides, i.e., we do not impose the usual condition Var (r) ⊆ Var (l). We will make explicit when extra variables are not allowed. An (unconditional) order-sorted rewrite theory is a triple R = (Σ, E, R) with Σ an order-sorted signature, E a set of Σ-equations, and R a set of rewrite rules. We write R−1 for the reversed rules of R, i.e., R−1 = {r → l | l → r ∈ R}. We call R linear (resp. left-linear, right-linear) if for each rule l → r ∈ R, l and r are linear (resp. l is linear, r is linear). Given R = (Σ, ∅, R), we might assume that Σ is defined as the disjoint union Σ = C ] D of symbols c ∈ C, called constructors, and symbols f ∈ D, called defined symbols, where D = {root(l) | l → r ∈ R} and C = Σ − D. A pattern is a term f (l1 , . . . , lk ) where f ∈ D and li ∈ TC (X ), for 1 ≤ i ≤ k. A rewrite system R = (C ] D, ∅, R) is constructor-based if every left-hand side of a rule in R is a pattern. p We define the one-step rewrite relation →R on TΣ (X ) as follows: t →R t0 (or →R if p is not relevant) if there is a position p ∈ Pos Σ (t), a (possibly renamed) rule l → r in R such that Var (t) ∩ (Var (r) ∪ Var (l)) = ∅, and a substitution σ such that t|p = σ(l) and t0 = t[σ(r)]p . Note that during a rewrite step extra variables in the right-hand side of the rewrite rule being used may be automatically instantiated with arbitrary substitutions. The relation →R/E for rewriting modulo E is defined as =E ◦ →R ◦ =E , i.e., t →R/E t0 if there are w, w0 ∈ TΣ (X ) such that t =E w, w →R w0 , and w0 =E t0 . Note that →R/E induces a relation on E-equivalence classes, namely, [t]E →R/E [t0 ]E iff t →R/E t0 . We say R = (Σ, E, R) is terminating (resp. confluent, convergent) if the relation →R/E is terminating (resp. confluent, convergent). For substitutions σ, ρ and a set of variables V we define σ|V →R ρ|V if there is X ∈ V such that σ(X) →R ρ(X) and for all other Y ∈ V we have σ(Y ) = ρ(Y ). The relation →R/E on substitutions is defined as =E ◦ →R ◦ =E . A substitution σ is called R/E-normalized if σ(X) is →R/E -irreducible for all X. 6
Escobar, Meseguer, Thati
3
Narrowing
Since E-congruence classes can be infinite, →R/E -reducibility is undecidable in general. Therefore, we “implement” R/E-rewriting by a combination of rewriting using oriented equations and rules. We assume that E is split into a set of (oriented) equations ∆ and a set of axioms B, i.e., E = ∆ ] B, and is such that: (i) B is regular, i.e., for each t = t0 in B, we have Var (t) = Var (t0 ), and sortpreserving, i.e., for each substitution σ, we have σ(t) ∈ TΣ (X )s if and only if σ(t0 ) ∈ TΣ (X )s . (ii) B has a finite and complete unification algorithm and ∆ ∪ B has a complete (but not necessarily finite) unification algorithm. (iii) For each t = t0 in ∆ we have Var (t0 ) ⊆ Var (t). (iv) ∆ is sort-decreasing, i.e., for each t = t0 in ∆, each s ∈ S, and each substitution σ, σ(t0 ) ∈ TΣ (X )s implies σ(t) ∈ TΣ (X )s . → − (v) The rewrite rules ∆ obtained by orienting the equations in ∆ are confluent → and terminating modulo B, i.e., the relation →− is convergent modulo B. ∆/B Definition 3.1 (R ∪ ∆, B-rewriting) We define the relation →R,B on TΣ (X ) as t →R,B t0 if there is a p ∈ Pos Σ (t), l → r in R such that Var (t)∩(Var (r)∪Var (l)) = ∅, and a substitution σ such that t|p =B σ(l) and t0 = t[σ(r)]p . The relation →∆,B → − is similarly defined by considering the oriented rewrite rules ∆ obtained from the equations in ∆. We define →R∪∆,B as →R,B ∪ →∆,B . Note that, since B-matching is decidable, →∆,B , →R,B , and →R∪∆,B are decidable. R ∪ ∆, B-normalized (and similarly R, B or ∆, B-normalized) substitutions are defined in a straightforward manner. The idea is to implement →R/E (on terms and goals) using →R∪∆,B . For this to work, we need the following additional assumptions. (vi) We assume that →∆,B is coherent with B [35], i.e., ∀t1 , t2 , t3 we have t1 →+ ∆,B t2 + ∗ and t1 =B t3 implies ∃t4 , t5 such that t2 →∆,B t4 , t3 →∆,B t5 and t4 =E t5 . (vii) We assume that (a) →R,B is E-consistent with B, i.e. ∀t1 , t2 , t3 we have that t1 →R,B t2 and t1 =B t3 imply ∃t4 such that t3 →R,B t4 and t2 =E t4 ; and (b) →R,B is E-consistent with →∆,B , i.e. ∀t1 , t2 , t3 we have that t1 →R,B t2 and t1 →∗∆,B t3 imply ∃t4 , t5 such that t3 →∗∆,B t4 and t4 →R,B t5 and t5 =E t2 . The following lemma links →R/E with →∆,B and →R,B . Lemma 3.2 [47] Let R = (Σ, ∆ ∪ B, R) be an order-sorted rewrite theory with properties (i)–(vii) assumed above. Then t1 →R/E t2 if and only if t1 →∗∆,B →R,B t3 for some t3 =E t2 . Thus t1 →∗R/E t2 if and only if t1 →∗R∪∆,B t3 for some t3 =E t2 . Narrowing generalizes rewriting by performing unification at non-variable positions instead of the usual matching. The essential idea behind narrowing is to symbolically represent the transition relation between terms as a narrowing relation 7
Escobar, Meseguer, Thati
between terms. Specifically, narrowing instantiates the variables in a term by a B-unifier that enables a rewrite modulo B with a given rule and a term position. Definition 3.3 (R ∪ ∆, B-narrowing) The R∪∆, B-narrowing relation on TΣ (X ) σ 0 is defined as t σ if R ∪ ∆, B is understood) if there is p ∈ R∪∆,B t (or Pos Σ (t), a rule l → r in R ∪ ∆ such that Var (t) ∩ (Var (l) ∪ Var (r)) = ∅, and σ ∈ CSUB (t|p = l, V ) for a set V of variables containing Var (t), Var (l), and Var (r), such that t0 = σ(t[r]p ).
4
Narrowing Reachability Goals
First, we recall the definition of reachability goals provided in [47]. Definition 4.1 (Reachability goal) Given an order-sorted rewrite theory R = (Σ, E, R), we define a reachability goal G as a conjunction of the form t1 →∗ t01 ∧ . . . ∧ tn →∗ t0n , where for 1 ≤ i ≤ n, we have ti , t0i ∈ TΣ (X )si for appropriate sorts si . The empty goal is denoted by Λ. We assume that conjunction ∧ is associative and commutative, so that the order of conjuncts is irrelevant. We say that the ti are the sources of the goal G, while the t0i are the targets. We S define Var (G) = i (Var (ti ) ∪ Var (t0i )). A substitution σ is an R-solution of G (or just a solution for short, when R is clear from the context) if σ(ti ) →∗R/E σ(t0i ) for 1 ≤ i ≤ n. Definition 4.2 (R/E-rewriting on goals) We define the rewrite relation on goals as follows. (Reduce)
G ∧ t1 →∗ t2
→R/E
G ∧ t01 →∗ t2
(Eliminate)
G ∧ t →∗ t
→R/E
G.
if t1 →R/E t01
The relations →R∪∆,B , →R,B , and →∆,B are lifted to goals and substitutions in a similar manner. Lemma 4.3 [47] σ is a solution of a reachability goal G if and only if σ(G) →∗R/E Λ. Given an (unconditional) order-sorted rewrite theory R, we are interested in finding a complete set of R-solutions for a given goal G. Definition 4.4 (Complete set of solutions on goals) A set Γ of substitutions is said to be a complete set of R-solutions of G if •
Every substitution σ ∈ Γ is an R-solution of G, and
•
For any R-solution ρ of G there is a substitution σ ∈ Γ such that σ|Var (G) E ρ|Var (G) .
The narrowing relation on terms is extended to reachability goals by narrowing only the left-hand sides of the goals, while the right-hand sides only accumulate substitutions. The idea is to repeatedly narrow the left-hand sides until each lefthand side unifies with the corresponding right-hand side. The composition of the unifier with all the substitutions generated (in the reverse order) gives us a solution of the goal. Definition 4.5 (R ∪ ∆, B-narrowing on goals) We define the narrowing rela8
Escobar, Meseguer, Thati
tion on goals as follows. (Narrow) G ∧ t1 →∗ t2
σ
G ∧ t1 →∗ t2
σ
(Unify)
R∪∆,B
σ(G) ∧ t01 →∗ σ(t2 ) if t1
R∪∆,B
σ(G)
σ
0 R∪∆,B t1
if σ∈CSU∆∪B (t1 = t2 , Var (G)).
σ∗
σ
σ
We write G R G0 if there is a sequence of derivations G 1 R . . . n R G0 such that σ = σn ◦ σn−1 ◦ . . . ◦ σ1 . Similarly, ∆, B-narrowing and R ∪ ∆, B-narrowing relations are defined on terms and goals, as expected.
5
Soundness and Weak Completeness
Let us recall that ∆, B-narrowing is known to give a sound and complete procedure for ∆ ∪ B-unification [35]. Soundness of narrowing as a solving reachability procedure is easy from [35]. Theorem 5.1 (Soundness) [47] If G
σ∗ R∪∆,B
Λ, then σ is a solution of G.
The completeness of narrowing as a procedure for solving equational goals critically depends on the assumption that the equations are confluent, an assumption that is no longer reasonable in our more general reachability setting, where the meaning of a rewrite is changed from an oriented equality to a transition or an inference, so that we can specify and program with rewrite theories concurrent systems and logical inference systems. In this general setting, confluence and termination are not reasonable assumptions, and are therefore dropped. As a result of this, narrowing is no longer a complete procedure for solving reachability goals, in that it may fail to find certain solutions. The idea behind proving weak completeness is to associate with each R ∪ ∆, Brewriting derivation an R∪∆, B-narrowing derivation. It is possible to establish such a correspondence only for R/E-normalized substitutions, and hence the weakness in completeness. Theorem 5.2 (Weak completeness) [47] Let R be a set of rewrite rules with no extra-variables in the right-hand side, ρ be an R/E-normalized solution of a reachability goal G, and let V be a finite set of variables containing Var (G). Then σ∗ there is σ such that G R∪∆,B Λ and σ|V E ρ|V . Narrowing is complete only with respect to R/E-normalized solutions, as shown by the following example. Example 5.3 Consider R = (Σ, ∅, R), where Σ has a single sort, and constants a, b, c, d, and a binary function symbol f , and R has the following three rules: a→b
a→c
f (b, c) → d
The reachability goal G : f (x, x) →∗ d has σ = {a/x} as a solution. But G has neither a trivial solution nor a narrowing derivation starting from it. We can provide a general procedure which builds a narrowing tree starting from G to find all R/E-normalized solutions. 9
Escobar, Meseguer, Thati
Theorem 5.4 [47] Let R be a set of rewrite rules with no extra-variables in the right-hand side. For a reachability goal G, let V be a finite set of variables containing σ∗ Var (G), and Γ be the set of all substitutions σ, where G R∪∆,B Λ. Then Γ is a complete set of solutions of G with respect to R/E-normalized solutions. Nodes in this tree correspond to goals, while edges correspond to one-step R ∪ ∆, Bnarrowing derivations. There are two issues to be addressed here: (i) Since there can be infinitely long narrowing derivations, the tree has to be traversed in a fair manner to cover all possible narrowing derivations. (ii) The ∆ ∪ B-unification algorithm invoked at each node of a tree for R/E can in general return an infinite set of unifiers. By assumption (ii) in Section 4, ∆∪B-unification has a complete but not necessarily finite unification algorithm. However, the narrowing steps themselves of the general procedure above use only B-unification, which is finite, and thus the enumeration of ∆ ∪ B-unifiers should be interleaved in a fair manner with the expansion of the narrowing tree. If the rhs’s of G are strongly →∆,B -irreducible, i.e., all their instances by →∆,B -normalized substitutions are →∆,B -irreducible, then ∆ ∪ B-unification can be replaced by B-unification. While this general procedure gives us weak completeness, for it to be useful in practice we need effective narrowing strategies that drastically cut down the search space by expanding only relevant (or necessary) parts of the narrowing tree. We discuss this topic in Section 6.2.
6
Completeness and Computational Efficiency Issues
6.1
Completeness
The crucial reason for losing completeness is that, by definition, narrowing can be performed only at non-variable positions, and therefore cannot account for rewrites that occur within the solution (i.e. under variable positions) 6 . Such “under-thefeet” rewrites can have non-trivial effects if the rewrite rules or the reachability goal are non-linear, and the rules are not confluent. A natural question to ask is whether the simple narrowing procedure described above is complete for specific classes of rewrite theories. In [47] we have identified several useful classes of rewrite theories for which the naive narrowing procedure can find all solutions, and have applied these results to verify safety properties of cryptographic protocols. One such class is that of so-called “topmost” rewrite theories, that includes: (i) most object-oriented systems; (ii) a wide range of Petri net models; and (iii) many reflective distributed systems [46]. Another such class is one where the rewrite rules are right linear and the reachability goal is linear. In [58], we establish a completeness result of a much broader scope by generalizing narrowing to back-and-forth-narrowing for solving reachability goals. Backand-forth narrowing is complete in the solvability sense, i.e., it is guaranteed to 6
One could of course generalize the definition of narrowing to allow narrowing steps at variable positions. But that would make the narrowing procedure very inefficient since, in general, we would have to perform arbitrary instantiations of variables.
10
Escobar, Meseguer, Thati
find a solution when there is one. The back-and-forth procedure is very general, in the sense that there are absolutely no assumptions on the given rewrite system R = (Σ, ∅, R). In particular, the rewrite rules in R need not be left or right linear, or confluent, or terminating, and can also have extra variables in the right-hand side. In back-and-forth narrowing, we: •
generalize the basic narrowing step through linearization of the term being narrowed, and
•
use a combination of forward and backward narrowing with this generalized relation.
Specifically, we account for under-the-feet rewrites by defining an extended narrowing step that is capable of “skipping” several such rewrites and capturing the first rewrite that occurs at a non-variable position. This is achieved by linearizing a term before narrowing it with a rule. The intermediate under-the-feet rewrites that have thus been skipped will be accounted for by extending the reachability goal with appropriate subgoals. For example, consider the reachability goal ∃x. f (x, x) →∗ d in Example 5.3 again. We: (i) linearize the term f (x, x) to, say, f (x1 , x2 ), (ii) narrow the linearized term with the rule f (b, c) → d and the unifier {b/x1 , c/x2 }, and (iii) extend the reachability goal with subgoals x →∗ b and x →∗ c. This gives us the new reachability goal ∃x. d →∗ d ∧ x →∗ b ∧ x →∗ c. However, linearization alone is not enough, in general, to regain completeness: → we also need to use the “back-and-forth” idea. For example, consider a goal ∃− x .t →∗ 0 ∗ 0 t , where the solution σ is such that any rewrite sequence σ(t) → σ(t ) is such that none of the rewrites occur at non-variable positions of t. But observe that if at least one of these rewrites occurs at a non-variable position in t0 , then we can narrow the right side t0 in the backward direction, i.e. using R−1 , to obtain a simpler goal. For instance, in the goal ∃x. d →∗ d ∧ x →∗ b ∧ x →∗ c above, backward narrowing gives us the goal ∃x. d →∗ d ∧ x →∗ a ∧ x →∗ a, which has the unifier (solution) {a/x}. In general, backward narrowing might in turn enable forward narrowing steps using R on the left-hand side, and so on, until we reach a point where all the rewrites occur under variable positions of both the left-hand and right-hand sides. In this case, however, the left-hand and right-hand sides are unifiable, and we are therefore done. For the simple example considered above, however, note that just backward narrowing with R−1 , even without any linearization, gives us the solution as follows: d id f (b, c) id id f (a, a). But in [58] we present examples showing that a combination of forward and backward narrowing is indeed necessary, in that neither direction is complete by itself.
6.2
Narrowing Strategies
An important problem for narrowing to be effective in practice is to devise strategies that improve its efficiency. Otherwise, one could quickly face a combinatorial explosion in the number of possible narrowing sequences. When several narrowing derivations are possible for the same solution, the question is whether there is a preferred strategy and whether a standardization result is possible. 11
Escobar, Meseguer, Thati
We have been working on two strategy approaches to explore the search space in a smart manner: (i) the natural narrowing strategy [21]; and (ii) the grammar-based narrowing strategy for cryptographic protocol verification [20]. 6.2.1 Natural narrowing In a recent work [21], we have proposed a narrowing strategy, called natural narrowing, which allows rewrite theories R = (Σ, ∅, R) that can be non-left-linear, nonconstructor-based, non-terminating, and non-confluent. This strategy improves all the previous state-of-the-art narrowing strategies within the functional logic setting, and it is complete in the weak sense that it is guaranteed to find all R-normalized solutions. We give the reader an intuitive feeling for how natural narrowing works. Example 6.1 [21] Consider the following rewrite system for proving equality (≈) of arithmetic expressions built using modulus or remainder (%), subtraction (−), and minimum (min) operations on natural numbers. (1) (2) (3) (4)
M % s(N) → (M−s(N)) % s(N) (0 − s(M)) % s(N) → N − M M − 0 → M s(M) − s(N) → M−N
(5) (6) (7) (8)
min(0, N) → 0 min(s(N),0) → 0 min(s(N),s(M)) → s(min(M,N)) X ≈ X → true
Note that this rewrite system is not left-linear because of rule (8), and it is not constructor-based because of rule (2). Furthermore, note that it is neither terminating nor confluent due to rule (1). Consider the term 7 t = ∞ % min(X,X−0) ≈ ∞ % 0 and the following two narrowing sequences. Only these two are relevant, while evaluation of subterm ∞ may run into problems. First, the following sequence leading to true, that starts by unifying subterm t|1.2 with left-hand side (lhs) (5): ∞ % min(X,X−0) ≈ ∞ % 0
[X7→0]
∞%0 ≈ ∞%0
id
true
Second, the following sequence not leading to true, that starts by reducing subterm t|1.2.2 with lhs (3) and that early instantiates variable X: ∞ % min(X,X−0) ≈ ∞ % 0 id
[X7→s(X’)] ∞ % min(s(X’),s(X’)) ≈ ∞ % 0 ∞ % s(min(X’,X’)) ≈ ∞ % 0
The key points to achieve these optimal evaluations are: (i) (Demanded positions). This notion is relative to a left-hand side (lhs) l and determines which positions in a term t should be narrowed in order to be able to match l at a root position. For the term ∞ % min(X,X−0) ≈ ∞ % 0 and lhs X ≈ X, only subterm min(X,X−0) is demanded. (ii) (Failing term). This notion is relative to a lhs l and stops further wasteful narrowing steps. Specifically, the last term ∞ % s(min(X’,X’)) ≈ ∞ % 0 of the second former sequence fails w.r.t. lhs (8), since the subterm s(min(X’,X’)) is demanded by (8) but there is a clash between symbols s and 0. (iii) (Most frequently demanded positions). This notion determines those demanded positions w.r.t. non-failing lhs’s that are demanded by the maximum number of rules and that cover all such non-failing lhs’s. It provides the optimality 7
The subterm ∞ represents an expression that has a remarkably high computational cost.
12
Escobar, Meseguer, Thati
properties. If we look closely at lhs’s (5), (6), and (7) defining min, we can see that the first argument is demanded by the three rules, whereas the second argument is demanded only by (6) and (7). Thus, subterm X at the first argument of min in term ∞ % min(X,X−0) ≈ ∞ % 0 is the most frequently demanded position. Note that this subterm is a variable; this motivates the last point. (iv) (Lazy instantiation). This notion relates to an incremental construction of unifiers without the explicit use of a unification algorithm. This is necessary in the previous example, since subterm min(X,X−0) does not unify with lhs’s l6 and l7 . However, we can deduce that narrowing at subterm X−0 is only necessary when substitution [X 7→ s(X’)], inferred from l6 and l7 , has been applied. Thus, we early construct the appropriate substitutions [X 7→ 0] and [X 7→ s(X’)] in order to reduce the search space. 6.2.2 A Grammar-based Strategy for Protocol Verification In work of Escobar, Meadows, and Meseguer [20], the grammar techniques used by Meadows in her NRL Protocol Analyzer (NPA) [41] have been placed within the general narrowing setting proposed in this paper and have been implemented in Maude, in what we call the Maude-NPA tool. For narrowing to be a practical tool for such protocol analysis we need efficient strategies that drastically cut down the search space, since protocols have typically an infinite search space and are highly non-deterministic. The NRL Protocol Analyzer [41] is a tool for the formal specification and analysis of cryptographic protocols that has been used with great effect on a number of complex real-life protocols. One of the most interesting of its features is that it can be used, not only to prove or disprove authentication and secrecy properties using the standard Dolev-Yao model [16], but also to reason about security in face of attempted attacks on low-level algebraic properties of the functions used in a protocol. Maude-NPA’s ability to reason well about these low-level functionalities is based on its combination of symbolic reachability analysis using narrowing modulo algebraic axioms E, together with its grammar-based techniques for reducing the size of the search space. On one hand, unification modulo algebraic properties (e.g., encryption and decryption, concatenation and deconcatenation) as narrowing using a finite convergent (i.e., confluent and terminating) set of rewrite rules where the right-hand side of each rule is either a subterm of the left-hand side or a ground term [15] allows the tool to represent behavior which is not captured by the usual Dolev-Yao free algebra model. On the other hand, techniques for reducing the size of the search space by using inductively defined co-invariants describing states unreachable to an intruder allow us to start with an infinite search space, and reduce it in many cases to a finite one, thus freeing us from the requirement to put any a priori limits on the number of sessions. Example 6.2 Consider a protocol with only two operations, encryption, represented by e(K, X), and decryption, represented by d(K, X), where K is the key and X is the message. Suppose that each time an honest principal A receives a message X, it outputs d(k, X), where k is a constant standing for a key shared by 13
Escobar, Meseguer, Thati
all honest principals. We can denote this by a protocol rule X → d(k, X) in R. Encryption and decryption usually satisfy the following cancellation properties E: d(K, e(K, X)) = X and e(K, d(K, X)) = X. In order to keep the example simple, we assume that the intruder does not perform operations, so no extra intruder rules are added to R. Suppose now that we want to find out how an intruder can learn a term m that does not know initially. The NPA uses backwards search, so we ask what rules could produce m, and how. According to the honest principal rule X → d(k, X) and the property d(K, e(K, X)) = X, we have that the intruder can learn m only if previously knows e(k, m). That is, we consider the rule application e(k, m) → d(k, e(k, m)), where d(k, e(k, m)) =E m. We then ask the Maude-NPA how the intruder can learn e(k, m), and we find that this can only happen if the intruder previously knows e(k, e(k, m)). We see a pattern emerging, which suggests a set of terms belonging to the following formal tree language L: L 7→ m L 7→ e(k, L) We can verify the co-invariant stating that intruder knowledge of any member of L implies previous knowledge of some member of L, therefore being impossible for an intruder to learn any member of L from an initial state in which does not know any messages. This defines a backwards narrowing strategy where we discard any protocol state in which the intruder has to learn some term in the grammar L, since this would lead to a useless backwards search path. For more details see [20] and Section 7.1.
7
Applications
In this section we show how narrowing can be used as a unified mechanism for programming and proving. We consider the following scenarios: (i) The use of narrowing reachability analysis in security protocol verification (Section 7.1). (ii) Various uses of narrowing in theorem proving, e.g., for improving equational unification, for inductive theorem proving, and for automatic, inductionless induction theorem proving (Section 7.2). (iii) Within functional logic programming, the use of narrowing in broader classes of programs not supported by current functional logic programming languages, e.g., narrowing modulo AC axioms, removing left-linearity or constructorbased requirements, nonconfluent systems, etc. (Section 7.3). (iv) Many formal techniques use some form of narrowing. For example, inductionless induction and partial evaluation do so. These techniques can yield better results when more efficient narrowing strategies are used (Sections 7.2.3 and 7.3.1). The grammar-based strategy of Section 6.2.2 supports scenario (i), whereas the natural narrowing strategy of Section 6.2.1 supports scenarios (ii), (iii), and (iv). 14
Escobar, Meseguer, Thati
7.1
Security Protocol Verification
Verification of many security protocol properties can be formulated as solving reachability problems. For instance, verifying the secrecy property of a protocol amounts to checking whether the protocol can reach a state where an intruder has discovered a data item that was meant to be a secret. In this section we show how the strong completeness of narrowing for topmost rewrite theories, together with the grammar-based strategy explained in Section 6.2.2, can be exploited to get a generic and complete procedure for the analysis of such security properties modulo algebraic properties of the cryptographic functions. Example 7.1 Consider the well-known Needham-Schroeder protocol [48] that uses public keys to achieve authentication between two parties, Alice and Bob. The protocol is specified in Maude 8 according to the framework described in [20]: fmod PROTOCOL-SYMBOLS is --- Importing Auxiliary Sorts: Msg, Fresh, Strands, ... protecting DEFINITION-PROTOCOL-RULES . --- Sort Information sorts Name Nonce Key Enc . subsort Name Nonce Enc Key < Msg . subsort Name < Key . --- Encoding operators for public/private encryption op pk : Key Msg -> Enc . op sk : Key Msg -> Enc . --- Nonce operator op n : Name Fresh -> Nonce . --- Intruder’s name op i : -> Name . --- Associativity operator op _;_ : Msg Msg -> Msg . *** Encryption/Decryption Cancellation Algebraic Properties eq pk(Ke:Key,sk(Ke:Key,Z:Msg)) = Z:Msg . eq sk(Ke:Key,pk(Ke:Key,Z:Msg)) = Z:Msg . endfm mod PROTOCOL-STRANDS-RULES is protecting PROTOCOL-SYMBOLS . var SS : StrandSet . var K : IntruderKnowledge . vars L ML L1 L2 : SMsgList . vars M M1 M2 : Msg . vars A B : Name . var Ke : Key . var r : Fresh . var N : Nonce . *** General rule: Accept input message rl [ L1 | -(M), L2 ] & SS & {M inI, K} => [ L1, -(M) | L2 ] & SS & {M inI, K} . *** General rule: Accept output message rl [ L1 | +(M), L2 ] & SS & {M !inI, K} => [ L1, +(M) | L2 ] & SS {M inI, K} . *** General rule: Accept output message rl [ L1 | (+(M), L2) ]) & SS & K => [ L1, +(M) | L2 ] & SS & K . *** Dolev-Yao Intruder Rules 8
The Maude syntax is so close to the corresponding mathematical notation for defining rewrite theories as to be almost self-explanatory. The general point to keep in mind is that each item: a sort, a subsort, an operation, an equation, a rule, etc., is declared with an obvious keyword: sort, subsort, op, eq (or ceq for conditional equations), rl (or crl for conditional rules), etc., with each declaration ended by a space and a period. Indeed, a rewrite theory R = (Σ, ∆ ∪ B, R) is defined with the signature Σ using keyword op, equations in ∆ using keyword eq, axioms in B using keywords assoc, comm and id:, and rules in R using keyword rl. Another important point is the use of “mix-fix” user-definable syntax, with the argument positions specified by underbars; for example: if then else fi.
15
Escobar, Meseguer, Thati rl ([ -(M1)), -(M2) | +(M1 ; M2) ] & SS & {(M1 ; M2) !inI, K} => SS & {(M1 ; M2) inI, K} . rl ([ -(M1 ; M2) | +(M1) , +(M2) ] & SS & {M1 !inI, K} => SS & {M1 inI, K} . rl ([ -(M1 ; M2) | +(M2) , +(M1) ] & SS & {M2 !inI, K} => SS & {M2 inI, K} . rl ([ -(M) | +(sk(i, M)) ] & SS & {sk(i, M) !inI, K} => SS & {sk(i, M) inI, K} . rl ([ -(M) | +(pk(Ke, M)) ] & SS & {pk(Ke, M) !inI, K} => SS & {pk(Ke, M) inI, K} . rl ([ nil | +(A) ] & SS & {A !inI, K} => SS & {A inI, K} . *** Initiator rl [ nil | +(pk(B, A ; n(A, r))), -(pk(A, n(A, r) ; N)), +(pk(B, N)) ] & SS & {pk(B, A ; n(A, r)) !inI, K} => SS & {pk(B, A ; n(A, r)) inI, K} . rl [ +(pk(B, A ; n(A, r))), -(pk(A, n(A, r) ; N)) | +(pk(B, N)) ] & SS & {pk(B,N) !inI, K} => SS & {(B,N) inI, K} . *** Responder rl [ -(pk(B,A ; N)) | +(pk(A,N ; n(B,r))), -(pk(B,n(B,r))) ] & SS & {pk(A,N ; n(B,r)) !inI, K} => SS & {pk(A,N ; n(B,r)) inI, K} . endm
In this Maude specification, a nonce, i.e., a random number sent by one principal to another to ensure confidentially, is denoted by n(A,r), where A is the name of the principal and r is the randomly generated number. Concatenation of two messages is denoted by the operator ; , e.g., n(A,r);n(B,r’). Encryption of a message M with the public key of principal A is denoted by pk(A,M), e.g., pk(A,n(S,r);S). Encryption of a message with a private key is denoted by sk(A,M), e.g., sk(A,n(S,r);S). The name of the intruder is fixed and denoted by constant i. The only secret key operation the intruder can perform is sk(i,m) for a known message m. The protocol is described using strands [22]. A state of the protocol is a set of strands (with & the associative and commutativity union operator) and the intruder knowledge at that point, which is enclosed within curly brackets and contains two kinds of facts: positive knowledge facts, denoted by (m inI), and negative knowledge facts, denoted by (m !inI). A strand denotes the sequence of input messages (denoted by −(M ) or M − ) and output messages (denoted by +(M ) or M + ) that a principal performs. Different sessions of the same protocol can be run in parallel just by having different strands in the set of strands. The protocol is described as the following set of strands [20]: (i) [pk(B, A; n(A, r))+ , pk(A, n(A, r); Z)− , pk(B, Z)+ ] This strand represents principal A initiating the protocol by sending his/her name and a nonce, both encrypted with B’s public key, to B in the first message. Then A receives B’s response and sends a final message consisting of the rest of the message received from B. (ii) [pk(B, A; W )− , pk(A, W ; n(B, r0 ))+ , pk(B, n(B, r0 ))− ] This strand represents principal B receiving A’s first message, checking that it is the public key encryption of A’s name concatenated with some value W , and then sending to A the concatenation of that value W with B’s own nonce, encrypted with A’s public key. Then, B receives the final message from A and verifies that the final message that it receives has B’s nonce encrypted with B’s public key. together with the intruder capabilities to concatenate, deconcatenate, encrypt and 16
Escobar, Meseguer, Thati
decrypt messages according to the Dolev-Yao attacker’s capabilities [16]: (iii) (iv) (v) (vi)
[M1− , M2− , (M1 ; M2 )+ ] Concatenation of two messages into a message. [(M1 ; M2 )− , M1+ , M2+ ] Extraction of two concatenated messages. [M − , pk(Y, M )+ ] Encryption of a message with a public key. [M − , sk(i, M )+ ] Encryption of a message with the intruder’s secret key.
All these strands give rise to backwards rewrite rules describing their effect on the intruder’s knowledge as shown in the above Maude specification. There are also three general rules for accepting input and output messages. As explained in [20], we then perform a backwards narrowing reachability analysis, i.e., we provide a reachability goal from a final state pattern describing an attack, like · · · & [m1 , m2 , . . . , mk | nil] & · · · & {(m01 inI), . . . , (m0n inI)}
to an initial state pattern of the form · · · & [nil | m1 , m2 , . . . , mk ] & · · · & {(m01 !inI), . . . , (m0n !inI), . . . , (m0n+j !inI)}
where the initial state may contain more strands and more terms m0 to be known in the future (m0 !inI) in the intruder knowledge than the final state, since they may have been added during the backwards narrowing process. Consider, for example, the following final attack state pattern (where Z is a variable of sort Msg, and A, B are variables of sort Name): [ pk(B, A; Z)− , pk(A, Z; n(B, r0 ))+ , pk(B, n(B, r0 ))− | nil ] & { (n(B, r0 ) inI) }
which represents a situation where B has completed the expected communication with someone (i.e., A) and the intruder has learned B’s nonce. For this insecure goal state, the reachability analysis returns several possible solutions, including the following initial state corresponding to Lowe’s attack [39] (note the new strands and the new terms in the intruder knowledge): [ nil | pk(i, A; n(A, r))+ , pk(A, n(A, r); n(B, r0 ))− , pk(i, n(B, r0 ))+ ] & [ nil | pk(i, A; n(A, r))− , (A; n(A, r))+ ] & [ nil | (A; n(A, r))− , pk(B, (A; n(A, r))+ ] & [ nil | pk(B, A; n(A, r))− , pk(A, n(A, r); n(B, r0 ))+ , pk(B, n(B, r0 ))− ] & [ nil | pk(i, n(B, r0 ))− , n(B, r0 )+ ] & [ nil | n(B, r0 )− , pk(B, n(B, r0 ))+ ] & { (n(B, r0 ) !inI), (pk(i, n(B, r0 )) !inI), (pk(A, n(A, r); n(B, r0 )) !inI), (pk(i, A; n(A, r)) !inI), ((A; n(A, r)) !inI), (pk(B, (A; n(A, r)) !inI), (pk(B, n(B, r0 )) !inI) }
Note that, in order to define an effective mechanism to find the previous attack, we have to detect and avoid many irrelevant paths (usually infinite in depth). For instance, we should avoid the following infinite backwards narrowing sequence generated by the Dolev-Yao strand for deconcatenation shown above, [. . . , m− | . . .] [. . . | m− , . . .] & [(M1 ; m)− | m+ , M1+ ] [. . . | m− , . . .] & [nil | (M1 ; m)− , m+ , M1+ ] & [(M2 ; (M1 ; m))− | (M1 ; m)+ , M2+ ] [. . . | m− , . . .] & [nil | (M1 ; m)− , m+ , M1+ ] & [nil | (M2 ; (M1 ; m))− , (M1 ; m)+ , M2+ ] & [(M3 ; (M2 ; (M1 ; m)))− | (M2 ; (M1 ; m))+ , M3+ ] ···
which shows that the intruder learnt a message m1 ; · · · ; mn ; m in a previous state that he/she is decomposing to learn m. Indeed, this useless search and many similar 17
Escobar, Meseguer, Thati
ones are avoided using a grammar-based strategy (see Section 6.2.2). Furthermore, for many protocols backwards narrowing with a grammar-based strategy terminates, allowing full verification of security properties. 7.2
Theorem Proving
We consider three aspects related to theorem proving where narrowing reachability analysis using a convergent equational theory E is relevant: → → → • equational unification problems, i.e., solving (∃− x ) t(− x ) = t0 (− x ), • •
→ inductive theorem proving, i.e., solving E `ind (∃− x ) t = t0 , and → automatic proof by inductionless induction of goals E ` (∀− x ) t = t0 . ind
7.2.1 Equational unification Narrowing was originally introduced as a complete method for generating all solutions of an equational unification problem, i.e., for goals F of the form → → → → → (∃− x ) t1 (− x ) = t01 (− x ) ∧ . . . ∧ tn (− x ) = t0n (− x) in free algebras modulo a set E of convergent equations [23,33,35]. As already pointed out in the Introduction, solving E-unification problems such as the goal F → above, is equivalent to solving by narrowing the reachability goal G = (∃− x ) t1 ≈ ∗ ∗ 0 0 ˜ ∅, RE ), where t1 → true ∧ . . . ∧ tn ≈ tn → true in the rewrite theory RE = (Σ, → → − ˜ = Σ ∪ {≈, true} and RE = − Σ E ∪ { x ≈ x → true}, with E the rules obtained orienting the equations E. Even in this traditional setting, our techniques can be useful. For example, many convergent equational theories fail to satisfy the left-linearity and constructor-based requirements; but no such restrictions apply to natural narrowing. 7.2.2 Inductive theorem proving The just-described reduction of existential equality goals to reachability goals has important applications to inductive theorem proving. Specifically, it is useful in → proving existentially quantified inductive theorems like E `ind (∃− x ) t = t0 in the → initial model, i.e., in the minimal Herbrand model of E satisfies (∃− x ) t = t0 . As it is well-known (see, e.g., [27]) → → E ` (∃− x ) t = t0 ⇔ E `ind (∃− x ) t = t0 therefore, narrowing is an inductive inference method. An effective narrowing strategy, such as natural narrowing, can provide a very effective semidecision procedure for proving such inductive goals, because it will detect failures to unify, stopping with a counterexample instead of blindly expanding the narrowing tree. In particular, an effective narrowing strategy can be added to inductive provers such as Maude’s ITP [11]. In cases when equational narrowing is ensured to terminate (see [49]), an effective narrowing strategy can also be used to prove universal inductive goals of → the form E `ind (∀− x ) C ⇒ t = t0 , with C a conjunction of equations, because we → can reduce proving such a goal to first solving E `ind (∃− x ) C by narrowing, and → − 0 then proving E `ind (∀ x ) σ(t) = σ(t ) for each of the solutions σ found for C. 18
Escobar, Meseguer, Thati
7.2.3 Inductionless Induction An effective narrowing strategy can also be useful in proving universal inductive → theorems of the form E `ind (∀− x ) t = t0 , even in the case where equational narrowing is not guaranteed to terminate. Specifically, an effective narrowing strategy can be fruitfully integrated in automatic techniques that use narrowing for proving or refuting such goals, such as inductionless induction [12]. Inductionless induction aims at automatically proving universal inductive theorems without using an explicit induction scheme, as in implicit induction techniques such as [8,53]. It simplifies the task by using classical first-order theorem provers which are refutation-complete and saturation-based, and a dedicated technique to ensure (in-)consistency. Given a set C of equations that are to be proved or refuted (also understood as conjectures) in the initial model for a set of equations E, the technique considers a (first-order) axiomatization A of the initial model that represents the “negative” information about inequalities in the model. The key fact that is exploited is that the conjectures C are inductive theorems if and only if C ∪ A ∪ E is consistent. This consistency check is in turn performed in two stages 9 : (i) first the logical consequences of C using E are computed, and (ii) the consistency of each such consequence with A is checked using a refutationally complete theorem-prover. The conjectures are true if and only if there are no logical consequences that are inconsistent with A. The relevant point in this inductionless induction technique is that deductions of C are computed by a (not very restricted) version of narrowing called superposition, with some additional tactics to eliminate redundant (or irrelevant) consequences. Specifically, the conjectures in C are narrowed using oriented equations E to obtain the logical consequences. The consistency of each such non-redundant consequence with A is checked as before, until the superposition does not return any more nonredundant consequences. The point is that a narrowing strategy can be used instead of superposition (under some restrictions not discussed here) to compute a smaller set of deductions, which can increase the chances of termination of the procedure above, without loss of soundness. We briefly recall a few more details about the inductionless induction method (see [12] for additional details). We assume below that is a reduction ordering which is total on ground terms, i.e. a relation which is irreflexive, transitive, well-founded, total, monotonic, and stable under substitutions. A well-known reduction ordering is the recursive path ordering, based on a total ordering Σ (called precedence) on Σ. The following is the inference rule defining the superposition strategy (note that l = r is symmetric).
Superposition
l=r
c[s]p if σ = mgu(l, s), s is not a variable, σ(r) 6 σ(l), l = r ∈ E, c[s]p ∈ C. σ(c[r]p )
Given a ground equation c, C ≺c is the set of ground instances of equations in C that are strictly smaller than c in this ordering. A ground conjecture c is redundant in a set of conjectures C if E ∪ A ∪ C ≺c ` c. A non-ground conjecture is redundant 9
Under suitable assumptions about E and A (see [12]).
19
Escobar, Meseguer, Thati
if all its ground instances are. An inference is redundant if one of its premises or its conclusion are redundant in C. Example 7.2 Consider the problem of coding the rather simple inequality x + y + z + w 6 h for natural numbers x, y, z, w, h, borrowed from [18], which is specified in the following Maude module. mod SOLVE is sort Nat . op 0 : -> Nat . op s : Nat -> Nat . vars X Y Z W H : Nat . op _+_ : Nat Nat -> Nat . rl X + 0 => X . rl X + s(Y) => s(X) + Y . op rl rl rl rl rl rl rl rl rl rl rl rl endm
____ Bool . X s(Y) s(Z) W X Y s(Z) X 0 s(Z) W X 0 Z W s(X) Y 0 W X Y 0 W 0 s(Y) 0 W 0 Y 0 W 0 0 0 s(W) 0 0 0 W 0 0 0 0 true . 0 0 0 s(W) false . s(X) Y 0 0 false . X 0 s(Z) W false . 0 s(Y) Z 0 false . 0 s(Y) 0 s(W) false . s(X) s(Y) s(Z) 0 false .
W eq x’ y’}) (part of evaluating function eq in the program) is responsible for the only Failed node in the graph.
There is only one detail of Figure 1 left to explain. Some node labels are headed by numbers like 1:Z or 2:Failed. These numbers denote the so called path of a computation. (For convenience, in Figure 1 nodes with the same path also have the same color. The only exception are failure nodes, which are always red.) Each com82
Braßel
putation has a path starting with the empty path for main. This path is extended whenever a non-deterministic branching occurs. Each branch gets a different number and thus, different paths mean that the nodes belong to different branches of the computation. The original tracing semantics [9] non-deterministically computes two graphs for the above example, as shown in Figure 2. It can be seen immediately
_5 add
main
Z
fcase
eq
fcase
fcase
True
Z
_5
add
main
eq
fcase
S
fcase
fcase
S
Failed
Z
Fig. 2. The two Graphs of Example 2.1 produced by the Semantics of [9]
that it is much more economic to produce a single graph, which is an overlay of all the graphs produced by the original semantics. The connection between the graphs can be seen immediately when considering the paths. We call a path p equal or smaller than a path q, with the usual notation p ≤ q, if p is a prefix of q. When we take the set of all paths attached to nodes in the overlay graph, each path of this set which is maximal with respect to ≤ corresponds to a graph produced by the original semantics. For each maximal element m of this set, the corresponding graph can be obtained by taking only those nodes, whose attached path q satisfies q ≤ m. An example can be obtained by comparing Figures 1 and 2. In [7] we described how to transform a given Flat Curry program such that during its execution a file is written by sideeffects. The generated file contains a codified version of the graphs introduced above. This codified version has to be rehashed into a more declarative structure, which is described in the next subsections. 2.1
Representation of Trace Graphs
In lazy functional (logic) languages there are possibilities to construct graphs in an elegant way. First, sharing already introduces directed acyclic graphs. For instance, both arguments of the tuple introduced by (let x=e in (x,x)) physically refer to the same memory address at run time. But also cyclic graphs can be constructed where recursive let expressions are allowed. For instance, the expression (let ones=1:ones in ones) introduces at run time a structure with a cyclic reference in the heap. Representing graphs in this way has some advantages: 83
Braßel •
Following edges in the graph is an operation with a constant cost.
•
Programming by pattern matching is possible.
•
Unreferenced parts of the graph can be detected by garbage collection.
Therefore, we can represent trace graphs with the simple structure: type Path = [Int] data TraceGraph = Nil Path | Node Int Path TraceGraph [TraceGraph] TraceInfo The graph consists of nodes (Node) and leafs (Nil). Leafs represent subexpressions which were not evaluated during program execution. Each node of the graph has a reference of type Int (to allow node identification) and a path which is represented by a list of integers (cf. the discussion above). Note, that leafs also have paths in order to support language implementations which do not feature sharing of evaluations across non-deterministic branches. In such an implementation, subexpressions might be evaluated in one branch but stay unevaluated in another and, thus, a leaf might belong to a special path only. In addition to reference and path, nodes also have a parent node and a list of successor nodes. (There is always a single parent but there may be more than one successors, cf. Figure 1 above.) In addition, each node has some special information which represents what kind of node it is: data CaseMode = Flex | Rigid data TraceInfo = App String [[TraceGraph]] | Or | Case CaseMode TraceGraph | Free Int [TraceGraph] | Failed Note that application nodes (App) contain a list of lists of trace graphs. This is because in different computation branches the arguments of an application node might point to different expressions. 9 For example, the node labeled eq in Figure 1 has two different pointers in its second argument. This eq node is represented as Node 1 [] (Node 0 [] (Nil []) (App "main" [])) (App "eq" [[Node 3 [] (Node 2 ... (Case Flex (Node 3 ...))) (App "add" [...])], [Node 9 [1] (Node 8 ...) (App "Z" []), Node 14 [2] (Node 13 ...) (App "Z" [])]])
The “...” are not only to shorten the example. Because of the cycles in the structure it is impossible to give a complete term representation. For instance, in the run-time heap, the argument node of the flexible case (Case Flex (Node 3 ...)) is identical with the first argument of eq, (Node 3 ...) as you can see in Figure 1. 2.2
Implementation of Trace Graph Building
During the execution of the transformed program a file is generated, in which all parts of the graph are codified by numbers, called references. There are two separate spaces of references, one for the successor and parent relation and one for argument pointers. Such a trace is a sequence of pieces of information of the three kinds: data TraceItem = Successor Int Int | RedirectArg Int Path Int | TNode Int Path Int ItemInfo 9
This also happens only if there is no sharing of evaluations across non-deterministic branches.
84
Braßel
type Trace = [TraceItem] Successor i j The node with reference j is successor of the node with reference i. RedirectArg p ar nr Each application node with argument reference ar belonging to the computation of path p should be replaced by a reference to the node with number nr. TNode r p par info The node with number r belongs to the computation of path p and has the node with number par as parent. The kind of the node (application, failure, free variable or case, cf. above) is then given in the info part which will not be considered in the following. If we assume a data structure to associate integer keys with data elements like a search tree, hash table, array or similar, with the following interface: data Mapping a = ... lookup :: Mapping a -> Int -> a insert :: Int -> a -> Mapping a -> Mapping a empty :: Mapping a Then the building of the graph as a cyclic data structure can be implemented as a function manipulating three of these search structures: 1) a mapping of node references to the list of their successor references 2) a mapping of argument references to the list of their corresponding node references and their paths 3) one mapping of node references to trace nodes. (The Structure of trace nodes was defined in Section 2.1). type Maps = (Mapping [Int],Mapping [(Int,Path)],Mapping TraceGraph) traceToCycGraph :: Trace -> TraceGraph traceToCycGraph tr = let (_,_,ns) = cycle tr (empty,empty,empty) in lookup ns mainReference cycle :: Trace -> Maps -> Maps cycle [] maps = maps cycle (Successor x y:xs) (sMap,aMap,nMap) = cycle xs (insert x y sMap,aMap,nMap) cycle (RedirectArg v p ref:xs) (sMap,aMap,nMap) = cycle xs (sMap,insert v (ref,p) aMap,nMap) cycle (TNode ref path par info:xs) (sMap,aMap,nMap) = let maps = cycle xs (sMap,aMap,insert ref node nMap) (sMap2,aMap2,nMap2) = maps sucs = map (lookup nMap2) (lookup sMap2 ref) node = TraceNode ref path (lookup nMap2 par) sucs (buildInfo maps info) in maps The rules for Successor and RedirectArg only add information to the maps. The last rule contains the recursive let which adds the information of the current trace node to the node map. The elements of these trace nodes depend on the call to cycle on the thus updated map. This ties the loop and makes sure that the result 85
Braßel
of cycle is a cyclic structure in the heap which directly resembles the trace graph. An elegant definition in this way is only possible in lazy languages. There are, however, drawbacks to this technique: This definition can only work efficiently if the whole trace fits into memory. This is not to be expected for all applications we would like to be able to debug. Therefore there is an alternative implementation to build the trace graph. This alternative implementation represents the graph as a potentially infinite term. Each node is upon demand retrieved from the trace file by side effects. This is comparable to lazy file access by the Curry standard function readFile. The access to the parents, successors or arguments is not possible in constant time as it involves some kind of binary search on the file for each access. But as there are no cycles in the graph, the degree of heap referencing is much lower and therefore trace nodes can become garbage much more often. This ensures that the program will only have parts of the trace graph in memory at each moment. Advantages and disadvantages of the two alternative implementations can be summarized as follows: Cyclic Graph
Infinite Graph
Access to successor
in constant time
in logarithmic time
Access to value
in constant time
linear in chain length
Processed nodes
not always garbage
always garbage
application
normal traces
huge traces
It remains to be evaluated where the border between “normal” and “huge” is.
3
A Framework for Interpreting Trace Graphs
The basic idea of providing a simple yet versatile interface to program views on the traced program executions is to represent the trace as a sequence of computation steps. These steps are categorized into a) single step b) subcomputation c) branching. A single step might be further distinguished to be an unfolding, the binding of a variable, the suspending of a computation or perhaps some representation of a side effect. Value oriented techniques are then characterized by subcomputation to the most evaluated form whereas operation oriented tools feature subcomputations to head normal form only. These different kinds of subcomputations can be seen as interpreting the program trace in the light of different evaluation strategies. Computing the most evaluated form is like employing a strict strategy while stopping at head normal form is lazy evaluation. For debugging, the main idea is that it is much easier to understand the execution of a program, if it is evaluated with a simple strategy. It is therefore better to understand a strict evaluation of the program than a lazy one. This is just another way of saying that value oriented approaches (cf. Section 1.2) try to show the results as if they were evaluated strictly. Of course, evaluating the given expression in a fully strict manner is not going to work, as the expression might contain potentially infinite structures. Therefore, strict evaluation is generalized to what we call strict evaluation with oracle. Beside 86
Braßel
the unusual name, the basic idea should be familiar from denotational semantics. The semantics of a potentially infinite structure like the one denoted by (repeat 1) for the definition repeat x = x : repeat x is a set of values whose least upper border is the infinite value rather than that infinite value itself: Jrepeat 1K = {⊥, 1 : ⊥, 1 : 1 : ⊥, . . .} A “strict semantics with oracle” can be understood as a non-deterministic choice of one element of the set as result of the evaluation of (repeat 1). For debugging we choose exactly that element that corresponds to how far the expression was evaluated during the traced program execution. If, for instance, we have traced main = take 2 (repeat 1) we choose 1 : 1 : ⊥ as the semantics of (repeat 1). This means in particular that we can have different choices, should (repeat 1) be called in different contexts during the program’s execution. 3.1
Representation of Computations
As metinoed above, computations are categorized into three basic kinds of steps, as shown in Figure 3. Simple steps denote for instance a function unfolding or the
Fig. 3. The three kinds of Steps
binding of a free variable, forks denote the non-deterministic branchings induced by logic search and short cuts embed subcomputations, i.e. reductions inside the given term. Each computation is terminated when it produces a value. For reasons developed in the next subsection, we also need to represent invalid computations and to augment each value with a computation state. Thus, we have: data Computation step state = Deadend | Goal state | Step step (Computation step state) | Fork [Computation step state] | Sub (Computation step ()) (Computation step state) There is good reason to have the content of a single step as a type variable. Many views can be formulated without any knowledge of what these steps consist of, as long as there is a way to represent them. Therefore we can have different definitions of a step depending on the strategy we want to represent and the detail level we would like to include. As an example of what a single step consists of, we might 87
Braßel
define: type Narrowing state = Computation NarrowingStep state data NarrowingStep = Unfold Term | Bind Int Term | Fail data Term = Term String [Term] | Var Int Term | Unevaluated This is enough for value oriented tools, whereas operational oriented tools might need to include more information like suspending goals. Example 3.1 The evaluation of main in example 2.1 can be represented as follows, where the value Unevaluated is abbreviated as _: Step (Unfold (Term "main" [])) ( Step (Unfold (Term "eq" [Term "add" [Var 1 _,Var 1 _],Term "Z" []])) ( Sub ( Step (Unfold (Term "add" [Var 1 _,Var 1 _])) Fork [Step (Bind 1 (Term "Z" []) ( Step (Unfold (Term "add" [Term "Z" [],Term "Z" []])) ( Step (Unfold (Term "Z" [])) (Goal ())))) ,Step (Bind 1 (Term "S" [_])) ( Step (Unfold (Term "add" [Term "S" [_], (Term "Z" [])])) ( Step (Unfold (Term "S" [_])) (Goal ())))] Fork [Step (Unfold (Term "eq" [Term "Z" [],Term "Z" []])) ( Step (Unfold (Term "True" [])) (Goal ())) ,Step (Unfold (Term "eq" [Term "S" [_],Term "Z" []])) (Step Fail (Goal ()))])))
which can be shown to the user in different ways, for instance in form of two independent proof trees, cf. also Figure 2: main main eq (add _A _A) Z eq (add _A _A) Z /add _A _A /add _A _A |_A\Z |_A\S _ |add Z Z |add (S _) (S _) \Z \S _ eq Z Z eq (S _) Z True FAIL 3.2
Generating Computations
The definition of computations above allows to generate, combine and process computations in a monadic programming style. Computations are a combination of list and state monads. As is well known, list monads are very expressive for nondeterminism and a state monad is useful to abstract from information which has to be updated regularly during computations. In our case, this information includes for instance the path for which a given subgraph has to be interpreted. (Cf. the discussion of the path concept above.) The introduction of dead ends has the purpose of making interpretations satisfy the additional axioms of plus on monads, see below. This is also very helpful when implementing interpretations. When dead ends are added, we have to exchange the original constructors Step, Fork and Sub with constructing functions step, fork, sub, which make sure that dead ends eliminate a whole subway up to a next fork: step :: a -> Computation a b -> Computation a b step x w = if noDeadend w then Step x w else Deadend sub :: Computation a () -> Computation a b -> Computation a b sub x w = if noDeadend x && noDeadend w then Sub x w else Deadend 88
Braßel
The function to construct forks makes sure that each fork has at least two subways: fork :: [Computation a b] -> Computation a b fork ws = mkFork (filter noDeadend ws) where mkFork [] = Deadend mkFork [x] = x mkFork (x:y:xs) = Fork (x:y:xs) Relative to these constructing functions, the following functions on ways satisfy the monadic axioms: return = Goal (Step x w) >>= (Fork ws) >>= (Sub d w) >>= Deadend >>= Goal o >>=
b b b _ b
= = = = =
step x (w >>= b) fork (map (>>= b) ws) sub d (w >>= b) Deadend b o
Computations also satisfy the additional axioms of monad plus: mzero mplus mplus mplus mplus mplus
= Deadend Deadend (Goal o) (Step x w1) (Fork ws) (Sub d w1)
w w w2 w2 w2
= = = = =
w if noDeadend w then w else Goal o step x (mplus w1 w2) fork (map (flip mplus w2) ws) sub d (mplus w1 w2)
It is straightforward to ensure that the monadic laws for these definitions indeed hold with respect to the constructing functions. The huge advantage of this technique lies in the way it allows to abstract from the details of both the non-determinism and the manipulation of the state. For instance, if we interpret a given node of the trace graph, we can proceed like this: interpretNodes :: [TraceGraph] -> State -> Narrowing State interpretNodes [Node _ _ _ successors info] = interpretInfo info >>= interpretNodes successors We do not have to care about whether the interpretation of the successors yields a deterministic sequence of steps or if there will be forks in the result. The operator (>>=) automatically makes sure that the interpretation of the successor is added to all branches when necessary. Likewise, if we wish to make sure that the node we interpret is compatible with the current path (which is part of the state), we can define like this: interpretNodes :: [TraceGraph] -> State -> Narrowing State interpretNodes [Node _ p _ successors info] = ensurePath p >>= interpretInfo info >>= interpretNodes successors ensurePath :: Path -> State -> Narrowing State ensurePath p st = if p String -> a -> a The function observe behaves as an identity on its third argument. Additionally, it generates, as a hidden side effect, a trail file representing the evaluated part of the observed data. To distinguish different observations from each other, observe takes a label as its second argument. After program termination (including runtime errors and aborts), all observations are presented to the user, with respect to their different labels. Finally, observe demands an observer as its first argument which defines the special observation behavior for the type of the value observe is applied to. For each predefined type τ such an observer is defined as oτ . For example, for expressions of type Int the observer oInt should be used and for [Int] the observer oList oInt. Note, that observers for polymorphic type constructors 94
Sadeghi and Huch
(e.g., []) are functions taking as many arguments as the type constructor. The explicit annotation of the observer for each type is necessary, since Curry, in contrast to Haskell, does not provide type classes which hide these observers from the user in HOOD. However, there is also a benefit of these explicit annotations. It is possible to use different observers for the same type which allows selective masking of substructures in large observed data structures, e.g. by the predefined observer oOpaque [1] which presents every data structure by the symbol #. As a small example, we consider a function which computes all sublists of a given list (here with elements of type Int): sublists :: [Int] -> [[Int]] sublists xs = let (ready,extend) = sublists’ xs in ready++extend sublists’ :: [Int] -> ([[Int]],[[Int]]) sublists’ [] = ([[]],[[]]) sublists’ (x:xs) = let (ready,extend) = sublists’ xs in (ready++extend,[x]:map (x:) extend) The idea is to distinguish lists which are already closed sublists and lists which may be extended with the actual list element x. Unfortunately, this program contains a little bug. sublists [1,2,3] yields: [[],[],[3],[3],[2],[2,3],[2,3],[1],[1,2],[1,2,3],[1,2,3]] Some elements occur twice in the result. To find this bug, we first observe the two results of sublists’ in sublists and obtain: sublists xs = let (ready,extend) = observe (oPair (oList (oList oInt)) (oList (oList oInt))) "result" (sublists’ xs) in ready++extend result -----([[],[],[3],[3],[2],[2,3],[2,3]],[[1],[1,2],[1,2,3],[1,2,3]]) The bug seems to result from the first pair component because the replication appears here. Hence, we observe this component within the right-hand side of sublists’ and obtain: sublists’ (x:xs) = let (ready,extend) = sublists’ xs in (observe (oList (oList oInt)) "first component" (ready++extend), [x]:map (x:) extend) first component --------------[[],[]] [[],[],[3],[3]] [[],[],[3],[3],[2],[2,3],[2,3]] 95
Sadeghi and Huch
This observation still shows the bug, but does not help to locate it, since we cannot distinguish the values of ready and extend. A better observation point would have been the result of sublists’ during recursion. Hence, we again change the source code and add an observer to another place: sublists’ (x:xs) = let (ready,extend) = observe (oPair (oList (oList oInt)) (oList (oList oInt))) "sublists’" (sublists’ xs) in (ready++extend, [x]:map (x:) extend) sublists’ --------([[]],[[]]) ([[],[]],[[3],[3]]) ([[],[],[3],[3]],[[2],[2,3],[2,3]]) In the second line of this observation, we see that the bug is located in the second pair component. Thinking about this observation, we see that the expression [x]:map (x:) extend adds the list [3] twice, since the empty list is contained in extend. The bug is located in the base case which should be corrected to: sublists’ [] = ([[]],[]) Observing data structures can help finding a bug. However, a program consists of functions and it is more interesting to observe functions which COOSY provides as well. Observers for functions can be constructed by means of the right associative operator: (~>) :: Observer a -> Observer b -> Observer (a -> b) In our example, we could have used a functional observer to observe the recursive calls of sublists’: sublists’ (x:xs) = let (ready,extend) = observe (oList oInt ~> oPair (oList (oList oInt)) (oList (oList oInt))) "sublists’" sublists’ xs in (ready++extend, [x]:map (x:) extend) sublists’ --------[] -> ([[]],[[]]) [3] -> ([[],[]],[[3],[3]]) [2,3] -> ([[],[],[3],[3]],[[2],[2,3],[2,3]]) In this observation, it is also possible to detect the bug and in practice it is often easier to find bugs by observing functions. However, in larger programs it is still an iteration of adding and removing observers to find the location of a bug, similar to the debugging session sketch for the sublists example. A tool which supports the 96
Sadeghi and Huch
Fig. 1. A tree of all program expressions in the main window
programmer in adding and removing observers is desired.
3
Tree Presentation of a Program
COOiSY is a small portable Curry program that provides a graphical interface debugger for the Curry programs. It uses the meta-programming library of Curry [3] and presents the whole program as a tree which contains functions, data structures and all defined subexpressions of the program which may be necessary to be observed for finding bugs. By means of Curry’s Tcl/Tk library [5,9] we provide convenient access to this tree. By default all functions of a program (module) are available. On selection of a corresponding rule the user can access the right-hand side of a function definition and descend into the tree representing all its subexpression. On the other hand, for a concise presentation, initially local definitions and all subexpressions are hidden and can be opened on demand by the user. She/he can also select and deselect arbitrary expressions for being observed. Let us consider the following simple program that offers the reverse presentation of a list: reverse :: [Int] -> [Int] reverse [] = [] reverse (x:xs) = reverse xs ++ [x] The function reverse is defined by two rules which we present as reverse(1) and reverse(2) to the user. Each of these rules can be selected for observation. All 97
Sadeghi and Huch
expressions within each of these rules have to be presented in the tree. In the righthand side of the first rule the only expression is the empty list. In other words, only the expression [] is presented to the user. The second rule contains three function calls: (++), reverse and (:). Each function takes two expressions as parameters. Furthermore, every function call itself and all partial applications are represented (see Figure 1). Now by the selection of the expression that is done by only a mouse-click on the expressions, COOiSY automatically adds necessary observe function to the source code as described in the following section and loads the changed program automatically in a new PAKCS-Shell which is provided for the programmer to perform request to the program. Advance can be triggered automatically in separate viewers which are distinguished by different labels they belong to.
4
Automatic Observations in COOiSY
In this section we show how COOiSY helps programmers during debugging by automatically adding the necessary observers to selected expressions and functions. 4.1
Observing Functions
The most important feature of a convenient observation tool, is the observation of top-level functions. In Curry, these functions can be defined by one or more rules and may behave non-deterministically. The idea of observing such a function should be that every call to this function is observed. The easiest way to realize this behavior is to add a wrapper function which adds the observation to the original function. In our example from Section 2, an observation of all calls to sublists’ can be obtained as follows: sublists’ = observe (oList oInt ~> oPair (oList (oList oInt)) (oList (oList oInt))) "sublists’" helpSublists’ where helpSublists’ [] = ([[]],[[]]) helpSublists’ (x:xs) = let (ready,extend) = sublists’ xs in (ready++extend,[x]:map (x:) extend) Note, that we reuse the original function name for the wrapper function. By leaving the recursive calls in the right hand sides of the original function definition unchanged, we guarantee that COOiSY observes each application of sublists’. This technique can also be applied to locally defined functions and is provided by our tool. 4.2
Observing Data Types
The most problematic part of using COOSY (especially for beginners) is the definition of observers for newly introduced data types, although COOSY provides useful 98
Sadeghi and Huch
abstractions for this task. For every user defined data type, corresponding observers have to be defined to observe values of this type. Our tool provides an automatic derivation of these observers, not only for defined data types in the program, but also for data types which are imported from other modules. We sketch the idea by means of an example. Consider the data type for natural numbers: data Nat = O | S Nat It defines two constructors, the constructor O :: Nat with arity 0 and the constructor S :: Nat -> Nat with arity 1. The observer for each type τ (e.g., Int) should be available as function oτ (e.g., oInt). Hence, we define an observer oNat. COOSY already provides generic observers o1, o2, o3,. . . by which the observer oNat can easily be defined as follows: oNat :: Observer Nat oNat O = o0 "O" O oNat (S x) = o1 oNat "S" S x For polymorphic data types an observer needs observers for the polymorphic arguments as well, like oList. The construction should become clear from the following example: data Tree a b = Branch a b [Tree a b] [Tree a b] oTree :: Observer x1 -> Observer x2 -> Observer (Tree x1 x2) oTree oa ob (Branch x1 x2 x3 x4) = o4 oa ob (oList (oTree oa ob)) (oList (oTree oa ob)) "Branch" Branch x1 x2 x3 x4 In this way, generic observers for all data structures defined in the program are generated and added automatically to the program. These observers are used for observations of functions and expressions over values of this type. This method is applied to the imported data types, so that for all imported modules the data types observers can be generated and automatically imported to the program. However, polymorphism brings up a problem. How can polymorphic functions be observed? The function can be used in different type instantiations. Hence, the only type we can assign to its polymorphic arguments is oOpaque.
4.3
Observing Expressions
Sometimes it is not sufficient to observe functions defined in a program. Observations of subexpressions in the right-hand sides of rules can become necessary to find a bug. A user can also select (sub-)expressions from right-hand sides of function definitions. For this purpose COOiSY provides a tree representation of the whole program, in which the user can select arbitrary (sub-)expressions of the program to be observed. Corresponding calls of observe are automatically added as presented in Section 2. The type of each selected expression is inferred and a corresponding observer is generated. COOiSY also automatically generates labels for the observers which helps the programmer to later identify the observations. Top-level functions are simply la99
Sadeghi and Huch
beled with their name. Local functions are labeled with a colon-separated list of function names leading to the declaration of the observed function. Finally, expressions are labeled with the corresponding function name and a String representation of the selected expression. 4.4
Observing Imported Modules
COOiSY supports adding observations to different modules of a project. When the user selects functions or expressions of a module to be observed a new module is generated which contains the observer calls. Since observer calls in (even indirectly) imported modules must also be executed, COOiSY can check for each imported module whether an observer version is available and uses this for execution.
5
Design of COOiSY
With the advance of modern computer technology, distributed programming is becoming more and more popular. Instead of storing huge amounts of data redundantly in many places, we use a client/server architecture, and typically, we have many clients connected to many servers. For the communication between a client and a server in Curry, TCP communication can be used. In this section we briefly review the client/server architecture of COOiSY for showing the observation steps in separate viewing tools. 5.1
Architecture
Originally in COOSY, each time the computation made progress the information about observed values was recorded in a separate trace file as events. There are two kinds of events to distinguish unevaluated expressions from failed or non-terminated computations: Demand and Value. A demand event shows that a value is needed for the computation. A value event shows that the computation of a value has succeeded [1]. These events were shown with a textual visualization in the viewer of COOSY. Instead, in COOiSY, we use a Socket establishing a connection between the main window of the tool and the observed application (PAKCS-System). All events of the observed application are sent to COOiSY’s main window. Each event contains the label of the observation it belongs to. Using this architecture we can also Client PAKCS−System
Socket
(label,event) "next step?"
Server Main Window of COOiSY
Fig. 2. A Socket to connect the observe applications and the main window
present observations in a single-step mode which helps beginners to understand the evaluation order of a computation. By pressing the forward button in this mode, the user can delay the client for an acknowledging message from the server before the computation continues and the next message is sent to the server (see Figure 2). The received messages in the server are forwarded to the trace windows, for showing 100
Sadeghi and Huch
all observations of a special label. When started, each trace window generates a socket and waits to receive the events from the main window, see Figure 3.
Client Sockets Main Window of COOiSY
events
events
Servers Trace Window 1 Trace Window 2
...........
...........
Fig. 3. Sockets to connect the main window and the trace windows
In Section 2 we have seen that for each observed function/expression a label is needed to match the observed values with the observed expressions. These labels group the progress of the execution for each observed expression in separate trace windows. Each of these windows which is named with the corresponding label receives messages from the main window through a socket. Now, by sending events, the observed values are shown to the programmer with a textual visualization in the related trace window. The programmer may conveniently arrange these windows on her/his screen and even close observations she/he is not interested in anymore. 5.2
Surfing Observation
Originally in COOSY, the information about the observed expression was recorded as a list of events in a trace file which was considered to be shown after the program execution. That means the programmer could observe the evaluation of selected expressions only when the execution was terminated. Our aim in the new version (COOiSY) is the ability to also show intermediate steps of the evaluation with the possibility of forward and backward stepping. For this purpose we use TCP communication and change the list data-structure to a tree structure which is stored in a dynamic predicate [6]. Dynamic predicates are similar to external functions, whose code is not contained in the program, but is dynamically computed/extended, like the meta predicates assert and retract in Prolog. In COOSY the generation of the observations for each observation label contained in the trace file works in a bottom-up manner. For a continuous update of observed data terms this algorithm is of no use, since in each step the whole observed data structure has to be re-constructed. We need a top-down algorithm which allows extensions in all possible leaf positions of the presented data structure. For instance, during the computation non-evaluated arguments (represented by underscores) may flip into values, but values within a data structure will not change anymore. However, we must consider the non-determinism within Curry by which values may later be related to different non-deterministic computations. Our new representation of events is stored in the following dynamic predicate: 101
Sadeghi and Huch
TreesTable :: [([Index],EvalTree)] -> Dynamic TreesTable = dynamic data EvalTree = | | | | type Index type ArgNr type Arity
Open Int Value Arity String Index [EvalTree] Demand ArgNr Index [EvalTree] Fun Index [EvalTree] LogVar Index [EvalTree]
= Int = Int = Int
The dynamic predicate TreesTable is a kind of global state accessible and modifiable within the whole program. The indices represent all nodes occuring in the corresponding evaluation tree (EvalTree) with respect to the order in which they were added. This is necessary since the evaluation order is not statically fixed. In Section 5 we have seen that events are sent via a socket connection from the main window of COOiSY to each trace window. Each event contains a logical parent showing in which order values are constructed. Hence, within one non-deterministic branching the index list [Index] is extended in its head position whenever the evaluation tree is extended, usually in an Open leaf. If part of a value is used in more than one non-deterministic computation, then the logical parent indicates which part of an evaluation tree is shared within two non-deterministic computations. We only consider the subtree consisting of nodes from the index list up to the logical parent of the new event. This subtree with the corresponding indices is copied as an additional evaluation tree to the global TreesTable. As an example we consider the following simple program that performs a nondeterministic computation: add :: Int -> Int -> Int add x y = x + y main = add (0?1) 1 The expression (0?1) either yields 0 or 1. That means the function main offers two different results 0+1 and 1+1. For the first result after selecting the function add to be observed, ten events are sent from the observe application to the main window. The first event is a Demand that is stored in the above defined dynamic tree with the index 0 as: [([0], Demand 0 0 [(Open 1)])] The second received message is a Fun event with the logical parent 0 that should be substituted in the open-subtree of its parent: [([1,0], Demand 0 0 [Fun 1 [(Open 1),(Open 2)]])] This Fun event is stored as a node with two subtrees presenting the argument and the result of the corresponding function. Functions are represented in curried form, 102
Sadeghi and Huch
i.e. the result of a function with arity two is again a function. After adding the remaining eight events to the evaluation tree we obtain [([9,8,7,6,5,4,3,2,1,0] , Demand0)] | Fun1 / \ Demand5 Demand2 | | Value6 Fun3 | / \ "0" Demand7 Demand4 | | Value8 Value9 | | "1" "1" which is shown to the user as {\0 1 -> 1}. The next incoming event is a value event with the logical parent 5. This index does not occur in the head position of the index list. Hence, we detect a nondeterministic computation. The observed value of this computation shares the nodes with indices 0 to 5 with the first tree. Hence we copy this part and extend it with the new event 10, which means the same function is called with another argument (1) in this non-deterministic branch. After adding three further events we obtain ([13,12,11,10,5,4,3,2,1,0] , Demand0) | Fun1 / \ Demand5 Demand2 | | Value10 Fun3 | / \ "1" Demand11 Demand4 | | Value12 Value13 | | "1" "2" which is shown to the user as {\0 1 -> 1}. This method helps us to provide fast pretty printing for each intermediate step of observations in the trace windows. Furthermore, we can present shared parts of the evaluation trees in the second presentation by a lighter color, which helps to understand non-deterministic computations. Figure 4 shows the last four steps of the example. While, the underscore represents a non-evaluated expression, the exclamation mark stands for an initiated but not jet finished computation. By storing the number of incoming events in a list we can also perform backward and forward stepping through the observations presented in one observation window by filtering the TreesTable with respect to a subsets of considered indices. 103
Sadeghi and Huch
Fig. 4. A trace window
For removing the evaluated trees from the TreesTable we have defined a clearbutton in each trace window. Furthermore when the observed program is restarted the TreesTable is cleared automatically.
6
Executed Parts of the Program
In some cases programmers prefer to follow the order of program execution instead of observing functions to see the program behavior during the execution. Furthermore, knowing which parts of the program have been executed is an interesting information for the programmer, because this restricts the possible locations of a bug to the executed parts. Observers should only be added to executed code. COOiSY provides such a feature which can also be useful for testing small separate functions of a program and focus on a small environment of the program for being observed. Another nice feature of constantly showing the executed parts of a program is that in case of the program yielding No more solutions, the last marked expression usually shows where the computation finally failed. In many cases, the last marked expression determines the reason for an unexpected program failure or run-time error. To keep the result view of our tool small (c.f. Figure 5), we take the following artificial program as an example: test :: [Int] -> [Int] test xs = bug xs ++ okay xs bug :: [Int] -> [Int] bug [] = [] okay :: [Int] -> [Int] okay xs = xs The function bug represents a failing computation which might be much more complex in a real application. COOiSY’s presentation of the execution of test [1] is shown in Figure 5. The program is again represented as a tree (Section 3), in which executed parts are marked green and the last executed expression is marked red. We can see that the function okay is never applied. The bug may either be located in the application of bug to xs or within the function bug. Furthermore, we can see 104
Sadeghi and Huch
that the program finally failed when the function bug was applied to xs.
Fig. 5. Marking the executed part of the program
The viewer also shows how many times the executed functions are called. For a non-terminating computation, this information can be helpful to find the nonterminating recursion. For marking expressions, COOiSY adds calls to the function markLineNumber applied to the position of the actual expression in a flat-tree of the curry program: markLineNumber :: String -> Int -> a -> a To distinguish the expressions of imported modules from the main module, the function takes the name of the actual module as its first argument. The second argument is the position of the actual expression in a flat-tree representing the whole program and the third argument is the executed expression that the function markLineNumber behaves as an identity function on. Executing this function, the first and second argument are sent as a message from the executed application to the main window of COOiSY (Section 5). In the main window process, the message initiates a marking of the corresponding position of the actual expression in the viewer beside showing the observation steps with the ability of backward and forward stepping on the marked expressions. This technique is a light-weight implementation of program slicing as defined in [8]. Furthermore, it will be interesting to investigate how this kind of slicing can be used for improving debugging like done in [2].
7
Related Work
COOiSY is an observational debugger for the functional logic language Curry which extends Gill’s idea (HOOD) [4] to observe data structures of program expressions. It is an improvement of COOSY that covers all aspects of modern functional logic languages such as lazy evaluation, higher order functions, non-deterministic search, logical variables, concurrency and constraints. COOiSY offers a comfortable graphical user interface that helps the user to conveniently and automatically observe the evaluation of arbitrary program expression. It displays the observation steps as a comprehensive summary, based on pretty-printing. 105
Sadeghi and Huch
The graphical visualization of HOOD (GHOOD) [10] also uses Gill’s idea to observe the expressions of a program. In contrast to HOODs comprehensive summary as a textual visualization, GHOOD offers a graphical visualization, based on a tree-layout algorithm which displays the structure of the selected expression of a program as a tree. However, also in GHOOD observers have to be added manually which still means more effort than using COOiSY. For having a suitable overview of large programs, GHOOD offers a graphical visualization instead of textual information. This is nice for educational purposes. However, for real application the textual representation seems more appropriate and we decided to keep COOSY’s textual representation within COOiSY. As an improvement we present the trace for each selected expression in a separate window which the user can conveniently move or even close within his graphical environment. Also related to debugging lazy languages is the Haskell tracer Hat [11]. It is based on tracing the whole execution of a program, combined with different viewing tools supporting users to conveniently analyze the recorded trace. Although Hat is a powerful debugging tool, there are also some disadvantage of Hat compared to observation based debugging: •
Hat is restricted to a subset of Haskell. Extension of Haskell can not be covered easily and Hat cannot be used at all to analyze programs using such extensions.
•
During the execution, large trace files are generated which may slow down using the tracer for debugging real applications a lot.
These disadvantages do not hold for COOiSY which is still light-weight and works independently of Curry extension (at least for that parts of a program not using the extension). On the other hand, having the whole program as a data structure in COOiSY, some more global information like in Hat can be computed (like the line information discussed in Section 6). However, COOiSY is supposed to stay a light-weight and easy to use debugger.
8
Conclusion
Sometimes it is hard to figure out what caused an unexpected output or program failure. A well implemented, easy to use debugger is needed to help the programmer in finding the position of the error in the program quickly and easily. We have extended the Curry Object Observation System [1] in a new version to provide a comfortable graphical interface as Curry Object Observation Interactive System. It helps the programmer to observe data structure or functions of arbitrary expressions of her/his program to find bugs. Using COOiSY is very simple and should be accessible to beginners which we want to investigate in our next lectures about declarative programming. Distributed programming helps us to send the information about the observed expressions through a socket and to show each computed expression in a trace window in parallel. The trace windows separate the display of observation steps for selected expressions and offer an understandable result for programmers. The information about observed expressions/functions is collected in each trace window and the ability of going forward and backward on the collected information is provided 106
Sadeghi and Huch
for the programmer. The programmer does not need to add annotations to her/his program to observe the desired expressions. These annotations are added automatically by COOiSY. A tree containing all program expressions (i.e. global and local functions, patterns, variables and all subexpressions) is provided for the programmer. Each selection in this tree activates COOiSY to write the annotations in an extra file automatically, without changing the original program. Also larger projects consisting of different modules are supported. For future work, we want to improve observations of polymorphic functions by generating specialized versions for each usage of observed polymorphic functions. Furthermore, we plan to investigate, how our tool can also be used as a platform for other development tools for Curry, like refactoring, test environment and program analysis. Another possible future work could result from the fact that our tool holds a lot of meta information about debugged programs. Hence, it could be possible to add observations to every/many program functions automatically and derive information about the connection between different observations which may improve debugging.
References [1] B. Braßel, O. Chitil, M. Hanus, and F. Huch. Observing functional logic computations. In Proc. of the Sixth International Symposium on Practical Aspects of Declarative Languages (PADL’04), pages 193–208. Springer LNCS 3057, 2004. [2] Olaf Chitil. Source-based trace exploration. In Clemens Grelck, Frank Huch, Greg J. Michaelson, and Phil Trinder, editors, Implementation and Application of Functional Languages, 16th International Workshop, IFL 2004, LNCS 3474, pages 126–141. Springer, March 2005. [3] M.Hanus et. al. Pakcs: The portland aachen kiel curry system, 2004. [4] Andy Gill. Debugging haskell by observing intermediate data structures. Electr. Notes Theor. Comput. Sci., 41(1), 2000. [5] M. Hanus. A functional logic programming approach to graphical user interfaces. In PADL ’00: Proceedings of the Second International Workshop on Practical Aspects of Declarative Languages, pages 47–62, London, UK, 2000. Springer-Verlag. [6] M. Hanus. Dynamic predicates in functional logic programs. In Journal of Functional and Logic Programming, volume 5. EAPLS, 2004. [7] M. Hanus. Curry: An integrated functional logic language, 2006. [8] C. Ochoa, J. Silva, and G. Vidal. Lightweight Program Specialization via Dynamic Slicing. In Proc. of the Workshop on Curry and Functional Logic Programming (WCFLP 2005), pages 1–7. ACM Press, 2005. [9] John K. Ousterhout. Tcl and the Tk Toolkit. Addison Wesley Longman,Inc., 1998. [10] Claus Reinke. GHood – Graphical Visualisation and Animation of Haskell Object Observations. In Ralf Hinze, editor, ACM SIGPLAN Haskell Workshop, Firenze, Italy, volume 59 of Electronic Notes in Theoretical Computer Science, page 29. Elsevier Science, September 2001. Preliminary Proceedings have appeared as Technical Report UU-CS-2001-23, Institute of Information and Computing Sciences, Utrecht University. Final proceedings to appear in ENTCS. [11] Malcolm Wallace, Olaf Chitil, Thorsten Brehm, and Colin Runciman. Multiple-view tracing for Haskell: a new Hat. In Ralf Hinze, editor, Preliminary Proceedings of the 2001 ACM SIGPLAN Haskell Workshop, pages 151–170, Firenze, Italy, September 2001. Universiteit Utrecht UU-CS-2001-23. Final proceedings to appear in ENTCS 59(2).
107
108
WFLP 2006
Static Slicing of Rewrite Systems
1
Diego Cheda2 Josep Silva2 Germ´an Vidal2 DSIC, Technical University of Valencia Camino de Vera S/N, 46022 Valencia, Spain
Abstract Program slicing is a method for decomposing programs by analyzing their data and control flow. Slicingbased techniques have many applications in the field of software engineering (like program debugging, testing, code reuse, maintenance, etc). Slicing has been widely studied within the imperative programming paradigm, where it is often based on the so called program dependence graph, a data structure that makes explicit both the data and control dependences for each operation in a program. Unfortunately, the notion of “dependence” cannot be easily adapted to a functional context. In this work, we define a novel approach to static slicing (i.e., independent of a particular input data) for first-order functional programs which are represented by means of rewrite systems. For this purpose, we introduce an appropriate notion of dependence that can be used for computing program slices. Also, since the notion of static slice is generally undecidable, we introduce a complete approximation for computing static slices which is based on the construction of a term dependence graph, the counterpart of program dependence graphs. Keywords: Program slicing, rewrite systems
1
Introduction
Program slicing [13] is a method for decomposing programs by analyzing their data and control flow. Roughly speaking, a program slice consists of those program statements which are (potentially) related with the values computed at some program point and/or variable, referred to as a slicing criterion. In imperative programming, slicing criteria are usually given by a pair (program line, variable). Example 1.1 Consider the program in Figure 1 to compute the number of characters and lines of a text. A slice of this program w.r.t. the slicing criterion (12, chars) would contain the black sentences (while the gray sentences are discarded). This slice contains all those parts of the program which are necessary to compute the value of variable chars at line 12. 1
This work has been partially supported by the EU (FEDER) and the Spanish MEC under grant TIN200509207-C03-02, by the ICT for EU-India Cross-Cultural Dissemination Project ALA/95/23/2003/077-054, by LERNet AML/19.0902/97/0666/II-0472-FA and by the Vicerrectorado de Innovaci´ on y Desarrollo de la UPV under project TAMAT ref. 5771. 2 Email: {dcheda,jsilva,gvidal}@dsic.upv.es
This paper is electronically published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs
Cheda, Silva, Vidal
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
lineCharCount(str) i:=1; lines:=0; chars:=0; while (i | c(πk ) where c/k ∈ C is a constructor symbol of arity k ≥ 0, ⊥ denotes a subexpression of the value whose computation is not relevant and > a subexpression which is relevant. Slicing patterns are similar to the liveness patterns which are used to perform dead code elimination in [7]. Basically, they are abstract terms that can be used to denote the shape of a constructor term by ignoring part of its term structure. For instance, given the constructor term C(A, B), we can use (among others) the following slicing patterns >, ⊥, C(>, >), C(>, ⊥), C(⊥, >), C(⊥, ⊥), C(A, >), C(A, ⊥), C(>, B), C(⊥, B), C(A, B), depending on the available information and the relevant fragments of C(A, B). Given a slicing pattern π, the concretization of an abstract term is formalized by means of function γ, so that γ(π) returns the set of terms that can be obtained from π by replacing all occurrences of both > and ⊥ by any constructor term. This usually leads to an infinite set, e.g., γ(C(A, >)) = γ(C(A, ⊥)) = {C(A, A), C(A, B), C(A, D), C(A, C(A, A)), C(A, C(A, B)), . . .}. Definition 3.9 (slicing criterion) Given a program P, a slicing criterion for P is a pair (f, π) where f is a function symbol and π is a slicing pattern. Now, we introduce our notion of static slice. Basically, given a slicing criterion (f, π), a program slice is given by the set of program positions of those terms in the program that affect the computation of the relevant parts—according to π—of the possible values of function f . Formally, Definition 3.10 (slice) Let P be a program and (f, π) a slicing criterion for P. Let Pπ be the set of positions of π which do not address a symbol ⊥. Then, the slice of P w.r.t. (f, π) is given by the following set of program positions: [
{(k, w) | (k, w) ∈ P, tP
f
v|q , v ∈ γ(π) and q ∈ Pπ }
Observe that a slice is a subset of the program positions of the original program that uniquely identifies the program (sub)terms that belong to the slice. Example 3.11 Consider again the program of Example 3.5 and the slicing pattern (main, C(>, ⊥)). •
The concretizations of C(>, ⊥) are γ(C(>, ⊥)) = {C(A, A), C(A, B), C(A, D), C(B, A), C(B, B), C(B, D), C(D, A), C(D, B), C(D, D), . . .}.
•
The set of positions of C(>, ⊥) which do not address a symbol ⊥ are Pπ = {Λ, 1}.
•
Clearly, only the value C(D, B) ∈ γ(C(>, ⊥)) is computable from main.
•
Therefore, we are interested in the program positions of those terms such that either C(D, B) (i.e., C(D, B)|Λ ) or D (i.e., C(D, B)|1 ) depend on them.
•
The only computations from main to a concretization of C(>, ⊥) (i.e., to a term 115
Cheda, Silva, Vidal
from γ(C(>, ⊥))) are thus the following:
D1 : main{} →Λ,R1 C{(R1,Λ)} (f{(R1,1)} (A{(R1,1.1)} ), g{(R1,2)} (B{(R1,2.1)} )) →1,R2 C{(R1,Λ)} (D{(R2,Λ)} , g{(R1,2)} (B{(R1,2.1)} )) →2,R3 C{(R1,Λ)} (D{(R2,Λ)} , B{(R3,Λ),(R1,2.1)} ) D2 : main{} →Λ,R1 C{(R1,Λ)} (f{(R1,1)} (A{(R1,1.1)} ), g{(R1,2)} (B{(R1,2.1)} )) →2,R3 C{(R1,Λ)} (f{(R1,1)} (A{(R1,1.1)} ), B{(R3,Λ),(R1,2.1)} ) →1,R2 C{(R1,Λ)} (D{(R2,Λ)} , B{(R3,Λ),(R1,2.1)} ) In this example, it suffices to consider only one of them, e.g., the first one. Now, in order to compute the existing dependences, we show the possible suffixes of D1 together with their associated subreductions (here, for clarity, we ignore the program positions):
Suffix : C(f(A), g(B)) →1,R2 C(D, g(B)) →2,R3 C(D, B) Subreductions : • C(f(•), g(B)) →2,R3 C(f(•), B) C(•, g(B))
→2,R3 C(•, B)
C(f(A), g(•)) →1,R2 C(D, g(•)) →2,R3 C(D, •) C(f(A), •)
→1,R2 C(D, •)
Suffix : C(D, g(B))
→2,R3 C(D, B)
Subreductions : • C(•, g(B))
→2,R3 C(•, B)
C(D, g(•))
→2,R3 C(D, •)
C(D, •) Suffix : C(D, B) Subreductions : • C(•, B) C(D, •)
Therefore, we have the following dependences (we only show the program positions of the root symbols, since these are the only relevant program positions for 116
Cheda, Silva, Vidal
f f
C
g
B
B
C
A
A
B
B
Fig. 4. Tree terms of f(C(A, B)) and f(C(A, B), g(B, B))
computing the slice): · From the first suffix and its subreductions: C{(R1,Λ)} (f(A), g(B))
main
C(D, B)
A{(R1,1.1)}
main
D
f{(R1,1)} (A)
main
D
· From the second suffix and its subreductions: C{(R1,Λ)} (D, g(B))
main
C(D, B)
D{(R2,Λ)}
main
D
· From the third suffix and its subreductions: C{(R1,Λ)} (D, B)
main
C(D, B)
D{(R2,Λ)}
main
D
Therefore, the slice of the program w.r.t. (main, C(>, ⊥)) returns the following set of program positions {(R1, Λ), (R1, 1), (R1, 1.1), (R2, Λ)}. Clearly, the computation of all terms that depend on a given constructor term is undecidable. In the next section, we present a decidable approximation based on the construction of a graph that approximates the computations of a program.
4
Term Dependence Graphs
In this section, we sketch a new method for approximating the dependences of a program which is based on the construction of a data structure called term dependence graph. We first introduce some auxiliary definitions. Definition 4.1 (Tree term) We consider that terms are represented by trees in the usual way. Formally, the tree term T of a term t is a tree with nodes labeled with the symbols of t and directed edges from each symbol to the root symbols of its arguments (if any). For instance, the tree terms of the terms f(C(A, B)) and f(C(A, B), g(B, B)) are depicted in Fig. 4. We introduce two useful functions that manipulate tree terms. First, function Term from nodes to terms is used to extract the term associated to the subtree 117
Cheda, Silva, Vidal (R1,/\) main
(R1,1)
(R3,/\)
(R2,/\)
C
f
D
g
x
(R1,2) f
g
A
B
(R1,1.1)
A
x
(R1,2.1)
Fig. 5. Term dependence graph of the program in Example 3.5
whose root is the given node of a tree term: n if n has no children in T Term(T, n) = n(Term(T, n )) if n has k children n in T k k Now, function Term abs is analogous to function Term but replaces inner operationrooted subterms by fresh variables: n if n has no children in T Term abs (T, n) = n(Term 0 (T, n )) if n has k children n in T abs
Term
0
abs (T, n)
=
x
k
k
if n is a function symbol, where x is a fresh variable
Term abs (T, n) otherwise Now, we can finally introduce the main definition of this section. Definition 4.2 (Term dependence graph) Let P be a program. A term dependence graph for P is built as follows: (i) the tree terms of all left- and right-hand sides of P belong to the term dependence graph, where edges in these trees are labeled with S (for Structural); (ii) we add an edge, labeled with C (for Control), from the root symbol of every left-hand side to the root symbol of the corresponding right-hand side; (iii) finally, we add an edge, labeled with C, from every node n of the tree term Tr of the right-hand side of a rule to the node m of the tree term Tl of a left-hand side of a rule whenever Term abs (Tr , n) and Term(Tl , m) unify. Intuitively speaking, the term dependence graph stores a path for each possible computation in the program. A similar data structure is introduced in [1], where it is called graph of functional dependencies and is used to detect unsatisfiable computations by narrowing [11]. Example 4.3 The term dependence graph of the program of Example 3.5 is shown in Fig. 5. 5 Here, we depict C arrows as solid arrows and S arrows as dotted arrows. Observe that only the symbols in the right-hand sides of the rules are labeled with program positions. 5
For simplicity, we make no distinction between a node and the label of this node.
118
Cheda, Silva, Vidal
Clearly, the interest in term dependence graphs is that we can compute a complete program slice from the term dependence graph of the program. Usually, the slice will not be correct since the graph is an approximation of the program computations and, thus, some paths in the graph would not have a counterpart in the actual computations of the program. Algorithm 1 Given a program P and a slicing criterion (f, π), a slice of P w.r.t. (f, π) is computed as follows: (i) First, the term dependence graph of P is computed according to Def. 4.2. For instance, we start with the term dependence graph of Fig. 5 for the program of Example 3.5. (ii) Then, we identify in the graph the nodes N that correspond to the program positions Pπ of π which do not address the symbol ⊥; for this purpose, we should follow the path from f to its possible outputs in the graph. For instance, given the slicing criterion (main, C(>, ⊥)) and the term dependence graph of Fig. 5, the nodes N that correspond to the program positions of C(>, ⊥) which do not address the symbol ⊥ are shown with a bold box in Fig. 6. (iii) Finally, we collect • the program positions of the nodes (associated with the right-hand side of a program rule) in every C-path—i.e., a path made of C arrows—that ends in a node of N , • the program positions of the descendants M of the above nodes (i.e., all nodes which are reachable following the S arrows) excluding the nodes of N , and • the program positions of the nodes which are reachable from M following the C arrows, and its descendants. Therefore, in the example above, the slice will contain the following program positions: • the program positions (R1, Λ), (R1, 1), (R2, Λ) associated with the paths that end in a node with a bold box; • the program positions of their descendants, i.e., (R1, 1.1); • and no more program positions, since there is no node reachable from the node labeled with A. Trivially, this algorithm always terminates. The completeness of the algorithm (i.e., that all the program positions of the slice according to Definition 3.10 are collected) can be proved by showing that all possible computations can be traced using the term dependence graph. However, the above algorithm for computing static slices is not correct since there may be C-paths in the term dependence graph that have no counterpart in the computations of the original program. The following example illustrates this point.
119
Cheda, Silva, Vidal (R1,/\) main
(R1,1)
(R3,/\)
(R2,/\)
C
f
D
g
x
(R1,2) f
g
A
B
(R1,1.1)
A
x
(R1,2.1)
Fig. 6. Slice of the program in Example 3.5 main
g
g
f
f
A
A
B
A B
Fig. 7. Term dependence graph of the program in Example 4.4
Example 4.4 Consider the following program: (R1)
main → g(f(A))
(R2)
g(B) → B
(R3)
f(A) → A
The associated term dependence graph is shown in Fig. 7. From this term dependence graph, we would infer that there is a computation from main to B while this is not true.
5
Related Work and Discussion
The first attempt to adapt PDGs to the functional paradigm has been recently introduced by Rodrigues and Barbosa [10]. They have defined the functional dependence graphs (FDG), which represent control relations in functional programs. However, the original aim of FDGs was the component identification in functional programs and thus they only consider high level functional program entities (i.e., the lowest level of granularity they consider are functions). In a FDG, a single node often represents a complex term (indeed a complete function definition) and, hence, the information about control dependences of its subterms is not stored in the graph. Our definition of term dependence graph solves this problem by representing terms as trees and thus considering a lower level of granularity for control dependences between subterms. As mentioned before, our term dependence graph shares many similarities with the loop checks of [1]. Roughly speaking, [1] defines a directed graph of functional dependencies as follows: for every rule l → r, there is an R-arrow from l to every subterm of r (where inner arguments are replaced by fresh variables); also, u-arrows 120
Cheda, Silva, Vidal
are added from every term in the right-hand side of an R-arrow to every term in the left-hand side of an R-arrow with which it unifies. In this way, every possible computation path can be followed in the directed graph of functional dependencies. Later, [2] introduced the computation of similar relations (the so called dependency pairs) to analyze the termination of term rewriting systems. As for future work, we plan to formally prove the completeness of the slices computed by Algorithm 1. We also want to identify and define more dependence relations in the term dependence graph in order to augment its precision w.r.t. Definition 3.6. Then, we want to extend the framework to cover higher-order features. Finally, we plan to implement the slicing algorithm and integrate it in a Curry slicer [9] to perform static slicing of functional and functional logic programs.
References [1] M. Alpuente, M. Falaschi, M.J. Ramis, and G. Vidal. Narrowing Approximations as an Optimization for Equational Logic Programs. In J. Penjam and M. Bruynooghe, editors, Proc. of PLILP’93, Tallinn (Estonia), pages 391–409. Springer LNCS 714, 1993. [2] T. Arts and J. Giesl. Termination of term rewriting using dependency pairs. Theoretical Computer Science, 236(1-2):133–178, 2000. [3] F. Baader and T. Nipkow. Term Rewriting and All That. Cambridge University Press, 1998. [4] J. Ferrante, K.J. Ottenstein, and J.D. Warren. The Program Dependence Graph and Its Use in Optimization. ACM Transactions on Programming Languages and Systems, 9(3):319–349, 1987. [5] J. Field and F. Tip. Dynamic Dependence in Term Rewriting Systems and its Application to Program Slicing. Information and Software Technology, 40(11-12):609–634, 1998. [6] D.J. Kuck, R.H. Kuhn, D.A. Padua, B. Leasure, and M. Wolfe. Dependence Graphs and Compiler Optimization. In Proc. of the 8th Symp. on the Principles of Programming Languages (POPL’81), SIGPLAN Notices, pages 207–218, 1981. [7] Y.A. Liu and S.D. Stoller. Eliminating Dead Code on Recursive Data. Programming, 47:221–242, 2003.
Science of Computer
[8] C. Ochoa, J. Silva, and G. Vidal. Dynamic Slicing Based on Redex Trails. In Proc. of the ACM SIGPLAN 2004 Symposium on Partial Evaluation and Program Manipulation (PEPM’04), pages 123– 134. ACM Press, 2004. [9] C. Ochoa, J. Silva, and G. Vidal. Lighweight Program Specialization via Dynamic Slicing. In Workshop on Curry and Functional Logic Programming (WCFLP 2005), pages 1–7. ACM Press, 2005. [10] N. Rodrigues and L.S. Barbosa. Component Identification Through Program Slicing. In Proc. of Formal Aspects of Component Software (FACS 2005). Elsevier ENTCS, 2005. [11] J.R. Slagle. Automated Theorem-Proving for Theories with Simplifiers, Commutativity and Associativity. Journal of the ACM, 21(4):622–642, 1974. [12] F. Tip. A Survey of Program Slicing Techniques. Journal of Programming Languages, 3:121–189, 1995. [13] M.D. Weiser. Program Slicing. IEEE Transactions on Software Engineering, 10(4):352–357, 1984.
121
122
WFLP 2006
A Study on the Practicality of Poly-Controlled Partial Evaluation Claudio Ochoa and Germ´an Puebla School of Computer Science Technical University of Madrid Madrid, Spain {claudio,german}@fi.upm.es
Abstract Poly-controlled partial evaluation (PCPE) is a flexible approach for specializing logic programs, which has been recently proposed. It takes into account repertoires of global control and local control rules instead of a single, predetermined, combination. Thus, different global and local control rules can be assigned to different call patterns, obtaining results that are hybrid in the sense that they cannot be obtained using a single combination of control rules, as traditional partial evaluation does. PCPE can be implemented as a search-based algorithm, producing sets of candidate specialized programs (many of them hybrid), instead of a single one. The quality of each of these programs is assessed through the use of different fitness functions, which can be resource aware, taking into account multiple factors such as run-time, memory consumption, and code size of the specialized programs, among others. Although PCPE is an appealing approach, it suffers from an inherent blowup of its search space when implemented as a search-based algorithm. Thus, in order to be used in practice, and to deal with realistic programs, we must be able to prune its search space without losing the interesting solutions. The contribution of this work is two-fold. On one hand we perform an experimental study on the heterogeneity of solutions obtained by search-based PCPE, showing that the solutions provided behave very differently when compared using a fitness function. Note that this is important since otherwise the cost of producing a large number of candidate specializations would not be justified. The second contribution of this work is the introduction of a technique for pruning the search space of this approach. The proposed technique is easy to apply and produces a considerable reduction of the size of the search space, allowing PCPE to deal with a reasonable number of benchmark programs. Although pruning is done in a heuristic way, our experimental results suggest that our heuristic behaves well in practice, since the fitness value of the solutions obtained using pruning coincide with the fitness value of the solution obtained when no pruning is applied. Keywords: Partial Evaluation, Control Strategies, Resource Awareness, Program Optimization, Pruning Techniques
1
Introduction
The aim of partial evaluation (PE ) is to specialize a program w.r.t. part of its input, which is known as the static data[11]. The quality of the code generated by partial evaluation greatly depends on the control strategy used. Unfortunately, the existence of sophisticated control rules which behave (almost) optimally for all programs is still far from reality. Poly-controlled partial evaluation [15] (PCPE ) attempts to cope with this problem by employing a set of global and local control rules instead of a predetermined combination (as done in traditional partial evaluation algorithms). This allows using different global and local control rules for This paper is electronically published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs
Ochoa and Puebla
different call patterns (atoms). Thus, PCPE can produce specialized programs that are not achievable by traditional partial evaluation using any of the considered local and global control rules in isolation. In [15], two algorithms for implementing PCPE were introduced. One of them uses a function called pick to decide a priori which (global and local) control strategies are to be applied to every atom. The second one applies a number of pre-selected control rules to every atom, generating several candidate specializations, and decides a posteriori which specialization is the best one by empirically comparing the final configurations (candidate specializations) using a fitness function, possibly taking into account factors such as size of the specialized program and time- and memory-efficiency of such a specialized program. Since choosing a good Pick function can be a very hard task, and in the need of a proof of concept of the idea of PCPE, we have implemented the second algorithm (leaving the first one for future work), although this algorithm is less efficient in terms of size of the search space. Among the main advantages of PCPE we can mention: It can obtain better solutions than traditional PE: In [15], preliminary experiments showed that PCPE produced hybrid solutions with better fitness value than any of the solutions achievable by traditional PE, for a number of different resource-aware fitness functions. Hybrid solutions are not achievable by traditional partial evaluation, since different global and local control rules are applied to different call patterns. It is a resource-aware approach: in traditional PE, existing control rules focus on time-efficiency by trying to reduce the number of resolution steps which are performed in the residual program. Other factors such as the size of the compiled specialized program, and the memory required to run the residual program are most often neglected—some relevant exceptions being the works in [4],[3]—. In addition to potentially generating larger programs, it is well known that partial evaluation can slow-down programs due to lower level issues such as clause indexing, cache sizes, etc. PCPE, on the other hand, makes use of resource aware fitness functions to choose the best solution from a set of candidate solutions. It is more user-friendly: existing partial evaluators usually provide several global and local control strategies, as well as many other parameters (global trees, computation rules, etc.) directly affecting the quality of the obtained solution. For a novice user, it is extremely hard to find the right combination of parameters in order to achieve the desired results (reduction of size of compiled code, reduction of execution time, etc.). Even for an experienced user, it is rather difficult to predict the behavior of partial evaluation, especially in terms of space-efficiency (size of the residual program). PCPE allows the user to simultaneously experiment with different combinations of parameters in order to achieve a specialized program with the desired characteristics. It performs online partial evaluation: as opposed to other approaches (e.g. [3]), PCPE performs online partial evaluation, and thus it can take advantage of the great body of work available for online partial evaluation of logic programs. Unfortunately, PCPE is not the panacea, and it has a number of disadvantages. The main drawback of this approach is that, when implemented as a search-based 124
Ochoa and Puebla
algorithm, its search space suffers from an inherent exponential blowup since given a configuration, the number of successors can be as high as the number of combinations of local and global control rules considered. As a direct consequence, the specialization time of PCPE is higher than its PE counterpart. After getting acquainted for the first time with the basic idea of poly-controlled partial evaluation, probably two questions come up immediately to our mind: (i) does PCPE provides a wide range of solutions? I.e., is the set of obtained solutions heterogeneous enough to offer us a wide set of candidate solutions to choose from? (ii) is PCPE feasible in practice? I.e., since there is an exponential blowup of the search space, is it possible to perform some pruning in order to deal with realistic programs without losing the interesting solutions? Throughout this work we address these two questions, providing some experimental results to help us justify our allegations.
2
Background
We assume some basic knowledge on the terminology of logic programming. See for example [12] for details. Very briefly, an atom A is a syntactic construction of the form p(t1 , . . . , tn ), where p/n, with n ≥ 0, is a predicate symbol and t1 , . . . , tn are terms. The function pred applied to atom A, i.e., pred(A), returns the predicate symbol p/n for A. A clause is of the form H ← B where its head H is an atom and its body B is a conjunction of atoms. A definite program is a finite set of clauses. A goal (or query) is a conjunction of atoms. Two terms t and t0 are variants, denoted t ≈ t0 , if there exists a renaming ρ such that tρ = t0 . We denote by {X1 7→ t1 , . . . , Xn 7→ tn } the substitution σ with σ(Xi ) = ti for all i = 1, . . . , n (with Xi 6= Xj if i 6= j) and σ(X) = X for any other variable X, where ti are terms. A unifier for a finite set S of simple expressions is a substitution θ if Sθ is a singleton. A unifier θ is called most general unifier (mgu) for S, if for each unifier σ of S, there exists a substitution γ such that σ = θγ. 2.1
Basics of Partial Evaluation in LP
Partial evaluation of LP is traditionally presented in terms of SLD semantics. We briefly recall the terminology here. The concept of computation rule is used to select an atom within a goal for its evaluation. Definition 2.1 A computation rule is a function R from goals to atoms. Let G be a goal of the form ← A1 , . . . , AR , . . . , Ak , k ≥ 1. If R(G) =AR we say that AR is the selected atom in G. The operational semantics of definite programs is based on derivations [12]. Definition 2.2 [derivation step] Let G be ← A1 , . . . , AR , . . . , Ak . Let R be a computation rule and let R(G) =AR . Let C = H ← B1 , . . . , Bm be a renamed 125
Ochoa and Puebla
apart clause in P . Then G0 is derived from G and C via R if the following conditions hold: θ = mgu(AR , H) G is the goal ← θ(A1 , . . . , AR−1 , B1 , . . . , Bm , AR+1 , . . . , Ak ) 0
As customary, given a program P and a goal G, an SLD derivation for P ∪ {G} consists of a possibly infinite sequence G = G0 , G1 , G2 , . . . of goals, a sequence C1 , C2 , . . . of properly renamed apart clauses of P , and a sequence θ1 , θ2 , . . . of mgus such that each Gi+1 is derived from Gi and Ci+1 using θi+1 . A derivation step can be non-deterministic when AR unifies with several clauses in P , giving rise to several possible SLD derivations for a given goal. Such SLD derivations can be organized in SLD trees. A finite derivation G = G0 , G1 , G2 , . . . , Gn is called successful if Gn is empty. In that case θ = θ1 θ2 . . . θn is called the computed answer for goal G. Such a derivation is called failed if it is not possible to perform a derivation step with Gn . In partial evaluation, SLD semantics is extended in order to also allow incomplete derivations which are finite derivations of the form G = G0 , G1 , G2 , . . . , Gn and where no atom is selected in Gn for further resolution. This is needed in order to avoid (local) non-termination of the specialization process. Also, the substitution θ = θ1 θ2 . . . θn is called the computed answer substitution for goal G. An incomplete SLD tree possibly contains incomplete derivations. In order to compute a partial evaluation (PE) [11], given an input program and a set of atoms (goals), the first step consists in applying an unfolding rule to compute finite incomplete SLD trees for these atoms. Then, a set of resultants or residual rules are systematically extracted from the SLD trees. Definition 2.3 [unfolding rule] Given an atom A, an unfolding rule computes a set of finite SLD derivations D1 , . . . , Dn (i.e., a possibly incomplete SLD tree) of the form Di = A, . . . , Gi with computer answer substitution θi for i = 1, . . . , n whose associated resultants are θi (A) ← Gi . Therefore, this step returns the set of resultants, i.e., a program, associated to the root-to-leaf derivations of these trees. The set of resultants for the computed SLD tree is called a partial evaluation for the initial goal (query). The partial evaluation for a set of goals is defined as the union of the partial evaluations for each goal in the set. We refer to [8] for details. In order to ensure the local termination of the PE algorithm while producing useful specializations, the unfolding rule must incorporate some non-trivial mechanism to stop the construction of SLD trees. Nowadays, well-founded orderings (wfo) [2,13] and well-quasi orderings (wqo) [16,9] are broadly used in the context of on-line partial evaluation techniques (see, e.g., [6,10,16]). In addition to local termination, an abstraction operator is applied to properly add the atoms in the right-hand sides of resultants to the set of atoms to be partially evaluated. This abstraction operator performs the global control and is in charge of guaranteeing that the number of atoms which are generated remains finite. This is done by replacing atoms by more general ones, i.e., by losing precision in order to guarantee termination. The abstraction phase yields a new set of atoms, some 126
Ochoa and Puebla
of which may in turn need further evaluation and, thus, the process is iteratively repeated while new atoms are introduced.
3
Poly-Controlled Partial Evaluation
Traditional algorithms for partial evaluation (PE) of logic programs (LP) are parametric w.r.t. the global control and local control rules 1 . In these algorithms, once a specialization strategy has been selected, it is applied to all call patterns in the residual program. However, it is well known that several control strategies exist which can be of interest in different circumstances. It is indeed a rather difficult endeavor to find a specialization strategy which behaves well in all settings. Thus, rather than considering a single specialization strategy, at least in principle one can be interested in applying different specialization strategies to different atoms (call patterns). Unfortunately, this is something which existing algorithms for PE do not cater for. Poly-controlled partial evaluation (PCPE) [15] fills this gap by allowing the use of a set of specialization strategies instead of a predetermined one. 3.1
A Search-Based Poly-Controlled Partial Evaluation Algorithm
Algorithm 1 shows a search-based poly-controlled partial evaluation algorithm. In this algorithm, a configuration Confi is a pair hSi , Hi i s.t. Si is the set of atoms yet to be handled by the algorithm and Hi is the set of atoms already handled by the algorithm. Indeed, in Hi not only we store atoms Ai but also the result A0i of applying global control to such atoms and the unfolding rule U nf old which has been used to unfold Ai , i.e., members of Hi are tuples of the form hAi , A0i , U nf oldi. We store U nf old in order to use exactly such unfolding rule during the code generation phase. Correctness of the algorithm requires that each A0i is an abstraction of Ai , i.e., Ai = A0i θ. Algorithm 1 employs two auxiliary data structures. One is Confs, which contains the configurations which are currently being explored. The other one is Sols, which stores the set of solutions currently found by the algorithm. As it is well known, the use of different data structures for Confs provides different traversals of the search space. In our implementation of this algorithm in CiaoPP [7], we have used both a stack and a queue, traversing the search space in a depth-first and a breadth-first fashion, respectively. Given a set of atoms S which describe the potential queries to the program, the initial configuration is of the form hS, ∅i. In each iteration of the algorithm, a configuration hSi , Hi i is popped from Confs (line 6), and an atom Ai from Si is selected (line 7). Then, several combinations of global control (Abstract ∈ G) and local control (U nf old ∈ U) rules, respectively, are applied (lines 11 and 12). Each application builds an SLD-tree for A0i , a generalization of Ai as determined by Abstract, using the corresponding unfolding rule Unfold. Once the SLD-tree τi is computed, the leaves in its resultants, i.e., the atoms in the residual code for A0i are collected by the function leaves (line 14). Those atoms in leaves(τi ) which are not a variant of an atom handled in previous iterations of the algorithm are added to the set of atoms to be considered (Si+1 ) and pushed on Confs. We use 1
From now on, we call any combination of global and local control rules a specialization strategy.
127
Ochoa and Puebla
Algorithm 1 Search-Based Poly-Controlled Partial Evaluation Algorithm Input: Program P Input: Set of atoms of interest S Input: Set of unfolding rules U Input: Set of generalization functions G Output: Set of partial evaluations Sols 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22:
H0 = ∅ S0 = S create(Confs); Confs = push(hS0 , H0 i, Confs) Sols = ∅ repeat hSi , Hi i = pop(Confs) Ai = Select(Si ) Candidates = {hAbstract, Unfoldi | Abstract ∈ G, Unfold ∈ U} repeat Candidates = Candidates − {hAbstract, Unfoldi} A0i = Abstract(Hi , Ai ) τi = Unfold(P, A0i ) Hi+1 = Hi ∪ {hAi , A0i , Unfoldi} Si+1 = (Si − {Ai }) ∪ {A ∈ leaves(τi ) | ∀ hB, , i ∈ Hi+1 . B 6≡ A} if Si+1 =∅ then Sols = Sols ∪ {Hi+1 } else push(hSi+1 , Hi+1 i,Confs) end if until Candidates = ∅ i=i+1 until empty stack(Confs)
B ≡ A to denote that B and A are variants, i.e., they are equal modulo variable renaming. The process terminates when the stack of configurations to handle is empty, i.e. all final configurations have been reached. The specialized program corS responds to hA,A0 ,U nf oldi∈Hn resultants(A0 , U nf old), where the function resultants is parametric w.r.t. the unfolding rule. Note that in this algorithm, once an atom Ai is abstracted into A0i , code for A0i will be generated, and it will not be abstracted any further no matter which other atoms are handled in later iterations of the algorithm. As a result, the set of atoms for which code is generated are not guaranteed to be independent. Two atoms are independent when they have no common instance. However, the pairs in H uniquely determine the version used at each program point. Since code generation produces a new predicate name per entry in H, independence is guaranteed, and thus the specialized program will not produce more solutions than the original one. As mentioned in [15], one could think of a similar algorithm deciding a priori a control strategy to be applied to each atom. This algorithm would be more similar to the traditional PE algorithm, employing possibly different control rules for differ128
Ochoa and Puebla : - module (_ ,[ rev /2] ,[]). : - entry rev ([ _ , _ | L ] , R ).
Input query
#solutions
rev(L,R) rev([ |L],R) rev([ , |L],R) rev([ , , |L],R) rev([ , , , |L],R) rev([1|L],R) rev([1,2|L],R)
rev ([] ,[]). rev ([ H | L ] , R ) : rev (L , Tmp ) , app ( Tmp ,[ H ] , R ). app ([] , L , L ). app ([ X | Xs ] ,Y ,[ X | Zs ]) : app ( Xs ,Y , Zs ).
6 48 117 186 255 129 480
(b)
(a)
Fig. 1. The nrev example and the number of solution generated by PCPE
ent atoms. Unfortunately, it is not clear how this decision can be made, so instead Algorithm 1 generates several candidate partial evaluations and then decides a posteriori which specialized program to use. Clearly, generating all possible candidate specialized programs is more costly than computing just one. However, selecting the best candidate a posteriori allows to make much more informed decisions than selecting it a priori. 3.2
Exponential Blowup of the Search Space
Given that Algorithm 1 allows different combinations of specialization strategies, given a configuration, there are several successor configurations. This can be interpreted as, given G={A1 , . . . , Aj } and U={U1 , . . . , Ui }, there is a set of transA formation operators TUA11 , . . . , TUAi1 , . . . , TUij . Thus, in the worst case, given a set of unfolding rules U = {Unfold1 , . . . , Unfoldi }, and a set of abstraction functions G = {Abstract1 , . . . , Abstractj }, there are i × j possible combinations. As already mentioned, this represents an inherent exponential blowup in the size of the search space, and it makes the algorithm impractical for dealing with realistic programs. Of course, several optimizations can be done to the base algorithm shown above, in order to deal with this problem. A first obvious optimization is to eliminate equivalent configurations which are descendants of the same node in the search tree. I.e., it is often the case that given a configuration Conf there are more than one TUA 0 0 and TUA0 with (A, U ) 6= (A0 , U 0 ) s.t. TUA (Conf ) = TUA0 (Conf ). This optimization is easy to implement, not very costly to execute, and reduces search space significantly. However, even with this optimization, a simple experiment shows the magnitude of this problem. Let us consider the program in Listing 1(a), which implements a naive reverse algorithm. In this experiment, let us choose the set of global control rules G={dynamic, hom emb}. The hom emb global control rule is based on homeomorphic embedding [8,9] and flags atoms as potentially dangerous (and are thus generalized) when they homeomorphically embed any of the previously visited atoms at the global control level. Then, dynamic is the most abstract possible global control rule, which abstracts away the value of all arguments of the atom and replaces them with distinct variables. Also, let us choose the set of local control rules U={one step, df hom emb as}. The rule one step is the simplest possible unfolding rule which always performs just one unfolding step for any atom. Finally, df hom emb as is an 129
Ochoa and Puebla
unfolding rule based on homeomorphic embedding. More details on this unfolding rule can be found in [14]. It can handle external predicates safely and can perform non-leftmost unfolding as long as unfolding is safe (see [1]) and local (see [14]). In CiaoPP [7], the description of initial queries (i.e., the set of atoms of interest S in Algorithm 1 ) is obtained by taking into account the set of predicates exported by the module, in this case rev/2, possibly qualified by means of entry declarations. For example, the entry declaration in Listing 1(a) is used to specialize the naive reverse procedure for lists containing at least two elements. Table (b) of Figure 1 shows the number of candidate solutions generated by Algorithm 1 (eliminating equivalent configurations in the search tree), for several entry declarations. As can be observed in the table, as the length of the list provided as entry grows, the number of candidate solutions computed quickly grows. Furthermore, if the elements of the input list are static, then the number of candidates grows even faster, as can be seen in the last two rows in Table 1, where we provide the first elements of the list. From this small example, it is clear that, in order to be able to cope with realistic Prolog programs, it is mandatory to reduce the search space. In Section 5 we propose a technique to do so.
4
Heterogeneity of PCPE Hybrid Solutions
As mentioned before, Algorithm 1 produces a set of candidate solutions. Of these, a few of them are pure, in the sense that they can be obtained via traditional PE (i.e., they apply the same control strategy to all atoms in the residual program), and the rest are hybrid, in the sense that they apply different specialization strategies to different atoms. In this section, we try to determine how heterogeneous are the fitness values of the different solutions obtained by PCPE. 4.1
Choosing Adequate Sets of Global and Local Control Rules
The question of whether the solutions obtained by PCPE are heterogeneous w.r.t. their fitness values depends, in a great deal, on the particular choice of specialization strategies to be used, as well as on the arity of the sets G and U of control rules. We can expect that by choosing control rules different enough, the candidate solutions will be also very different, and viceversa. To see this, think for a moment that we choose U = {det, lookahead} where both det and lookahead are purely determinate [6,5]—i.e., they select atoms matching a single clause head—, the difference being that lookahead uses a ”look-ahead” of a finite number of computation steps to detect further cases of determinacy [6]. Given that both rules are based on determinate unfolding, and this is considered a very conservative technique, it is highly probable that this particular choice of local control rules will not contribute to finding heterogeneous solutions. A better idea will be then to choose one unfolding rule that is conservative, and another one that is aggressive. An example of an aggressive local control rule would be one performing non-leftmost unfolding. The same reasoning can be done when selecting the global control rules, we could select one rule that is very precise—while guaranteeing termination—, and a very imprecise global control rule. 130
Ochoa and Puebla
Benchmark
Input Query
example pcpe permute nrev advisor relative ssuply transpose
main( , ,2, ) permute([1,2,3,4,5,6],L) rev([ , , , |L],R) what to do today( , , ) relative(john,X) ssupply( , , ) transpose([[ , , , , , , , , ], , ], )
overall
speedup Mean St Dev
Vers
Fitness
27 70 255 14 61 31 154
1.56 1.31 1.09 1.68 18.01 5.15 2.62
0.87 1.15 0.66 1.31 3.45 1.84 0.87
0.21 0.48 0.15 0.67 4.84 1.82 0.30
Diam 0.99 1.16 0.51 0.97 16.37 4.72 2.13
87.4
4.49
1.45
1.21
3.83
Table 1 PCPE statistics over different benchmarks (speedup)
4.2
Heterogeneity of the Fitness of PCPE Solutions
Once we select an appropriate set of control rules for PCPE, we need to determine whether the fitness of the solutions we obtain are heterogeneous. With this purpose, we have ran some experiments over a set of benchmarks and different fitness functions, in order to collect statistical facts such as Standard Deviation and Diameter that can help us to determine how different are the obtained solutions. In our experiments, as mentioned in Section 3, we have used a set of global control rules G={dynamic, hom emb} and a set of local control rules U={one step, df hom emb as}. Besides, we used different fitness functions already introduced in [15]. For reasons of space, we will show some of the results obtained when using the following fitness functions: speedup compares programs based on their time-efficiency, measuring run-time speedup w.r.t. the original program. When using this fitness function, the user needs to provide a set of run-time queries with which to time the execution of the program. Such queries should be representative of the real executions of the program 2 . This fitness function is computed as speedup=Torig /Tspec , where Tspec is the execution time taken by the specialized program to run the given run-time queries, and Torig the time taken by the original program. reduction compares programs based on their space-efficiency, measuring reduction of size of compiled bytecode w.r.t. the original program. It is computed as reduction=(Sorig − Sempty )/ (Sspec − Sempty ), where Sspec is the size of the compiled bytecode of the specialized program, Sorig is the size of the compiled bytecode of the original program, and Sempty is the size of the compiled bytecode of an empty program. In Table 1 we can observe, for a number of benchmarks, the collected statistics when using speedup [15] as a fitness function. As mentioned before, the number of versions obtained is tightly related to several factors, such as the number and kind of control rules used, as well as the initial input queries used to specialize each program. For this particular experiment, PCPE generated a mean of 87 candidate solutions per benchmark. In most cases we can observe that both the fitness of the 2
Though the issue of finding representative run-time queries is an interesting research topic in its own right, it is out of the scope of this paper to automate such process.
131
Ochoa and Puebla
best solution and the mean fitness are over 1, meaning that a speedup is achieved when comparing the obtained solutions w.r.t. the original program. In some cases, the mean speedup is below 1, indicating that many of the solutions are bad and get a slowdown w.r.t. the original program. Let us take transpose, for example. In this particular benchmark, we can see that most of the 154 final solutions are slower than the original program, meaning that it is easy to specialize this program with different control strategies and obtain a solution that runs slower than the original program. Note however, that the best solution obtained by PCPE is 2.62 faster than the original program. In order to answer our initial question, i.e., whether does PCPE provide a wide range of solutions, the columns we are interested in looking at are St Dev and Diameter. St Dev stands for standard deviation, and measures how spread out the values in a data set are. Diameter measures the difference of fitness among (any of) the best solution(s) when compared to (any of) the worst solution(s). Note that many of the solutions found by PCPE can have the same fitness value. Values closer to 0 in St Dev would indicate that most solutions are similar and their fitness value is similar to the mean fitness value. However, the mean St Dev is 1.21, showing that in general solutions are spread out, i.e., they are different when compared against each other, even though very little static information is provided to the PCPE algorithm (as shown in the column Input Query of Table 1). This fact is evident when we look at the fitness of the different solutions in a graphical way. In Fig. 2 we can observe, for the nrev benchmark, as defined in Listing 1(a), how the fitness of all solutions are quite distributed across the mean value. We have chosen this benchmark because it is the one with the lowest Standard Deviation value, and with the highest number of versions obtained. Also, we can see that many solutions share the same fitness value, and that in some way they are grouped together, indicating that it should be possible to find ways to collapse those solutions into one, pruning in this way the search space. Regarding the Diameter column, we can observe that the mean diameter is 3.83, indicating that there is an important difference between the worst and the best solutions. These preliminary results are encouraging, showing that PCPE is capable of obtaining several heterogeneous solutions, most of them not being achievable by traditional partial evaluation. Similar results have been obtained for other fitness functions (not shown here due to lack of space). Though it is clear we need to prune the search space in order to make this approach practical, we should do it with care, in order to not to prune the good solutions.
5
Pruning the Search Space: SPRS Heuristic
In spite of the possibility of eliminating redundant configurations and non-promising branches, it is worthwhile to explore in practice the use of poly-controlled partial deduction with more restrictive capabilities in order to reduce the cost of exploring the search space. For instance, rather than allowing all possible combinations of specialization strategies for different atoms in a configuration, we can restrict ourselves to configurations which always use the same specialization strategy for all atoms which correspond to the same predicate. This restriction will often signifi132
Ochoa and Puebla
1.1 solution 0.64 1
0.9
0.8
0.7
0.6
0.5
0.4 0
50
100
150
200
250
300
Fig. 2. PCPE solutions for nrev
cantly reduce the branching factor of our algorithm since, handling of an atom Ai will become deterministic as soon as we have previously considered an atom for the same predicate in any configuration which is an ancestor of the current one in the search space, i.e., it is compulsory to use exactly the same specialization strategy as before. We call this approach SPSR, standing for Same Predicate, Same Rules. We will refer to configurations which satisfy this restriction as consistent, and as inconsistent to those which do not. Though this simplification may look too restrictive at first sight, it is often the case in practice that there exists a specialization strategy which behaves well for all atoms which correspond to the same predicate, in the context of a given program. We will modify Algorithm 1 in such a way that only consistent configurations are further processed. For this we need to store for every atom in every configuration the global control rule used to generalize such an atom. We now provide a formal definition of consistent configurations w.r.t. to the SPSR heuristic. Definition 5.1 [consistent configuration] given a configuration Conf = hS, Hi, we say that Conf is consistent iff ∀hA1 , A01 , G1 , U1 i ∈ H, ∀hA2 , A02 , G2 , U2 i ∈ H, pred(A1 ) = pred(A2 ) ⇒ (G1 = G2 ∧ U1 = U2 ) Note that the definition of consistent configuration can be applied to intermediate configurations (not only to final ones). Thus, if a given configuration Conf is inconsistent, it will be pruned, i.e., it will not be pushed on Confs. By doing this we are pruning not only this configuration, but also all the successor configurations that would have been generated from it. This means that early pruning will achieve significant reductions of the search space. 133
Ochoa and Puebla
Benchmark example pcpe permute nrev advisor relative ssuply transpose overall
Heur Versions
Fitness PCPE CS
orig spsr orig spsr orig spsr orig spsr orig spsr orig spsr orig spsr
27 27 70 9 255 9 14 8 61 11 31 31 154 6
1.56 1.60 1.31 1.29 1.03 1.06 1.68 1.66 18.01 17.96 5.15 5.13 2.62 2.54
orig spsr
87.4 14.4
4.49 4.44
PE
hd
1.37
hd
1.06
hd
1.03
hd
1.55
hd
15.30
hd
5.15
hd
2.60 4.01
Mean
St Dev
Diameter
0.87 0.86 0.91 1.02 0.64 0.71 1.21 1.49 3.45 8.00 1.52 1.53 0.87 1.08
0.21 0.23 0.48 1.01 0.15 0.19 0.67 0.86 4.84 9.36 1.82 1.82 0.30 0.57
0.99 1.11 1.16 1.01 0.51 0.55 0.97 1.06 16.37 16.95 4.72 4.51 2.13 1.60
1.35 2.09
1.21 2.01
3.83 3.82
Table 2 Comparison of search-pruning alternatives(speedup)
6
Experimental Results
Since the SPSR heuristic prunes the search space in a blind way, i.e., without making any evaluation of the candidates being pruned, there is a possibility of pruning the optimal solutions. In order to determine if this is the case, we have extended the experiments shown in Sec. 4, adding the results obtained when applying the SPSR heuristic to the example programs. In Table 2, we show the number of versions obtained by PCPE, the fitness value of both the optimal solution(s) obtained by PCPE, and the best solution obtained by traditional PE (together with the control strategy CS used to obtain such value 3 ), the mean value of all solutions, their standard deviation and their diameter, when using speedup as a fitness function. We compare in all cases the values obtained by the original PCPE approach (in row orig under colum Heur) versus the values obtained by PCPE when pruning its search space by means of the SPSR heuristics (in row spsr). As shown in the table, the search space is significantly reduced when applying SPSR, and the mean number of versions is reduced from 87 candidate solutions to only 14. However, there are some benchmarks for which no pruning of the search space is achieved, as is the case of example pcpe and ssupply. This is due to the fact that these programs contain very few atoms in their candidate specializations, and all of such configurations are consistent, satisfying the SPSR restriction. In our experiments, when pruning is done, the St Dev grows, indicating that we are pruning solutions sharing the same fitness value. By looking at the fitness values, we can presume that the best solution is preserved, in spite of performing a blind pruning (the slight difference between fitness values of orig and spsr is probably due to noise when measuring time). Note that, in most cases, PCPE outperforms 3 We use the following notation for denoting pairs of control rules: ho={hom emb,one step}, hd={hom emb,df hom emb as}, do={dynamic,one step}, dd={dynamic,df hom emb as}
134
Ochoa and Puebla
Benchmark example pcpe permute nrev advisor relative ssuply transpose overall
Heur Versions Sols
Fitness PCPE CS
orig spsr orig spsr orig spsr orig spsr orig spsr orig spsr orig spsr
27 27 70 9 255 9 14 8 61 11 31 31 154 6
1 1 6 1 3 1 1 1 2 1 1 1 5 1
1.22 1.22 1.15 1.15 0.98 0.98 1.69 1.69 1.17 1.17 11.26 11.26 0.98 0.98
orig spsr
87.4 14.4
2.71 1.00
2.63 2.63
PE
hd
1.15
do
0.98
do
0.98
hd
1.68
do
0.98
hd
11.26
do
0.98 2.57
Mean
St Dev
Diameter
0.82 0.82 0.61 0.63 0.32 0.55 1.03 0.94 0.67 0.80 1.61 1.61 0.39 0.63
0.19 0.19 0.27 0.34 0.15 0.25 0.34 0.38 0.25 0.28 1.79 1.79 0.19 0.26
0.82 0.82 1.15 1.15 0.79 0.71 1.41 1.41 1.04 1.04 10.32 10.32 0.75 0.70
0.77 0.85
0.45 0.49
2.32 2.30
Table 3 Comparison of search-pruning alternatives(reduction)
traditional PE. Interestingly, it is clear that for these benchmarks the best strategy for PE is hd. We can observe also that the mean fitness is higher when pruning is performed, which could indicate that bad solutions are pruned away. In Table 3 we show the same information as above, but for the reduction fitness function. We have also added an extra column Sols showing the number of best solutions found by PCPE (note that this column does not make any sense when time-efficiency is measured, because this measurement is subject to noise). By looking at the fitness value, we can see that the best solution is preserved, in spite of performing a blind pruning. But according to the Sols column, we are pruning away the redundant best solutions, and leaving only one of them. Clearly, the number of versions pruned by SPSR does not depend on the fitness function used, since the fitness function is used after generating all solutions in order to determine which candidates are the best ones. With regard to the fitness value, it is interesting to note that the strategy do, i.e., dynamic as a global control and one step as a local control, produces a program that is very similar to the original one (probably having some variable and predicate renaming). This means that in situations where the original program has few predicates, it is difficult to obtain a residual program smaller than the original program. This is reflected in the benchmarks permute, nrev, relative and transpose, where the best control strategy is do and the fitness value is close to 1. However, note that PCPE still obtains better solutions in the cases of permute and relative, clearly through a hybrid solution. It is also interesting to see that the diameter is preserved most of times, indicating that both the best and worst solutions are preserved. However, in nrev and transpose the diameter decreases a bit, and since the best solution is preserved, this means we are pruning the worst solutions in these cases. In summary, SPSR seems to be a very interesting pruning technique, since it significantly reduces the search space of PCPE, it seems to preserve the best solu135
Ochoa and Puebla
tions (at least for the tested benchmarks), and can allow us to use PCPE in order to attack more interesting benchmarks, and also to provide more static information to the algorithm. It remains as future work to develop other techniques for pruning the search space in PCPE, that can ensure that the optimal solution is preserved. Acknowledgments. This work was funded in part by the Information Society Technologies program of the European Commission, Future and Emerging Technologies under the IST15905 MOBIUS project, by the Spanish Ministry of Education under the TIN2005-09207 MERIT project, and by the Madrid Regional Government under the S-0505/TIC/0407 PROMESAS project.
References [1] E. Albert, G. Puebla, and J. Gallagher. Non-Leftmost Unfolding in Partial Evaluation of Logic Programs with Impure Predicates. In 14th International Symposium on Logic-based Program Synthesis and Transformation (LOPSTR’05), number 3901 in LNCS. Springer-Verlag, April 2006. [2] M. Bruynooghe, D. De Schreye, and B. Martens. A General Criterion for Avoiding Infinite Unfolding during Partial Deduction. New Generation Computing, 1(11):47–79, 1992. [3] Stephen-John Craig and Michael Leuschel. Self-tuning resource aware specialisation for Prolog. In PPDP ’05: Proceedings of the 7th ACM SIGPLAN international conference on Principles and practice of declarative programming, pages 23–34, New York, NY, USA, 2005. ACM Press. [4] Saumya K. Debray. Resource-Bounded Partial Evaluation. In Proceedings of PEPM’97, the ACM Sigplan Symposium on Partial Evaluation and Semantics-Based Program Manipulation, pages 179– 192. ACM Press, 1997. [5] J. Gallagher and M. Bruynooghe. The derivation of an algorithm for program specialisation. New Generation Computing, 9(1991):305–333, 1991. [6] J.P. Gallagher. Tutorial on specialisation of logic programs. In Proceedings of PEPM’93, the ACM Sigplan Symposium on Partial Evaluation and Semantics-Based Program Manipulation, pages 88–98. ACM Press, 1993. [7] Manuel V. Hermenegildo, Germ´ an Puebla, Francisco Bueno, and Pedro L´ opez-Garc´ıa. Integrated Program Debugging, Verification, and Optimization Using Abstract Interpretation (and The Ciao System Preprocessor). Science of Computer Programming, 58(1–2):115–140, October 2005. [8] M. Leuschel and M. Bruynooghe. Logic program specialisation through partial deduction: Control issues. Theory and Practice of Logic Programming, 2(4 & 5):461–515, July & September 2002. [9] Michael Leuschel. On the power of homeomorphic embedding for online termination. In Giorgio Levi, editor, Static Analysis. Proceedings of SAS’98, LNCS 1503, pages 230–245, Pisa, Italy, September 1998. Springer-Verlag. [10] Michael Leuschel, Bern Martens, and Danny De Schreye. Controlling generalisation and polyvariance in partial deduction of normal logic programs. ACM Transactions on Programming Languages and Systems, 20(1):208–258, January 1998. [11] J. W. Lloyd and J. C. Shepherdson. Partial evaluation in logic programming. The Journal of Logic Programming, 11:217–242, 1991. [12] J.W. Lloyd. Foundations of Logic Programming. Springer, second, extended edition, 1987. [13] B. Martens and D. De Schreye. Automatic finite unfolding using well-founded measures. Journal of Logic Programming, 28(2):89–146, 1996. To Appear, abridged and revised version of Technical Report CW180, Departement Computerwetenschappen, K.U.Leuven, October 1993. [14] G. Puebla, E. Albert, and M. Hermenegildo. Efficient Local Unfolding with Ancestor Stacks for Full Prolog. In 14th International Symposium on Logic-based Program Synthesis and Transformation (LOPSTR’04), number 3573 in LNCS, pages 149–165. Springer-Verlag, 2005. [15] G. Puebla and C. Ochoa. Poly-Controlled Partial Evaluation. In Proc. of 8th ACM-SIGPLAN International Symposium on Principles and Practice of Declarative Programming (PPDP’06). ACM Press, July 2006. [16] M.H. Sørensen and R. Gl¨ uck. An Algorithm of Generalization in Positive Supercompilation. In Proc. of ILPS’95, pages 465–479. The MIT Press, 1995.
136
WFLP 2006
Implementing Dynamic-Cut in T OY
1
R. Caballero2 Y. Garc´ıa-Ruiz3 Departamento de Sistemas Inform´ aticos y Programaci´ on Universidad Complutense de Madrid Madrid, Spain
Abstract This paper presents the integration of the optimization known as dynamic cut within the functional-logic system T OY . The implementation automatically detects deterministic functions at compile time, and includes in the generated code the test for detecting at run-time the computations that can actually be pruned. The outcome is a much better performance when executing deterministic functions including either or-branches in their definitional trees or extra variables in their conditions, with no serious overhead in the rest of the computations. The paper also proves the correctness of the criterion used for detecting deterministic functions w.r.t. the semantic calculus CRWL. Keywords: determinism, functional-logic Programming, program analysis, programming language implementation.
1
Introduction
Nondeterminism is one of the characteristic features of Logic Programming shared by Functional-Logic Programming. It allows elegant algorithm definitions, increasing the expressiveness of programs. However, this benefit has an associated drawback, namely the lack of efficiency of the computations. There are two main reasons for this: - The complexity of the search engine required by nondeterministic programs, which slows down the execution mechanism. - The possible occurrence of redundant subcomputations during a computation. In the Logic Programming language Prolog, the second point is partially solved by introducing a non-declarative mechanism, the so-called cut. Programs using cuts are much more efficient, but at the price of becoming less declarative. In the case of Functional-Logic Programming the situation is somehow alleviated by the demand driven strategy [2,8], which is based on the use of definitional trees 1 2 3
This work has been funded by the projects TIN2005-09207-C03-03 and S-0505/TIC/0407. Email:
[email protected] Email:
[email protected]
This paper is electronically published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs
Caballero, Garc´ıa-Ruiz
[1,8]. Given any particular program function, the strategy uses the structure of the left-hand sides of the program rules in order to reduce the number of redundant subcomputations. The implementation of modern Functional-Logic languages such as T OY [9] or Curry [6] is based on this strategy. Our proposal also relies on the demand driven strategy, but introduces a safe and declarative optimization to further improve the efficiency of deterministic computations. This optimization is the dynamic cut, first proposed by Rita Loogen and Stephan Winkler in [10]. In [4,3] the same ideas were adapted to a setting including non-deterministic functions and a demand driven strategy, showing by means of examples the efficiency of the optimization. However, in spite of being well-known and accepted as an interesting optimization, the dynamic cut had not been implemented in any real system up to now. In this paper we present this implementation in the functional-logic system T OY (available at http://toy.sourceforge.net). The dynamic cut considers two special fragments of code: (i) Rules with existential variables in the conditions. (ii) Sets of overlapping rules occurring in deterministic functions. As we will explain in section 3, computations involving these fragments of code can be safely pruned if certain dynamic conditions are fullfilled. A key point of the optimization is detecting deterministic functions. The information about deterministic functions is required not only at compile time but also at run-time, when it is used for checking dynamically if the cut must take place in a particular computation. As previous works [10,4,3] have shown, this dynamic test is necessary for ensuring the correctness of the cut, i.e. that the optimization does not affect the set of solutions of any goal. The determinism analysis performed by the system follows the well-known criterion of non-ambiguity already introduced in [10]. From the theoretical point of view, the novelty of this paper w.r.t. previous work is that we have proved formally the correctness of such a criterion w.r.t. the semantic calculus CRWL, proposed as suitable logic foundation for Functional-Logic Programming in [5]. Of course, completeness cannot be established because determinism is an undecidable property [13]. For that reason we also allow the user to annotate explicitly some functions as deterministic. The paper is organized as follows. The next section introduces the non-ambiguity criterion for detecting deterministic functions and the correctness theorem. Section 3 shows by means of examples the cases where the optimization will be applied. Section 4 presents the steps followed during the implementation of the dynamic cut in T OY, and Section 5 finalizes presenting some conclusions.
2
Detecting Deterministic Functions in FunctionalLogic Programs
This section proves the correctness of the non-ambiguity condition used for detecting deterministic functions w.r.t. the semantic calculus CRWL [5]. 138
Caballero, Garc´ıa-Ruiz
2.1
The CRWL calculus
CRWL is an inference system consisting of six inference rules: BT Bottom:
RF Reflexivity:
e→⊥
DC Decomposition
e1 → t1 . . . em → tm c e1 . . . em → c t1 . . . tm
FA Function Application:
JN Join:
X→X
c ∈ CDn ∪ F S n+1 , m ≤ n, ti ∈ CT erm⊥
e1 → t1 . . . , en → tn C r → a a a1 . . . ak → t f e1 . . . en a1 . . . ak → t
if e1 → t e2 → t e1 == e2
t 6= ⊥,
(k ≥ 0)
(f t1 . . . tn → r ⇐ C) ∈ [R]⊥
t ∈ CT erm
The notation [R]⊥ in rule FA represents the set of all the possible instances of program rules, where each particular instance is obtained from some function defining rule in R, by some substitution of (possibly partial) terms in place of variables. See [5] for a detailed description of this and related calculi. 2.2
Deterministic Functional-Logic Functions
Before defining and characterizing deterministic functions we need to establish briefly some basic notions and terminology. We refer to [5] for more detailed definitions. We assume a signature Σ = hDC, F Si, where DC and F S are ranked sets of constructor symbols resp. function symbols. Given a countably infinite set V of variables, we build CTerms (using only variables and constructors) and Terms (using variables, constructors and function symbols). We extend Σ with a special nullary constructor ⊥ (0-arity constructor) obtaining a new signature Σ⊥ and we will write T erm⊥ and CT erm⊥ (partial terms) for the corresponding sets of terms in this extended signature. A T OY program P is composed of data type declarations, type alias, infix operators, function type declarations and a set of defining rules for functions symbols. Each defining rule for a function f ∈ F S has a left-hand side, a right-hand side and r ⇐ C a optional condition: f t1 . . . tn → |{z} |{z} | {z } left-hand side
right-hand side
condition
where t1 . . . tn must be linear Cterms and C must consist of finitely many (possibly zero) joinability statements e1 == e2 with e1 , e2 ∈ T erm. A natural approximation ordering v for partial terms can be defined as the least partial ordering over T erm⊥ satisfying the following properties: •
⊥ v t, for all t ∈ T erm⊥
•
X v X, for all variable X
•
if t1 v s1 , ..., tn v sn , then c t1 . . . tn v c s1 . . . sn , for all c ∈ DC n and ti , si ∈ CT erm⊥ .
A partially ordered set (poset in short) with bottom is a set S equipped with a partial order v and a least element ⊥ (w.r.t. v). D ⊆ S is a directed set iff for all x, y ∈ D there exists z ∈ D such that x v z, y v z. A subset A ⊆ S is a cone, iff ⊥ ∈ A and for all x ∈ A, y ∈ S y v x ⇒ y ∈ A. An ideal I ⊆ S is a directed cone. The program semantics is defined by the semantic calculus CRWL presented in 139
Caballero, Garc´ıa-Ruiz
[5]. CRWL (Constructor Based ReWriting Logic) is a theoretical framework for the lazy functional logic programming paradigm. Given any program P , CRWL proves statements of the form e → t with e ∈ T erm⊥ and t ∈ CT erm⊥ . We denote by P `CRWL e → t that the statement e → t can be proved in CRWL w.r.t. P . The intuitive idea is that t is a valid approximation of e in P. The denotation of any e ∈ T erm⊥ , written [[e]], is defined as: [[e]] = {t ∈ CT erm⊥ | P `CRWL e → t}. Now we are ready for presenting the formal definition of deterministic function in our setting. Definition 2.1 (Deterministic Functions) Let f be a function defined in a program P. We say that f is a deterministic function iff [[f tn ]] is an ideal for every tn s.t. ti is a CTerm⊥ for all i = 1 . . . n. We call a function non-deterministic, if it does not fulfill the previous definition. The intuitive idea behind a deterministic function is that it returns at most one result for any arbitrary ground parameters [7]. In addition, in a lazy setting whenever a function returns some value t, it is expected to return all the less defined terms s v t as well. The previous definition of deterministic function takes this idea into account. Consider for instance the following small program: data pair = pair int int
f 1 = pair 1 2
g1=1
g1=2
Using CRWL it can be proved that [[f 1]] = {⊥, pair ⊥ ⊥, pair 1 ⊥, pair ⊥ 2, pair 1 2}, [[f t]] = {⊥} if t 6= 1, [[g 1]] = {⊥, 1, 2}, [[g t]] = {⊥} if t 6= 1. Then g is a non-deterministic function because for the parameter 1 the set {⊥, 1, 2} is not an ideal, in particular because it is not directed: taking x = 1, y = 2 it is not possible to find z ∈ {⊥, 1, 2} s.t. x v z, z v 2. On the other hand, it is easy to check that f is a deterministic function. 2.3
Non-ambiguous functions
The definition 2.1 is only a formal definition and cannot be used in practice. In [4] an adaptation of the non-ambiguity condition of [11] is presented, which we will use as an easy mechanism for the effective recognition of deterministic functions. Although not all the deterministic functions are non-ambiguous, the non-ambiguity criterion will be enough for detecting several interesting deterministic functions. Definition 2.2 (Non-ambiguous functions) Let P be a program defining a set of functions G. We say that F ⊆ G is a set of non-ambiguous functions if every f ∈ F verifies: (i) If f t¯n = e ⇐ C is a defining rule for f , then var(e) ⊆ var(t¯) and all function symbols in e belong to F . (ii) For any pair of variants of defining rules for f , f t¯n = e ⇐ C, f t¯0 n = e0 ⇐ C 0 , one of the following two possibilities holds: (a) Left-hand sides do not overlap, that is, the terms (f t¯n ) and (f t¯0 n ) are not unifiable. (b) If θ is the m.g.u. of f t¯n and f t¯0 n , then eθ ≡ e0 θ. In [3,4] the inclusion of the set on non-ambiguous functions in the set of deterministic 140
Caballero, Garc´ıa-Ruiz
functions was claimed. Here, and thank to the previous formal definition, we will be able to prove the result. Before that we need some auxiliar lemmata. The proofs of these results are tedious but straightforward using induction on the structure of the CRWL-proofs and are not included for the sake of the space. The first two lemmata establish substitution properties that will play an important role in the proof. The lemmata use the symbol CSubst for the set of all the c-substitutions, which are mappings θ : V → CT erm, and the notation CSubst⊥ for the set of all the partial c-substitutions θ : V → CT erm⊥ defined analogously. We note as tθ the result of applying the substitution θ to the term t. Lemma 2.3 Let t ∈ CT erm, s ∈ CT erm⊥ be such that t v s. There there exists a substitution θ ∈ CSubst⊥ verifying tθ = s. Lemma 2.4 Let t, t0 ∈ CT erm be such that: 1) t, t0 are linear, 2) var(t)∩var(t0 ) = ∅ and 3) There exists γ = m.g.u.(t,t’). Let s ∈ Cterm⊥ be a term and θ, θ0 ∈ CSubst⊥ such that tθ v s, t0 θ0 v s. Then there exists a substitution θ00 s.t. tγθ00 = t0 γθ00 = s. Lemma 2.5 Let P be aprogram and e ∈ T erm⊥ . Then: i) Let t, t0 ∈ CT erm⊥ be such that P `CRWL e → t and t0 v t. Then P `CRWL e → t0 . ii) Let P be a program and e ∈ T erm⊥ and θ ∈ CSubst⊥ be s.t. P `CRWL eθ → t. Then P `CRWL eθ0 → t for all θ0 s.t. θ v θ0 . iii) Let e¯n s.t. ei ∈ T erm⊥ for all i = 1 . . . n, and s.t. P `CRWL e e¯n → t, and a ∈ T erm⊥ such that e v a. Then P `CRWL a e¯n → t. iv) [[e]] is a cone. Now we are ready to prove that non-ambiguous functions are deterministic. Theorem 2.6 . Let P be a program and f be a non-ambiguous function defined in P. Then f is deterministic. Proof. In order to check that f is a deterministic function, we must prove that [[f t¯n ]] is an ideal, i.e.: - [[f t¯n ]] is a cone by lemma 2.5 item iv). - [[f t¯n ]] is a directed set. We prove a more general result: Consider e ∈ T erm⊥ and suppose that all the function symbols occurring in e are correspond to non-ambiguous functions. Then, [[e]] is a directed set. Let be. t, t0 ∈ Cterm⊥ verifying (R1) : P `CRWL e → t and (R2) : P `CRWL e → t0 . We prove that exists s ∈ Cterm⊥ s.t.: a) t v s, b) t0 v s and c) P `CRWL e → s by induction on the depth l of a CRWL -proof for e → t: l = 0. Three possible CWRL-inference rules: •
BT. Then t = ⊥ and defining s = t0 we have: a) ⊥ v s, b) t0 v s and c) P `CRWL e → s (by (R2)).
•
RF. Then the proof for (R1) must be of the form X → X, and hence e = X and t = X. Then t0 only can be X or ⊥ (otherwise no CRWL inference could 141
Caballero, Garc´ıa-Ruiz
be applied and (R2) would not hold). We define s as X and then: a) t v X b) t0 v X c) P `CRWL e → s by (R1). •
DC. Then e = c, t = c, with c ∈ DC 0 . Then t0 must be either c or ⊥. In any case defining s as c the result holds.
l > 0 There are three possible inference rules applied at the first step of the proof: •
DC. Then e = c e1 . . . em , t = c t1 . . . tm with c ∈ DC n ∪ F S n+1 , m ≤ n. Analogously t0 = c t1 . . . tm and the first inference rules of any proof for (R1) y (R2) must be of the form: (R1) :
e1 → t1 . . . em → tm c e1 . . . em → c t1 . . . tm
(R2) :
e1 → t01 . . . em → t0m c e1 . . . em → c t01 . . . t0m
The proofs for P `CRWL ei → ti and P `CRWL ei → t0i have a maximum depth of l − 1. Therefore by induction hypotheses exists si ∈ Cterm⊥ satisfying ti , t0i v si , and P `CRWL ei → si for all 1 ≤ i ≤ m. Then defining s = c s1 . . . sm , t v s, t0 v s hold and P `CRWL e → s with a proof starting with a DC inference. •
JN. Very similar to the previous case.
•
AF. Then e is of the form f e¯n with ei ∈ CT erm⊥ for i = 1 . . . n. Moreover n is greater of equal to the program arity of f . Hence an AF inference must have been applied at the first step of any proof of (R2). In each case a suitable instance (I1) y (I2) must have been used. We call θ and θ0 to the substitutions associated to the first and to the second instance respectively, θ, θ0 ∈ CSubst⊥ . The first inference step of each proof will be of the following form: (1) :
(2) :
with
e1 → t1 θ, . . . , ek → tk θ, Cθ, rθ → a, a ek+1 . . . en → t f e1 . . . ek ek+1 . . . en → t e1 → t01 θ0 , . . . , ek → t0k θ0 , C 0 θ0 , r 0 θ0 → a0 , a0 ek+1 . . . en → t0 f e1 . . . ek ek+1 . . . en → t0 0 (k > 0), t, t 6= ⊥ and the rule instances:
I2 : (f t01 . . . t0k → r0 ⇐ C 0 )θ0 ∈ [R]⊥
I1 : (f t1 . . . tk → r ⇐ C)θ ∈ [R]⊥
Now we consider separately two cases: a) I1 e I2 correspond to the same program rule, and b) each instance correspond to a different program rule. The first case is easy to check and does not rely on the non-ambiguity criterion. For the sake of the space we only include the proof of the case b). Assume that I1 , I2 are instances of two different program rules. By the non-ambiguity criterion there exists γ=m.g.u. (f t¯k , f t¯0 k ), i.e. ti γ = t0i γ for i = 1 . . . k and rγ = r0 γ. Calling ui to ti γ = t0i γ, the rule instances can be seen as: (f u1 . . . uk → r00 ⇐ Cγ) and (f u1 . . . uk → r00 ⇐ C 0 γ). Now we must look for some s ∈ CT erm⊥ such that: a) t v s, b) t0 v s and c) P `CRWL f e¯n → s for some substitution θ00 . The proof of c) can be of one of these two forms (4) :
e1 → u1 θ00 , . . . , ek → uk θ00 , Cγθ00 , r 00 θ00 → a00 , a00 ek+1 . . . en → s f e1 . . . ek ek+1 . . . en → s
(5) :
e1 → u1 θ00 , . . . , ek → uk θ00 , C 0 γθ00 , r 00 θ00 → a00 , a00 ek+1 . . . en → s f e1 . . . ek ek+1 . . . en → s
We observe that γ unifies the heads and fusions the right-hand sides, but it doesn’t relation C y C 0 . We consider the form (4) (the (5) is analogous). From 142
Caballero, Garc´ıa-Ruiz
the premises of (1) y (2) we know that P `CRWL ei → ti θ and P `CRWL ei → t0i θ0 for i = 1 . . . k. By induction hypotheses exists si ∈ CT erm⊥ s.t.: a)ti θ v si , b) t0i θ0 v si , and c) P `CRWL ei → si . Since ti , t0i are unified by γ, we can apply the Lemma 2.4. Then there exist substitutions θi which we can restrict to the variables in ui s.t. ui θi = si . (u1 , . . . , uk ) is a linear tuple because (t1 , . . . , tk ) and (t01 , . . . , t0k ) are both linear. Then we can define a substitution θ00 as: ( θ00 (X) =
θi (X) if X ∈ var(ti , t0i ) for some i, 1 ≤ i ≤ k θ(X) otherwise
ensuring that there exist CRWL -proofs of ei → ui θ00 for all i = {1, . . . , k} in (4) (this is because ui θi = ui θ00 ). Checking that rest of the premises of (4) also have CRWL -proof requires similar arguments. 2 The non-ambiguity condition characterizes a set of functions F as deterministic. This is because the value of a function may depend on other functions, and in general this dependence can be mutual. In practice the implementation starts with an empty set F of non-ambiguous functions, adding at each step to F those functions that satisfy the definition and that only depend on functions already in F . This is done until a fix-point for F is reached. Although most of the deterministic functions that occur in a program are nonambiguous as well, there are some functions which are not detected. This happens for instance in the function f of following example: f 1 = 1 f 1 = g 1 g 1 = 1. It would be useful to use additional determinism criteria, such as those based on abstract interpretation proposed in [12], but the detection of deterministic function will be still incomplete. For that reason the system allows the programmer to distinguish deterministic functions annotating them by using --> instead of =, as in the following example: f 1 --> 1 f 1 --> g 1 g 1 = 1, which indicates that f is deterministic. The non-annotated functions like g will be analyzed following the non-ambiguity criterion.
3
Pruning Deterministic Computations
In this section we present briefly the two different situations where the dynamic cut can be introduced. 3.1
Deterministic Functions Defined through Overlapping Program Rules
Sometimes deterministic functions can be defined in a natural way by using overlapping rules. Consider for instance the two programs of Figure 1. Both programs contain functions for computing arithmetic using Peano’s representation. The function toNat is used for easily converting positive numbers of type int to their Peano representation. The only difference between P1 and P2 is the method for multiplying numbers. The function multi at P2 , which we have called ’classical’ reduces the first argument before each recursive call until it becomes zero. The method multi of P1 , which we have called ’parallel’, reduces both arguments before the recursive call. Observe that the first two rules of multi in P1 are overlapping. However it is easy to check that it is a non-ambiguous and hence a deterministic function. 143
Caballero, Garc´ıa-Ruiz % P1 : ’Parallel’ multiplication data nat = zero | s nat
% P2 : ’Classical’ multiplication data nat = zero | s nat
add zero Y add (s X) Y multi zero multi zero multi (s X) (s Y) power N zero power N (s M)
= = = = = = =
add zero Y add (s X) Y multi zero multi (s X) Y
= = = =
power N zero power N (s M)
= s zero = multi N (power N M)
odd zero odd (s zero) odd (s (s N))
= false = true = odd N
odd zero odd (s zero) odd (s (s N))
= false = true = odd N
toNat N
= if (N==0) then zero else s (toNat (N-1))
toNat N
= if (N==0) then zero else s (toNat (N-1))
Y s (add X Y) zero zero s (add X (add Y (multi X Y) )) s zero multi N (power N M)
Y s (add X Y) zero add Y (add X Y)
Fig. 1. Two methods for multiplying X Y P1 0 100000 0 0 50000 0 100 1000 2.7 400 400 4.1 1000 100 4.9 50000 0 0 100000 0 0 multi (toNat X) (toNat
P2 0 0 2.7 4.1 3.5
N P1 P2 104 0.7 0 105 6.1 0 106 60.0 0 107 0 odd (power zero (toNat N)) without dynamic cut
N P1 P2 104 0 0 105 0 0 106 0 0 107 0 0 odd (power zero (toNat N)) with dynamic cut
Y) Fig. 2. Comparative tables
The first table at Figure 2 shows the time 4 required for computing the first answer for goals of the form multi (toNat X) (toNat Y) == R in both programs. The symbol means that the system has run out of memory for the goal. From this data it is clear that the parallel multi of P1 behaves better than its classical counterpart of P2 . The reason is that in P1 the computation of multi reduces the two arguments simultaneously saving both time and space. However this kind of ’parallel’ definition is not used very often in Functional-Logic Programming because programmers know that overlapping rules can produce unexpected behaviors due to the backtracking mechanism. Indeed using P1 a goal like multi zero zero == R has two solutions, both giving R the value zero, instead of only one as expected (and as the program P2 does). Such redundant computations can affect the efficiency of other computations. The central table of Figure 2 contains the time required by both programs for checking if the N-th power of zero is odd without the dynamic cut optimization. The goal returns no in both cases as expected, but we observe that now P1 behaves rather worse than P2 , even running out of memory for large enough numbers. This is because the subgoal power zero (toInt N) needs to compute N multiplications, and in P1 this means N redundant computations. Thus using P1 without dynamic cut the goal odd (power zero (toInt N)) will check N times if zero 4
All the results displayed in seconds, obtained on a computer at 2.13 GHz with 1 Gb of RAM
144
Caballero, Garc´ıa-Ruiz data nucleotides = adenine | guanine | cytosine | thymine compatible compatible compatible compatible
adenine thymine = true thymine adenine = true guanine cytosine = true cytosine guanine = true
dna [ ] [ ] = true dna [N1|R1] [N2|R2] = true ⇐= compatible N1 N2, (dna R1 R2) dnaPart S1 S2 L = true ⇐= part P1 S1 L , part P2 S2 L, dna P1 P2 part X Y L = true ⇐= (U ++ X) ++ V == Y, length X == L Fig. 3. Detecting DNA strands
is odd, while in P2 this is done only once. The dynamic cut solves this situation, detecting that multi in P1 is a deterministic function and cutting the possibility of using the second rule of multi if the first one has succeeded producing a result (and satisfying some conditions explained below). The third table, at the right of Figure 2 has been obtained after activating the dynamic cut. The problem of the redundant computations has been solved. It is worth pointing out that the data of the first table do not change after activating the optimization, because all the goals considered produce only one answer, and the dynamic cut optimization only has effect on the second and posteriors answers. 3.2
Existential variables in conditions
Consider now the program of Figure 3. It includes a simple representation of DNA molecules, which are build by two chains of nucleotides. The nucleotides of the two strands are connected in compatible pairs, defined in the program through function compatible. The function dna detects if its two input parameters represent two strands that can be combined in a DNA molecule. Function dnaPart checks if the two input sequences S1 and S2 contain some subsequences P1 and P2 of length L that can occur associated in a DNA molecule. This function relies in function part which checks if the parameter X is a sublist of length L of the list Y. The functions ++ and length, represent respectively the concatenation of lists and the number of elements in a list. Consider the following session in the system T OY : Toy> dnaPart (repeat 1000 adenine) (repeat 1000 thymine) 5 yes. Elapsed time: 844 ms. more solutions? y yes. Elapsed time: 40390 ms.
The goal dnaPart (repeat 1000 adenine) (repeat 1000 thymine) 5 asks if in two strands of 1000 nucleotides of adenine and thymine respectively it is possible to find two subsequences of 5 nucleotides, one from each strand, which can occur associated in a DNA molecule. The answer given by the system after 0.8 seconds is yes (actually all the subsequences of n elements of the first strand are compatible with all the subsequences of n elements of the second strand). If the user asks for a second answer, the same redundant answer yes is obtained after more than 40 seconds. The second answer is useless because it doesn’t provide new information, and greatly affects the efficiency. It can be argued that there is no point in asking for a second 145
Caballero, Garc´ıa-Ruiz
answer after the first, but this situation can occur as subcomputations of a bigger computation and cannot be avoided in general. Examining the code we find out easily the source of the redundant computation: the condition of function part includes two existential variables U and V. When the user asks for more solutions the backtracking mechanism looks for new values of the variables satisfying the conditions. But this is unnecessary because the rule already has returned true and cannot return any new value. The dynamic cut will avoid this redundant computation. Here is the same goal running after activating the dynamic cut optimization in T OY : Toy>dnaPart (repeat 1000 adenine) (repeat 1000 thymine) 5 yes. Elapsed time: 844 ms. more solutions ? y no. Elapsed time: 0 ms.
Now the system detects automatically that there are no more possible solutions after the first one, reducing the 40 seconds to 0. The interested reader can find in [4] more experimental results. The experiments in that paper were tested introducing manually the code for the dynamic cut before the optimization was part of the system. However the results have been confirmed by the current implementation. 3.3
Dynamic conditions for the cut
From the previous examples one could consider that the cut can be introduced safely in the code of functions multi and part without taking into account any run-time test. But the cut also depends on dynamic conditions. There are two situations that must be taken into account before applying the cut: i) Variable bindings. Consider the goal: multi X zero == R, with X a logical variable. Using the program P1 of Figure 1 this goal produces two answers: { X7→zero, R7→zero } and { R7→zero }. The first answer is obtained using the first rule for multi and the second answer through the second rule. Introducing a cut after the first answer would be unsafe; the second answer is not redundant, but gives new information w.r.t. the first one. As it includes no binding for X it can be interpreted as ’for every X, the equality multi X zero == zero holds’, and therefore subsumes the first answer. ii) Non deterministic functions computed. Suppose we include a new function zeroOrOne in the program P1 of Figure 1 defined as: zeroOrOne = zero zeroOrOne = s zero Then a goal like multi zeroOrOne (s zero) == R will return two answers: { R 7→ zero } and { R 7→ s zero }. Introducing the cut after the first answer would be again unsafe. But in this case it is not because it prevents the use of the second rule, but because it would avoid the backtracking of the non-deterministic function zeroOrOne that leads to the application of the third rule of multi, yielding the second answer. Therefore the cut must not take place if after obtaining the first result of the deterministic function any of the variables in the input arguments has been bound or a non-deterministic function has been computed. As we will see in the following paragraph the implementation generates a dynamic test for checking these conditions 146
Caballero, Garc´ıa-Ruiz
before introducing the cut.
4
Implementing the Dynamic Cut
4.1
Compiling programs into Prolog
The T OY compiler transforms T OY programs into Prolog programs following ideas described in [8]. A main component of the operational mechanism is the computation of head normal forms (hnf) for expressions. The translation scheme can be divided into three phases: 1) Higher order T OY programs are translated into programs in first order syntax. 2) Function calls f (e1 , . . . , en ) occurring in the first order T OY program rules are replaced by Prolog terms of the form susp(f (e1 , . . . , en ), R, S) called suspensions. The logical variable S is a flag which is bound to a concrete value, say hnf, once the suspension is evaluated. R contains the result of evaluating the function call. Its value is meaningful only if S==hnf holds. 3) Finally the Prolog clauses are generated, adding code for strict equality and hnf (to compute head normal forms). Each n-ary function f is translated into a Prolog predicate f (X1 , . . . , Xn , H). When computing a hnf for an unevaluated suspension susp(f(X1 ,. . . ,Xn ),R,S), a call f(X1 ,. . . ,Xn ,H) will occur in order to obtain in H the desired head normal form. We are particularly interested in the third phase (code generation), since it will be affected by the introduction of dynamic cuts. Before looking more closely at this phase we need to introduce briefly our notation for definitional trees. 4.2
Definitional Trees in T OY
Before generating the code for any function the compiler builds its associated definitional tree. In our setting the definitional tree dt of a function f , can be of one of the following three forms: •
• •
dt(f ) = f (t¯n ) → case X of hc1 (X m1 ) : dt1 ; . . . ; ck (X mk ) : dtk i, where X is the variable at position u in f (t¯n ) and c1 . . . ck are constructor symbols, with dti a definitional tree for i = 1 . . . k. dt(f ) = f (t¯n ) → or hdt1 | . . . | dtk i, with dti a definitional tree for i = 1 . . . k. dt(f ) = f (t¯n ) → try (r ⇐ C), with f t¯n = r ⇐ C corresponding to an instance of a program rule for f .
In each case we say that the tree has a case/or/try node at the root, respectively. A more precise definition together with the algorithm that produces a definitional tree from a function definition can be found in [8]. The only difference is that we do not allow ’multiple tries’, i.e. try nodes including several program rules, replacing them by or nodes with multiple try child nodes, one for each rule included in the initial multiple try. The tree obtained by this modification is obviously equivalent and will be more suitable for our purposes. As an example of a definitional tree, consider again the definition of function multi in the program P1 of Figure 1. 147
Caballero, Garc´ıa-Ruiz
Its definitional tree, denoted as dt(multi), is defined in T OY as: dt(multi) = multi(A,B)→ or h multi(A,B)→ case A of h zero : multi (zero, B) → try (zero) % 1st rule ; s(X) : multi (s(X),B) → case B of h s(Y) : multi (s(X), s(Y)) → try (s (add X (add Y (multi(X,Y))))) i % 3rd rule | multi(A,B)→ case B of h zero: multi (A,zero) → try (zero) i % 2nd rule
4.3
Definitional trees with cut
From the definitional tree dt of each function the T OY system generates a definitional tree with cut, dtc. Definitional trees with cut have the same structure as usual definitional trees. The only difference is that they rename some or and try nodes as orCut and tryCut, respectively. We define a function Γ transforming a definitional tree dt into its corresponding definitional tree with cut straightforwardly by distinguishing cases depending on the root node of dt: •
•
•
•
•
Γ( f (t¯n ) → case X of hc1 (Xm1 ) : dt1 ; . . . ; ck (Xmk ) : dtk i ) = f (t¯n ) → case X of hc1 (Xm1 ) : Γ(dt1 ); . . . ; ck (Xmk ) : Γ(dtk )i Γ( f (t¯n ) → orhdt1 | . . . | dtk i ) = f (t¯n ) → orCut hΓ(dt1 ) | . . . | Γ(dtk )i, if f is deterministic. Γ( f (t¯n ) → orhdt1 | . . . | dtk i ) = f (t¯n ) → or hΓ(dt1 ) | . . . | Γ(dtk )i, if f is non-deterministic. Γ (f (t¯n ) → try (r ⇐ C) = f (t¯n ) → tryCut (r ⇐ C) if some existential variable occurs in C (i.e. some variable occurs in C but not in the rest of program rule). Γ (f (t¯n ) → try (r ⇐ C) = f (t¯n ) → try (r ⇐ C) if no existential variable occurs in C.
For instance the dt of function multi displayed above is transformed into the following definitional tree with cut dct (denoted dtc(multi)): dtc(multi) = multi(A,B)→ orCut h multi(A,B)→ case A of h zero : multi (zero, B) → try (zero) % 1st rule ; s(X) : multi (s(X),B) → case B of h s(Y) : multi (s(X), s(Y)) → try (s (add C (add D (multi(C,D))))) i % 3rd rule | multi(A,B)→ case B of h zero: multi (A,zero) → try (zero) i % 2nd rule
Notice that the only difference corresponds to the root, which has been transformed into a orCut node because multi is a deterministic function. 4.4
Generating the code
Now we can describe the function prolog(f, dtc) which generates the code for a function f from its definitional tree with cut dtc. The function definition depends on the node found at the root of dtc. There are five possibilities: Case 1. dtc = f (¯ s) → case X of hc1 (Xm1 ) : dtc1 ; . . . ; cm (Xmk ) : dtcm i. Then: 148
Caballero, Garc´ıa-Ruiz
prolog(g, dtc) = {g(¯ s, H) : − hnf (X, HX), g 0 (¯ sσ, H).} ∪ prolog(g 0 , dtc1 ) ∪ . . . ∪ prolog(g 0 , dtcm ) where σ = X/HX and g 0 is a new function symbol. The first call to hnf ensures that the position indicated by X is already in head normal form, and therefore can be used in order to distinguish the different alternatives. Case 2. dtc = f (¯ s) → orhdtc1 | . . . | dtcm i. Then: prolog(g, dtc) = {g(¯ s, H) : − g1 (¯ s, H).} ∪ . . . ∪ {g(¯ s, H) : − gm (¯ s, H).} ∪ prolog(g1 , dtc1 ) ∪ . . . ∪ prolog(gm , dtcm ) where g1 , . . . , gm are new function symbols. In this case each new function symbol represents one of the non-deterministic choices. Case 3. dtc = f (¯ s) → orCuthdtc1 | . . . | dtcm i. Then prolog(g, dtc) = {g(¯ s, H) : −varlist(¯ s, Vs ), g 0 (¯ s, H), (checkvarlist(Vs ), ! ; true). } ∪ {g 0 (¯ s, H) : −{g1 (¯ s, H).} ∪ . . . ∪ {g 0 (¯ s, H) : − gm (¯ s, H).} ∪ prolog(g1 , dtc1 ) ∪ . . . ∪ prolog(gm , dtcm ) where g 0 , g1 , . . . , gm are new function symbols. Observe the differences with the case 2: •
A new function g 0 is used as an intermediate auxiliary function between g and the non-deterministic choices.
•
g starts calling a predicate varlist. This predicate, whose definition is tedious but straightforward, returns in its second parameter Vs a list containing all the logical variables in the input parameters, including those used as flags for detecting the evaluation of suspensions of non-deterministic functions.
•
After g 0 succeeds, i.e. after an or-branch has produced a result, the test for the dynamic cut is performed. This test, represented by predicate checkvarlist, checks if any of the variables in the list produced by varlist has been bound. This will mean that either an input logical variable has been bound or a nondeterministic function has been evaluated. In any of these cases the cut is avoided. Otherwise the dynamic cut, which is implemented as an ordinary Prolog cut, is safely performed. The definition of checkvarlist is simple: checkVarList([ ]). checkVarList([X|Xs]):- var(X), \+varInList(X,Xs), checkVarList(Xs). The literal \+varInList(X,Xs), checks if the variable X occurs twice in the list, detecting bindings among variables of the list.
Case 4. dtc = try (e ⇐ l1 == r1 , . . . , ln == rn ). Then prolog(g, dtc) = { g(¯ s, H) : − equal(l1 , r1 ), . . . , equal(ln , rn ), hnf (e, H). } If all equalities in the conditions are satisfied the program rule returns the head 149
Caballero, Garc´ıa-Ruiz
normal form of its right-hand side e. Case 5. dtc = tryCut (e ⇐ l1 == r1 , . . . , ln == rn ). Then prolog(g, dtc) = {g(¯ s, H) : −varlist((¯ s, e), Vs ), equal(l1 , r1 ), . . . , equal(ln , rn ), (checkvarlist(Vs ), ! ; true), hnf (e, H).} This case is similar to the case of the orCut. The main difference is that in this case we also collect the possible new variables of the right-hand side, because if the condition binds any of them the cut must be discarded.
4.5
Examples
Now we show the Prolog code generated by T OY for some of the function examples presented through the paper: •
Prolog code for function part of Figure 3: part(A, B, C, true):- varList( [A, B, C ], Vs ), equal(susp( ++, [ susp(++, [D,A]),J]),B), equal(susp(length, [A]), C), (checkVarList(Vs), !; true).
This corresponds to the implementation of a tryCut node. In this example varList only looks for variables and non-deterministic functions in the parameters A, B and C, because the right-hand side of this rule is the ground term true. •
Prolog code for function multi of Figure 1 multi(A, B, H):- varList([A,B], Vs), multi’(A, B, H), (checkVarList(Vs), ! ; true ). multi’(A, B, H):- hnf(A, F), multi’_1(F, B, H). multi’(A, B, zero):- hnf(B, zero). multi’_1(zero, B, zero). multi’_1(s(X), B, s(susp(add,[X,susp(add,[Y,susp(multi,[X,Y])])]))):- hnf(B, s(Y)).
The code of this example corresponds to the implementation of an orCut node. The two branches are represented here by the two clauses for multi0 (corresponding to function g 0 in the case 3 of the previous subsection). The cut is introduced if the first alternative, which corresponds to a case node with two possibilities, succeeds.
5
Conclusions
In this paper we have presented the implementation of the dynamic cut optimization in the Functional-Logic system T OY . The optimization improves dramatically the efficiency of the computations in the situations explained in the paper. Moreover, we claim that in practice it allows the use of some elegant and expressive function definitions that were disregarded due to their inefficiency up to now. The cut is introduced automatically by the system following the next steps: 150
Caballero, Garc´ıa-Ruiz
(i) The deterministic functions of the program are detected using the nonambiguity criterion. The correctness of the criterion is ensured by theorem 2.6. Also the user can indicate explicitly that any function is deterministic. (ii) The definitional tree associated to each program function is examined. The or nodes occurring in deterministic functions are labeled during this process as or-cut nodes. Also the try nodes corresponding to program rules including existential variables in the conditions are labeled as try-cut nodes. (iii) During the code generation the system will generate the dynamic cut code for or-cut and try-cut nodes. However the cut only will be performed if the dynamic conditions explained in subsection 3.3 are fulfilled. We think that a similar scheme might also be used for incorporating the dynamic cut to the Prolog-based implementations of the Curry language [6]. Currently the dynamic cut must be turned on in T OY by typing the command /cut at the prompt. However, we have checked that the optimization produces almost no overhead in the cases where it cannot be applied, and we plan to provide it activated by default in the future versions of the system.
References [1] Antoy, S., Definitional trees, in: Int. Conf. on Algebraic Logic Programming (ALP’92), number 632 in LNCS (1992), pp. 143–157. [2] Antoy, S., R.Echahed and M. Hanus, A needed narrowing strategy, Journal of the ACM 47 (2000), pp. 776–822. [3] Caballero, R. and F. L´ opez-Fraguas, Dynamic-cut with definitional trees, in: Proceedings of the 6th International Symposium on Functional and Logic Programming, FLOPS 2002, number 2441 in LNCS (2002), pp. 245–258. [4] Caballero, R. and F. L´ opez-Fraguas, Improving deterministic computations in lazy functional logic languages, Journal of Functional and Logic Programming 2003 (2003). [5] Gonz´ alez-Moreno, J., M. Hortal´ a-Gonz´ alez, F. L´ opez-Fraguas and M. Rodr´ıguez-Artalejo, An approach to declarative programming based on a rewriting logic, The Journal of Logic Programming 40 (1999), pp. 47–87. [6] Hanus, M., Curry: An Integrated Functional Logic Language (version 0.8.2. march 28, 2006), Available at: http://www.informatik.uni-kiel.de/ curry/papers/report.pdf (2006). [7] Henderson, F., Z. Somogyi and T. Conway, Determinism analysis in the mercury compiler (1996). URL citeseer.ist.psu.edu/henderson96determinism.html [8] Loogen, R., F. L´ opez-Fraguas and M. Rodr´ıguez-Artalejo, A demand driven computation strategy for lazy narrowing, in: Int. Symp. on Programming Language Implementation and Logic Programming (PLILP’93), number 714 in LNCS (1993), pp. 184–200. [9] F. L´ opez-Fraguas and J. S´ anchez-Hern´ andez, Toy: a multiparadigm declarative system, in: Int. Symp. RTA’99, number 1631 in LNCS (1999), pp. 244–247. [10] Loogen, R. and S. Winkler, Dynamic detection of determinism in functional-logic languages, in: Int. Symp. on Programming Language Implementation and Logic Programming (PLILP’91), number 528 in LNCS (1991), pp. 335–346. [11] Loogen, R. and S. Winkler, Dynamic detection of determinism in functional logic languages, in: J. Maluszynski and M. Wirsing, editors, Programming Language Implementation and Logic Programming: Proc. of the 3rd International Symposium PLILP’91, Passau, Springer, Berlin, Heidelberg, 1991 pp. 335–346. [12] Pe˜ na, R. and C. Segura, Non-determinism analyses in a parallel-functional language, Journal of Logic Programming 2004 (2005), pp. 67–100. [13] Sawamura, H. and T. Takeshima, Recursive Unsolvability of Determinacy, Solvable Cases of Determinacy and Their Applications to Prolog Optimization, in: Proceedings of the Symposium on Logic Programming, 1985, pp. 200–207.
151
152
WFLP 2006
Implementing Relational Specifications in a Constraint Functional Logic Language Rudolf Berghammer and Sebastian Fischer1 Institut f¨ ur Informatik Universit¨ at Kiel Olshausenstraße 40, 24098 Kiel, Germany
Abstract We show how the algebra of (finite, binary) relations and the features of the integrated functional logic programming language Curry can be employed to solve problems on relational structures (like orders, graphs, and Petri nets) in a very high-level declarative style. The functional features of Curry are used to implement relation algebra and the logic features of the language are combined with BDD-based solving of boolean constraints to obtain a fairly efficient implementation of a solver for relational specifications. Keywords: Functional programming, constraint solving, Curry, relation algebra, relational specifications
1
Introduction
For many years, relation algebra has widely been used by mathematicians and computer scientists as a convenient means for problem solving. Its use in Computer Science is mainly due to the fact that many datatypes and structures (like graphs, hyper-graphs, orders, lattices, Petri nets, and data bases) can be modeled via relations, problems on them can be specified naturally by relation-algebraic expressions and formulae, and problem solutions can benefit from relation-algebraic reasoning and computations. A lot of examples and references to relevant literature can be found, e.g., in [17,4,6,13]. In fortunate cases a relational specification is executable as it stands, i.e., is an expression that describes an algorithm for computing the specified object. Then we have the typical situation where a tool like RelView [2] for mechanizing relational algebra is directly applicable. But in large part relational specifications are non-algorithmic as they implicitly specify the object to be computed by a set of properties. Here the standard approach is to intertwine a certain program development method with relation-algebraic calculations to obtain an algorithm (a 1
Email: {rub,sebf}@informatik.uni-kiel.de
This paper is electronically published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs
Berghammer and Fischer
relational program) that implements a relational specification (see [15,1] for some typical examples). Rapid prototyping at the specification level is not applied. As a consequence, specification errors usually are discovered at later stages of the program development. To avoid this drawback, we propose an alternative approach to deal with relational specifications (and programs). We formulate them by means of the functional logic programming language Curry [8]. Concretely, this means that we use the operational features of this language for implementing relation algebra. On that score our approach is similar to [16,11], the only implementations of relation algebra in a functional language we are aware of. But, exceeding [16,11], we use the logical problem-solving capabilities of Curry for formulating relational specifications and nondeterministically searching for solutions. As we will demonstrate, this allows to prototype a lot of implicit relational specifications. To enhance efficiency, we employ a boolean constraint solver that is integrated into the Curry language. The integration of constraint solving over finite domains and real numbers into functional logic languages has been explored in [12,7]. We are not aware of other approaches that integrate boolean constraints into a functional logic language or combine them with relation algebra to express constraints over relations. The implementation of relation algebra in Curry enables to formulate relational programs within this language. In respect thereof, we even can do more than RelView since Curry is a general purpose language in contrast to the rather restricted language of RelView. The remainder of this paper is organized as follows. Sections 2 and 3 provide some preliminaries concerning relation algebra and the programming language Curry. In Section 4 we show how the functional features of Curry can be employed for elegantly implementing the constants and operations of relation algebra and, based on this, the logical features of the language can be employed for directly expressing relational problem specifications. Some examples for our approach are presented in Section 5, where we also report on results of practical experiments. Section 6 contains concluding remarks.
2
Relation-algebraic Preliminaries
In the following, we first introduce the basics of relation algebra. Then we show how specific relations, viz. vectors and points, can be used to model sets. For more details concerning relations, see, e.g., [17,4]. 2.1
Relation Algebra
We write R : X ↔ Y if R is a relation with domain X and range Y , i.e., a subset of X × Y . If the sets X and Y of R’s type X ↔ Y are finite and of size m and n, respectively, we may consider R as a boolean m × n matrix. Since a boolean matrix interpretation is well suited for many purposes, in the following we often use matrix terminology and notation. Especially we speak about rows and columns and write Rx,y instead of hx, yi ∈ R or x R y. We assume the reader to be familiar with the basic operations on relations, viz. RT (inversion, transposition), R (complement, negation), R ∪ S (union, join), R ∩ S (intersection, meet), and RS (composition, multiplication, denoted by juxtaposition), the predicate R ⊆ S (inclusion), and the 154
Berghammer and Fischer
special relations O (empty relation), L (universal relation), and I (identity relation). 2.2
Modeling of Sets
There are some relation-algebraic possibilities to model sets. In this paper we will use (row) vectors, which are relations v with v = Lv. Since for a vector the domain is irrelevant, we consider in the following mostly vectors v : 1 ↔ X with a specific singleton set 1 := {⊥} as domain and omit in such cases the first subscript, i.e., write vx instead of v⊥,x . Such a vector can be considered as a boolean matrix with exactly one row, i.e., as a boolean row vector or a (linear) list of truth values, and represents the subset {x ∈ X | vx } of X. A non-empty vector v is said to be a point if v T v ⊆ I, i.e., v is a non-empty functional relation. This means that it represents a singleton subset of its domain or an element from it if we identify a singleton set with the only element it contains. Hence, in the boolean matrix model a point v : 1 ↔ X is a boolean row vector in which exactly one component is true.
3
Functional Logic Programming with Curry
The functional logic programming language Curry [8,10] aims at integrating different declarative programming paradigms into a single programming language. It can be seen as a syntactic extension of Haskell [14] with partial data structures and a different evaluation strategy. The operational semantics of Curry is based on lazy evaluation combined with a possible instantiation of free variables. On ground terms the operational model is similar to lazy functional programming, while free variables are nondeterministically instantiated like in logic languages. Nested expressions are evaluated lazily, i.e., the leftmost outermost function call is selected for reduction in a computation step. If in a reduction step an argument value is a free variable and demanded by an argument position of the left-hand side of some rule, it is either instantiated to the demanded values nondeterministically or the function call suspends until the argument is bound by another concurrent computation. Binding free variables is called narrowing; suspending calls on free variables is called residuation. Curry supports both strategies because which of them is right depends on the intended meaning of the called function. 3.1
Datatypes and Function Declarations
Curry supports algebraic datatypes that can be defined by the keyword data followed by a list of constructor declarations separated by the symbol “|”. For example, the following two declarations introduce the predefined datatypes for boolean values and polymorphic lists, respectively, where the latter usually is written as [a]: data Bool = True | False data List a = [] | a : List a Later we will also use the following two functions on lists: any :: (a -> Bool) -> [a] -> Bool null :: [a] -> Bool 155
Berghammer and Fischer
The first function returns True if its second argument contains an element that satisfies the given predicate and the second function checks whether the given list is empty. Type synonyms can be declared with the keyword type. For example, the following definition introduces matrices with entries from a type a as lists of lists: type Matrix a = [[a]] Curry functions can be written in prefix or infix notation and are defined by rules that are evaluated nondeterministically. The four declarations not :: Bool -> Bool not True = False not False = True (&&), (||) :: Bool -> Bool -> Bool True && b = b False && _ = False True || _ = True False || b = b (++) :: [a] -> [a] -> [a] [] ++ ys = ys (x:xs) ++ ys = x : (xs++ys) introduce three well-known boolean combinators (conjunction and disjunction as infix operations) and list concatenation (also as infix operation). 3.2
Nondeterministic Search
The logic features of Curry can be employed to nondeterministically search for solutions of constraints. Constraints are represented in Curry as values of the specific type Success. The always satisfied constraint is denoted by success and two constraints can be combined into a new one with the following concurrent conjunction operator: (&) :: Success -> Success -> Success To constrain a boolean expression to be satisfied one can use the following function that maps the boolean constant True to success and is undefined for the boolean constant False: satisfied :: Bool -> Success satisfied True = success Based on this function, the following function takes a predicate and nondeterministically computes a solution for the predicate using narrowing: find :: (a -> Bool) -> a find p | satisfied (p x) = x where x free 156
Berghammer and Fischer
Here the part of the rule between the two symbols “|” and “=” is called a guard and must be satisfied to apply the rule. Furthermore, the local declaration where x free declares x to be an unknown value. Based on this, the function find can be used to solve boolean formulae. For example, given the definition one :: (Bool,Bool) -> Bool one (x,y) = x && not y || not x && y for the boolean formula (x ∧ ¬y) ∨ (¬x ∧ y), the call (find one) evaluates to (True,False) or (False,True). This means that the formula holds iff x is assigned to the truth value true and y is assigned to the truth value false or x is assigned to false and y is assigned to true. 3.3
Boolean Constraint Solving
Boolean formulae can be solved more efficiently using binary decision diagrams [5]. Therefore, the PAKCS [9] implementation of Curry contains a specific library CLPB that provides Constraint Logic Programming over Booleans based on BDDs. In this library, boolean constraints are represented as values of type Boolean. There are two constants, viz. the always satisfied constraint and the never satisfied constraint: true :: Boolean false :: Boolean Besides these constants, the library CLPB exports a lot of functions on boolean constraints. For example, there are the following nine functions corresponding to the boolean lattice structure of Boolean, where the meaning of the function neg and the operations (.&&), (.||), (.==), and (./=) is obvious and the remaining operations denote the comparison relations on Boolean with the constant false being defined strictly smaller than the constant true: neg :: Boolean -> Boolean (.&&), (.||), (.==), (./=), (.=) :: Boolean -> Boolean -> Boolean Decisive for the applications we will discuss later is the CLPB-function satisfied :: Boolean -> Success that nondeterministically yields a solution if the argument is a satisfiable boolean constraint by possibly instantiating free variables in the constraint. This function is far more efficient than the narrowing-based version presented in Section 3.2.
4
Implementation of Relation Algebra
In this section we sketch an implementation of relation algebra over finite, binary, relations in the Curry language. We will represent relations as boolean matrices. This allows to employ the higher-order features of Curry for an elegant formulation of the relation-algebraic operations, predicates, and constants. As we will also 157
Berghammer and Fischer
demonstrate, relational constraints can be integrated seamlessly into Curry because we can use free variables to represent unknown parts of a relation. Based on this, the nondeterministic features of Curry permit us to formulate the search for unknown relations that satisfy a given predicate in a natural way. 4.1
Functional Combinators
The functional features of Curry serve well to implement relation algebra. Relations can be easily modeled as algebraic datatype and relational operations, predicates, and constants can be defined as Curry-functions over this datatype. We only consider relations with finite domain and range. As already mentioned in Section 2.1, such a relation can be represented as boolean matrix. Guided by the type of matrices introduced in Section 3.1, we define the following type for relations: type Rel = [[Boolean]] Note that we use the type Boolean instead of Bool for the matrix elements. This allows to apply the more efficient constraint solver satisfied of Section 3.3 for solving relational problems. Furthermore, note that in our implementation a vector corresponds to a list which contains exactly one list. The dimension (i.e., the number of rows and columns, respectively) of a relation can be computed using the function dim :: Rel -> (Int,Int) and an empty relation, universal relation, and identity relation of a certain dimension can be specified by the functions O, L :: (Int,Int) -> Rel I :: Int -> Rel. The definitions of these three functions are straightforward and, therefore, omitted. Next, we consider the inverse of a relation. In the matrix model of relation algebra it can be computed by transposing the corresponding matrix, and in Curry this transposition looks as follows: inv :: Rel -> Rel inv xs | any null xs = [] | otherwise = map head xs : inv (map tail xs) Here the predefined function map :: (a -> b) -> [a] -> [b] maps a function over the elements of a list and the predefined functions head and tail compute the head and the tail of a non-empty list, respectively. The complement of a relation can be easily computed by negating the matrix entries. In Curry this can be expressed by the following declaration, where the function map has to be used twice since we have to map over a matrix which is a list of lists: comp :: Rel -> Rel comp = map (map neg) 158
Berghammer and Fischer
Union and intersection are implemented using boolean functions that combine the matrices element-wise using disjunction and conjunction, respectively: (.|.), (.&.) :: Rel -> Rel -> Rel (.|.) = elemWise (.||) (.&.) = elemWise (.&&) Here the function elemWise is defined as elemWise :: (a -> b -> c) -> [[a]] -> [[b]] -> [[c]] elemWise = zipWith . zipWith using the predefined functions (.) :: (b -> c) -> (a -> b) -> (a -> c) zipWith :: (a -> b -> c) -> [a] -> [b] -> [c] for function composition and list combination, respectively. Finally, relational composition can be implemented as multiplication of boolean matrices. A corresponding Curry function looks as follows: (.*.) :: Rel -> Rel -> Rel xs .*. ys = [ [ foldr1 (.||) (zipWith (.&&) xrow ycol) | ycol a) -> [a] -> a that combines all elements of the given list (second argument) with the specified binary operator (first argument). 4.2
Relational Constraints
In Section 2.1 we introduced one more basic combinator on relations, namely relational inclusion. It differs from the other constructions because it does not compute a new relation but yields a truth value. For the applications we have in mind, we understand relational inclusion as a boolean constraint over relations, i.e., a function that takes two relations of the same dimension and yields a value of type Boolean. Its Curry implementation is rather straightforward by combining the already used functions foldr1 and elemWise with the predefined function concat :: [[a]] -> [a] that concatenates a list of lists into a single list. (. Rel -> Boolean xs . Boolean xs .==. ys = xs . Boolean) -> Rel find d p | satisfied (p rel) = rel where rel = freeRel d freeRel :: (Int,Int) -> Rel freeRel (m,n) = map (map freeVar) (replicate m (replicate n ())) where freeVar () = let x free in x The predefined function replicate :: Int -> a -> [a] computes a list of given length that contains only the specified element. The function find is the key to many solutions of relational problems using Curry since it takes a predicate over a relation and nondeterministically computes solutions for the predicate. A generalization of find to predicates over more than one relation is obvious, but for the problems we will consider in this paper this simple version suffices.
5
Applications and Results
Now, we present some example applications. We also report on the results of our practical experiments with the PAKCS implementation of Curry on a PowerPC G4 processor running at 1.33 GHz with 768 MB DDR SDRAM main memory. Unfortunately, the current PAKCS system is not able to use more than 256 MB of main memory which turned out to be a limitation for some examples. 5.1
Least Elements
In the following first example we present an application of our library that does not rely on the logic features of Curry. We implement the relational specification of a least element of a set with regard to an ordering relation. This specification is not given as a predicate but as a relation-algebraic expression. Let R : X ↔ X be an ordering relation on the set X and v : 1 ↔ X be a vector that represents a subset V of X. Then the vector v ∩ v RT : 1 ↔ X is either empty or a point. In the latter case it represents the least element of V with regard to R since the equivalence (v ∩ v RT )x ⇐⇒ vx ∧ ¬∃ y : vy ∧ RT y,x ⇐⇒ vx ∧ ∀ y : vy → Rx,y ⇐⇒ x ∈ V ∧ ∀ y : y ∈ V → Rx,y 160
Berghammer and Fischer
holds for all x ∈ X. Based on the relational specification v ∩ v RT , in Curry the least element of a set/vector v with regard to an ordering relation R can be computed by the following function: leastElement :: Rel -> Rel -> Rel leastElement R v = v .&. comp (v .*. comp (inv R)) The body of this function is a direct translation of the relational specification into the syntax of Curry. 5.2
Linear Extensions
As an example for a relational specification given as a predicate, we consider linear extensions of an ordering relation. A relation R : X ↔ X is an ordering relation on X if it is reflexive, transitive, and antisymmetric. It is well-known how to express these properties relation-algebraically. Reflexivity is described by I ⊆ R, transitivity by RR ⊆ R, and antisymmetry by R ∩ RT ⊆ I. Hence, we immediately obtain the Curry-predicate ordering :: Rel -> Boolean ordering R = refl R .&& trans R .&& antisym R where refl r = I (fst (dim r)) . [Rel] allLinearExtensions R = findall (=:=linearExtension R) Here (=:=) :: a -> a -> Success is the built-in constraint equality and partially applied to get a predicate on relations and findall :: (a -> Success) -> [a] encapsulates the search and returns a list of all values that satisfy the given predicate. We use it despite its deficiencies discussed in [3] because it suffices for our purposes. If the given ordering relation represents the dependencies of tasks in a distributed system, each linear extension represents a possible scheduling expressed as relation of type Rel. To rate a scheduling with regard to some quality factor, it would be more convenient to represent it as an ordered list of tasks. The conversion is accomplished by the following function that relies on the predefined function sortBy :: (a -> a -> Bool) -> [a] -> [a] that sorts a list according to an ordering predicate. The function evaluate :: Boolean -> Bool converts between boolean constraints and values of type Bool and (xs !! n) selects the n-th element of the list xs. linearOrderToList :: Rel -> [Int] linearOrderToList R = sortBy leq (take (length R) [1..]) where leq x y = evaluate (R !! (x-1) !! (y-1)) 162
Berghammer and Fischer
narrowing sec
k=3
k=4
k=5
k=6
k=7
n=2
0.04
0.1
0.15
0.37
0.96
n=3
0.1
0.19
0.49
1.36
3.59
n=4
0.22
0.39
1.19
3.32
8.85
constraint solving sec
k=5
k = 10
k = 20
k = 30
k = 40
n=2
0.05
0.09
0.3
0.64
1.16
n=3
0.07
0.12
0.48
1.1
1.96
n=4
0.13
0.22
0.87
1.97
3.62
Fig. 2. Narrowing vs. constraint solving
For the example depicted in Figure 1 there are 16 possible schedulings. We can rate a schedule by the time tasks have to wait for others by accumulating for each task the run times of all tasks that are scheduled before and computing the sum of all these delays. If we assign a run time of 2 time units to tasks 2 and 4 and a run time of 1 time unit to all other tasks, the list [1,3,2,5,4,6,7] represents an optimal scheduling. 5.3
Maximal Cliques
As another example for a relational specification given as predicate, we consider specific sets of nodes of a graph. The adjacency matrix of a graph g is a relation R : X ↔ X on the set X of nodes of g. For our example we restrict us to undirected graphs, i.e., we assume the relation R to be irreflexive (R ⊆ I ) and symmetric (R = RT ). A subset C of X is called a clique, if for all x, y ∈ C from x 6= y it follows Rx,y . If x 6= y and Rx,y are even equivalent for all x, y ∈ C, then C is a maximal clique. Similar to the simple calculation in Section 5.1 we can show that a vector v : X ↔ X represents a maximal clique of the undirected graph g iff v R ∪ I = v . Hence, a maximal clique of an undirected graph can be specified in Curry as follows, which exactly reflects the relational specification: maxclique :: Rel -> Rel -> Boolean maxclique R v = v .*. comp (R .|. I (fst (dim R))) .==. comp v As a consequence, a single maximal clique or even the set of all maximal cliques of an undirected graph can be computed via maximalClique :: Rel -> Rel maximalClique R = find (1, fst (dim R)) (maxclique R) using the function find similar to Section 5.2. 163
Berghammer and Fischer
n n n n n n
sec
8
=2 =3 =4 =2 =3 =4
6
4
2
0 0
20
40
60
80
100
120
140
nodes
Fig. 3. Computing maximal cliques
5.4
Discussion
Of course, with regard to efficiency, our approach to execute relational specifications cannot compete with specific algorithms for the problems we have considered. It should be pointed out that our intention is not to support the implementation of highly efficient algorithms. We rather strive for automatic evaluation of relational specifications with minimal programming effort and reasonable performance for small problem instances. Therefore, we compared our approach to a narrowingbased implementation, that does not rely on a constraint solver but uses the handcoded function satisfied introduced in Section 3.2. The results of our experiments show that using a constraint solver significantly increases the performance while it preserves the declarative formulation of programs. To compare the constraint-based implementation with the narrowing-based one we especially used the last example and computed maximal cliques in undirected graphs of different size. Problem instances that are published in the Web and generally used to benchmark specific algorithms for computing cliques could not be solved by our approach with reasonable effort. Therefore, for our benchmarks we generated our own problem instances as the disjoint union of n complete (loopless) graphs with k nodes each, and searched for all maximal cliques in these specific graphs. The run times of our benchmarks for different values of k and n are given in the two tables depicted in Figure 2. To check correctness was easy since in each case a maximal clique consists of the set of k nodes of a copy of the complete graphs we started with and there are exactly n maximal cliques. The run time of the constraint-based implementation increases moderately compared to the narrowing based implementation. To visualize this difference more clearly, the results of the tables are depicted graphically in Figure 3. We could not compute cliques in larger graphs because the constraint solver turned out to be very 164
Berghammer and Fischer
memory consuming and fails with a resource error for larger problem instances. The instances that can be solved are solved reasonably fast – the maximal cliques of a graph with 160 nodes are computed in less than 4 seconds. Note that, conceptually, the huge number 2160 = 1461501637330902918203684832716283019655932542976 of sets of nodes has to be checked in a graph with 160 nodes.
6
Conclusion
In this paper we have demonstrated how the functional logic programming language Curry can be used to implement relation algebra and to prototype relational specifications. We have used the functional features of Curry for elegantly implementing relations and the most important operations on them. Then the execution of explicit specifications corresponds to the evaluation of expressions. For the execution of implicit specifications we employed a boolean constraint solver available in the PAKCS system which proved to be head and shoulders above a narrowing-based approach. Without presenting an example, it should be clear that our approach also allows the formulation of general relational algorithms (like the computation of the transitive closure R+ of R as limit of the chain O ⊆ fR (O) ⊆ fR (fR (O)) ⊆ . . ., where fR (X) = R ∪ XX) as Curry-programs. By implementing a solver for relational specifications using Curry, we described an application of the integration of different programming paradigms. The involved paradigms are the following: •
Relation algebra – to formulate specifications.
•
Functional programming – to express relations as algebraic datatype and relational combinators as functions over this datatype.
•
Constraint solving – to efficiently solve constraints over relations.
•
Free variables and built-in nondeterminism – to express unknown relations and different instantiations in a natural way.
Using our library, relational specifications can be checked in a high-level declarative style with minimal programming effort. We have demonstrated that different programming paradigms can benefit from each other. Functional programming can be made more efficient using constraint solving facilities and constraint programming can be made more readable by abstraction mechanisms provided by functional programming languages. Especially, higher-order functions and algebraic datatypes serve well to implement constraint generation on a high level of abstraction. Functional logic languages allow for a seamless integration of functional constraint generation and possibly nondeterministic constraint solving with instantiation of unknown values. Since the underlying constraint solver uses BDDs to represent boolean formulae, constraints over relations are also represented as BDDs. Unlike RelView we do not represent relations as BDDs but use a matrix representation. For future work we plan to investigate, whether the ideas behind the BDD representation of relations employed in RelView can be combined with the BDD representation of relational constrains. Such a combination could result in a more efficient implementation for two reasons: Firstly, applications that use our library to implement relational al165
Berghammer and Fischer
gorithms where many relational expressions need to be evaluated benefit because operations on relations can be implemented more efficiently on BDDs than on matrices. Secondly, even in applications were a specification only has to be evaluated once before it is instantiated by the constraint solver, we can benefit if the BDDbased representation of relations uses less memory than the matrix representation. As another topic for future work, we plan to consider a slightly different interface to our library that hides the dimensions of relations. Specifying dimensions of relations is tedious and error prone. They could be handled explicitly in the datatype for relations and propagated by the different relational combinators. The challenge will be to ensure correct guessing of unknown relations without extra specifications by the programmer.
References [1] Berghammer, R. and T. Hoffmann, Relational depth-first-search with applications, Information Sciences 139 (2001), pp. 167–186. [2] Berghammer, R. and F. Neumann, RelView – An OBBD-based Computer Algebra system for relations, in: Proc. Int. Workshop on Computer Algebra in Scientific Computing (2005), pp. 40–51. [3] Braßel, B., M. Hanus and F. Huch, Encapsulating non-determinism in functional logic computations, Journal of Functional and Logic Programming 2004 (2004). [4] Brink, C., W. Kahl and G. Schmidt, editors, “Relational Methods in Computer Science,” Advances in Computing, Springer, 1997. [5] Bryant, R., Graph-based algorithms for boolean function manipulation, IEEE Transactions on Computers C35 (1986), pp. 677–691. [6] de Swart, H., E. Orlowska, G. Schmidt and M. Roubens, editors, “Theory and Applications of Relational Structures as Knowledge Instruments,” Lecture Notes in Computer Science 2929, Springer, 2003. [7] Fern´ andez, A., M. Hortal´ a-Gonz´ alez and F. S´ aenz-P´ erez, Solving combinatorial problems with a constraint functional logic language, in: Proc. 5th Int. Symposium on Practical Aspects of Declarative Languages (PADL 2003) (2003), pp. 320–338. [8] Hanus, M., The integration of functions into logic programming: From theory to practice, Journal of Logic Programming 19&20 (1994), pp. 583–628. [9] Hanus, M. et al., PAKCS: The Portland Aachen Kiel Curry System (version 1.7.1), Available at URL http://www.informatik.uni-kiel.de/~pakcs/ (2003). [10] Hanus, M. et al., Curry: An integrated functional logic language (version 0.8.2), Available at URL http://www.informatik.uni-kiel.de/~curry (2006). [11] Kahl, W., Semigroupoid interfaces for relation-algebraic programming in Haskell, in: R. A. Schmidt, editor, Relations and Kleene Algebra in Computer Science, Lecture Notes in Computer Science 4136, 2006, pp. 235–250. [12] Lux, W., Adding linear constraints over real numbers to Curry, in: Proc. 5th Int. Symposium on Functional and Logic Programming (FLOPS 2001) (2001), pp. 185–200. [13] MacCaull, W., M. Winter and I. D¨ untsch, editors, “Proc. Int. Seminar on Relational Methods in Computer Science,” Lecture Notes in Computer Science 3929, Springer, 2006. [14] Peyton Jones, S., editor, “Haskell 98 Language and Libraries—The Revised Report,” Cambridge University Press, 2003. [15] Ravelo, J., Two graph-algorithms derived, Acta Informatica 36 (1999), pp. 489–510. [16] Schmidt, G., A proposal for a multilevel relational reference language, Journal of Relational Methods in Computer Science 1 (2004), pp. 314–338. [17] Schmidt, G. and T. Str¨ ohlein, “Relations and Graphs – Discrete Mathematics for Computer Scientists,” Springer, 1993.
166
WFLP 2006
Lazy Database Access with Persistent Predicates ? Sebastian Fischer Institut f¨ ur Informatik Universit¨ at Kiel Olshausenstraße 40, 24098 Kiel, Germany
Abstract Programmers need mechanisms to store application specific data that persists multiple program runs. To accomplish this task, they usually have to deal with storage specific code to access files or relational databases. Functional logic programming provides a natural framework to transparent persistent storage through persistent predicates, i.e., predicates with externally stored facts. We extend previous work on persistent predicates for Curry by lazy database access. Results of a database query are only read as much as they are demanded by the application program. We also present a typeoriented approach to convert between database and Curry values which is used to implement lazy access to persistent predicates based on a low level lazy database interface. Keywords: Curry, database access, dynamic predicates, laziness, persistence
1
Introduction
Programming languages need mechanisms to store data that persists among program executions. Internal data needs to be saved and recovered, and external data has to be represented and manipulated by an application. For instance, web applications often read data stored on a database server and present it to the user in a structured way. Relational databases are typically used to efficiently access a large amount of stored data. In a relational database, data is stored in tables that can be divided into rows and columns. From the logic programming point of view, a database table can be seen as specification of a predicate, storing the predicate’s facts in its rows. In previous work [7,5,4], we developed an approach to database access in Curry where database tables are seen as dynamic specification of special predicates. The specification is dynamic because it may change at run time. Hence, such predicates are called dynamic predicates. Dynamic predicates whose facts are stored persistently, e.g., in a database, are called persistent predicates. ? This work has been partially supported by the DFG under grant Ha 2457/5-1.
This paper is electronically published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs
Fischer
A first attempt of a prototypical implementation of our approach turned out to be inefficient for large result sets because these were parsed completely when a query was executed. Therefore, we developed a lazy implementation that does not read unused results. This implementation consists of two parts: •
we develop a low level lazy database interface and
•
we implement lazy access to persistent predicates based on this interface.
The most interesting point of the second part is concerned with data conversion. We present a type-oriented approach to convert between the values stored in a database and Curry data terms. This paper mainly describes practical work. Although we slightly modify the interface of our library, we do not introduce conceptual novelties. Nevertheless, it is remarkable that we could employ high-level declarative techniques to achieve quite technical goals. As a result, we get a concise and both portable and maintainable implementation. The remainder of this paper is structured as follows: in Section 1.1 we motivate persistent predicates by reflecting related work on database access in declarative programming languages. In Section 2 we present the interface of our database library. We discuss a low level implementation of lazy database access in Section 3. We sketch a type-oriented approach to conversion between database and Curry values in Section 4 that is used to implement lazy access to persistent predicates based on the low level interface. Finally, Section 5 contains concluding remarks. 1.1
Related Work
The notion of persistent predicates is introduced in [3] where a database and a file based Prolog implementation are provided. Persistent predicates enable the programmer to store data that persists from one execution to the next and is stored transparently, i.e., the program’s source code need not be changed with the storage mechanism. However the implementation presented in [3] has two major drawbacks: firstly, access to persistent predicates is implemented using side effects. This is a problem, because in this approach the behavior of a program with calls to persistent predicates depends on the order of evaluation. Another drawback of [3] is that it, secondly, does not support transactions. Databases are often used in the context of web applications where potentially a lot of processes access the database concurrently. Therefore, transactions are a key feature for a practical implementation of a database library. Both problems are solved in [7] where database access is only possible inside the IO monad [11] and a transaction concept is provided. Eliminating side effects is especially essential in the context of functional logic programming languages which are based on sophisticated evaluation strategies [1]. [4] extends the library presented in [7] by a database implementation of persistent predicates. We improve this implementation by means of lazy database access and slightly simplify its interface. Section 2 recapitulates the interface to our library and mentions differences to the version presented in [4]. A combinator library for Haskell, which is used to construct database queries with relational algebra, is provided by [9]. It allows for a syntactically correct and type safe implementation of database access. The authors provide a general ap168
Fischer
proach to embed domain specific languages into higher-order typed languages and apply it to relational algebra to access relational databases. While [9] syntactically integrates a domain specific language into the Haskell programming language, persistent predicates transparently integrate database access into a familiar programming paradigm.
2
The Database Library
In our approach, persistent predicates are defined by the keyword persistent, since their definition is not part of the program but externally stored. The only information given by the programmer is a type signature and a string argument to persistent identifying the storage location. The predicate defined below stores (finitely many) prime numbers in the table primes in the database currydb: prime :: Int -> Dynamic prime persistent "db:currydb.primes" The storage location is prefixed with "db:" to indicate that it is a database table. After the colon, the database and the table are given divided by a period. The result type of persistent predicates is Dynamic which is conceptually similar to Success (the result type of constraints and ordinary predicates in Curry.) Dynamic predicates are distinguished from other predicates to ensure that the functions provided to access them are only used for dynamic predicates, and not for ordinary ones. Conceptually, the type Dynamic should be understood as a datatype with a single value – just like Success is a datatype with the single value success. Internally, the datatype stores information on how to access the externally stored facts. This information is introduced by the predicate specifications and the combinators presented in Section 2.2. 2.1
Basic Operations
The basic operations for persistent predicates stored in a database are assertion to insert new facts, retraction to delete and query to retrieve them. Because the definition of persistent predicates changes over time, their access is only possible inside the IO monad to provide an explicit order of evaluation. To manipulate the facts of a persistent predicate, the operations assert :: Dynamic -> IO () retract :: Dynamic -> IO () are provided. Conceptually, these functions modify a global knowledge base as a side effect: assert inserts new facts into the knowledge base and retract removes them. The arguments of assert and retract must not contain free variables, and thus, only assertion and retraction of ground facts are allowed. If the arguments of a database predicate are not ground, a call to assert or retract suspends until they are. Note that, currently, Curry only supports concurrency of constraints via the concurrent conjunction operator (&) :: Success -> Success -> Success 169
Fischer
An extension of Curry similar to Concurrent Haskell [10] that supports concurrent i/o actions could make use of this feature for synchronization. We could define an alternative version of retract that does not suspend on partially instantiated arguments but deletes all matching facts from the database without propagating the resulting bindings to the program. These bindings would then be encapsulated in the call to retract. However, this behavior can be achieved using getDynamicSolutions defined below and, more importantly, the implementation of retract would become inefficient, if the encapsulation would be done internally for all calls to retract, i.e., also for those that do not involve free variables. Moreover, a similar alternative does not seem to exist for the implementation of assert. Representing unknown parts as null -values seems to be appropriate at first glance. However, the information about which variables are identical is lost, if all free variables are represented as null -values, thus, suspension or failure seem to be more practical for retract and the only reasonable options for assert. We chose suspension to support possible extensions of Curry concerned with concurrency. A query to a persistent predicate can have multiple solutions computed nondeterministically. To encapsulate search, the function getDynamicSolutions :: (a -> Dynamic) -> IO [a] takes a dynamic predicate abstraction and returns a list of all values satisfying the abstraction similar to getAllSolutions for predicates with result type Success. The function getDynamicSolution :: (a -> Dynamic) -> IO (Maybe a) can be used to query only one solution. Note that this function would not be necessary if getDynamicSolutions were lazy. But with a strict implementation all solutions are computed in advance even if the program demands only the head of the result list. Unfortunately not all Curry implementations support lazy encapsulated search and we can provide it only for predicates that are stored in a database. Also note that only encapsulated access to a dynamic predicate is provided. Dynamic predicates cannot be used in guards like ordinary predicates and, thus, cannot cause nondeterministic behavior of the program.
2.2
Combining Persistent Predicates
Often information needs to be queried from more than one persistent predicate at once, or a query has to be restricted with a boolean condition. To combine several persistent predicates, we provide two different forms of conjunction. One combines two values of type Dynamic similarly to the function (&) for ordinary constraints. The other one combines a Dynamic predicate with a boolean condition: () :: Dynamic -> Dynamic -> Dynamic (|>) :: Dynamic -> Bool -> Dynamic These combinators can be employed to construct Dynamic abstractions that resemble typical database queries. For example, the abstraction 170
Fischer
\(x,y) -> prime x prime y |> x+2 == y resembles the SQL query SELECT tab1.prime, tab2.prime FROM primes AS tab1, primes AS tab2 WHERE tab1.prime + 2 = tab2.prime The translation into SQL relies on very few primitive features present in SQL. Simple select-statements with a where-clause suffice to query facts of a persistent predicate – also if it involves complex conjunctions. Since there are no combinators to define predicates with aggregating arguments – like the sum, average, minimum or maximum of another argument – we do not need such features of SQL. Also, nested queries and having- or order-by-clauses are not generated by our implementation. We do not provide aggregation features because it is unclear how to perform assert on a predicate with aggregating arguments. Aggregation is only reasonable for queries and thus has to be coded explicitly. Null-values can be accessed as Nothing if the corresponding argument is of a Maybe type. If it is not, null-values are represented as free variables. A detailed transformation scheme including a mechanism to transform boolean conditions attached to persistent predicates into efficient SQL queries is discussed in [5,4].
2.3
Transactions
Since changes made to the definition of persistent predicates are instantly visible to other programs employing the same predicates, transactions are required to declare atomic operations. As database systems usually support transactions, the provided functions rely on the databases transaction support: transaction :: IO a -> IO (Maybe a) abortTransaction :: IO a The function transaction is used to start transactions. The given i/o action is performed atomically and Nothing is returned, if the i/o action fails or is explicitly aborted with abortTransaction. Nested transactions are not supported and lead to a run-time error. We simplified the interface of [4] by eliminating the function transactionDB that takes a database name to perform the transaction in. Now, transactions are specified with the function transaction regardless whether they employ database predicates or not and the transaction is performed in all databases known to the current process. Whether this is a performance penalty depends on the implementation of transactions in the involved database systems. Usually, a transaction that does not touch any tables, does not block other processes that access the database. Actually, one process will typically not access different database systems, so the simplified interface will rarely cause any performance overhead. 171
Fischer
3
Lazy Database Access
Although complex restrictions can be expressed using the conjunction combinators () and (|>) presented in the previous section, database queries may still have large result sets. If the programmer accesses only parts of the results, it is an unnecessary overhead to retrieve all results from the database. The implementation presented in [4] communicates with the database via standard i/o and always parses the complete result set before providing it to the application program. This turned out to be inefficient for large result sets. Thus, we developed an alternative implementation. The implementation presented in this paper does not query the complete results when getDynamicSolutions is called. Instead, it only retrieves a handle which is used to query the results when they are demanded. The advantage of this approach is obvious: a call to getDynamicSolutions causes only a negligible delay because it only retrieves a handle instead of the whole result set from the database. Moreover, results that are not demanded by the application are not queried from the database. Thus, a delay is only caused for reading results that are consumed by the program – no results are queried in advance. If results are read on demand, it is important that they are independent of when they are demanded. Especially, results must not be affected by table updates that happen between the query and the consumption of the results. This property is ensured by the database system. Conceptually, a snapshot of the database is created when the query yields a handle and all results correspond to this snapshot. Hence, the fact that a query is read lazily does not affect the corresponding set of results. Relational database systems allow us to retrieve result sets of queries incrementally via an API. In this section we show how we access this API from a Curry program. One possibility to access the database API is to use external functions. However, implementing them is a complex task and more importantly external functions need to be re-coded for every Curry implementation, so it is a good idea to avoid them wherever possible. 3.1
Curry Ports
Java supports a variety of database systems. Curry supports distributed programming using ports [6] and it is possible to port this concept to Java and write distributed applications that involve both Curry and Java programs. So we can access all database systems supported by Java in Curry if we can communicate with a Java program using Curry. We implemented database access using ports to get a high-level, maintainable, and portable Curry implementation that benefits from the extensive support for different database systems in Java. We will not discuss how ports and database access are implemented in Java, but focus on the Curry part of our implementation. We will concentrate on ports and how we use them to model a lazy database interface. A port is a multiset of messages which is constrained to hold exactly the elements of a specified list. There is a predicate openPort :: Port a -> [a] -> Success that creates a port for messages of type a. Usually, openPort is called with free 172
Fischer
variables as arguments and the second argument is instantiated by sending messages to the first argument. A client can send a message to a port using send :: a -> Port a -> Success Since the message may contain free variables that can be bound by the server, there is no need for a receive function on ports: if the client needs to receive an answer from the server, it can include a free variable in his request and wait for the server to bind this variable. To share a port between different programs, it can be registered under a global name accessible over the network. The i/o action openNamedPort :: String -> IO [a] opens a globally accessible port and returns the list of messages that are sent to the port. The i/o action connectPort :: String -> IO (Port a) returns the port that is registered under the given name. The last two functions destroy type-safety of port communication because their return values are polymorphic. The predicates openPort and send ensure that only type correct messages are sent to or received from a port. With openNamedPort and connectPort, however, it is possible to send type-incorrect messages to a port. The function openNamedPort creates a stream of messages of unspecified type and the type of messages that can be sent to a port created by connectPort is also unspecified. If ports are used for communication over a network, this communication is no longer type-safe. Therefore, we have to carefully establish type-correct message exchange ourselves. The user of our library is not concerned with these issues because the ports-based interface is not exported.
3.2
Lazy Interface to the Database
In this section we describe the messages that are used to communicate with the Java program that implements database access and the functions that use these messages to implement a lazy database interface. This interface is not intended for application programs but only used internally for our implementation of persistent predicates. For the communication with the Java program we need a datatype for the messages that are sent via ports and a datatype for the values that are stored in a database table. A Curry process must be able to open and close database connections and send insert-, delete-, or commit-statements. However the most interesting messages with regard to lazy database access are those concerned with queries and result retrieval. As mentioned earlier, a handle must be returned as result of a query. Furthermore, it must be possible to check, whether there are more results corresponding to a handle and if so to query another row of the result set. Hence, we define the following datatypes: 173
Fischer
data DBMessage = Open String DBHandle | Update DBHandle String | Query DBHandle String ResultHandle | EndOfResults DBHandle ResultHandle Bool | NextRow DBHandle ResultHandle [DBValue] | Close DBHandle type DBHandle = Int type ResultHandle = Int data DBValue = NULL | BOOLEAN Bool | INT Int | FLOAT Float | CLOB String | TIME ClockTime type Connection = (Port DBMessage, DBHandle) A connection consists of a port of type (Port DBMessage) and an integer of type DBHandle. The datatype DBValue wraps values of different SQL column types. SQL supports a variety of different column types and we chose a reasonable small representation as algebraic datatype. To obtain a small representation, we represent values of different SQL types as values of the same Curry type. For example, values of type DOUBLE, FLOAT, REAL, . . . are all represented as wrapped Float values in Curry. SQL supports the special datatypes DATE, TIME and DATETIME for date and time values that are represented as wrapped value of type ClockTime – the standard datatype for representing time values in Curry. Although subsuming column types is an abstraction, it is detailed enough for transparent database access in Curry. A central part of our implementation is the definition of the datatype DBMessage. The defined messages serve the following purpose: •
(Open spec db) opens a new connection to the specified database and db is instantiated with an integer representing the connection.
•
(Update db sql) performs an SQL statement sql in the database represented by db without returning a result.
•
(Query db sql result) is used for queries that return a result. Note that only an integer that represents the result set is returned.
•
(EndOfResults db result empty) checks whether the result set is empty,
•
(NextRow db result row) queries one row from a non-empty result set and
•
(Close db) closes the specified connection.
As a simple example for the communication with a database server over ports using this datatype consider the following definition: endOfResults :: Connection -> ResultHandle -> Bool endOfResults (p,db) r | send (EndOfResults db r empty) p = ensureNotFree empty where empty free 174
Fischer
The call to ensureNotFree suspends until its argument is bound and is used to wait for the answer returned from the server. The function endOfResults and the NextRow-message can be employed to define a function query :: Connection -> String -> [[DBValue]] that performs an SQL query and returns a list of all rows in the result set lazily. The key idea is to delay the query for the actual rows until they are demanded by the program. The function lazyResults that takes a connection and a result handle and lazily returns a list of rows is implemented as follows: lazyResults :: Connection -> ResultHandle -> [[DBValue]] lazyResults (p,db) r | endOfResults (p,db) r = [] | otherwise = send (NextRow db r row) &> (ensureNotFree row : lazyResults (p,db) r) where row free Each row is queried on demand by sending a NextRow message and the result set is empty if endOfResults returns True. Note that we demand the contents of each row, when the corresponding constructor (:) of the result list is demanded. Since every call to endOfResults advances the result pointer to the next row, we may not be able to demand the contents of previous rows later. Finally, the function query can be implemented using the Query-message and the function lazyResults: query :: Connection -> String -> [[DBValue]] query (p,db) sql | send (Query db sql resultHandle) p = lazyResults (p,db) (ensureNotFree resultHandle) where resultHandle free Note that the presented functions are not i/o actions. Although i/o is performed by the Java application we communicate with, the communication itself is done with send-constraints outside the IO monad. Therefore, the presented functions can be considered unsafe and need to be used with care. Demand driven database access cannot be implemented without this kind of unsafe features. Recall that it is, e.g., not possible to implement a lazy readFile operation without unsafe features in the IO monad. Note that the presented functions are not part of our database library. They are only used internally in the implementation of persistent predicates. The function query can be used to define a lazy version of getDynamicSolutions to retrieve all solutions of a persistent predicate abstraction on demand. Its implementation has to consider predicates that are combined from heterogeneous parts. Predicates stored in a database can be freely combined with others stored in files or those with facts held in main memory. The details are out of the scope of this paper. However, an interesting aspect of these details is how to convert between the datatype DBValue and Curry values automatically. An approach to this problem that employs typeoriented database specifications is discussed in Section 4. Beforehand, we consider 175
Fischer
an example that demonstrates lazy database access with persistent predicates: main = do assert (foldr1 () (map prime [3,5,7])) (p:ps) YearOfBirth -> Dynamic person = persistent2 "jdbc:mysql://localhost/currydb persons" (cons2 Name (string "last") (string "first")) (int "born") This declaration states that person is a persistent predicate with 2 arguments stored it the database currydb in a MySQL database on the local machine in the table persons. The table persons has 3 columns. The first two columns – last and first – store the name of a person, i.e., the first argument of person. The third column – born – stores the year of birth, i.e., the second argument of person. The specifications 178
Fischer
cons2 Name (string "last") (string "first") int "born" resemble the structure of the argument types of person. The first resembles a constructor with two string arguments and the second represents an integer. The string arguments to the primitive specifications string and int denote the column names where the corresponding values are stored. Declarations like the one shown above are generated automatically from the provided type information. However, less intuitive column names would be selected by this automatic transformation. The presented specifications serve different purposes. Firstly, they store information about the corresponding column names and SQL column types. Secondly, they store functions to convert between database and Curry values. We define a datatype DBSpec for these specifications as follows: data DBSpec a = DBSpec [String] [String] (ReadDB a) (ShowDB a) type ReadDB a = [DBValue] -> (a,[DBValue]) type ShowDB a = a -> [DBValue] -> [DBValue] In fact, these types are a bit more complicated in the actual implementation. However, the presented types are sufficient for this description. The first two components of a DBSpec store the column names and types. A function of type (ReadDB a) is a parser that takes a list of database values and returns a value of type a along with the remaining unparsed database values. A function of type (ShowDB a) takes a value of type a and a list of database values and extends this list with the representation of the given value as database values. We can define the primitive combinator int presented above as follows: int :: String -> DBSpec Int int name = DBSpec [name] ["INT"] rd sh where rd (NULL : xs) = (let x free in x, xs) rd (INT n : xs) = (n, xs) sh n xs = (INT (ensureNotFree n) : xs) The parser for integers reads one column from the list of database values and returns a free variable if it is a null -value. The show function extends the given database values with an integer value and suspends on free variables. Database specifications for other primitive types, viz. string, float, bool and time, can be defined similarly. Complex datatypes can be represented by more than one column. Recall the specification for the name of a person introduced above: cons2 Name (string "last") (string "first")
179
Fischer
The combinator cons2 can be defined as follows: cons2 :: (a -> b -> c) -> DBSpec a -> DBSpec b -> DBSpec c cons2 cons (DBSpec nsa tsa rda sha) (DBSpec nsb tsb rdb shb) = DBSpec (nsa++nsb) (tsa++tsb) rd sh where Cons a b = cons a b rd = rda />= \a -> rdb />= \b -> ret (Cons a b) sh (Cons a b) = sha a . shb b This combinator takes a binary constructor cons as first argument. The subsequent arguments are database specifications corresponding to the argument types of the provided constructor. The name and type informations of the provided specifications are merged into the new specification, i.e., the arguments of the constructor are stored in subsequent columns of a database table. Finally, the read and show functions are constructed from the read and show-functions for the arguments. We use the predefined function composition (.) :: (b->c) -> (a->b) -> (a->c) to define the show function and monadic parser combinators for the read function: (/>=) :: ReadDB a -> (a -> ReadDB b) -> ReadDB b rda />= f = uncurry f . rda ret :: a -> ReadDB a ret a xs = (a,xs) The function uncurry :: (a->b->c) -> (a,b) -> c transforms a binary function into a function on pairs. The definition of the show function may be confusing at first glance: it matches a value (Cons a b) where Cons is a locally defined function equal to the constructor cons provided as first argument. Apart from constructors, in Curry also defined function symbols can be used in pattern declarations [2]. This allows us to define type-based combinators for arbitrary datatypes instead of only for specific ones. We provide similar combinators cons1, cons3, cons4, . . . for constructors of different arity. The presented combinators allow for a concise declaration of database specifications that are used to convert between database and Curry values. The declarations are introduced automatically when a program is loaded. However, the programmer can also introduce them himself if he wants to control the column names, e.g., if he wants to access existing database tables. The generated converters are used internally to implement lazy access to persistent predicates based on the low level lazy database interface presented in Section 3.2. The idea of type-oriented combinators seems to be applicable in a variety of applications. They bring the flavor of generic programming to a language without specific generic programming features. We plan to explore this connection in more detail in the future.
180
Fischer
5
Conclusions
We described a lazy implementation of a functional logic database library for Curry. The library is based on persistent predicates which allow for transparent access to externally stored data, i.e., without storage specific code. We extend [4] with an implementation of lazy database access and simplified the declaration of transactions by discarding the function transactionDB. We present an implementation of lazy database access that is both portable and maintainable because it is implemented in Curry using the concepts of ports [6] and not integrated into the run-time system using external functions. Moreover, our implementation supports a variety of database systems because it benefits from the extensive support for different database systems in Java. Using the ports-based lazy database interface, we implemented a low level lazy database interface for Curry. Based on this, we developed type-oriented converter specifications to implement a lazy version of getDynamicSolutions that encapsulates results of a Dynamic abstraction lazily. Values that are not demanded by the application program are not queried from the database in advance. Although we do not introduce conceptual novelties concerning our database library, we demonstrate that quite technical implementation goals – viz. laziness, i.e., efficiency – can be achieved using high-level programming techniques. Functional logic programming is powerful enough to transparently and efficiently integrate database programming into its programming paradigm using functional logic programming techniques.
References [1] Antoy, S., R. Echahed and M. Hanus, A needed narrowing strategy, Journal of the ACM 47 (2000), pp. 776–822. [2] Antoy, S. and M. Hanus, Declarative programming with function patterns, in: Proceedings of the International Symposium on Logic-based Program Synthesis and Transformation (LOPSTR’05) (2005), pp. 6–22. [3] Correas, J., J. G´ omez, M. Carro, D. Cabeza and M. Hermenegildo, A generic persistence model for (C)LP systems (and two useful implementations), in: Proc. of the Sixth International Symposium on Practical Aspects of Declarative Languages (PADL’04) (2004), pp. 104–119. [4] Fischer, S., A functional logic database library, in: WCFLP ’05: Proceedings of the 2005 ACM SIGPLAN Workshop on Curry and Functional Logic Programming (2005), pp. 54–59. [5] Fischer, S., “Functional Logic Programming with Databases,” Master’s thesis, Kiel University (2005), available at: http://www.informatik.uni-kiel.de/~mh/lehre/diplom.html. [6] Hanus, M., Distributed programming in a multi-paradigm declarative language, in: Proc. of the International Conference on Principles and Practice of Declarative Programming (PPDP’99) (1999), pp. 376–395. [7] Hanus, M., Dynamic predicates in functional logic programs, Journal of Functional and Logic Programming 2004 (2004). [8] Hanus, M., Type-oriented construction of web user interfaces, in: Proc. of the 8th International ACM SIGPLAN Conference on Principle and Practice of Declarative Programming (PPDP’06) (2006), pp. 27–38. [9] Leijen, D. and E. Meijer, Domain specific embedded compilers, in: Proceedings of the 2nd Conference on Domain-Specific Languages (DSL’99) (1999), pp. 109–122. [10] Peyton Jones, S., A. Gordon and S. Finne, Concurrent Haskell, in: Proc. 23rd ACM Symposium on Principles of Programming Languages (POPL’96) (1996), pp. 295–308. [11] Wadler, P., How to declare an imperative, ACM Computing Surveys 29 (1997), pp. 240–263.
181
182
WFLP 2006
Using Template Haskell for Abstract Interpretation Clara Segura1,2 Departamento de Sistemas Inform´ aticos y Programaci´ on Universidad Complutense de Madrid Madrid, Spain
Carmen Torrano3 Departamento de Sistemas Inform´ aticos y Programaci´ on Universidad Complutense de Madrid Madrid, Spain
Abstract Metaprogramming consists of writing programs that generate or manipulate other programs. Template Haskell is a recent extension of Haskell, currently implemented in the Glasgow Haskell Compiler, giving support to metaprogramming at compile time. Our aim is to apply these facilities in order to statically analyse programs and transform them at compile time. In this paper we use Template Haskell to implement an abstract interpretation based strictness analysis and a let-to-case transformation that uses the results of the analysis. This work shows the advantages and disadvantages of the tool in order to incorporate new analyses and transformations into the compiler without modifying it. Keywords: Meta-programming, Template Haskell, abstract interpretation, strictness analysis.
1
Introduction
Metaprogramming consists of writing programs that generate or manipulate other programs. Template Haskell [17,18] is a recent extension of Haskell, currently implemented in the Glasgow Haskell Compiler [12] (GHC 6.4.1), giving support to metaprogramming at compile time. Its functionality is obtained from the library package Language.Haskell.TH. It has been shown to be a useful tool for different purposes [6], like program transformations [7] or the definition of an interface for Haskell with external libraries (http://www.haskell.org/greencard/). Specially interesting is the implementation of a compiler for the parallel functional language Eden [15] without modifying GHC. 1 2 3
Work partially supported by the Spanish project TIN2004-07943-C04. Email:
[email protected] Email:
[email protected]
This paper is electronically published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs
Segura, Torrano Segura, Torrano Simplifier
New Pass Haskell Code
Abstract Syntax
Desugarer
Core Syntax
CoreToStg
Fig. 1. GHC compilation process with new analyses and transformations Fig. 1. GHC compilation process with new analyses and transformations
Using such an extension, a program written by a programmer can be inspected and/or compile time before written proceeding the rest ofcan thebecompilation Usingmodified such anat extension, a program by awith programmer inspected and/or before proceeding with the restinoforder the compilation process.modified Our aimatiscompile to applytime these metaprogramming facilities to statically process. Our aim isand to apply thesethem metaprogramming in allow order us to on statically analyse programs transform at compile time.facilities This will the one hand toprograms quickly implement new analyses defined for functional andthe onone the analyse and transform them at compile time. This willlanguages allow us on hand quickly implement these new analyses forcompiler functional languages and onit.the othertohand to incorporate analysesdefined into the without modifying In other hand incorporate these analyses into the process. compiler Haskell withoutcode modifying it. In 1 wetoshow a scheme of GHC compilation is desugared Figure Figure we show a schemelanguage of GHC compilation Haskell code is desugared into a 1simpler functional called Core.process. Analyses and transformations in into simpler functional language called Core. Analyses andastransformations in GHCa take place at Core syntax level, which are summarized a simplifier phase. GHC taketoplace at Core syntax level, which are summarized a simplifier phase. In order add new analyses and transformations it would beasnecessary to modify In to addHowever, new analyses andTemplate transformations would to modify theorder compiler. by using Haskell itthese canbebenecessary incorporated at the the compiler. However, by using modifying Template Haskell can1be incorporated the level of Haskell syntax without GHC. Inthese Figure this is added asat a new pass of at Haskell the levelsyntax of thewithout abstractmodifying syntax tree. level GHC. In Figure 1 this is added as a new passIn atparticular, the level oflanguages the abstract tree. can benefit from these facilities. Eden is like syntax Eden [5] Several a parallel extension of Haskell is implemented GHC [3].Eden In particular, languages likewhose Eden compiler [5] can benefit from theseonfacilities. is aanalyses parallel have extension Haskell whose compiler is implemented on GHC Several been of theoretically defined for this language [14,11,4] but[3].they have analyses been theoretically defined because for this language [14,11,4] but they have not beenhave incorporated to the compiler this involves the modification of GHC, once for each newtoanalysis we could define,this which seems unreasonable. Using not been incorporated the compiler because involves the modification of GHC, onceHaskell for eachnew new analyses analysis we couldtransformations define, which seems Using Template and/or couldunreasonable. be first prototyped Template new analyses and/or transformations could be first prototyped and then Haskell incorporated to the compilation process without directly modifying the and then incorporated to the compilation process without directly modifying the internals of the compiler. internals of paper the compiler. In this we explore the usefulness of Template Haskell for these purposes this paper we usefulness ofbased Template Haskell for these by In implementing an explore abstractthe interpretation strictness analysis andpurposes a let-toby implementing an that abstract based strictness analysis and a let-tocase transformation usesinterpretation the results of the analysis. These are well-known and case transformation that uses the allows results us of the analysis. These are problems well-known and already solved problems, which to concentrate on the arising already solved us those to concentrate the problems arising from the tool. problems, In Sectionwhich 2 we allows describe features ofonTemplate Haskell used 2 we describe features to of Template Haskell used from the sections. tool. In In Section 3 we give anthose introduction abstract interpretation, in later Section and describe the In strictness and let-to-case to transformation. Section 4 in later sections. Section analysis 3 we give anthe introduction abstract interpretation, 4 and describe theimplementation strictness analysis the let-to-case describes their usingand Template Haskell transformation. and shows some Section examples. describes their implementation using Template and shows to some Finally, in Section 5 we conclude and discuss Haskell the improvements theexamples. tool that could make it more5useful. Finally, in Section we conclude and discuss the improvements to the tool that could make it more useful.
2 Template Haskell 2 Template Haskell
Template Haskell is a recent extension of Haskell for compile-time meta-programming. This extension allows extension the programmer to observe the structure of the code Template Haskell is a recent of Haskell for compile-time meta-prograof a program and either transform that code, generate new code from it, analyse mming. This extension allows the programmer to observe the structure oforthe code its properties. In this section we summarize the facilities offered by the extension. of a program and either transform that code, generate new code from it, or analyse The code of Haskell expression is represented by anoffered algebraic data type Exp, its properties. Inathis section we summarize the facilities by the extension. andThe similarly are represented each of the syntatic categories of a Haskell program, code of a Haskell expression is represented by an algebraic data type Exp, 2 we show parts of the definitions like declarations (Dec) or patterns (Pat). In Figure and similarly are represented each of the syntatic categories of a Haskell program, of these data types, which we will use later in Section 4. 184 2
Segura, Torrano data Exp = LitE Lit -- literal VarE Name -- variable ConE Name -- constructor LamE [Pat] Exp -- lambda abstraction AppE Exp Exp -- application CondE Exp Exp Exp -- conditional LetE [Dec] Exp -- let expression CaseE Exp [Match] -- case expression InfixE (Maybe Exp) Exp (Maybe Exp) -- primitive op. . . . data Match = Match Pat Body [Dec] data Pat = VarP Name ConP Name [Pat] . . .
-- pat -> body where decs -- variable -- constructor
data Body = NormalB Exp -- just an expression . . . data Dec = ValD Pat Body [Dec] FunD Name [Clause [Pat] Body [Dec]]
-- v = e where decs -- f p1 ... pn = e -where decs
Fig. 2. Data types representing Haskell syntax
like declarations (Dec) or patterns (Pat). In Figure 2 we show parts of the definitions of these data types, which we will use later in Section 4. A quasi-quotation mechanism allows one to represent templates, i.e. Haskell programs at compile time. Quasi-quotations are constructed by placing brackets, [| and |], around concrete Haskell syntax fragments, e.g. [|\x->x|]. This mechanism is built on top of a monadic library. The quotation monad Q encapsulates meta-programming features as fresh name generation. It is an extension of the IO monad. The usual monadic operators bind, return and fail are available, as well as the do-notation [19]. The function runQ makes the abstract syntax tree inside the Q monad available to the IO monad, for example for printing. This is everything we need to know about the quotation monad for our purposes. The translation of quoted Haskell code makes available its abstract syntax tree as a value of type ExpQ, where type ExpQ = Q Exp; e.g. [|\x->x|]::ExpQ. Library Language.Haskell.TH makes available syntax construction functions built on top of the quotation monad. Their names are similar to the constructors of the algebraic data types, e.g. lamE :: [PatQ] -> ExpQ -> ExpQ. For example, we can build the expression [|\x->x|] also by writing lamE [varP (mkName "x")] (varE (mkName "x")), where mkName:: String -> Name. Evaluation can happen at compile time by means of the splice notation $. It evaluates its content (of type ExpQ) at compile-time, converts the resulting abstract syntax tree into Haskell code and inserts it in the program at the location of its invocation. As an example, [|\x->$qe|] evaluates qe at compile time and the result of the evaluation, a Haskell expression qe’, is spliced into the lambda abstraction giving [|\x->qe’|]. We will use in Section 4 the quasi-quotation mechanism in order to analyse and transform Haskell programs, and the splicing notation in order to do this at compile time. A pretty printing library Language.Haskell.TH.PprLib will be useful in order to visualize the results of our examples. There are other features of Template Haskell we are not using here; the interested 185
Segura, Torrano
reader may look at [17,18] for more details.
3
Strictness Analysis and let-to-case transformation
3.1
Motivation
Practical implementations of functional languages like Haskell use a call-by-need parameter passing mechanism. A parameter is evaluated only if it is used in the body of the function; once it has been evaluated to weak-head normal form, it is updated with the new value so that subsequent accesses to that parameter do not evaluate it from scratch. The implementation of this mechanism builds a closure or suspension for the actual argument, which is updated when evaluated. The same happens with a variable bound by a let expression: A closure is built and it is evaluated and subsequently updated when the main expression demands its value. Strictness analysis [9,1,20,2] detects parameters that will be evaluated by the body of a function. In that case the closure construction can be avoided and its evaluation can be done immediately. This means that call-by-need is replaced by call-by-value. The same analysis can be used to detect those variables bound by a let expression that will be evaluated by the main expression of the let. Such variables can be immediately evaluated, so that the let expression can be transformed into a case expression without modifying the expression semantics [16]. This is known as letto-case transformation: let x = e in e0 ⇒ case e of x → e0 Notice that this transformation assumes a strict semantics for the case expression. Core case expression is strict in the discriminant, but Haskell case with a unique variable pattern alternative is lazy. As our analysis and transformation happen at Haskell level we would not obtain the desired effect with the previous transformation. Additionally it can even be incorrect from the point of view of the types because let-bound variables are polymorphic while case-bound ones are monomorphic. For example, the expression let x = [ ] in case x of [ ] → (1 : x,0 a0 : x) is type correct as x has a polymorphic type [a], which means that the types of the two occurrences of x in the tuple may be different instances of it, i.e. [Int] and [Char]. However its transformed version is not type correct, because x is monomorphic, and the types of the two occurrences are not unifiable. However we can use Haskell’s polymorphic function seq::a->b->b to obtain the desired effect maintaining the types. It evaluates its first argument to weak head normal form and then returns as result its second argument. Consequently, our transformation is the following: let x = e in e0 ⇒ let x = e in x ‘seq‘ e0 3.2
Strictness Analysis by Abstract Interpretation
Strictness analysis can be done by using abstract interpretation [10]. This technique can be considered as a non-standard semantics in which the domain of values is replaced by a domain of value descriptions, and where each syntactic operator is given a non-standard interpretation allowing to approximate at compile time the 186
Segura, Torrano e→
c
{ constant }
|v
{ variable }
| e1 op e2
{ primitive operator }
| if e1 then e2 then e3
{ conditional}
| λb.e
{ first-order lambda }
| C e1 . . . en
{constructor application }
| e1 e2
{ function application }
| let v1 = e1 . . . vn = en in e { let expression } | case e of alt 1 . . . alt n alt →
{ case expression }
C b1 . . . bn → e |b→e Fig. 3. A first-order subset of Haskell
run-time behavior with respect to the property being studied. Mycroft [9] gave for the first time an abstract interpretation based strictness analysis for a first-order functional language. Later, Burn et al. [1] extended it to higher order programs and Wadler [20] introduced the analysis of data types. Peyton Jones and Partain [13] described how to use signatures in order to make abstract interpretation more efficient. We show here an abstract interpretation based strictness analysis for expressions of a first-order subset of Haskell with data types, whose syntax is shown in Figure 3. For the moment, this analysis is enough for our purposes. In Section 5 we discuss the extension of the analysis to higher order and in general to full Haskell. Notice that for flexibility reasons we allow lambda abstractions as expressions, but we restrict them to be first-order lambda abstractions, i.e. the parameter is a variable b that can only be bound to a zeroth order expression. As the language is first-order the only places where lambda abstractions are allowed are function applications and right hand sides of let bindings. Function and constructor applications must be saturated. Let bindings may be recursive. Notice that if we lift the previously mentioned restrictions we have a higher-order subset of Haskell. This is the reason for our definition. Case expressions may have at most one default alternative (b → e). The basic abstract values are ⊥ and >, respectively representing strictness and ”don’t know” values, where ⊥ ≤ >. Operators u and t are respectively the greatest lower bound and the least upper bound. In order to represent the strictness of a function in its different arguments we use abstract functions over basic abstract values a. For example λa1 .λa2 .a1 u a2 represents that the function is strict in both arguments, and λa1 .λa2 .a1 represents that it is strict in its first argument but that we do not know anything about the second one. In Figure 4 we show the interpretation of each of the language expressions, where ρ represents an abstract environment assigning abstract values to variables. The environment ρ + [v → av] either extends environment ρ if variable v had no assigned 187
Segura, Torrano [[c]] ρ = > [[v]] ρ = ρ(v) [[e1 op e2 ]] ρ = [[e1 ]] ρ u [[e2 ]] ρ [[if e1 then e2 then e3 ]] ρ = [[e1 ]] ρ u ([[e2 ]] ρ t [[e3 ]] ρ) [[λb.e]] ρ = λa.[[e]] (ρ + [b → a]) [[C e1 . . . en ]] ρ = > [[e1 e2 ]] ρ = [[e1 ]] ρ [[e2 ]] ρ [[let v1 = e1 . . . vn = en in e]] ρ = [[e]] ρ0 where ρ0 = fix f f = λρ.ρ + [v1 → [[e1 ]] ρ, . . . vn → [[en ]] ρ] [[case e of b → e0 ]] ρ = [[e0 ]] (ρ + [b → a]) where a = [[e]] ρ [[case e of alt 1 . . . alt n ]] ρ = a u (a1 t . . . t an )
(n > 1)
where a = [[e]] ρ ai = [[alti ]] ρ a [[C b1 . . . bn → e]] ρ a = [[e]] (ρ + [b1 → a, . . . bn → a]) [[b → e]] ρ a = [[e]] (ρ + [b → a]) Fig. 4. A strictness analysis by abstract interpretation
abstract value, or updates the abstract value of v if it had. The interpretation is standard so we only give some details. Primitive binary operators, like + or ∗, are strict in both arguments so we use u operator. The abstract value of a constructor application is > because constructors are lazy. This means for example, that function λx.x : [ ] is not considered strict in its first argument. Notice that in the lists abstract domain we have safely collapsed the four-valued abstract domain of Wadler [20] into a two-valued domain, where for example ⊥ : ⊥, [1, ⊥, 2] and [1, 2, 3] are abstracted to >, and only ⊥ is abstracted to ⊥. In the three examples it is safe to evaluate the list to weak head normal form. In a case expression the variables bound by the case alternatives inherit the abstract value of the discriminant. When there is only a default alternative case is lazy, otherwise it is strict in the discriminant. As we have used first-order abstract functions as abstract values, function application can be easily interpreted as abstract function application. To interpret a let expression we need a standard fixpoint calculation as it may be recursive. 3.3
Signatures
Abstract interpretation based analyses of higher order functions is expensive. Signatures [13] can be used in order to improve their efficiency although they imply losing some precision in the analysis. We use them in our implementation as we 188
Segura, Torrano
are interested in analyses for full Haskell. Strictness basic signatures are ⊥ and >. Signatures for functions of n arguments are n-tuples of signatures (s1 , . . . , sn ) indicating whether the function is strict in each of its arguments. For example, (⊥, >, ⊥) is the signature of a function with three arguments that is strict is the first and the third arguments. The strictness signature of a function is obtained by probing it with n combinations of arguments. Component si is calculated by applying the function to the combination in which the ith argument is given the value ⊥ and the rest of them are given the value >. For example, the signature of function λx.λy.λz.x + y, (⊥, ⊥, >), is obtained by applying the function to (⊥, >, >), (>, ⊥, >) and (>, >, ⊥). When considering higher order, functions must be probed with signatures of the appropriate functional types. For example in λf.λx.f 3 + x, the first argument is a function, so it has to be probed with ((⊥, ⊥), >) and ((>, >), ⊥) giving (⊥, ⊥), as expected. In Section 5 we will discuss about the problems encountered in this case, when trying to extend the analysis.
4
Implementation using Template Haskell
In this section we describe the implementation of the strictness analysis and the corresponding transformation using Template Haskell. Given a Haskell expression e the programmer wants to evaluate, this is the module he/she has to write: module import import import
Main where Strict System.IO Language.Haskell.TH
main = putStr (show $(transfm
[| e |]))
Module Strict contains the transformation function and the strictness analysis. First we quote the Haskell expression in order to be able to inspect the abstract syntax tree; then we modify such tree using function transfm, defined below. We use $ to execute the transformation at compile time. These small modifications could be even completely transparent to the programmer if we generate them automatically. If we want the new pass to do more things we just have to modify function transfm. 4.1
Strictness Analysis Implementation
The analysis is carried out by function strict :: Exp -> Env -> AbsVal which given a expression and a strictness environment returns the abstract value of the expression. Abstract values are represented using a data type AbsVal: data StrictAnnot = Bot | Top deriving (Show,Eq) data AbsVal = B StrictAnnot | F [StrictAnnot] | FB Int The basic annotations are B Bot, to represent strictness, and B Top to represent the ”don’t know” value. The abstract value of a function with n arguments is approximated through a signature of the form F [b1, b2, ..., bn] where each 189
Segura, Torrano strict :: Exp -> Env -> AbsVal strict (VarE s) rho = getEnv s rho strict (LitE l) rho = B Top strict (InfixE (Just e1) e (Just e2)) rho = if (isCon e) then (B Top) else inf (strict e1 rho) (strict e2 rho) strict (CondE e1 e2 e3) rho = inf (strict e1 rho) (sup (strict e2 rho) (strict e3 rho)) Fig. 5. Strictness Analysis Implementation-Basic Cases strict (LamE ((VarP s):[]) e) rho = let B b = strictaux e (addEnv (s,0,B Bot) rho) in case (strict e (addEnv (s,B Top) rho)) of B b1 -> F (b:[]) F bs -> F (b:bs) strictaux::Exp -> Env -> AbsVal strictaux (LamE ((VarP s):[]) e) rho = strictaux e (addEnv (s,B Top) rho) strictaux e rho = strict e rho Fig. 6. Strictness Analysis Implementation-Lambda Expressions
bi indicates whether the function is strict in the ith argument. The special FB n value is the abstract value of a completely undefined function with n arguments, that is, the bottom of the functional abstract domain, which is useful in several places. The transformation function calls this function, but if we want to prove the prototype with examples we can write the following: main = putStr (show $(strict2 [| e |] empty)) where e is a closed expression we want to analyse, empty represents the empty strictness environment, and function strict2 is defined as follows: strict2 :: ExpQ -> Env -> ExpQ strict2 eq rho = do {e Exp just converts an abstract value into an expression. Notice that the analysis is carried out at compile time and that we have defined strict2 as a transformation from a expression to another expression representing its abstract value. This is because the compile time computations happen inside the quotation monad, so both the argument and the result of strict2 must be of type ExpQ. We use the do-notation in order to encapsulate strict into the monadic world. Function strict is the actual strictness analysis defined by case distinction over the abstract syntax tree, we need to remember the Exp data type definition (shown in Figure 2) and the restrictions of our language (explained in the previous section). In Figure 5 we show the interpretation of constants, primitive operators, variables and conditional expressions, as shown in the previous section. We have to be careful with infix operators because some constructors like lists : are infix. We distinguish them using function isCon, which we do not show here. Operator inf calculates the greatest lower bound and sup the least upper bound, and getEnv gets from the environment the abstract value of a variable. In Figure 6 we show the interpretation of a lambda abstraction. Its value is a signature F [b1, ..., bn], being n the number of arguments, obtained by probing 190
Segura, Torrano strict (ConE cons) rho = B Top strict (AppE (ConE cons) e) rho = B Top strict (AppE e1 e2) rho = if (isCon e1) then B Top else absapply (strict e1 rho) (strict e2 rho) absapply::AbsVal -> AbsVal -> AbsVal absapply (FB n) a | n==1 = B Bot | n > 1 = FB (n-1) absapply (F (h:tl)) (B b) | null tl = B x | x == Top = F tl | otherwise = FB (length tl) where x = sups h b Fig. 7. Strictness Analysis Implementation-Applications
the function with several combination of arguments, as we explained in Section 3.3. We start probing the function with the first argument. First, we give it the value B Bot and the auxiliary function strictaux gives the rest of the arguments the value B Top. Then we give it the value B Top and recursively probe with the rest of the arguments. In such a way we obtain all the combinations we wish. In Figure 7 we show the interpretation of both constructor and function applications. From the point of view of the language they are the same kind of expression, so we use again function isCon to distinguish them. If it is a function application, absapply carries out the abstract function application. The abstract value FB n represents the completely undefined function so it returns B Bot when completely applied and FB (n-1) when there are remaining arguments to be applied to. When a signature F [b1, ..., bn] is applied to an abstract value B b we need to know whether it is the last argument. If that is the case we can return a basic value, otherwise we have to return a functional value. The resulting abstract value depends on both b1 and b. If b1 is Top the function is not necessarily strict in its first argument, so independently of the value of b we can return B Top if it was the last argument or continue applying the function to the rest of the arguments by returning the rest of the list. The same happens if b is Top as head xs was obtained by giving the first argument the value Bot: we have lost information and the only thing we can say is ”we don’t know” and consequently either return B Top or continue applying the function. If neither b1 nor b is Top (i.e. when the least upper bound sups returns Bot) then the function is strict in its first argument, which is undefined, so we can return B Bot independently of the rest of the arguments. However if there are arguments left we return the completely undefined function FB (n-1). In Figure 8 we show the interpretation of a let expression. Auxiliary function strictdecs carries out the fixpoint calculation. Function splitDecs splits the left hand sides (i.e. the bound variables) and the right hand sides of the declarations. The initial environment init is built by extending the environment with the new variables bound to an undefined abstract value of the appropriate type, done by extendEnv. Function combines updates the environment with the new abstract values in each fixpoint step; it also returns a boolean value False when the environment does not change and consequently the fixpoint has been reached. Finally, in Figure 9 we show the interpretation of a case expression. Function 191
Segura, Torrano strict (LetE ds e) rho = strict e (strictdecs ds rho) strictdecs:: [Dec] -> Env -> Env strictdecs [ ] rho = rho strictdecs ds rho = let (varns,es) = splitDecs ds init = extendEnv rho varns f = \ rho’ ->let aes = map (flip strict rho’) es triples = zipWith triple varns aes in combines rho’ triples fix g (env,True) = fix g (g env) fix g (env,False) = env in fix f (init,True) Fig. 8. Strictness Analysis Implementation-Let Expressions strict (CaseE e ms) rho = let se = strict e rho l = caseaux ms se rho sl = suplist l in if (nostrict ms) then sl else (inf se sl) caseaux :: [Match] -> AbsVal -> Env -> [AbsVal] caseaux ms se rho = map (casealt se rho) ms casealt :: AbsVal -> Env -> Match -> AbsVal casealt abs rho m = case m of Match (InfixP (VarP h) con (VarP tl)) (NormalB e) [] -> let rho’ = addEnvPat abs [VarP h, VarP tl] rho in strict e rho’ Match (ConP con ps) (NormalB e) []-> let rho’ = addEnvPat abs ps rho in strict e rho’ Match (VarP x)(NormalB e)[] -> let rho’ = addEnvPat abs ((VarP x):[]) rho in strict e rho’ Fig. 9. Strictness Analysis Implementation-Case Expressions
nostrict returns true if it is a lazy case expression. The first two branches of casealt correspond to constructor pattern matches (either infix or prefix) and the third one to the variable alternative. Function suplist calculates the least upper bound of the alternatives, and casealt interprets each of the alternatives. The variables bound by the case alternatives inherit the abstract value of the discriminant, which is done by function addEnvPat. Example 4.1 Given the expression \ x -> \ y -> 3 * x , the analysis returns F [Bot, Top], as expected; i.e. the function is strict in the first argument. Example 4.2 Another example with a case expression is the following one: \ x -> \ z-> case 1:[] of [] -> x y:ys -> x + z The result is F [Bot, Top] as expected, telling us that the function is strict in the first argument but maybe not in the second one, although we know it is. Notice the loss of precision. This is because the analysis is static, but not because of the implementation. Example 4.3 The use of signatures in the implementation implies a loss of precision with respect to the analysis shown in Section 3. For example, function 192
Segura, Torrano transf :: Exp -> Env -> Exp transf (LetE ds e) rho = if (isRecorFun ds) then let (vs,es) = splitDecs ds rho’ = foldr addEnvtop rho vs es’ = map (flip transf rho’) es ds’ = zipWith makeDec ds es’ te’ = transf e rho’ in LetE ds’ te’ else case (head ds) of ValD (VarP x) (NormalB e’) [] -> let te’ = transf e’ rho te = transf e (addEnv (x,B Top) rho) ds’ = ValD (VarP x) (NormalB te’) []:[] lambda = LamE ((VarP x):[]) te F bs = strict lambda rho in if (head bs) == Bot then LetE ds’ (InfixE (Just (VarE x)) (VarE (mkName "Prelude:\’u")) (Just te)) else LetE ds’ te Fig. 10. Transformation of a let expression
\ x -> \ y -> \ z -> if z then x else y has abstract value λa1 .λa2 .λa3 .a3 u (a1 t a2 ) but the implementation would assign it signature F [Top, Top, Bot] which is undistinguishable from the abstract value λa1 .λa2 .λa3 .a3 . Function \ x -> \ y -> \ z -> z has the same signature. 4.2
Transformation implementation
The let-to-case transformation has been developed in a similar way. We want the transformation function to be applied not only to the main expression at top level but also, when possible, to all its subexpressions. For example, function \ x -> let z = 3 in x + z can be transformed to \ x -> let z = 3 in z ‘seq‘ (x + z). But then, even when the main expression is closed, subexpressions may have free variables. Consequently, we need a strictness environment, initially empty, carrying the abstract values of the free variables: transfm e = transf2 e empty transf2 :: ExpQ -> Env -> ExpQ transf2 eq rho = do {e (let a_1 = 1 in a_1 Prelude:seq (a_1 GHC.Num.+ 3)) GHC.Num.* (let y_2 = 2 in y_2 Prelude:seq (y_2 GHC.Num.+ x_0))
5
Conclusions and Future Work
In this paper we have studied how to use Template Haskell in order to incorporate new analyses and transformations to the compiler without modifying it. We have presented the implementation of a strictness analysis and a subsequent let-to-case transformation. The source code can be found at http://dalila.sip.ucm.es/miembros/clara/publications.html. These are well-known problems, which has allowed us to concentrate on the difficulties and limitations of using Template Haskell for our purposes, see the discussion below. As far as we know, this is the first time that Template Haskell has been used for developing static analyses. There are some compiling tools available for GHC (see http://www.haskell.org/libraries/#compilation) which are useful to write analyses prototypes, but our aim is to use the results of the analyses and to continue with the GHC’s compilation process. Analyses and transformations are usually done over a simplified language where the syntactic sugar has disappeared: Core in GHC. Currently, those researchers interested in writing just a new simplifier pass, can only do it by linking their 194
Segura, Torrano
code into the GHC executable, which is not trivial. In http://www.haskell.org/ghc/docs/latest/html/users guide/ext-core.html a (draft) formal definition for Core is provided with the aim of making Core fully usable as a bi-directional communication format. At the moment it is only possible to dump to a file the Core code obtained after the simplifier phase in such external format. The analysis has been developed for a first-order subset of Haskell. This has been relatively easy to define. The only difficulty here is the absence of a properly commented documentation of the library. The analysis could be extended to higherorder programs. We have not done this for the moment for the following reason. When analysing higher order functions, it is necessary to probe functions with functional signatures, which we have to generate, as we explained in Section 3.3. In order to generate such signatures we need to know how many arguments the function has, which in the first order case was trivial (we just counted the lambdas) but not in the higher order case due to partial applications. If we had types available in the syntax tree, it would be trivial again. In this analysis the probing signatures are quite simple; if the argument function has n arguments then the probing signature is FB n. But in other analyses, like non-determinism analysis [14], probing signatures are more complex and types are fundamental to generate them properly. Although there is a typing algorithm for Template Haskell [8], the type information is not kept in the syntax tree. We could of course develop our own typing algorithm but it would be of no help for other users if it is not integrated in the tool. This would be very useful also to do type-based analyses, which we plan to investigate. Using Template Haskell for analyses and transformations has several disadvantages. First, the analysis and transformation must be defined for full Haskell. Defining the analysis for Core would make sense if it were possible to control in which phase of the compiler we want to access the abstract syntax tree, and for the moment this is not the case. If the analysis is defined for a subset of Haskell, like ours, it would be necessary to study the transformations done by GHC’s desugarer in order to determine how to analyse the sugared expressions. An analysis at the very beginning of the compilation process is still useful when we want to give information to the user about the results of the analysis. In that case we want to reference the original variables written by him/her, which are usually lost in further phases of the compiler. Notice that in our examples variables are indexed but they still maintain the original string name. The desugarer however generates fresh variables unknown for the programmer. Second, we can profit only of those analyses whose results are used by a subsequent transformation. The results of the analysis cannot be propagated to further phases of the compiler, which would be affected by them. Examples of this situation is the non-determinism analysis [14] whose results are used to deactivate some transformations done by the simplifier, or the usage analysis which affects to the STG code generated by the compiler [21]. However it is useful for developing abstract interpretation based analyses whose results can be used to transform Haskell code, and incorporate easily such transformation to the compilation process.
195
Segura, Torrano
References [1] G. L. Burn, C. L. Hankin, and S. Abramsky. The Theory of Strictness Analysis for Higher Order Functions. In Programs as Data Objects, volume 217 of LNCS, pages 42–62. Springer-Verlag, October 1986. [2] T. P. Jensen. Strictness Analysis in Logical Form. In R. J. M. Hughes, editor, Functional Programming Languages and Computer Architecture, volume 523 of LNCS, pages 352–366. Springer-Verlag, New York, 1991. [3] U. Klusik, Y. Ortega-Mall´ en, and R. Pe˜ na. Implementing Eden - or: Dreams Become Reality. In Selected Papers of the 10th International Workshop on Implementation of Functional Languages, IFL’98, volume 1595 of LNCS, pages 103–119. Springer-Verlag, 1999. [4] U. Klusik, R. Pe˜ na, and C. Segura. Bypassing of Channels in Eden. In P. Trinder, G. Michaelson, and H.-W. Loidl, editors, Trends in Functional Programming. Selected Papers of the 1st Scottish Functional Programming Workshop, SFP’99, pages 2–10. Intellect, 2000. [5] R. Loogen, Y. Ortega-Malln, R. Pe˜ na, S. Priebe, and F. Rubio. Patterns and Skeletons for Parallel and Distributed Computing. F. Rabhi and S. Gorlatch (eds.), chapter Parallelism Abstractions in Eden, pages 95–128. Springer-Verlag, 2002. [6] I. Lynagh. Template Haskell: A report from the field. ian.lynagh/papers/), 2003.
(http://web.comlab.ox.ac.uk/oucl/work/
[7] I. Lynagh. Unrolling and Simplifying Expressions with Template Haskell. ox.ac.uk/oucl/work/ian.lynagh/papers/), 2003. [8] I. Lynagh. Typing Template Haskell: Soft Types. ian.lynagh/papers/), 2004.
(http://web.comlab.
(http://web.comlab.ox.ac.uk/oucl/work/
[9] A. Mycroft. Abstract Interpretation and Optimising Transformations for Applicative Programs. Phd. thesis, technical report cst-15-81, Dept Computer Science, University of Edinburgh, December 1981. [10] F. Nielson, H. R. Nielson, and C. Hankin. Principles of Program Analysis. Springer-Verlag, 1999. [11] R. Pe˜ na and C. Segura. Sized Types for Typing Eden Skeletons. In T. Arts and M. Mohnen, editors, Selected papers of the 13th International Workshop on Implementation of Functional Languages, IFL’01, volume 2312 of LNCS, pages 1–17. Springer-Verlag, 2002. [12] S. L. Peyton Jones, C. V. Hall, K. Hammond, W. D. Partain, and P. L. Wadler. The Glasgow Haskell Compiler: A Technical Overview. In Joint Framework for Inf. Technology, Keele, DTI/SERC, pages 249–257, 1993. [13] S. L. Peyton Jones and W. Partain. Measuring the Effectiveness of a Simple Strictness Analyser. In Glasgow Workshop on Functional Programming 1993, Workshops in Computing, pages 201–220. Springer-Verlag, 1993. [14] R. Pea and C. Segura. Non-determinism Analyses in a Parallel-Functional Language. Journal of Functional Programming, 15(1):67–100, 2005. [15] S. Priebe. Preprocessing Eden with Template Haskell. In Generative Programming and Component Engineering: 4th International Conference, GPCE 2005, volume 3676 of LNCS, pages 357–372. Springer-Verlag, 2005. [16] A. L. M. Santos. Compilation by Transformation in Non-Strict Functional Languages. PhD thesis, Glasgow University, Dept. of Computing Science, 1995. [17] T. Sheard and S. Peyton Jones. 37(12):60–75, 2002.
Template meta-programming for Haskell.
SIGPLAN Notices,
[18] S. Peyton Jones T. Sheard. Notes on Template Haskell Version 2. (http://research.microsoft.com/ ˜simonpj/tmp/notes2.ps), 2003. [19] P. Wadler. Monads for functional programming. In J. Jeuring and E. Meijer, editors, Advanced Functional Programming. LNCS 925. Springer-Verlag, 1995. [20] P. L. Wadler and R. J. M. Hughes. Projections for Strictness Analysis. In G. Kahn, editor, Proceedings of Conference Functional Programming Languages and Computer Architecture, FPCA’87, volume 274 of LNCS, pages 385–407. Springer-Verlag, 1987. [21] K. Wansbrough and S. L. Peyton Jones. Once Upon a Polymorphic Type. In Conference Record of POPL ’99: The 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 15–28, San Antonio, Texas, January 1999. ACM Press.
196
WFLP 2006
Temporal Contextual Logic Programming Vitor Nogueira1 ´ Universidade de Evora and CENTRIA FCT/UNL Portugal
Salvador Abreu2 ´ Universidade de Evora and CENTRIA FCT/UNL Portugal
Abstract The importance of temporal representation and reasoning is well known not only in the database community but also in the artificial intelligence one. Contextual Logic Programming [17] (CxLP) is a simple and powerful language that extends logic programming with mechanisms for modularity. Recent work not only presented a revised specification of CxLP together with a new implementation for it but also explained how this language could be seen as a shift into the Object-Oriented Programming paradigm [2]. In this paper we propose a temporal extension of such language called Temporal Contextual Logic Programming. Such extension follows a reified approach to the temporal qualification, that besides the acknowledge increased expressiveness of reification allows us to capture the notion of time of the context. Together with the syntax of this language we also present its operational semantics and an application to the management of workflows. Keywords: Temporal, logic, contexts, modular.
1
Introduction and Motivation
Contextual Logic Programming [17] (CxLP) is a simple and powerful language that extends logic programming with mechanisms for modularity. Recent work not only presented a revised specification of CxLP together with a new implementation for it but also explained how this language could be seen as a shift into the ObjectOriented Programming paradigm [2]. Finally, CxLP was shown to be a powerful language in which to design and implement Organizational Information Systems [3]. Temporal representation and reasoning is a central part of many Artificial Intelligence areas such as planning, scheduling and natural language understanding. Also in the database community we can see that this is a growing field of research [12,9]. Although both communities have several proposals for working with time, it still 1 2
Email:
[email protected] Email:
[email protected]
This paper is electronically published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs
Nogueira and Abreu
remains as challenging problem. For instance, there is no standard for temporal SQL or commercial DBMS that goes far beyond the traditional implementation of time. To characterize a temporal reasoning system we can consider besides the ontology and theory of time, the ”pure” temporal forms along with the logical forms. Although the first two issues are out of the scope of this paper, they were not neglected since they have been the subject of previous work [19,16], where we proposed a theory of time based on an ontology that considers points as the primitive units of time and the structure for the temporal domain was discrete and linear. Nevertheless, also intervals and durations were easily represented because the paradigm used to represent the temporal forms was Constraint Logic Programming. The last issue, the logical form of the reasoning system, is the subject of this work. Adding a temporal dimension to CxLP results in a language that besides having all the expressiveness acknowledged to logic programming, allow us easily to establish connections to common sense notions because of its contextual structure. In this article we will introduce the language Temporal Contextual Logic Programming (TCxLP) along its operational semantics and discuss the application to the case workflow management systems. The remainder of this article is structured as follows. In Sects. 2 and 3 we briefly overview Contextual Logic Programming and Many–Sorted First–Order Logic, respectively. Section 4 discusses some temporal reasoning options followed and Sect. 5 presents the syntax and operational semantics of the proposed language, Temporal Contextual Logic Programming. Its application to the management of workflows is shown in Sect. 6. In Sect. 7 we establish some comparisons with similar formalisms. Conclusions and proposals for future work follow.
2
An Overview of Contextual Logic Programming
For this overview we assume that the reader is familiar with the basic notions of Logic Programming. Contextual Logic Programming (CxLP) [17] is a simple yet powerful language that extends logic programming with mechanisms for modularity. In CxLP a finite set of Horn clauses with a given name is designated by unit. The vocabulary of Contextual Logic Programming contains sets of variables, constants, function symbols and predicate symbols, from which terms and atoms are constructed as usual. Also part of the vocabulary is a set of unit names. More formally, we have that each unit name is associated to a unit described by a pair hLu , Cu i consisting of a label Lu and clauses Cu . The unit label Lu is a term u(v1 , . . . , vn ), n ≥ 0, where u is the unit name and v1 , . . . , vn are distinct variables denominated unit’s parameters. We define a unit designator as any instance of a unit label. In [2] we presented a new specification for CxLP, which emphasizes the OOP aspects by means of a stateful model, allowed by the introduction of unit arguments. Using the syntax of GNU Prolog/CX, consider a unit named employee to represent some basic facts about University employees: Example 2.1 CxLP unit employee 198
Nogueira and Abreu
:-unit(employee(NAME, POSITION)). name(NAME). position(POSITION). item :- employee(NAME, POSITION). employee(bill, teachingAssistant). employee(joe, associateProfessor). The main difference between the code of example 2.1 and a regular logic program is the first line that declares the unit name (employee) along with two unit arguments (NAME, POSITION). Unit arguments help avoid the annoying proliferation of predicate arguments, which occur whenever a global structure needs to be passed around. A unit argument can be interpreted as a “unit global” variable, i.e. one which is shared by all clauses defined in the unit. Therefore, as soon as a unit argument gets instantiated, all the occurrences of that variable in the unit are replaced accordingly. For instance if the variable NAME gets instantiated with bill we can consider that the following changes occur: :-unit(employee(bill, POSITION)). name(bill). item :- employee(bill, POSITION). Consider another unit baseSalary that besides some facts has a rule to calculate the employees base salary: multiply an index by a factor that depends of the employee position. For instance, if the position is teaching assistant, then the base salary is 10(index) ∗ 200(factor) = 2000. Example 2.2 CxLP unit baseSalary :-unit(baseSalary(S)). item :- position(P), position_factor(P, F), index(I), S is I*F. position_factor(teachingAssistant, 200). position_factor(associateProfessor, 400). index(10). We can see that there is no clause for predicate position/1 in this unit. Although it will be explained in detail below, for now we can consider that the definition for this predicate will be obtained from the context. A set of units is designated as a contextual logic program. With the units above we can build the program: P = {employee, baseSalary}. Since in the same program we could have two or more units with the same name 199
Nogueira and Abreu
but different arities, to be more precise besides the unit name we should also refer its arity i.e. the number of arguments. Nevertheless, most of the times when there is no ambiguity, we omit the arity of the units. If we consider that employee and baseSalary designate sets of clauses, then the resulting program is given by the union of these sets. For a given CxLP program, we can impose an order on its units, leading to the notion of context. Contexts are implemented as lists of unit designators and each computation has a notion of its current context. The program denoted by a particular context is the union of the predicates that are defined in each unit. Moreover, we resort to the override semantics to deal with multiple occurrences of a given predicate: only the topmost definition is visible. To construct contexts, we have the context extension operation given by the operator :> . The goal U :> G extends the current context with unit U and resolves goal G in the new context. For instance to obtain the employees information we could do: | ?- employee(N, P) :> item. N = bill P = teachingAssistant ? ; N = joe P = associateProfessor In this query we extend the initial empty context [] 3 with unit employee obtaining context [employee(N, P)] and then resolve query item. This leads to the two solutions above. Units can be stacked on top of a context; as an illustration consider the following query: | ?- employee(bill, _) :> (item, baseSalary(S) :> item). In this goal we start by adding the unit employee/2 to the empty context resulting in context [employee(bill, )]. The first call to item matches the definition in unit employee/2 and instantiates the remaining unit argument. The context then becomes [employee(bill,teachingAssistant)]. After baseSalary/1 being added, we are left with the context [baseSalary(S), employee(bill,teachingAssistant)]. The second item/0 goal is evaluated and the first matching definition is found in unit baseSalary. In the body of the rule for item we find position(P) and since there is no rule for this goal in the current unit (baseSalary), a search in the context is performed. Since employee is the topmost unit that has a rule for position(P), this goal is resolved in the (reduced) context [employee(bill, teachingAssistant)]). In an informal way, we can say that we ask the context for the position for which we want to calculate the base salary. Variable P is instantiated to teachingAssistant and computation of goal position factor(teaching3
In the GNU Prolog/CX implementation the empty context contains all the standard Prolog predicates such as =/2.
200
Nogueira and Abreu
Assistant, F), index(I), S is F*I is executed in context [baseSalary(S), employee(bill, teachingAssistant)]. Using the clauses of unit baseSalary we get the final context [baseSalary(2000), employee(bill, teachingAssistant)] and the answer S = 2000.
3
Many–Sorted First–Order Logic
For self–containment reasons in this section we will briefly present Many–Sorted First–Order Logic (MSFOL). For a more detailed discussion see for instance [14]. Many–Sorted First-Order Logic can be regarded as a ’typed version’ of First– Order Logic (FOL), that results from adding to the FOL the notion of sort. Although MSFOL is a flexible and convenient logic, it still preserves the properties of FOL. In MSFOL besides predicate and function symbols, there exists sort symbols A, B, C. Each function f has an associated sort sort(f ) of the form A1 × . . . × Aarity(f ) → A and each predicate symbol P has an associated sort sort(P ) of the form A1 × . . . × Aarity(P ) . Likewise, each variable has an associated sort. These sorts have to be respected in order to build only well–formed formulas. An MSFOL interpretation M consists of: •
a domain DA for each sort A
•
for each function symbol f a function I(f ) : DA1 ×. . .×DAarity(f ) → DA , matching the sort of f
•
for each predicate symbol P a function I(P ) : DA1 ×. . .×DAarity(P ) → B, matching the sort of P .
The satisfiability for MSFOL interpretations is defined as expect, i.e. the quantification variables in clauses becomes sort–dependent.
4
Temporal Reasoning Issues
In this section we discuss several temporal reasoning options that we followed in our approach. Namely, the temporal qualification and temporal ontology. 4.1
The Model of Time
To define the model of time we need to define not only the time ontology but also the time topology. By time ontology we mean the class or classes of objects time is made of (instants, intervals, durations, . . . ) and the time topology is related to the properties of sets of the objects defined, namely: •
discrete, dense or continuous
•
bounded or unbounded
•
linear, branching, circular or with a different structure
•
are all individuals comparable by the order relation (connectedness)
•
are all individuals equal (homogeneity)
•
is it the same to look at one side or to the other (symmetry) 201
Nogueira and Abreu
4.2
Temporal Qualification
Temporal qualification is by itself a very prolific field of research. For an overview on this subject see for instance [18]. Besides modal logic proposals, from a first–order point of view we can consider the following methods for temporal qualification: temporal arguments, token arguments, temporal reification and temporal token reification. Although no method is clearly superior to the others, we decided to follow a temporal reification to units designators. Because besides assigning a special status to time, temporal reification has the advantage of allowing to quantify over propositions. The major critics made to reification is that such approach requires a sort structure to distinguish between terms that denote real objects of the domain (terms of the original object language) and terms that denote propositional objects (propositions of the original object language). One major issue in every temporal theory is deciding what sort of information is subject to change. In the case of Contextual Logic Programming the temporal qualification could be performed at the level of clauses, units or even contexts. In order to be as general as possible we decided to qualify units, more specifically, units designators. This way we can also qualify: •
clauses: by considering units with just one clause;
•
contexts: by considering contexts containing a single unit.
Moreover, this way temporal qualification is twofolded: it is static when we are considering units and it is dynamic when those units are part of a context.
4.3
Temporal Ontology
From an ontological point of view we can classify the temporal relations into a number of classes such as fluents, events, etc. Normally, each of these classes has associated a theory of temporal incidence. For instance the occurrence of an event over an interval is solid (if it holds over a interval it does not hold on any interval that overlaps it) whereas fluents hold in a homogeneous way (if it holds in an interval then it holds in each part of the interval). Our theory of temporal incidence will be encoded by means of conditions in the operational semantics and to be as expressive as possible unit designators can be considered as events or as fluents according to the context, i.e. the current context will specify if they must hold in a solid or homogeneous way (see inference rule Reduction of page 206).
5
Temporal Contextual Logic Programming
In this section we present our temporal extension of CxLP, called Temporal Contextual Logic Programming (TCxLP). We start by providing the syntax of this language that can be regarded as a two–sorted CxLP where one of sort is for temporal elements and the other for non–temporal elements. Then we give the operational semantics by means of a set of inference rules which specify computations. 202
Nogueira and Abreu
5.1
Syntax
Temporal Contextual Logic Programming is a two–sorted CxLP with the sorts T and N T , where the former sort stands for temporal sort and the later for non– temporal sort. It is convenient to move to many–sorted logics since it naturally allows to distinguish between time and non–time individuals. There is a constant now of the temporal sort that stands for the current time and one new operator (::) to obtain the time of the context. In Temporal Contextual Logic Programming each unit is described by a triple hLu , Cu , Tu i where the element Tu is the temporal qualification. The unit temporal qualification Tu is a set of holds(ud, t) where ud is a unit designator for u and t a term of the temporal sort. To illustrate the concepts above, consider the table taken from [5] that represents information about Eastern Europe history, modeling the independence of various countries (to simplify we presented just the excerpt related to Poland) where each row represents an independent nation and its capital: Year
Timeslice
1025
{ indep(Poland, Gniezno) }
... 1039
{ indep(Poland, Gniezno) }
1040
{ indep(Poland, Cracow) }
... Table 1 Eastern European history: abstract temporal database
For this example we can consider the unit indep where the label is: :- unit(indep(Country, Capital)). the temporal qualification is: holds(indep(poland, gniezno), time(1025, 1039)). holds(indep(poland, cracow), time(1040, 1595)). and the clauses are: country(Country). capital(Capital). item :- holds( indep(Country, Capital) ). A temporal context is any sequence of the following elements: •
unit designator,
•
term of the sort T . With the unit above we can build temporal contexts like:
time(1400, 1600) :> indep(poland, C) :> item. In this context C stands for name of all the capitals of (independent) Poland 203
Nogueira and Abreu
between 1400 and 1600. In this section we are going to use λ to denote the empty context, t for a term of the temporal sort, ud for unit designator, u to represent the set of predicate symbols that are defined in unit u and C to denote contexts. Moreover, we may specify a context as C = e.C 0 where e is the topmost element and C 0 is the remaining context (C 0 is the supercontext of C). 5.2
Operational Semantics
As usual in logic programming, we present the operational semantics by means of derivations. For self–containment reasons, we explain briefly what is a derivation. Such explanation follows closely the one in [15]. Derivations are defined in a a declarative style, by considering a derivation relation and introducing a set of inference rules for it. A tuple in the derivation relation is written as U, C ` G[θ] where U is a set of units, C a temporal context, G a goal and θ a substitution. Since the set of units remains the same for a derivation, we will omit U in the definition of the inference rules. Each inference rule has the following structure: Antecedents {Conditions Consequent The Consequent is a derivation tuple, the Antecedents are zero, one or two derivation tuples and Conditions are a set of arbitrary conditions. The inference rules can be interpreted in a declarative or operational way. In the declarative reading we say that the Consequent holds if the Conditions are true and the Antecedents hold. From a operational reading we get that if the Conditions are true, to obtain the Consequent we must establish the Antecedents. A derivation is a tree such that: (i) any node is a derivation tuple (ii) in all leaves the goal is null (iii) the relation between any node and its children is that between the consequent and the antecedents of an instance of an inference rule (iv) all clause variants mentioned in these rule instances introduce new variables different from each other and from those in the root. The operation of the temporal contextual logic system is as follows: given a context C and a goal G the system will try to construct a derivation whose root is C ` G [θ], giving θ as the result substitution, if it succeeds. The substitution θ is called the computed answer substitution. We may now enumerate the inference rules which specify computations. We will present just the rules for the basic operators since the remaining can be obtained from these ones: for instance, the extension U :> G can be obtained by combining the inquiry with the switch as in :> C, [U|C] :< G. Together with each rule we will also present its name and a corresponding number. Moreover, the paragraph after each rule gives an informal explanation of how it works. 204
Nogueira and Abreu
Null goal (1)
C ` ∅[] The null goal is derivable in any context, with the empty substitution as result. Conjunction of goals C ` G1 [θ] Cθ ` G2 θ[σ] C ` G1 , G2 [θσdvars(G1 , G2 )] To derive the conjunction derive one conjunct first, and then the other in the same context with the given substitutions. The notation δdV stands for the restriction of the substitution δ to the variables in V . Since C may contain variables in unit designators or temporal terms that may be bound by the substitution θ obtained from the derivation of G1 , we have that θ must also be applied to C in order to obtain the updated context in which to derive G2 θ. (2)
Context inquiry n
(3)
θ = mgu(X, C) C ` :> X[θ] In order to make the context switch operation (4) useful, there needs to be an operation which fetches the context. This rule recovers the current context C as a term and unifies it with term X, so that it may be used elsewhere in the program. Context switch C 0 ` G[θ] C ` C 0 :< G[θ] The purpose of this rule is to allow execution of a goal in an arbitrary context, independently of the current context. This rule causes goal G to be executed in context C 0 . (4)
Time inquiry: empty context (5)
λ ` :: now[]
Time inquiry: temporal element
(6)
sort(t) = T
sort(t0 ) = T tC ` :: t0 [θ] θ = mgu(t, t0 )
The two rules above state that the time of a context is now if the context is empty (5) or is given by the first element of the context, if such element is of the temporal sort (6). From the combination of these rules with the one for 205
Nogueira and Abreu
the context traversal (8) we get that the time of a context is represented by the ”first” or topmost temporal element of such context (or now if there is no explicit mention of time). Therefore, also for the time enquiry operator we resort to an overriding semantics. Reduction
(7)
H ← G1 , G 2 · · · Gn ∈ u θ = mgu(G, H) uCθ ` (G1 , G2 · · · Gn )θ[σ] holds(uϕ, t0 ) ∈ Tu uC ` G[θσdvars(G)] uC ` :: t uC ` intersects(t, t0 )
The clauses of topmost unit (u) can be applied in a context (uC) if there is at least one unit designator (uϕ) that holds (holds(uϕ, t0 )) at the time of the context (uC ` :: t and uC ` intersects(t, t0 )). In a informal way we can say that when a goal has a definition in the topmost unit in the context (first two conditions) and such unit (instantiation) can be applied in the time of the context (last three conditions), then it will be replaced by the body of the matching clause, after unification. The reader might have noticed that not only the time (t) is obtained from the context but also the definition of predicate intersects/2. This way we can have not only different temporal elements (points, intervals, etc) but also different ontologies. For instance, if t and t’ are time points and the definition of intersects in the context is a synonym for equal then we can consider unit designators as events. On the other hand if t and t’ are intervals and the definition of intersects in the context is a synonym for intervals overlapping then we can consider unit designators as fluents. Finally, we can also have a combination of both (events and fluents) approaches. Context traversal: C ` G[θ] eC ` G[θ] When none of the previous rules applies, remove the top element of the context, i.e. resolve goal G in the supercontext. (8)
5.2.1 Application of the rules It is rather straightforward to check that the inference rules are mutually exclusive, leading to the fact that given a derivation tuple C ` G[θ] only one rule can be applied.
6
Application: Management of Workflow Systems
Workflow management systems (WfMS) are software systems that support the automatic execution of workflows. Although time is an important resource for them, 206
Nogueira and Abreu
Fig. 1. The Student Enrolment process model: initial proposal (left) and refinement (right)
the time management offered by most of these systems must be handled explicitly and is rather limited. Therefore, automatic management of temporal aspects of information is an important and growing field of research [8,6,7,13]. Such management can defined not only at the conceptual level (for instance changes defined over a schema) but also at run time (for instance workload balancing among agents). The example used to illustrate the application of our language to workflows is based on the one given in [7]. The reason to use an existing example is two folded: not only we consider that such example is an excellent illustration of the temporal aspects in a WfMS but also will allow us to give a more precise comparison to the approach of those authors. For that consider the process of enrollment of graduate students applying for PhD candidate positions. In the first proposal of the process model, from September 1st , 2003, any received application leads to an interview of the applicant (see workflow on the left of Fig. 1). After September 30th , 2003, the process model was refined and the applicants CVs must be analyzed first: only applicants with an acceptable CV will be interviewed (see workflow on the right of Fig 1). One process of the above workflow is selecting the successor(s) of a completed task. Since for the example given there is a refinement of the workflow process, such selection must depend of the time. For instance, if the completed task is “Receive Application” and the current date is the 4th of September of 2003 then the next task must be “Interview”. But if the current date is after September 30th , 2003, then the next task must be “AnalyzeCV”. To represent such process consider the following unit next. Please notice that besides the unit label and clauses, now we have the temporal qualification of the units designators. The first temporal qualification states that for the student enrolment the next task after receiving and application is doing an interview, but this is only valid between ’03-09-01’ and ’03-09-30’. % L_next: label :- unit(next(SchemaName, TaskName, NextTask)). % C_next: clauses item :- holds(next(SchemaName, TaskName, NextTask), _). 207
Nogueira and Abreu
% T_next: temporal qualification holds(next(studentEnrollment, receiveApplication, interview), time(’03-09-01’, ’03-09-30’)). holds(next(studentEnrollment, interview, r1), time(’03-09-01’, inf)). holds(next(studentEnrollment, rejectApplication, end_flow), time(’03-09-01’, inf)). holds(next(studentEnrollment, acceptApplication, end_flow), time(’03-09-01’, inf)). holds(next(studentEnrollment, rejectAndThank, end_flow), time(’03-10-01’, inf)). holds(next(studentEnrollment, receiveApplication, analyzeCV), time(’03-10-01’, inf)). holds(next(studentEnrollment, analyzeCV, r2), time(’03-10-01’, inf)). With the unit above and assuming an homogeneous approach (see 4.3), consider the goal: ?- time(’03-09-04’) :> next(studentEnrollment, receiveApplication, N) :> item. N = interview I.e., at September 4th , 2003, the next task after receiving an application is an interview. The same query could be done without the explicit time: ?- next(studentEnrollment, receiveApplication, N) :> item. N = analyzeCV Remembering that if nothing is said about the time then we assume we are in the present time (after September 30th , 2003) and according to the refined workflow, the next task must be to analyze the CV. In the goals above our main focus was on the temporal aspects of the problem, leaving aside the modular aspects. Nevertheless we can consider a slightly more elaborated version of the problem where we have another temporally qualified unit called worktask with the name of the tasks and the role required by the agent for the execution: unit( worktask(SchemaName, TaskName, Role). A variant of the query above that besides the next task (N) would also find out a (valid) role for an agent to perform such task, could be stated as: ?- next(studentEnrollment, receiveApplication, N) :> (item, worktask(studentEnrollment, N, R) :> item). N = analyzeCV R = comitteeMember
7
Comparison With Other Approaches
One language that has some similarities with ours is Temporal Prolog of Hrycej [11]. Although such language provided a very interesting temporal extension of Prolog, it was restricted to Allen’s temporal constraint model [4] for reasoning about time intervals and their relationships. Moreover, the predicate space was flat, i.e. it had 208
Nogueira and Abreu
no solution to the problem of modularity. Its not a novelty the use of many–sorted logic to represent time (along other concepts). For instance to represent and reason about policies in [10] we find sorts for principals, actions and time. Moreover, there is an interesting notion of environment that can be regarded as a counterpart for context. Nevertheless since handling policies is the main focus, time is treated in a rather vague way. Combi and Pozzi [8,6,7] have a very interesting framework for temporal workflow management systems. Their proposal is more “database oriented” and therefore presents the advantages and disadvantages known towards logical approaches. For instance, their queries are far more verbose (see trigger findSuccessor ), not only because we use logical variables but also because contexts allow us to make some facts implicit or in the context.
8
Conclusions and Future Work
In this paper we presented a temporal extension of CxLP that can be regarded as a two–sorted CxLP. Although we aimed that such extension could be as minimal as possible we also wanted to be as expressive as possible, leading to the notion of units whose applicability depends of the time of the context, i.e. temporally qualified units. Although we presented the operational semantics we consider that to obtain a more solid foundation there is still need for declarative approach together with its soundness and completeness proof. To our understanding the best way to prove the usefulness of this language is by means of application, and for that purpose we chose the management of workflow systems. Besides this example, we are currently applying this language to the legislation field, namely to represent and reason about the evolution of laws. As mentioned in Sect. 7, using TCxLP as a language for specifying policies seems as fruitfully area of research. Finally, it is our goal to show that this language can act as the backbone for construction and maintenance of temporal information systems. Therefore, the evolution of this language or its integration with others such as ISCO [1] is one of the current lines of research.
References [1] Salvador Abreu. Isco: A practical language for heterogeneous information system construction. In Proceedings of INAP’01, Tokyo, Japan, October 2001. INAP. [2] Salvador Abreu and Daniel Diaz. Objective: In minimum context. In Catuscia Palamidessi, editor, ICLP, volume 2916 of Lecture Notes in Computer Science, pages 128–147. Springer, 2003. [3] Salvador Abreu, Daniel Diaz, and Vitor Nogueira. Organizational information systems design and implementation with contextual constraint logic programming. In IT Innovation in a Changing World – The 10th International Conference of European University Information Systems, Ljubljana, Slovenia, June 2004. [4] J .F. Allen. Maintaining knowledge about temporal intervals. cacm, 26(11):832–843, nov 1983. [5] M. H. Boehlen, J. Chomicki, R. T. Snodgrass, and D. Toman. Querying TSQL2 databases with temporal logic. Lecture Notes in Computer Science, 1057:325–341, 1996. [6] Carlo Combi and Giuseppe Pozzi. Temporal conceptual modelling of workflows. In Il-Yeol Song, Stephen W. Liddle, Tok Wang Ling, and Peter Scheuermann, editors, ER, volume 2813 of Lecture Notes in Computer Science, pages 59–76. Springer, 2003.
209
Nogueira and Abreu [7] Carlo Combi and Giuseppe Pozzi. Architectures for a temporal workflow management system. In SAC ’04: Proceedings of the 2004 ACM symposium on Applied computing, pages 659–666, New York, NY, USA, 2004. ACM Press. arquivos/p659-combi.pdf. [8] Carlo Combi and Giuseppe Pozzi. Task scheduling for a temporalworkflow management system. Thirteenth International Symposium on Temporal Representation and Reasoning (TIME’06), 0:61– 68, 2006. [9] Chris Date and Hugn Darwen. Temporal Data and the Relational Model. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2002. [10] Joseph Y. Halpern and Vicky Weissman. Using first-order logic to reason about policies. In 16th IEEE Computer Security Foundations Workshop (CSFW-16 2003), 30 June - 2 July 2003, Pacific Grove, CA, USA, pages 187–201. IEEE Computer Society, 2003. [11] Tomas Hrycej. A temporal extension of prolog. J. Log. Program., 15(1-2):113–145, 1993. [12] Gad Ariav Ilsoo Ahn, Don Batory, James Clifford, Curtis E. Dyreson, Ramez Elmasri, Fabio Grandi, Christian S. Jensen, Wolfgang K¨ afer, Nick Kline, Krishna Kulkarni, T. Y. Cliff Leung, Nikos Lorentzos, John F. Roddick, Arie Segev, Michael D. Soo, and Suryanarayana M. Sripada. The TSQL2 Temporal Query Language. Kluwer Academic Publishers, 1995. [13] Elisabetta De Maria, Angelo Montanari, and Marco Zantoni. Checking workflow schemas with time constraints using timed automata. In Robert Meersman, Zahir Tari, Pilar Herrero, Gonzalo M´ endez, Lawrence Cavedon, David Martin, Annika Hinze, George Buchanan, Mar´ıa S. P´ erez, V´ıctor Robles, Jan Humble, Antonia Albani, Jan L. G. Dietz, Herv´ e Panetto, Monica Scannapieco, Terry A. Halpin, Peter Spyns, Johannes Maria Zaha, Esteban Zim´ anyi, Emmanuel Stefanakis, Tharam S. Dillon, Ling Feng, Mustafa Jarrar, Jos Lehmann, Aldo de Moor, Erik Duval, and Lora Aroyo, editors, OTM Workshops, volume 3762 of Lecture Notes in Computer Science, pages 1–2. Springer, 2005. [14] K. Meinke and J. V. Tucker, editors. Many-sorted logic and its applications. John Wiley & Sons, Inc., New York, NY, USA, 1993. [15] Lu´ıs Monteiro and Ant´ onio Porto. A Language for Contextual Logic Programming. In K.R. Apt, J.W. de Bakker, and J.J.M.M. Rutten, editors, Logic Programming Languages: Constraints, Functions and Objects, pages 115–147. MIT Press, 1993. [16] Vitor Beires Nogueira, Salvador Abreu, and Gabriel David. Towards temporal reasoning in constraint contextual logic programming. In Proceedings of the 3rd International Workshop on Multiparadigm Constraint Programming Languages MultiCPL’04 associated to ICLP’04, Saint–Malo, France, September 2004. [17] Ant´ onio Porto and Lu´ıs Monteiro. Contextual logic programming. In Giorgio Levi and Maurizio Martelli, editors, Proceedings 6th Intl. Conference on Logic Programming, Lisbon, Portugal , 19–23 June 1989, pages 284–299. The MIT Press, Cambridge, MA, 1989. [18] H. Reichgelt and L. Vila. Handbook of Temporal Reasoning in Artificial Intelligence, chapter Temporal Qualification in Artificial Intelligence. Foundations of Artificial Intelligence, 1. Elsevier Science, 2005. [19] Vitor Nogueira. A Temporal Programming Language for Heterogeneous Information Systems. In G. Gupta M. Gabbrielli, editor, Proceedings of the 21st Intl.Conf. on Logic Programming (ICLP’05), number 3668 in LNCS, pages 444–445, Sitges, Spain, October 2005. Springer.
210
WFLP 2006
A Fully Sound Goal Solving Calculus for the Cooperation of Solvers in the CF LP Scheme S. Est´evez Mart´ına,1 A. J. Fern´andezb,2 M.T. Hortal´a Gonz´aleza,1 M. Rodr´ıguez Artalejoa,1 R. del Vado V´ırsedaa,1 a b
Departamento de Sistemas Inform´ aticos y Computaci´ on Universidad Complutense de Madrid
Departamento de Lenguajes y Ciencias de la Computaci´ on Universidad de M´ alaga
Abstract The CF LP scheme for Constraint Functional Logic Programming has instances CF LP (D) corresponding to different constraint domains D. In this paper, we propose an amalgamated sum construction for building coordination domains C, suitable to represent the cooperation among several constraint domains D1 , . . . , Dn via a mediatorial domain M. Moreover, we present a cooperative goal solving calculus for CF LP (C), based on lazy narrowing, invocation of solvers for the different domains Di involved in the coordination domain C, and projection operations for converting Di constraints into Dj constraints with the aid of mediatorial constraints (so-called bridges) supplied by M. Under natural correctness assumptions for the projection operations, the cooperative goal solving calculus can be proved fully sound w.r.t. the declarative semantics of CF LP (C). As a relevant concrete instance of our proposal, we consider the cooperation between Herbrand, real arithmetic and finite domain constraints. Keywords: Cooperative Goal Solving, Constraints, Functional-Logic Programming, Lazy Narrowing.
1
Introduction
The scheme CF LP for Constraint Functional Logic Programming, recently proposed in [11], continues a long history of attempts to combine the expressive power of functional and logic programming with the improvements in performance provided by domain specific constraint solvers. As the well-known CLP scheme [9], CF LP has many possible instances CF LP (D) corresponding to different specific constraint domains D given as parameters. In spite of the generality of the approach, the use of one fixed domain D is an important limitation, since many practical problems involve more than one domain. 1 2
Author partially supported by projects TIN2005-09207-C03-03 and S-0505/TIC0407. Author partially supported by projects TIN2004-7943-C04-01 and TIN2005-08818-C04-01.
This paper is electronically published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs
´vez, Ferna ´ ndez, Hortala ´ , R. Artalejo, del Vado Este
A solution to this practical problem in the CLP context can be found in the concept of solver cooperation [5], an issue that is raising an increasing interest in the constraint community. In general, solver cooperation aims at overcoming two problems: a lack of declarativity of the solutions (i.e., the interaction among solvers makes it easier to express compound problems) and a poor performance of the systems (i.e., the communication among solvers can improve the efficiency of the solving process). This paper presents a proposal for coordinated programming in the CF LP scheme as described in [11]. We introduce coordination domains as amalgamated sums of the various domains to be coordinated, along with a mediatorial domain which supplies special communication constraints, called bridges, used to impose equivalences among values of different base types. Building upon previous works [2,10,15], we also describe a coordinated goal solving calculus which combines lazy narrowing with the invocation of the cooperating solvers and two kinds of communication operations, namely the creation of bridges and the projection of constraints between different constraint stores. Projection operations are guided by existing bridges. Using the declarative semantics of CF LP , we have proved a semantic result called full soundness, ensuring soundness and local completeness of the goal solving calculus. In order to place our proposal for solver cooperation in context, we briefly discuss main differences and similarities with a limited selection of related proposals existing in the literature. E. Monfroy [14] proposed the system BALI (Binding Architecture for Solver Integration) that facilitates the specification of solver cooperation as well as integration of heterogeneous solvers via a number of cooperations primitives. Monfroy’s approach assumes that all the solvers work over a common store, while our present proposal requires communication among different stores. Also, Mircea Marin [12] developed a CF LP scheme that combines Monfroy’s approach to solver cooperation with a higher-order lazy narrowing calculus somewhat similar to [10,15] and the goal solving calculus presented in this paper. In contrast to our proposal, Marin’s approach allows for higher-order unification, which leads both to greater expressivity and to less efficient implementations. Moreover, the instance of CF LP implemented by Marin and others [13] combines four solvers over a constraint domain for algebraic symbolic computation, while the instance we are currently implementing deals with the cooperation among Herbrand, finite domain and real arithmetic constraints. Recently, P. Hofstedt [7,8] proposed a general approach for the combination of various constraint systems and declarative languages into an integrated system of cooperating solvers. In Hofstedt’s proposal, the goal solving procedure of a declarative language is viewed also as a solver, and cooperation of solvers is achieved by two mechanisms: constraint propagation, that submits a constraint belonging to some domain D to its constraint store, say SD ; and projection of constraint stores, that consults the contents of a given store SD and deduces constraints for another domain. Projection, as used in this paper, differs from Hofstedt’s projection in the creation and use of bridges; since Hofstedt’s propagation corresponds to our goal solving rules for placing constraints in stores and invoking constraint solvers. Hofstedt also proposes the construction of combined computation domains, similar to our coordination domains. The lack of bridges in Hofstedt’s approach corresponds to the lack of mediatorial domains 212
´vez, Ferna ´ ndez, Hortala ´ , R. Artalejo, del Vado Este
within her combined domains. In different places along the paper we will include comparisons to Hofstedt’s approach; see especially Table 5 in Section 5. The structure of the paper is as follows: Section 2 introduces the basic notions of constraint domains and solvers underlying the CF LP scheme. Section 3 describes the constructions needed for coordination in our setting, namely coordination domains, bridges and projections. Programs, goals, the lazy narrowing calculus for cooperative goal solving (with a typical example), and the full soundness result are described in Section 4. Section 5 summarizes conclusions and future work.
2
Constraint Domains and Solvers in the CF LP Scheme
In this section, we recall the essentials of the CF LP (D) scheme [11], which serves as a logical and semantic framework for lazy Constraint Functional Logic Programming (briefly CF LP ) over a parametrically given constraint domain D. The proper choice of D for modeling the coordination of several constraint domains will be discussed in Section 3. As a main novelty w.r.t. [11], the current presentation of CF LP (D) includes now an explicit treatment of a Milner-like polymorphic type system in the line of previous work in Functional Logic Programming [4]. 2.1
Signatures and Constraint Domains
S We assume a universal signature Σ = hT C, DC, DF i, where T C = n∈N T C n , S S DC = n∈N DC n and DF = n∈N DF n are families of countably infinite and mutually disjoint sets of type constructor, data constructor and defined function symbols, respectively. We also assume a countable set TVar of type variables. Types τ ∈ T ypeΣ have the syntax τ ::= α | C τ1 . . . τn | (τ1 , . . . , τn ) | τ → τ 0 , where α ∈ TVar and C ∈ T C n . By convention, C τ n abbreviates C τ1 . . . τn , “→” associates to the right, τ n → τ abbreviates τ1 → · · · → τn → τ , and the set of type variables occurring in τ is written TVar(τ ). A type τ is called monomorphic iff TVar(τ ) = ∅, and polymorphic otherwise. Types C τ n , (τ1 , . . . , τn ) and τ → τ 0 are used to represent constructed values, tuples and functions, respectively. A type without any occurrence of “→” is called a datatype. Each n-ary c ∈ DC n comes with a principal type declaration c :: τ n → C αk , where n, k ≥ 0, α1 , . . . , αk are pairwise different, τi are datatypes, and TVar(τi ) ⊆ {α1 ,. . . , αk } for all 1 ≤ i ≤ n. Also, each n-ary f ∈ DF n comes with a principal type declaration f :: τ n → τ , where τi , τ are arbitrary types. For the sake of semantic considerations, we assume a special data constructor (⊥ :: α) ∈ DC 0 , intended to represent an undefined data value that belongs to every type. 3 Intuitively, a constraint domain provides specific data elements, along with certain primitive functions operating upon them. Following this idea, and extending the formal approach of [11] with a type system, we consider domain specific signatures Γ=hBT, P F i disjoint from Σ, where BT is a family of base types (such as int for integer numbers or real for real numbers) and P F is a family of primitive function symbols, each one with an associated principal type declaration p :: 3
In concrete programming languages such as T OY [1] and Curry [6], data constructors and their principal types are introduced by datatype declarations, the principal types of defined functions can be either declared or inferred, and ⊥ does not textually occur in programs.
213
´vez, Ferna ´ ndez, Hortala ´ , R. Artalejo, del Vado Este
τ1 →. . .→τn →τ (shortly, p :: τ n →τ ), where τ1 , . . ., τn and τ are datatypes. The number n is called arity of p, and the set of n-ary symbols in P F is noted as P F n . A constraint domain over a specific signature Γ (in short, Γ-domain) is a structure D=h{UdD }d∈BT , {pD }p∈P F i, where each d ∈ BT is interpreted as a non-empty D or R=U D ; and interpretations set UdD of base elements of type d, as e.g. Z=Uint real D p of primitive function symbols behave as explained in Subsection 2.3 below. 2.2
Extended Types, Expressions, Patterns and Substitutions over a Domain D
Given a Γ-domain D, extended types τ ∈ T ypeΣ,Γ over Γ have the syntax τ ::= α | d | C τ1 . . . τn | τ → τ 0 | (τ1 , . . . , τn ) where d ∈ BTΓ . Obviously, T ypeΣ ⊆ T ypeΣ,Γ . Given a countable infinite set Var of data variables disjoint from TVar, Σ and Γ, expressions over D e ∈ ExpD , have the syntax e ::= X | u | h | (e e1 ), where X ∈ Var, S u ∈ U D =def d∈BTΓ UdD , and h ∈ DCΣ ∪ DFΣ ∪ P FΓ . Note that (e e1 ) - not to be confused with the pair (e, e1 ) - stands for the application operation which applies the function denoted by e to the argument denoted by e1 . Following usual conventions, we assume that application associates to the left, and we abbreviate (e e1 . . . en ) as (e en ). Expressions without repeated variable occurrences are called linear, variablefree expressions are called ground and expressions without any occurrence of ⊥ are called total. Patterns over D are special expressions t ∈ P atD whose syntax is defined as t::=X | u | (c tm ) | (f tm ) | (p tm ), where X∈Var, u∈U D , c∈DCΣn with m≤n, f ∈DFΣn with m bool yearHours [] M R = true yearHours [T|Ts] M R = true (M-R), workerHours T #< (M+R), yearHours Ts M R Durations for each time slot are represented by the function duration, instead of the array timeSlotDuration[timeSlots] indexed by constraint variables as used 233
´lez-del-Campo and Sa ´enz-Pe ´rez Gonza
in OPL. In addition, we implement two versions of this function for comparing its readability and performance in order to analyse the trade-off between such factors. The first implementation is shown below and uses arithmetical constraint operators: duration:: int -> int duration T = m0_0 T #+ m1_24 T #+ m2_14 T #+ m3_14 T#+ m4_17 T #+ m5_13 T #+ m6_6 T #+ m7_6 T #+ m8_0 T We show a case of the functions involved in duration, which are intended to compute the duration of a given time slot: m4_17:: int -> int m4_17 T = 17#*T#*(T #- 1)#*(T #- 2)#*(T #3)#*(5#- T)#* (6 #- T)#*(7 #- T)#*(8 #- T)#/576 The second implementation involves two non-existing propositional constraint operators in T OY version 2.1.0, namely implication (#=>) and disjunction (#\/), so that we have implemented them into the system. duration:: int -> int duration T = D (D #= 24)) #\/ ((T #= 2) #=> (D#= 14))) #\/ (((T #= 3) #=> (D #= 14)) #\/ ((T #= 4) #=> (D #= 17)))) #\/ ((((T #= 5) #=> (D #= 13)) #\/ ((T #= 6) #=> (D #= 6))) #\/ (((T #= 7) #=> (D #= 6)) #\/ ((T #= 8) #=> (D #= 0))))) #\/ ((T #= 0) #=> (D #= 0)) The generation of the seed in T OY is similar to OPL, but we make the elements of the list [V|Vs] to be assigned to suitable values. The list [X|Xs] of lists of finite domains variables is assigned to the seed list. In the following code fragment, remove V List removes V from its second argument (List). fromXuntilY V W generates a list with all values between V and W. The reflection function fd min V returns the minimum value in the domain of V, whereas fd max V returns the maximum. rest V W removes the value W from the domain of the decision variable V. generate list X V generates a list of values including the value V and all the values in the domain variable X, assumed that maybe V is not a feasible assignment for X. The first element of the generated list is the value of the seed for X. try V [W|Ws] tries, by backtracking, to label the decision variable V with every value W of its second argument. my search [X|Xs] [V|Vs] tries to assign each value V in the list, which is in its first argument, to each corresponding decision variable X, which is in its second argument. ++ is the list concatenation operator. rest :: int -> int -> [int] rest X V = remove V (fromXuntilY (fd_minX) (fd_max X)) 234
´lez-del-Campo and Sa ´enz-Pe ´rez Gonza
generate_list :: int -> int -> [int] generate_list X V = [V] ++ rest X V try :: int -> [int] -> bool try X [V|Vs] = true bool my_search [] [] = true my_search [X|Xs] [V|Vs] = true let x free in elem x [1 ,2 ,3] =:= True > Suspended
If == had a flexible implementation we would get only one result, because of the absence of negative answers: > > > >
let x free in elem x [1 ,2 ,3] =:= True success { x 7→ 1} Try more ( y / n ) ? y no solutions
The intended behavior would be more on the line of: > > > > > > >
let x free in elem x [1 ,2 ,3] =:= True success { x 7→ 1} Try more ( y / n ) ? y success { x 7→ 2} Try more ( y / n ) ? y success { x 7→ 3} no solutions
In order to overcome this and trying to reach a more orthogonal operator set we are proposing a new set that can be seen in table 2. With our proposal we have: •
Two rigid operators on Bool. These operators already exist in Curry although we have renamed them.
•
The flexible version of the two previous operators. Those operators are new.
•
Finally, the operators returning Success. The disequality operator is new in Curry and is the key element in this work. 244
˜o, Rey Gallego, Marin
op. name
type
flexible?
notes
==
a → a → Bool
yes
/=
a → a → Bool
yes
defined as not.(==)
=:=
a → a → Success
yes
no changes
=/=
a → a → Success
yes
the new operator
===
a → a → Bool
no
old rigid equality
/==
a → a → Bool
no
the negation of previous one
Table 2 Our proposal for equality operators in Curry.
Computation step for a single expression: Eval[[ei ]] ⇒ D Eval[[e1 &e2 ]] ⇒ replace(e1 &e2 , i, D)
i ∈ {1, 2}
Eval[[ei ]] ⇒ D Eval[[c(e1 , . . . , en )]] ⇒ replace(c(e1 , . . . , en ), i, D) Eval[[f (e1 , . . . , en )]]T ⇒ D Eval[[f (e1 , . . . , en )]] ⇒ D
i ∈ {1, . . . , n}
if T is a definitional tree for f with fresh variables
Computation step for an operation-rooted expression e: Eval[[e]]rule(l=r) ⇒ {; id[]σ(r)}
if σ is a substitution with σ(l) = e
Eval[[e]]T1 ⇒ D1 Eval[[e]]T2 ⇒ D2 Eval[[e]]or (T1 , T2 ) ⇒ D1 ∪ D2 Eval[[e]]branch(π, p, r, T1 , . . . , Tk ) ⇒ 8 if e|p = c(e1 , . . . , en ), pat(Ti )|p = c(x1 , . . . , xn ) and Eval[[e]]Ti ⇒ D > >D < ∅ if e|p = c(. . . ) and pat(Ti ) 6= c(. . . ), i = 1, . . . , k Sk {; σi []σi (e)} if e|p = x, r = flex , and σi = {x 7→ pat(Ti )|p } > i=1 > : replace(e, p, D) if e|p = f (e1 , . . . , en ) and Eval[[e|p ]] ⇒ D Derivation step for a disjunctive expression: Eval[[e]] ⇒ {γ1 ; σ1 []e1 , . . . , γn ; σn []en } {γ; σ[]e} ∪ D ⇒ clean({γ1 ∧ σ1 (γ); σ1 ◦ σ[]e1 , . . . , γn ∧ σn (γ); σn ◦ σ[] en }) ∪ D
Fig. 1: Operational semantics of Curry
3
Operational semantics
In this section we present an operational semantics that allows computing with disequality constraints. The presentation tries to be a minimal extension to that present in the Curry draft. This way, changes can be easily located. Essentially, there are two changes. First, the mechanism for accumulating answers is enhanced to include constraints in solved form in addition to answer substitutions. This extension is generic, i.e. largely independent of the constraint system in use. Secondly, specific rules for simplifying disequality constraints are included. The main execution cycle is described by the rules in Fig. 1, that introduces the derivability relation D1 ⇒ D2 on pairs of disjunctive expressions. 245
˜o, Rey Gallego, Marin
Eval[[ei ]] ⇒ D Eval[[e1 =:=e2 ]] ⇒ replace(e1 =:=e2 , i, D)
if ei = f (t1 , . . . , tn ), i ∈ {1, 2}
Eval[[c(e1 , . . . , en )=:=c(e01 , . . . , e0n )]] ⇒ {; id[]e1 =:=e01 & . . . &en =:=e0n } Eval[[c(e1 , . . . , en )=:=d(e01 , . . . , e0m )]] ⇒ ∅ Eval[[x=:=e]] ⇒ D Eval[[e=:=x]] ⇒ D
if c 6= d or n 6= m
if e is not a variable
Eval[[x=:=y]] ⇒ {; {x 7→ y}[]success}
Eval[[x=:=c(e1 , . . . , en )]] ⇒ {; σ[]y1 =:=σ(e1 )& . . . &yn =:=σ(en )}
Eval[[x=:=c(e1 , . . . , en )]] ⇒ ∅
if x ∈ / cv(e1 , . . . , en ), σ = {x 7→ c(y1 , . . . , yn )}, y1 , . . . , yn fresh variables
if x ∈ cv(c(e1 , . . . , en ))
Fig. 2: Solving equational constraints Disjunctive expressions represent (fragments of) the fringe of a search tree. Formally, they are multisets of answer expressions of the form γ; σ[]e, where γ is a constraint, 8 σ a substitution and e a Curry expression. An answer expression γ; σ[]e is solved when e is a data term and γ is solved and consistent. We will use to denote a trivial constraint. The computation of an expression e suspends if there is no D such that Eval[[e]] ⇒ D. A constraint expression is solvable if it can be reduced to success. As can be seen in Fig. 1, reduction of terms rooted by user-defined function symbols is guided by an overloaded version of Eval[[]] that takes a Curry expression and a definitional tree as arguments. For details on these, and other aspects of the semantics which are largely orthogonal to the questions discussed here – conditional rules, higher-order features, freezing, etc – the reader is referred to [3]. The disjunctive behavior of disjunctive expressions is partly captured by the last rule in Fig. 1, which expresses how answers are accumulated. Observe that the combination of answers – both substitutions and new constraints – with the accumulated constraint might introduce inconsistency or perhaps constraints not in solved form. This is why a call to the auxiliary function clean is needed. Its definition depends on the actual constraint system and will be presented later for the disequality case. The other half of this disjunctive behavior is captured by the auxiliary function replace, that inserts a disjunctive expression into a position in a term, giving another disjunctive expression as result: replace(e, p, {γ1 ; σ1 []e1 , . . . , γn ; σn []en }) = {γ1 ; σ1 []σ1 (e)[e1 ]p , . . . , γn ; σn []σn (e)[en ]p } Figure 2 shows the rules for solving equational constraints. These are practically 8 Here, the word constraint refers to formal constraints i.e. internal representations of constraint formulae, opposed to Curry constraint expressions.
246
˜o, Rey Gallego, Marin
Eval[[ei ]] ⇒ D Eval[[e1 =/=e2 ]] ⇒ replace(e1 =/=e2 , i, D)
if ei = f (t1 , . . . , tn ), i ∈ {1, 2}
Eval[[c(e1 , . . . , en )=/=c(e01 , . . . , e0n )]] ⇒ {; id[]e1 =/=e01 , . . . , ; id[]en =/=e0n } Eval[[c(e1 , . . . , en )=/=d(e01 , . . . , e0m )]] ⇒ success Eval[[x=/=e]] ⇒ D Eval[[e=/=x]] ⇒ D
if c 6= d or n 6= m
if e is not a variable
Eval[[x=/=y]] ⇒ {x 6= y; id[]success}
if range(x) is infinite
Eval[[x=/=y]] ⇒ {x 6= y ∧ x ∈ range(x) ∧ y ∈ range(y); id[]success}
if range(x) is finite
Eval[[x=/=cj (e1 , . . . , en )]] ⇒ {; σ1 []success, . . . , ; σj []σj (x)=/=cj (e1 , . . . , en ), . . . , ; σk []success} if x ∈ / cv(e1 , . . . , en ), σi = {x 7→ ci (yi1 , . . . , yi
Eval[[x=/=c(e1 , . . . , en )]] ⇒ success
ar(ci ) )},
yuv fresh variables
if x ∈ cv(c(e1 , . . . , en ))
Fig. 3: Solving disequality constraints identical to those in the Curry draft and are included here mainly to reveal the symmetries and dualities w.r.t. the rules for solving disequality constraints, shown in Fig. 3. Observe that the auxiliary function cv , such that cv (e) collects the variables in e not inside a function call is used to implement an occurs check. Although the theory of disequality constraints is well established, the actual choice of solved forms may vary with regard to implementation considerations. Following [5], we have chosen to avoid explicit disjunctive constraints and instead we carry those alternatives over the search tree of the Curry semantics, i.e. disjunctive constraints are distributed over disjunctive expressions. This way, solved disequality constraints – those appearing in the left hand of answer expressions – amount to conjunctions of disequations between distinct variables, in the case where those variables range over infinite sets of values. When the variables can only take a finite set of values, solved forms are extended with the corresponding constraints for domain consistency. As we have said before, accumulating answers may corrupt the constraint store either by making it inconsistent or not solved. The task of tidying everything up – perhaps moving part of the constraint information back to the expression store – is the responsibility of function clean: clean(∅)
=∅
clean({γ; σ[]e} ∪ D)
= clean(D)
if γ inconsistent
clean({e1 6= e2 ∧ γ; σ[]e} ∪ D) = clean({γ; σ[]e1 =/=e2 &>e} ∪ D) if e1 or e2 nonvars clean({γ; σ[]e} ∪ D)
= {γ; σ[]e} ∪ clean(D) 247
otherwise
˜o, Rey Gallego, Marin
4
Implementation details
We will consider two cases: Implementation for infinite types. In types with an infinite number of instances handling disequality is easier as we can guarantee that a disequality between an instance – likely partial – of such a type and a new free variable is always satisfiable. Support for constraints among variables of infinite types is already present in T OY and in the M¨ unster Curry Compiler. Extending our implementation to correctly handle finite types. In finite types, testing consistency of constraints is harder, as there are disequality chains where one runs out of possible instances for variables. The implementation has been done using Ciao Prolog and its attributed variables library. Regarding Sloth, it is a new library which plugs into the current module system supplying the =/= operator. 4.1
Implementation for infinite types
The basic technique used is to attach to each disequality constrained variable an attribute, which contains the set of disallowed instantiations for that variable: DiseqAttribute = ’ $de_store ’( Var , List ) )
where List containts all the terms that Var should be different from. The system can just assume that each constrained variable must have its correspoding attribute, so the implementation should hook into the compiler in two ways: •
The disequality operator itself.
•
Unification of constrained variables, both with terms or with other constrained variables.
which nicely maps to the semantics of attributed variables. Attaching attributes Disequality constraints are added when the execution path tries to reduce an expression whose head is the =/= operator to HNF. The implementation of this operator is fully native – we mean fully written in Prolog – using the standard hooks present in Sloth for external libraries. The first action to be performed by the operator is to evaluate its arguments to HNF, then select the applicable case: both arguments are variables, only one is a variable or neither are. In addition, we will use a predicate for adding constraints to the variables’ store: add_to_store ( Var , L ) : ( get_attribute ( Var , ’ $de_store ’( Var , List ) ) → append ( List , L , LNew ) , upd ate _att ribu te ( Var , ’ $de_store ’( Var , LNew ) ) ; att ach _att ribu te ( Var , ’ $de_store ’( Var , L ) ) ).
248
˜o, Rey Gallego, Marin
Then, given the structure of our store, the case when both arguments are variables becomes trivial: diseq_vars (A , B ) : add_to_store (A , [ B ]) , add_to_store (B , [ A ]) .
as is the one for a variable and an instantiated term: % % Term is in HNF . diseq_one_var ( Var , Term ) : add_to_store ( Var , [ Term ]) .
The final case is when both arguments are not variables, so we need to check their parameters, if they have any: diseq_spine ( Term1 , Term2 ) : Term1 =.. [ C1 | L1 ] , Term2 =.. [ C2 | L2 ] , ( \+ C1 = C2 → true ; C1 = C2 , diseq_or ( L1 , L2 ) ). diseq_or ([ A | AL ] ,[ B | BL ]) : ( diseq (A , B ) ; diseq_or ( AL , BL ) ).
In the code above we profit from our representation of Curry data constructors as Prolog terms.
Unification hooks Unification of constrained variables has two different cases: •
The variable is being unified with a term instantiated to at least HNF form.
•
The variable is being unified with another constrained variable.
Ciao Prolog provides the multifile predicates verify_attribute/2 for the first item and combine_attributes/2 for the second. Unification with a term Unification with a term just checks that the set of accumulated disequality constraints in our constraint store holds, and then proceeds to unify the var with the term: ve rify _att rib ute ( ’ $de_store ’( Var , List ) , Term ) : diseq_and ( Term , List ) , det ach _att ribu te ( Var ) , Var = Term .
The instantiation of Var will also instantiate all copies of the variable present in other constraint stores. This non-trivial detail is possible thanks to Ciao Prolog implementation of attributed variables, which allows us to store the real variables in the attributes. diseq_and will just verify – by calling the main diseq predicate – that all the elements in the list are different from the term to unify with. This has a very important effect, as it will create the new needed disequality constraints in the case Term would be partially instantiated, or our constraint store contained partially instantiated constraints. 249
˜o, Rey Gallego, Marin
Unification between variables When dealing with disequality between two already constrained variables, our new constraint store will be the union of their respective constraint stores, if there doesn’t exist a previous disequality between the unifying variables: c o m b i n e _a t t r i b u t e s ( ’ $de_store ’( V1 , L1 ) , ’ $de_store ’( V2 , L2 ) ) : \+ contains_ro ( V1 , L2 ) , % doesn ’t i n s t a n t i a t e vars union ( L1 , L2 , NewL ) , det ach _att ribu te ( V1 ) , d e t a c h _ a t t r i b u t e ( V2 ) , V1 = V2 , att ach _att ribu te ( V1 , ’ $de_store ’( V1 , NewL ) .
It should be noted that like in the previous case, the union/3 predicate will unify the non-instantiated terms present in the constraint stores. This is precisely what we are looking for, as any constraint attached to such terms will be combined. The behavior of backtracking is solved as Ciao Prolog fully supports backtracking attributed variables, so it is not an issue, indeed it greatly helps our implementation, as when unification of a constrained variable needs to be undone, all the operations will be rolled back, including reattaching the previous attributes.
4.2
Implementation for finite types
As mentioned in the operational semantics, when the terms to be constrained do belong to a finite data type, the number of available instantiations for a variable is bound. This way, when dealing with the disequality case we must be proactive checking the constraint store consistency. The following example > let a ,b , c free in a =/= b & b =/= c & c =/= a & a =:= True
would give a wrong answer under the previous implementation, given that it is assumed we have an infinite number of instantiations for variables, but in this case is not true, as the variables a, b, c can be only be instantiated to two different values, thus making the above constraint unsatisfiable. At the implementation level, our view is to handle this situation as a FD problem, getting an important leverage from mature FD solvers. So our constraint store is extended with a FD variable: DiseqAttribute = ’ $de_store ’( Var , Fd , List ) )
being Fd the FD variable associated to Var. For this scheme to work, we assume the existence of the following two predicates for obtaining type meta-information: type ( Term , Type , Range )
which returns the type and the range of any arbitrary translated Curry term, including variables. index ( Term , IndexList )
which returns the index i of Term, i ⊆ range(type(T erm)) as a list. How this metainformation is obtained will be discussed in section 4.5. 250
˜o, Rey Gallego, Marin
The modified constraint solver We will detail here only the parts of the solver affected by this extension, starting with the store handling operation: add_to_store ( Var , Fd , L ) : ( get_attribute ( Var , ’ $de_store ’( Var , FdOld , List ) ) → append ( List , L , LNew ) , Fd = FdOld , consistent ( Fd ) , % Needed only in some FD solvers upd ate _att ribu te ( Var , ’ $de_store ’( Var , Fd , LNew ) ) ; att ach _att ribu te ( Var , ’ $de_store ’( Var , Fd , L ) ) ).
For the two variables case, we have to correctly constrain their associated FD variables: diseq_vars (A , B ) : ( type (A , flat , Range ) → get_fd_var (A , FdA ) , % Will return a fresh var if A has no FD var get_fd_var (B , FdB ) , FdA in 1.. Range , FdB in 1.. Range , FdA . < >. FdB ; true % A is non - flat ), add_to_store (A , FdA , [ B ]) , add_to_store (B , FdB , [ A ]) .
Constraining a variable to be different from a term is achieved in this case by constraining its associated FD var: % % Term is in HNF . diseq_one_var ( Var , Term ) : ( type ( Term , flat , _ ) → get_fd_var (A , FdA ) , index ( Term , TIndex ) , c o n s t r ai n t _ f d _ l i s t ( FdA , TIndex ) ; true ), add_to_store ( Var , FdA , Term ) .
where constraint_fd_list(Fd, List) forces Fd to be different from all the elements in List. When both arguments are instantiated, we don’t need to change the behavior of the solver.The unification is sightly modified to care about the new FD variables in the store: Unification with a term The new case introduced by the flat terms is taken care of by the remap_term/2 predicate, which checks that the term is in the variable domain and maps any residual constraint into the possible subterms. ve rify _att rib ute ( ’ $de_store ’( Var , Fd , List ) , Term ) : ( type ( Term , flat , Range ) → remap_term ( Term , Fd ) ; diseq_and ( Term , List ) , ), det ach _att ribu te ( Var ) , Var = Term .
Unification between variables The new added case is very simple, we only have to constraint our FD variables to be equal. c o m b i n e _a t t r i b u t e s ( ’ $de_store ’( V1 , Fd1 , L1 ) , ’ $de_store ’( V2 , Fd2 ,
251
˜o, Rey Gallego, Marin L2 ) ) : (
type ( Term , flat , Range ) → Fd1 .=. Fd2
; \+ contains_ro ( V1 , L2 ) , union ( L1 , L2 , NewL )
% doesn ’t i n s t a n t i a t e vars
), upd ate _att ribu te ( V1 , ’ $de_store ’( V1 , NewL ) .
4.3
The FD solver
As the reader can see, the use of the FD solving library is fairly limited, as we only use the disequality (..) predicate. This means that maybe using a full-featured FD solver is overkill for this application, but our hope is to profit from advanced arc-consistency algorithm found in this kind of constraint libraries so a considerable speedup can happen. 4.4
A combined approach example
To illustrate some of the gains from this mixed approach, we will showcase an example using both strategies. Let’s use the query: > let x free in [ x ] =/= [ True ] & [ x ] =/= [ False ]
Then our execution trace will look like: > > > >
( x :[]) =/= ( True :[]) & . . . { spine case } success & ( x :[]) =/= ( False :[]) { x =/= True } ( x :[]) =/= ( False :[]) { x =/= True ∧ x =/= False } fail { inconsistent FD store }
The execution first uses the normal method for non-flat types – in this case the list type – but when it finds a flat variable can use that information to always return correct answers. 4.5
Type meta-data handling
An obvious problem of this approach is that it needs knowledge about term and variable types at runtime. The first step is to perform static analysis on the types so we can determine which types are flat and what others not. Definition 4.1 [Type Graph] A type graph G for a type T is informally defined as the directed graph with types as nodes and edges from T 1 to T 2 when T 1 contains T 2. Type variables have the special type P ol. Definition 4.2 [Finite types] A type T is finite when its associated graph G has no cycles and no leaf node is in P ol. Then we define range of T as Πt∈N |T C(t)|. Definition 4.3 [Polymorphic types] A type T is polymorphic when its associated graph G has a leaf with in P ol and has no cycles. We call the set of all leaves in P ol P Level(T ). Definition 4.4 [Infinite types] A type T is infinite when its associated graph G has one or more cycles. Once this analysis is performed, we attach to each term e a finiteness annotation which will be: 252
˜o, Rey Gallego, Marin •
fixed (r) if type(e) is finite, where r = range(type(v)).
•
poly(P ) if type(e) is polymorphic, where P = P Level(T ).
•
infinite if type(e) is infinite.
With this information attached, we can easily determine the information needed for the predicate type/3, which is emitted at compile time. The last step is to obtain the information given by the index/2 predicate. The naive approach currently used is to instantiate all possible terms and assign to each one a numerical index. However, it should be noted that more sensible approaches do exist. Meta-data implementation in Sloth There are different approaches to try when implementing this kind of runtime typing. In Sloth, we have opted for the simplest one, that is to convert every term in a tuple (t, a), where t was the original term and a is its finiteness annotation. This approach allows to propagate finiteness annotations in the same moment terms are unified, so it comes handy avoiding function specialization for flat types. We are undertaking a complete redesign of this area, obviously to lower the overhead caused for the double unification that we currently use. Given the fact that terms instantiated to some degree already carry their type information 9 , the only difficult case would be the variables’ one, and using attributed variables to solve this problem seems sensible.
5
Experimental results
We present some preliminary experimental results just as a sample of what our system can do. We also have written more tests for the disequality implementation, which can be found in the compiler distribution. The chosen problem is the coloring of a square map, where each cell has 4 neighbors and a cell cannot have the same color as any of them. The corresponding Curry program can be seen in figure 4. The disequality operator to use has been abstracted as an argument, so we can use the same code for all the tests. The first test has been performed using the old == operator and narrowing, which implies that we have to instantiate the map first, the second one allowing residuation whereas the third one uses the new disequality constraint operator. Our motivation for benchmarking is not proving big performance enhancements, but to check that there is no major slowdown going on with the new disequality system. This seems to be true and indeed the disequality library used here lacks fine tuning and we are also planning to use a much more improved FD solver. Anyways, comparing a system with disequality to a equality-only one is not easy, as the main advantage that disequality supports brings us is a far more expressive language. The results for a random map which has a proportion of 90% free variables among its elements are shown in table 3. As a matter of reference, we present the 9
This is a property of our name handling in the compiler
253
˜o, Rey Gallego, Marin
data Color = Red | Green | Yellow | Blue diff x y color Red color Blue
= ( x == y ) =:= False diff_c x y = x =/= y = success color Yellow = success color Green = success
= success
foldr_c l = foldr (&) success l coloring :: ( Color → Color → Success ) → [[ Color ]] → Success coloring _ [] = success coloring f ( l : ls ) = c_lists f l ls & map_c_list f ( l : ls ) c_lists :: ( Color → Color → Success ) → [ Color ] → [[ Color ]] → Success c_lists _ _ [] = success c_lists f l ( x : xs ) = foldr_c ( map ( uncurry f ) ( zip x l ) ) & c_lists f x xs map_c_list :: ( Color → Color → Success ) → [[ Color ]] → Success map_c_list f l = foldr_c ( map ( c_list f ) l ) c_list :: ( Color → Color → Success ) → [ Color ] → Success c_list f [x] = success c_list f ( x1 : x2 : xxs ) = f x1 x2 & c_list f ( x2 : xxs ) p_naive map = constraint map & coloring diff map p_reseq map = coloring diff map & constraint map p_diseq map = coloring diff_c map & constraint map constraint l = foldr_c ( map (\ ll → foldr_c ( map color ll ) ) l )
Fig. 4: The map coloring problem. System
Problem
timenormal
timeres
timediseq
Sloth
Coloring
> 100 sec.
20 ms.
24 ms.
T OY
Coloring
> 100 sec.
n.a.
20 ms.
Table 3 Benchmark results for the coloring problem with a 5x5 map
results for the same algorithm using T OY 10 , although direct comparison of two systems is meaningless, as they use different Prolog compilers, etc. . .
6
Conclusions and future work
We have extended Curry’s operational semantics to allow disequality constraints in a similar spirit to the one presented in[5]. A first implementation in the Sloth system has been described and is already available from our download site. So far, we believe that implementation results are promising, and we hope that practice with our prototype can help in improving the support for constraints in the Curry standard. There are some specific aspects of our implementation that probably need some polishing, namely the interaction with the underlying finite domain solver and the static analyses needed to detect finite types and for tagging all variables with their types. Another issue to consider in the near future, among others, is the interaction with the type classes extension. It would be desirable to restrict the types for which equality/disequality operators can be applied, analogously to the Eq standard type 10 The
code used for T OY can be found in the compiler tarball.
254
˜o, Rey Gallego, Marin
class in Haskell. At the time of writing this version (July 2006) type classes support for Curry lives in a different development branch, although a merge will happen shortly.
References [1] Puri Arenas-S´ anchez, Ana Gil-Luezas, and Francisco Javier L´ opez-Fraguas. Combining lazy narrowing with disequality constraints. In PLILP, pages 385–399, 1994. [2] Emilio Jes´ us Gallego Arias and Julio Mari˜ no. An overview of the Sloth2005 Curry System: System Description. In WCFLP ’05: Proceedings of the 2005 ACM SIGPLAN workshop on Curry and functional logic programming, pages 66–69, New York, NY, USA, 2005. ACM Press. [3] Michael Hanus, Sergio Antoy, Herbert Kuchen, Francisco J. L´ opez-Fraguas, Wolfgang Lux, Juan Jos´ e Moreno-Navarro, and Frank Steiner. Curry: An Integrated Functional Logic Language, 0.8.2 edition, April 2006. Editor: Michael Hanus. [4] M. Hermenegildo, F. Bueno, M. Garc´ıa de la Banda, and G. Puebla. The CIAO multi-dialect compiler and system: An experimentation workbench for future (C)LP systems. In Proceedings of the ILPS’95 Workshop on Visions for the Future of Logic Programming, Portland, Oregon, USA, december 1995. Available from http://www.clip.dia.fi.upm.es/. [5] Herbert Kuchen, Francisco Javier L´ opez-Fraguas, Juan Jos´ e Moreno-Navarro, and Mario Rodr´ıguezArtalejo. Implementing a lazy functional logic language with disequality constraints. In JICSLP, pages 207–221, 1992. [6] Julio Mari˜ no and Jos´ e Mar´ıa Rey. The implementation of Curry via its translation into Prolog. In Kuchen, editor, 7th Workshop on Functional and Logic Programming (WFLP98), number 63 in Working Papers. Westf¨ alische Wilhelms-Universit¨ at M¨ unster, 1998.
255