A Decidable Second-Order Semantic Matching in Isabelle Hui Shi Burkhart Wol FB 3 Mathematik-Informatik, Universitat Bremen fshi,
[email protected]
Abstract
We present a complete matching algorithm and an ecient implementation in the theorem prover Isabelle for solving a class of second-order semantic matching problems, where the equational theory with respect to which the semantic matching is performed has a convergent pattern rewrite system. For a certain class of semantic matching problems this algorithm can even be used as a decision procedure. A number of interesting examples illustrate the application of our approach in the semantic matching of schematic rules in theorem proving and formula manipulation, where the schematic aspects can be presented by de ned functions.
1 Introduction The problem of matching and uni cation modulo a set of equations (i.e. semantic matching and uni cation) has been extensively studied over the last 10 years, and various algorithms have been discovered. In particular, techniques based on narrowing have been developed and proved complete [7, 14, 25] in solving uni cation problems associated with convergent rewrite systems [4, 16, 20]. Although semantic matching or semantic uni cation is a common and attractive eld for decision procedures, it has rarely been used in program synthesis and program transformation or for large scale theorem proving in systems like Isabelle [24] or HOL [9]. Particularly, it is quite natural to consider them in Isabelle, since Isabelle incorporates schematic variables (or meta variables). The congruence relations induced by equational theories can be interpreted as syntactic similarity that makes schematic rules applicable to a class of \similar" formulas, hence, they may be an alternative to a tactic in some cases and allow proofs for a schematic formula inside an object logic. We found them also very attractive in the eld of program transformations [18]. As an example, let us consider an informal schematic rule, \take a sequence of statements containing a special one a, convert the statements via a function F , and replace a with b": D1; ; Di?1; a; Di+1; Dn () F (D1); ; F (Di?1); b; F (Di+1); ; F (Dn) for arbitrary i and n. Such a rule could be represented formally as the following rule in Isabelle:
This work is partially supported by the BMBF project UniForM.
1
[[ ?DS @ [a] @ ?DS' ]] = [[ (map ?F ?DS) @ [b] @ (map ?F ?DS') ]], where [[ ]] is some semantic interpretation function, @ and map are the usual concatenation and mapping functions on lists, and where the question mark is a pre x for meta variables in the Isabelle syntax. Now, matching the left-hand side of the rule against some subgoal, say [c,a,d,a], means that a suitable substitution for ?DS and ?DS' has to be found such that the instance of the left-hand side is equal to the subgoal modulo the usual equations de ning @. In fact, our approach is also capable to semantically match the whole equation. In this case, we should consider the de nition of the higher-order function map and nd a substitution for the higher-order variable ?F too. Functions like @ and map are considered as pattern generating functions and will be called matching combinators in this paper. The use of these functions presents a kind of control abstraction and facilitates the reuse of development rules and their veri cation. It is well-know that semantic uni cation or matching is complex and inecient in general; there are often too many solutions even to simple uni cation problems and a combination of equations usually requires a new uni cation algorithm. These obstacles hindered the incorporation of general equational uni cation procedures into program synthesis, transformation systems, and theorem proving systems. In this paper, we will introduce a restricted class of functions which are de ned by orthogonal, terminating pattern rewrite systems [20, 23]. Since the second-order matching is decidable and Huet and Lang's [13] second-order matching algorithm has been well studied, we consider only the second-order case. Both @ and map belongs to this class. For this class of functions, we will present a simple and ecient matching algorithm to solve their semantic matching problems. The main contribution of this paper can be summarised as follows: Identifying a class of decidable and practically useful semantic matching problems. Presenting a simple and ecient implementation, and integrating it into several tactics at top of the Isabelle core. Demonstrating a number of useful schematic rules through examples, which are covered by our approach. The rest of this paper is organized as follows. Section 2 introduces the basic de nitions and notations that will be used in this paper. Section 3 de nes a class of pattern rewrite systems and presents a semantic matching algorithm as a set of transformation rules. An ecient implementation in Isabelle and some test results are given in section 4. In section 5 several applications of our method in theorem proving will be demonstrated through examples. We conclude in section 6.
2 Preliminaries In this section we brie y review the relevant basic notations, terminology and results for simply typed -calculus, pattern rewrite systems, and semantic uni cation and matching. For surveys of these areas, see [4, 16, 15, 11, 20]. We use T0 to denote the set of base types and T the set of types, and write n ! for (1 ! (n ! )), where is assumed in T0. The order O() of a type is de ned as O() = 1 for 2 T0, and O(n ! ) = maxfO(i)j1 i ng + 1 otherwise. 2
For each 2 T0, there are two denumerable sets of function symbols and variables, resp, which are also called atoms. The set of terms of type 2 T is de ned by the usual construction of application and abstraction. The term (: : : (a s1) : : : sn) may be written as a s1 : : : sn or a sn , and x1: xk :t as xk :t. The set of all bound variables in a syntactic object O is denoted by BV (O) and that of all free variables by F V (O). t is a ground term if F V (t) = . A term is said linear i every free variables occurs at most once in it. We use X , Y , Z , F and G to denote free variables, and x, y, z to bound variables. A term is said to be rst-order (or second-order) if the types of its free variables are at most of order 1 (or 2, respectively). A precise formalism for describing subterms is obtained through the notion of positions which are the paths from the root to the subterm in Dewey decimal notation. We just brie y review the notation, details can be found in [4]. The positions in a term t are denoted by O(t). The letters p and q stand for positions. The subterm of t at position p will be denoted as tjp and the replacement of a subterm of t at position p with s is denoted by t[s]p. If p 2 O(t), then BV (t; p) denotes the set of all bound variables on the path from the root of t to the position p. Terms are only compared modulo -conversion. If t is a -normal form, then t must be of the form xk :a sn with a an atom and each si a -normal form. The atom a and the term a sn are called the head and the body of t, resp. A -normal form xn :a sn is long i a sn is of a base type and each si is long. We only consider long -normal forms in this paper. A term t in -normal form is called a (higher-order) pattern if every free occurrence of a variable F is in a subterm F un of t, such that un is -equivalent to a list of distinct bound variables. A substitution is a mapping from variables to terms, and is written as fx1 7! s1; , xn 7! sng or fxn 7! sng, its domain and range are D = fx1; ; xng and I = V (s1) [ [ V (sn), resp. We use , and to denote substitutions. The composition of two substitutions and , denoted as , is de ned as (x) = ( (x)). Given a set W of variables, we say that two substitutions and are equal over W , denoted as =W , i (x) = (x) for all x 2 W . We say is more general than over W , denoted as W , i there exists some such that =W . Without loss of generality, we assume any substitution to be idempotent, i.e. D() \ I () = . A fxk g-lifter of a term t away from W is a substitution = fF 7! (F ) xk g, where is a renaming such that D() = FV (t), I () \ W = , and F has the type k ! if each xi is of the type i (1 i k) and F of the type . A term s uni es with a term t in a theory E if E j= s = t for some substitution . Semantic uni cation is the process of nding all such substitutions. If E is an empty theory and s, t are patterns, then there exists at most one most general uni er (short: mgu) such that s = t [21]. Matching is a special case of uni cation, where t is treated as a ground term. A rewrite rule is an oriented pair l ! r such that l is not a free variable, l and r are of the same base type, and F V (l) F V (r). A pattern rewrite rule is a rewrite rule whose left-hand side is a pattern. A pattern rewrite system is a set of pattern rewrite rules. A rewrite rule is called left-linear i its left-hand side is linear. A rewrite system is called left-linear i all its rewrite rules are left-linear. A rewrite system R is called regular i F V (l) = F V (r) for every rule l ! r 2 R. A function symbol f is said to be a de ned function with respect to a rewrite system R, if there exists a rule in R with the form 3
f tn ! r; otherwise f is called a constructor. In the sequel, we use f , g to denote de ned functions, and c to constructors. A rewrite system is called second-order i each rule is a pair of second-order terms. In this paper, we are only interested in second-order pattern rewrite systems. Whenever we say \pattern rewrite system" in the following, we mean \second-order pattern rewrite system". Let l1 ! r1 and l2 ! r2 be two rules in a pattern rewrite system, they are overlapping if there exists a position p 2 O(l1) such that the head of l1jp is not a variable free in l1, and xk :l1jp and xk :l2 are uni able, where fxk g = BV (l1; p) and is a fxk g-lifter of l2 away from FV (l1). Otherwise, they are non-overlapping. A pattern rewrite system R is non-overlapping i any two rules of R are non-overlapping. A pattern rewrite system is orthogonal i it is left-linear and non-overlapping. An orthogonal pattern rewrite system is con uent [23, 20]. A matching pair is an ordered pair of terms s =/ t. A matching problem ? is a set of matching pairs. We are only interested in second-order matching problems, where / / s all terms are second-order. We may write f n = tn g for the matching problem fs1 = t1; ; sn =/ tng. F V (?) = F V (s1) [ [ F V (sn ).
3 A Second-Order Semantic Matching Algorithm In this section, we are going to introduce a class of de ned function called matching combinators and to present a complete matching algorithm to solve matching problems associated with such functions. We will also use this algorithm as a decision procedure for a restricted class of matching combinators called acyclic matching combinators.
3.1 De nitions
De nition 1 (Matching combinators) Let R be a regular, orthogonal, and terminating pattern rewrite system, f a de ned function de ned of R. f is said to be a matching combinator (short: combinator), if for every l ! r 2 R the following statements hold: it is constructor-based, i.e., l has the form f tn such that each ti (1 i n) contains no de ned functions, any other de ned function in r is a matching combinator.
The following rules de ne some useful matching combinators: 0 + Y ! Y; (s X ) + Y ! s(X + Y ) nil @ Ys ! Ys ; (X :: Ys ) @ Ys ! X :: (Xs @ Ys ) map F nil ! nil; map F (X :: Xs ) ! (F X ) :: (map F Xs ) foldr F nil Z ! Z; foldr F (X :: Xs ) Z ! F X (foldr F Xs Z ) Fay [7] and Hullo [14] showed that narrowing is a complete method for solving equations in the theory de ned by a con uent and terminating rst-order rewrite system. Prehofer's work [25] gives a framework for solving pattern rewrite systems by lifting the general rst-order notion of narrowing to patterns without loss of its completeness. Instead of using an existing narrowing strategy, such as basic narrowing or innermost narrowing, we introduce an alternative narrowing strategy { structural narrowing. It is 4
particularly appropriate to solve matching problems associated with matching combinators and makes implementation easy and ecient.
De nition 2 (Structural narrowing) Let t be a pattern xk :s, where the head of s is not a free variable. t is structurally narrowable into t0 with a pattern rewrite rule l ! r and a substitution , if is a fxk g-lifter of l away from FV (t), = mgu(t; xk :l), and t0 = (xk:r).
Since l is a pattern and has the form fXm 7! Fm xk g, where Xi (1 i n) are free variables in l and Fi (1 i n) are free variables, so xk :l is also a pattern. The most general uni er of t and xk :l exists if they are uni able. In fact, structural narrowing is a lazy narrowing strategy. Narrowing always occurs at the outermost position of a term. We give some examples to show the idea. foldr + (X1 :: (X2 :: nil)) 0 ;foldr F
(X ::Xs) Z !F X (foldr F Xs Z )
X1 +(foldr + (X2 :: nil) 0):
The term X1 + (foldr + (X2 :: nil) 0) is structurally normalized.
x1; x2:map G (x1 :: (x2 :: nil))
;map F (X ::Xs)!(F X )::(map F Xs) x1; x2:(G x1) :: (map G (x2 :: nil)); where the fx1; x2g-lifter of map F (X :: Xs ) is = fF 7! F 0 x1 x2; X 7! X 0 x1 x2; Xs 7! Xs0 x1 x2g; and the most general uni er is fF 0 7! x1; x2:G, X 0 7! x1; x2:x1, Xs0 7! x1; x2:x2 :: nilg.
3.2 Matching Algorithm
In our semantic matching algorithm, Huet and Lang's second-order matching algorithm [13] will be used to solve matching pairs containing no matching combinators, which always produces a nite complete set of minimal matches for second-order matching problems. The matching algorithm is presented as the following six transformation rules on pairs (; ?) of substitutions and matching problems ?. For an input pair (fg; ?0), the algorithm succeeds when there is a sequence of transformations terminating with (; fg), in which case is a match of the initial problem ?0 , fails when there exist no sequences of transformations that terminate with a pair of the above form. In the case of failure, ?0 is not matchable. In the sequel, it is assumed that every term is automatically normalized. (SO-Matching) solves a matching pair through second-order matching. If t is a secondorder term containing no matching combinators, (; ft =/ sg [ ?) =) (0 ; 0?); where 0 is a minimal match of t and s:
(Decomposition) breaks a matching pair down into simpler ones. If a is a function symbol or a bound variable,
(; fxk :a sn =/ xk :a tng [ ?) =) (; fxk :sn =/ xk :tng [ ?): 5
(Narrowing) realizes structural narrowing of a matching pair. If all ti (1 i n) contain no matching combinators, (; fxk :f tn =/ xk :sg [ ?) =) (0 ; f0(xk :r) =/ xk :sg [ 0?); where l ! r 2 R, is a fxk g-lifter of l away from FV (xk :f tn ), and 0 = mgu(xk :f tn, xk :l). (Abstraction) abstracts a matching pair. If the head of t is a matching combinator, p 2 O(t) such that the head of tjp is also a matching combinator, (; ft =/ sg [ ?) =) (; ft[X ]p =/ s; xk :tjp =/ xk :X g [ ?); where X is a new free variable, BV (t; p) = fxk g. As usual, =) denotes the re exive-transitive closure of =). The rst three transformation rules are very similar to those of a standard semantic matching algorithm. The use of structural narrowing requires the additional transformation /rule (Abstraction), which ensures the completeness of the algorithm. Let f(Xs @Ys)@Zs = sg be a matching problem, the rule (Narrowing) can /not be used/directly. Applying (Abstraction) to it, we will get the new problem fX @Zs = s; Xs @Ys = X g. Now we can use (Narrowing) to it.
Theorem 1 (Completeness) Let ?0 be a given matching problem, in which all de ned
functions are matching combinators of a pattern rewrite system R. If is a match of ? w.r.t. R, then there exists a sequence of transformations (fg; ?0) =) (0; fg) such that 0 F V (?) . 2 An important lemma to prove this theorem is given in the appendix. In addition, our matching algorithm produces only minimal matches for a given matching problem. To prove the following theorem, we use the fact that combinators are de ned by sets of non-overlapping rules.
Theorem 2 (Minimality) Let ? be a given matching problem, in which all de ned functions are matching combinators of a pattern rewrite system R. If (fg; ?0) =) (; fg) and (fg; ?0) =) (0; fg) are two dierent sequences of transformations, then 0 6F V (?) . Proof. The only two rules to be considered are (SO-Matching) and (Narrowing). Any application of other rules has no in uence upon matches. Let t =/ s be a matching pair such that t is a second-order term containing no matching combinators, then applying the rule (SO-Matching) produces a minimal match of the pair. Let fxk :f tn =/ xk :sg be a matching pair such that all ti (1 i n) contain no matching combinators, li ! ri (i = 1; 2), be two pattern rewrite rules that can be applied to narrow it. If 1 and 2 are fxk g-lifters of l1 and l2 away from F V (xk :f tn), 1 = mgu(xk :f tn; xk :1l1) and 2 = mgu(xk :f tn; xk :2l2), then 2 1 6F V (xk :f tn) 2, since l1 ! r1 and l2 ! r2 are non-overlopping. 6
3.3 Termination
It is well-known that semantic matching is undecidable in general, even for some simple rst-order convergent rewrite systems [1, 5]. For instance, from the term foldr (x; y:y) X Z we can have the following in nite narrowing sequence: foldr (x; y:y) X Z ;foldr F
(X1 ::XS1 ) Z1 !F X1 (foldr F XS1 Z1 ) 2 2 2 2 (foldr F XS2 Z2 )
;foldr F (X ::XS ) Z !F X
foldr (x; y:y) XS1 Z1 foldr (x; y:y) XS2 Z2
So, if we apply the matching algorithm to matching problems like ffoldr (x; y:y) X Z =/ sg, it will not terminate. To cope with this problem, we introduce the concept of acyclic matching combinator. At rst, we de ne a class of non-decreasing matching combinators, which modi es and lifts the de nition in [5] to the second-order case.
De nition 3 (Subterm property) A subterm property P is a measure for ground terms in long -normal form along with a well-founded total ordering , which compares values of P such that for any term s, p 2 O(s), and BV (s; p) = fxk g, P (xk :sjp) 6 P (s). De nition 4 (Non-decreasing matching combinators) Let P be a suitable property along with . A matching combinator f is de ned to be non-decreasing if whenever xk :f tn can be rewritten to a ground normal form t, where all ti (1 i n) are ground normal forms, then P (xk :ti) 6 P (t). Otherwise f is called decreasing (with respect to P ) (short: decreasing).
With respect to the depth of constructors in a term along with the greater-than relation over the natural numbers, except foldr all above combinators are non-decreasing.
De nition 5 (Acyclic matching combinators) Let P be a subterm property. A matching combinator f de ned by R is called acyclic (with respect to P ), if each f ln ! r 2 R satis es the following conditions: the top-level symbol of r is a variable or a constructor, and there is no matching combinator in r that is nested below any free variable or any decreasing combinator with respect to P .
From this de nition, except foldr all above matching combinators are acyclic, since in the right-hand side of its second rule foldr occurs inside the free variable F .
Theorem 3 (Termination) For any matching problem ?, in which all de ned functions are acyclic matching combinators of a pattern rewrite system R, the matching algorithm always terminates. 2 A proof of this theorem can be found in the appendix. Although foldr is not an acyclic matching combinator, we will show that our algorithm terminates for a lot of matching problems containing foldr, see Section 4 and 5.
7
4 An Ecient Implementation in Isabelle In this section, we describe an ecient implementation of the matching algorithm in the standard ML of New Jersey compiler in the theorem prover Isabelle. Our implementation is a pure extension of Isabelle and does not change the core of the prover. The data structures of term in Isabelle are also used here. Terms in a matching problem always have ground types. A term is a constant, a free variable, a bound variable, an abstraction, or an application. To avoid the renaming of bound variables, de Bruijn's name-free representation has been used in the implementation. Theoretically, the algorithm described in section 3 provides a decision procedure for matching problems associated with acyclic matching combinators. The rst direct implementation of our matching algorithm turned out to be very disappointing at the eciency even for simple matching problems. For instance, it took about an hour to compute all the possible trees of height 4. A combinatorial investigation shows that there are 651 such trees! For practical use, additional techniques to improve the algorithm are indispensable. After applying several techniques, our algorithm has been obviously improved: all 651 trees of height 4 are produced within 6 minutes. The most important technique we applied is dependency analysis, which is motivated by the partial ordering for rst-order uni cation problems from Martelli and Montanari [19]. One reason to make the algorithm inecient is the random selection of matching pairs to apply a transformation rule. Particularly, the use of the (Abstraction) may introduce new free variables. See the following example: =)(Narrowing) =)(Abstraction)
(fg; ffoldr + X Z =/ s(0)g) (fX 7! Y :: Y S; F 7! +g; fY + (foldr + Y S Z ) =/ s(0)g) (fX 7! Y :: Y S; F 7! +g; fY + X1 =/ s(0); foldr + Y S Z =/ X1g)
Solving foldr + Y S Z =/ X1 rst without knowing X1 can be very inecient through many useless narrowing steps. Instead, matching pairs whose right /terms contain no free variables that are dependent upon other pairs, such as Y + X1 = s(0), should be considered rst. A matching pair t =/ s 2 ? is called solvable if there is no t0 =/ s0 2 ? such that F V (s) \ F V (t0) 6= . In an improved implementation of the matching algorithm, we require that the rules (SO-Matching) and (Narrowing) can only be applied to solvable matching pairs. Through dependency analysis, the matching process is easy to control and unnecessary substitutions are avoided. Now, we outline the implementation of our matching algorithm. The central point is the construction of a matching tree for any given matching problem. Each node is labelled by a pair (; ?), the root is labelled by (fg; ?0), where ?0 is a given matching problem. Three main functions are used: simpli cation decomposing and abstracting matching pairs according to the rules (Decomposition) and (Abstraction); second-order matching computing matching pairs without matching combinators; solution narrowing matching pairs, and solving them independently, where the matching process is called recursively. 8
If there exists a leaf labelled by (; ?0) with ?0 6= fg, simplify it rst using the simpli cation function, then choose a solvable matching pair and apply the second-order matching function or the solution function to solve it independently. After that, new nodes will be constructed. This process will be continued until each leaf is labelled either by a pair of the form (; fg), or by a pair to which no functions can be applyed. All 's of the labels (; fg) form the complete set of matches for the given matching problem ?0 . To get an overview about the eciency of our implementation, a number of tests have been done on a SUN SPARC-Station-20 with 64 megabytes of memory. We have measured the execution time T in seconds (without garbage collection time) using the ML system function check-timer, and the number of minimal matches M on more than 40 examples. matching pair T M / 1 X @Y = [1; 2] 0.47 3 2 X @Y =/ [1; 2; 3] 0.70 4 3 X @Y =/ [1; 2; 3; 4] 1.10 5 / 4 X @Y = [1; 2; 3; 4; 5] 1.49 6 / 5 ht(node X Y ) = 1 0.36 1 / 6 ht(node X Y ) = 2 1.17 3 / 7 ht(node X Y ) = 3 9.98 21 / 8 ht(node X Y ) = 4 356.18 651 / 9 foldr f [X 1; X 2] 0 = f 0 (f 1 0) 0.45 1 / 10 foldr f [X 1; X 2; X 3] 0 = f 0 (f 1 (f 2 0)) 0.64 1 / 11 foldr f [X 1; X 2; X 3; X 4] 0 = f 0 (f 1 (f 2 (f 3 0))) 0.93 1 / 12 foldr f [X 1; X 2; X 3; X 4; X 5] 0 = f 0 (f 1 (f 2 (f 3 (f 4 0)))) 1.22 1 13 foldr f XS 0 =/ f 0 (f 1 0) 0.76 1 / 14 foldr f XS 0 = f 0 (f 1 (f 2 0)) 1.15 1 15 foldr f XS 0 =/ f 0 (f 1 (f 2 (f 3 0))) 1.70 1 / 16 foldr f XS 0 = f 0 (f 1 (f 2 (f 3 (f 4 0)))) 2.58 1 / 17 foldr + [X 1; X 2] 0 = 2 1.06 3 / 18 foldr + [X 1; X 2] 0 = 3 1.51 4 / 19 foldr + [X 1; X 2] 0 = 4 2.15 5 / 20 foldr + [X 1; X 2] 0 = 5 3.40 6 The above table gives some test results, from which we can draw the following conclusions. [ and ], 0, 1, 2, in the table are used to present lists and the natural numbers shortly, and f is any constructor. For matching pairs with rst-order matching combinators like @, which do not depend on other matching combinators, the number of minimal matches increases linearly w.r.t the size of the right terms, similarly does the time needed to compute these matches. This can be seen from the rst 4 examples. The use of this kind of matching combinators is unproblematic. Matching pairs from the second group contains the combinator ht on trees, which is de ned through the combinator max on the natural numbers whose result depends 9
only upon some of its arguments. See the following rules: max 0 Y ! Y; max (s X ) 0 ! s X; max (s X ) (s Y ) ! s(max X Y ) ht empty ! 0; ht(node X Y ) ! s(max (ht X ) (ht Y ))
Although ht is acyclic, both the number of matches and the time-consuming grow very rapidly w.r.t. the size of the right terms. In practice, one should avoid to use such kind of matching combinators. The use of matching combinators like foldr is dependent on the instantiation of higher-order variables. Although foldr is non-acyclic, the tests from 9 to 20 show that matching pairs containing it can still be solved eciently enought for practical uses. Furthermore, the matching pairs from 9 to 16 have only one minimal match.
5 Applications
The semantic matching facilities reside in an SML-functor SEM MATCH that can be loaded on top of Isabelle/HOL1. Its interface is small: signature SEM MATCH = sig exception SEM MATCHER of string val sem matches : thm list * term * term ? > (term * term) list list val sem rtac : thm list ? > thm ? > int ? > tactic val sem etac : thm list ? > thm ? > int ? > tactic val sem match tac : thm list ? > int ? > tactic end; sem matches thms t1 t2 produces the list of all matches of t1 and t2 with respect to a theory which has the set of theorems thms. It is useful for tactical programming. sem rtac thms thm i performs resolution of subgoal i with thm modulo a set of theorems thms. The subgoal does not have any meta variables, otherwise an exception will occur. With the back()-mechanism ([24], pp 9), the user may interactively switch to other possible substitutions. sem etac is a function the analogue to sem rtac for elimination resolution. sem match tac thms i assumes the subgoal i to be an equation. It attempts to construct a substitution, propagates it into the proof-state and concludes with a simpli cation of subgoal i, which should eliminate this subgoal. With back(), other substitutions will be chosen.
Example 1
We refer to our introductory example. Let Prog be a Isabelle-theory for a toy programming language de ning a type prog with constants a, b and c and a semantical function [[ ]] :: prog list => bool. Moreover, let the following theorem stmt exch map hold in this theory: - stmt exch map; val it = \[[ ?DS @ [a] @ ?DS' ]] = [[ (map ?F ?DS) @ [b] @ (map ?F ?DS') ]]" : thm 1
We expect only minor changes for an adoption to another object logics of Isabelle
10
Forward reasoning converts stmt exch map into a kind of schematic rule: - val aux = stmt exch map RS iD2; val aux = \[[ (map ?F ?DS) @ [b] @ (map ?F ?DS') ]] ==> [[ ?DS @ [a] @ ?DS' ]]"; Now, we may want to know if our program yields true. The proof can proceed as follows: - goal Prog.thy \[[ [c,a,d,a] ]]"; - by(sem rtac [append Nil, append Cons] aux 1); Level 1 1. [[ (map ?F [c]) @ [b] @ (map ?F [d,a]) ]] - back(); 1. [[ (map ?F [c,a,d]) @ [b] @ (map ?F []) ]]
Moreover, in order to demonstrate the power of our semantic matching, we take the following goal, which can be solved with two steps: - goal Prog.thy \[[ [c,a,d,a] = [c,b,c,c] ]]"; - by(sem rtac [append Nil, append Cons, map Nil, map Cons] stmt exch map 1); Level 1 [[ [c] @ [a] @ [d,a] ]] = [[ (map (%x.c) [c]) @ [b] @ (map (%x.c) [d,a]) ]]" - by(simp tac (HOL ss addsimps [append Nil, append Cons, map Nil, map Cons]) 1); No subgoals!
Example 2
The following is an one-step proof, which uses the semantic matching to solve a goal containing the matching combinator foldr. Otherwise, have to guess a right substitution and to simplify. - goal Prog.thy \(foldr (op +) [?X,?Y] 0) = Suc(Suc(Suc(Suc 0)))"; - by(sem match tac [add 0, add Suc, foldr Nil, foldr Cons] 1); No subgoals!
Example 3
With this example, we are going to illustrate how the equational approach can provide a direct solution to the ellipse problem from the following example: ((1 ^ : : : ^ n) ! ) =) (1 ! (: : : ! (n ! ) : : :));
for arbitrary n. The rule can be represented formally as the following rule stmt foldr in Isabelle with the matching combinator foldr: (foldr (op &) ?PS True) ?? > ?P ==> foldr (op ?? >) ?PS ?P. where ?? > and & are the logical connectives in HOL. We can make the following proof: - val [prem] = goal HOL.thy \[j (a & b & c)?? >dj] ==> a?? >b?? >c?? >d"; - by(sem rtac [foldr Nil, foldr Cons] stmt foldr 1); Level 1 1. (foldr (op &) [a,b,c] True) ?? > d - by(simp tac (HOL ss addsimps [foldr Nil, foldr Cons]) 1); Level 2 2. (a & b & c) ?? > d - by(simp tac (HOL ss addsimps [prem]) 1); No subgoals! 11
This example reveals an interesting shift from the meta-language Isabelle/SML to the object-language HOL. Of course, the same eect could have been achieved by a tactical program involving tacticals like REPEAT. But then, there would have no means left to formally reason about it. stmt foldr however can be proven by the induction over lists, which exactly represents the eect of REPEAT on the object logical side. From the examples above we draw the following conclusions: Matching combinators are useful to de ne rule schemata, which represent a kind of control abstraction and facilitates the reuse of development rules and their veri cation. The matching algorithm is the key to de ne semantic matching tactics, which support automatic equation solving and increase the productivity of theorem proving in such situations. In some cases schematic rules are an object level equivalent to tactical programs represented at the meta level.
6 Conclusion Second-order semantic matching problems associated with matching combinators are particularly interesting in functional programming, program transformations, and theorem proving. In this paper, we have presented a complete matching algorithm to solve this class of semantic matching problems. A class of decidable semantic matching problems has be obtained through restricting matching combinators to acyclic ones. A simple and ecient implementation, a number of tests and examples shed some light on the applicability of our approach in theorem proving and program manipulation. To our knowledge, there is almost no existing work which integrates semantic matching or uni cation in large scale theorem proving systems.
6.1 Related Work
In [5], Dershowitz, Mitra and Sivakumar presented a decision procedure of semantic matching for rst-order convergent rewrite system. Our work has lifted the approach to the second-order case and considered a class of convergent and non-overlapping pattern rewrite systems from Nipkow [23, 20]. Prehofer's work [25] gives a framework for solving higher-order equations by lifting the general rst-order notion of narrowing to patterns without loss of its completeness. The decidability of higher-order semantic matching problems has not been discussed there. A most important dierence between our work and other related work in semantic matching or uni cation is that we are not only interested in some theoretical results, such as completeness, minimality and termination, we are also interested in simple and ecient implementation. To meet this goal, we have used the constructor-based property of matching combinators, introduced an alternative lazy narrowing strategy { structural narrowing, and proved that structural narrowing together with the abstraction of nested matching combinators is a complete strategy to solve semantic matching problems associated with matching combinators. 12
Compared with basic narrowing (see [14]), structural narrowing is simpler and more ecient for solving semantic matching problems associated with matching combinators. Outermost narrowing is also a lazy narrowing strategy, but it is incomplete, see [6]. There are some other improved narrowing strategies, e.g. [2, 10]. Although most of them are introduced in the context of logic and functional programming languages, they consider arbitrary convergent rewriting systems and do not make use of the properties of constructor-based systems, though they are usually used to de ne functions.
6.2 Future Work
The combinations of uni cation and matching problems are rather complex in general, even for the rst-order case. Our reason of considering orthogonal rewrite systems is their easy combination, at least, for the rst-order case. Although there is still no similar results about the compositionality of higher-order rewrite systems, we hope that the class of pattern rewrite systems which are orthogonal and constructor-based can be combined easily. If it is the case, the combination of matching problems becomes very easy even for second-order matching combinators. To get the semantic matching more attractive in practice, we are going to improve the eciency of our matching algorithm further. For example, some results presented in [26, 3] about the improvement of the higher-order pattern uni cation and the secondorder matching, which are frequently used in our matching procedure, can also improve our implementation. The number of matches of a given matching problem may be very large. Usually, only a few of them, or even only one of them, are interesting in practice. There are often some context related conditions for applying a rule to a given goal. If we can de ne this kind of context conditions properly, and consider them during the matching process, it is possible to reduce the number of matches and speed up the matching process signi cantly for some matching problems. Until now, we have only consider simply typed terms. In theorem proving or program manipulation, ML-style polymorphism is often used to shift type-checking in many-sorted object languages to the meta level. In [22], Nipkow described an extension of Huet's higher-order uni cation from the simply typed -calculus to one with type variables. Despite its incompleteness, the extension has been adopted in both prolog and Isabelle. Although the good theoretical properties of our algorithm will not carry over to the polymorphic case, we shall use the polymorphic uni cation algorithm from Isabelle into our implementation.
References [1] A. Bockmayr: A Note on a Canonical Theory with Undecidable Uni cation and Matching Problem. In Journal of Automated Reasoning, Vol. 3, 379-381, 1987. [2] A. Bockmayr, S. Krischer and A. Werner: An Optimal Narrowing Strategy for General Canonical Systems. In Conditional Term Rewriting Systems, LNCS 656, 1993.
13
[3] R. Curien, Z. Qian and H. Shi: Ecient Second-Order Matching. To appear in the 7th International Conference on Rewriting Techniques and Applications, 1996. [4] N. Dershowitz and J.-P. Jouannaud: Rewrite Systems. In: Handbook of Theoretical Computer Science, Elsevier, 1990. [5] N. Dershowitz, S. Mitra and G. Sivakumar: Decidable Matching for Convergent Systems. In: Proc. of 11th International Conference on Automated Deduction, LNCS 607, 1992. [6] R. Echahed: On Completeness of Narrowing Strategies. In CAAP'88, 89-101, LNCS 299, 1988. [7] M. Fay: First-Order Uni cation in an Equational Theory. In Proc. 4th Workshop on Automated Deduction (1979), 161-167. [8] J. H. Gallier, W. Snyder: Complete Sets of Transformations for General EUni cation. LNCS 256 (1987). [9] M. J. C. Gordon and T. F. Melham: Introduction to HOL: A Theorem Proving Environment for Higher Order Logic. Cambridge University Press, 1993. [10] M. Hanus: Integration of Functions into Logic Programming: A Survey. MPI-Report MPI-I-94-201, Max-Planck-Institut fur Informatik, Saarbrucken, 1994. To appear in Jorunal of Logic Programming. [11] J. Hindley and J.P. Seldin: Introduction to Combinators and -Calculus. Cambridge University Press, 1986. [12] B. Homann and B. Krieg-Bruckner (eds.): PROgram Development by SPECi ction and TRAnsformation, The PROSPECTRA Methodology, Language Family, and System. LNCS 680, 1993. [13] G. Huet and B. Lang: Proving and Applying Program Transformations Expressed with Second-Order Patterns. Acta Informatica 11, (1978). [14] J.-M. Hullot: Canonical Forms and Uni cation. In Proc. 5th Workshop on Automated Deduction (1980), 318-334. [15] J.-P. Jouannaud and C. Kirchner: Solving Equations in Abstract Algebras: A RuleBased Survey of Uni cation. In J.-L. Lassez and G. Plotkin, editors, Computational Logic: Essays in Honor of Alan Robinson, MIT press, Cambridge, MA, 1991. [16] J. W. Klop: Term Rewriting Systems. In Abramski, S., Gabbay, Dov. M. and Maibaum, T.S.E. (eds): Handbook of Logic in Computer Science, Chapter 1. Oxford Science Publications, (1992). [17] Kolyang, T. Santen and B. Wol: Correct and User-Friendly Implementations of Transformation Systems. To appear in: Proceedings of the Formal Methods Europe, Oxford, 1996. 14
[18] B. Krieg-Bruckner, J. Liu, H. Shi and B. Wol: Towards Correct, Ecient and Reusable Transformational Developments. In Broy, M., Jahnichen, S. (eds): KORSO, Correct Software by Formal Methods, LNCS 1009 1995. [19] A. Martelli and U. Montanari: An Ecient Uni cation Algorithm. ACM Transactions on Programming Languages and Systems, Vol 4, No. 2, (1982), 258-282. [20] R. Mayr and T. Nipkow: Higher-Order Rewrite Systems and their Con uence. To appear in Theoretical Computer Science. [21] D. Miller: A Logic Programming Language with Lambda-Abstraction, Function Variables, and Simple Uni cation. In Schroeder-Heister, P. (ed.): Extensions of Logic Programming, LNCS 475, 253-281, 1991. [22] T. Nipkow: Higher-Order Uni cation, Polymorphism and Subsorts. In S. Kaplan and M. Okada, editors, Proc. 2nd Int. Workshop Conditional and Typed Rewriting Systems, LNCS 516, 1990. [23] T. Nipkow: Orthogonal Higher-Order Rewrite Systems are Con uent. In M. Bezem and J.F. Groote, editors, Proc. Int. Conf. Typed Lambda Calculi and Applications, 306-317, LNCS 664, 1993. [24] Lawrence C. Paulson: Isabelle { A Generic Theorem Prover. LNCS 828, SpringerVerlag, 1994. [25] C. Prehofer: Solving Higher-Order Equations. In Proc. of 9th IEEE Symposium on Logic in Computer Science, 1994. [26] Z. Qian: Linear Uni cation of Higher-Order Patterns. In: Proc. of TAPSOFT'93, Gaudel, M.-C. and Jouannaud, J.-P. (eds.), LNCS 668, 391-405, 1993. [27] H. Shi: Extended Matching with Applications to Program Transformation. Ph.D. Dissertation, Universitat Bremen, 1994.
7 Appendix Proof theorem 1.
The central point of the completeness proof lies in the fact that the algorithm always chooses a solvable pair to obtain some intermediate solutions, through the applications of the rules (SO-Matching) and (Narrowing) some new solvable pairs may be produced, trying then to obtain some new solutions, until no pairs left. A complete proof of this theorem can be found in [27]. In the following, we prove an important lemma used in the proof.
Lemma 1 Let ? be a solvable matching problem, (; ?) =) (0; ?0), ?0 contains at least one solvable matching pair as long as ?0 = 6 fg. / / 0 / 0= Proof. We use t = s t s to denote that the matching pair t = s is dependent on the pair t0 =/ s0, i.e., F V (s) \ F V (t0) = 6 . If we can prove that the dependency relation is well-founded on ?0, then there is a set of pairs that do not depend on any other pair, 15
i.e. solvable matching pairs. Using induction on transformation sequences, the theorem can then be proved. Let (1; ?1) =) (2; ?2), be a step of transformation, and be well-founded on ?1. To prove is well-founded on ?2, we consider the following cases: If the rule applied is (Decomposition), then is obviously well-founded on ?2 , since the rule does not introduce new dependency relations between matching pairs. If the rule (Abstraction) is applied to t =/ s, two new pairs t[X ]p =/ s and xk :tjp =/ xk :X will be introduced. Since X is/ a new free variable, the only new dependency / relation is xk :tjp = xk :X t[X ]p = s. So is also well-founded on ?2. If the rule applied is (SO-Matching) or (Narrowing), some dependency relations will be removed, and no new relations will be introduced. 2
Proof theorem 3. The central point of the proof is the de nition of an ordering over matching pairs. Let t be any term, jtj denotes the depth of nested occurrences of matching combinators. Let P be a subterm property along with a well-founded total ordering . P is de ned on matching pairs as follows, t =/ s P t0 =/ s0 i P (s) P (s0), or P (s) P (s0) and js0j < jsj, where < is the less-than function on the natural numbers. Obviously, P is also a
well-founded total ordering. Let t =/ s be a matching pair such that t is a term containing no combinators nested below any decreasing combinator or free variable and s is a ground normal form./ In the following, we show that any transformation sequence beginning with (fg; ft = sg) terminates. With this result the theorem can be proved. Proof. The interesting case is the one in which t has the form xk :f tn and f is a combinator. (fg; fxk :f tn =/ sg) =)(Abstraction) (fg; fxk :f t0n =/ s; yim :sm =/ yim :Xm g) =)(Narrowing) (; ft0 =/ s; (yim :sm) =/ (yim :Xm)g where xk :f t0n/ is structurally narrowable to t0 with a rule l ! r 2 R and a substitution . To solve t0 = s, we should consider two cases. The head of t0 is a variable. Since r contains no matching combinators nested below any free variable, so t0 is a pure term and it can be solved by (SO-Matching). The head of t0 is a constructor. The rule (Decomposition) can be applied to it, which transforms the matching pair to a set of pairs: fxk :rp =/ xk :spg, where / / each si is a proper subterm of s, so t = s P xk :ri = xk :ti. Thus, by applying the inductive hypothesis we can assume that all matches for/ t0 =/ s can be found in nite time. Let be one of these matches, we show now t = s P ((xk :si)) =/ ((xk :Xi)), 1 i m. The following two cases should be considered: Function f is decreasing. By assumption there is no de ned function below it, so m = 0. Function f is non-decreasing, then P (s) P ( ((Xi))) or P (s) P ( ((Xi))). Additionally, j ((xk:si))j < jsj, since for any X , (X ) contains no matching combinators. 2 16