Optimization of Equational Programs Using Partial ... - CiteSeerX

Optimization of Equational Programs Using Partial Evaluation Robert Strandhy David Sherman Departement d'Informatique Department of Computer Science Universite de Bordeaux 1 University of Chicago Irene Durand Departement d'Informatique Universite de Bordeaux 1 nd a redex in the term; replace the redex by the corresponding right-hand side. The rst phase is accomplished by the pattern matcher, and the second by the simple creation of a new subterm and a replacement. To avoid having to retraverse the term graph after replacement, one generally has to restart the pattern matcher in a state that takes into account which parts of the term graph remain the same after replacement. This technique was used by Homann and O'Donnell in the early versions of the Equational Programming System [HO82] [O'D85] [HOS85]. Although the traditional approach carefully avoids retraversal of the term graph, it is less successful when it comes to handling right-hand sides. We argued in [Str88] that most of the execution time was spent building right-hand sides. Much of the right-hand side structure created in the course of execution is used solely to drive further pattern-matching, and does not appear in the nal output. An approach is needed that avoids creating this strictly intermediate structure. Wadler addressed this need for the case of linear terms in so-called treeless form in[Wad88]; in this work, he exhibited a program transformation that avoided construction of tree structures for functions that could be computed with bounded internal storage. In [Str89] we de ned the class of forward-branching programs (FB). We showed that programs in FB have the nice property that outermost evaluation can be preserved while doing innermost stabilization. This means that every part of the subject term that we need to inspect can be reduced to head-normal form before being inspected. This surprising property means that we can do ordinary recursive descent using the stack architecture of a traditional machine instead of keeping track of complicated state transitions. This gives us the advantage of high execution speed in combination with the advantages of the semantics of outermost evaluation.

Abstract

We describe an application of partial evaluation to the optimization of Equational Logic programs. Our method treats the right-hand sides of reduction rules as partial input to subsequent reduction steps, allowing us to produce specialized forms of the rewriting system that are more ecient in space and time. Our implementation of Equational Logic Programming compiles source equations into a term-rewriting system that is subsequently compiled into assembly code. The term-rewriting system is implemented by an intermediate code called EM code, and we perform our program optimizations on this intermediate form. Use of EM code allows us to perform the optimizing transformations with a succinct set of rewrite rules. One interesting feature of our method is that the partial evaluation can be carried out by straightforward general-purpose program optimizations, and does not require complicated interpreters or semantic analysis.

1 Introduction The traditional rewrite engine of an equational program operates on a global term represented as a graph, usually a directed acyclic graph. The purpose of the engine is to reduce the term graph to normal form, i.e. until no more rewrite rules can be applied. To accomplish its task, it iterates the following two steps: Partially funded by NSF grants CCR 8805503, y Partially funded by ESPRIT.

CDA 8822657.

1

0

2 Optimization of Equational Programs

Programs in FB have a corresponding pattern matching automaton called the index tree in [Str89]. A decision procedure for FB augmented with actual generation of the index tree is presented in [DS90]. From the index tree, we generate code in an intermediate language called EM code [SS90]. That code is then further processed to create code for the physical machine. A further important advantage of innermost stabilization is that we can easily predict pattern-matching moves immediately following replacement by a righthand side: we simply run the new right-hand side through the positive arcs to predict the automaton moves. Predicting these moves at compile-time means that we don't have to perform them at run-time. Doing so can often save us the trouble of actually constructing the right-hand side. We showed in [Str88] how to transform the intermediate code before generating machine code from it using a technique related to partial evaluation [BAOE76] [Har77] [JSS85]. An algorithm was given in [Str88] to perform the partial evaluation, but it used complicated program and data structures. Work by Bondorf[Bon89] used similar techniques to improve constructor-based term-rewriting systems, with the speci c application of producing good compilers from interpretive language speci cations. In this paper, we use a substantially improved version of the intermediate language. The new version is presented in [SS90]. The advantage of the new intermediate language is that it has a more precise de nition and stricter formal properties. These formal properties enable us to show that our transformations preserve the semantics of the program, as well as to perform more general optimizations. In this paper we present an application of partial evaluation to the optimization of Equational Logic programs, expanding on previous results and introducing an elegant approach to the problem. The rest of the paper has the following structure: We give an overview of optimizations that are needed in order to achieve an ecient treatment of right-hand sides. We brie y present the intermediate language, EM code, de ned in [SS90]. We translate the optimizations to transformations on EM code. We show how to implement the transformations as local rewrite rules on the EM code program. We nally discuss the eectiveness of the method, and discuss methods to prove its correctness and its termination.

Homann and O'Donnell [HO82] [O'D85] [HOS85] assumed that the main diculty in creating an ecient programming system based on Equational Programming was to create an ecient pattern matcher. They reasoned that the creation of right-hand sides was trivial. The focus of their research was thus on patternmatching techniques. Given an ecient pattern matcher, we found that the creation of right-hand sides was now the most timeconsuming operation. Based upon experience from the implementation, we argued in [Str88] that we must avoid the creation of explicit structure whenever we can. The key observation is how the term-rewriting engine for an Equational Program proceeds when reducing a term to head-normal form. It uses the pattern-matcher to nd a redex, performs the appropriate reduction, and restarts the pattern-matcher on the result. This intermediate term is subsequently used for further matching, and much of that matching is performed on structure that was created in previous reductions rather than structure that is part of the input. We would like to produce programs that avoid explicitly creating this strictly intermediate structure. The heart of our approach is an application of partial evaluation to the pattern-matching automaton. The term constructed as the result of a reduction is used as partial input to a new invocation of the automaton, and is used to create a specialized version of the automaton that is used in place of the usual iterative call to the start state. The nal states of the original automaton are replaced with specialized subautomata that produce the same output as several invocations of the original but more quickly. The next step was to realize that the automaton need only construct term structure that is actually returned. Code for constructing the subterm is emitted only in cases where an actual physical term is needed, either when the automaton returns or when the subterm is needed in a call. Creation of terms or parts of terms that are not needed, or rather that are only needed to perform internal state transitions, can be eliminated altogether. Oftentimes it is the uppermost part of a term that is not needed; the result is that only the needed subterms are actually constructed. Eectively, strictly intermediate structure is encoded into the state of the specialized automaton instead of being explicitly constructed. An example of this kind of optimization to the pattern-matching automaton is shown in gure 1.

2

Original automaton:

h HHHH HHH h h TT \\ T \ T \\ T cons[y;addend[x;z]] addend[x;rev[y]] ?

rev

rev[()]

addend

rev[cons[x;y]]

addend[x;()]

()

addend[x;cons[y;z]]

cons[x;()]

Automaton after partial evaluation:

h

?

HHH HH HH h h TT \\ T \ TT \\ () cons[y;addend[x;z]] h cons[x;()] ?@ ? ? @@ ? @ cons[x;()] rev

rev[()]

addend

rev[cons[x;y]]

rev[cons[x;()]]

addend[x;()]

addend[x;cons[y;z]]

rev[cons[x;cons[y;z]]]

addend[x;addend[y;rev[z]]]

Figure 1: Partial evaluation of a pattern-matching automaton.

3

down[R1; R2 ; branch[R1;

n]

branchlist]

replace[R2; R1 ] build-node[R1; (c1 ; . . . ; cn )]

call[label; R1 ;

return[R1]

sym;

Term Traversal Place the n-th child of the node pointed to by R1 in R2 . Inspect the symbol in the node pointed to by R1 and branch to a label according to branchlist. Term Construction Indicate a reduction from R1 to R2 . Build a new node with symbol sym and children (c1 ; . . . ; cn ), and return its address in R1.

Program Control arguments] Perform a recursive call to the instruction tree labeled label, returning the result in R1 . The values of the registers in arguments are made available as inputs to the recursive environment. Return from a call to an EM code program, returning the node in R1 .

Table 1: EM code instructions

3 EM code

in [O'D85] and [Str88], and represent them in intermediate code in much the same way as was done in the latter work. Basically, each state of the automaton is represented by a piece of EM code that performs the operations necessary to get to the next state. For non- nal states, these operations involve fetching parts of the subject term into registers (down, and call for innermost stabilization), and inspecting them to see which state to enter next (branch). For nal states, these operations involve constructing the right-hand side of the rule (build) and replacing the redex with the right-hand side (replace). Each nal state ends with a return instruction, and each nal state that doesn't represent a failure of pattern-matching preceeds the return instruction with a recursive call to the start state to perform the next step in the reduction sequence. Figure 2 shows a typical EM code program.

3.1 Instructions of EM code

EM code is an intermediate code for describing the pattern-matching and term construction operations of term-rewriting systems (see [SS90]). An EM code program consists of a symbol table, a set of registers, a start state S0 , a recursion stack, and a forest of (possibly labeled) instructions. The instructions can be divided into three categories: term traversal, term construction, and program control. These instructions are summarized in table 1. Flow of control in an EM code program starts in the root of an instruction tree and proceeds in a path to a leaf. Along the way it can make recursive calls to other instruction trees, but calls always resume at the following instruction. The branch instruction is the only one with out-degree greater than one, and represents a choice that the program can make. Only one branch of a branch instruction is followed in a given execution.

4 Optimizations of the Intermediate Code

3.2 Representing Equational Programs

Before returning to partial evaluation of Equational Programs, we consider in this section some optimizations of EM code programs. These optimizations will

We use EM code to represent the pattern-matching automaton and the right-hand side constructor. We use pattern-matching automata similar to those discussed 4

prove to be very important for the implementation of partial evaluation that we describe in section 5. One of the biggestest advantages of EM code is that it is a good compromise between simple semantics and expressiveness. EM code contains six simple instructions, yet it can express pattern-matching automata at a reasonably high level. As a result, we can perform important optimizations by simple transformations of the EM code program. These transformations come in two avors:

4.1.2

We often have a situation where a subterm that has already been built is rebuilt again by the EM code program. As Equational Programming is referentially transparent, we can use a single copy for both purposes. For example: build[r2; f; (r0 r1)] ... build[r3; f; (r0 r1)] build[r4; g; (r2 r3)]

obvious transformations. Transformations that are done because they obviously give better code. These transformations include pruning of branches that are never used, elimination of instructions that create registers that are never used, and so forth.

Here, the subterm with f as root and the contents of and r1 as children is built twice. The instruction sequence should be changed into the following:

r0

build[r2; f; (r0 r1)] ... build[r4; g; (r2 r2)]

policy transformations. Transformations that we do because we think the result will be better, but mostly because we believe that more opportunities for obvious transformations will appear. In this category, we have inlining and delayed construction.

4.1.3

Unnecessary Inspection of Input

An EM code program can contain a build instruction that is subsequently followed by a test for what was built. This situation very commonly occurs when a call has been replaced by a copy of the called code, that is, when we inline code. For example:

The obvious transformations are general, and are useful for all EM code programs. The policy transformations are justi ed by arguments about EM code programs generated from Equational Logic programs.

build[r1; f; (r0)] ... branch[r1; ((f L0) (g L1) ...)] L0: L1: ...

4.1 Obvious Transformations 4.1.1

Rebuilding

Useless Instructions

Perhaps the most obvious transformation is to eliminate instructions that create values in registers that are subsequently never used. A typical case involves unnecessary build instructions. Although the original program usually does not contain any unnecessary builds, they may appear as the result of other optimizations. For example:

Here we can eliminate all branches except one since we know that r1 contains an f. build[r1; f; (r0)] ... ...

build[r1; a; ()] build[r2; b; ()] build[r3; f; (r1)] return[r3]

Notice that this type of transformation often makes other instructions useless, thereby creating more opportunities for optimization.

Here, a subterm is built into r2 and then not used. The corresponding build instruction can simply be eliminated. Notice that instructions other than build can be useless: for instance, a down instruction that puts the child of some subterm in a register that is subsequently not used. In fact, any instruction that generates a value in a register that is subsequently never used can be eliminated. Oftentimes such eliminations create the opportunity for more eliminations.

4.1.4

Unnecessary Traversal of Input

Frequently this case occurs: a subterm is built from parts located in registers and then subsequently taken apart again. For example: build[r1; f; (r0)] ... down[r1; r2; 1] branch[r2; (...)]

5

Here we can eliminate the down instruction:

Every nal state of the automaton is a candidate for replacement with a specialized form of the automaton. These nal states all contain a tail-recursive call to the start state of the automaton to continue reduction of the result term. Our method is to replace the call with the body called, in this case the full automaton. States and registers are renamed in the copy as appropriate, with the exception that the input registers are renamed to match the actual parameters of the call. We then perform the optimizations described in section 4. The build instructions that appear just before the recursive call in the original state are propagated into the body of the call, where they can be used to specialize branch instructions and replace unnecessary down instructions. Any build instructions that are not needed for the term returned as a result from the automaton are eliminated as useless instructions, so strictly intermediate structure is never actually constructed. An example of this transformation is shown in gures 2{4. Figure 2 shows the EM code program produced from the following equational program:

build[r1; f; (r0)] ... branch[r0; (...)]

Such an elimination often generates unnecessary instructions, giving us opportunities for further optimizations.

build

4.2 Policy Transformations

Policy transformations are much harder to justify than obvious ones. These transformations also involve heuristic decisions based on the structure of usual EM code programs. We have based these heuristics upon our experience with Equational Programming, but they probably hold in general. Currently, we have two policy transformations: Recursive calls to the pattern matcher should sometimes be replaced by a copy of the pattern matcher. This copy will then be specialized according to known contents of machine registers. Explicit creation of subterms should be delayed as long as possible, hopefully forever. This corresponds to pushing build instructions as close to the leaves of the EM code program as possible (hopefully o of the end). Whereas the second transformation is local, i.e. operates by interchanging two adjacent instructions, the rst one is not. It requires a call instruction to be replaced by an entire tree of instructions.

square(x) = multiply(x,x) where x is in integer_numerals end where; cube(x) = multiply(x, square(x));

This example shows one application of partial evaluation, in state s3. We have thus turned the nontrivial problem of performing partial evaluation of Equational Programs into a simple set of transformations on the EM code program. In the next section, we implement these transformations with an elegant set of rewrite rules, and discuss some practical issues.

5 Partial Evaluation through Term-rewriting

6 Implementing the Rules Using Local Term-Rewriting

The beauty of our approach is that we use the optimizations decribed in section 4 to perform the partial evaluation described in section 2. The object of the partial evaluation that we described in section 2 was to produce a specialized automaton that avoids construction of intermediates and avoids reinspection of \known" structure. These are exactly the optimizations described in 4.1.1, 4.1.3, and 4.1.4. Specialization is done by pruning branch instructions, intermediates are not constructed when their entire use is for specialization and traversal, and reinspection is avoided because unneeded down instructions are removed. The optimization in section 4.1.2 is not used directly for partial evaluation, although the subterm sharing that it provides can make it easier to perform the other optimizations.

In this section, we describe how to implement the transformations with local rewrite rules. We have a sample implementation that demonstrates the utility and practicality of the method. One further advantage of our method is that the soundness of the transformations is easy to prove based on the semantics of the instructions. It remains to be proven that our method is sucient for all of the optimizations that we want to do.

6.1 Knowledge About the Term

To implement our method with simple term-rewriting rules, we make explicit certain knowledge about the state of the pattern matcher and propagate it through the EM code program. Such knowledge is of three kinds: 6

s0: s1: s3:

s4:

s2: s5: s6:

branch[r0; (("multiply" s4)("cube" s3) ("square" s2)(default s1))] return[r0] down[r0; r1; 1] build[r2; "square"; (r1 r1)] build[r3; "multiply"; (r1 r2)] call[s0; r4; (r3)] replace[r4; r0] return[r4] down[r0; r8; 1] down[r0; r9; 2] build[r10; integer; ( r8*r9 )] replace[r10; r0] return[r10] down[r0; r5; 1] branch[r5; (("integer" s6)(default s5))] return[r0] build[r6; "multiply"; (r5 r5)] call[s0; r7; (r6)] replace[r7; r0] return[r7]

s0:

branch[r0; (("multiply" s4)("cube" s3) ("square" s2)(default s1))] s1: return[r0] s3: down[r0; r1; 1] build[r2; "square"; (r1 r1)] build[r3; "multiply"; (r1 r2)] s100: branch[r3; (("multiply" s104)("cube" s103) ("square" s102)(default s101))] s101: replace[r3;r0] return[r3] s103: down[r3; r101; 1] build[r102; "square"; (r101 r101)] build[r103; "multiply"; (r101 r102)] call[s0; r104; (r103)] replace[r104; r3] replace[r104; r0] return[r104] s104: down[r3; r108; 1] down[r3; r109; 2] build[r110; integer; ( r108*r109 )] replace[r110; r0] return[r110] s102: down[r3; r105; 1] branch[r105; (("integer" s106)(default s105))] s105: replace[r3;r0] return[r3] s106: build[r106; "multiply"; (r105 r105)] call[s0; r107; (r106)] replace[r107; r3] replace[r107; r0] return[r107] s4: down[r0; r8; 1] down[r0; r9; 2] build[r10; integer; ( r8*r9 )] replace[r10; r0] return[r10] s2: down[r0; r5; 1] branch[r5; (("integer" s6)(default s5))] s5: return[r0] s6: build[r6; "multiply"; (r5 r5)] call[s0; r7; (r6)] replace[r7; r0] return[r7]

Figure 2: EM code program bind knowledge, bind[rx;;(r1 . . . rn )]: Register rx contains a subterm, the root symbol of which is and whose children are (r1 . . . rn ); partial bind knowledge, bind[rx;; ]: Register rx contains a subterm, the root symbol of which is and whose children are unknown; equivalence knowledge, equiv[rx;ry]: Register rx contains a subterm equivalent to the one contained in ry. Our initial knowledge is generated by instructions in the program: A build instruction generates bind knowledge. A replace instruction generates equivalence knowledge. A branch instruction generates partial bind knowledge. We can also derive knowledge from the interaction of bind and equivalence knowledge, but we have not fully investigated the ways in which this can occur. One interaction that we must be concerned with is the fact that a replace instruction destroys any bind knowledge that we might have had about the replaced node, substituting equivalence knowledge about that node.

;

Figure 3: After expanding in s3.

6.2 Local Transformations

We perform the latter three obvious transformations, which avoid rebuilding, unnecessary inspection, and unnecessary traversal, with three simple rewrite rules. The 7

s0: s1: s3: s104:

s4:

s2: s5: s6:

instruction: since the following instruction does not depend upon a result constructed by the rst, the rst can move past without changing the eect of the program. If these rules are applied bottom-up from the return instructions, then all useless instructions will migrate down to the bottom and fall o of the instruction tree. These rules assume that all knowledge has rst been propagated by the above rules. Under certain conditions, replace instructions can also migrate. This is important in our application because build instructions can get stuck behind them and therefore not be delayed as long as actually possible. Exactly how replace instructions can safely move is a topic of current research. If all subject terms are acyclic, it is likely that replace instructions can always migrate to just before the return instructions. The remaining transformation is the policy transformation of code inlining, or beta-expansion. This method was outlined in [Str88]. Whether this can be regarded as a local transformation is open to interpretation.

branch[r0; (("multiply" s4)("cube" s3) ("square" s2)(default s1))] return[r0] down[r0; r1; 1] build[r2; "square"; (r1 r1)] build[r110; integer; ( r1*r2 )] replace[r110;r0] return[r110] down[r0; r8; 1] down[r0; r9; 2] build[r10; integer; ( r8*r9 )] replace[r10; r0] return[r10] down[r0; r5; 1] branch[r5; (("integer" s6)(default s5))] return[r0] build[r6; "multiply"; (r5 r5)] call[s0; r7; (r6)] replace[r7; r0] return[r7]

Figure 4: After optimization. fourth rule is to take into account the interaction between bind knowledge and replace instructions.

6.3 Completeness

bind[ri ;;(r1 . . . rn )] build[rj ;;(r1 . . . rn )] bind[ri ;;(r1 . . . rn )] ! equiv[ri ;rj ] bind[ri ;; ? ] branch[ri; ((default l0 )(s1 l1 ) (sm bind[ri ;; ? ] !

It would be desirable to show that our set of rewrite rules is complete, that is, that it is able to handle all of the optimization that we need. Unfortunately, we do not know whether this is the case. Only extensive experimentation with numerous real programs as input will give us an idea as to whether additional rules are needed. It is clear, however, that intermediate terms have to be constructed in certain cases in order to avoid extremely large intermediate programs.

lm))]

code after label li , where si =

bind[ri ; ? ;(r1 . . . rn )] down[ri ;rj ;k ] bind[ri ; ? ;(r1 . . . rn )] ! equiv[rj ;rk ]

6.4 Practical Issues

An obvious practical issue is that the bodies of inlined call instructions can be quite large, and optimizing them can take a lot of time. Furthermore, we need to be careful that we do not optimize equvalent subprograms over and over. A characteristic of the Equational Logic Programming system that we have ignored up to this point is that it uses lazy evaluation. In fact, the body of the inlined call is produced incrementally, and pruned branches are never produced, let alone optimized. The second problem is addressed by other additions to the system, speci cally common subterm sharing and, on the more sophisticated side, the dynamic sharing of equivalent terms described in [She90]. One additional idea would be to perform inlining with the specialized automaton instead of the original, but this idea requires a much more careful analysis as well as deadlock prevention.

bind[ri ; ? ; ? ] replace[rt;ri ] replace[rt;ri ] equiv[rt;ri ] !

These rules are embodied in a simple equational logic program that operates upon EM code programs represented as terms. The rst obvious transformation, eliminating useless instructions, and the policy transformation of delayed construction are performed by code migration rewrite rules. For each EM code instruction in the program, compare it with the following instruction. If the rst instruction is a build, down, or call instruction and creates a register used in the second instruction, it stops moving. Otherwise, it switches places with the following 8

We are watching both of these issues carefully, and hope soon to have a quanti able idea of the cost of our method at compilation time.

we don't in fact do any extra reductions. This is one reason why using partial evaluation is superior to arbitrarily composing the rules in the term-rewriting system to skip intermediates. We know that the time is quantitatively less, but it is interesting to ask whether the running time is qualitatively less. This question can only be answered by empirical evidence with real programs. An earlier implementation of this technique in [Str88] ran comparably to compiled Lisp and within a factor of two of C for a non-lazy numerically-intensive problem1 . The amount of work actually saved depends greatly upon the application. While only a bounded amount of work is saved by each partial evaluation step, this eect can have a large impact on highly repetitive programs. If we chose to do further partial evaluation steps in the example of gure 4, for example, we would produce special cases for deeper and deeper nesting of squares without an unnecessary type check operation in every step; over the course of an execution we save quite a lot of work. We do see larger programs, of course; eectively, we have traded program size for speed. In the worst case we can see a quadratic increase in the size of the program code, although on the practical side we have never seen an increase more than a linear factor. Containing the size of the program is probably a heuristic choice, in that partial evaluation steps that increase the program size by a great deal are not pruning their branches very much, and thus are probably not saving us that much in execution speed.

7 Evaluation

7.1 Termination Properties

As with any procedure that manipulates programs in a nontrivial way, we have to be concerned with termination. Since inlining of procedure calls is possible, we risk ending up in an in nite computation while performing these transformations. To avoid such in nite computations we need to have termination criteria that decide when to stop the inlining. Such criteria take the form of heuristics that must absolutely be pessimistic, that is, that make sure that the replacement procedure always halts. We are currently experimenting with dierent kinds of heuristics. A method that seems to work well on real programs is to mark each instruction with a level. Initially every instruction is marked with 0. Inlining of the initial program (which contains all level 0 instructions) by replacing a call at level i produces instructions on level i +1. We simply never generate instructions with a level greater than some xed n. Practical experiments have shown that n = 3 is a good tradeo between the size of the expanded program and the desire to eliminate as many intermediate terms as possible. To show termination, it suces to notice that there is a xed number of call instructions to expand in the rst place and that each inline step destroys a call instruction at level i and produces a nite number of calls at level i + 1.

8 Correctness We need to develop a full proof of the correctness and usefulness of our transformations. The following are some ideas along these lines. The proof of the soundness of our partial evaluation method is straightforward. The specialized automaton represents the original term-rewriting system, augmented with derived rules that follow from the original rules by composition. One can thus show that normal forms are preserved. Showing that our implementation method implements this partial evaluation is equally straightforward. Attaching a copy of the full automaton certainly gives a correct result|it is nothing more than inlining. Pruning every branch and removing every traversal for which we already know the answer follows directly from the knowledge propagation described in section 6, and the correctness follows from the semantics of EM code programs.

7.2 Time/Space Tradeo

The space required for the execution of a particular reduction to normal form has clearly decreased. A given partial evaluation step always replaces two reduction steps with one, and performs no more constructions than those two steps would separately. Note that we only remove constructions, and then only when they are provably unnecessary. The running time for the execution of a reduction sequence will decrease, by an argument similar to that above: a given reduction step in the specialized automaton does no more than the steps in the original from which it was constructed. Again, we only remove work, and then when it is provably redundant. We must be careful, though, as we may be forcing the program to do two steps when the lazy semantics of the language would only require one. But we only expand calls that would have to be executed anyway, that is, calls necessary for the current reduction to head-normal form, so

1 That is, a problem in which the unique features of Equational Logic programming are not exercised, and which exercises areas in which our implementations have historically been weak.

9

One approach to showing that delaying construction is correct is to consider the output of an EM code program, where the output is the returned value plus the side-eects that occur in the subject term. We have not worked out the details, but the simplicity of the transformation rules should be of great assistance. What remains to be shown is that the combination of these transformations produces the optimizations that we want. A possible proof that our rules produce all of the desired transformations would be rst, to show that our initial knowledge is complete; second, to show that each rule correctly takes advantage of the knowledge and propagates the appropriate old and new knowledge; and third, to show that the useful transformations can be expressed entirely in terms of knowledge propagation. Critical to such an approach is that all useful transformations can be done using only local knowledge.

9 Conclusions Work

and

time and space analysis. The nal test, of course, is to run real programs optimized with this method. We are currently integrating our optimizations into a new release of the Equational Logic Programming system, which can produce highlyportable C as well as assembly code for real machines. It will be interesting to compare the quality of the resulting code with compiled Lisp and C.

References

[BAOE76] L. Beckman, Haraldsson A., Oskarsson O., and Sandewall E. A partial evaluator, and its use as a programming tool. Arti cial Intelligence, (7):319{357, 1976. [Bon89]

Anders Bondorf. A self-applicable partial evaluator for term-rewriting systems. In International Joint Conference on the Theory and Practice of Software Development (TAPSOFT), volume 2. Springer-Verlag, 1989.

[DS90]

Irene Durand and Robert Strandh. A decision procedure for forward-branching equational programs. Technical Report 05-90, GRECO Programmation, 1990.

[Har77]

A. Haraldsson. A Program Manipulation System Based on Partial Evaluation. PhD thesis, Department of Mathematics, Linkoping University, Linkoping, 1977.

[HO82]

Christoph Homann and Michael J. O'Donnell. Programming with equations. ACM Transactions on Programming Languages and Systems, pages 83{112, 1982.

Further

We have presented a novel application of partial evaluation to the problem of optimizing programs implemented with term-rewriting systems. The main idea is to regard intermediate steps in the reduction sequence as partial input to the next pattern-matching step. This partial input is used to produce a specialized form of the pattern-matcher that avoids construction and reinspection of strictly intermediate terms. We implement this partial evaluation by performing general-purpose optimizations upon a copy of the pattern-matching automaton. This optimized automaton is then used in place of the more general one. We perform these optimizations with an elegant set of local rewriting rules that operate upon a term representation of the automaton. The simplicitly of these rules is made possible by our use of an intermediate language called EM code. Our method improves upon previous work, notably [Str88], by being more general, easier to implement, and easier to prove sound. We believe that this method will produce better code than any other optimization methods that we have considered so far. We are currently working to implement the optimizer as fast as the experimental version described in [Str88]. There is much more work to do on this topic. Most importantly, we need to know whether all of the optimizations that we want to perform are expressible in our method. We also need to carefully work out the details of optimizations based upon builtin and other arithmetic operations, optimizations that are necessary to produce very high-quality assembly code. We also need a complete proof of correctness and a qualitative

[HOS85] C. Homann, M. J. O'Donnell, and R. Strandh. Programming with equations. Software, Practice and Experience, 1985.

10

[JSS85]

N. D. Jones, P. Sestoft, and H. Sondergaard. An experiment in partial evaluation: The generation of a compiler generator. Technical report, Institute of Datalogy, University of Copenhagen, Copenhagen, 1985.

[O'D85]

Michael J. O'Donnell. Equational Logic as a Programming Language. MIT Press, 1985.

[She90]

David J. Sherman. Lazy directed congruence closure. Technical Report 90-028, University of Chicago Department of Computer Science, 1990.

[SS90]

David J. Sherman and Robert I. Strandh. An abstract machine for ecient implementation of term rewriting. Technical Report 90-012, University of Chicago Department of Computer Science, 1990. [Str88] Robert I. Strandh. Compiling Equational Programs into Ecient Machine Code. PhD thesis, Johns Hopkins University, Baltimore, Maryland, 1988. [Str89] Robert I. Strandh. Classes of equational programs that compile into ecient machine code. In Proceedings of the Third International Conference on Rewrite Techniques and Applications, 1989. [Wad88] Philip Wadler. Deforestation: Transforming programs to eliminate trees. In Second European Symposium on Programming. SpringerVerlag, 1988.

11

Optimization of Equational Programs Using Partial ... - CiteSeerX

Optimization of Equational Programs Using Partial ... - CiteSeerX

Suggest Documents

Unfolding of Equational Logic Programs

Equational Reasoning via Partial Reflection

Operational Termination of Membership Equational Programs: the ...

OMPI: Optimizing MPI programs using Partial Evaluation - CiteSeerX

Partial Image Data Registration using Stochastic Optimization

Sketching with Partial Programs

Termination Modulo Combinations of Equational Theories - CiteSeerX

Equational Logic and Equational Theories of Algebras

Proving Equational Haskell Properties Using Automated Theorem ...

Controlled Partial Evaluation of Declarative Logic Programs. - CiteSeerX

Automatic Classification of Partial Shoeprints Using ... - CiteSeerX

Partial optimization method of topology

3 From Total Equational to Partial First Order Logic?

Completeness of Category-Based Equational Deduction - CiteSeerX

Memory Subsystem Performance of Programs Using ... - CiteSeerX

Benefits of Using EPAS Programs - CiteSeerX

Benefits of Using EPAS Programs - CiteSeerX

Equational Reasoning about Programs with General Recursion and ...

Optimization of Object-Oriented Programs Using ... - Semantic Scholar

Proving Equational Haskell Properties Using Automated Theorem ...

Partial Deduction of Disjunctive Logic Programs: A

Colouring Terms to Control Equational Reasoning - CiteSeerX

Analysis and Optimization of Explicitly Parallel Programs - CiteSeerX

Profile-Directed Optimization of Event-Based Programs - CiteSeerX