We discuss some aspects of term graph rewriting based on systems of ... under the name 'systems of fixed-point equations'; they are a simple form of recursive ...
Term Graph Rewriting Jan Willem Klop* CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands and Vrije Universiteit, Department of Mathematics and Computer Science, de Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands A b s t r a c t . We discuss some aspects of term graph rewriting based on systems of recursion equations. This is done for first-order signatures as well as lambda calculus. Also relations with infinitary rewriting are discussed.
O. Introduction. In this paper we will discuss in an informM way some aspects of t@rm graph rewriting. We will indicate why this subject falls in the scope of higher-order rewriting, and thus in the scope of the present workshop. Our discussion will be loosely structured by a numbered sequence of keywords and phrases, thereby following a talk given at the workshop. In the theory of rewriting, term graph rewriting is a relatively new development, prompted by the actual practice in functional language implementations where subterm sharing is a matter of routine. There are several approaches to a theoretical foundation of graph rewriting, of which an important one is that based on category theory and single or double push-outs. We will not discuss this approach here, but instead refer to [SPvE93] where several references to the categorical treatment can be found. 1. The Equational Approach to Term Graph Rewriting. We will advocate the equational approach, that starts from term graphs as systems of recursion equations. An example of such a system is (a I s = F(/3,7), /3--- C(u), 7 = H ( ~ , ~ ) ) where c~ is the 'root' variable, G a unary function symbol, and F and H are binary function symbols from a first-order signature. We use the terminology 'term graph' since the graphs that we are concerned with look locally the same as terms (first-order terms or, later on, lambda terms). This means that in pictures like Fig. 1, the outgoing arcs of a node are in fact ordered from left to right; this is only suggested in the pictures but not made explicit. The nodes in a term graph are locations that in figures are literally filled with operator symbols; nodes will in generM have names cz,/3, 7 , . . . , but we will also admit unnamed nodes. (Thus, also ordinary terms are covered in our treatment; and term graphs may have parts that are 'term-like'.) As Fig. 1 shows, our term graphs may * This work was partially supported by ESPRIT Working Group 6345 Semagraph, and by ESPRIT BRA 6454 CONFER.
contain cycles. Term graphs as just introduced are already studied in [CKV74] under the name 'systems of fixed-point equations'; they are a simple form of recursive program schemes. We will use the phrases 'recursion system', 'term graph', 'system of recursion equations' interchangeably. It is understood that the recursion variables (or node names) are subject to renaming (a-conversion) just as in A-calculus. So (a [a = F ( a ) ) is identical to (fl [/3 -- F(/~)).
Fig. 1.
2. Basic Transformations: Substitution, Copying, Garbage Collection. Having introduced recursion systems as in (1), certain operations on these systems suggest themselves naturally. The first is substitution, denoted by --*s; it consists of replacing an occurrence of a recursion variable by its right-hand side. E.g., (a l a = F ( a ) ) ~ s (a ] a =- F ( F ( a ) ) ) . Here the last term graph has an unnamed node, namely the one containing the second occurrence of F. Another basic operation is that of copying, denoted by -~c. It consists of creating copies a ~ , a " , . . , of a node a and using these new names accordingly. E.g., (a l a = F ( a ) ) ~ c (a l a = F ( a ' ) , a ' -- F ( a " ) , a " -- F ( a ' ) ) . More precisely, a copying step is the inverse of a step in which some recursion 9 variables are identified, or collapsed together; identifying a, a ~, a " yields again (a I a - F ( a ) ) . A third basic operation is naming, denoted by --~n, consisting of giving a name (a recursion variable) to a previously unnamed subexpression. Thus (a [ a = F ( F ( a ) ) ) ~ n (a I c~ = F(Z), • --- F(a)). A transformation that is even more 'a priori' than the operations --~c, --~s, -~n is that of garbage collection, notation -~gc, consisting of removing parts that are inaccessible from the root. E.g.,
(a I a
= F(fl), 7 = F ( a ) ) -*gc
(a l a
= F(f~)).
(The last system has a free variable/3. In the corresponding picture there will be an empty node with name ~/. Cf. Fig. 3.) There are some simple relations between -~c, --*s, --~n. E.g. : -+c U --+s=--*c o __.~1.
A further remark on naming: nodes may be unnamed - - but only when they have in-degree (number of incoming arrows) 1. If the in-degree is 2 or more, then that node must have a name - - otherwise the system could not be written down. To see this, note that, intuitively, there is an isomorphism between the term graph as picture and the term graph as expression in the linear format that we introduced. In particular, the picture has the same number of occurrences of function symbols as the corresponding expression. Now consider the term graph {a ] a = F(fl, fl), /3 = C), and especially the picture corresponding to it. The node named fl has in-degree 2. Suppose that this n a m e / 3 is removed. Then there is no way of writing down a term graph expression for this graph picture while maintaining the isomorphism between expression and picture that we strive for. This discussion touches on something that is vital for term graph rewriting: namely, how many times parts of a graph are allowed to be used. A formal calculus dealing with this issue of 'unique resources' is developed in [BS931.
3. Black Hole. Term graphs as introduced thus far have the form I
=
=
where the al are pairwise different recursion variables and the ti are terms over the first-order signature at hand; the t~ should start with an operator symbol, i.e., they are not recursion variables. So equations a = fl in the right-hand side of the { ] )-construct are not allowed. However, such equations may arise after rewriting, which is as yet not introduced. For instance, consider CL (Combinatory Logic) with its collapsing rewrite rule I x --* x. Applied to (a I a = I ( a ) ) we may rewrite this to