Path Rewriting and Combined Word Problems - UniMI

0 downloads 0 Views 473KB Size Report
word problems; in the literature it is usually assumed that T1,T2 share a set of constructors ...... c) (called composition rules) are permanent (whenever a rule is deleted during ...... (in fact, if i = 1 the arrow λ belongs to the path) and we obtain Ki ↘ L1, where ...... The join of equational theories. Colloq. Math., 30:15–25, 1974. 84.
Path Rewriting and Combined Word Problems Camillo Fiorentini and Silvio Ghilardi∗ Technical Report n.250-00 Department of Computer Science, Universit`a degli Studi di Milano via Comelico 39, 20135 Milano – Italy Email: {fiorenti, ghilardi}@dsi.unimi.it May 2000

Abstract. We give an algorithm solving combined word problems (over non necessarily disjoint signatures) based on rewriting of equivalence classes of terms. The canonical rewriting system we introduce consists of few transparent rules and is obtained by applying Knuth-Bendix completion procedure to presentations of pushouts among categories with products. It applies to pairs of theories which are both constructible over their common reduct (on which we do not make any special assumption).



Lavoro svolto nell’ambito del progetto MURST “Logica”.

1

1

Introduction

An essential problem in automated deduction consists in integrating theorem provers which are able to perform separated tasks. In the field of equational logic, this leads in particular to the following question: suppose you are able to solve word problems for theories T1 , T2 ; can you solve word problem for T1 ∪ T2 ? Better, can you design an algorithm taking as input two arbitrary algorithms for word problems for T1 and T2 and realizing a decision procedure for word problem for T1 ∪ T2 ? In case T1 , T2 have disjoint signatures the positive answer was known from long time [12], although only more recently discovered within automated deduction community (see e.g. [11]). In the general case, combining decidable word problems may leads to undecidability, even if we suppose that T1 , T2 are both conservative over their common reduct T0 . To this aim, consider the following example. Let T0 be the theory of join-semilattices with zero (i.e. of commutative idempotent monoids) and let T1 be the theory of Boolean algebras. As T2 we take the theory of semilattice-monoids, which are algebras having both a monoid and a join-semilattice with zero structure and which satisfy the further equation: (

n _

i=1

xi ) ◦ (

m _

yj ) =

j=1

n _ m _

(xi ◦ yj ).

i=1 j=1

T2 clearly has decidable word problem (free algebras are finite sets of lists of the generators), as well as T1 . The union theory (which we better indicate with T1 +T0 T2 ) corresponds to the ‘distributive linear logic’ of [8] and falls within the undecidability results of [1]. Clearly something must be assumed in order to have positive solution to combined word problems; in the literature it is usually assumed that T1 , T2 share a set of constructors (we prefer the terminology ‘they are both constructible over T0 ’). There are various definitions of constructors and depending on such definitions there are variable strength results. Main papers on the subject are [5] and [3, 4]: the second has a weaker definition and consequently a stronger result. Our definition is again weaker (see Section 10 for details) and, more important, it covers natural mathematical examples and does not make any strong assumption on T0 (in [5] T0 is assumed to be free, in [3, 4] to be collapse-free).1 [5] and [3, 4] use quite different methods: in [3, 4] the combined decision algorithm is obtained through a refutation technique manipulating equations according to certain non-deterministic rules. As such it has the advantage of being more flexible, although it does not provide normal forms. On the contrary, [5] (and the similar method of [11] for the disjoint case) directly manipulate terms by abstracting and collapsing alien subterms and the suggested algorithm follows a complex and rigidly 1 Recall that an equational theory is said to be collapse-free iff it cannot prove equations of the kind x = t, where x is a variable and t is a non-variable term.

2

preassigned procedure. Our method is more similar to that of [5] (in the sense that it manipulates terms), but has the same flexibility advantages as the method of [3, 4]. The idea is simple: we build a canonical rewriting system which is able to normalize paths of mixed pure terms. The realization of such a plan looks very hard at a first glance: terms from combined signatures are quite unreliable datatypes, basically because they can compose, decompose and even collapse in many uncontrolled and overlapping ways. However, we shall put such complex combinatorics under the control framework provided by the categorical approach to equational logic: such approach goes back to the classical pioneering paper of F.W. Lawvere [9] in functorial semantics.2 Basically, equational theories are identified with categories with products, so that in our situation we need to manipulate presentations of pushouts among such categories. We get a first general and simple presentation of these pushouts in Section 3 by means of two-sides rewrite rules. To this presentation we apply, in Section 5, Knuth-Bendix completion procedure and get the desired rewriting system, under some ‘constructors’ hypothesis for our theories. This constructors hypothesis is formulated within a categorical framework in Section 5 by means of (weak) factorization systems and translated in symbolic terms in Section 10: roughly speaking, Ti is said to be constructible over T0 iff there is a class Ei of terms (including variables and closed under renamings) in the signature Ωi of Ti so that any Ωi -term t(x1 , . . . , xn ) decomposes uniquely (up to provable identity) as u(v1 , . . . , vk ) where the vi (x1 , . . . , xn ) are (always up to provable identity) distinct terms from Ei and u is a k-minimized term in the signature Ω0 of T0 (a term u(x1 , . . . , xk ) is said to be k-minimized iff it is not provably identical to any term in which only variables coming from a proper subset of {x1 , . . . , xk } occur). Examples are provided in Section 10 (a typical example is the case of commutative rings with unit which are constructible over abelian groups). We briefly describe here the rewriting system R we obtain. R consists of only four rules (for technical reasons concerning ‘colours’ of terms, two of such rules are ‘duplicated’). First rule (called composition rule) simply allows to compose equally coloured consecutive (equivalence classes of) terms. Second rule (called ε-extraction rule) minimizes terms by ‘moving left’ projections (i.e. n-tuples of distinct variables). Third rule (called µ-extraction rule) ‘moves right’ the second component of the above mentioned factorization of terms. The fourth rule (called products rule) is suggested by the completion procedure and has the following meaning: any projection (i.e. any tuple of distinct variable terms) appearing in an internal position of a path of pure terms represents a ‘hole’ and the normalization process is supposed to fill such a hole by ‘moving right’ genuine terms (i.e. terms which are not projections). The complete table of rules of R is given at the end of Section 5. 2 We recall that there is another quite interesting category-theoretic approach to universal algebra, namely the monads approach (which has also been significantly used in questions related to rewriting, see e.g. [10]).

3

Although R is a quite simply described system, the confluence proof requires long work, because all critical pairs must be examined. This leads to a large amount of details, all consisting of elementary computations (in fact, once the technical tools are appropriately settled, single cases are treated in the most natural way). The paper is organized as follows: in Section 2 we recall the necessary background from functorial semantics; in Section 3 we get a first presentation of pushouts among Lawvere categories. In Sections 4-5 we apply completion procedure and get the appropriate rewriting system R. In Section 6 we provide local confluence and termination for a simple subsystem R0 of R. In Section 7 a third rewriting system, called R+ is introduced (R+ is equivalent to R, it normalizes slower but it is easier to manage); in addition useful technical facts are collected. In Section 8, R+ is proved to be locally confluent, whereas in Section 9 termination of both R and R+ is established. Finally, equivalence between R and R+ and canonicity of the former are obtained. Section 10 provides examples of constructible theories and of normalizations of paths of terms; a comparison with results of [3, 4] is done at the end of the paper. Sections 6-7-8-9 can be skipped in a first reading by people mostly interested in our results (and less interested in their proofs). This technical report is fully detailed and self-contained. We only assume a certain familiarity with rewriting (for some unexplained notions readers may consult [2]).

2

A short summary in functorial semantics

We recall that a category with finite products C is a category in which for every finite list of objects X1 , . . . , Xn (n ≥ 0) there is an object X1 × · · · × Xn and there are arrows X1 ,...,Xn πX : X1 × · · · × Xn −→ Xi i (to be denoted simply as πXi or πi ) enjoying the following universal property: • for every object Z, for every n-tuple of arrows αi : Z −→ Xi (i = 1, . . . , n) there is a unique arrow α : Z −→ X1 × · · · × Xn such that α ◦ πi = αi for all i = 1, . . . , n3 (such α is usually indicated by hα1 , . . . , αn i). The definition includes the case n = 0 and n = 2: in fact, such two cases are sufficient for the general case n ≥ 0. We can so equivalently give the definition in the following way: a category C is said to have finite products iff • there is a terminal object, namely an object 1 such that for every object X there is just one arrow X −→ 1 (such arrow is noted h i or h iX ); 3

α

β

Composition of arrows −→−→ in a category is denoted as α ◦ β in this paper (contrary to some more frequent notations like β ◦ α). We think that directly following ‘arrow pictures’ looks better for the purposes of this paper.

4

• for every pair of objects X1 , X2 there are an object X1 × X2 and arrows π1 : X1 × X2 −→ X1 , π2 : X1 × X2 −→ X2 , such that for every object Z and for every pair of arrows α1 : Z −→ X1 , α2 : Z −→ X2 , there is a unique arrow hα1 , α2 i : Z −→ X1 × X2 such that hα1 , α2 i ◦ π1 = α1 and hα1 , α2 i ◦ π2 = α2 . In categories with finite products, we also use the standard abbreviation α1 × · · · × αn α1 αn to denote (for X1 −→ Y1 , . . . , Xn −→ Yn ) the arrow hπ1 ◦ α1 , . . . , πn ◦ αn i. In the paper by ‘category’ we always mean ‘category with finite products’ and by ‘functor’ we always mean ‘finite products preserving functor’. Checking complex identities in categories with products might be a little painful if the above definition is used, however for general reasons it is sufficient to check such identities in the category Set of sets (basically, this is due to faithfulness of Yoneda embedding and to the fact that products are componentwise in presheaves). For instance, in order to check the identity hα ◦ β, γi = hα, γi ◦ (β × 1), α

β

γ

where X −→ Y −→ Z1 , X −→ Z2 , 4 one can always assume that the data involved are just sets and functions and observe that “for every x ∈ X” (hα ◦ β, γi)(x) = hβ(α(x)), γ(x)i = (β × 1) ◦ (hα(x), γ(x)i) = (hα, γi ◦ (β × 1))(x). We shall meet many such identities in the paper, but we shall never justify them explicitly, we simply assume the reader realizes by himself that they are ‘true in Set’. An (equational) theory T = hΩ, Axi is just an ordinary signature Ω endowed with a set of pairs of terms (‘the axioms’ of T ). We use letters t, u, v, . . . for terms and letters x1 , x2 , . . . for variables; t(x1 , . . . , xn ) means that the term t contains at most the variables x1 , . . . , xn . Notation t(u1 /x1 , . . . , tn /xn ) (or simply t(ui /xi ) or again t(u1 , . . . , un )) is used for substitutions; when we write t(u/xi ) we mean t(x1 /x1 , . . . , u/xi , . . . , xn /xn ). Notations like `T t1 = t2 refer to some sound and complete deduction system (e.g. equational logic). Deciding `T t1 = t2 is just the (uniform) word problem for T . In order to avoid irrelevant cases, we shall always assume that our theories T match the following two requirements: • Ω always contains a constant symbol c0 (this is harmless, because adding a free constant -if needed- does not change the nature of word problems); • T is non-degenerate, namely 6`T x1 = x2 . Given a signature Ω and a category C, an Ω-interpretation I in C consists of the following data: - an object A in C (called the support of I); 4 Identities are generically noted 1; in some cases, we may use a subscript for their domain if we want to evidentiate it (in the present case, we should have written 1Z2 ).

5

- for every f ∈ Ωn (Ωn is the set of function symbols having arity n), an arrow I(f ) : An −→ A (notice that for n = 0, we have I(f ) : 1 −→ A). Given such an interpretation I and a term t, we can define I n (t) : An −→ A (to be denoted simply as I(t) if confusion does not arise), for every n such that x1 , . . . , xn is a list containing all the variables occurring in t, as follows: - I(xi ) = πi ; - I(f (t1 , . . . , tm )) = hI(t1 ), . . . , I(tm )i ◦ I(f ). An Ω-interpretation I is a model of T = hΩ, Axi (or is an internal T -algebra in C) iff for every (t1 , t2 ) ∈ Ax, we have I(t1 ) = I(t2 ).5 Not only categories give models of theories, they can be used as theories. This is a basic point in categorical logic, which leads in our case to the notion of Lawvere category. Basically this is nothing but any one-sorted (finite products) category. Formally, a Lawvere category is a category having objects {X n }n≥0 , in which X n (endowed with specified projections πi : X n −→ X) is the product with itself ntimes.6 In our context (see below) πi will be the (equivalence class of) the variable xi . We fix the following convention about a Lawvere category: arrows X n −→ X m of the kind hπi1 , . . . , πim i (where i1 , . . . , im ≤ n) are called • (pure) projections iff the i1 , . . . , im are all distinct (in this case we must have m ≤ n); • diagonals iff {i1 , . . . , im } include {1, . . . , n} (in this case we must have m ≥ n); • renamings iff i1 , . . . , im are just a permutation of {1, . . . , n} (in this case we must have n = m). In order to have a clearer picture, consider the category Sf having as objects the finite sets of the kind n = {1, . . . , n} and as arrows all functions (this is the skeleton category of finite sets); for every Lawvere category T, we have a functor S : Sf op −→ T associating X n with n and hπh(1) , . . . , πh(m) i with every function h : n ←− m. Now an arrow in T is a (pure) projection iff it is the S-image of an injective function, it is a diagonal iff it is the S-image of a surjective function and it is a renaming iff it is the S-image of a bijection. 5 Strictly speaking, one should show that this does not depend on the list x1 , . . . , xn (which includes all variables occurring in t1 , t2 ) chosen in order to apply I. Indeed it is so: in fact, a simple inductive argument shows that I n+1 (ti ) differs from I n (ti ) by left composition with the n-tuple of projections

hπ1 , . . . , πn i. Now notice that any arrow of the kind An

hπ1 ,...,πn ,αi

−→

An+1 gives the identity once I n (t1 )

composed on the right with such an n-tuple of projections (one can take as α anything, e.g. An −→ A). Thus I n+1 (t1 ) = I n+1 (t2 ) holds iff I n (t1 ) = I n (t2 ) holds (just take left composition with the above mentioned arrows). 6 Of course, this implies that X 0 is equal to the terminal object 1 and that X n1 +n2 is the product of X n1 and X n2 with obvious tuples of πi ’s as projections.

6

Lawvere categories are essentially in one-to-one correspondence with equational theories (we said ‘essentially’ because two equational theories differing only for the choice of the language and of the axioms are collapsed into the same ‘invariant’ Lawvere category). We need in this paper only one side of this correspondence, which we are going to explain. Let T = (Ω, Ax) be a theory; we build a Lawvere category T in the following way. We take as arrows X n −→ X m the m-tuples of equivalence classes of terms containing at most the variables x1 , . . . , xn (equivalence is intended through provable identity in T ); equivalence classes of variables are the specified projections and composition is substitution. Explicitly, this means that composition of h{t1 }, . . . , {tm }i : X n −→ X m and of h{u1 }, . . . , {ur }i : X m −→ X r is the r-tuple of terms in the variables x1 , . . . , xn given by: h{u1 (ti /xi )}, . . . , {ur (ti /xi )}i : X n −→ X r . Let us now examine models; given a model I of a theory T in a category C, we can associate with it a functor: (1)

FI : T −→ C

in the following way. If A is the support of I, FI (X n ) = An ; if h{t1 }, . . . , {tm }i is an arrow in T, FI (h{t1 }, . . . , {tm }i) = hI(t1 ), . . . , I(tm )i. Vice versa, given a functor F : T −→ C, we can associate with it the model IF with support F (X) given by (2)

IF (f ) = F ({f (x1 , . . . , xn )})

for every f ∈ Ωn . The two correspondences (1) and (2) are inverse each other, thus we can identify models with functors. Functors can be used also to deal with syntactic interpretations; we shall consider only special kinds of syntactic interpretations, those which matter for our purposes. Suppose we are given two theories T0 = (Ω0 , Ax0 ) and T1 = (Ω1 , Ax1 ), such that Ω0 ⊆ Ω1 and Ax0 ⊆ Ax1 . Such data induce a functor (3)

I1 : T0 −→ T1

associating equivalence classes of terms with themselves (more precisely, equivalence class of t in T0 with equivalence class of t in T1 ). When T1 is a conservative extension of T0 (i.e. when Ω0 -terms are provably equal in T0 iff they are provably equal in T1 ) we write T0 ⊆ T1 for short. Notice that T1 is a conservative extension of T0 iff the functor I1 is faithful (i.e. injective on arrows). Moreover, the restriction of a T1 -model to a T0 -model becomes composition on the left with I1 (whenever models are seen as functors under the correspondence (1)-(2)).

7

3

Basic Equations

We now fix our main data for the paper: we have three theories T0 = (Ω0 , Ax0 ) T1 = (Ω1 , Ax1 ) T2 = (Ω2 , Ax2 ) such that T1 and T2 are conservative extensions of T0 and Ω0 = Ω1 ∩ Ω2 ; taking (non disjoint) union of signatures and axioms we get a further theory which we call T1 +T0 T2 . We suppose to be able to solve the word problem for T1 , T2 ; in general, as explained in the introduction, this is not enough for solving the word problem for T1 +T0 T2 too,7 however we may look for sufficient conditions yielding a positive solution. The category T1 +T0 T2 can be built as usual, by using terms; however we want to characterize it intrinsically in terms of T0 , T1 , T2 . For this it is sufficient to look at its models. Let C be a category and let I1 , I2 be models of T1 , T2 in C restricting to the same model of T0 ; from these data it is possible to build a unique model I of T1 +T0 T2 in C restricting to I1 , I2 : the support is the same as the common support of I1 , I2 and the interpretations of functions symbols can simply be joined (as they agree on Ω0 ). Axioms Ax1 ∪Ax2 will be all true (as they involve only terms belonging to the same Ti ). Translating everything in terms of functors, we have that T1 +T0 T2 enjoys the following universal property: for every category C, for every pair of functors F1 : T1 −→ C and F2 : T2 −→ C such that I1 ◦ F1 = I2 ◦ F2 , there exists a unique functor F : T1 +T0 T2 −→ C such that J1 ◦ F = F1 and J2 ◦ F = F2 (here I1 : T0 −→ T1 , I2 : T0 −→ T2 , J1 : T1 −→ T1 +T0 T2 , J2 : T2 −→ T1 +T0 T2 are functors coming from syntactic expansions as in (3)). Otherwise said, T1 +T0 T2 is just the pushout of T1 , T2 over T0 .8 This purely categorical property uniquely characterizes T1 +T0 T2 . Next step consists in a direct description of a category (isomorphic to) T1 +T0 T2 , by using the above mentioned universal property: for this description we do not use terms anymore, but a more algebraic notion, namely mixed paths of arrows from T1 , T2 . To make the notation simpler, we act as functors I1 , I2 (which are faithful) were just inclusions. Formally, a path K : X n −→ X m is a non empty list of arrows coming from either T1 or T2 (or both) K = α1 , . . . , αk such that 7

Notice that T1 +T0 T2 might not be a conservative extension of T1 , T2 : for instance, both Boolean algebras and Heyting algebras are conservative over distributive lattices with 0 and 1, but putting together the two theories one gets again the theory of Boolean algebras which is obviously not conservative over Heyting algebras. 8 In relevant contexts, a 2-dimensional pushout should be considered instead: it corresponds to the theory of pairs of models of T1 , T2 , endowed with an isomorphism among their respective reducts. 2-dimensional aspects could be conglobated with some further work in the approach of this paper.

8

(i) the domain of α1 is X n ; (ii) the codomain of αk is X m ; (iii) for every i = 1, . . . , k − 1, the codomain of αi is equal to the domain of αi+1 . Paths are just words (with ‘typing’ restrictions). Equivalence relations on paths (stable with right and left concatenation) can be introduced by two-side rewrite rules. The plan is quite simple: identify such rules, orient and complete them into a canonical rewrite system (after all, the situation is very similar to string-rewriting systems for monoid presentations). In the remaining part of the paper, we make the following conventions: • we shall use letters α, β, . . . for arrows from T1 ∪ T2 , letters α1 , β 1 , . . . for arrows from T1 , letters α2 , β 2 , . . . for arrows from T2 and letters α0 , β 0 , . . . for arrows from T0 ; notice that any arrow like α1 may happen to come from T0 , the vice versa however cannot be; • instead of indicating types (i.e. objects of Lawvere categories) with X n , X m , . . . we may use letters Y, Z, U, . . . if the knowledge of the exponent does not matter; letter X however can only indicate X 1 ; • roman letters can be used to indicate arrows having codomain X, that is a1 for instance, stands for an arrow in T1 (which might belong to T0 too) having domain some Y = X n , but whose codomain can only be X = X 1 . Next, we give main definitions for path rewriting. Let S be a set of pairs of paths; we write (i) K ⇒S K 0 (or simply K ⇒ K 0 , leaving S as understood from the context) iff K = K1 , L, K2 and K 0 = K1 , R, K2 for some pair hL, Ri ∈ S; (ii) K ⇔S K 0 (or simply K ⇔ K 0 ) iff K = K1 , L, K2 and K 0 = K1 , R, K2 for some pair hL, Ri such that either hL, Ri ∈ S or hR, Li ∈ S; (iii) K ⇒∗S K 0 (or simply K ⇒∗ K 0 ) for the reflexive-transitive closure of ⇒S ; (iv) K ⇔∗S K 0 (or simply K ⇔∗ K 0 ) for the least equivalence relation containing ⇒S . Clearly ⇔∗ is the least stable equivalence relation extending S. Pairs hL, Ri ∈ S will be directly written as L ⇒ R and called rules of S; alternatively, they might be written as L ⇔ R (and called basic equations of S), but in such a case we tacitly assume that S is symmetric, i.e. that S contains hR, Li in case it contains hL, Ri (in such a case e.g. relations ⇒ and ⇔ obviously coincide). Next theorem accomplishes our first goal (‘finding appropriate basic equations’):

9

Theorem 3.1 Let P be given by the following two kinds of pairs of paths: α i , β i ⇔ αi ◦ β i

(i = 1, 2)

1 × α2 , α1 × 1 ⇔ α1 × 1, 1 × α2 (where in the last pair we have α1 : Y1 −→ Z1 and so

α2 : Y2 −→ Z2

1 × α2 : Y1 × Y2 −→ Y1 × Z2 α1 × 1 : Y1 × Z2 −→ Z1 × Z2 ).

We have that T1 +T0 T2 is isomorphic to the Lawvere category having as arrows the equivalence classes of paths under the relation ⇔∗P . Proof. Let P be the category having {X n }n≥0 as objects and as arrows X n −→ X m the equivalence classes (wrt ⇔∗ ) of paths of domain X n and codomain X m . Composition of {K} and {L} is {K, L}. Identity of X n turns out to be just {1X n }. We first show that P has finite products. X 0 = 1 is obviously terminal; in fact any path K : Y −→ 1 is equivalent to the singleton path h iY by iterated applications of the first basic equation of P (last member of K must be some h iZ , so it composes with the last-but-one member giving again something of the same kind, etc.). Given objects Y1 = X n1 , Y2 = X n2 , we take Y1 ×Y2 (i.e. X n1 +n2 ) as binary product and {πY1 }, {πY2 } as projections (here πY1 , πY2 are obviously the projections in T0 ). Let us now take two paths K1 , K2 of domain Z and codomains Y1 , Y2 , respectively. Suppose for instance that K1 = α1 , . . . , αr

K2 = β1 , . . . , βs .

Let hK1 , K2 i be the path: h1Z ,1Z i

K1 ×1Y

1 ×K

Z −→ Z × Z Z−→ 2 Z × Y2 −→ 2 Y1 × Y2 where 1Z × K2 is (1Z × β1 ), . . . , (1Z × βs ) (K1 × 1Y2 is defined analogously). We show that {hK1 , K2 i} enjoys the universal property for pairs. In fact h1Z , 1Z i, (1Z × K2 ), (K1 × 1Y2 ), πY1 ⇔∗ K1 by successive applications of the first basic equation of P (we have (αr × 1Y2 ) ◦ πY1 = πdom(αr ) ◦ αr , etc. so we finally get h1Z , 1Z i, (1Z × K2 ), πdom(α1 ) , K1 ⇔∗ K1 , because dom(α1 ) = Z and for every j, (1Z × βj ) ◦ πZ = πZ ). Similarly h1Z , 1Z i, (1Z × K2 ), (K1 × 1Y2 ), πY2 ⇔∗ K2 10

(by the same passages in different order). Let now K be another path from Z into Y1 × Y2 such that K, πY1 ⇔∗ K1 and K, πY2 ⇔∗ K2 . We must have K = K 0 , hγ1 , γ2 i, for some hγ1 , γ2 i : U −→ Y1 × Y2 ; so K 0 , γ1 ⇔∗ K1 and K 0 , γ2 ⇔∗ K2 . From this, a glance to the shape of our basic equations9 yields (K 0 × 1Y2 ), (γ1 × 1Y2 ) ⇔∗ (K1 × 1Y2 ) and (1Z × K 0 ), (1Z × γ2 ) ⇔∗ (1Z × K2 ). Consequently hK1 , K2 i ⇔∗ h1Z , 1Z i, (1Z × K 0 ), (1Z × γ2 ), (K 0 × 1Y2 ), (γ1 × 1Y2 ). We only have to show that this last path is equivalent to K = K 0 , hγ1 , γ2 i. If K 0 = δ1 , . . . , δl , by repeated applications of the second basic equation (first basic equation is also used e.g. in contracting (1 × δj ), (δj × 1) into δj × δj ), we have that h1Z , 1Z i, (1Z × K 0 ), (1Z × γ2 ), (K 0 × 1Y2 ), (γ1 × 1Y2 ) ⇔∗ h1Z , 1Z i, (K 0 × K 0 ), (γ1 × γ2 ) (where K 0 × K 0 is (δ1 × δ1 ), (δ2 × δ2 ), . . . , (δl × δl )). Finally, observe that h1Z , 1Z i ◦ (δ1 × δ1 ) = δ1 ◦ h1cod(δ1 ) , 1cod(δ1 ) i, etc. hence repeated applications of the first basic equation yield hK1 , K2 i ⇔∗ K 0 , h1U , 1U i, γ1 × γ2 ⇔ K 0 , hγ1 , γ2 i, as wanted. In order to check that P is isomorphic to T1 +T0 T2 , we show it enjoys the related universal property. Functors F1 : T1 −→ P

F2 : T2 −→ P

associating with αi the equivalence class {αi } obviously commute with the inclusions I1 : T0 −→ T1 and I2 : T0 −→ T2 . Now let Gi : Ti −→ C (i = 1, 2) be such that I1 ◦ G1 = I2 ◦ G2 . There is in fact a unique functor G : P −→ C such that F1 ◦ G = G1 and F2 ◦ G = G2 : it is the functor associating with {α1i1 , . . . , αkik } the arrow Gi1 (α1i1 ) ◦ · · · ◦ Gik (αkik ). This definition is forced by the conditions F1 ◦ G = G1 and F2 ◦ G = G2 and is good because basic equations of P express identities holding in any category with finite products. This completes the proof of the theorem. a 9 For the case of the second basic equation, you need identities like 1Y × (δ × 1Z ) = 1Y × δ × 1Z = (1Y × δ) × 1Z , which hold in Lawvere categories (in fact, if e.g. Y = X n , Z = X m and δ = hd1 , . . . , dk2 i : X k1 −→ X k2 , then unravelling the definitions the three members are all equal to

hπ1 , . . . , πn , π ◦ d1 , . . . , π ◦ dk2 , πn+k1 +1 , . . . , πn+k1 +m i, where π = hπn+1 , . . . , πn+k1 i). The point is that in Lawvere categories the finite product structure is freely generated (actually by one object); this is usual for categories coming from syntactic calculi, however in the general context of arbitrary category with products such identities hold only up to (coherent) isomorphisms.

11

In the applications, we should keep in mind that the isomorphism of categories among T1 +T0 T2 and P is the unique expansion to the signature Ω1 ∪ Ω2 of the models F1 : T1 −→ P , F2 : T2 −→ P associating with αi the equivalence class {αi }. This means the following: given an Ω1 ∪Ω2 -term t, the universal model (isomorphism) U : T1 +T0 T2 −→ P interpretes it as the equivalence class of any path obtained by expressing t as an iterated composition of terms which are pure, i.e. which are either Ω1 or Ω2 -terms. Such a path (called a splitting path for t) can be effectively computed from t in many ways (possibly yielding not the same path, but yielding in any case ⇔∗ equivalent paths); one might for instance adopt usual abstraction of alien subterms, or alternatively make use of the following simply described inductive procedure (which applies to any tuple t1 , . . . , tn of terms having variables included in some fixed list x1 , . . . , xm ): - if t1 , . . . , tn are all Ω1 or Ω2 -terms, a splitting path is the singleton path h{t1 }, . . . , {tn }i having domain X m and codomain X n (recall that arrows in T1 , T2 are equivalence classes of terms under provable identity in the corresponding theory); - otherwise, we have e.g. that ti = f (u1 , . . . , uk ); a splitting path K of t1 , . . . , ti−1 , u1 , . . . , uk , ti+1 , . . . , tn is given (we apply multiset induction on term complexities) and it has codomain X n−1+k , so we can take K, h{x1 }, . . . , {f (xi , . . . , xi+k )}, . . . , {xn−1+k }i as a splitting path for t1 , . . . , tn . It is now clear how we can deal with word problems: to decide whether t and u are T1 +T0 T2 -equal, it is sufficient to split them into paths K and L according to one of the above mentioned procedures and then check whether K ⇔∗ L holds or not. Of course, this will become convenient only after turning our basic equations into a canonical rewriting system. Let us see in any case an example. Example Let us prove the well-known fact from elementary algebra saying that it is not possible to endow a given distributive lattice with 0 and 1 with two different Boolean algebra structures (the complement, in case it exists, is unique). Let T0 be the theory of distributive lattices with 0 and 1 and let T1 , T2 be the theory of Boolean algebras. We show that `T1 +T0 T2 ¬1 x1 = ¬1 x1 ∧ ¬2 x1

12

¬ x

1 1 (here ¬i is complement in Ti , what we prove is ¬1 x1 ≤ ¬2 x1 ). We take X −→ X as splitting path of ¬1 x1 (we usually drop brackets in the examples, to be precise we

{¬1 x1 }

should write X −→ X); as splitting path of ¬1 x1 ∧ ¬2 x1 , we take X

h¬1 x1 ,x1 i

−→

X2

x1 ∧¬2 x2

hx1 ,¬2 x1 i

X2

¬1 x1 ∧x2

−→ X.

Notice we could have used X

−→

−→ X

istead: indeed the two paths are ⇔∗ -equivalent (just expand h¬1 x1 , x1 i, x1 ∧ ¬2 x2 to hx1 , x1 i, h¬1 x1 , x2 i, hx1 , ¬2 x2 i, x1 ∧ x2 , then apply second basic equation and contract once again). We need to show that ¬1 x1 ⇔∗ h¬1 x1 , x1 i, x1 ∧ ¬2 x2 . For this, let us consider the path K = h¬1 x1 , x1 i, hx1 , x2 , x1 ∧ x2 i, x1 ∧ (¬2 x2 ∨ x3 ); We have K ⇔ h¬1 x1 , x1 i, x1 ∧ (¬2 x2 ∨ (x1 ∧ x2 )) ⇔ ¬1 x1 because {x1 ∧ (¬2 x2 ∨ (x1 ∧ x2 ))} = {x1 }. On the other hand K ⇔ h¬1 x1 , x1 , x1 ∧ ¬1 x1 i, x1 ∧ (¬2 x2 ∨ x3 ) ⇔ ⇔ h¬1 x1 , x1 i, hx1 , x2 , 0i, x1 ∧ (¬2 x2 ∨ x3 ) ⇔ ⇔ h¬1 x1 , x1 i, x1 ∧ ¬2 x2 . In conclusion, we have ¬1 x1 ⇔∗ K ⇔∗ h¬1 x1 , x1 i, x1 ∧ ¬2 x2 as wanted. a Notice that in the above example we sometimes replaced terms from Ωi (i = 1, 2) with terms from Ω0 , when we realized this was possible. Such passages are indispensable in order to activate the first basic equation (which applies to consecutive equally coloured terms), but they might be non effective. The additional hypotheses we shall make on our data in order to be able to orient and complete basic equations into a canonical rewrite system will also be sufficient in order to make such passages effective.

13

4

Orientation

Before beginning orientation and completion, we make a modification to our ‘datatypes’, due to the fact that we do not want to bother distinguishing paths which are mere alphabetic variants each other. Formally, the involved definitions are the following (let α

β1

α

β

k 1 k Yk+1 , K be the path Y1 −→ · · · −→ Yk+1 and let L be ‘parallel’ path Y1 −→ · · · −→ with αi , βi equally coloured and having the same domain and codomain):

• K is said to be a ρ-renaming of L (where ρ = {ρi : Yi −→ Yi }1≤i≤k+1 is a list of renamings)10 iff the following squares Yi

αi -

Yi+1 ρi+1

ρi ?

Yi

βi

? - Yi+1

commute for i = 1, . . . , k (otherwise said, we have βi = ρ−1 i ◦ αi ◦ ρi+1 for all i); we write L = ρ(K) in order to express that L is (the) ρ-renaming of K; • K is said to be the ρ-alphabetic variant of L (where ρ = {ρi : Yi −→ Yi }1≤i≤k+1 is a list of renamings) iff it is the ρ-renaming of L and moreover ρ1 = 1Y1 and ρk+1 = 1Yk+1 (the reason for this definition is that variables in internal equivalence classes of terms in a path are considered bounded). Example. For every permutation σ on the n-elements set, we have that path K1 , ha1 , . . . , an i, α, K2 is an alphabetic variant of the path K1 , haσ(1) , . . . , aσ(n) i, hπσ−1 (1) , . . . , πσ−1 (n) i ◦ α, K2 (here K1 , K2 might be empty). Thus applying alphabetic variants allows permuting the components of an arrow in a path (provided such arrow is not in last position). a Example. Path K

hα,πZ i

K

1 2 W −→ Y × Z × U −→ V × Z −→ T 10

Recall from Section 2 that a renaming X n −→ X n in a Lawvere category is an n-tuple of projections of the kind hπσ(1) , . . . , πσ(n) i, where σ is a permutation on {1, . . . , n}.

14

is an alphabetic variant of the path W

K1 ◦hπY ,πZ ,πU i

−→

Y ×U ×Z

hhπY ,πZ ,πU i◦α,πZ i

−→

K

2 V × Z −→ T

(here only K2 might be empty and K1 ◦ hπY , πZ , πU i is K1 with last arrow composed with hπY , πZ , πU i). Thus applying alphabetic variants allows assuming that certain projections (located not in first position) project, say, on last components of their domains. a The content of the last two examples will be frequently and tacitly used within the technical Sections of the paper. We shall apply rewriting on equivalence classes of paths modulo ‘being an alphabetic variant of ’. This needs some additional conventions on our rules, because we want to have the following property (making the rewriting process easily manageable): if K rewrites to L, then any alphabetic variant of K rewrites to some alphabetic variant of L. In addition, notation of certain rules may be awkward in case we do not stipulate anything about their alphabetic variants. Consequently, we stipulate that the renaming of any rule is always tacitly supposed to be available as a rule: by this, we mean that if K ⇒ K 0 is a rule, then ρ(K) ⇒ ρ0 (K 0 ) is also a rule, for any list ρ, ρ0 of renamings such that first and last components of ρ, ρ0 are respectively equal.11 A consequence of the above stipulation is that the normal forms we eventually obtain, will be unique only up to alphabetic variants. Checking whether two paths are alphabetic variants each other, in case we know they are both in normal forms, does not substantially affect efficiency, given the particular structure of normal forms (we shall turn on that in Section 10). Before going on, we need another preliminary indispensable decision about our datatypes. As evidentiated also in the example at the end of Section 3, terms like f (t1 , t2 ), where f ∈ Ω0 and where ti (x1 ) is a pure Ωi -term, have (at least) two different splitting paths, namely X

ht1 (x1 ),x1 i

−→

X2

f (x1 ,t2 (x2 ))

−→

X

and X

hx1 ,t2 (x1 )i

−→

X2

f (t1 (x1 ),x2 )

−→

X.

Our final aim is that of having (uniqueness of) normal forms for paths, so we must decide once for all which one has to be considered in normal form. This choice is clearly conventional, but has to be done one way or another: we choose the former path. This yields to the following notion: say that a path is well-coloured iff it has the form K, α2 (where K is possibly empty). This means that the last arrow in a well-coloured path must come from T2 (which does not exclude it might come from T0 as well). We modify our basic equations so that we need to consider only well-coloured paths. For a path K : Y −→ Z, let K + be the well-coloured path K, 1Z . 11 We shall of course always deal with rules K ⇒ K 0 such that K and K 0 agree on domains and codomains. Thus, our convention says that ρ(K) ⇒ ρ(K 0 ) is a rule in case K ⇒ K 0 is a rule, ρ = {ρ1 , . . . , ρn }, ρ0 = {ρ01 , . . . , ρ0m } and ρ1 = ρ01 and ρn = ρ0m .

15

Let us reformulate our basic equations as follows: (E1)1 α1 , β 1 , γ ⇔ α1 ◦ β 1 , γ (E1)2 α2 , β 2 ⇔ α2 ◦ β 2 (E2)

1 × α2 , α1 × 1, β ⇔ α1 × 1, 1 × α2 , β.

These new equations do not allow to rewrite a well-coloured path into a non wellcoloured path; notice also that the ‘interchange basic equation’ 1 × α2 , α1 × 1 ⇔ α1 × 1, 1 × α2 now does not apply anymore in the last position of a path. As we said, we shall only consider from now on only well-coloured paths subject to the new basic equations (E1)i , (E2).12 There is no loss in that because for wellcoloured paths K, L, we have K ⇔∗ L (according to the old basic equations) iff K ⇔∗ L (according to the new basic equations). In fact, one side is trivial; for the other side, let us consider a ⇔-chain like K = K0 ⇔ K1 ⇔ · · · ⇔ Kn = L obtained according to the old basic equations. We thus have K + = K0+ ⇔ K1+ ⇔ · · · ⇔ Kn+ = L+ according to the new basic equations; now two applications of (E1)2 yields K ⇔ K + and L ⇔ L+ because K, L are well-coloured. Thus K ⇔∗ L holds by using the new equations too. The obvious orientations of (E1)1 , (E1)2 are (Rc1 ) α1 , β 1 , γ ⇒ α1 ◦ β 1 , γ (Rc2 ) α2 , β 2 ⇒ α2 ◦ β 2 . Orientation of (E2) depends on the colour of β. In case β has colour 2, we orient it as follows (supposing α2 has colour 2 too): (Rp2 )∗

1 × α22 , α1 × 1, β 2 ⇒ α1 × 1, (1 × α22 ) ◦ β 2

where second member has been reduced by a further (Rc2 )-rewrite step. In case β has colour 2, there are no other relevant cases. If α1 , α2 have both colour 1, the two members are joinable by (Rc1 ) and the equation can be deleted.13 If α2 has colour 1 and α1 has colour 2, we do not need to add the rule α12 × 1, 1 × α21 , β 2 ⇒ 1 × α21 , (α12 × 1) ◦ β 2 12

Of course, this means also that, when computing the splitting path of a term, identity should be added at the end in case the top symbol of the term has wrong colour. 13 We tolerate the use of (Rp2 )∗ in case α1 , α2 both have colour 2. As a general philosophy, we prefer not to put provisoes on applications of rules, unless needed. So, for (Rp2 )∗ (and (Rp1 )∗ below), the only proviso is that first and third arrow in first member must be equally coloured.

16

because this is simply a renaming of (Rp2 )∗ and our convention about renamings automatically includes it. Notice that (Rp2 )∗ applies also in case β ∈ T0 (the fact that β has colour 2 does not prevent it from belonging to T0 ). In case β ∈ T1 \T0 , both members of (E2) cannot occur in last position of a well-coloured path; taking into account this fact, the appropriate oriented rule is (Rp1 )∗

1 × α21 , α1 × 1, β 1 , γ ⇒ α1 × 1, (1 × α21 ) ◦ β 1 , γ

Although, strictly speaking, we do not need such a rule in case β ∈ T0 (because orientation in this case is like in (Rp2 )∗ ), we allow its use in this case too. During next section completion process, rules (Rpi )∗ will be removed, whereas the rules (Rci ) (called composition rules) are permanent (whenever a rule is deleted during completion, we always mark its name with a ∗). Let us summarize the content of this section. We call R∗ the rewriting system given by the rules (Rc1 ), (Rc2 ), (Rp1 )∗ , (Rp2 )∗ . R∗ is our starting rewriting system: this system is sound and complete for our purposes (deciding path equivalence according to system P of Theorem 3.1), because the above discussion shows that Lemma 4.1 For well-coloured paths K1 , K2 , we have K1 ⇔∗P K2 iff K1 ⇔∗R∗ K2 .

5

Completion

System R∗ is clearly inadequate because it is far from being confluent, so we shall modify it by using Knuth-Bendix style completion as an heuristic guide. Let us recall some general notions concerning a rewrite system S (these notions can be formulated within the context of abstract rewrite systems as in [2]). System S is said to be: • terminating iff there are no infinite reduction sequences K1 ⇒S K2 ⇒S · · · Ki ⇒S · · · • confluent iff K ⇒∗S K1 and K ⇒∗S K2 imply that K1 , K2 are joinable (i.e. that there exists K0 such that K1 ⇒∗S K0 and K2 ⇒∗S K0 ); • locally confluent iff K ⇒S K1 and K ⇒S K2 imply that K1 , K2 are joinable; • canonical iff it is terminating and confluent iff (by Newmann’s Lemma) it is terminating and locally confluent.

17

It can be shown (see [2]) that in a canonical rewriting system S the relation K1 ⇔∗S K2 holds iff K1 and K2 have the same normal form, which is moreover unique (a normal form for L is some L0 such that L ⇒∗S L0 and there is no L00 such that L0 ⇒S L00 ). In order to prove canonicity of our path rewriting systems, we show that they are locally confluent and terminating; local confluence is, in its turn, easily reduced to the fact that critical pairs are all joinable. We recall that a critical pair is any pair of paths obtained as follows L1 , M, L2 ¡

@

¡ ¡ ª

@ @ R

R1 , L2

L1 , R2

where M is non empty and L1 , M ⇒ R1

and M, L2 ⇒ R2

are both rules of the system. In addition, there are also critical pairs induced by rules of the kind L1 , M, L2 ⇒ R1 and M ⇒ R2 (where L1 , M, L2 are all not empty). In all these cases, we say that such rules superpose. In case some critical pairs are not joinable, the obvious thing to do is to enrich the system by adding it such oriented critical pairs as new rules. In addition, experience shows that it is better also to simplify - if possible - rules which are generated from the procedure. The following operations concerning a rewrite system S are in order: (i) we can add to S a set of new rules {Li ⇒ Ri }i such that (Li , Ri ) or (Ri , Li ) is a critical pair generated by rules in S; (ii) we can divide rules in S in two disjoint groups S 0 ∪ S 00 and remove all rules in S 00 in case we realize that left and right member of such rules are joinable in S 0 ; (iii) we can divide rules in S in two disjoint groups S 0 ∪ S 00 and replace any rule L ⇒ R in S 00 by some L ⇒ R0 such that R ⇒∗S 0 R0 .14 Clearly if S 0 results from S after a finite number of applications of (i)-(ii)-(iii), we have that S 0 is equivalent to S (in the sense that we have K1 ⇔∗S K2 iff K1 ⇔∗S 0 K2 ). If we are lucky, we can produce in this way a canonical rewrite system S 0 starting from a given S. Notice that the above completion procedure - as it is formulated here - only has heuristic value (it cannot be fully mechanized as each step in (i)-(ii)-(iii) may consist in infinitely many operations). Let us now apply completion to R∗ . The first obvious superposition we have in ∗ R is obtained by considering rules (Rc1 ) and (Rc2 ) as in the fork: 14

We shall not need left member simplification steps.

18

αi , β 0 , αj ¡

@

(Rci ) ¡

j

@ (Rc ) @ R

¡ ª

αi ◦ β 0 , αj

αi , β 0 ◦ αj

(with i 6= j and possibly with a γ appended everywhere to the right in case j = 1). Any naif global orientation of these critical pairs in one sense or in the other, would immediately cause infinite rewriting. Orientation from left to right αi ◦ β 0 , αj ⇒ αi , β 0 ◦ αj would produce for instance (let α have codomain Y and let β have domain Y ): α, β

= α ◦ h1Y , 1Y i ◦ π1 , β ⇒ α ◦ h1Y , 1Y i, π1 ◦ β = α ◦ h1Y , 1Y i ◦ h1Y ×Y , 1Y ×Y i ◦ π10 , π1 ◦ β ⇒ ···

where π1 : Y × Y ⇒ Y and π10 : (Y × Y ) × (Y × Y ) −→ Y × Y are first projections. Critical pairs (CP )

hαi , β 0 ◦ αj

;

αi ◦ β 0 , αj i

will be differently oriented, depending on the nature of the arrow β 0 . In case the signature Ω0 is empty, the solution is the following couple of rules: (Rpr)∗ αi , π ◦ αj ⇒ αi ◦ π, αj (Rdi)∗

αi ◦ δ, αj ⇒ αi , δ ◦ αj

where π is any strict projection and δ is any strict diagonal 15 (a projection - resp. diagonal - is strict iff it is not a renaming). Given that any β 0 factors as π ◦ δ, where π is a projection and δ is a diagonal, this pair of rules is sufficient to make all critical pairs (CP ) joinable, if Ω0 is empty. In our case, we cannot suppose Ω0 to be empty, however we can try to identify two different classes of arrows in T0 forming a factorization system; arrows in the first class will be associated ‘to the left’ and arrows in the second class will be associated ‘to the right’, as in the case in which Ω0 is empty. There is a standard notion of factorization system in category theory (see [6]), however such a notion is too strong in the present context, so that we weaken it. Let C be any category; by a weak factorization system in C, we mean a pair of classes of arrows (E, M) from C such that: (i) both E and M contain identities and are closed with respect to left and right composition with arrows in E ∩ M; 15 Recall from Section 2 that we call projections (resp. diagonals) arrows X n −→ X m which are m-tuples of distinct πi (resp. m-tuples of πi including all the π1 , . . . , πn ).

19

(ii) for every α ∈ C, there are ε ∈ E, µ ∈ M such that α = ε ◦ µ; (iii) whenever we have a commutative square ε1 Y1

Y0

µ1

ε2 ?

Y2

? -Y µ2

with ε1 , ε2 ∈ E, µ1 , µ2 ∈ M, there is a unique ρ ∈ E ∩M such that ε2 ◦ρ = ε1 and ρ ◦ µ1 = µ2 (this condition says that the factorization given by (ii) is essentially unique). ¿From the above axioms it follows that arrows ρ ∈ E ∩ M are invertible (because they have two trivial factorizations, namely ρ ◦ 1 and 1 ◦ ρ, hence...); such arrows will be just renamings in our cases. Notice that we do not ask for E, M to be closed under composition, not even that they contain isomorphisms and are ‘orthogonal’ each other (like in standard factorization systems). Main Example. For any equational theory T = (Ω, Ax), the corresponding Lawvere category T always has a weak factorization system (E, M) (which we call the standard weak factorization system for T): • arrows in E are just projections; • arrows in M are those α such that in case it happens that α = ε ◦ α0 (with ε ∈ E), we must have that ε is just a renaming. The factorizations α = αε ◦ αµ (with αε ∈ E, αµ ∈ M) are obtained as follows. Let ~t(x1 , . . . , xn ) be a tuple of terms containing at most the variables x1 , . . . , xn ; say that this tuple is n-minimized iff for no i = 1, . . . , n we have `T ~t = ~t(c0 /xi ).16 Now we have that the m-tuple of terms ~t is n-minimized iff the arrow α : X n −→ X m belongs to M, where α is the vector of the equivalence classes of terms represented by the m components of ~t (if, say, ~t = ht1 , . . . , tm i, then α is h{t1 }, . . . , {tm }i). Suppose in fact on one side that we have α = ε ◦ α0 , where ε : X n −→ X k is the tuple hπi1 , . . . , πik i; as such a tuple is a strict projection, the ij are all distinct and some s = 1, . . . , n is missed. Let α0 be formed by the equivalence classes represented by the terms ~t0 (x1 , . . . , xk ); the relation α = ε ◦ α0 means that `T ~t(x1 , . . . , xn ) = ~t0 (xi /x1 , . . . , xi /xk ). Replacing xs by the constant c0 , we get 1 k `T ~t(c0 /xs ) = ~t0 (xi1 , . . . , xik ),

hence

`T ~t(c0 /xs ) = ~t

V Notations like `T ~ u = ~v , for ~ u = hu1 , . . . , um i and ~v = hv1 , . . . , vm i, mean that `T m j=1 uj = vj . Recall that in Section 2 we assumed that there is at least one ground term c0 in our signatures. 16

20

contrary to the fact that ~t is n-minimized. Conversely, if ~t is not n-minimized, we have `T ~t(c0 /xs ) = ~t for some s, hence α admits a factorization through the proper projection hπ1 , . . . , πs−1 , πs+1 , . . . , πn i. We so established that α : X n −→ X m belongs to M iff it is represented by some n-minimized vector of terms. Let now α be arbitrary; how can we get the factorization α = αε ◦ αµ , where αε ∈ E and αµ ∈ M? This is easy: take any vector of terms in the equivalence classes of α containing a minimal set of variables: if such a vector is ~t(xi1 , . . . , xik ), then the factorization is α = hπi1 , . . . , πik i ◦ β, where β represents the vector of terms ~t(x1 , . . . , xk ). Notice that this process is effective in case word problem for T is solvable: one takes any ~t representing α and then go on by replacing variables in it by c0 ; the procedure stops when only terms not provably equal to ~t can be obtained. Next we show that the above factorization is unique up to a renaming. Suppose we have a commutative square in T ε2 - l Xm X µ2

ε1 ?

Xk

µ1

? - Xn

with ε1 , ε2 ∈ E and µ1 , µ2 ∈ M. For the sake of simplicity, we can apply a suitable renaming to X m so that we have ε1 = hπ1 , . . . , πk i (i.e. ε1 projects on first k components) and ε2 = hπj1 , . . . , πjl i; now µ1 , µ2 must be represented by k, l-minimized vectors of terms t~1 , t~2 . The commutativity of the square says that we have `T t~1 (x1 , . . . , xk ) = t~2 (xj1 , . . . , xjl ). By minimization, we must have {1, . . . , k} = {j1 , . . . , jl } (otherwise, one can ‘reduce’ variables in t1 or t2 by replacing them with c0 ); this means also that k = l. Now the renaming hπj1 , . . . , πjl i : X k −→ X k fills the ‘bottom-top’ diagonal of the square ε2- k X k+l X µ2

ε1 ?

Xk

? - Xn µ1

(and is the unique such), as wanted. a Using the above described standard weak factorization system (which we conveniently call (E0 , M0 )) available in T0 , we can replace rule (Rdi)∗ by the following one (Rµ )∗

αi ◦ µ, αj ⇒ αi , µ ◦ αj

21

(µ ∈ M0 )

which is stronger than (Rdi)∗ because diagonals always are in M0 (they cannot be factored through a strict projection or, to say it differently, they always are represented by minimized vectors of terms). As rule (Rpr)∗ is kept, the effect of rules (Rpr)∗ and (Rµ )∗ is that all critical pairs (CP ) are now joinable (because β 0 can be factored as βε0 ◦ βµ0 , with βε0 ∈ E0 and βµ0 ∈ M0 ). Apart from evident problems concerning the effectiveness of applicability of rule (Rµ )∗ , this is still bad because it may once again produce infinite rewriting. This is especially evident in case T0 is not collapse-free. Suppose for instance we have a collapsing equation like f (x1 , x1 ) = x1 in T0 ; then if we start with the path x1 , t(x1 ) (where t(x1 ) is any term), we can decompose x1 as hx1 , x1 i ◦ f (x1 , x2 ), thus producing the following rewrite steps: hx1 , x1 i ◦ f (x1 , x2 ), t(x1 ) ⇒ hx1 , x1 i, t(f (x1 , x2 )) ⇒ x1 , hx1 , x1 i ◦ t(f (x1 , x2 )) yielding again x1 , t(x1 ). We cannot overcome this problem without postulating anything (after all, combined solvable word problems might be unsolvable...). We shall postulate that there is a canonical way of extracting M0 -components from terms in Ωi (of course, we shall also have to assume that such extraction can be done in an effective way, see Section 10).17 This extra assumption will restrict rule (Rµ )∗ (or, to put it in a different form, will allow the completion/simplification process to get rid of undesired instances of rule (Rµ )∗ ). We need first to come back once again to the abstract categorical framework. Let C to be a subcategory of C0 and let (E, M) be a weak factorization system in C. A weak factorization system (E 0 , M) in C0 (notice that M is the same!) is said to be a left extension of (E, M) iff the following hold: • E 0 ∩ C = E; • if ε1 , ε2 ∈ E and if ε ∈ E 0 , then ε1 ◦ε ∈ E 0 and ε◦ε2 ∈ E 0 (whenever compositions make sense). Notice that this implies that E - not necessarily E 0 - is closed under composition. Let us say that Ti is constructible over T0 iff in Ti there is a left extension (Ei , M0 ) of the standard weak factorization system (E0 , M0 ) of T0 . Assumption. We assume that T1 , T2 are both constructible over T0 . We postpone to Section 10 a symbolic translation of this assumption as well as the analysis of some examples (and counterexamples). For the moment, let us underline that, as an effect of the above assumption, we now have that any arrow αi admits two factorizations, namely: 17 The assumption of [3, 4] may be seen as the stronger requirement that there is a maximal way of extracting M0 -components (such stronger requirement is incompatible with existence of collapsing equations in T0 ).

22

i according to the standard weak factorization system • it can be factored as αεi ◦αm (E0 , Mi ) of Ti (we recall that here E0 is formed by arrows which are projections, whereas Mi is formed by arrows represented by minimized -in the sense of the theory Ti - vectors of terms);

• it can be factored as αei ◦ αµi according to the left extension (Ei , M0 ) of the standard weak factorization system of T0 (here the class Ei is axiomatically given by the above Assumption, whereas M0 is the class of arrows from T0 represented by minimized vectors of terms -in the sense of the theory T0 ).18 Rules (Rpr)∗ , (Rµ )∗ are so restricted: (Rε )

j αi , αj ⇒ αi ◦ αεj , αm

(Rµ ) αi , αj ⇒ αei , αµi ◦ αj and called ε-extraction and µ-extraction rules, respectively.19 Let us call R0 the rewriting system formed by rules (Rci ), (Rε ), (Rµ ); in Section 6 we shall prove that Theorem 5.1 R0 is canonical. As an effect of the above Theorem, rules (Rpr)∗ , (Rdi)∗ , (Rµ )∗ are all canceled during completion/simplification process (because their members are joinable in R0 ); we shall nevertheless artificially postpone cancellation of rules (Rpr)∗ and (Rdi)∗ at the end of the completion, because we shall make further use of them in order to identify the good superpositions/simplifications steps needed to treat the remaining rule (Rpi )∗ (which causes some further confluence problems). In fact, in order to finish our completion process we need only to identify a couple of very specific superpositions yielding the right modification of the rule (Rpi )∗ (Rpi )∗

1 × α2i , α1j × 1, β i ⇒ α1j × 1, (1 × α2i ) ◦ β i

(recall that in case i = 1, there is an extra arrow to the right of both members).20 Let us consider the path hγ i ,1Y i

αj1 ×1Z

1Y ×αi2

βi

1 Y2 −→2 Y1 × Y2 −→ Y1 × Z2 −→ 2 Z1 × Z2 −→ U

giving rise to the superposition (among rules (Rci ) and (Rpi )∗ ) 18

These vectors of terms are also n-minimized in the sense of Ti , given that Ti is conservative over

T0 .

19

It goes without saying that such rules do not apply in case of trivial factorizations (i.e. when -resp. αµi - are, up to a renaming, just identities). We allow applying the rule also in case i = j (although in principle this is not needed). 20 Recall from Section 4 that we allow j to be different or equal to i.

αεj

23

hγ i , 1i, 1 × α2i , α1j × 1, β i ¡ (Rci ) ¡ ¡ ¡ ¡ ª ¡

¡

@

@

@ (Ri )∗ @ p @ @ @ R

hγ i , α2i i, α1j × 1, β i

hγ i , 1i, α1j × 1, (1 × α2i ) ◦ β i

The related critical pair is oriented as follows: hγ i , α2i i, α1j × 1, β i ⇒ hγ i , 1i, α1j × 1, (1 × α2i ) ◦ β i

(Rp0 )∗

We need another final superposition (among (Rp0 )∗ and (Rdi)∗ ): consider the path (here αj : Y1 × Z −→ Y2 ) Y

hγ i ,δ i ,δ i i

−→

αj ×1

βi

Y1 × Z × Z −→Z Y2 × Z −→ U

and the superposition hγ i , δ i , δ i i, αj × 1Z , β i @

¡

@

¡

@ (Ri )∗ @ p @ @ @ R

(Rdi)∗ ¡ ¡

¡

¡ ¡ ª

hγ i , δ i i, hαj , πZ i, β i

hγ i , δ i , 1Y i, αj × 1Y , (1Y2 × δ i ) ◦ β i

where we used the fact that hγ i , δ i , δ i i = hγ i , δ i i ◦ (1Y1 × ∆Z ) (∆Z is the diagonal h1Z , 1Z i) and the fact that (1Y1 × ∆Z ) ◦ (αj × 1Z ) = hαj , πZ i (πZ is the projection Y1 × Z −→ Z). We are near to the end of the completion process; we first reduce the second component of the above critical pair by using two (Rpr)∗ -reduction steps:21 suppose that δ i : Y −→ Z has factorization δi

δi

ε m Y −→ Y 0 −→ Z,

then we have

hγ i , δ i , 1Y i, αj × 1Y , (1Y2 × δ i ) ◦ β i ⇓ i ) ◦ βi hγ i , δ i , 1Y i, αj × δεi , (1Y2 × δm ⇓ i ) ◦ βi hγ i , δ i , δεi i, αj × 1Y 0 , (1Y2 × δm

21

This reduction is important: without it, we may have problems in the termination proof.

24

So our final products rule is (Rpi )

i ) ◦ βi hγ i , δ i i, hαj , πZ i, β i ⇒ hγ i , δ i , δεi i, αj × 1Y 0 , (1Y2 × δm

(recall we have an extra arrow to the right in both members in case i = 1). Putting types, first member is hγ i ,δ i i

(I)

hαj ,πZ i

βi

αj ×1

i )◦β i (1Y2 ×δm

Y −→ Y1 × Z −→ Y2 × Z −→ U

whereas second member is (II)

Y

hγ i ,δ i ,δεi i

−→

0

Y1 × Z × Y 0 −→Y Y2 × Y 0

−→

U

We add a proviso for this rule: δ i 6∈ E0 (that is, δ i cannot be a projection). The reason for this last proviso is that, in case δ i is a projection, then the second member of (Rpi ) can be re-written to the first by using rule (Rdi)∗ (thus causing termination i = 1 , problems). In fact, in case δ i is a projection, we have that δ i = δεi and δm Z i i i j i hence the second member is hγ , δ , δ i, α × 1Z , β and we can rewrite it as follows hγ i , δ i , δ i i, αj × 1Z , β i = hγ i , δ i i ◦ (1Y1 × ∆Z ), αj × 1Z , β i ⇒ hγ i , δ i i, hαj , πZ i, β i thus getting the first member. To finish, we observe that rules (Rpi )∗ and (Rp0 )∗ can be removed, because their members become joinable in the rewrite system R obtained by adding (Rpi ) to R0 . We check it for the former rule, leaving the latter (which is treated in a very similar way) to the reader. First member of (Rpi )∗ is Y1 × Y2

hπY1 ,πY2 ◦αi i

−→

Y1 × Z2

hπY1 ◦αj1 ,πZ2 i

−→

βi

Z1 × Z2 −→ U

whereas the second member is αj1 ×1Y

Y1 × Y2 −→ 2 Z1 × Y2

(1Z1 ×αi )◦β i

−→

U. αi

ε Applying a (Rpr)∗ -rewrite step to the second member, we get (suppose that Y2 −→

αi

m Y20 −→ Z2 ):

(∗)

i α1j × 1Y2 , (1Z1 × αi ) ◦ β i ⇒ α1j × αεi , (1Z1 × αm ) ◦ βi.

Let us now operate on first member by successively using rules (Rpi ), (Rpr)∗ , (Rcj ) as i , so by uniqueness this follows (to apply (Rpi ), notice that πY2 ◦ αi = (πY2 ◦ αεi ) ◦ αm

25

is the factorization of πY2 ◦ αi in the standard weak factorization system of Ti ):22 hπY1 , πY2 ◦ αi i, hπY1 ◦ α1j , πZ2 i, β i ⇓(Rpi ) i ) ◦ βi hπY1 , πY2 ◦ αi , πY2 ◦ αεi i, hπY1 ◦ α1j i × 1Y20 , (1Z1 × αm = i ) ◦ βi hπY1 , πY2 ◦ αi , πY2 ◦ αεi i, hπY1 , πY20 i ◦ (α1j × 1Y20 ), (1Z1 × αm ⇓(Rpr)∗ i ) ◦ βi hπY1 , πY2 ◦ αi , πY2 ◦ αεi i ◦ hπY1 , πY20 i, α1j × 1Y20 , (1Z1 × αm = i ) ◦ βi hπY1 , πY2 ◦ αεi i, α1j × 1Y20 , (1Z1 × αm ⇓(Rcj ) i ) ◦ βi α1j × αεi , (1Z1 × αm

as in (∗). In conclusion, we obtained the rewriting system R which is described by Table 1 (in the last two rules of the Table, Z, Y 0 and Y2 are the codomains of δ i , δεi and αj , respectively, as in the fully displayed paths (I) and (II) above).23 Recall that renamings of rules of R are available as rules of R. However, such a convention can be slightly simplified, given that the above rules are all closed under the operation of composing first (or last) arrow in each member by the same single renaming. Thus we can merely stipulate that if L ⇒ R is a rule, then L0 ⇒ R0 is also a rule, where L0 is any alphabetic variant of L and R0 is any alphabetic variant of R. The content of the present section can be so summarized (recall that R is obtained from R∗ by few completion steps): Lemma 5.2 For well-coloured paths K1 , K2 , we have K1 ⇔∗R K2 iff K1 ⇔∗R∗ K2 . 22 If αi is a projection, rule (Rpi ) does not apply, however in this case 1Z1 × αim is the identity, αεi = αi and a single (Rcj )-rewrite step reduces the first member as in (∗). 23 Notice the following subtility concerning rule (Rc1 ) (a similar observation applies to rule (Rp1 ) too): paths α1 , β 0 and α1 ◦ β 0 are well-coloured in case the composed arrow α1 ◦ β 0 collapses to an arrow from T0 . In this case, rule (Rc1 ) does not apply, but the two paths are nevertheless joinable by eliminating peaks from the following ⇔∗R -chain (according to the instructions given in the confluence proof): α1 ◦ β 0 ⇐ α1 ◦ β 0 , 1 ⇐ α 1 , β 0 , 1 ⇒ α1 , β 0 .

The point behind that lies in the above mentioned properties of left extension of factorization systems: if α1 ◦ β 0 collapses, then (αe1 ◦ (αµ1 ◦ β 0 )ε ) ◦ (αµ1 ◦ β 0 )µ is, by uniqueness, the factorization, in T0 as well as in T1 , of the arrow α1 ◦ β 0 , hence we have α1 , β 0 ⇒ αe1 , αµ1 ◦ β ⇒ αe1 ◦ (α1µ ◦ β 0 )ε , (αµ1 ◦ β 0 )µ ⇒ (α1e ◦ (αµ1 ◦ β 0 )ε ) ◦ (αµ1 ◦ β 0 )µ = α1 ◦ β 0 where last passage is now correct (it is an (Rc2 )-step applied to arrows from T0 ).

26

(Rc1 ) α1 , β 1 , γ ⇒ α1 ◦ β 1 , γ (Rc2 ) α2 , β 2 ⇒ α2 ◦ β 2 (Rε )

α, β ⇒ α ◦ βε , βm

(Rµ ) α, β ⇒ αe , αµ ◦ β 1 ) ◦ β1, θ (Rp1 ) hγ 1 , δ 1 i, hα, πZ i, β 1 , θ ⇒ hγ 1 , δ 1 , δε1 i, α × 1Y 0 , (1Y2 × δm where δ 1 6∈ E0 2 ) ◦ β2 (Rp2 ) hγ 2 , δ 2 i, hα, πZ i, β 2 ⇒ hγ 2 , δ 2 , δε2 i, α × 1Y 0 , (1Y2 × δm where δ 2 6∈ E0

Table 1: The system R

In Section 9 we shall prove our main result, namely that Theorem 5.3 R is canonical.

6

Local confluence, I

In this section we will prove the canonicity of the system R0 which, we recall, is the system described by Table 2.

(Rc1 ) α1 , β 1 , γ ⇒ α1 ◦ β 1 , γ (Rc2 ) α2 , β 2 ⇒ α2 ◦ β 2 (Rε )

α, β ⇒ α ◦ βε , βm

(Rµ ) α, β ⇒ αe , αµ ◦ β

Table 2: The system R0 We begin by showing that R0 is locally confluent: we single out all critical pairs arising from superpositions between the rules of R0 and we prove that they are

27

joinable. Most of the cases can be reduced to the critical pairs treated in the following lemma. Lemma 6.1 The paths αi ◦ σ 0 , β j and αi , σ 0 ◦ β j are joinable in R0 . Proof. Let αei and αµi be the e/µ components of αi and let us consider the following commutative diagram, where ε1 ◦ µ corresponds to the ε/µ factorization in T0 of j αµi ◦ σ 0 and ε2 ◦ δm is the ε/m factorization of µ ◦ β j in Tj . αi HH

σ0

-

βj

-

6 αei HHH αµi HH j

-

6

6j

µ

δm

-

ε1

ε2

-

Since αei ◦ ε1 belongs to Ei (recall the definition of left extensions of factorization j systems) and δm belongs to Mj , we have (up to a renaming): (αi ◦ σ 0 )e = αei ◦ ε1

(αi ◦ σ 0 )µ = µ

(αµi ◦ σ 0 ◦ β j )ε = ε1 ◦ ε2

j (αµi ◦ σ 0 ◦ β j )m = δm

We can do the following rewriting steps: αi ◦ σ 0 , β j

⇒Rµ

αei ◦ ε1 , µ ◦ β j

⇒R ε

j αei ◦ ε1 ◦ ε2 , δm

αi , σ 0 ◦ β j

⇒Rµ

αei , αµi ◦ σ 0 ◦ β j

⇒R ε

j αei ◦ ε1 ◦ ε2 , δm

and this proves the lemma.

a

In subsequent sections we proceed with a systematic analyses of the cases. To simplify the exposition, we treat (Rc1 ) and (Rc2 ) together; however some applications of (Rc1 ) may require an additional arrow λ to the right (we put it within round brackets).

6.1

Superposition between (Rci ) and (Rcj ) αi , β, γ j , (λ) ¡

@

¡

@

(Rci ) ¡ ¡

¡ ª

αi ◦ β, γ j , (λ)

j

@ (Rc ) @ @ R

αi , β ◦ γ j , (λ)

If i = j, we can rewrite both members to αi ◦ β ◦ γ i . Otherwise, necessarily β belongs to T0 and we can apply Lemma 6.1.

28

6.2

Superpositions between (Rci ) and (Rµ )

CASE 1 αi , β i , (λ) ¡

¡ (Rµ ) ¡ ¡ ¡ ¡ ª ¡

αei , αµi ◦ β i , (λ)

@

@ @ (Ri ) @ c @ @ @ R - αi ◦ β i , (λ) (Rci )

CASE 2 α, β i , γ i , (λ) ¡

¡

@

@ i @ (Rc ) @ @ R

(Rµ ) ¡ ¡

¡ ª

αe , αµ ◦ β i , γ i , (λ)

α, β i ◦ γ i , (λ)

¡ ¡ ¡ (Rµ )

@ @ i

(Rc ) @

@

@ R

¡

¡ ª

αe , αµ ◦ β i ◦ γ i , (λ) CASE 3 αi , β i , γ ¡

¡

(Rµ ) ¡ ¡

¡ ª

αi , βei , βµi ◦ γ

@

@ i @ (Rc ) @ @ R

αi ◦ β i , γ = i i (α ◦ βe ) ◦ βµi , γ

(Rci ) ?

αi ◦ βei , βµi ◦ γ By Lemma 6.1, with σ 0 = βµi .

29

6.3

Superpositions between (Rci ) and (Rε )

CASE 1 αi , β i , (λ) ¡

¡

@

@

(Rε ) ¡

@ (Ri ) @ c @ @ @ R - αi ◦ β i , (λ)

¡

¡

¡ ¡ ª

i , (λ) αi ◦ βεi , βm

(Rci )

CASE 2 α, β i , γ i , (λ) ¡

¡

@

@ i @ (Rc ) @ @ R

(Rε ) ¡ ¡

¡ ª

i , γ i , (λ) α ◦ βεi , βm

α, β i ◦ γ i , (λ) =

i ◦ γ i ), (λ) α, βεi ◦ (βm

(Rci ) ?

i ◦ γ i , (λ) α ◦ βεi , βm

By Lemma 6.1, with σ 0 = βεi . CASE 3 αi , β i , γ ¡

¡

(Rε ) ¡ ¡

¡ ª

α i , β i ◦ γ ε , γm

@

@ i @ (Rc ) @ @ R

αi ◦ β i , γ

@ @ i

¡ ¡ ¡ (Rε )

(Rc ) @

@

@ R

¡

¡ ª

α i ◦ β i ◦ γε , γ m

30

6.4

Superposition between (Rµ ) and itself α, β, γ ¡

¡

@

(Rµ ) ¡ ¡ ª

@

@ (Rµ ) @ @ R

¡

αe , αµ ◦ β, γ = αe , (αµ ◦ βe ) ◦ βµ , γ

α, βe , βµ ◦ γ (Rµ ) ?

αe , αµ ◦ βe , βµ ◦ γ By Lemma 6.1, with σ 0 = βµ .

6.5

Superpositions between (Rµ ) and (Rε )

CASE 1

α, β ¡

(Rµ ) ¡

¡

@

@

@ (Rε ) @ @ R

¡

¡ ª

αe , αµ ◦ β = αe , (αµ ◦ βε ) ◦ βm

α ◦ βε , βm = αe ◦ (αµ ◦ βε ), βm

By Lemma 6.1, with σ 0 = αµ ◦ βε . CASE 2

31

α, β, γ ¡

@

¡

@ @ (Rε ) @ @ R

(Rµ ) ¡ ¡

¡ ª

αe , αµ ◦ β, γ

α, β ◦ γε , γm

@

¡ ¡ ¡ (Rµ )

@

(Rε ) @

@

¡

@ R

¡ ª

αe , αµ ◦ β ◦ γε , γm

CASE 3 α, β, γ @

¡

@ @ (Rε ) @ @ R

¡

(Rµ ) ¡ ¡

¡ ª

α, βe , βµ ◦ γ = α, βε ◦ (βm )e , (βm )µ ◦ γ

α ◦ βε , βm , γ (Rµ ) ?

α ◦ βε , (βm )e , (βm )µ ◦ γ In first member we use the fact that the following diagram is commutative β

© * ©©

βε ?

© ©©

6

©©βm

(βm )e

(βm )µ -

Thus, reasoning as usual (by uniqueness of factorizations - up to a renaming), we can state that βe = βε ◦ (βm )e and βµ = (βm )µ . We can apply Lemma 6.1, with σ 0 = βε .

32

6.6

Superposition between (Rε ) and itself α, β, γ ¡

¡

@

@ @ (Rε ) @ @ R

(Rε ) ¡ ¡ ¡ ª

α ◦ βε , βm , γ

α, β ◦ γε , γm = α, βε ◦ (βm ◦ γε ), γm

(Rε ) ?

α ◦ βε , βm ◦ γε , γm By Lemma 6.1, with σ 0 = βε . We can conclude: Theorem 6.2 R0 is locally confluent. It remains to show the termination of R0 . This result is a consequence of Theorem 9.8, however here we give a direct proof, which uses less machinery. We need a complexity measure for paths which decreases with application of our rules. At this aim, we define: ½ ½ 0 if αi ∈ Ei 0 if αi ∈ Mi i i µ(α ) = ε(α ) = 1 otherwise 1 otherwise Let K be the path α1 , . . . , αn . We define: µ(K) = hµ(α1 ), . . . , µ(αn )i

ε(K) = hε(αn ), . . . , ε(α1 )i

(notice that µ(K) = µ(K 0 ) and ε(K) = ε(K 0 ) hold in case K and K 0 are alphabetic variants each other). Finally, we introduce the following order relation  between paths K, L: • K  L if and only if either (i) or (ii) hold: (i) |K| > |L| (where |K| denotes the length of K); (ii) |K| = |L| and hµ(K), ε(K)i >l hµ(L), ε(L)i (where >l denotes the lexicographic order between n-ple of integers). It is a standard fact that

33

Lemma 6.3  is a terminating transitive relation. Now we prove that  is a stable relation. Lemma 6.4 K  K 0 implies L, K, R  L, K 0 , R (for all L, R). Proof. If |K| > |K 0 | we immediately have L, K, R  L, K 0 , R. Suppose now |K| = |K 0 |; we have to show that hµ(L), µ(K), µ(R), ε(R), ε(K), ε(L)i >l hµ(L), µ(K 0 ), µ(R), ε(R), ε(K 0 ), ε(L)i. This follows from the fact that either µ(K) >l µ(K 0 ) or µ(K) = µ(K 0 ) and ε(K) >l ε(K 0 ). a Lemma 6.5 Let L ⇒R0 L0 , then L  L0 . Proof. Since  is stable, it is sufficient to analyze the following three cases. (1) αi , β i ⇒(Rci ) αi ◦ β i . We have |αi , β i | > |αi ◦ β i |, hence αi , β i  αi ◦ β i . (2) α, β ⇒(Rµ ) αe , αµ ◦ β. The two paths have the same length, moreover µ(α) = 1 and µ(αe ) = 0 (otherwise there is no way to apply (Rµ )). This implies that hµ(α), µ(β), ε(β), ε(α)i >l hµ(αe ), µ(αµ ◦ β), ε(αµ ◦ β), ε(αe )i from which α, β  αe , αµ ◦ β follows. (3) α, β ⇒(Rε ) α ◦ βε , βm . Clearly |α, β| = |α ◦ βε , βm |. Moreover: - µ(α) ≥ µ(α ◦ βε ). In fact, if µ(α) = 0, then α ∈ Ei (i = 1, 2), which implies α ◦ βε ∈ Ei , hence µ(α ◦ βε ) = 0. - µ(β) ≥ µ(βm ). Suppose that µ(βm ) = 1, that is βm = (βm )e ◦ µ, with µ different from identity. Since β = (βε ◦ (βm )e ) ◦ µ and βε ◦ (βm )e belongs to Ek , µ is also the µ-component of β, and this means that µ(β) = 1. - ε(β) > ε(βm ). Otherwise we cannot apply (Rε ). We get: hµ(α), µ(β), ε(β), ε(α)i >l hµ(α ◦ βε ), µ(βm ), ε(βm ), ε(α ◦ βε )i and this proves the lemma.

a

34

Since  is a terminating transitive relation, the following theorem is proved. Theorem 6.6 R0 is terminating. This concludes the proof of Theorem 5.1.24

The system R+

7

Proving directly local confluence of R leads to unnecessary complications, this is why we prefer to introduce another system (which we call R+ ) and prove local confluence of the latter. In Section 9 we shall prove termination of both R and R+ and then we shall make a more precise comparison between R and R+ : from this comparison, canonicity of R follows immediately. In order to introduce R+ we first consider slight modifications of rules (Rµ ) and i (Rp ). Rule (Rµ ) is enlarged as follows: (Rµ )+

hα, βi, γ ⇒ hαe , βi, (αµ × 1) ◦ γ

(notice that in case vector β is empty, we get ordinary (Rµ )-rule). Rules (Rpi ) are on the other hand restricted so that only 1-component arrows are ‘moved to the right’ (let us call (Rpi )+ the rules obtained by this restriction). In conclusion, we let R+ be the rewriting system of Table 3. It should be noticed that (as for R) also in R+ alphabetic variants of the above rules are available as rules. For instance, rule (Rpi )+ has the following alphabetic variant Y

hγ1i ,di ,γ2i i

−→

Y1 × X × Y2

hαj1 ,πX ,αj2 i

−→

βi

Z1 × X × Z2 −→ U

⇓ Y

hγ1i ,di ,diε ,γ2i i

−→

Y1 × X × Y 0 × Y2

hπ◦αj1 ,πY 0 ,π◦αj2 i

−→

Z1 × Y 0 × Z2

(1Z1 ×dim ×1Z2 )◦β i

−→

U

(where a further arrow must be inserted to the right in case i = 1, where Y 0 is the codomain of diε and where π is the projection Y1 ×X ×Y 0 ×Y2 −→ Y1 ×X ×Y2 ). Other alphabetic variants are possible, e.g. by permuting the components of hγ1i , di , diε , γ2i i. Such alphabetic variants will be sometimes used during confluence proofs. In the remaining part of this section we collect useful technical facts. We first analyze the relationship between old and new µ-extraction rules. 24

Notice that all results in this Section depends only on the definition of left extensions of weak factorization systems. As such, they can be used to handle pushouts (for faithful and bijective on objects functors) in Cat, the category of all small categories. Recalling that monoids are just one-object categories, Theorem 5.1 specializes to a little result in pure string-rewriting.

35

(Rc1 )

α1 , β 1 , γ ⇒ α 1 ◦ β 1 , γ

(Rc2 )

α2 , β 2 ⇒ α 2 ◦ β 2

(Rε )

α, β ⇒ α ◦ βε , βm

(Rµ )+ hα, βi, γ ⇒ hαe , βi, (αµ × 1) ◦ γ (Rp1 )+

hγ 1 , d1 i, hα, πX i, β 1 , θ ⇒ hγ 1 , d1 , d1ε i, α × 1, (1 × d1m ) ◦ β 1 , θ where d1 6∈ E0

(Rp2 )+

hγ 2 , d2 i, hα, πX i, β 2 ⇒ hγ 2 , d2 , d2ε i, α × 1, (1 × d2m ) ◦ β 2 where d2 6∈ E0

Table 3: The system R+

Lemma 7.1 If K ⇒ K 0 by a single (Rµ )+ -step, then there is K 00 such that K 0 rewrites to K 00 by (at most) 2 (Rµ )+ -rewrite steps and K rewrites to K 00 by a single (Rµ )-rewrite step. Proof. We have the following three (Rµ )+ -rewrite steps: hα, βi, γ ⇒ hαe , βi, (αµ × 1) ◦ γ ⇒ hαe , βe i, (αµ × βµ ) ◦ γ ⇒ ⇒ hαe , βe ie , hαe , βe iµ ◦ (αµ × βµ ) ◦ γ We need only to show that hαe , βe ie = hα, βie and hαe , βe iµ ◦ (αµ × βµ ) = hα, βiµ . Let us consider the factorization hα, βie

Y @

- Y0 ¡ ¡hα, βiµ

hα, βi@

¡ ¡ ª

@ R @

Z1 × Z2

and let us put hα, βiµ = hσ, τ i. We have (hα, βie ◦ σε ) ◦ σµ = αe ◦ αµ , hence (by uniqueness of factorization) hα, βie ◦ σε = αe

and αµ = σµ

hα, βie ◦ τε = βe

and βµ = τµ .

and similarly

36

Thus (∗)

hα, βi = hα, βie ◦ (hσε , τε i ◦ (αµ × βµ )).

The arrow hσε , τε i ◦ (αµ × βµ ) belongs to M0 as it is equal to hσ, τ i = hα, βiµ ; so if we factorize hσε , τε i as ε ◦ µ and then µ ◦ (αµ × βµ ) as ε0 ◦ µ0 , we get that ε ◦ ε0 is the identity (being equal to the first component of the ε/µ-factorization of an arrow in M0 , namely hσε , τε i ◦ (αµ × βµ )). This can happen only if ε itself (which is a projection) is in fact identity (up to a renaming); we thus established that hσε , τε i belongs to M0 - which means that (∗)0

hσε , τε i is a diagonal.

(this is clear as σε , τε are both projections). From hα, βie ◦σε = αe and hα, βie ◦τε = βe , we get hαe , βe i = hα, βie ◦ hσε , τε i As first component is in E i and second component is in M0 , we get by uniqueness of factorization, hαe , βe ie = hα, βie and (∗)00

hαe , βe iµ = hσε , τε i

which gives the claim (combined with hα, βiµ = hσε , τε i ◦ (αµ × βµ ) coming from (∗)). a The above Lemma guarantees that there is no need in the local confluence proof to compute superpositions between rule (Rµ )+ and other rules ((Rµ )+ itself included): it is sufficient to compute superpositions between (Rµ ) and other rules.25 Using (Rµ )+ instead of (Rµ ) allows us to apply a less restrictive rule during confluence proofs; this makes some passages shorter (the only little price we pay for that is that we shall need to prove termination of (Rµ )+ too). Next Corollary will be used in Section 9 and is a slightly more accurate reformulation of what comes from the proof of Lemma 7.1 (recall that, according to (∗)0 and (∗)00 the third step was in fact a (Rdi)∗ -step, where (Rdi)∗ is the diagonalization rule we met in Section 5): Lemma 7.2 Let (Rµ )+1 be the following special case of rule (Rµ )+ : (Rµ )+1

ha, αi, β ⇒ hae , αi, (aµ × 1) ◦ β.

If K ⇒ K 0 by a single (Rµ ) or (Rµ )+ -rewrite step, then K rewrites to K 0 by using a finite number of (Rµ )+1 -rewrite steps followed by a single (Rdi)∗ -rewrite step. 25 If K ⇒ K 0 and K ⇒ K 00 give rise to the critical pair (K 0 , K 00 ) and, say, K ⇒ K 0 is a (Rµ )+ -step, we can find K0 such that K 0 ⇒∗R+ K0 and the pair (K0 , K 00 ) is a critical pair generated by rule (Rµ ) (instead of rule (Rµ )+ ).

37

In words: the e/µ factorization of ha1 , . . . , an i is obtained by taking the componentwise e/µ factorizations and then by applying a diagonalization step. The following fact is very useful: Lemma 7.3 If hei1 , . . . , ein i ∈ E i , the eij are pairwise distinct. Proof. As E i is closed under composition with projections, all eij are in E i . Let heij1 , . . . , eijm i be a list of distinct arrows containing exactly all the arrows among ei1 , . . . , ein . By the previous Lemma, heij1 , . . . , eijm i ∈ E i . According to the definition of heij1 , . . . , eijm i, there is a diagonal δ such that heij1 , . . . , eijm i ◦ δ = hei1 , . . . , ein i. As δ ∈ M0 , by uniqueness of e/µ-factorizations, δ is a renaming (thus showing the claim). a Corollary 7.4 αi ∈ E i iff the components of αi are pairwise distinct and all belong to E i . A consequence of the above results is that e/µ-factorizations are stable under certain pullbacks, in the sense of the following: Lemma 7.5 If α : Y1 −→ Y2 has factorization αe ◦ αµ , then for every Z, α × 1Z has factorization (αe × 1Z ) ◦ (αµ × 1Z ). Proof. It is sufficient to show that the components of πY1 ◦ αe cannot be equal to the components of πZ . This is clear, otherwise we would have in our theories provable equations of the kind t = xi , where t is a term not containing the variable xi : this cannot be, otherwise (after making term t a ground term by a substitution, if you like) we would obtain degeneration, i.e. that all terms are provably equal. a We now show that also rule (Rpi ) can be roughly achieved by finitely many (Rpi )+ rewrite steps. Let us use the notation K & L in order to express that there is K 0 such that K ⇒∗R+ K 0 and K 0 ⇔∗R0 L. Lemma 7.6 Let L be the path L =

hγ i ,δ i i

hα,πZ i

βi

(λ)

Y −→ Y1 × Z −→ Y2 × Z −→ U −→ V

(where the arrow λ is missed in case i = 2) and let R, R0 be the following two paths: i ) ◦ β i , (λ) R = hγ i , δ i , δεi i, α × 1Y 0 , (1Y2 × δm

R0 = hγ i , δ i , 1Y i, α × 1Y , (1Y2 × δ i ) ◦ β i , (λ) (where we supposed that Y 0 is the codomain of δε ). We have:

38

(i) R ⇔∗R0 R0 ; (ii) if δ i ∈ T0 , then L ⇔∗R0 R; (iii) in the general case, L & R (and consequently L & R0 ). Proof. (i) was proved at the end of Section 5 (it was actually used as simplification i ) in R step during completion). (ii) is easy, because we can move to the left (1 × δm by ⇔∗R0 -equivalence: i hγ i , δ i , δεi i, α × 1, (1 × δm ) ◦ β i , (λ) ⇔∗R0 hγ i , δ i , δ i i, α × 1, β i , (λ) ⇔∗R0 L.

(iii) is proved by induction on the number of components of δ. If such number is 1, there is nothing to prove (because either (Rpi )+ or (ii) applies). So suppose it is bigger than 1. If δ ∈ T0 , we just proved a stronger claim; otherwise L and R (up to an alphabetic variant) are hγ,δ,di

(1)

Y −→ Y1 × Z × X

hα,πZ ,πX i

−→

(λ)

β

Y2 × Z × X −→ U −→ V

and (2)

Y

hγ,δ,d,hδ,diε i

−→

α×1

0

Y1 × Z × X × Y 0 −→Y Y2 × Y 0

(1Y2 ×hδ,dim )◦β

−→

(λ)

U −→ V

respectively (with d 6∈ T0 ). To the former, we can apply a (Rpi )+ -rewrite step thus getting (3)

Y

hγ,δ,d,dε i

−→

Y1 × Z × X × Y000

hα,πZ i×1Y 00

−→

0

Y2 × Z × Y000

(1Y2 ×1Z ×dm )◦β

−→

(λ)

U −→ V

(where we called Y000 the codomain of dε ). By induction hypothesis, there is path K 00 such that (3) ⇒∗R+ K 00 and K 00 ⇔∗R0 (4), where (4) is the path (let Y00 be the codomain of δε ): (4) Y

hγ,δ,d,δε ,dε i

−→

Y1 ×Z ×X ×Y00 ×Y000

α×1Y 0 ×1Y 00 0 −→

0

Y2 ×Y00 ×Y000

(1Y2 ×δm ×dm )◦β

−→

(λ)

U −→ V

As hγ, δ, d, δε , dε i is equal to hγ, δ, d, 1Y i ◦ (1Y1 × 1Z × 1X × hδε , dε i), we can move right 1Y1 × 1Z × hδε , dε i by ⇔∗R0 -equivalence, thus getting the path (5)

Y

hγ,δ,d,1Y i

−→

α×1

Y1 × Z × X × Y −→Y Y2 × Y

(1Y2 ×hδ,di)◦β

−→

(λ)

U −→ V

which we know from (i) it is ⇔∗R0 -equivalent to (2). In conclusion, we have (1) ⇒R+ (3) ⇒∗R+ K 00 ⇔∗R0 (4) ⇔∗R0 (5) ⇔∗R0 (2) thus showing the claim

a

39

We need a final Lemma for next Section: Lemma 7.7 We have R1 & R2 , where R1 , R2 are the paths R1 = Y × Z1 R2 =

hαi1 ,αi2 i×1

−→

πY ×β j

γi

(λ)

1 Y1 × Y2 × Z1 −→ Y1 × Z2 −→ W −→ V

1×β j

Y × Z1 −→ Y × Z2

(αi1 ×1)◦γ i

−→

(λ)

W −→ V

(λ is missed in case i = 2). Proof. Applying (an alphabetic variant of) Lemma 7.6 (iii), we have that R1 = hπY ◦ α1i , πY ◦ α2i , πZ1 i, hπY1 , πZ1 ◦ β j i, γ i , (λ) & h1Y ×Z1 , πY ◦

α1i , πY



α2i , πZ1 i,

h1Y ×Z1 × (πZ1 ◦ β j )i, ((πY ◦ α1i ) × 1Z2 ) ◦ γ i , (λ) =

hπY , πZ1 , πY ◦ α1i , πY ◦ α2i , πZ1 i, h1Y ×Z1 × (πZ1 ◦ β j )i, hπY , πZ2 i ◦ (α1i × 1Z2 ) ◦ γ i , (λ) ⇔∗R0

(see Figure 1)

hπY , πZ1 , πY ◦ α1i , πY ◦ α2i , πZ1 i, hπY , πZ2 1 i ◦ (1Y × β j ), (α1i × 1Z2 ) ◦ γ i , (λ) ⇔∗R0 hπY , πZ1 , πY ◦ α1i , πY ◦ α2i , πZ1 i ◦ hπY , πZ2 1 i, 1Y × β j , (α1i × 1Z2 ) ◦ γ i , (λ) = hπY , πZ1 i, 1Y ×

βj ,

(α1i × 1Z2 ) ◦ γ i , (λ)

= 1Y ×Z1 , 1Y ×

βj ,

(α1i × 1Z2 ) ◦ γ i , (λ)

⇔∗R0 1Y × β j , (α1i × 1Z2 ) ◦ γ i , (λ) = R2 as wanted.

8

a

Local confluence, II

In this section we prove that R+ is locally confluent. In order to show confluence of a pair of paths (R1 , R2 ), we shall use the following schema: we find L1 , L2 such that R1 & L1 and R2 & L2 and L1 ⇔∗R0 L2 . Canonicity of R0 (which was proved in Section 6) guarantees that in such a condition K1 , K2 are joinable. Throughout this section we shall mention arrows γ, d, α, β, θ, λ whose domains and codomains are fixed as follows:

40

Y × Z1 × Y1 × Y2 × Z1

1Y ×Z1 ×(πZ1 ◦β j )

- Y × Z1 × Z2

2 i hπY ,πZ

hπY ,πZ2 i

1

?

Y × Z1

1Y

? - Y1 × Z2

×β j

Figure 1: Commutative diagram

Y

hγ, di-

Y1 × X

hα, πX i-

Y2 × X

β-

U

θ-

V

λ-

T

We also assume that d factorizes in ε/m-components as follows Y dε@

@ R

d Y0

-X µd ¡ ¡ m

We first analyze some situations which are very frequent during local confluence proof. Lemma 8.1 Let Ki (i ∈ {1, 2}) be the following path: Ki = hγ i , di , diε i, αj × 1Y 0 , (1Y2 × dim ) ◦ β 0 ◦ θi , (λ) (where λ lacks in case i = 2). Then: (i) The path Ki0 = hγ i , di i, (hαj , πX i ◦ β 0 )e , (hαj , πX i ◦ β 0 )µ ◦ θi , (λ) is joinable with Ki in R+ . (ii) The path Ki00 = hγ i , di i, hαj , πX i ◦ β 0 , θi , (λ) is joinable with Ki in R+ . Proof. (ii) is trivially reduced to (i) (just apply (Rµ ) in Ki00 to decompose hαj , πX i◦β 0 ). To prove (i), we have to factorize the arrow hαj , πX i ◦ β 0 in components e/µ. We first factorize hαj , πX i: by Lemmas 7.2, 7.3, such factorization is obtained by first factorizing αj in e/µ components and then diagonalizing with πX in case πX appears among the components of αej . We have to distinguish whether πX is among the components of αej or not. Case 1 : πX is among the components of αej , hence αj has the following factorization in e/µ-components:

41

αj

Y1 × X @

h¯ αj , πX i@

¡

@ @ R

- Y2 ¡ µ ¡αj µ

¡

S×X

Then: hαj , πX i = h¯ αj , πX , πX i ◦ (αµj × 1X ) = h¯ αj , πX i ◦ [(1S × ∆X ) ◦ (αµj × 1X )] ∆

X where X −→ X × X is a diagonal. We have two subcases, depending whether πX appears in the ε-component of (1S × ∆X ) ◦ (αµj × 1X ) ◦ β 0 or not.

Subcase 1.1 : let us assume that (1S × ∆X ) ◦ (αµj × 1X ) ◦ β 0 has the following factorization in T0

S×X

(1S ×∆X )◦(αjµ ×1X )◦β 0

H

HH

©µ

πS 0 ×1XHH

©©

HH H j

S0

-U © * ©©

©©

×X

It follows that hαj , πX i ◦ β 0 = h¯ αj , πX i ◦ (πS 0 × 1X ) ◦ µ; by the fact that h¯ αj , πX i ◦ (πS 0 × 1X ) belongs to Ej and by the uniqueness of decomposition we have: (hαj , πX i ◦ β 0 )e = h¯ αj , πX i ◦ (πS 0 × 1X ) = h¯ αj ◦ πS 0 , πX i (hαj , πX i ◦ β 0 )µ = µ It follows that Ki0 = hγ i , di i, h¯ αj ◦ πS 0 , πX i, µ ◦ θi , (λ). We can apply Lemma 7.6(iii) (in fact, if i = 1 the arrow λ belongs to the path) and we obtain Ki0 & L1 , where L1 = hγ i , di , 1Y i, (¯ αj ◦ πS 0 ) × 1Y , (1S 0 × di ) ◦ µ ◦ θi , (λ) Let us consider Ki . We first observe that αj × 1Y 0 can be decomposed in e/µ components as (αej × 1Y 0 ) ◦ (αµj × 1Y 0 ) by Lemma 7.5; therefore an application of (Rµ ) yields to Ki ⇒R+ hγ i , di , diε i, αej × 1Y 0 , (αµj × 1Y 0 ) ◦ (1Y2 × dim ) ◦ β 0 ◦ θi , (λ) = hγ i , di , diε i, hπ ◦

α ¯ j , πX , πY 0 i, (αµj

× 1Y 0 ) ◦ (1Y2 × dim ) ◦ β 0 ◦ θi , (λ)

42

π

where Y1 × X × Y 0 −→ Y1 × X. We can apply Lemma 7.6(iii) on hdi , diε i and we get: Ki & hγ i , di , diε , 1Y i, (π ◦ α ¯ j ) × 1Y , (1S × hdi , diε i) ◦ (αµj × 1Y 0 ) ◦ (1Y2 × dim ) ◦ β 0 ◦ θi , (λ) = hγ i , di , diε , 1Y i, (π ◦ α ¯ j ) × 1Y , (1S × hdi , diε i) ◦ (1S × 1X × dim ) ◦ (αµj × 1X ) ◦ β 0 ◦ θi , (λ) =

(see Figure 2)

hγ i , di , diε , 1Y i, (π ◦ α ¯ j ) × 1Y , (1S × di ) ◦ (1S × ∆X ) ◦ (αµj × 1X ) ◦ β 0 ◦ θi , (λ) = hγ i , di , diε , 1Y i, (π ◦ α ¯ j ) × 1Y , (1S × di ) ◦ (πS 0 × 1X ) ◦ µ ◦ θi , (λ) = hγ i , di , diε , 1Y i, (π × 1Y ) ◦ (¯ αj × 1Y ), (πS 0 × 1Y ) ◦ (1S 0 × di ) ◦ µ ◦ θi , (λ) ⇔∗R0 hγ i , di , diε , 1Y i ◦ (π × 1Y ), (¯ αj × 1Y ) ◦ (πS 0 × 1Y ), (1S 0 × di ) ◦ µ ◦ θi , (λ) = hγ i , di , 1Y i,

(¯ αj

◦ πS 0 ) × 1Y , (1S 0 × di ) ◦ µ ◦ θi , (λ)

which coincides with L1 , and this prove (i). S×Y

1S ×hdi ,diε i

PP

PP

1S ×di

?

S×X

PP

- S×X ×Y0

PP

1S ×1X ×dim

PP1S ×hdi ,di i PP PP

1S ×∆X

PP ? q -P S×X ×X

Figure 2: Commutative diagram

Subcase 1.2 : now (1S × ∆X ) ◦ (αµj × 1X ) ◦ β 0 has the following factorization in T0 :

43

S×X

(1S ×∆X )◦(αjµ ×1X )◦β 0

HH

H

πS ◦πS 0H

-U © * © ©©

© µ

HH HH j

© ©©

S0

We consequently have (hαj , πX i ◦ β 0 )e = h¯ αj , πX i ◦ πS ◦ πS 0 = α ¯ j ◦ πS 0 (hαj , πX i ◦ β 0 )µ = µ In the present subcase we do not need manipulating Ki0 ; moreover by manipulating Ki as in the previous case, we get Ki & hγ i , di , diε , 1Y i, (π ◦ α ¯ j ) × 1Y , (1S × di ) ◦ (1S × ∆X ) ◦ (αµj × 1X ) ◦ β 0 ◦ θi , (λ) = hγ i , di , diε , 1Y i, (π ◦ α ¯ j ) × 1Y , (1S × di ) ◦ πS ◦ πS 0 ◦ µ ◦ θi , (λ) = hγ i , di , diε , 1Y i, (π ◦ α ¯ j ) × 1Y , πS ◦ πS 0 ◦ µ ◦ θi , (λ) ⇔∗R0 hγ i , di , diε , 1Y i, ((π ◦ α ¯ j ) × 1Y ) ◦ πS ◦ πS 0 , µ ◦ θi , (λ) = hγ i , di , diε , 1Y i, hπY1 , πX i ◦ α ¯ j ◦ πS 0 , µ ◦ θi , (λ) ⇔∗R0 hγ i , di i, α ¯ j ◦ πS 0 , µ ◦ θi , (λ) which coincides with Ki0 , and this prove (i). Case 2 : suppose now that πX does not belong to αej , namely αj has the following factorization in e/µ-components: αj

Y1 × X @

- Y2 ¡ µ ¡αj

αej@

¡

@ @ R

¡

S 44

µ

This implies that: hαj , πX ie = hαej , πX i

hαj , πX iµ = αµj × 1X

We need to factorize (αµj × 1X ) ◦ β 0 in T0 : again, we have two subcases, depending whether πX appears or not in the ε-component. Subcase 2.1 : let (αµj × 1X ) ◦ β 0 factorize as follows: S×X

(αµj × 1X ) ◦ β 0-

@

µ ¡ ¡µ

πS 0 × 1X@

@ R @

U

¡

¡

S0 × X

Reasoning as in Case 1, it follows that: (hαj , πX i ◦ β 0 )e = hαej ◦ πS 0 , πX i

(hαj , πX i ◦ β 0 )µ = µ

We have Ki0 = hγ i , di i, hαej ◦ πS 0 , πX i, µ ◦ θi , (λ) & hγ i , di , 1Y i,

(αej

◦ πS 0 ) × 1Y , (1S 0 × di ) ◦ µ ◦ θi , (λ)

(L01 )

The arrow αj × 1Y 0 decomposes in e/µ components as (αej × 1Y 0 ) ◦ (αµj × 1Y 0 ) (see Lemma 7.5); thus, by (Rµ ), Ki rewrites to hγ i , di , diε i, αej × 1Y 0 , (αµj × 1Y 0 ) ◦ (1Y2 × dim ) ◦ β 0 ◦ θi , (λ) ⇔∗R0

(by Lemma 7.6(i))

hγ i , di , 1Y i, αej × 1Y , (1S × diε ) ◦ (αµj × 1Y 0 ) ◦ (1Y2 × dim ) ◦ β 0 ◦ θi , (λ) = hγ i , di , 1Y i,

αej

× 1Y , (1S ×

diε )

◦ (1S × dim ) ◦ (αµj × 1X ), β 0 ◦ θi , (λ) ⇔∗R0

hγ i , di , 1Y i, αej × 1Y , (1S × di ) ◦ (αµj × 1X ) ◦ β 0 ◦ θi , (λ) = hγ i , di , 1Y i, αej × 1Y , (1S × di ) ◦ (πS 0 × 1X ) ◦ µ ◦ θi , (λ) ⇔∗R0 hγ i , di , 1Y i, (αej × 1Y ) ◦ (πS 0 × 1Y ), (1S 0 × di ) ◦ µ ◦ θi , (λ) 45

which coincides with L01 , and this concludes Subcase 2.1. Subcase 2.2.: let (αµj × 1X ) ◦ β 0 factorize as follows: (αµj × 1X ) ◦ β 0-

S×X @

µ ¡ ¡µ

πS ◦ πS 0@

@ R @

U

¡

S0

¡

We have (hαj , πX i ◦ β 0 )e = αej ◦ πS 0

(hαj , πX i ◦ β 0 )µ = µ

Reasoning as in Subcase 2.1, we have Ki ⇔∗R0 hγ i , di , 1Y i, αej × 1Y , (1S × di ) ◦ (αµj × 1X ) ◦ β 0 ◦ θi , (λ) = hγ i , di , 1Y i,

αej

× 1Y , (1S × di ) ◦ πS ◦ πS 0 ◦ µ ◦ θi , (λ) ⇔∗R0

hγ i , di , 1Y i, hπY1 , πX i ◦ αej ◦ πS 0 , µ ◦ θi , (λ) ⇔∗R0 hγ i , di i, αej ◦ πS 0 , µ ◦ θi , (λ) which coincides with Ki0 .

a

Lemma 8.2 Let Kj (j ∈ {1, 2}) be the following path: Kj = hγ i , di , diε i, αj × 1Y 0 , (1Y2 × dim ) ◦ β 0 , θj , (λ) (where λ lacks in case j = 2). Then the path Ki000 = hγ i , di i, hαj , πX i ◦ β 0 ◦ θj , (λ) is joinable with Kj in R+ . Proof. Here we cannot apply the products rule on Ki000 , therefore we have to act on Kj ; thus we have to decompose (1Y2 × dim ) ◦ β 0 in e/µ components. Suppose that the e/µ-components of dim are dim X ¡ µµ δ i@@ R ¡ S

Y0

46

Then by Lemma 7.5: (1Y2 × dim )µ = 1Y2 × µ

(1Y2 × dim )e = 1Y2 × δ i We decompose (1Y2 × dim ) ◦ β 0 as follows: 1Y2 ×dim

Y2 × Y 0

- Y ×X 2 * ©©

©

1Y2 ×µ©©

1Y2 ×δ i

©©

? ©

Y2 × S

β0

πY 0 ×πS 0 2

?

Y20 × S 0

ν

? -V

Since 1Y2 × δ i belongs to Ei , we can state that ((1Y2 × dim ) ◦ β 0 )e = (1Y2 × δ i ) ◦ (πY20 × πS 0 ) = πY20 × (δ i ◦ πS 0 ) ((1Y2 × dim ) ◦ β 0 )µ = ν By (Rµ ), we have Kj ⇒R+ hγ i , di , diε i, αj × 1Y 0 , πY20 × (δ i ◦ πS 0 ), ν ◦ θj , (λ)

47

Lemma 7.7 yields (by a &-step):26 hγ i , di , diε i, 1Y1 × 1X × (δ i ◦ πS 0 ), ((αj ◦ πY20 ) × 1S 0 ) ◦ ν ◦ θj , (λ) ⇔∗R0 hγ i , di , diε i, 1Y1 × 1X × δ i , (1Y1 × 1X × πS 0 ) ◦ ((αj ◦ πY20 ) × 1S 0 ) ◦ ν ◦ θj , (λ) ⇔∗R0 hγ i , di , diε ◦ δ i i, (αj × 1S ) ◦ (πY20 × πS 0 ) ◦ ν ◦ θj , (λ) = hγ i , di , diε ◦ δ i i, (αj × 1S ) ◦ (1Y2 × µ) ◦ β 0 ◦ θj , (λ) = hγ i , di , diε ◦ δ i i, (1Y1 × 1X × µ) ◦ (αj × 1X ) ◦ β 0 ◦ θj , (λ) ⇔∗R0 hγ i , di , diε ◦ δ i i ◦ (1Y1 × 1X × µ), (αj × 1X ) ◦ β 0 ◦ θj , (λ) = hγ i , di , di i, (αj × 1X ) ◦ β 0 ◦ θj , (λ) ⇔∗R0 hγ i , di i, hαj , πX i ◦ β 0 ◦ θj , (λ) which coincides with Ki000 .

a

Let us now prove local confluence of R+ . To this aim, by Section 6 results, it suffices to study the superpositions between the rule (Rpi )+ and the other rules, itself included (see also the observation following the proof of Lemma 7.1).

8.1

Superpositions between (Rpj )+ and (Rci )

CASE 1 We have a path of four arrows θ1 , θ2 , θ3 , θ4 and we apply (Rci ) on θ1 , θ2 and (Rpj )+ on θ2 , θ3 , θ4 (clearly, if the rule applied is (Rp1 )+ , we have to add an arrow λ to the path). Let us first suppose i 6= j; in such a case θ2 must belong to T0 : 26

We have a projection πY20 : Y2 −→ Y20 , hence αj must be a pair (of vectors), whose component having codomain Y20 is obviously αj ◦ πY20 .

48

ζ i , hγ 0 , d0 i, hα, πX i, β j , (λ) ¡

(Rci )¡¡ ¡

ζi



@

¡ ¡

@

@

@

j +

@(Rp ) @ @

¡

¡ ¡ ª

hγ 0 , d0 i, hα, πX i, β j , (λ)

ζ i , hγ 0 , d0 , d0ε i, α

@ @ R

× 1Y 0 , (1Y2 × d0m ) ◦ β j , (λ)

In this case the two members of the critical pair are ⇔∗R0 -equivalent by Lemma 7.6(ii). If i = j, then we have: ζ i , hγ i , di i, hα, πX i, β i , (λ) @

¡

¡

@

@

¡

@ (Ri )+ @ p @ @ @ R @

(Rci )¡¡ ¡

ζi



¡ ¡ ª

¡

hγ i , di i, hα, πX i, β i , (λ)

ζ i , hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ)

By applying Lemma 7.6(iii) to first member (where ζ i ◦ hγ i , di i = hζ i ◦ γ i , ζ i ◦ di i) we have a &-step to the path (let W be the domain of ζ i ): hζ i ◦ γ i , ζ i ◦ di , 1W i, α × 1W , (1Y2 × ζ i ◦ di ) ◦ β i , (λ)

(L1 )

By applying (Rci ) to the second member, where α × 1Y 0 is hπ ◦ α, πY 0 i (with Y1 × X × π Y 0 −→ Y1 × X), we get hζ i ◦ γ i , ζ i ◦ di , ζ i ◦ diε i, hπ ◦ α, πY 0 i, (1Y2 × dim ) ◦ β i , (λ) & Lemma 7.6(iii) hζ i ◦ γ i , ζ i ◦ di , ζ i ◦ diε , 1W i, (π ◦ α) × 1W , (1Y2 × ζ i ◦ diε ) ◦ (1Y2 × dim ) ◦ β i , (λ) = hζ i ◦ γ i , ζ i ◦ di , ζ i ◦ diε , 1W i, (π × 1W ) ◦ (α × 1W ), (1Y2 × ζ i ◦ diε ◦ dim ) ◦ β i , (λ) ⇔∗R0 hζ i ◦ γ i , ζ i ◦ di , ζ i ◦ diε , 1W i ◦ (π × 1W ), α × 1W , (1Y2 × ζ i ◦ di ) ◦ β i , (λ) = hζ i



γi, ζ i



di , 1W i, α

× 1W , (1Y2 × ζ i ◦ di ) ◦ β i , (λ)

which coincides with (L1 ).

49

For future reference let us mark the following fact we established during the above proof:27 Lemma 8.3 Paths ζ i ◦ hγ i , di i, hα, πX i, β i , (λ) ζ i ◦ hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ) are joinable in R+ . Let us go on by examining other superpositions. CASE 2 We have a path of three arrows θ1 , θ2 , θ3 , we apply (Rcj ) on θ1 , θ2 and (Rpi )+ on the whole path. If i = j everything trivially compose; otherwise θ1 must belong to T0 . Therefore we have: hγ 0 , d0 i, hαj , πX i, β i , (λ) ¡

¡

¡ (Rcj )¡ ¡

hγ 0 , d0 i



¡ ¡ ª

¡

@

@

@

@ (Ri )+ @ p @ @ @ @ R

¡

hαj , πX i, β i , (λ)

hγ 0 , d0 , d0ε i, αj × 1Y 0 , (1Y2 × d0m ) ◦ β i , (λ)

The two members are ⇔∗R0 -equivalent by Lemma7.6(ii). CASE 3 We have a path of three arrows θ1 , θ2 , θ3 and we apply (Rcj ) on θ2 , θ3 and (Rpi )+ on the whole path. Again everything compose if i = j; otherwise θ3 must belong to T0 . Moreover as i = 1 or j = 1, we need a fourth arrow θ4 (θ4 , in its turn, must be followed in a well-coloured path 28 by a further arrow λ in case θ4 belongs to T1 \T0 ). We have: 27

The Lemma comes from the fact that the first step we applied to second member was a (Rci )-step. Of course, only well-coloured paths occur in our rewriting, so we are justified in limiting ourselves to such paths. 28

50

hγ i , di i, hαj , πX i, β 0 , θ, (λ) ¡

¡

@

@

¡

@

¡ (Rcj )¡

@ (Ri )+ @ p @ @ @ R @

¡

¡

¡ ¡ ª

hγ i , di i, hαj , πX i

◦ β 0 , θ, (λ)

hγ i , di , diε i, αj × 1Y 0 , (1Y2 × dim ) ◦ β 0 , θ, (λ)

If θ ∈ Tj \T0 , we compose hαj , πX i ◦ β 0 with θ and then apply Lemma 8.2. If θ ∈ Ti \T0 , we compose (1Y2 × dim ) ◦ β 0 with θ and the confluence immediately follows by Lemma 8.1(ii). If θ ∈ T0 , we can in any case apply one of the two previous solutions (because either i or j must be 2, hence lack of λ does not matter). CASE 4 We have a path of four arrows θ1 , θ2 , θ3 , θ4 and we apply (Rcj ) on θ3 , θ4 and (Rpi )+ on θ1 , θ2 , θ3 . Suppose j = i; that is: hγ i , di i, hα, πX i, β i , θi , (λ) ¡

¡

(Rci )¡¡ ¡

¡ ¡ ª

hγ i , di i, hα, πX i, β i

¡

@

@

@

@ (Ri )+ @ p @ @ @ @ R

¡

◦ θi , (λ)

hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , θi , (λ)

Then we can reduce both first member (by (Rpi )+ ) and second member (by (Rci )+ ) to the path hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i ◦ θi , (λ). Suppose that i 6= j; in this case θ3 ∈ T0 and we have hγ i , di i, hα, πX i, β 0 , θj , (λ) ¡ ¡

¡

@

@

@

¡ (Rcj )¡

@ (Ri )+ @ p @ @ @ R @

¡

¡

¡ ¡ ª

hγ i , di i, hα, πX i, β 0

◦ θj , (λ)

hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β 0 , θj , (λ)

51

If α ∈ Ti , we can trivially apply (Rci ) to the second member and then get first member by ⇔∗R0 -steps. The relevant case is when α ∈ Tj : here we can rewrite first member by (Rcj ) to hγ i , di i, hα, πX i ◦ β 0 ◦ θj , (λ) and then we apply Lemma 8.2.

8.2

Superpositions between (Rpi )+ and (Rµ )

CASE 1 ζ, hγ i , di i, hα, πX i, β i , (λ) ¡

¡

@

@

¡

@

(Rµ ) ¡ ¡ ¡ ¡ ¡ ¡ ª

@ (Ri )+ @ p @ @ @ R @

ζe , ζµ ◦ hγ i , di i, hα, πX i, β i , (λ)

ζ, hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ)

where ζ factorizes as follows W ζe@

ζ

R @

-Y µζ ¡ ¡ µ

Z By applying Lemma 7.6(iii) to first member (where ζµ ◦ hγ i , di i = hζµ ◦ γ i , ζµ ◦ di i), we obtain, through a &-step: ζe , hζµ ◦ γ i , ζµ ◦ di , 1Z i, α × 1Z , (1Y2 × ζµ ◦ di ) ◦ β i , (λ)

(L1 )

Let us apply (Rµ ) to the second member; we get ζe , hζµ ◦ γ i , ζµ ◦ di , ζµ ◦ diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ) = ζe , hζµ ◦ γ i , ζµ ◦ di , ζµ ◦ diε i, hπ ◦ α, πY 0 i, (1Y2 × dim ) ◦ β i , (λ) π

with Y1 × X × Y 0 −→ Y1 × X. We can apply again Lemma 7.6(iii) and get, by a &-step

52

ζe , hζµ ◦ γ i , ζµ ◦ di , ζµ ◦ diε , 1Z i, (π ◦ α) × 1Z , (1Y2 × ζµ ◦ diε ) ◦ (1Y2 × dim ) ◦ β i , (λ) = ζe , hζµ ◦ γ i , ζµ ◦ di , ζµ ◦ diε , 1Z i, (π ◦ α) × 1Z , (1Y2 × ζµ ◦ di ) ◦ β i , (λ) ⇔∗R0 ζe , hζµ ◦ γ i , ζµ ◦ di , ζµ ◦ diε , 1Z i ◦ (π × 1Z ), α × 1Z , (1Y2 × ζµ ◦ di ) ◦ β i , (λ) = ζe , hζµ ◦ γ i , ζµ ◦ di , 1Z i, α × 1Z , (1Y2 × ζµ ◦ di ) ◦ β i , (λ) which coincides with (L1 ). CASE 2 hγ i , di i, hα, πX i, β i , (λ) ¡

¡

¡

@

@

@

(Rµ ) ¡ ¡

η i , hσ 0 , s0 i

¡ ¡ ª

¡

¡

◦ hα, πX i, β i , (λ)

@ (Ri )+ @ p @ @ @ R @

hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ)

where we suppose hγ i , di i to have the following factorization in components e/µ hγ i , di i

Y @

η i@

@ R @

¡

- Y1 × X µ ¡ ¡hσ 0 , s0 i ¡

Z We apply (Rµ )+ on the second member to the component hγ i , di i of hγ i , di , diε i and we obtain hη i , diε i, (hσ 0 , s0 i × 1Y 0 ) ◦ (α × 1Y 0 ), (1Y2 × dim ) ◦ β i , (λ) = hη i , diε i, (hσ 0 , s0 i

◦ α) × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ) ⇔∗R0

hη i , 1Y i, (hσ 0 , s0 i ◦ α) × 1Y , (1Y2 × diε ) ◦ (1Y2 × dim ) ◦ β i , (λ) = hη i , 1Y i, (hσ 0 , s0 i ◦ α) × 1Y , (1Y2 × di ) ◦ β i , (λ) 53

(L2 )

We need to factorize s0 in components ε/µ in T0 . s0

Z 0 × Z 00

-X ¡ µ ¡µ

@

πZ 00@

@ @ R

¡

Z 00

¡

ηi

1 where Z 0 × Z 00 = Z. This implies that η i has the form hη1i , η2i i, where Y −→ Z 0 and

ηi

2 Y −→ Z 00 . By applying (Rµ )+ on the first member to the arrow hσ 0 , s0 i ◦ hα, πX i = hhσ 0 , s0 i ◦ α, s0 i, in order to decompose s0 , we obtain:

hη1i , η2i i, hhσ 0 , s0 i ◦ α, πZ 00 i, (1Y2 × µ) ◦ β i , (λ) which, by Lemma 7.6(iii), becomes (through a &-step) hη1i , η2i , 1Y i, (hσ 0 , s0 i ◦ α) × 1Y , (1Y2 × η2i ) ◦ (1Y2 × µ) ◦ β i , (λ) = hη i , 1Y i,

(hσ 0 , s0 i

◦ α) × 1Y , (1Y2 × η2i ◦ µ) ◦ β i , (λ)

(L1 )

Since η2i ◦ µ = η i ◦ πZ 00 ◦ µ = η i ◦ s0 = di , we can conclude that (L1 ) coincides with (L2 ). CASE 3 hγ i , di i, hα, πX i, θi , (λ) ¡

¡ (Rµ ) ¡ ¡ ¡ ¡ ¡ ª ¡

hγ i , di i, hα, πX ie , hα, πX iµ ◦ θi , (λ)

¡

@

@

@

@ (Ri )+ @ p @ @ @ @ R

hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ θi , (λ)

Confluence is an immediate application of Lemma 8.1(i) (taking as β 0 the identity). CASE 4

54

hγ i , di i, hα, πX i, β i , θ ¡

(Rµ ) ¡

¡

@

@

¡

@

@ (Ri )+ @ p @ @ @ @ R

¡

¡

¡

¡ ¡ ª

hγ i , di i, hα, πX i, βei , βµi

hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , θ

◦θ

It suffices to apply (Rpi )+ to first member and the confluence immediately follows by Lemma 6.1 (with σ 0 = βµi ).

8.3

Superpositions between (Rpi )+ and (Rε )

CASE 1 ζ, hγ i , di i, hα, πX i, β i , (λ) ¡ ¡

@

¡

@

@

@ (Ri )+ @ p @ @ @ @ R

(Rε ) ¡ ¡

¡ ¡

ζ◦

¡ ¡ ª

ε, h¯ γ i , d¯i i, hα, πX i, β i , (λ)

ζ, hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ)

where we suppose hγ i , di i to have the following factorization in ε/m-components hγ i , di i Y1 × X

Y @

µ i ¯i ¡ γ ,d i ¡ h¯

ε@

R @



¡

Let us suppose that d¯i has the following ε/m-factorization d¯i

Y˜ @ d¯iε @

@ R

-

X

µ ¡ ¡ d¯im

Y˜ 0

55

¡

Then, by the uniqueness of decompositions, since ε ◦ d¯iε ◦ d¯im = di , we have: ε ◦ d¯iε = diε

d¯im = dim

Y˜ 0 = Y 0

We apply (Rpi )+ to the first member and we obtain: ζ ◦ ε, h¯ γ i , d¯i , d¯iε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ) ⇔∗R0 ζ, ε ◦ h¯ γ i , d¯i , d¯iε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ) ζ, hε ◦

γ¯ i , ε

= i i ¯ ¯ ◦ d , ε ◦ dε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ) =

ζ, hγ i , di , diε i, α

× 1Y 0 , (1Y2 × dim ) ◦ β i , (λ)

which coincides with the second member. CASE 2 hγ i , di i, hα, πX i, β i , (λ) @

¡

@

¡

@

¡

(Rε ) ¡ ¡ ¡ ¡

¡ ¡ ª

hγ i , di i ◦ ε, h¯ α, hi, β i , (λ)

@ (Ri )+ @ p @ @ @ R @

hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ)

where we have factorized hα, πX i in components ε/m as follows Y1 × X

hα, πX i-

@

Y2 × X

¡ µ α, hi ¡ h¯

ε@

@ R

¡

Z On the other hand, let h = hε ◦ hm . Since ε ◦ hε ◦ hm = πX , by uniqueness of factorizations, hm must coincide with 1X . Therefore h = hε is the projection29 on X, hence (up to renaming) we have: Z = Y10 × X

ε = πY 0 × 1X

29 As ε ◦ hε = πX , we have that hε composed on the left with a projection is πX : it follows that hε itself must be the projection into X (with domain Z).

56

Thus, first member coincides with hγ i ◦ πY 0 , di i, h¯ α, πX i, β i , (λ) which, by (Rpi )+ , becomes hγ i ◦ πY 0 , di , diε i, α ¯ × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ) = hγ i , di , diε i ◦ (ε × 1Y 0 ), α ¯ × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ) ⇔∗R0 hγ i , di , diε i, (ε × 1Y 0 ) ◦ (¯ α × 1Y 0 ), (1Y2 × dim ) ◦ β i , (λ) = hγ i , di , diε i, ε

◦α ¯ × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ)

and, since ε ◦ α ¯ = α, the last path coincides with the second member. CASE 3 hγ i , di i, hα, πX i, β i , (λ) ¡

(Rε ) ¡ ¡ ¡ ¡

¡ ¡

¡ ¡ ª

i , (λ) hγ i , di i, hα, πX i ◦ βεi , βm

@

@

@

@ (Ri )+ @ p @ @ @ R @

hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ)

We have to factorize β i . Suppose that ‘X belongs to the codomain of βεi ’, that is βi

Y2 × X @ πY20 × 1X @

@ R

-U µi ¡ ¡ βm

¡

Y20 × X

Then the first member coincides with i , (λ) hγ i , di i, hα ◦ πY20 , πX i, βm

57

which is rewritten by (Rpi )+ as i , (λ) hγ i , di , diε i, (α ◦ πY20 ) × 1Y 0 , (1Y20 × dim ) ◦ βm

⇔∗R0 i , (λ) hγ i , di , diε i, α × 1Y 0 , (πY20 × 1Y 0 ) ◦ (1Y20 × dim ) ◦ βm

= i , (λ) hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ (πY20 × 1X ) ◦ βm

= hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ) which coincides with the second member. If ‘X does not belong to the codomain of βεi ’, that is βεi = πY2 ◦ πY20 , then the second member coincides with i , (λ) hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ πY2 ◦ πY20 ◦ βm

= i , (λ) hγ i , di , diε i, α × 1Y 0 , πY2 ◦ πY20 ◦ βm

⇔∗R0 i , (λ) hγ i , di , diε i, (α × 1Y 0 ) ◦ πY2 ◦ πY20 , βm

= hγ i , di , diε i, hπY1 , πX i ◦ α ⇔∗R0 hγ i , di , diε i ◦ hπY1 , πX i, α

i , (λ) ◦ πY20 , βm i , (λ) ◦ πY20 , βm

= hγ i , di i, α

i , (λ) ◦ πY20 , βm

which, by the fact that α ◦ πY20 is the same as hα, πX i ◦ πY2 ◦ πY20 , coincides with the first member. CASE 4 hγ i , di i, hα, πX i, β i , θ ¡

¡

(Rε ) ¡

¡

@

@

@

@ (Ri )+ @ p @ @ @ @ R

¡

¡

¡ ¡ ª

hγ i , di i, hα, πX i, β i

¡

◦ θε , θm

hγ i , di , diε i, α × 1Y 0 , (1Y2 × dim ) ◦ β i , θ

58

It suffices to apply (Rpi )+ to first member and the confluence immediately follows by Lemma 6.1 (with σ 0 = θε ).

8.4

Superpositions between (Rpi )+ and (Rpj )+

CASE 1 Here we have three arrows θ1 , θ2 , θ3 and we apply both rules to the whole path; the case i 6= j is trivial (it implies that θ1 and θ3 belongs to T0 , so that everything compose). Let us suppose i = j. Arrow θ2 (up to an alphabetic variant) must be of 1 , π 2 i. We have in principle two cases (to be treated in a very similar the form hα, πX X 1 and π 2 are the same projection or not. way) depending on whether πX X SUBCASE 1.1 hγ i , di i, hα, πX , πX i, β i , (λ) @

¡

@

¡

(Rpi )+ ¡ ¡

K1

i +

@ (Rp ) @ @ R

¡ ª

K2

where hγ i , di i-

Y di

Y1 × X

hα, πX , πX i Y2 × X × X

βi V

di

ε m and Y −→ Y 0 −→ X corresponds to the factorization ε/m of di . The two members are:

K1 = hγ i , di , diε i , hα, πX i × 1Y 0 , (1Y2 × 1X × dim ) ◦ β i , (λ) = hγ i , di , diε i , hhπY1 , πX i ◦ α, πX , πY 0 i , (1Y2 × 1X × dim ) ◦ β i , (λ) K2 = hγ i , di , diε i , hhπY1 , πX i ◦ α, πY 0 , πX i , (1Y2 × dim × 1X ) ◦ β i , (λ) By applying (Rpi )+ on both members with respect to πX , we get the path hγ i , di , diε , diε i , hhπY1 , πX i ◦ α, πY1 0 , πY2 0 i , (1Y2 × dim × dim ) ◦ β i , (λ) where πY1 0 and πY2 0 project on the first and on the second Y 0 respectively. SUBCASE 1.2

59

1 , π 2 i, β i , (λ) hγ i , di , ci i, hα, πX X

¡

(Rpi )+

@

¡

@

¡

K1

i +

@ (Rp ) @ @ R

¡

¡ ª

K2

where hγ i , di , ci i -

Y di

di

1 , π2 i hα, πX X -

Y1 × X × X ci

Y2 × X × X

β i-

V

ci

ε m ε m and Y −→ Y 0 −→ X, Y −→ Y 00 −→ X correspond to the factorization ε/m of di , ci . The two members are:

1 , π 00 i , (1 i i K1 = hγ i , di , ci , ciε i , hπ ◦ α, πX Y2 × 1X × cm ) ◦ β , (λ) Y 2 i , (1 i i K2 = hγ i , di , diε , ci i , hπ ◦ α, πY 0 , πX Y2 × dm × 1X ) ◦ β , (λ)

where π denotes in both cases the projection from the corresponding domains onto Y1 × X × X. By applying (Rpi )+ on both members with respect to the suitable projection on X, we get the same path, namely hγ i , di , diε , ci , ciε i , hπ ◦ α, πY 0 , πY 00 i , (1Y2 × dim × cim ) ◦ β i , (λ) CASE 2 Here we have a four-arrows path, (Rpi )+ is applied to the first three arrows and (Rpj )+ to the last three. We have 1 i, θ j , (λ) hγ i , di i, hαj , cj , πX i, hβ i , πX

¡

@

¡

@

(Rpi )+ ¡ ¡

K1

j

@ (Rp )+ @ @ R

¡ ª

K2

where Y

hγ i , di i-

Y1 × X

hαj , cj , πX-i

Y2 × X × X

1 i hβ i , πX -

j U × X θ- V

π1

X and Y2 × X × X −→ X is the projection on the first X.30 We also assume that di j and c have the following factorizations: 30

It cannot be the projection on second X, otherwise the proviso for rule (Rpj )+ is violated.

60

di

Y

-X ¡ µ ¡ dim

@

diε @

R @

Y0

cj

Y1 × X @

-X ¡ µ ¡ cjm

cjε @

¡

@ R

¡

Z

It follows that 1 i, θ j , (λ) K1 = hγ i , di , diε i , hαj , cj i × 1Y 0 , (1Y2 × 1X × dim ) ◦ hβ i , πX

K2 = hγ i , di i, hαj , cj , πX , cjε i , β i × 1Z , (1U × cjm ) ◦ θj , (λ) First member can be written as follows hγ i , di , diε i , hhπY1 , πX i ◦ αj , hπY1 , πX i ◦ cj , πY 0 i , h(1Y2 × 1X × dim ) ◦ β i , πX i , θj , (λ) We can apply (Rpj )+ with respect to the projection πX (we point out that X is the codomain of hπY1 , πX i ◦ cj , which factorizes, in ε/m components, as (hπY1 , πX i ◦ cjε ) ◦ cjm ). This yields to hγ i , di , diε i , hhπY1 , πX i ◦ αj , hπY1 , πX i ◦ cj , πY 0 , hπY1 , πX i ◦ cjε i , ((1Y2 × 1X × dim ) ◦ β i ) × 1Z , (1U × cjm ) ◦ θj , (λ)

(L1 )

By applying (Rpi )+ to second member with respect to the projection πX , we get hγ i , di , diε i , hhπY1 , πX i ◦ αj , hπY1 , πX i ◦ cj , πY 0 , hπY1 , πX i ◦ cjε i , (1Y2 × 1X × dim × 1Z ) ◦ (βi × 1Z ) , (1U × cjm ) ◦ θj , (λ) (L2 ) Since ((1Y2 × 1X × dim ) ◦ β i ) × 1Z coincides with (1Y2 × 1X × dim × 1Z ) ◦ (βi × 1Z ), paths L1 and L2 coincide. CASE 3 Here we have a five-arrows path, (Rpi )+ is applied to the first three components and (Rpj )+ to the last three components. We distinguish the subcases i 6= j and i = j. SUBCASE 3.1 The third arrow must belong to T0 , hence we have hθi , ci i, hη, πX i, hγ 0 , d0 i, hα, πX i, β j , (λ) ¡

@

¡

@

(Rpi )+ ¡ ¡

L1

j

@ (Rp )+ @ @ R

¡ ª

L2

61

where W

hθi , cii

W1 × X

hη, πX i

0i hγ 0 , d-

W2 × X

hα, πX i-

Y1 × X

βj Y2 × X - U

Therefore L1 and L2 are as follows (let W 0 , Y 0 be the codomains of ciε , dε0 ): L1 = hθi , ci , ciε i , η × 1W 0 , (1W2 × cim ) ◦ hγ 0 , d0 i, hα, πX i, β j , (λ) L2 = hθi , ci i, hη, πX i, hγ 0 , d0 , d0ε i , α × 1Y 0 , (1Y2 × d0m ) ◦ β j , (λ) Applying (Rpi )+ to L2 , one gets hθi , ci , ciε i , η × 1W 0 , (1W2 × cim ) ◦ hγ 0 , d0 , d0ε i , α × 1Y 0 , (1Y2 × d0m ) ◦ β j , (λ) By an ⇔∗R0 -step, we get hθi , ci , ciε i , η × 1W 0 , (1W2 × cim ), hγ 0 , d0 , d0ε i , α × 1Y 0 , (1Y2 × d0m ) ◦ β j , (λ) which is ⇔∗R0 -equivalent to L1 by Lemma 7.6(ii). SUBCASE 3.2 hθi , ci i, hη, πX i, hγ i , di i, hα, πX i, β i , (λ) @

¡

(Rpi )+

@

¡

¡

L1

i +

@ (Rp ) @ @ R

¡

¡ ª

L2

where W

hθi , cii

W1 × X

hη, πX i

W2 × X

ii hγ i , d-

Y1 × X

hα, πX i-

βi Y2 × X - U

Therefore L1 and L2 are as follows (let W 0 , Y 0 be the codomains of ciε , dεi ): L1 = hθi , ci , ciε i , η × 1W 0 , (1W2 × cim ) ◦ hγ i , di i, hα, πX i, β i , (λ) L2 = hθi , ci i, hη, πX i, hγ i , di , diε i , α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ) Applying (Rpi )+ , L2 rewrites to hθi , ci , ciε i , η × 1W 0 , (1W2 × cim ) ◦ hγ i , di , diε i , α × 1Y 0 , (1Y2 × dim ) ◦ β i , (λ) which is confluent with L1 by Lemma 8.3 (take ζ i to be 1W2 × cim )). We so completed the proof of the following: Theorem 8.4 R+ is locally confluent.

62

9

Termination

In order to show termination of R and of R+ , we shall associate with our paths certain commutative labelled trees. Such trees are represented as terms built up from the countable set of variables {xi }i≥1 by using four31 constructors fi (i ∈ {0, 1}2 ) of type T ermM ultiset −→ T erm. R-trees (or, briefly, trees) are inductively defined as follows: • xi is an R-tree for every i ≥ 1; • if {T1 , . . . , Tn } is a multiset of R-trees and i ∈ {0, 1}2 , then fi (T1 , . . . , Tn ) is an R-tree. As a next step, we introduce a relation > among our trees; we have T1 > T2 iff one of the following two conditions is satisfied: • T1 is fi (T10 , . . . , Tn0 ) and T2 is fj (T100 , . . . , Tk00 ) and {T10 , . . . , Tn0 } >m {T100 , . . . , Tk00 }; • T1 is fi (T10 , . . . , Tn0 ) and T2 is fj (T10 , . . . , Tn0 ) and i > j (in the lexicographic sense). Some comments are in order. First >m is the multiset extension of >; secondly the definition is by induction on the height h(T1 ) of the tree T1 . It is easily seen that T1 > T2 implies h(T1 ) ≥ h(T2 ).32 In the following, we use ≥ for the reflexive closure of >. We have the following easy Lemma 9.1 > is a transitive and terminating relation. Proof. For transitivity, let us show that T1 > T2 > T3

implies T1 > T3

by induction on h(T1 ) + h(T2 ) + h(T3 ). We have two cases: (i) suppose that T1 > T2 holds by the first clause, so that T1 is fi (T10 , . . . , Tn0 ), T2 is fj (T100 , . . . , Tk00 ) and {T10 , . . . , Tn0 } >m {T100 , . . . , Tk00 }; T1 > T3 follows from the fact that >m is transitive (as > is transitive on lower height trees by induction hypothesis); (ii) suppose that T1 > T2 holds by the second clause, so that T1 is fi (T10 , . . . , Tn0 ), T2 is fj (T10 , . . . , Tn0 ) and i > j; if T2 > T3 holds by the first clause, then T1 > T3 holds by the same clause, if it holds by the second clause, then T1 > T3 holds by transitivity of lexicographic orders. 31

Actually only three such constructors will be really used (fh0,1i is useless). h(T ) is obviously defined as follows: variables have height 1, fi (T1 , . . . , Tn ) has height 1 + max(h(T1 ), . . . , h(Tn )). 32

63

For termination, suppose we have a chain T1 > T2 > · · · Ti > · · · We show that this cannot be by induction on h(T1 ). As >m is terminating (by inductive hypothesis on >), first clause cannot be used infinitely many times; so starting from a certain Tk on, only the second clause applies, which is absurd as such clause can be consecutively applied only at most 3 times. a As our trees are represented as terms, it makes sense to speak about substitutions. Substitutions are compatible with > in the following sense: Lemma 9.2 Let a succession {Ti }i≥1 of trees be given and let T 0 , T 00 be such that T 0 > T 00 ; we then have T 0 (Ti /xi ) > T 00 (Ti /xi ). Proof. Immediate.

a

Let us now turn to our paths. First, we need a definition. For an arrow αi , let us put

½ i

e(α ) =

½

0 if αi ∈ E0 1 otherwise

i

m(α ) =

0 if αi ∈ Ei 1 otherwise

χ(αi ) = hm(αi ), e(ai )i. Lemma 9.3 For every arrow α and for every ε ∈ E0 , we have χ(ε ◦ α) = χ(α) (whenever composition makes sense). Proof. If e(α) = 0 then clearly e(ε ◦ α) = 0 too; vice versa, if e(ε ◦ α) = 0, then the two ε/m factorizations (ε ◦ α) ◦ 1 = (ε ◦ αε ) ◦ αm of ε ◦ α must be equal, so that αm is the identity; hence α = αε , that is α ∈ E0 . The proof of m(α) = 0 iff m(ε ◦ α) = 0 is similar. a For a path K : Y −→ Z and for β 0 : Z −→ V , let K ◦ β 0 be the path obtained by composing the last arrow of K with β 0 (that is, if K = K 0 , α, then K ◦ β 0 is K 0 , α ◦ β 0 ). With a path K : X n −→ X (resp. L : X n −→ X m ), we now associate an R-tree T (K) (resp. a multiset of R-trees T (L)) as follows (definition is by induction on the lengths |K|, |L| of K and L): T (a) = fχ(a) (xi1 , . . . , xik ),

if aε = hπi1 , . . . , πik i;

T (ha1 , . . . , am i) = {T (a1 ), . . . , T (am )}; T (K 0 , a) = fχ(a) (T (K 0 ◦ aε )); T (L0 , ha1 , . . . , am i) = {T (L0 , a1 ), . . . , T (L0 , am )}.

64

Lemma 9.4 Let L : Y −→ X n and K : X n −→ X m . We have that T (L, K) = T (K)(T (L1 )/x1 , . . . , T (Ln )/xn ), where L1 = L ◦ π1 , . . . , Ln = L ◦ πn . Proof. The claim is shown by induction on the length |K| of K. If length is 1, then K is just α = ha1 , . . . , am i; if (aj )ε is hπij(1) , . . . , πij(kj ) i, we have T (L, α) = {fχ(aj ) (T (Lij(1) ), . . . , T (Lij(kj ) ))}j=1,...,m = T (α)(T (L1 )/x1 , . . . , T (Ln )/xn ). If length is greater than 1, then K is K 0 , α (for α = ha1 , . . . , am i), so that T (L, K 0 , α) = {fχ(aj ) (T (L, K 0 ◦ (aj )ε )}j=1,...,m = {fχ(aj ) (T (K 0 ◦ (aj )ε )(T (Li )/xi ))}j=1,...,m by inductive hypothesis; on the other hand T (K 0 , α)(T (Li )/xi ) = {fχ(aj ) (T (K 0 ◦ (aj )ε )}j=1,...,m (T (Li )/xi ) = {fχ(aj ) (T (K 0 ◦ (aj )ε )(T (Li )/xi )}j=1,...,m and the two members are equal by the inductive definition of substitution.

a

Lemma 9.5 Let K 0 = ρ(K) for a list of renamings whose first component is identity;33 we have T (K) = T (K 0 ). Proof. We first collect some easy facts. Fix any path L : Y −→ Z and a renaming ρ : Z −→ Z. We have: (i) T (L) = T (L ◦ ρ); (ii) for every α = ha1 , . . . , an i : Z −→ X n and for every i = 1, . . . , n, T (L, ρ ◦ ai ) = T (L ◦ ρ, ai ): in fact, T (L, ρ ◦ ai ) = fχ(ρ◦ai ) (T (L ◦ (ρ ◦ ai )ε ) = fχ(ai ) (T (L ◦ ρ ◦ (ai )ε ) = T (L ◦ ρ, ai ) by uniqueness of factorization and Lemma 9.3; (iii) for every L0 , T (L, ρ ◦ α, L0 ) = T (L ◦ ρ, α, L0 ), by (ii) and Lemma 9.4. Now let K = α1 , . . . , αk , K 0 = α10 , . . . , αk0 and let ρ = {1 = ρ0 , ρ1 , . . . , ρk } (recall we have ρi−1 ◦ αi0 = αi ◦ ρi for all i). We have T (K) = T (K ◦ ρk ) = T (α1 , . . . , ρk−1 ◦ αk0 ) = T (α1 , . . . , αk−1 ◦ ρk−1 , αk0 ) = · · · = T (K 0 ) as wanted. 33

a

See Section 4 for these concepts.

65

Notice that the above Lemma yields in particular that T (K) = T (K 0 ) in case K 0 is an alphabetic variant of K: this is important, as we rewrite on equivalence classes of paths modulo alphabetic variants. Moreover the above Lemma (which will be tacitly used many times during the termination proof) yields the possibility of replacing K with any ρ(K) (where ρ has identity as first component) when computing T (K): this allows moving certain arrows to last position in a tuple of arrows, assuming that certain projections located in an internal position of a path project on last components, etc (see the Examples of Section 4). Lemma 9.6 Let δ = hd1 , . . . , dn i : X m −→ X n be an arrow which is not in E0 (i.e. it is not a projection); suppose that δε = hπi1 , . . . , πik i : X m −→ X k . We have that T (δ, 1X n ) > T (δε , 1X k ). Proof. We have T (δε , 1X k ) = {fh0,0i (fh0,0i (xs ))}s=i1 ,...,ik and T (δ, 1X n ) = {fh0,0i (fχ(dj ) (xij(1) , . . . , xij(lj ) ))}j=1,...,n , where we supposed that (dj )ε = hπij(1) , . . . , πij(lj ) i. Now elements of the former multiset are all distinct and for every s = i1 , . . . , ik , there is j such that s is among j(1), . . . , j(lj ) (otherwise πs would be missed in δε ). This means in particular that for such s, j we have fh0,0i (xs ) ≤ fχ(dj ) (xij(1) , . . . , xij(lj ) ) (where this inequality is strict in case the same j corresponds to different s). Consequently the former multiset is less or equal than the latter. It is strictly less indeed; in fact δ cannot be in E0 for two independent reasons: some of the χ(dj ) is not h0, 0i or some projection among hπi1 , . . . , πik i appears at least twice in δ. In both cases, this is a sufficient reason for the latter multiset to be bigger. a For a path K = α1 , . . . , αk , we define c(K) to be the vector hT (α1 , . . . , αk ), T (α1 , . . . , αk−1 ), . . . , T (α1 )i and for paths K, L, we put K > L iff

c(K) > c(L)

where second member refers to the lexicographic extension of >m . Next Lemma says that c is ‘almost stable by concatenation’ as a complexity measure: Lemma 9.7 Let K : X m −→ X n and K 0 : X m −→ X n be two paths such that K > K 0 (notice that they agree on domains and codomains); then (i) for every path L having codomain X m , we have L, K > L, K 0 ;

66

(ii) suppose that K = K0 , ha1 , . . . , an i, K 0 = K00 , ha01 , . . . , a0n i and that T (K0 , ai ) ≥ T (K00 , a0i ) holds for all i = 1, . . . , n; then for every path R having domain X n , we have K, R > K 0 , R. Proof. Claim (i) directly follows from Lemmas 9.4, 9.2. Let us show (ii). Here it is sufficient to prove that if T (K0 , ai ) ≥ T (K00 , a0i ) holds for every i = 1, . . . , n, then T (K, b) ≥ T (K 0 , b) holds for every b : X n −→ X (this yields the claim, by induction on the length of R, because the complexity measure of a path is the vector of trees associated to left segments of the path itself). Supposing that bε is hπi1 , . . . , πik i, we have T (K, b) = fχ(b) (T (K0 , ai1 ), . . . , T (K0 , aik )) T (K 0 , b) = fχ(b) (T (K00 , a0i1 ), . . . , T (K00 , a0ik )) hence T (K, b) ≥ T (K 0 , b) as wanted.

a

Theorem 9.8 R and R+ are terminating. Proof. If we have K ⇒ K 0 by rules (Rci ), then K > K 0 always holds because such rules are length-reducing (recall that in lexicographic orders for variable length vectors, length is principal parameter). According to the above Lemma, it is sufficient to show that for every other rule L ⇒ R of R+ ∪ R, we have both (1)

T (L ◦ πi ) ≥ T (R ◦ πi ),

for every i = 1, . . . , n (here X n is the common codomain of L, R) and (2)

c(L) > c(R).

Notice that any (Rε )-rewrite step is a special case of a (Rpr)∗ -rewrite step, where (Rpr)∗ is the rewrite rule (Rpr)∗

α, ε ◦ β ⇒ α ◦ ε, β

(here ε is any strict projection). Moreover, we know from Lemma 7.2 that any (Rµ ) or (Rµ )+ -rewrite step is a composition of a finite number of (Rµ )+1 and of (Rdi+1 )∗ rewrite steps, where (Rµ )+1 is (any alphabetic variant of) (Rµ )+1

hα, ai, β ⇒ hα, ae i, (1 × aµ ) ◦ β

and (Rdi+1 )∗ is (any alphabetic variant of) (Rdi+1 )∗

hα, a, ai, β ⇒ hα, ai, (1 × ∆X ) ◦ β

67

Consequently, it is sufficient to prove (1) and (2) for rules (Rpr)∗ , (Rµ )+1 , (Rdi+1 )∗ and (Rpi ). Proof of (1) for rule (Rpr)∗ : α, ε ◦ β ⇒ α ◦ ε, β. Let b any component of β; as (ε ◦ bε ) ◦ bm is the factorization of ε ◦ b, we have (taking into account Lemma 9.3): T (α, ε ◦ b) = fχ(b) (T (α ◦ ε ◦ bε )) = T (α ◦ ε, b), as wanted. Proof of (2) for rule (Rpr)∗ : by the previous point, we have T (α, ε ◦ β) = T (α ◦ ε, β); however T (α) > T (α ◦ ε) because the projection is strict. Notice that the above established fact that T (α, ε ◦ β) and T (α ◦ ε, β) are componentwise equal (together with Lemma 9.4), yields the following important information to be used in the sequel: let us write K ⇒∗ε K 0 in order to express that K 0 is obtained from K by a sequence of (Rpr)∗ -rewrite steps; we have that34 (∗)

K ⇒∗ε K 0

implies T (K) = T (K 0 ).

Proof of (1) for rule (Rµ )+1 : first member of the rule is hα,ai

β

X n −→ Z × X −→ U whereas second member is (let ae = he1 , . . . , ek i) Xn

hα,e1 ,...,ek i

−→

Z × Xk

(1×aµ )◦β

−→

U.

Let b be any component of the vector β; we first suppose that bε is the identity and then reduce to this case. If bε is identity, we have T (hα, ai, b) = fχ(b) (T (α) ∪ {T (a)}) T (hα, e1 , . . . , ek i, (1 × aµ ) ◦ b) ≤ fχ((1×aµ )◦b) (T (α) ∪ {T (ej )}j=1,...,k ). (we put ≤ here, because we do not know what ((1 × aµ ) ◦ b)ε is, so we supposed worst case - it is identity). We need to prove that T (a) > T (ej ) for all j = 1, . . . , k (then first clause of the definition of orders among our trees applies). Suppose that a factors as follows 34 (∗) is shown as follows: suppose that K ⇒ε K 0 (in one step); then K = S 0 , L, S 00 and K 0 = S 0 , R, S 00 , where L and R are the left and right side of a (Rpr)∗ -rule. We have that T (L, S 00 ) = T (R, S 00 ) because such multisets of trees are obtained by replacing variables in the same multiset of trees by equal trees (notice that T (L) and T (R) are not only equal, but also componentwise equal); the same happens to T (S 0 , L, S 00 ) and T (S 0 , R, S 00 ) (only a further substitution is operated to get them).

68

X m+n



0

-

@

Xm

¡ ¡am

a@

@ R @

¡ ¡ ª

X where n = m + n0 and αε is hπ1 , . . . , πm i. We have T (a) = fχ(a) (x1 , . . . , xm ). Now observe that each ej factors through aε (in fact, we have a = aε ◦ ((am )e ◦ (am )µ ), hence, by uniqueness of factorizations, he1 , . . . , ek i = ae = aε ◦ (am )e ), so that T (ej ) ≤ fχ(ej ) (x1 , . . . , xm ). As χ(a) = h1, 1i and χ(ej ) = h0, −i, 35 we have T (a) > fχ(ej ) (x1 , . . . , xm ) by second clause in our definition of order among trees. Let us now consider the case in which bε is not identity; we apply ⇒∗ε -rewriting to both members (being sure that the corresponding trees do not change by (∗)). We have two subcases. In the first subcase Z = Z 0 × Z 00 (consequently α is split as α0 , α00 ) and bε is the projection Z 0 × Z 00 × X −→ Z 00 × X. We have for first member hα0 , α00 , ai, b ⇒∗ε hα00 , ai, bm and hα0 , α00 , ae i, (1 × aµ ) ◦ b ⇒∗ε hα00 , ae i, (1 × aµ ) ◦ bm for second member, thus reducing to the above special case (now (bm )ε is identity). In the second subcase, bε is the projection Z 0 × Z 00 × X −→ Z 00 . In this case, both α00

b

m members ⇒∗ε -rewrite to the path Y −→ Z 00 −→ X.

Proof of (2) for rule (Rµ )+1 : by the previous point, we have that the multiset of trees corresponding to the first member of the rule is greater or equal to the multiset of trees corresponding to the second member. In case they are equal, we need to compare T (α, a) and T (α, ae ); as we saw above, the former is greater as a multiset, because for every component ej of ae , we have T (a) > T (ej ). Proof of (1) for rule (Rdi+1 )∗ : first member of the rule is hα,a,ai

β

Y −→ Z × X × X −→ U whereas second member is hα,ai

Y −→ Z × X

(1×∆X )◦β

−→

U.

35 Of course rule does not apply in case first component of χ(a) is 0, because in such a case a would have trivial e/µ factorization as a ◦ 1X .

69

Let b be any component of the vector β; we first suppose that bε is the identity and then reduce to this case. If bε is identity, we have T (hα, a, ai, b) = fχ(b) (T (α) ∪ {T (a), T (a)}) which is trivially bigger than fχ((1×∆X )◦b) (T (α) ∪ {T (a)}) ≥ T (hα, ai, (1 × ∆X ) ◦ b), by first clause in definition of order for trees. If bε is not identity, let Z = Z 0 × Z 00 (consequently α splits as α0 , a00 ) and let bε be 1 , π 2 i, or ii) hπ 00 , π 1 i, or iii) hπ 00 , π 2 i or finally iv) π 00 . In the last either i) hπZ 00 , πX Z Z Z X X X three cases both members have a ⇒∗ε -rewriting to the same path (which is hα00 , ai, bm for ii)-iii) and α00 , bm for iv)), so the corresponding trees are equal by (∗). In the first case, first member ⇒∗ε -rewrites to hα00 , a, ai, bm , whereas second member ⇒∗ε -rewrites to hα00 , ai, (1 × ∆X ) ◦ bm , thus reducing to the above considered special case. Proof of (2) for rule (Rdi+1 )∗ : by the previous point, we have that the multiset of trees corresponding to the first member of the rule is greater or equal to the multiset of trees corresponding to the second member. In case they are equal, we need to compare T (α, a, a) and T (α, a): the former is clearly bigger. Proof of (1) for rule (Rpi ): we recall that first member of (Rpi ) is hγ,δi

hα,πZ i

β

α×1

(1Y2 ×δm )◦β

Y −→ Y1 × Z −→ Y2 × Z −→ U whereas second member is hγ,δ,δε i

0

Y −→ Y1 × Z × Y 0 −→Y Y2 × Y 0

−→

U

(with an extra arrow to the right in case i = 1). This rule is subject to the proviso that δ cannot be a projection. Let b be any component of β; we first assume that bε is the identity (and then reduce to this case). We have that T (hγ, δi, hα, πZ i, b) = fχ(b) (T (hγ, δi, hα, πZ i)) = fχ(b) (T (hγ, δi, α) ∪ T (δ, 1Z )) where ∪ refers to multiset union (notice that we used (∗) above in missed intermediate passages). We do not know what is ((1×δm )◦b)ε : let so take worst case (it is identity) and proceed as follows by using (∗) again: T (hγ, δ, δε i, α × 1, (1 × δm ) ◦ b) ≤ fχ((1×δm )◦b) (T (hγ, δ, δε i, α × 1)) = = fχ((1×δm )◦b) (T (hγ, δi, α) ∪ T (δε , 1Y 0 )). This tree is indeed smaller than fχ(b) (T (hγ, δi, α) ∪ T (δ, 1Z )) (by the first clause of the definition of trees order): in fact, by Lemma 9.6 we have T (δ, 1Z ) > T (δε , 1Y 0 ). 70

Let us now turn to the general case (bε may not be identity). In such a case, let us transform both hγ,δi hα,πZ i b Y −→ Y1 × Z −→ Y2 × Z −→ X and

α×1

hγ,δ,δε i

0

Y −→ Y1 × Z × Y 0 −→Y Y2 × Y 0

(1Y2 ×δm )◦b

−→

X

by ⇒∗ε -rewriting and then apply (∗). Suppose we have Y2 = Y20 × Y200 and Z = Z 0 × Z 00 (consequently δ and α are also splitted as δ 0 , δ 00 and α0 , α00 , respectively); let b factors as follows: bε

Y20 × Y200 × Z 0 × Z 00

- Y 00 × Z 00 2 ¡ ¡ ¡b

@

@ b@

¡

@

m

¡

@

¡ ¡ ª

@ R @

X where bε is the obvious projection. We then have for the first member hγ, δi, hα, πZ i, b ⇒∗ε hγ, δ 0 , δ 00 i, hα00 , πZ 00 i, bm . Let us also split δm : Y 0 −→ Z 0 × Z 00 as θ0 , θ00 (as a consequence, from hδ 0 , δ 00 i = δ = δε ◦ δm , we have in particular δε ◦ θ00 = δ 00 ); an analogous transformation on the second member gives hγ, δ, δε i, α × 1, (1 × δm ) ◦ b ⇒∗ε hγ, δ 0 , δ 00 , δε i, α00 × 1, (1 × θ00 ) ◦ bm . 00 ; from δ ◦ θ 00 = δ 00 , by uniqueness of factorizations, Let us now factorize θ00 = θε00 ◦ θm ε 00 00 00 00 we get δε = δε ◦ θε and δm = θm ; thus, by further ⇒∗ε -rewrite steps, we get 00 hγ, δ 0 , δ 00 , δε i, α00 × 1, (1 × θ00 ) ◦ bm ⇒∗ε hγ, δ 0 , δ 00 , δε00 i, α00 × 1, (1 × δm ) ◦ bm .

Now hγ, δ 0 , δ 00 i, hα00 , πZ 00 i, bm and 00 hγ, δ 0 , δ 00 , δε00 i, α00 × 1, (1 × δm ) ◦ bm

are first and second member of a (Rpi )-rewrite rule and (bm )ε is the identity. We can so reduce to the above particular case, except that now there is no guarantee that δ 00 is not a projection: this further case has to be considered separately. However in 00 is the identity, δ 00 = δ 00 and all what we need is to prove that trees such a case, 1 × δm ε corresponding to the paths Y

hγ,δ 0 ,δ 00 i

−→ Y1 × Z 0 × Z 00

71

hα00 ,πZ 00 i

−→

Y200 × Z 00

Y

hγ,δ 0 ,δ 00 ,δ 00 i

−→

α00 ×1

Y1 × Z 0 × Z 00 × Z 00 −→ Y200 × Z 00

are the same. Indeed they are both equal to T (hγ, δ 0 , δ 00 i, α00 ) ∪ T (δ 00 , 1Z 00 ) (again by (∗)). Proof of (2) for rule (Rpi ): by the previous point, we have that the multiset of trees corresponding to the first member of the rule is greater or equal to the multiset of trees corresponding to the second member. This does not prevent from them to be equal, in some cases; hence, we compare trees corresponding to hγ,δi

hα,πZ i

Y −→ Y1 × Z −→ Y2 × Z and to

α×1

hγ,δ,δε i

0

Y −→ Y1 × Z × Y 0 −→Y Y2 × Y 0 . The former is T (hγ, δi, α) ∪ T (δ, 1Z ) whereas the latter is T (hγ, δi, α) ∪ T (δε , 1Y 0 ): as δ cannot be a projection, Lemma 9.6 applies, showing that the former is greater. a From the previous section results, we immediately get: Corollary 9.9 R+ is canonical. a We now compare rewrite systems R+ and R: it will turn out that they are essentially the same, hence in particular canonicity of R will follow. Lemma 9.10 If K ⇒∗R+ K 0 , then there exists K 00 such that K 0 ⇒∗R+ K 00 and K ⇒∗R K 00 . Proof. Statement is proved by noetherian induction on K (with respect to the order > among paths which has been used in the termination proof). If K = K 0 , the statement is trivial; otherwise we have, for some K0 , K ⇒R+ K0 ⇒∗R+ K 0 . Now there is K00 such that K0 ⇒∗R+ K00

and K ⇒R K00

(if the ⇒R+ -step is done by a rule different than (Rµ )+ this is trivial, otherwise apply Lemma 7.1). As R+ is confluent, there exists K000 such that K00 ⇒∗R+ K000

and K 0 ⇒∗R+ K000 .

As K00 < K (any kind of rewrite step decreases complexity, as we saw in the termination proof), we can apply inductive hypothesis to K00 , yielding K 00 such that K 0 ⇒∗R+ K000 ⇒∗R+ K 00

and K ⇒R K00 ⇒∗R K 00

as wanted.

a

72

Lemma 9.11 If K ⇒∗R K 0 , then K ⇔∗R+ K 0 . Proof. Statement is again proved by noetherian induction on K. The only relevant case is when we have K ⇒R K 0 by a single (Rpi )-rewrite step, which is covered by Lemma 7.6 (iii). a We can finally complete the Proof of Theorem 5.3. As we know from Proposition 9.8 that R is terminating, we only have to prove its confluence. Suppose we have that K ⇒∗R K 0

and K ⇒∗R K 00 .

Then K 0 ⇔∗R+ K 00 by Lemma 9.11; as R+ is canonical, K 0 and K 00 both ⇒∗R+ -rewrite to their common normal form N . By Lemma 9.10, there are N 0 , N 00 such that N ⇒∗R+ N 0 ,

K 0 ⇒∗R N 0

N ⇒∗R+ N 00 ,

K 00 ⇒∗R N 00

However N is in R+ -normal form, hence N 0 = N = N 00 is a path to which K 0 , K 00 both ⇒∗R -reduce. a

10

Examples and Related Work

In this Section we illustrate our results in concrete cases. First, we gave in Section 5 a definition of constructibility for theories referring to their associated Lawvere categories. Now we give a useful equivalent purely symbolic definition: Proposition 10.1 A theory T 0 = hΩ0 , Ax0 i is constructible over a theory T = hΩ, Axi iff T 0 is a conservative extension of T and there exists a class E 0 of Ω0 -terms such that: (i) E 0 contains the variables and is closed under renamings of terms; (ii) for every Ω0 -term t(x1 , . . . , xn ) there exist a k-minimized Ω-term u(x1 , . . . , xk ) and pairwise distinct (with respect to provable identity in T 0 ) Ω0 -terms v1 (x1 , . . . , xn ), . . . , vk (x1 , . . . , xn ) belonging to E 0 such that `T 0 t = u(v1 , . . . , vk );

73

(iii) whenever u, u0 are k (resp. k 0 )-minimized Ω-terms and we have `T 0 u(v1 , . . . , vk ) = u0 (v10 , . . . , vk0 0 ) for pairwise distinct (wrt T 0 -provability) terms v1 , . . . , vk ∈ E 0 and pairwise distinct (wrt T 0 -provability) terms v10 , . . . , vk0 0 ∈ E 0 , then k = k 0 and there is a permutation σ acting on the k-elements set,36 such that 0 `T 0 vσ(i) = vi (i = 1, . . . , k)

and

`T u0 = u(xσ(i) /xi ).

Proof. If T 0 is constructible over T , in T0 there is a left extension (E 0 , M) of the standard weak factorization system (E, M) of T. In order to find E 0 fulfilling the above requirements it is sufficient to take the set of terms t(x1 , . . . , xn ) such that the equivalence class of t (seen as an arrow X n −→ X in T0 ) belongs to E 0 . To see that (i)(iii) hold, we only have to show that if an arrow from T like he1 , . . . , em i : X n −→ X m belongs to E 0 , then the ei are pairwise distinct and, vice versa, that if all ei belong to E 0 and are pairwise distinct, then he1 , . . . , en i belongs to E 0 ; these facts follow from Corollary 7.4. Vice versa, suppose that a class E 0 of Ω0 -terms fulfilling the above requirements is given. We define a left extension (E 0 , M) of the standard weak factorization system (E, M) of T by taking as E 0 the set of arrows he1 , . . . , em i : X n −→ X m such that the ei are represented by distinct (up to provable identity in T 0 ) terms in E 0 . First notice that, if α = ha1 , . . . , an i ∈ E 0 ∩ T, then α ia an n-tuple of distinct projections by an immediate application of (iii) to (the symbolic meaning of) the commutativity of the squares Y

ai X 1X

(ai )ε ?

Z=X

? -X

(ai )µ

We can easily factorize arrows a having codomain X (just apply (ii) to find ae and aµ ). To factorize arrows ha1 , . . . , am i : X n −→ X m , it is sufficient to factorize each ai as hei1 , . . . , eiki i ◦ µi and then ‘diagonalize’ as follows: let he1 , . . . , es i be any list of the distincts elements of {eij } and let δ be a diagonal such that he1 , . . . , es i ◦ δ = he11 , . . . , emkm i. We factorize ha1 , . . . , am i as he1 , . . . , es i ◦ (δ ◦ (µ1 × · · · × µm )). Notice that δ ◦ (µ1 × · · · × µm ) is still represented by a minimized vector of terms: in fact,Pfor every i, as ei1 , . . . , eiki are all distinct, δ composed with the projection from X ki onto the domain of µi is a projection πi : X s −→ X ki (hence the i-th component of 36

Such σ is clearly unique given that the vi are distinct.

74

δ ◦ (µ1 × · · · × µm ) is πi ◦ µi ); moreover projections {πi } altogether cannot miss any component of X s (by the very definition of the list he1 , . . . , es i). To show uniqueness of factorizations suppose you have a commutative square Z

α2 Y1 µ2

α1 ?

Y2

? - Xn µ1

with α1 , α2 ∈ E 0 and µ1 , µ2 ∈ M. The α1 , α2 are lists formed by distinct components (by the definition of the class E 0 ); let us first show that each component a of α1 appears as a component of α2 too (and vice versa, so that α1 and α2 differ only by a renaming). As µ1 is represented by a minimized vector of terms, there is a component s of µ1 such that α1 ◦ sε contains a; if s is the i-th component of µ1 , let r be the corresponding i-th component of µ2 . By commutativity of the square, we have (α1 ◦sε )◦sµ = (α2 ◦rε )◦rµ ; by (iii), there is a renaming ρ such that α1 ◦sε ◦ρ = α2 ◦rε and ρ ◦ rµ = sµ . The former shows that a is a component of α2 . We so established that α1 , α2 differ only for a renaming, i.e. that there is a renaming ρ such that α1 ◦ ρ = α2 . Now ρ ◦ µ2 = µ1 immediately follows from the commutativity of the above square and from the following Claim. If α ∈ E 0 and τ, σ ∈ T0 , then α ◦ τ = α ◦ σ implies τ = σ. The claim is obvious in case σ, τ are projections, because the components of α are distinct. In the general case, it is sufficient to prove the Claim for σ, τ having codomain X; if codomain is X, from (α ◦ σε ) ◦ σµ = (α ◦ τε ) ◦ τµ , we have (by (iii)) (α ◦ σε ) ◦ ρ = α ◦ τε and ρ ◦ τµ = σµ for a renaming ρ. As σε ◦ ρ and τε are projections, we just saw (this is the above mentioned obvious case) that σε ◦ ρ = τε , hence σ = σε ◦ σµ = σε ◦ ρ ◦ τµ = τε ◦ τµ = τ as required.

a

We say that T 0 is effectively constructible over T iff it is constructible over T and moreover for every term t, terms u, v1 , . . . , vk satisfying (ii) above are provided by a total recursive function. As an immediate corollary to our main Theorem 5.3, we have: Theorem 10.2 Suppose that T1 , T2 are both effectively constructible over T0 and that word problems for T1 , T2 are solvable; then word problem for T1 +T0 T2 is solvable too. Proof. By Theorem 3.1, Lemma 4.1, 5.2 and Theorem 5.3, it is sufficient to show that applicability of rules of R is effective whenever a path is given as a list of terms, representing their respective equivalence classes (in order to be able to compare

75

normal forms, we need also to check that it is effectively recognizable whether two paths are alphabetic variants each other). For rules (Rci ) we need to be able to recognize whether a certain arrow αi comes from T0 : this happens iff αe ∈ E0 (by uniqueness of e/µ factorization and by the fact that E0 ⊆ Ei ), a fact which is effective by appealing to the solvability of word problem for Ti .37 For rule (Rε ) we already observed in Section 5 that ε-extraction is effective in case word problem is decidable. For rule (Rµ ), one just use effective constructibility, together with the fact that the e/µ factorization of ha1 , . . . , an i can be reduced to the e/µ factorization of components, see Lemma 7.2. Finally, in order to apply rules (Rpi ) (and checking the relative proviso) it is sufficient to be able to recognize projections, a fact which is reduced once again to solvability of the input word problems. Last, we show that it is effectively recognizable whether two paths are alphabetic variants each other. In case they are both in normal form (which is the relevant case), there is a quick procedure for that. First, for α1 , . . . , αk to be an alphabetic variant of β1 , . . . , βk0 we need k = k 0 ; secondly, as the components of α1 and β1 are distinct (because paths are in normal form and (Rµ ) does not apply), it is easily computed provided it exists - the renaming ρ1 such that α1 ◦ρ1 = β1 ; at this point, we recursively need to check whether ρ−1 1 ◦ α2 , . . . , αk is an alphabetic variant of β2 , . . . , βk and so on. a Example. Commutative rings with unit are constructible over abelian groups. In fact terms t(x1 , . . . , xn ) in the theory of abelian groups can be represented as homogeneous linear polynomials in the indeterminates x1 , . . . , xn with integer coefficients (they are minimized iff no coefficient is zero); terms in the theory of commutative rings with unit can be represented as arbitrary polynomials with integer coefficients. Class E 0 needed for constructibility is formed by monic monomials (1 included): in fact, every integer polynomial can be uniquely expressed as a linear combination (with integer non-zero coefficients) of distinct monic monomials. a Example. Let T be the theory of join-semilattices with zero and let T 0 be the theory of semilattice-monoids we met in the Introduction. T 0 is constructible over T : class E 0 is given by terms of the form xi1 ◦ · · · ◦ xik (for k ≥ 0). a Example The theory of abelian groups endowed with an endomorphism f is constructible over the theory of abelian groups: class E 0 is given by terms of the form f n (xi ) (for n ≥ 0). a Example Differential rings (i.e. of rings endowed with a differentiation operator ∂ satisfying usual laws for derivatives of sums and products) are constructible over commutative rings with unit: class E 0 is given by terms of the form {∂ k xi } (for k ≥ 0). 37

Clearly if term t represents a : X n −→ X, then a is a projection iff t collapses to (i.e. it is provably equal to) a variable xi (for i = 1, . . . , n); a similar observation applies to vector of terms.

76

a Notice that in the above examples the smaller theory is not collapse-free. Additional examples of different nature can be found in [3, 4]. In order to build counterexamples, a useful tool is the following Proposition (clearly inspired from [3, 4]): Proposition 10.3 If T 0 is constructible over T , then the T -reduct of any free T 0 algebra is a free T -algebra (on a bigger set of generators). Proof. Let FT 0 (G) be the free T 0 -algebra on the set G of generators; we show that its T -reduct is free over the set of elements of the form u(g1 , . . . , gn ) where u(x1 , . . . , xn ) ∈ E 0 and g1 , . . . , gn are distinct elements from G. Clearly the claim follows from the case in which G is finite. To have a quick proof we translate everything in the terminology of functorial semantics. Let (E, M) be the standard weak factorization system of T and let (E 0 , M) be its left extension to T0 . For any functor F having domain T0 let us call |F | its restriction to T; for any type Y let E 0 (Y, X) be T0 (Y, X) ∩ E 0 . Fix a type Y and a T -algebra A : T −→ Set; we need to find a bijective natural correspondence between set-theoretic functions ¯ : E 0 (Y, X) −→ A(X) N and natural transformations N : |T0 (Y, −)| −→ A. ¯ be the restriction of NX to E 0 (Y, X) in the domain. Conversely, if N ¯ Given N , let N is given, we define for every Z and α : Y −→ Z ¯ (e1 ), . . . , N ¯ (ek )) NZ (α) = A(αµ )(N where αe = he1 , . . . , ek i. In order to prove naturality of N so defined, take ν : Z −→ Z 0 in T; we need to show the commutativity of the square NZ |T(Y, Z)| A(Z) |T(Y, ν)|

A(ν) ?

?

|T(Y, Z 0 )| - A(Z 0 ) NZ 0 We have ¯ (e1 ), . . . , N ¯ (ek ))) = A(αµ ◦ ν)(N ¯ (e1 ), . . . , N ¯ (ek )). A(ν)(NZ (α)) = A(ν)(A(αµ )(N On the other hand, let αµ ◦ ν factorize in ε/µ-components as follows:

77

Xk

αµ -

Z

π

ν ?

Xm

? - Z0 µ

where π = hπi1 , . . . , πim i. We have NZ 0 (|T(Y, ν)(α)|) = NZ 0 (α ◦ ν) = NZ 0 (αe ◦ π ◦ µ) ¯ (ei ), . . . , N ¯ (eim )) = A(µ)(N 1

¯ (e1 ), . . . , N ¯ (ek ))) = A(µ)(A(π)(N ¯ (e1 ), . . . , N ¯ (ek )) = A(π ◦ µ)(N ¯ (e1 ), . . . , N ¯ (ek )). = A(αµ ◦ ν)(N ¯ are immediate. Bijectivity and naturality of the correspondence N ←→ N

a

Counterexample. Boolean algebras are not constructible over join-semilattices with zero. In fact the free join-semilattice with zero over an infinite set G of generators is just the set of finite subsets of G; in this algebra, clearly the strict part of the partial order relation associated with the join is terminating. It is not so however in the countably generated free Boolean algebra, which is atomless. a Counterexample. Modal algebras (also K4-modal algebras, interior algebras, diagonalizable algebras, etc.) are not constructible over Boolean algebras: in fact, in such varieties, finitely generated free algebras are atomic and infinite,38 whereas free Boolean algebras are either finite or atomless. a Let us now give examples of normalization through our rewriting system R. In order to apply normalization to paths of equivalence classes of terms, algebraic notation for rules must be converted into ordinary symbolic notation. This is not difficult (all needed information is contained in Section 2 above), however some care is needed. Suppose e.g. you want to apply products rule to the path X3

ht, ui X2

hv, x2 i X2

w

-X

First u(x1 , x2 , x3 ) has to be minimized (this is the factorization δ = δε ◦ δm in the Table of rules of Section 5). Suppose it minimizes as u0 (x1 , x3 ); the pair of projections hx1 , x3 i stays in first position, whereas u0 (x1 , x2 ) is moved in third position. However, the term moved to last position for composition with w(x1 , x2 ) (the arrow 1 × δm of 38

These are well-known results. For a proof making use of normal forms, see [7].

78

the Table of rules), requires a renaming away from x1 and consequently it is the pair hx1 , u0 (x2 , x3 )i. Thus the products rule rewrite step produces ht, u, x1 , x3 i X4

X3

hv, x3 , x4 i

- X3

w(x1 , u0 (x2 , x3 )) -

X

In the examples below we consider the following theories: T0 =

Abelian groups with period 2

T1 =

Boolean rings

T2 =

T0 + an idempotent endomorphism f (i.e., such that f (f (x1 )) = f (x1 ))

We leave to the reader to check that T1 , T2 are both constructible over T0 . Example. Let us consider the following instance of word problem in the theory T1 +T0 T2 : ?

f (x1 · x2 + x2 + f (x2 )) = f (x1 · x2 ) Let us rewrite a splitting path of first member in R. X2

hx1 , x2 , f (x2 )i X3

x1 · x2 + x2 + x3 X ⇓Rµ

X2

hx1 , x2 , f (x2 )i X3

hx1 · x2 , x2 , x3 i X3 ⇓(Rp2 )

X2

hx1 , x2 , f (x2 ), x2 i X4

X2

X2

hx1 , x2 , x2 i

- X3

hx1 · x2 , x2 , x2 i X3

f (x1 )

f (x1 + x2 + x3 ) X

hx1 · x2 , x2 , x4 i f (x1 + x2 + f (x3 ))X3 X ⇓Rε hx1 · x2 , x2 , x3 i f (x1 + x2 + f (x3 ))X3 X ⇓(Rc1 ) f (x1 + x2 + f (x3 )) -

X

⇓Rµ X2 X2

hx1 · x2 , x2 i

x1 · x2

- X2 - X

-X

f (x1 + x2 + f (x2 )) ⇓Rε f (x1 )

X

-X

where the last path corresponds to the splitting path of the term f (x1 · x2 ). a

79

Example. Let us consider the following instance of word problem for T1 +T0 T2 : ?

f (x1 ) · f (x2 ) + f (x1 ) · (f (x1 ) + f (x2 )) = f (x1 ) We rewrite first member as follows. X2

hf (x1 ), f (x2 ), f (x1 ) + f (x2 )i-

X2

hf (x1 ), f (x2 )i -

X2

X3 ⇓Rµ

hx1 · x2 , x1 · x3 i X2

x1 + x2 -

X

X2

x1 + x2 -

X

- X2

x1 + x2 -

X

hx1 , x2 , x1 + x2 i ◦ hx1 · x2 , x1 · x3i =

X2

hf (x1 ), f (x2 )i -

hx1 · x2 , x1 · (x1 + x2 )i

X2

⇓Rµ X2

hf (x1 ), f (x2 )i -

X2

hx1 · x2 , x1 i X2 =

X2

hf (x1 ), f (x2 )i -

X2

hx1 · x2 , x1 i X2 ⇓Rε

X2

X2

hf (x1 ), f (x2 )i

hf (x1 ), f (x2 )i

- X2

hx1 , x1 + x2 i ◦ (x1 + x2 ) X

x1 + x1 + x2

hx1 · x2 , x1 i ◦ x2 X =

- X2

x1

- X

-X

x1

-X

x1

-X

⇓(Rc2 ) X2

f (x1 )

-X

where the last path coincides with the second term of the problem. a We now make a comparison with results from [3, 4]. Let T = (Ω, Ax) and T 0 = be equational theories such that T 0 is a conservative extension of T . Let G be the set of Ω0 -terms r such that 6`T 0 r = t for all Ω0 -terms t with top symbol in Ω. Notice that G 6= ∅ iff V ⊆ G, where V is the set of variables. Moreover, G is empty in case T is not collapse-free. We say that T 0 is BT-constructible over T iff the following hold: (Ω0 , Ax0 )

(I) V ⊆ G (hence T is collapse-free);

80

(II) for all Ω0 -term t, there are an Ω-term s and a vector ~r of terms in G such that `T 0 t = s(~r); (III) for every pair s1 , s2 of Ω-terms and for every pair of vectors r~1 , r~2 of terms from G, we have `T 0 s1 (r~1 ) = s2 (r~2 ) iff `T s1 (z~1 ) = s2 (z~2 ) where z~1 , z~2 are fresh vectors of variables abstracting r~1 , r~2 so that two terms in r~1 , r~2 are abstracted by the same variable iff they are provably equal in T 0 . We now show that if T 0 is BT-constructible over T , then T 0 is constructible over T (in our sense). We use Proposition 10.1 above taking E 0 = G. Let us first show uniqueness of factorizations. Suppose that we have k (resp. k 0 )-minimized Ω-terms u, u0 and that we have `T 0 u(v1 , . . . , vk ) = u0 (v10 , . . . , vk0 0 ) for pairwise distinct (wrt T 0 -provability) terms v1 , . . . , vk ∈ G and pairwise distinct (wrt T 0 -provability) terms v10 , . . . , vk0 0 ∈ G. Let w1 , . . . , ws be the terms which are common to the lists v1 , . . . , vk and v10 , . . . , vk0 0 . For simplicity, let us also rearrange such lists as v1 , . . . , vk = w1 , . . . , ws , r1 , . . . rl

and v10 , . . . , vk0 0 = w1 , . . . , ws , r10 , . . . , rl00 .

Then, applying (III), we get `T u(x1 , . . . , xs , y1 , . . . , yl ) = u0 (x1 , . . . , xs , z1 , . . . , zl0 ) which cannot be (unless l = l0 = 0, yielding what we need) because u and u0 are minimized: in fact, replacing e.g. all the yi by a ground term c we would get `T u(x1 , . . . , xs , c, . . . , c) = u0 (x1 , . . . , xs , z1 , . . . , zl0 ) = u(x1 , . . . , xs , y1 , . . . , yl ) contrary to the fact that u is minimized. Showing the existence of factorization is a little more tricky because the requirements in (II) above look more liberal than those in Proposition 10.1(ii) (it is not asked for s to be minimized, not for the ~r to be distinct (up to probability) and to contain only at most the variables of the original t). We progressively refine the factorization in (II). First, if we have (let ~r = r1 , . . . , rk ) `T 0 t = s(r1 , . . . , rk ) for non distinct ri , then we can identify variables in s(x1 , . . . , xk ) and reduce correspondingly the list r1 , . . . , rk to a list formed by distinct elements. If furthermore s(x1 , . . . , xk ) is not minimized, then we can minimize it and remove the corresponding ri from the list. Thus we obtained a factorization of t(x1 , . . . , xn ) as s(r1 , . . . , rk ) where s is k-minimized and the ri are pairwise distinct (up to provability in T 0 ) terms of G. Suppose that the ri contain additional variables, say that they contain variables from x1 , . . . , xk , ~y ; we have (let ~x = x1 , . . . , xk ) `T 0 t(~x) = s(r1 (~x, ~y ), . . . , rk (~x, ~y ))

81

Let ~z be renamings of the ~y ; we get `T 0 t(~x) = s(r1 (~x, ~z), . . . , rk (~x, ~z)) hence `T 0 s(r1 (~x, ~y ), . . . , rk (~x, ~y )) = s(r1 (~x, ~z), . . . , rk (~x, ~z)). We can now apply (III) to this situation; as s is minimized we must have `T 0 ri (~x, ~y ) = ri (~x, ~z) for all i = 1, . . . , k (eventually up to a permutation). Replacing all the ~y by ground terms ~c (we may use the same ground term for all of them), we get `T 0 ri (~x, ~c) = ri (~x, ~z) = ri (~x, ~y ). Now ri (~x, ~c) is provably equal to ri (~x, ~y ), hence as the latter is in G so is the former (G is closed under provably identical terms according to its definition). For the same reason, all the ri (~x, ~c) are pairwise distinct (with respect to provable identity in T 0 ) because so are the ri (~x, ~y ). We finally get `T 0 t(~x) = s(r1 (~x, ~c), . . . , rk (~x, ~c)) which is a factorization matching all the requirements from Proposition 10.1(ii). Summing up, the difference between the definition of constructibility of [3, 4] and ours, lies in the fact that we do not need any specific definition for the class E 0 of terms used in factorizations. The refinement factorization technique we used above for comparison with BTconstructibility is interesting by itself. Combining it with the proof of Proposition 10.3, it is not difficult to get the following third characterization of constructibility: Proposition 10.4 Let T 0 be a conservative extension of T . We have that T 0 is constructible over T iff the T -reduct of any T 0 -free algebra FT 0 (G0 ) is a free T -algebra over a set of generators G such that • G0 ⊆ G; • G is invariant under the T 0 -isomorphisms of FT 0 (G0 ) which are the extension of a bijection on the set of free generators G0 . To finish, let us mention some possible directions for future research. Of course, there is the problem of extending our results to combined unification. Secondly, one may try to generalize combined word problems to the case in which the definition of constructibility is related to a weak factorization system of the smaller theory which may not be the standard one (that is, class E0 is supposed to be larger than the

82

class of projections). Results from Section 6 are still valid, however it is not clear what happens with critical pairs arising from superpositions with products rule. Such enlargements of the definition of constructibility are important because they could cover additional mathematically relevant examples. Finally, although quite difficult, it would be essential to be able to deal with theories extending T1 +T0 T2 with further axioms. In principle, as our combination algorithm is obtained through rewriting, one may try to apply some form of Knuth-Bendix completion to get decision procedures in such situations too.

References ´ Kurucz, I. N´emeti, I. Sain, and A. Simon. Causes and remedies [1] H. Andr´eka, A. for undecidability in arrow logics and in multi-modal logics. In Arrow logic and multi-modal logic, pages 63–99. CSLI Publ., Stanford, CA, 1996. [2] F. Baader and T. Nipkow. Term rewriting and all that. Cambridge University Press, Cambridge, 1998. [3] F. Baader and C. Tinelli. Deciding the word problem in the union of equational theories. Technical Report UIUCDCS 98-2073, Department of Computer Science, University of Illinois at Urbana-Champaign, 1998. [4] F. Baader and C. Tinelli. Deciding the word problem in the union of equational theories sharing constructors. In Rewriting techniques and applications (Trento, 1999), pages 175–189. Springer, Berlin, 1999. [5] E. Domenjoud, F. Klay, and C. Ringeissen. Combination techniques for nondisjoint equational theories. In Proc. CADE-12, volume 814 of Springer LNAI, pages 48–64, 1996. [6] P. J. Freyd and G. M. Kelly. Categories of continuous functors. I. Journal of Pure and Applied Algebra, 2:169–191, 1972. [7] S. Ghilardi. An algebraic theory of normal forms. Annals of Pure and Applied Logic, 71(3):189–245, 1991. [8] S. Ghilardi and G. C. Meloni. Modal logics with n-ary connectives. Z. Math. Logik Grundlag. Math., 36(3):193–215, 1990. [9] F. William Lawvere. Functorial semantics of algebraic theories. Proc. Nat. Acad. Sci. U.S.A., 50:869–872, 1963. [10] Christoph L¨ uth and Neil Ghani. Monads and modular term rewriting. In Category theory and computer science (Santa Margherita Ligure, 1997), pages 69–86. Springer, Berlin, 1997.

83

[11] T. Nipkow. Combining matching algorithms: the regular case. J. Symbolic Comput., 12(6):633–653, 1991. [12] Don Pigozzi. The join of equational theories. Colloq. Math., 30:15–25, 1974.

84