Post Correspondence Problem for morphisms with

Vesa Halava | Tero Harju| Juhani Karhumäki | Michel Latteux

Post Correspondence Problem for morphisms with unique blocks

TUCS Technical Report No 677, April 2005

Post Correspondence Problem for morphisms with unique blocks Vesa Halava

Department of Mathematics and TUCS - Turku Centre for Computer Science University of Turku FIN-20014 Turku, Finland. [email protected]

Tero Harju

Department of Mathematics and TUCS - Turku Centre for Computer Science University of Turku, FIN-20014 Turku, Finland. [email protected]

Juhani Karhumäki

Department of Mathematics and TUCS - Turku Centre for Computer Science University of Turku, FIN-20014 Turku, Finland. [email protected] Supported by the Academy of Finland under grant 44087.

Michel Latteux

Université des Sciences et Technologies de Lille Bâtiment M3, 59655 Villeneuve d’Ascq Cédex, France. [email protected]

TUCS Technical Report No 677, April 2005

Abstract In the Post Correspondence Problem (PCP) an instance (h, g) consists of two morphisms h and g, and the problem is to determine whether or not there exists a word w such that h(w) = g(w). Here we prove that the PCP is decidable for instances with unique blocks and that the infinite PCP is decidable for instance with unique continuation in the construction of the solution. These results establish a new larger class of decidable instances of the PCP, including the class of marked instances.

Keywords: Post Correspondence Problem; infinite Post Correspondence Problem; morphism; words; blocks

TUCS Laboratory Discrete Mathematics for Information Technology Laboratory

1

Introduction

In the Post Correspondence Problem (PCP, for short), we are given two morphisms h, g : A∗ → B ∗ , where A and B are finite alphabets, and we are asked whether or not there exists a nonempty word w ∈ A∗ such that h(w) = g(w). The pair (h, g) is called an instance of the PCP and a word w ∈ A+ is a solution of the instance (h, g) if h(w) = g(w). The set of all solutions, E(h, g) = E(I) = {w ∈ A+ | h(w) = g(w)}, is called the equality set of the instance I = (h, g). The PCP is undecidable in this general form (see [12]). The borderline between decidable and undecidable sets of instances has been investigated in several occasions by restricting the instances of the PCP. For example, it is an easy exercise to show that the unary PCP, where the domain alphabet has only one letter, is decidable. An instance (h, g) of the PCP, where h, g : A∗ → B ∗ , is binary if |A| = 2. It was proved in [1] that the PCP is decidable for binary instances; see also [5] or [6] for a somewhat simpler proof. On the other hand, the PCP is undecidable for instances with domain alphabets A satisfying |A| ≥ 7 (see [11]). Another known borderline between decidability and undecidability is provided by marked and prefix morphisms. A morphism h : A∗ → B ∗ is said to be marked if the images h(a) and h(b) of any two different letters a, b ∈ A begin with a different letter. The problem where the instances are pairs of marked morphisms is called the marked PCP (consisting of marked instances). It was proved in [9], that the marked PCP is decidable in general. On the other hand, in [13] it was shown that the PCP is undecidable for instances of prefix morphisms. A morphism is called prefix if no image of a letter is a prefix of an image of another letter. In this paper we study instances of the PCP, which are not necessary marked, but they can be reduced to instances of the marked PCP. This reduction is due to the unique condition satisfied by the original instance. We also study the infinite PCP in the infinite solutions of the instances (h, g), which satisfy the condition of the unique continuation. Two (finite) words u and v are said to be comparable, if one is a prefix of the other. Let ω = a1 a2 · · · be an infinite word over A where ai ∈ A for each index i = 1, 2, · · · . Note that h(ω) = g(ω) if the morphisms h and g agree on ω, that is, if h(u) and g(u) are comparable for all finite prefixes u of ω. We also say that such an infinite word ω is an infinite solution of the instance I = (h, g). The problem whether or not a given instance of the PCP has an infinite solution is called naturally the infinite PCP, or ωPCP, for short. It was shown by Ruohonen [13] that there is no algorithm to determine whether a general instance of the PCP has an infinite solution. I 1

It was proved in [3] that the ωPCP is decidable for marked instances of the PCP. Later, using the previous result, it was shown in [7] that the ωPCP is decidable for all binary instances. Recently, it was proved in [4] that the ωPCP is undecidable for instances of size 9. We shall now fix some notation. The empty word is denoted by ε. The length of a word u is denoted by |u|. A word u ∈ A∗ is said to be a prefix of a word v ∈ A∗ , denoted by u ≤ v, if v = uw for some w ∈ A∗ . Also, if u = ε and w = ε in v = uw, then u is a proper prefix of v, and this is denoted by u < v. Recall that u and v are comparable, u v, if u ≤ v or v ≤ u. The longest common prefix of the words u and v is denoted by u ∧ v. If v = uw then we also write u = vw−1 and w = u−1 v. A word u ∈ A∗ is said to be a suffix of a word v ∈ A∗ , if v = wu for some w ∈ A∗ . If u = ε and u = v, then the suffix u is proper. Finally, a morphism h is called non-erasing if h(w) = ε implies that w = ε, i.e., no image of a letter is empty for h.

2

Unique block instances

The basic result on which we build the results of this paper is the following, see [9] or [2] for proofs. Theorem 1. The PCP is decidable for marked instances of any size. The basic idea of the proofs is the concept of blocks. We aim at a decision method for the PCP, but we start with the following simpler problem: Problem 1. Given an instance I = (h, g) of the marked PCP, where h, g : A∗ → B ∗ , and b ∈ B. Does there exist x, y ∈ A+ such that h(x) = g(y) and b ≤ h(x)? We do not look for solutions for h(x) = g(x) here, but only for h(x) = g(y), and we additionally require that h(x) begins with a specific letter b. This problem is known to be decidable for two morphisms in general, the reasoning being that the language h(A∗ ) ∩ bB ∗ is regular, and there exist such words x and y if and only if (1) h(A∗ ) ∩ bB ∗ ∩ g(A∗ ) ∩ bB ∗ = ∅, and the emptiness problem is decidable for regular languages. If h(u) = g(v) and h(u ) = g(v ) for all ε < u ≤ u and ε < v ≤ v with (u , v ) = (u, v), and b ≤ h(u), then the pair (u, v) is called a block or a a block for the letter b, and u and v are called the block words. Denote by Sb (h, g) or Sb , for short, the set of all blocks for letter b. If (u, v) is a solution of the equation h(x) = g(y), then there exist decompositions u = u1 u2 · · · uk and v = v1 v2 · · · vk of u and v such that (ui , vi ) ∈ Sbi 2

for bi ∈ B for i = 1, . . . , k. Thus taking u = w = v, where w is a solution of the marked instance (h, g), there exists block decomposition of w, w = u1 u 2 · · · uk = v 1 v 2 · · · v k ,

(2)

where (ui , vi ) ∈ Sbi for bi ∈ B for i = 1, . . . , k. This means that each solution is a concatenation of blocks. For marked instance (h, g) the block for every letter a is unique, and, therefore, every solution has unique block decomposition, see [9] or [2] for proofs. Let us now define the first property of unique continuation: UC1. The instance (h, g), where h, g : A∗ → B ∗ , of the PCP, is called unique block instance if, for every letter a ∈ A, 1. the block (au, v) is unique, and 2. the block (u, av) is unique, if they exists. Note that in (UC1) we assume then every letter of a ∈ A is the first letter of at most one block in block words of both h and g. Clearly, the condition (UC1) implies that h and g are non-erasing. If an instance I satisfies (UC1), and assume that (u, v) is a block and a ≤ u, a ∈ A. Then we denote β(a) = (u, v). Example 1. We give an example of a non-marked unique block instance. a b c d h ab abc cccc b g a babc cc bbc has unique blocks, the blocks are (ab, ab), (c, cc) and (db, b). For example, h(adci ) g(adcj ) and h(aaci ) g(abcj ) for all i and j, but there is no block of the form (adci , adcj ) or (aaci , abcj ). Clearly, w = ab is a solution of the PCP and ω = adcω is a solution of ωPCP for this instance. Note that the instances satisfying the condition that, for all b ∈ B, |Sb | ≤ 1, is a subclass of (UC1). We leave the details for the reader. Now the condition (UC1) ensures the reduction to marked PCP. Theorem 2. The PCP is decidable for unique block instances. Proof. Assume that I = (h, g), where h, g : A∗ → B ∗ , is a unique block instance. Now define the instance I = (h , g ) of the marked PCP, where h , g : C ∗ → A∗ and C = {a ∈ A | β(a) exists} . 3

We set for a ∈ C and (the unique) block β(a) = (u, v) h (a) = u

and g (a) = v.

Clearly, h and g are marked by condition (UC1). Note that a ≤ h (a) for all a ∈ C. We prove that I has a solution if and only I has. Indeed, assume, that I has a solution w, and let u1 u2 · · · uk = w = v1 v2 · · · vk be its block decomposition, where (ui , vi ) = β(ai ). The block decomposition is unique, since the first block (u1 , v1 ) is unique for the letter a1 ≤ w etc. Now, for w = a1 a2 · · · ak , h (w ) = u1 u2 · · · uk = w = v1 v2 · · · vk = g (w ), and w is a solution of I . For the other direction, assume that I has a solution w = a1 a2 · · · ak . Now w = h (w ) = g (w ) is a solution of I, since by the definition of I h(h (w )) = h(u1 u2 · · · uk ) = g(v1 v2 · · · vk ) = g(g (w )), where (ui , vi ) = β(ai ) for some letters ai ∈ C for all i, since h(ui ) = g(vi ) for all i. The claim follows by Theorem 1. The proof of Theorem 2 uses exactly the same technique as the algorithm for marked PCP in [9]. Indeed, for an instance I = (h, g) of the marked PCP, the successor I = (h , g ) is build as in the proof of Theorem 2. This successor construction is iterated and a successor sequence I (i) = (h(i) , g (i) ) is defined. The conclusion is that this sequence is ultimately periodic, that is, there exist numbers n and d such that I (i) = I (i+d) for all i ≥ n. Finally, there exists a solution beginning with a letter a for I if and only if there h(i) (a) = a = g (i) (a) for all i ≥ n. See [9] for more detailed study and proofs. Note that for an instance I of the marked PCP the letters for which a minimal solution appear can be detected and the minimal solution for each letter is unique, that is, Emin (I) = E(I) \ E(I)2 = {w1 , w2 , . . . , wk | wi is the minimal solution for letter ai } . Now by the proof of Theorem 2, we obtain Corollary 1. Let I = (h, g) be a unique block instance of the PCP and assume that the domain alphabet is A. The following sets can be effectively found: 4

1. S = {a ∈ A | there exists a solution w for I, a ≤ w} 2. Emin (I) is a finite marked set effectively computable. Marked here means that every element of E(I) begins with different letter. In order to make use of Theorem 2, we must be able to prove that we may detect the unique block instances. Therefore, we need to prove Theorem 3. It is decidable, whether or not an instance (h, g) of the PCP is a unique block instance. Proof. We establish a procedure for deciding whether or not an instance is a unique block instance. Let I = (h, g), where h, g : A∗ → B ∗ , be an instance of the PCP. For a letter a ∈ A, construct the minimal deterministic finite automata for the regular language Ha = h(aA∗ ) ∩ g(A∗ ). This can be done by the usual tricks in the theory of finite automata, by first defining automata for languages h(aA∗ ) and g(A∗ ), and then using the construction in [10] for intersection. The minimal automaton can be found with so called The Method of Quotients in [10]. Let A be the automaton. Now I satisfies condition 1 in (UC1) for the letter a only if A is not branching, that is, there is a unique path from initial state to the final state in A reading a word wa ∈ Ha . Indeed, then Ha = wa∗ We still need to check that the word au such that h(au) = wa is unique, but this can be checked simply by trying al possible coverings of w by images of h. Now, only if au is unique, the condition 1 in (UC1) is satisfied. We still need to decide whether or not the condition 2 holds, but this can similarly than the uniqueness of au for all wa , that is, check that the words v such that g(v) = wa are unique. Still, we need to check that all these (now unique) v’s for wa ’s begin with different letters. Now the property (UC1) does not help in the ωPCP, since no reduction to marked infinite PCP can not be establish. Indeed, unique block instance can have an infinite solution without block decomposition and still be nonultimately periodic. For the ωPCPwe need a more stronger unique continuation condition to have a decidable ωPCP.

3

Unique continuation instances UC2. The instance (h, g), where h, g : A∗ → B ∗ , of the PCP, is called unique continuation instance, if, for u, v ∈ A∗ , h(u) < g(v) or g(v) < h(u), then there exists at most one letter x such that h(ux) g(v) or h(u) g(vx), respectively. 5

First of all, Theorem 4. It is decidable, whether or not an instance of the PCP is a unique continuation instance. Proof. Let I = (h, g), where h, g : A∗ → B ∗ , be an instance of the PCP. We define a following procedure called Continuation. The input is (a, h, g) for a ∈ A: (1) Set i = 1, x1 = a and y1 = ε, Sg = Sh = ∅. (2) If h(xi ) = g(yi ), then return Unique, case 1. (3) Else if g(yi ) < h(xi ), then if si = g(yi )−1 h(xi ) ∈ Sh return Unique, case 2. Else set Sh := Sh ∪ {si }. If the letter b such that g(yi b) h(xi ) is unique, then set xi+1 = xi , yi+1 = yi b and i = i + 1, GOTO 2. If no such b exists, return No block. Else return Not unique. (4) Else if h(xi ) < g(yi ), then if si = h(xi )−1 g(yi ) ∈ Sg return Unique, case 2. Else set Sg := Sg ∪ {si }. If the letter b such that h(xi b) g(yi ) is unique, then set yi+1 = yi , xi+1 = xi b and i = i + 1, GOTO 2. If no such b exists, return No block. Else return Not unique. Now the instance satisfies condition (UC2) if and only if, for all a ∈ A, the procedure Continuation returns unique for both inputs (a, h, g) and (a, g, h). The following theorem follows, since a unique continuation instance is unique block instance. Indeed, the procedure Continuation can be transformed into a procedure which determines the unique block also, simply by returning the pair (xi , yi ) if the procedure stop in the command line (2). Note that the words si in the procedure Continuation are called suffix overflows. Theorem 5. The PCP is decidable for unique continuation instances. The difference between unique block instances and unique continuation instances is that for unique block instance we may find several letters each overflow in steps command lines (3) and (4) both the equality h(xi ) = g(yi ) is eventually satisfied only for one choice of the letter b. But in the unique continuation morphisms the choice of the next letter is always deterministic according to the overflow. Using this determinism we are able to prove that the ωPCP is decidable for unique continuation instances. The proof of the decidability of the ωPCP for unique continuation instances uses the idea of the proof of the next theorem proved in [3]. 6

Theorem 6. The ωPCP is decidable for marked instances. We are ready to prove our main theorem on ωPCP. Theorem 7. The ωPCPis decidable for unique continuation instances. Proof. Assume that I = (h, g), where h, g : A∗ → B ∗ , is an instance of the ωPCP, and I is a unique continuation instance. First of all assume that E(I) = ∅, since for any nonempty w ∈ E(I), wω is an infinite solution and we are done. The infinite solutions of I are of two type, either they have a block decomposition, or they do not have a block decomposition. We prove that the infinite solutions of both type can be detected. Note that we say that an infinite solution ω ∈ Aω has a block decomposition, if ω = u1 u2 · · · = v1 v2 · · · ,

(3)

and (ui , vi ) is block for some letter ai for i = 1, 2, . . . . Assume that I = (h , g ) constructed as in the proof of Theorem 2. It is easy to prove that there exists an infinite solution ω with block decomposition in (3) for I if and only if ω = a1 a2 · · · is an infinite solution of the instance I of the marked ωPCP. Since the marked ωPCP is decidable (see [3]), the solution with block decomposition can be detected. Note that, for any infinite solution ω of the instance I , h (ω ) is an infinite solution of I. The harder case are the infinite solution without block decomposition. Assume that ω is such an infinite solution, that is, ω = u 1 u 2 · · · u k ω1 = v 1 v 2 · · · v k ω2 ,

(4)

where (ui , vi ) are blocks for i = 1, . . . , k and ω1 , ω2 ∈ Aω , and k is maximal. The maximality here means, that ω1 and ω2 do not have block words as a prefix. We shall describe a procedure which detects these infinite solutions. Since there is no block as a prefix of ω1 and ω2 , it is necessary, that they both begin with letters such that no block exist for these letters. Otherwise, by unique continuation, there would be a whole block. Clearly, such letters are those for which the procedure Continuation returns ”unique case 2”. In this case, a suffix overflow appears again, and, since the instance is unique continuation instance, necessarily the overflows appear cyclically. Assume, that (xi , yi ) = (u, u ) is the pair when the first repeated overflow appears the first time, and (xj , yj ) = (uv, u w) when the same overflow appears the second time. It is immediate that h(uv ω ) = g(u wω ). Therefore, ω1 = uv ω and ω2 = u wω for some letters b ≤ u and c ≤ u and for any block (x, y), b x and c y. We achieve that all possible ω1 and ω2 can be determined. For each letter that disappear, that is, there is no block for them, we construct the words ω1 and ω2 if they exist. Note that we check also whether or not ω1 = ω2 which would immediately imply that ω1 is an infinite solution. 7

What we still need is to prove that for all a1 the block part in (4) can be effectively found. Note that we know that the number k is finite, since, otherwise, there would be an infinite solution with block decomposition beginning with a1 , which would be already be detected by the first case. We construct the word u1 . . . uk and v1 . . . vk using the same idea as in procedure Continuation. Clearly, there exists a solution (4) if and only if either (u1 · · · uk )−1 (v1 · · · vk ) = cz or (v1 · · · vk )−1 (u1 · · · uk ) = bz for some word z, according to whether or not |u1 · · · uk | < |v1 · · · vk | or not. Our algorithm works as the Continuation for (a1 , h, g), but the sequence (xi , yi ) is constructed so that for each step we check that xi yi . If at some step i, xi and yi are not comparable, there is no infinite solution for a1 . Similarly, if Continuation returns ”no block”, that is no next letter for xi or yi exists and h(xi ) = g(yi ). Now if Continuation stops in the case h(xi ) = g(yi ) (and xi yi ), we have that xi = yi dz or yi = xi dz, for some letter d and word z. In the first case, if there is a block word for g beginning with d, we set xi+1 = xi and yi+1 = yi d, and continue according to the procedure Continuation. Otherwise, there is no block word for g beginning with d, and xi = u1 · · · uk and yi = v1 · · · vk . The other case, where yi = xi dz, we reason similarly. Note that dz = ε, since E(I) = ∅. Since there is no infinite solution with block decomposition, this algorithm necessarily stops. Finally, we need to check whether or not for the d there exists the infinite words ω1 = uv ω and ω2 = u wω as above, and that u1 u2 · · · uk uv ω = v1 v2 · · · vk u wω . These words are ultimately periodic, and there this check can be done in a trivial way. The structure of the solutions of the marked infinite PCP was studied in [8], and the infinite solutions of the unique continuation instances have the same structure. Indeed, it was proved that the set of infinite solutions for the marked instances i is of the form Emin (I)ω ∪ Emin (I)∗ (P ∪ F ) ,

(5)

where P is a finite set of ultimately periodic words, and F is a finite set of morphic images of fixed points of D0L systems. In the proof of Theorem 7 it was proved that the solutions with block decomposition are morphic images of solutions of an marked instance, and the solution without block decomposition are ultimately periodic. Since the morphic images of ultimately periodic words in P are ultimately periodic and the morphic images of morphic images of F are of the type F , we get that 8

Theorem 8. The infinite solutions of the instance with unique continuation property have the structure of (5). Finally, we give another uniqueness property for the instances of the PCP. UC3. The instance (h, g), where h, g : A∗ → B ∗ , of the PCP, is called unique equality continuation instance, if, for u ∈ A∗ and a, b ∈ A, h(ua) g(ua) and h(ub) g(ub), then either h(u) = g(u) or a = b. We leave the following two question as open problems: Is the PCP is decidable for unique equality continuation instances or not? Is it decidable whether or not an instance of the PCP satisfies the property (UC3)? Note that for a unique equality continuation instance I, Emin (I) is finite and marked as well, and the set of infinite solutions of I has the structure as in (5).

References [1] A. Ehrenfeucht, J. Karhumäki, and G. Rozenberg. The (generalized) Post Correspondence Problem with lists consisting of two words is decidable. Theoret. Comput. Sci., 21:119–144, 1982. [2] V. Halava. The Post Correspondence Problem for Marked Morphisms. PhD thesis, Department of Math., Univ. of Turku. TUCS Dissertations no. 37, 2002. [3] V. Halava and T. Harju. Infinite solutions of the marked Post Correspondence Problem. In J. Karhumäki, W. Brauer, H. Ehrig and A. Salomaa, editors, Formal and Natural Computing, volume 2300 of Lecture Notes in Comput. Sci., pages 57–68. Springer-Verlag, 2002. [4] V. Halava and T. Harju. Infinite Post Correspondence is Undecidable for Instance of Size 9. submitted, also as TUCS Tech. Report 6??, 2005. [5] V. Halava, T. Harju, and M. Hirvensalo. Generalized Post correspondence problem for marked morphisms. Internat. J. Algebra Comput., 10(6):757–772, 2000. [6] V. Halava, T. Harju, and M. Hirvensalo. Binary (generalized) Post Correspondence Problem. Theoret. Comput. Sci., 276:183–204, 2002. [7] V. Halava, T. Harju, and J. Karhumäki. Decidability of the binary infinite Post Correspondence Problem. Discrete Appl. Math., 130:521– 526, 2003. 9

[8] V. Halava, T. Harju, and J. Karhumäki. The Structure of Infinite Solutions of Marked and Binary Post Correspondence Problems. Theory Comput. Syst., to appear. [9] V. Halava, M. Hirvensalo, and R. de Wolf. Marked PCP is decidable. Theoret. Comput. Sci., 255(1-2):193–204, 2001. [10] M. V. Lawson, Finite Automata, Chapman & Hall, 2004. [11] Y. Matiyasevich and G. Sénizergues. Decision problems for semi-Thue systems with a few rules. Theor. Comput. Sci. 330(1):145-169, 2005. [12] E. Post. A variant of a recursively unsolvable problem. Bull. of Amer. Math. Soc., 52:264–268, 1946. [13] K. Ruohonen. Reversible machines and Post’s correspondence problem for biprefix morphisms. Elektron. Informationsverarb. Kybernet. (EIK), 21(12):579–595, 1985.

10

Lemminkäisenkatu 14 A, 20520 Turku, Finland | www.tucs.fi

University of Turku • Department of Information Technology • Department of Mathematics

˚ Abo Akademi University • Department of Computer Science • Institute for Advanced Management Systems Research

Turku School of Economics and Business Administration • Institute of Information Systems Sciences

ISBN 952-12-1527-5 ISSN 1239-1891

Post Correspondence Problem for morphisms with

Post Correspondence Problem for morphisms with

Suggest Documents