EQUIVALENCE AND STRONG M

0 downloads 0 Views 330KB Size Report
δabc,δacb, and δbac whenever w undergoes a particular αβ-transformation. This observation implicitly agrees with Remark 4.4, therefore motivating our next.
This is the preprint submitted to IJFCS. It was accepted on 14Apr2017. ON M -EQUIVALENCE AND STRONG M -EQUIVALENCE FOR PARIKH MATRICES GHAJENDRAN POOVANANDRAN AND WEN CHEAN TEH Abstract. Strong M-equivalence was introduced as an order-independent alternative to M-equivalence for Parikh matrices. This paper further studies the notions of M-equivalence and strong M-equivalence. Certain structural properties of M-equivalent ternary words are presented and then employed to (partially) characterize pairs of ternary words that are ME -equivalent (i.e. obtainable from one another by certain elementary transformations). Finally, a sound rewriting system in determining strong M-equivalence is obtained for the ternary alphabet.

1. Introduction The Parikh matrix mapping [8] is an extension of the classical Parikh mapping [10]. Parikh matrices are widely investigated due to their applicability in studying subword occurrences in words (for example, see [9, 12, 13]). However, not every word is uniquely determined by its Parikh matrix. Two words are termed as being M-equivalent if and only if they share the same Parikh matrix. This gives rise to the injectivity problem, which asks for a natural characterization of M-equivalent words. Despite receiving significant interest (for example, see [?,1–7,14–21]), the problem is given a satisfactory solution only for the case of the binary alphabet [1, 5, 7], whereas it remains elusive even for the ternary alphabet. One defining property of Parikh matrices is their dependency on the ordering of the underlying alphabet. Consequently, two words may be M-equivalent with respect to one ordered alphabet but not M-equivalent with respect to another ordered alphabet with the same underlying alphabet. Thus, the notion of strong M-equivalence was introduced in [19] to eradicate this undesirable property. Similar to the case of M-equivalence, a natural characterization of strong M-equivalence remains as an open problem. In this paper, we develop new results pertaining to M-equivalence and strong M-equivalence. The remainder of this paper is structured as follows. Section 2 provides the basic terminology and preliminaries. Section 3 presents a structural characterization of M-equivalent ternary words. Furthermore, M-equivalent ternary words that are obtainable by certain elementary transformations are partially characterized. In Section 4, a sound rewriting system is developed to determine when two ternary words are strongly M-equivalent. Our conclusion follows after that. 2010 Mathematics Subject Classification. 68R15, 68Q45, 05A05. Key words and phrases. Parikh matrices, injectivity problem, M-equivalence, strong M-equivalence. 1

ON M -EQUIVALENCE AND STRONG M -EQUIVALENCE FOR PARIKH MATRICES

2

2. Subwords and Parikh Matrices Our exposition in this section is largely self-contained, and the reader is referred to [11] for a comprehensive literature on formal languages. Let Σ be an (finite) alphabet. The set of all words over Σ is denoted by Σ∗ and λ is the unique empty word. In the literature, we often deal with alphabets with a total ordering assigned to it (i.e. ordered alphabets). For example, if a1 < a2 < · · · < ak , then we may write Σ = {a1 < a2 < · · · < as }. Conversely, if Σ = {a1 < a2 < · · · < as } is an ordered alphabet, we shall term {a1 , a2 , . . . , as } as the underlying alphabet. For convenience, we shall frequently abuse notation and use Σ to denote both the ordered alphabet and its underlying alphabet. For Σ = {a1 < a2 < · · · < as }, we denote by ai,j the word ai ai+1 . . . aj for every 1 ≤ i ≤ j ≤ s. Suppose Γ ⊆ Σ. The projective morphism πΓ : Σ∗ → Γ∗ is defined by ( πΓ (a) =

a, if a ∈ Γ λ, otherwise.

We may write πa,b for π{a,b} . Definition 2.1. Suppose Σ is an alphabet and v, w ∈ Σ∗ . (1) We say that v is a scattered subword (or simply subword ) of w if and only if there exist x1 , x2 , . . . , xn , y0 , y1 , . . . , yn ∈ Σ∗ with some of them possibly being empty, such that v = x1 x2 · · · xn and w = y0 x1 y1 · · · yn−1 xn yn . (2) We say that v is a factor of w if and only if there exist x, y ∈ Σ∗ such that w = xvy. For v, w ∈ Σ∗ , the number of occurrences of the word v as a subword of w is denoted by |w|v . If two occurrences of v as a subword of w differ by at least one position of any letter, then the two occurrences are considered different. For example, |ababac|abc = 3. By convention, |w|λ = 1 for all w ∈ Σ∗ . For k ≥ 1, we denote by Mk the set of all k × k upper triangular matrices with nonnegative integral entries and unit diagonal. Clearly, Mk constitutes a monoid under matrix multiplication. Definition 2.2. Suppose Σ = {a1 < a2 < · · · < ak } is an ordered alphabet. The Parikh matrix mapping with respect to Σ, denoted by ΨΣ , is the morphism: ΨΣ : Σ∗ → Mk+1 , defined by ΨΣ (aq ) = (mi,j )1≤i,j≤k+1 , where mi,i = 1 for all 1 ≤ i ≤ k + 1, mq,q+1 = 1, and all other entries of the matrix ΨΣ (aq ) are zero. Any matrices of the form ΨΣ (w) for w ∈ Σ∗ are termed as Parikh matrices. Theorem 2.3. [8] Suppose Σ = {a1 < a2 < · · · < ak } is an ordered alphabet and w ∈ Σ∗ . The matrix ΨΣ (w) = (mi,j )1≤i,j≤k+1 has the following properties: (1) mi,j = 0 for all 1 ≤ j < i ≤ k + 1; (2) mi,i = 1 for all 1 ≤ i ≤ k + 1; (3) mi,j+1 = |w|ai,j for 1 ≤ i ≤ j ≤ k.

ON M -EQUIVALENCE AND STRONG M -EQUIVALENCE FOR PARIKH MATRICES

3

Example 2.4. Consider the ordered alphabet Σ = {a < b < c}. Then, the Parikh matrix of the word bacab with respect to Σ can be computed as follows: ΨΣ (bacab) = ΨΣ (b)ΨΣ (a)ΨΣ (c)ΨΣ (a)ΨΣ (b)     1 1 1 0 0 1 0 0 0 0 0 1 1 0 0 1 0 0    = 0 0 1 0 0 0 1 0 · · · 0 0 0 0 0 1 0 0 0 1   1 2 2 0 0 1 2 1  = 0 0 1 1 . 0 0 0 1

0 1 0 0

0 1 1 0

 0 0  0 1

For the ordered alphabet Σ = {a1 < a2 < ... < ak }, the dual ordered alphabet (denoted by Σo ), is Σo = {ak < ak−1 < ... < a1 }. Theorem 2.5 (Duality of M-equivalence). [16] Suppose Σ is an ordered alphabet and w, w0 ∈ Σ∗ . If ΨΣ (w) = ΨΣ (w0 ), then ΨΣo (w) = ΨΣo (w0 ). 3. Properly M-equivalent ternary words The following notions were first formally established in [7]. Definition 3.1. Suppose Σ is an ordered alphabet and w, w0 ∈ Σ∗ . We say that w and w0 are M-equivalent, denoted by w ≡M w0 , if and only if ΨΣ (w) = ΨΣ (w0 ). Note that in earlier works done on the injectivity problem (see [1, 3, 4]), the term amiable is used instead of M-equivalent. The following rules are named Rule E1 and Rule E2, as they are elementary in deciding whether two words are M-equivalent. Suppose Σ = {a1 < a2 < . . . < as } and w, w0 ∈ Σ∗ . Rule E1. If w = xai aj y and w0 = xaj ai y for some x, y ∈ Σ∗ and |i − j| ≥ 2, then w ≡M w0 . Rule E2. If w = xaj aj+1 yaj+1 aj z and w0 = xaj+1 aj yaj aj+1 z for some x, y, z ∈ Σ∗ such that |y|aj−1 = |y|aj+2 = 0, then w ≡M w0 . Definition 3.2. Suppose Σ is an ordered alphabet and w, w0 ∈ Σ∗ . (1) We say that w is 1-equivalent to w0 , denoted by w ≡1 w0 , if and only if w0 can be obtained from w by finitely many applications of Rule E1. (2) We say that w is elementarily matrix equivalent (ME-equivalent) to w0 , denoted by w ≡ME w0 , if and only if w0 can be obtained from w by finitely many applications of Rule E1 and E2. Theorem 3.3. [1, 5] Suppose Σ is an ordered binary alphabet and w, w0 ∈ Σ∗ . Then, w and w0 are M-equivalent if and only if w0 can be obtained from w by finitely many applications of Rule E2. Unlike the case of the binary alphabet, ME -equivalence does not suffice to characterize M-equivalence for arbitrary alphabet. For instance, one can see that the M-equivalent words abcccaba and cbcaaabc are not ME -equivalent

ON M -EQUIVALENCE AND STRONG M -EQUIVALENCE FOR PARIKH MATRICES

4

with respect to the ordered alphabet {a < b < c}. This raises the question, to what extent does ME -equivalence characterize M-equivalence? In the spirit of answering this question in detail (for the ternary alphabet), we define the following notion, which was first introduced in [14]. Definition 3.4. Suppose Σ is an ordered alphabet. Two M-equivalent words w, w0 ∈ Σ∗ are properly M-equivalent if and only if w is not ME-equivalent to w0 . The following result shows that for the ordered alphabet {a < b < c}, there exist infinitely many pairs of properly M-equivalent words with n occurrences of abc as a subword in each of them for every n ≥ 1. Proposition 3.5. With respect to Σ = {a < b < c}, the pair of words w = cm−1 bm an bcan(m−1) b and w0 = bcm−1 an bcbm an(m−1) , where n ≥ 1 and m ≥ 2, are properly M-equivalent.   1 mn n(m + 1) n 0 1 m+2 m + 1 , thus Proof. Observe that ΨΣ (w) = ΨΣ (w0 ) =  0 0 1 m  0 0 0 1 0 w ≡M w . Consider w00 ∈ Σ∗ such that w00 ≡1 w0 , then w00 is in the form bvbcbm an(m−1) for some v ∈ {a, c}∗ . Observe that Rule E2 cannot be applied anywhere on w00 . Thus w cannot be obtained from w0 by finitely many applications of Rule E1 and Rule E2 and thereby, we conclude that w is not ME -equivalent to w0 .  Suppose Σ = {a < b < c} and w, w0 ∈ Σ∗ . It was shown in [19, Theorem 23] that if |w|abc = |w0 |abc = 0, then w ≡M w0 if and only if w ≡ME w0 . On the other hand, for the case of |w|abc = |w0 |abc = 1, more complex characterizations of M-equivalence and ME -equivalence have been provided in [19, Theorem 26]. In the remaining part of this section, we develop a partial generalization of those results for the case of |w|abc = |w0 |abc ∈ {2, 3}. Definition 3.6. Suppose Σ is any alphabet and v, w ∈ Σ∗ . The v-core of w, denoted by corev (w), is the unique subword w0 of w such that w0 is the subword of the shortest length which satisfies |w0 |v = |w|v . Remark 3.7. The uniqueness of corev (w) has been proven in [22], where corev (w) was originally defined differently from the version provided here. Proposition 3.8. Suppose Σ = {a < b < c} and w ∈ Σ∗ with |w|abc ≥ 1. Then, w ≡1 u coreabc (w)v for some unique u ∈ {b, c}∗ and v ∈ {a, b}∗ . Proof. Let bF and bL denote the first and the last b in coreabc (w) respectively. Then w is in the form w1 w2 bF w3 bL w4 w5 for some w1 ∈ {b, c}∗ , w2 ∈ {a, c}∗ , w3 ∈ Σ∗ , w4 ∈ {a, c}∗ , and w5 ∈ {a, b}∗ . Furthermore, coreabc (w) = ap bF w3 bL cq where p = |w2 |a and q = |w4 |c . (In case |coreabc (w)|b = 1, then bF = bL and w3 = λ.) By applications of Rule E1 on w2 and w4 , we can form from w, the 1-equivalent word uap bF w3 bL cq v for some u ∈ {b, c}∗ and v ∈ {a, b}∗ . To see that u and v are unique, note that any word of the form W1 bF w3 bL W2 that is 1-equivalent to w satisfies the property that W1 ≡1 w1 w2 and W2 ≡1

ON M -EQUIVALENCE AND STRONG M -EQUIVALENCE FOR PARIKH MATRICES

5

w4 w5 . Assume w ≡1 u coreabc (w)v and w ≡1 u0 coreabc (w)v 0 . By transitivity of 1-equivalence, uap ≡1 u0 ap and thus πb,c (uap ) = πb,c (u0 ap ). Since u, u0 ∈ {b, c}∗ , it follows that u = u0 . Similarly, it can be shown that v = v 0 .  Theorem 3.9. Suppose Σ = {a < b < c} and w, w0 ∈ Σ∗ . Let u, u0 ∈ {b, c}∗ and v, v 0 ∈ {a, b}∗ be the unique words such that w ≡1 u coreabc (w)v and w0 ≡1 u0 coreabc (w0 )v 0 . Let α = |u|b − |u0 |b , β = |v|b − |v 0 |b , and γ = | coreabc (w)|b − | coreabc (w0 )|b . Then, w and w0 are M-equivalent if and only if   0 α+γ 0 0 0 and Ψ{b

Suggest Documents