On various classes of infinite words obtained by iterated mappings

0 downloads 0 Views 2MB Size Report
le logieiel ~ implant4 sat l'ordJnateur SM 90. La sortie graphique nti~s~e est une imprimante. TOSHIBA P 1351, d'une r4solution de 180 points par ponce.
O N V A R I O U S C L A S S E S OF I N F I N I T E W O R D S

OBTAINED BY ITERATED MAPPINGS BY

JEAN-JAcQUES PANSIOT (1)

ABSTRACT.W e define gets of infinite words generated by various classes of iterated mappings. W e show that every infiniteword generated by an extended tag sywtem can also be generated by an ~.free tag system. We give a full inclusion graph and several closure properties for the sets of infinitewords considered. W e investigate some extensions of tag systems using iterated sequential mappings.

1. I n t r o d u c t i o n a n d preliminaries

In this paper we study various classes of infinite words obtained by iteration. These words have been studied for a long time [Thus], especially for their unavoidable regularities [Lothaire, chap. 2], [Berstel 84]. We deal mostly with inclusion and closure properties of the classes we consider. Some results are given without proof, since they appear (in french) in a previous paper [Pansiot 83 I. We assume that the reader is familiar with basic notions from formal language theory [Hopcroft and Ullman], [Rozenberg and Salomaa]. The simplest way to generate an infinite word is to iterate a morphism. Let g : X" -* X" be a morphism prolongeable in zo E X, that is, such that g(z0) = zoU, u 6 X +. Then, for i _> 1, we have =

(t) j . j . PANSIOT, Universi~ Louis-Pasteur, Centre de Calcul de l~splanade, 7, rue Ren~ Descartes, 67084 Strasbourg Cedex, ~h'ance Ce texte a ~t~ compos~ par le Laboratoire de Typographic Informatique de l'Univer~it4 Loui~Pat~eur de S t r u b o u r g , an moyen du logiciel STRATEC ; le fickier de saisie a ensure ~ trait~ par le logieiel ~ implant4 sat l'ordJnateur SM 90. La sortie graphique nti~s~e est une imprimante TOSHIBA P 1351, d'une r4solution de 180 points par ponce.

189

and the sequence defines a unique word, in general infinite, denoted by g~ (z0). Note that this way of generating words can be seen both as sequential (add g~(z0) at the end) or parallel

(replace z by g(z)). One can classify morph/sms by the length of images of letters [Ehrenfeucht et a l l . A morphism ~ is uni]orm (with modulus m) if Ig(z)] = m for all z. If m equals 1, g is litera/(or a coding). If lg(z)l > 2 for all z, g is called (everywhere) fftowing. Finally, if Jg(z)l _> 1 for all x, g is e-lree (non-erasing). We denote by M~, M~, M~, M" the sets of infinlte words generated by iterating morphisms that are respectively uniform, growing, e-free or arbitrary. This way of generating mBnlte words is quite rigid, because there is nothing equivalent to non-terminals for grammars. As a consequence, the different classes menfionned above are not closed even for very simple operations like removing the first letter. Because of this, more powerful methods have been introduced.

A tag system [Cobham] is a 5-tuple T = < X, Y, zo, g, h > where X and Y are finite alphabets, z0 E X, g : X" ~ X" is a morphism prolongeable in zo, and h : X" ~ Y" is a coding. The h~f~nlte word generated by T (if it exists) is T = h(g'~(z0)). We define uniform, growing, e-free and arbitrary tag systems, depending on the morphism #. The corresponding sets of i~fi~te words are denoted by C(M~), C(M~), C(M~), and C(M~). COBHAM [Cobham] has given important results concerning the first set. One attempt to get bigger sets of in£~ite words is to consider ertended tag systems < X, Y, z0,9, h > where h is an arbitrary morphism. We will see that we can't get new infinite words in this way, and this remains true if h is replaced by a deterministic generalized aequential mapping (d.g.~.m.). A d.g.a.rn, is a 6-tuple D =< X,Y,S, 8o,6,a > where X and Y are finite input and output alphabets, S is a finite set of states, a0 is the initial state, 6 : S x X --* S the next-state (deterministic) function and ~r : S × X --* Y" is the output function. A more drastic step is to consider words generated by iterated d.g.a.m. Assume D is a d.g.a.m, such that X = Y, and a(s0, z0) starts with zo, then the sequence ~:0 = Zo,Ul,o~.,t~¢

=

0"(80,U¢-1),...

defines a word, in general infinite, denoted by D~'(z0). This word can also be seen as a fixed-point of D [Bleuzen ]. With this device we can construct much more complex words. In section 2 we give a hierarchy of the various sets of infinite words obtained by extended tag systems. In section 3 we consider closure properties of these sets. Finally in section 4 we give some results concerning iterated d.g.8.m.

190

2. A hierarchy of t a g s y s t e m s We say that two (extended) tag systems are equivale~ if they generate the same infinite word. In this section we consider the following problem: given an (extended) tag system T = < X, Y, zo, g, h > is it possible to find an equivalent tag system T' = < X', Y, z~, ~, h' > of a simpler (more restricted) kind. For example with h' literal, or with g' growing or uniform and so on. Two basic tools for constructing a tag system T' equivalent to T are speed.up [Rozenberg and Salomaa] and alphabet compression. In the first case we consider T' = T ~ = < X, Y, zo, g i h >, and it is clear that T and T ~ are equivalent. Alphabet compression is a more powerful transformation where X ' corresponds to a set of words U over X. More precisely let T = < X, Y, zo,g, h > be an extended tag system, and U = {uo, U l , . . . , ut,} a finite set of words over X such that u0 = gi (z0) for some j _> 0. Assume that for all u in U,g(u) e U" and g(uo) £ u0U +. Let X' = {[uo],...,[uk]} be a new alphabet. We define a morphism gt : X'" --* X " as follows. For all u in U we choose a factori=ation g(u) = u~,u~, ... u; .

For u0 we choose u~, = u0 (this is possible by hypothesis). Then g'([u])

=

The morphism h' : X'" --* Y" is given by h'([u]) = h(u). From these definitions we have =

hence T and T t = < X l, Y, [uo], gt, h' > are equivalent. The main result of this section is the following: THEOREM 2.1. - - For every eztended tag system T = < X , Y , zo,g,h >, where g and h are arbitrary morphisms, one cart c o , t r u e r an e.]ree tag system T' = < X ,t Y, z I0' gt, h I > eq~valent to T, where g' is e-]ree and h' literal. The tag system T' can be effectively constructed from T by a series of speed-ups and alphabet compressions, ~ shown in [Pansiot 83]. EXAMPLE. - -

Consider the morphism g : a ~ acb, b ~ boa, ¢ ~ e. We have S l = go, ( a ) = a e b b e a b c a a c b b c a a e b .

..

This word can be obtained from the famous infinite word of Thue [Thue] $o = abbabaabbaababba...,

191

by inserting the letter c after every other letter of So. Let u0 = acb and ul = boa, X ' = {[uo],[u,]}.Then the morphism

is the Thue morphism on two lettem generating So (up to a renaming of letters). Define the morphism h ~ by [uo] --~ acb, [ul] --* boa. The extended tag system

T' = < X',Y,[uo],g',h'> is e-free. Using another alphabet compression, we will get a uniform tag system. Let x" = {[.o]1, [-o1~, [-o1~, [.,1,, [-~l~, [".l~} and define g" : X"" -* X"" by [.o]1 -~ [~011[~]2,

and h" : X"" --+ Y" by

{,,o]1 ~ ~,

[,o]2 ~ c,

Then and

h"(g"'°([,,o]i))

= ~cbb~b~...

= Sl.

Note that T" = < X",Y,[uo]t,g",h" > is a uniform tag system of modulus 2 generating S1, hence equivalent to an erasing tag system. A result similar to THEOREM 2.1 holds for growing morphisms, and can be stated as follows.

T H E O R E M 2.2. - - For every ez~ended tag system T = < X , Y , zo,g,h > where g is growing and h e-free, one can cor~truct a tag 8yatem T' = < X', Y, z0, gl, h' > equivalent to T and with g' growing and h' literal. From these results, extended tag systems and erasing morphisms are no more powerful than non erasing tag systems. The next theorem gives a full inclusion graph for all classes of tag systems introduced in section 1.

192

THEO&EM 2.3. -- The/ollowing (strict) inch~ion graph holds, where two ads that are not linked by a sequence of inclu~ion~ are incomparable.

C(M:)

U

M~

C

C(M~)

C

A,f~

U

C

C(M~)

C

M,~

U

=

C ( M ~)

C

IVI ~

U

The equality C(M,~) = C ( M ~) is a consequence of THEOREM 2,1. All other weak inclusions are straightforward. It remains to be shown that these inclusions are strict and there are no other inclusions. For this we need infinite words that are in one of these sets but not in some others. A useful notion is the subword complexity (or complexity for short) of an infinite word S or a language L. It is the function Is where f~ (n) is the number of distinct subwords of S of length n. EHaENPEUCHT, LEE and ROZEN,aERG [Ehrenfeucht et a/] have studied the subword complexities of various classes of DOL-languages. Their results can be applied to ~nfinite words generated by iterated morphisms, and can be summarized as foIlows. THEOREM

2.4.

a) All wor~ in M ~ have a complexity bounded by en 2 /or some ¢, and there exists a word $2 in M~ with eomplezity at lea~t dn 2 for some d > O. b) All words in M~ have a complexity bounded by cnlogn /or some e, and there e~ists a word $3 in M~ with complezity at lea~t dn log n for some d > O. c) All words in M~ have a complezity bounded by en for some e, and there ezists a word in M y with complexity at leaJt dn for some d > 0 (]or ezample So b, om Thue). For more details concerning subword complexities of infinite words see [Pansiot 84]. From THEOREM 2.4 we see that S2 is in M~ but not in M~' neither in C(M~). Similarly Sa cannot be in M~ neither in C(M~). It can be shown that $1 as defined previously is in M ~' but not in M~'. Finally the infimte word of Ar~on $4 [Ar~on] is in C(M~) but cannot be generated by iterated morphism [Berstel 79]. From all these properties, THEOREM 2.4 holds.

3. Closure properties for tag systems Closure properties by different classes of mappings play an important role in formal language theory. We have the following results. THEOREM

3. I.

a) The sets M~, M~, M~, M ~ are not closed bit morplu'srn~, even literal ones. b) C(M~) is closed by uniform morphisrn~, b~ not by growing ones. c) C(M~) is closed by e./ree morphisrr~, b~ not by arbitrary ones.

193

d) C(M~') is closed by arbitrary morphisrns. Proo]. - - Part a) is a consequence of THEORI~M 2.3. The closure of C(M~) by uniform morphisms is given in [Cobham]. To prove that C(M~) is not closed by growing morphisms, it is sufficient to find a word $5 generated by an extended tag system T = < X,Y, zo,g,h > such that g is uniform, h growing, but $5 is not in C(M~). The following morphisms have the required property [Pansiot 83]: g

: ZO "-~ ZOy~

y -~ Zy,

Z --~ Z Z

and h : Zo --* I0,

y~

I0, z ~ 0 0 0 0 .

They define the infinite word Ss = I0105101~i... I02"+'-~i... This proves part b). The set C(M~) is closed by E-freemorphisms, this is THEORIP, M 2.2, but it c~nnot be closed by arbitrary morphisms, since the closure of My, or even M~ by arbitrary morphJsms is equal to C(M~). This settles part c). Finally, C(M~) is closed by arbitrary morphisms by THEOREM 2.1. Since the biggest class of tag systems, C(M/) is closed by morphisms, one can ask whether or not it is dosed by d.g.a.m.The rather surprising result is that C(M~) is closed by arbitrary d.g.s.m.,even erasing ones. In fact applying a ~.g.8.m. of some kind (uniform, growing, etc...)gives no more power than applying a morphism of the same kind. THEOREM

3.2. --

The sc4sM,~, M~, M~, M ~, C(M~), C(M[), C(M~)

are closed by d.g.s.rn. 4 some kind [literal, uniform, growing, e-free, erasing} if and only if they are closed by morptu'srr~ of the same kind. Therefore THZOR~.M 3.1 still hol& if the word morphi~m is replaced by d.g.s.m. In [Pan~iot 83] we give an effective construction of a tag system equivalent to a

d.g.s.rn, composed with a tag system. The main idea is that the sequence of states taken by a d.g.s.m, applied to an infinite word generated by iterated morphism is quite regular, and can be encoded in the word itself. Note that many operations on words are best thought of as applying a d.g.s.m., for example erasing one letter every 3 letters, or changing every other occurrence of a letter into another letter and so on. Therefore THEOREM 3.2 gives a good idea of the kind of infinite words that can be constructed by tag systems.

An infinite word $' is obtained by finitemodificationof an infinite word $ if $' can be constructed from .q by a finite number of insertions and deletions

194

of letters. In [Pansiot 81] it is shown that any firtite modification of the infinite word of Thue, S0 cannot be generated by iterated morphism. Hence none of the four sets M~, M~, M,~ , M ~' is closed by finite modification. It is known [Cobham] that C(M~) is closed by finite modification, and from THEOREM 3.2 it Can be seen that C(M~) and C(M~') are also closed. Similarly C(M~), C(M~) and C(M~) are closed under q-bloc compression and periodic deletion, those are just particular cases of d.g.a.m. A more complex operation is the product of two infinite words =

808182...8i...

and T = tot~t2...ti...

denoted by s × "r =

From [Cobham], if S and T can be obtained by uniform tag systems of same modulus, then so is S x T. But in general, C(M~) is not closed by product. It can be shown [Pansiot 83] that M~, M~', Mff, M ~' are not closed by product. It remain.~ an open question whether or not C(M~') and C(M~) are closed by product, although we conjecture they are not. 4. I t e r a t e d d.g.e.m. The set of infinite words obtained by (extended) tag systems, C(M~'), is quite restricted since these words have at most quadratic complexity. A natural extension is to consider iterated d.g.s.m, of various kinds : uniform, growing, non-erasing or erasing. The in.fi.aite word generated by a growing d.g.s.m. D starting with z0 is also the unique fixed-point of D starting with z0. In [Bleuzen], fixed-points of d.g.s.m, are studied for their complexities. The main result is the following: THEOREM 4.1. - - Let $ be an infinite word generated by iterating a uniform d.g.s.m, of modulus m on k letters, with s states, then the complezity /s of $ verifies fS (n) < k28mn l+(l° z,/log,n)

Moreover one can conJtruct a urd]orm d.g.a.m, such that

fs(n) >_in From this result, iterated uniform d.g.8.m, are strictly more powerful than tag systems, but the complexity remains polynomial. In contrast, e-free d.g.8.m, can generate infinite words with exponential complexity as can be seen in the following example.

195

EXAMPLE.

--

Consider the d.g.s,rn. D =< X = {O,l,#},X,{a,b,c},c,6,a >

defined by a , 0 --+ b

a, 1--+ a a, # 4 a 6:b,O--+b b, 1 -+ b

a , 0 --+ 1 a,l~0

a, #--, I# a : b,0 --+0 b, 1 --+ 1

b,#--+a c,#--, a

b,#--+# c,#--, # i #

The infinite word $6 generated by D starting at

# is

$6 = # 1 # 0 1 # I I # 0 0 1 # . . . # u i # . . . where ui is the binary representation of the integer i (most sig'nificantbit at right). Therefore $6 has complexity at least 2". Obviously arbitrary d.g.s.m, generate infinite words with at most exponential complexity. Therefore the big jump in complexity occurs between uniform and e-free d.g.s.m. The next theorem shows that in fact growing d.g.s.rn,generate also polynomial complexity. THEOREM 4 . 2 . - - Let $ be an infinite word generated by iterating the d.g.s.m. D = < X , X , S , so,6, a > where a is growing. The subword complezity oJ $ verifies f S (n) < 2k2 srn2n l°z,m,/log m,

where mx and m2 are lower and upper bounds for the length of a~s, z) for s E S and z E X Note t h a t if D is uniform of m o d u l u s m, t h e n mx = m2 = m and the upper b o u n d above is the same as the b o u n d of THEOREM 4.1, up to a COnstant factor.

P r o o ] . - Let uo be a subword of length n of $. T h e n u0 is a subword of a ( s x , u l ) for some state s, and subword ux. Moreover if we choose ul of minimal length, mx ([till - 2) < ]uo], t h a t is lull < luQl-x + 2. If we repeat this process with WltI ux, u2, . . . we get ue satisfying n-I

I.~1 < ----~-- + 2 +

1

-+- , +~. . .,

NOW f o r

log rt to _ - - + I log rnx

1

.~t_---r

196

we have [Uto-It _< 3, and for t > t0, we have lu,[