Valuations and Unambiguity of Languages, with ... - CiteSeerX

3 downloads 0 Views 761KB Size Report
Sep 22, 1994 - of fractals described by formal languages, this calculation may be simpli ed considerably by using the ... The bridges between formal language theory and fractal geometry we are going to build are: ...... Fractals Everywhere.
Valuations and Unambiguity of Languages, with Applications to Fractal Geometry Ludwig Staiger Lehrstuhl fur Informatik II Lehrstuhl Informatik fur Ingenieure RWTH Aachen und Naturwissenschaftler Ahornstrae 55 Universitat Karlsruhe (TH) D-52056 Aachen D-76128 Karlsruhe September 22, 1994 Henning Fernau

y



Abstract

Valuations | morphisms from (; ; e) to ((0; 1); ; 1) | are a simple generalization of Bernoulli morphisms (distributions, measures) as introduced in [11, 5]. This paper shows that valuations are not only useful within the theory of codes, but also when dealing with ambiguity, especially in regular expressions and contextfree grammars, or for de ning outer measures on the space of ! -words which are of some importance for the theory of fractals. These connections yield new formulae to determine the Hausdor dimension of fractal sets (especially in Euclidean spaces) de ned via regular expressions and contextfree grammars. Furthermore, we generalize the classical notion of the entropy of a formal language. This paper is an enhanced version of the one presented at ICALP'94 [17].

 email: [email protected] yemail: [email protected]

1

2

Contents 1 Introduction

3

2 Valuations and Unambiguity

4

2.1 2.2 2.3 2.4

Simple Properties of Valuations : : : Unambiguous Operations : : : : : : : Unambiguous Regular Expressions : Unambiguous Contextfree Grammars

: : : :

3 -Entropy of Languages

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

4 5 5 7

9

3.1 The -Entropy of Regular Languages : : : : : : : : : : : : : : : : : : : : : 11 3.2 The -Entropy of the Submonoid : : : : : : : : : : : : : : : : : : : : : : : 13

4 !-Languages and Hausdor Dimension

15

5 IIFS and Fractal Geometry

20

6 Conclusions

26

4.1 Metric Properties of the Space (!n ;  ) : : : : : : : : : : : : : : : : : : : : 15 4.2 Hausdor Dimension in (!n ;  ) : : : : : : : : : : : : : : : : : : : : : : : : 16 4.3 Hausdor Dimension of !-Languages : : : : : : : : : : : : : : : : : : : : : 18 5.1 Iterated Function Systems : : : : : : : : : : : : : : : : : : : : : : : : : : : 20 5.2 Calculating Dimensions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 24 5.3 Some Fractals : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 24

3

1 Introduction Unambiguity is an old theme in formal language theory. In many applications, like compilers, text processors, etc. one is interested in describing words in a unique manner. In this paper, we treat three kinds of unambiguities:

 unambiguous language operations  unambiguous regular expressions  unambiguous contextfree grammars We will characterize these unambiguities with the help of so-called valuations. It is also very interesting that our results permit the calculation of the valuation of a language given by a speci c nite description from the nite description itself, circumventing the struggle with the (in general) in nite languages. On the other hand, fractal geometry is now a budding branch of mathematics with a variety of possible applications [23, 12]. One of the main problems encountered in that eld is the determination of the Hausdor dimension of fractals. We show that, for certain types of fractals described by formal languages, this calculation may be simpli ed considerably by using the results on unambiguous regular expressions and unambiguous contextfree grammars derived in the second section. Surprisingly, we can give conditions in terms of formal language theory that alleviate arguments in topology and measure theory (the mathematical backbones of fractal geometry) considerably. In such a way, results from formal language theory (and algebra in a broader sense) contribute to the theory of fractals. The bridges between formal language theory and fractal geometry we are going to build are:

 Language specifying iterated function systems, such describing fractals.  Valuation dimension; this entity corresponds to the similarity dimension encountered

with iterated function systems.  Entropy with respect to a valuation; this entity corresponds to the BesicovitchTaylor index encountered with iterated function systems [41].  Special metrizations (induced by valuations) of spaces of !-words; Hausdor measure and Hausdor dimension within these spaces are directly related to these entities within Euclidean spaces.

There have been other approaches using formal languages in order to describe fractals, or, more general, pictures. Closely related to our work are [44, 29]. Probably the most prominent example is L systems [30]. Interestingly, there are close connections to our language-based approach (see our comments below and the section on controlled iterated function systems in [30]). Additionally, we also nd hypergraph-based ideas like collage grammars [19] and cellular automata [45, 40, 42]. One could also mention map 0L systems [30] and chain code picture languages [39].

4 Further material on the topic is contained in other works of the authors [14, 13, 16, 15, 26, 34, 35, 36, 37]. Conventions:  and  denote inclusion and strict inclusion, respectively. IN is the set of nonnegative integers. The cardinality of a set S is denoted by card S . n = f1; : : : ; ng  IN denotes our standard alphabet.  (without subscript) denotes some at most countable alphabet. A language L is a subset of the word monoid  generated by the alphabet , where e is the neutral element of the monoid, called empty word. Mostly, the monoid operation called catenation is just denoted by juxtaposition, sometimes made explicit using  between the words. The monoid generated by the language L   is denoted by L, and the semigroup generated by L is denoted by L+ . We also consider !-languages F over the alphabet , i. e. sets of one-sided in nite words, F  ! for short. In general, if L  , L! = fv0  v1  v2  : 8i 2 IN(vi 2 L n feg)g. Further notions and denotations are introduced throughout the text body.

2 Valuations and Unambiguity We call a monoid morphism mapping from (n; ; e) to ((0; 1P ); ; 1) a valuation. Any valuation can be extended to languages L  n de ning (L) = w2L (w). As an example, consider the valuation n de ned by n(a) = 1=n for every a 2 n. Proofs of the results mentioned in this section can be found in [14, 13, 16].

2.1 Simple Properties of Valuations Almost by de nition, we nd that (n ; 2n ; ) is a measure space. Other properties exploit that is a morphism. We list some of them below. Let K; L be languages over n and fLigi2 be an at most countable family of languages over n .

Determinacy Monotonicity (-)Additivity

; 6= K () 0 < (K ). K  L ) (K )  (L)  (n). Let fLigi2 be a family of pairwise disjoint languages S P

Li over n . Then ( i2 Li) = i2 (Li). (-)Subadditivity In general, (Si2 Li)  Pi2 (Li). Let (Li )iS2IN be an increasing chain. Continuity Then ( i2IN Li ) = limi!1 (Li). Subtractivity ( (K ) < 1 ^ K  L) ) (LnK ) = (L) ? (K ). Multiplication law (KL)  (K ) (L). Power law (8m 2 IN)( (Lm)  ( (L))m). (L) = (S1m=0 Lm )  P1m=0 (Lm); Star law if (L) < 1, then additionally (L)  1? 1(L) < 1.

5 Observe that determinacy, subadditivity and multiplication law are quite similar to the properties required for real-valued valuations of elds [43].

2.2 Unambiguous Operations Recall that the product KL is unambiguous i any word in KL is unambiguously decomposable, i.e. for any w1; w10 2 K and w2; w20 2 L with w = w1w2 = w10 w20 , we have w1 = w10 and w2 = w20 . The union K [ L is termed unambiguous i K and L are disjoint. The star operation C , C  +n is called unambiguous i C is a code, i.e. for all (v1; : : : ; vk) 2 C k and for all (u1; : : :; um) 2 C m, the equality u1  um = v1  vk implies m = k and, for all i 2  k , u i = vi . The next lemma is easily proved.

Lemma 1 Let K; L; C  n and be a valuation with (K ); (L) < 1. Then:  (KL) = (K ) (L) i the product KL is unambiguous.  (K [ L) = (K ) + (L) i the union K [ L is unambiguous.  If (C ) < 1, then (C ) = 1? 1(C) i C is a code.  If C is a code, then (C ) = P1i=0 ( (C ))i. Observe that, presupposing the unambiguity of the involved operators (products, unions, stars), we may calculate the valuation of a given language by calculating the valuations of the parts de ning the language, and by interpreting concatenation as multiplication, union as addition, and star as an in nite sum. This observation will be exploited in the following.

2.3 Unambiguous Regular Expressions In this part, we give conditions under which it is possible to calculate the valuation of a language given by a regular expression from the expression itself. This in turn enables us to characterize a certain class of regular expressions known as (strong) unambiguous expressions. We give a formal de nition of regular and unambiguous regular expressions, since, following Bruggemann-Klein [7], we leave out the empty set in the usual de nition of regular expressions. Let Rn  (n [ f(; ); [; g) denote the class of regular expressions over n. Rn is the smallest language over n [ f(; ); [; g satisfying:

 n  Rn.  If R1; R2 2 Rn, then (R1R2), (R1 [ R2), and R1 lie in Rn . As usual, the language [R]  n described by some regular expression R is de ned recursively:

6

 For w 2 n , let [w] = fwg.  If R1; R2 2 Rn , then [(R1R2)] = [R1][R2], [(R1 [R2)] = [R1][[R2], and [R1] = [R1]. Let URn  (n [ f(; ); [; g) denote the class of unambiguous regular expressions over n. URn is the smallest language over n [ f(; ); [; g satisfying:  n  URn.  If R1, R2 2 URn, then (R1R2) 2 URn , provided the corresponding language oper-

ation [R1][R2] is unambiguous.  If R1, R2 2 URn , then (R1 [ R2) 2 URn , provided the corresponding language operation [R1] [ [R2] is unambiguous.  If R1 2 URn, then R1 2 URn, provided the corresponding language operation [R1] is unambiguous. Obviously, URn  Rn , and this inclusion is strict, as the example e 2 Rn nURn shows. We develop a characterization of unambiguous regular expressions in terms of valuations in the following. To this end, we inductively de ne the valuation of a regular expression. Observe that, in this de nition, we formally interpret language operations as numerical operations. Let : n ! (0; 1) be some valuation. The valuation R of a regular expression R 2 Rn is a mapping R : Rn ! [0; 1] de ned inductively as follows.  If w 2 n  Rn , then R(w) = (w).  If R1, R2 2 Rn , then PR((R1R2)) = R(R1) R(R2), R((R1 [ R2)) = R(R1) + R(R2) and R(R1) = i2IN( R(R1))i. Note that we consider [0; 1] as a semiring extending the semiring ([0; 1); +; ; 1; 0) de ning especially 01 = 10 = 0. By the multiplication, subadditivity and star laws, we have immediately ([R])  R(R) for any regular expression R 2 Rn and any valuation : n ! (0; 1). By induction, Lemma 1 leads to the following corollary. Corollary 2 Let : n ! (0; 1) be some valuation and R 2 URn . Then, R(R) = ([R]). Considering the example R = (12) 2 R2, (1) = (2) = 1=3, we nd 1 1 1 X X X R(R) = ( ( 31 )j + ( 31 )j )i = 1; j =0 i=0 j =0 P on the other hand [R] = 2, yielding ([R]) = 1i=0( 32 )i < 1. Hence, R 62 UR2. Theorem 3 Let : n ! (0; 1) be some valuation and R 2 Rn such that R(R) < 1. Then, R is unambiguous i R(R) = ([R]). On the other hand, if we know two regular expressions R1; R2 2 Rn describing the same language, then R1 is ambiguous provided we have a valuation such that R(R1) > R(R2).

7

2.4 Unambiguous Contextfree Grammars In this part, we examine contextfree grammars. The question is again the next: When is it possible to compute the valuation of a language directly numerically from its nite description, here a contextfree grammar? From our preceding thoughts, the basic idea of transferring a contextfree grammar into a numeric system should be clear: Think of a contextfree grammar as a system of equations involving catenation and union, interpret the language variables (nonterminals) as numerical variables, interpret the terminals by the given valuation, and nally interpret the catenation as multiplication and the union as addition. In the following, we assume that the reader is familiar with some basics of formal power series, see e.g. [21]: If R is a semiring (with zero 0 and one 1) and n is a monoid, the set of formal power series, i.e. mappings n ! R, is denoted by Rn . s 2 Rn  is P written formally as a series w2n w, where = s(w). The support supp(s) of a series s equals fw : 6= 0g. The set of polynomials, i.e. formal power series having nite support, isPwritten R. The characteristic series L of a language L  n is de ned by L = w2L w. We consider the semirings IN, [0; 1), and [0; 1] together with the usual addition and multiplication as semiring operations. Note that any valuation itself can be viewed as a formal power series in [0; 1]. Another connection of valuations and formal power series is the next.

Lemma 4 Any valuation : n ! (0; 1) induces a semiring morphism X X  (w): w 7! : [0; 1]n! [0; 1] by w2n

w2n

To the set of productions P of a contextfree grammar G = (X; n ; P; x1), X = fx1; : : :; xmg, there corresponds a system of equations of the form

xi = pi ; 1  i  m; pi 2 IN

(1)

with (8w 2 (n [ X ))(2 f0; 1g) such that xi ! w 2 P i = 1. It is well-known that G is unambiguous i the solution of the IN-algebraic system (1) is given by (L1; : : : ; Lm), where Li denotes the characteristic series of the language Li , which in turn is generated by the original grammar G, taking as start symbol xi. can be extended to a semiring morphism : ([0; 1]n )X ! [0; 1]X . This provides us with a mathematically sound description of the numerical system corresponding to Eq. (1), namely xi = (pi). Elaborating the above observations, we obtain the next lemma.1

Lemma 5 Let : n ! (0; 1) be some valuation. Let Li  +n be generated by an unambiguous contextfree grammar Gi = (fx1; : : :; xmg; n ; P; xi ) in Chomsky normal form. Let xi = pi be the corresponding IN-algebraic system. Then, ( (L1); : : : ; (Lm)) is a solution of the system of equations xi = (pi ).

1The Chomsky normal form restriction in the following assertions alleviates the proofs contained in

[14] but is not necessary.

8 Is it possible to calculate (Li) directly from the numerical system xi = (pi)? An answer is given in the next theorem which is proved in [14].

Theorem 6 Let : n ! (0; 1) be some valuation. Let L be an e-free unambiguous contextfree language. Let xi = pi , i = 1; : : : ; m, be an IN-algebraic system of equations in Chomsky normal form with 2 f0; 1g such that the series L is the rst component

of the uniquely determined solution s = (s1; : : : ; sm) of the equation system xi = pi . We assume furthermore that, without loss of generality, any variable xi 2 X nfx1g is reachable from x1 in some derivation.

 If (L) < 1 and if the corresponding system with valuation xi = (pi ) has exactly one solution b = (b1 ; : : :; bm) 2 [0; 1)m, then b1 = (s1) = (L).  If the corresponding system with valuation xi = (pi) has no solution in [0; 1)m, then (L) = 1. On the other hand, it is possible to obtain a criterion for unambiguity of grammars using valuations.

Theorem 7 Let G = (X; n ; P; x1) be a contextfree grammar in Chomsky normal form such that from x1 any nonterminal xi 2 X nfx1g is reachable, inducing the IN-algebraic system xi = pi with the solution s = (s1 ; : : :; sm). Let : n ! (0; 1) be a valuation such that (supp(s1 )) < 1. Assume that the corresponding system with valuation xi = (pi) has exactly one solution b = (b1; : : : ; bm ) with bi = (supp(si)) in (0; 1)m. Then, G is unambiguous.

In Theorem 6, the case when the system with valuation xi = (pi ) has more than one solution is left open. We tackle this problem in the following. In order to do this, we need some de nitions and facts about complete partially orderered sets (cpo's), cf. [18]. A partial order v on a set D is a re exive, anti-symmetric and transitive binary relation. A subset M  D is called directed if, for every nite subset u  M , there is an upperbound x 2 M for u. (D; v) is complete (a cpo), if (1) every directed subset M  D has a least upperbound F M and (2) there is a least element ?D in D. If (D; v) is a partially ordered set, (Dm ; v) is so, too, de ning a v b () 8i 2 m(a(i) v b(i)). If (D; v) is a cpo, the mth power (Dm ; v) is also complete. In our case, we consider the following three cpo's and the derived mth powers.

 D1 = ([0; 1]; )  D2 = (2n ; ) with least element ;.  D3 = ((IN [f1g)n; ) with s  t i , for all words w 2 n,  . Given cpo's D and E , a Ffunction Ff : D ! E is monotone if f (x) v f (y) whenever x v y. If f is monotone and f ( M ) = f (M ) for every directed M , then f is said to be cpocontinuous. For example, any valuation , viewed as a map from (2n ; ) to ([0; 1]; ), 



is cpo-continuous. The basic lemma we need is the next.

9

Lemma 8 If D is a cpo and f : D ! D is continuous, then there is a point x(f ) 2 D such that x(f ) = f ( x(f )) and x(f )  x for any x 2 D such that x = f (x). In other F words, x(f ) is the least xed point of f . Moreover, x(f ) = ff k (?D ) : k 2 INg. Observe that any contextfree grammar G = (X; n ; P; x1), written as a system of equations xi = p~i , may be viewed as a cpo-continuous mapping p~ = (~p1; : : : ; p~m) on D2m , where p~j is applied to (L1; : : : ; Lm) by substituting any occurrence of the variable xi in p~j by the corresponding language Li. Fortunately, the usual de nition of a language (or a tuple of languages) derived from a contextfree grammar coincides with the least xed point semantic just described. Similarly, the system of equations of formal power series obtained from a contextfree grammar has a least xed point semantics which again coincides with the classical interpretation. Just note that our usual assumption that the corresponding grammar is e-free and in Chomsky normal form only serves to guarantee that the least solution does not have 1 anywhere as a coecient. Moreover, the mapping D2 ! D3; L 7! L and supp : D3 ! D2 are cpo-continuous. Furthermore, if we consider the sequences L(0) = (L0;1; : : : ; L0;m) = (;; : : :; ;), L(1) = (L1;1; : : :; L1;m) with L1;j = p~j (L(0)), L(2), etc. on the language side, and s(0) = (s0;1; : : : ; s0;m) = (0; : : : ; 0), s(1) = (s1;1; : : : ; s1;m) with s1;j = pj (s(0)), s(2), etc. on the side of formal power series, i. e. considering the set of equations xi = pi , we observe L(k) = supp(s(k)) for every k by induction, and hence by cpo-continuity, the language tuple generated by a contextfree grammar is just the support of the least solution of the corresponding equation of formal power series. The other `direction' L(k) = s(k) is valid if and only if G is unambiguous. Finally, the system of equations with valuation xi = (pi ) obtained from a system of equations xi = pi of formal power series has a least xed point semantics. By induction, it is easy to see that its approximating sequence coincides with ( (s(k)))k , where s(k) is de ned as above, yielding the appoximating sequence of the system of equations of formal power series. Since : D3 ! D1 is cpo-continuous, ( (s1); : : :; (sm )) is the least solution of the D1m -system xi = (pi), where (s1; : : :; sm ) is the least formal power series solution of the system xi = pi. We summarize our observation in the following theorem.

Theorem 9 Let : n ! (0; 1) be some valuation. Let L be an e-free unambiguous contextfree language. Let xi = pi , i = 1; : : : ; m, be an IN-algebraic system of equations in Chomsky normal form with 2 f0; 1g such that the series L is the rst component

of the uniquely determined solution s = (s1; : : : ; sm) of the equation system xi = pi . If (b1 ; : : :; bm) is the least solution of the D1m -system xi = (pi), then bi = (si ), since (s1; : : : ; sm) is the least solution of the D3m -system xi = pi . Especially, b1 = (L).

3

-Entropy

of Languages

It was shown in [35], [36], [26] that the entropy of languages introduced by Chomsky and Miller (cf. [20]) is a useful tool for the calculation of the Hausdor dimension of certain subsets of the Cantor space !n or of the Euclidean space IRd. In this section we derive a generalization of this concept which, as we shall see in the subsequent sections, leads to

10 similar calculation formulae for the Hausdor dimension of subsets of (!n ;  ) and can be used to calculate the Hausdor dimension of certain subsets of IRd thereby generalizing results of [1], [2], [25]. We call this generalization entropy of languages with respect to a valuation (short: entropy of a language). The aim of this section is to show that the properties of the entropy of languages mentioned in [33] and [36, Section 2] hold as well for this generalization. Let : n ! (0; 1) be a valuation. We call s(L) := Pw2L (w)s the s-dimensional valuation of the language L  n. In particular, we allow valuations having (w)  1 for some words. For a xed language L  n we consider the s-dimensional valuation s as a function s(L) : [0; 1) ! [0; 1]. We summarize some properties of the function s(L).

Property 10 Let L  n and : n ! (0; 1) be a valuation, and let s(L) < 1 for some s 2 [0; 1). Then there is an 2 [0; 1) such that s(L) = 1 for s < , s(L) < 1 for s > , and the function s(L) is continuous on ( ; 1) and satis es lims# s(L) = (L).2 If, moreover, (w) < 1 for all w 2 L, then the function s(L) is strictly decreasing and lims!1 s(L) = 0.

Before proceeding to the proof we would like to add two remarks. Remark 1. The point de ned above is called a \change-over-point" of the function s(L). Remark 2. It is possible to construct valuations and languages L such that (w) < 1 for w 2 L and nevertheless s(L) = 1 for all s 2 [0; 1). In the sequel, however, we are not interested in such pathological cases. If (a) < 1 for any a 2 n , then there is a nite change-over-point of the function s(L) for any L  n . Proof. If s(L) < 1 for some s 2 [0; 1), then the set fw : (w)  1 ^ w 2 Lg is nite. Since our assertion is obvious for nite languages, we may split L into a disjoint union L = L0 [ L00 where L00 is nite and contains fw : (w)  1 ^ w 2 Lg. Thus in virtue of s(L) = s(L0) + s(L00) it remains to verify the assertion for functions s(L0) where ; 6= L0  fw : (w) < 1 ^ w 2 Lg. First observe that, once s(L0) is nite, it is strictly decreasing and positive. Thus := inf fs : s(L0) < 1g. Now let   such that  (L0) < 1. As the in nite series (L0) = Pw2L0 (w) is the approximation of its nite sums  (Ui) where Ui := fw : w 2 L0 ^jwj  ig for every  > 0 there is an i 2 IN such that  > (L0) ?  (Ui) =  (L0 n Ui) s(L0) ? s(Ui) for s  . Consequently the sequence of continuous functions s(Ui) i2IN uniformly converges to the function s(L0) on the interval [; 1), whence s(L0) is continuous on [; 1). Since (L0) < 1 for all  > this shows that (L0) is continuous on ( ; 1). If, moreover, (L0) < 1, then s(L0) is continuous on [ ; 1). Utilizing the same argument and the property that lims!1 s(Ui) = 0 we can show that s(L0) tends to zero as s approaches in nity. Finally, it remains to show that lims# s(L) = (L) = 1 if (L0) = 1. 2permitting the value 1 for (L)

11 As it was mentioned above, for nite U the function s(U ) is continuous. Hence, s(Ui ) = lim"!0 s+"(Ui). Taking suprema on both sides yields (L0)  lim"!0 +"(L0), and the assertion follows.

Q.E.D.

The -entropy of the language L  n, HL , is de ned as the \change-over-point" of the function s(L). (2) HL := inf fs : s  0 ^ s(L) < 1g :3 In particular, HL < 1 i 9s(s 2 (0; 1) ^ s(L) < 1). As the usual entropy of languages, our -entropy satis es also the following simple identities.

HW [V = HW V = maxfHW ; HV g if W  V 6= ;, and HL = 0 if L is nite.

(3) (4)

3.1 The -Entropy of Regular Languages Now, we consider regular languages. Here we can characterize our languages having nite -entropy. To this end we introduce the state of a subset M  n [ !n derived from a word w 2 n . M=w := fp : p 2 n [ !n ^ w  p 2 M g (5) We call a set M  n [ !n nite-state provided it has only a nite number of distinct states. It is well-known that a language L  n is nite-state i it is regular, whereas every regular !-language 4 is nite-state but the converse does not hold (see e.g. [31]). Furthermore, we say that a word w 2 n is a pre x of a string p 2 n [ !n provided p = w  p0 for some p0 2 n [ !n and we abbreviate this fact by w v p. For a subset M  n [ !n its set of nite pre xes is A(M ) := fw : w 2 n ^ 9p(w  p 2 M )g and its set of subwords (in xes) is T(M ) := fv : v 2 n ^ 9p9w(w 2 n ^ w  v  p 2 M )g.

Property 11 If L  n is a regular language and : n ! (0; 1) is a valuation, then the following conditions are equivalent.

1. There is an s  0 such that s(L) < 1.

2. 8w; v(v 6= e ^ L=w = L=w  v 6= ; ! (v) < 1) :

3. There are an ` 2 IN and a positive constant c < 1 such that for all u 2 T(L) with juj  ` it holds (u)  cjuj.

3Here we follow the convention inf ; = 1. 4Regular !-languages are de ned as nite unions of sets of the form W  V ! where W; V are regular

languages.

12 Remark. Property 11(2) is just another formulation of the contracting cycles property of [25] and [2]. Proof. \1: ! 2::" If there is a word v 2 n n feg such that (v)  1 and there are words w; u with wvuP2 L and L=w = L=w  v, then L  wvu and, therefore, s(L)  s(wvu) = s(wu)  i2IN( s(v))i = 1. \2: ! 3::" Let the set of nonempty states of L, fL=w : w 2 A(L)g, have k elements. Then it is well-known that u 2 T(L) i there is a word u0 2 A(L) such that ju0j < k and u0  u 2 A(L). Moreover, for every word u0 2 L of length ju0j  k there is an factorization u0 = w  v  w^ satisfying 0 < jvj  k and L=w = L=w  v. Consequently, (u0) = (w  w^)  (v). Repeating this process of cutting out nonempty \cycles" v of length  k we nally arrive at a family of words w0; v1; : : :; vm such that jw0j < k, 0 < jvij  k, ; 6= L=wi = L=wi  vi for some wi 2 A(L) and (u0  u) = (w0)  (v1)  (vm) : Consequently, (vi) < 1, more precisely, let q

cb := maxf jvj (v) : 1  jvj  k ^ 9w(L=w = L=w  v 6= ;)g ; then cb < 1 and (vi)  cbjvij. This yields the inequality 0 (u)  ((uw ))  cbjv j++jvm j 0 f (w) : jwj < kg  cbjuj?k+1  max minf (w) : jwj < kg  c0  cbjuj , for some c0  0 : Now, rst choosing c such that cb < c < 1 and then ` large enough, the assertion follows immediately. \3: ! 1::" If for L  n and (w)  cjwj for some positive c < 1 and all w 2 L with jwj P `, then thePinequality n  cs < 1 holds for some s  0, and this implies s(L)  jwj` s(w) + i` ni  csi < 1. 1

From the direction \1: ! 2:" of the preceding proof we obtain immediately.

Q.E.D.

Corollary 12 If L is a regular language and (w)s < 1 for all w 2 L nfeg, then s(L) < 1. We obtain a relation between the entropies of L, A(L) and T(L) for a regular language L.

Property 13 If L is regular, then s(L) < 1 i s(A(L)) < 1 i s(T(L)) < 1. Proof. L  A(L)  T(L) and s(T(L)) < 1 imply s(L) < 1 and s(A(L)) < 1. Conversely, let s(L) < 1. Since L is regular, there is a k 2 IN such that 8v(v 2 T(L) ! 9w; w^(w; w^ 2 L ^ w  v v w^ ^ jwj; jw^ j ? jw  vj  k)) :

13 Thus A(L)  T(L) = L0;0[: : :[Lk;k where Li;j := fv : in vjn \L 6= ;g, and the assertion follows from the easily veri ed inequality s(Li;j )  minf (a)(i+j)s : a 2 n g  s(L).

Q.E.D.

Corollary 14 If L is regular, then HL = HA (L) = HT (L). Next we give a method to compute the -entropy for a nonempty regular language L. To this end let fL1 = L; L2; : : : ; Lk g be its set of nonempty states. De ne the -weighted s-dimensional adjacency matrix of L, A ;s L = (as;i;j )1i;j k , as follows X as;i;j := ( (x))s : Li =x=Lj

` Then s(A(L) \ `n ) =q(1; 0; : : : ; 0)  (A ;s L )  1l where 1l is the all ones column vector. ;s ` Let L(s) := lim`!1 ` k(A ;s L ) k be the spectral radius of the matrix AL . According to 5 Theorem 2 of [25] L is strictly decreasing, PL(0)  1, and lims!1 L(s) = 0. Thus, if ` L(s) < 1, the sum s(A(L)) = (1; 0; : : : ; 0)  `2IN(A ;s L )  1l converges and, if (s)  1, it diverges. Consequently, HL = HA(L) = HT(L) = i L ( ) = 1 . In particular, we have the following.

Corollary 15 (L) = 1 for = HL if L is an in nite regular language.

3.2 The -Entropy of the Submonoid

Next we consider the relation between the entropies of L and L. As HL  HL  and HL  = 1 whenever (w)  1 for some w 2 L n feg, we are interested only in cases when HL < 1 and (w) < 1 for w 2 L n feg. This implies also that (w) < 1 for w 2 L n feg. First we give some general bounds on the -entropy of L. Property 16 Let e 62 L, = HL < 1 and (w) < 1 for all w 2 L n feg. Then 1. HL   inf fs : s(L)  1g and, 2. if L is a code and (L)  1, then HL  is the unique solution of the equation s(L) = 1. Proof. 1. Since (w) < 1 for all w 2 L the function s(L) is strictlyPdecreasing in ( ; 1). Consequently, if s(L) < 1, in view of the inequality s(L)  i2IN( s(L))i we have s(L) < 1. 2. The additional claim that Ps(L) = 1 implies HL   s follows from the fact that for codes L the identity s(L) = i2IN( s(L))i holds. Summarizing 1. and 2. for codes C  n yields the formula HC  = inf fs : s(C )  1g We obtain a condition sucient for the inequality HL  > HL .

Q.E.D. (6)

5More precisely, there is a c; 0 < c < 1, such that for all " > 0 the inequality L (s + ")  c"  (s)

holds.

14

Lemma 17 If L is a nite union of k codes which satis es (L) > k for = HL < 1, then HL > HL .

Proof. Let L = C1 [ : : : [ Ck where all Ci are codes. Then there is an i 2 IN such that (Ci) > 1. Because s(Ci) is continuous on ( ; 1), we have inf fs : s(Ci)  1g > and, therefore, < HC i  HL . Q.E.D. 



In connection with Corollary 15 we obtain.

Corollary 18 If L  n is regular and a nite union of codes and HL < 1, then HL > HL .

Next we consider the approximation of the -entropy of L, HL  , via HU  where U is a nite subset of L. We are going to derive a result analogous to the theorem of [33]. There we used the real numbers m de ned as the smallest (positive) roots of the equation 1 = m + (m )m.6 In the sequel we assume that there is a positive constant c < 1 such that every word w 2 L (and, hence also every w 2 L) satis es (w)  cjwj. In other words, L  V ;c   V ;c . where V ;c := fw : w 2 n ^ (w)  cjwjg. Observe that V ;c

Theorem 19 Let L be a nonempty subset of V ;c. Then for m  minfjwj : w 2 L n feg g and "m := logc m we have

s(L) 

whenever s  "m.

X

( s(L))i  s?"m (L)

i2IN

Proof. As in [33] one obtains X s(L)  ( s(L))i  i2IN

X

w2L



s wj ?j m  ( (w)) :

wj Now (w)  cjwj implies ( (w))"m  (cjwj)"m = jmwj and, consequently, Pw2L ?j m  s s ? "  ( (w))  m (L ).

Q.E.D.

Corollary 20 Let L  V ;c for some c < 1, e 62 L and minfjwj : w 2 Lg  m > 0. Then 0   ? HL  "m whenever  (L) = 1 : 

Proof. If  (L) = 1, then on the one hand HL   and on the other hand according to Theorem 19 ?"m (L) = 1, that is, HL   ? "m. Q.E.D. 



We obtain the announced analogue to the theorem of [33]. 6It is well-known that ?` upperbounds the number of compositions (ordered partitions) of the number m ` into parts not smaller than m, and it holds 0 < m < m+1 < 1 and limm!1 m = 1 (cf. [33]).

15

Theorem 21 Let L  V ;c for some c < 1. Then for every " > 0 there is a nite subset U  L such that HL ? HU < " : Proof. Let HL = . It suces to show that for every " > 0 there is a nite subset U  L such that ?2"(U ) = 1. If HL = , then ?"(L) = 1 for all " > 0. Now choose m 2 IN such that " > "m := logc m . Since ?"(L) = 1 there is a nitePsubset V  fw : w 2 L ^jwj  mg satisfying ?"(V ) > 1. Hence by Theorem 19, 1 = i2IN( ?"(V ))i  ?"?"m (V )  ?2"(V ): Finally we may choose U to be any nite subset of L satisfying V  U . Q.E.D. 







As a nal remark to this section we derive an upperbound to the -entropy of the languages V ;c where c < 1.

HV ;c  ? logc n for V ;c  n and c < 1. Proof. We have s(V ;c)  Pi2IN ni  csi = Pi2IN(n  cs)i < 1 if only n  cs < 1.

4

! -Languages

(7)

Q.E.D.

and Hausdor Dimension

Now we apply our results on valuations of languages to the calculation of the Hausdor dimension in the spaces (!n ;  ) where  is the metric derived from the valuation : n ! (0; 1) in the following manner (

0 , if  =  , and  (; ) = min f (w) : w 2 A() \ A()g , otherwise.

(8)

The case when (a) = n(a) = n1 for a 2 n was investigated in detail in [36], here we generalize the results obtained there. Particular results for arbitrary valuations were obtained in [25] and [2].

4.1 Metric Properties of the Space (!n ;  ) First we need some properties of the metric  . It turns out that there is a crucial distinction between the behaviour of the metrics derived from various valuations , mainly depending on the fact whether (a) < 1 for all a 2 n or not. Observe that  satis es the ultrametric inequality

 (; )  maxf (;  );  (;  )g ;

(9)

because A() \ A() contains at least one of the sets A() \ A( ) or A() \ A( ). In contrast to the investigations in [36] the space (!n ;  ), however, need not be compact, and even in cases when it is compact the diameter of the balls in (!n ;  ) need not uniformly correspond to the length of the words de ning them.

16 Closed balls (they are simultaneously open) with center  and radius  > 0 in (!n ;  ), IB () = f :  (; )  g, are characterized by words in n as follows. Denote by w (; ) the shortest pre x (provided it exists) w <  such that (w)  . (  (; ) >  , for all  6= , and IB() = fwg(; )  ! ,, ifotherwise. (10) n Two remarks are in order here. Remark 1. A point  2 !n such that  (; ) >  for some  > 0 and all  6=  is usually called an isolated point of (!n ;  ). If  is no isolated point, then w (; ) exists for every  > 0. Remark 2. Since the metric  satis es the ultrametric inequality Eq. (9), balls are simultaneously open and closed in (!n ;  ) and, moreover, every  2 IB() can be chosen to be the center of the ball, that is IB () = IB() which shows that its radius equals its diameter. For the diameter of the ball IB () we obtain, similarly to the case = n, the following. (

if  (; ) >  , for all  6= , and (11) diamIB() = 0 (w (; )) ,, otherwise. In particular, we have diamIB()  . Note that in contrast to the special case = n in virtue of the (possible) existence of isolated points in (!n ;  ) not all balls are subsets of the form w  !n and, vice versa, not all subsets of the form w  !n are balls. Consider e.g. 12  !2 in (!2 ;  ) where (1) < 1 and (2)  1. Remark. Since w  !n = Swv IB() for every 0 <  < minf (v) : v v wg, those subsets are always open. As closed sets are complements of open sets, we obtain that !-languages E  !n satisfying E = f : A()  A(E )g = !n n (n n A(E ))  !n (12) are closed in every space (!n ;  ). Therefore, we will refer to !-languages E  !n satisfying Eq. (12) as strongly closed . One easily observes that (!n ;  ) has no isolated points i (a) < 1 for all a 2 n . Valuations having this property will be called contractive . Since n is nite, for contractive valuations the space (!n ;  ) is compact and its balls are exactly the sets of the form w !n . Thus, we only have isolated points in !n in case (a)  1 for some a 2 n. The set of all isolated points can be written as II := f : inf f (w) : w < g > 0g : (13) It is an easy exercise to show that for noncontractive valuations we have II = n  a! i (a) = 1 and (b) < 1 for b 2 n n fag, and that otherwise II is uncountable.

4.2 Hausdor Dimension in (!n ;  )

In order to introduce the Hausdor dimension of subsets of (!n ;  ) we de ne the dimensional outer Hausdor measure induced by  . X : F  [ F ^ diam F < g (14) (diam F ) inf f  (F ) := lim i i i !0 i2IN

i2IN

17 Remark. It may happen that, due to the fact that F contains uncountably many isolated points, the set on the right hand side of Eq. (14) becomes empty. Then, following our convention inf ; = 1, we set  (F ) := 1. In that case we have  (F ) := 1 independently of the value of . Then the Hausdor dimension of F  !n in (!n ;  ) is de ned as dim( ) F := inf f :  (F ) = 0g = supf : = 0 _  (F ) = 1g : Here we mention that the Hausdor dimension is countably stable, that is, dim( )

[

i2IN

Fi = sup dim( ) Fi i2IN

(15)

In what follows we are mainly interested in the Hausdor dimension and measure of sets not containing isolated points (at least not uncountably many). Therefore we introduce the -fundamental set of (!n ;  ), IF , as follows. IF := !n n II = f : inf f (w) : w < g = 0g :

(16)

As the set of isolated points II is open, its complement IF is a closed subset of (!n ;  ). If IF 6= ;, it is, however, not strongly closed unless is contractive. One easily veri es the identity IF = IF =w for all w 2 n , that is, IF is a so-called one-state !-language, and there are only two strongly closed one-state !-languages contained in !n : !n itself and ;. For subsets F  IF we have the following relation between  and the valuation . inf f (L) : F  L  !n ^ 8w(w 2 L ! (w)  )g  (F ) := lim !0

(17)

Proof. On the one hand, ( (Sw))  (diamPw  !n ), so the inequality \" follows. On the other hand, let F  i2M Fi and i2M (diam Fi)   (F ) +  for some m  IN and  > 0. Without loss of generality we may assume F \ Fi = 6 ;. We consider two cases. If i := diam Fi  0, then Fi  IBi (i ) = f :  (i; )  ig for i 2 Fi, and fi g 6= Fi. According to Eq. (10), IBi (i ) = wi  !n for some wi 2 n. If diam Fi = 0, that is, Fi = fi g  IF , then we may nd a wi < i such that ( (wi))    2?(i+1) . Consequently, F  Si2M wi  !n and X (fwi : i 2 M g)  maxf(diam Fi) ; 0  2?(i+1) g   (F ) + 2   ; i2M

and the assertion follows, because  can be made arbitrarily small.

Q.E.D.

Next we derive some relations between the -entropy of languages and the Hausdor dimension of !-languages in the space (!n ;  ). First we get results analogous to Lemmas 3.8 and 3.10 of [36]. To this end we introduce the -limit of a language V  n :

V  := f :  2 !n ^ A() \ V is in niteg

(18)

18

Lemma 22 If (V ) < 1, then  (V  ) = 0. Proof. As in [36] we use the partition of V into V (i) := fv : v 2 V and v has exactly i pre xes in V g. Then V   V (i)  !n and, since (V ) < 1, (V (i)) tends to 0 as i approaches in nity.

Q.E.D.

Lemma 23 Let F  IF . Then  (F ) = 0 i there is a language L  n such that F  L and (L) < 1. Proof. Let  (F ) = 0. For = 0 we have  (F ) = card F . So F = ;, and we may choose L = ;. Let > 0. According to Eq. (17), for every i 2 IN we can nd a language Li such that F  Li  !n and (Li ) < nS?i . This in particular implies that jwj  i for all w 2 Li . Now it is easy to see that L := i2IN Li satis es F  L and (L) < 1. The other direction is proved in Lemma 22.

Q.E.D.

As immediate consequences of the de nition of the Hausdor dimension we get the following relations between the -entropy of languages and the Hausdor dimension of its -limits. dim( ) V   HV , and (19) ( ) ( )   dim F = inf fdim W : F  W g , if F  IF . (20)

4.3 Hausdor Dimension of !-Languages Utilizing the results of [2], [25] and Corollary 14 we can relate Hausdor dimension and measure of strongly closed nite-state subsets of !n and the -entropy of their pre x languages. First we draw a connection between nite-state !-languages contained in IF and languages of the form V ;c as introduced in Section 3.2 and we derive an estimate for Hausdor dimensions and measures of nite-state strongly closed subsets of IF .

Lemma 24

  IF , and this inclusion is proper if IF 6= ; and 1. If c < 1 then V ;c is not contractive. 2. For every nite-state and strongly closed !-language E  IF there are a positive c < 1 and an ` 2 IN such that E  fw : jwj = ` ^ (w)  c` g! .

Proof. 1. The rst part is immediate. From the additional assumption it follows that (a) < 1 and (b)  1 for some letters a; b 2 n. Depending on these values and c < 1 it is easy to construct a  2 fa; bg! such that inf f (w) : w < g = 0 but (w) > cjwj for all but nitely many w < . 2. If E is nite-state, its pre x language A(E ) is also nite-state, that is, a regular language. Let ; 6= A(E )=w = A(E )=w  v for some w; v with v = 6 e. Then w  v  A(E ) ! and, since E is strongly closed w v 2 E  IF whence (v) < 1. According to Property 11

19 it follows (v)  c` for some c < 1 and all v 2 T(E ) \ `n where ` is suciently large. Now the assertion follows from the obvious inclusion E  (T(E ) \ mn)! which holds for arbitrary m 2 IN n f0g.

Q.E.D.

Theorem 25 If F  IF is nonempty, nite-state and strongly closed, then HA (F ) = dim( )(F ) and, moreover, if = dim( )(F ), then  (F ) > 0.

Proof. In [2] and [25] it is shown that = dim( )(F ) is the solution of the equation A(F )(s) = 1, and that  (F ) > 0. The remaining assertion follows from our above consideration on the calculation of (A(L)).

Q.E.D.

Since U ! is nite-state and strongly closed if only U is nite, in view of the identity A(U ! ) = A(U ) and Corollary 14 the Hausdor dimension of any U !  IF is obtained as dim( ) U ! = HU  . This allows us to get via an approximation similar to the one in Theorem 21 a general formula for the Hausdor dimension of arbitrary !-powers L! .

Lemma 26 Let L  V ;c for some positive c < 1. Then dim( ) L! = dim( )(L) = HL . Proof. The inequality dim( ) L!  dim( )(L)  HL follows from the inclusion L!  



(L) and Eq. (19). In order to show the reverse inequality observe that in view of Theorem 21 we have HL  = supfHU  : U  L and U nite g = supfdim( ) U ! : U  L and U nite g  dim( ) L! .

Q.E.D.

Next we obtain a general bound on  (L! ) for = HL  .

Lemma 27 Let L  V ;c. Then  ((L))  1 for = HL . Proof. De ne L(i) := fw : w 2 L ^ jwj  i ^ 8v(v < w ^ jvj  i ! v 62 L)g. Then L(i) is a pre xcode contained in L satisfying (L)  L(i)  !n . Now on the one hand, following Property 16, H( L i ) = inf fs : s(L(i))  1g  HL . On the other hand, let := HL and i := H( L i ) . Then i (L(i))  1 and, therefore, 

i and (L)  L(i)  !n imply  ((L) )  lim infi!1 (L(i))  lim infi!1 i (L(i))  1, because the function s(L(i)) is continuous on [ i; 1). Q.E.D. 

( )



( )







Next we consider the strong closure of an !-language E de ned as cl(E ) := f : A()  A(E )g (21) The strong closure can be alternatively written as cl(E ) = (A(E )) , it is also known as the adherence of A(E ) (cf. [38], [22] or [6]). It is the smallest strongly closed !-language containing E , thus independently of the valuation it contains the smallest closed (with respect to the metric  ) subset of (!n ;  ) containing E . Utilizing Lemma 26 and Corollary 14 we obtain an estimate for the Hausdor dimension of the strong closure of an !-power of a regular language.

20

Corollary 28 If 0 < c < 1 and L  V ;c is a regular language, then dim( ) L! = dim( ) cl(L! ).

Moreover, we have the following.

Corollary 29 If c < 1 and L  V ;c is regular and a nite union of codes and = HL , then 0 <  (L! ) =  (cl(L! ))  1. Proof. Utilizing the inclusions cl(L! )  L! [ L  (A(L( )))  and A((A(L )) )  A (L), in virtue of Eq. (19) and the Corollaries 14 and 18 dim (A(L))  HA(L) < HL = 

dim( ) L! , the

equality  (L! ) =  (cl(L! ))

orem 25 and Lemma 27.



follows. The remaining inequalities are The-

Q.E.D.

Remark. Utilizing more involved calculations as carried out in [26, Theorem 6] one can show 0 <  ((L) ) =  (cl(L! ))  1 for arbitrary regular languages L  V ;c, but in the case of nonregular languages W one might even have dim( n) W ! < dim( n) cl(W ! ) (cf. [36, Examples 6.3 and 6.5]). Corollary 29 readily implies [17, Theorem 8].

5 IIFS and Fractal Geometry 5.1 Iterated Function Systems One of the most popular ways of describing fractals is iterated function systems (IFS) [3]. We restrict ourselves in the following to Euclidean spaces X  IRm equipped with the Euclidean distance E . Denoting the set of contracting similitudes f : X ! X by S (X ), we can describe an IFS F as a map F : n ! S (X ). We will sketch some well-known properties of IFS in the following, putting emphasis onto the connections with our theory developed in the last sections. An IFS F de nes a contractive valuation F : n ! (0; 1), where F (i) (for i 2 n) denotes the similarity factor of the similitude F (i). If w 2 +n , we can interpret w as a similitude F (w) 2 S (X ), where F is a semigroup morphism mapping (+n ; ) into (S (X ); ). Denoting by [m] the pre x of length m of  2 !n , we may consider the sequence (F ([m])(x))m for some given point x 2 X . It is well-known that the above sequence converges to some point which we denote by F (), since it is independent of the initial point x.  is also called an address of the point F (). The such-de ned map F : (!n ;  F ) ! (X; E ) is Lipschitz continuous. Following [24], we call the set AF = F (!n ) limit set of F . Given an IFS F : n ! S (X ), we may interpret a nite (m-element) language L = fw1; : : :; wmg  +n as an IFS FL : m ! S (X ); i 7! F (wi). We have AFL = F (L! ). Up to now, we restricted our attention to nite languages. What about in nite ones? When considering in nite languages, we are led to in nite IFS (IIFS) [44, 1, 16, 24] whose theory is more involved but analogous to the theory of IFS. Without going into detail here, we can still de ne a set described by an IIFS FL (based on the IFS F and the language L),

21 namely the limit set F (L! ).7 Observe that in the space (!n ;  n ), we may interpret any language L  +n as an (I)IFS de ning w : !n ! !n ; x 7! w  x. In this interpretation, the limit set of L is just L! . We de ne, for L  +n , the valuation dimension valdim (L) = inf fs > 0 : s(L)  1g:

(22)

Property 16 shows the close relation of valdim (L) and HL  . As we will see, the valuation dimension corresponds to the similarity dimension known from IFS theory. This motivates the introduction of this notion in this context. We denote the s-dimensional outer Hausdor measure on a Euclidean space (X; E ) by Hs, and the corresponding Hausdor dimension by dimH . For IFS, Moran's open set condition (OSC) is well-known as an assumption alleviating the determination of the Hausdor dimension of AF [3]: Provided there is an open bounded non-empty test set M  X such that F (i)(M )  M for any i 2 n , and that, furthermore, for any i; j 2 n , i 6= j , F (i)(M ) \ F (j )(M ) = ;, then, for = valdim F (n ), 0 < Hs (AF ) < 1, and = dimH (AF ). Generally, it is not trivial to nd a test set for a given (I)IFS F . But, if we knew that F ful lls OSC, (when) could we say something about FL? An answer to this question is given in the next theorem. To this end, we need two further notions [37]. We say that a language V  n is an OSC-code i there is a nonempty W  n such that (23) 8v(v 2 V ! v  W  !n  W  !n ) , and (24) 8v; v0(v; v0 2 V ^ v 6= v0 ! v  W  !n \ v0  W  !n = ;) are true. We will refer to the language W  n also as an OSC-witness for V . Note that

any OSC-code is a code, and any pre xcode is an OSC-code. In [37] it is shown that any regular code is an OSC-code. Observe further the correspondence with the Euclidean case: Interpreting V as an (I)IFS in the space !n , V satis es the OSC with open test set W  !n i V is an OSC-code with OSC-witness W .

Theorem 30 Let F = ('1; : : : ; 'n) where 'i: IRd ! IRd be an IFS satisfying the OSC, and let C  n be an OSC-code. Then FC is an (I)IFS which satis es also the OSC. Proof. Let F = ('1; : : : ; 'n) satisfy the OSC with test set M  IRd, and let W  n be an OSC-witness for C . Let FC = ('v )v2C , where 'v := 'v 'v` for v = v1  v`. De ne S 1

'w (M ). The set M 0 is nonempty and open, because all 'i are similitudes and M 0 := w2W M is nonempty and open, and moreover, 'v (M 0) = 'v ( S 'w (M )) = S 'u (M )  M 0 w2W u2vW for v 2 C . Now consider, for v; v0 2 C , v 6= v0, 'v

(M 0) \ '

v

0 0 (M )

= =

[

w2W [

!

'vw (M ) \

w;w0 2W

[

'v0w (M ) w2W ('vw (M ) \ 'v0w0 (M )) :

!

7When restricting our attention to compact sets, we should consider the closure of F (L! ) instead.

22 Since v  w  !n \ v0  w0  !n = ;, neither v  w v v0  w0 nor v0  w0 v v  w. Hence, we have a rst position where v  w and v0  w0 do not coincide, that is, u  i v v  w and u  j v v0  w0 where 1  i < j  n, and we have 'ui(M )  'vw (M ) and 'uj (M )  'v0w0 (M ). Now from the inclusion '`(M )  M (one part of the OSC) and the fact that 'i(M ) \ 'j (M ) = ; we readily obtain our assertion.

Q.E.D.

Together with [15, Theorem 3.11], we immediately obtain dimH (F (L! )) = valdim F (L) if the IFS F : n ! S (X ) satis es the OSC and L is an OSC-code. Our considerations from the previous sections together with Theorem 3 of [2], however, allow to strengthen the mentioned result and to generalize it to not necessarily contractive valuations. In [2] and [25] IFS have been generalized to systems F containing arbitrary similitudes. In order to guarantee the convergence of the sequence (F ([m])(x))m one has to restrict the set of admissible !-words . In [2, Theorem 3] it is shown that the mapping F : (E;  F ) ! (X; E ) is Lipschitz continuous whenever E is a strongly closed nite-state subset of IF F . In connection with this, the following generalization of the Open Set Condition for pairs (F ; E ) satisfying the above mentioned property is introduced. Let M be a nite set of open subsets of (X; E ). To every w 2 n we assign a set Mw 2 M. We say that the assignment is compatible with E i n [ i=1

Mw = ; ! w 62 A(E ) ; 'i(Mwi )  Mw ; and

(25) (26)

'i (Mwi) \ 'j (Mwj ) = ; ; for i 6= j :

(27)

We say that a pair (F ; E ) satis es the Generalized Open Set Condition (GOSC) i E is a nite-state strongly closed subset of IF F and there are a nite set of open subsets of (X; E ), M, and an assignment w 7! Mw 2 M compatible with E . Due to Eq. (25), for every nite-state strongly closed subset F  E the pair (F ; F ) satis es GOSC provided (F ; E ) satis es GOSC. Now Theorem 3 of [2] yields the following estimate of Hausdor dimension and measure.

Theorem 31 Let E be a nite-state strongly closed subset of IF such that (F ; E ) satis es GOSC. Then dimH F (E ) = HA (E) = dim( ) E and, moreover, H (F (E )) > 0 for F

F

F

= HA F(E) .

We proceed with the announced strengthening of the identity dimH (F (L! )) = valdim F (L).

Theorem 32 Let (X; E ) be a Euclidean space, F : n ! S (X ), E be a nite-state and strongly closed subset of IF , and let L  n such that L!  E . Assume the pair (F ; E ) satis es the GOSC. Then dimH (F (L! )) = dim( F ) L! , and provided L is a code, we have dimH (F (L! )) = valdim F (L).

23 Remark. An analogous theorem for IIFS satisfying the OSC (using the notion of topological pressure function) is given in [24, Theorem 3.15]. Confer also [17, Theorem 10]. Proof. Since F : E ! X is Lipschitz, clearly dimH (F (L! ))  dim( ) L! = HL F  valdim F (L). For each of the nite languages Lm = L \ fw 2 n : jwj  mg, the !-language L!m  E is nite-state and strongly closed, hence (F ; L!m) satis es the GOSC, and according to Theorem 31 we have dimH F (L!m) = HA F(Lm ) = HL Fm . Since (Lm)m2IN is an increasing chain of sets with Sm2IN Lm = L, by Theorem 21, limm!1 HL Fm = HL F which in turn equals dim( )(L! ) by Lemma 26. Hence, dimH (F (L! ))  supm2IN dimH (F (L!m )) = dim( )(L! ). The additional assertion in case L is a code follows from Eq. (6).

Q.E.D.

In case of a regular language L we can strengthen our result. Theorem 33 Let (X; E ) be a Euclidean space, F : n ! S (X ), and let L  n be a regular language such that F (w) < 1 for all w 2 L n feg. Then cl(L! )  IF F and dimH (F (cl(L! )) = dimH (F (L! )) = dim( F ) L! . If, moreover, L is a nite union of codes then Hs (F (L! )) = Hs (F (cl L! )) for s 2 [0; 1). Proof. If F (w) < 1 for all w 2 L n feg, then according to Corollary 12 and Property 11 we have F (w)  cjLwj for some cL < 1 and for all w 2 T(L) with jwj  ` for a suitably chosen ` 2 IN. In particular, A()  A(L! )  T(L) implies  2 IF F which proves our rst assertion. Since L is regular, dim( F ) L! = dim( F ) cl(L! ) according to Corollary 28, and the second assertion follows from Theorems 31 and 32. Now, let L be a nite union of codes and = dim( ) L! . Applying the fact that dim( )(A(L)) < = dim( ) L! utilized in the proof of Corollary 29, we have dimH (F ((A(L)) )) < and, therefore, Hs (F ((A(L)) )) = 0 for s  . In case s < we have obviously Hs (F (L! )) = Hs (F (cl L! )) = 1.

Q.E.D.

Remark. In [15, Remark 3.12], the question was raised whether requiring an OSC for each IFS-part Fn = ('1; : : :; 'n) of a given IIFS F = ('1; '2; : : :) is weaker than requiring an OSC for F itself. We can show the following here: If all Fn ful ll an OSC, then F itself does not necessarily satisfy an OSC. Proof. Consider as basic IFS F : 2 ! S (([0; 1]; E )) de ned by F (1)(x) = x=2 and F (2)(x) = x=2+1=2. It is clear that AF = [0; 1]. Consider the suxcode L = fw12jwj : w 2 2g (which is no OSC-code) from [37, Example 1]. Assume the IIFS FL satis es an OSC with test set M . Then, ?F1(M ) 6= ; is open in the topology of (!2 ; 2), and de nes ; 6= W  2 by W !2 = ?F1(M ). We show that W is an OSC-witness for L, contradicting [37]. Assume to the contrary that W is no OSC-witness. Then, condition (23) or (24) is violated. Now, assume there are w; w0 2 W and v; v0 2 L such that  2 v  w  !2 \ v0  w0  !2 6= ;. Then, F () 2 F (v)(M ) \ F (v0)(M ), contradicting that M is a test set. Finally, if v  W  !2 6 W  !2 for some v 2 L, then there is a  2 v  W  !2 such that  62 W  !2 . F () 2 F (v)(M ) implies F () 2 M , since M is a test set. Hence,  2 W  !2 = ?F1(M ), a contradiction.

Q.E.D.

24

5.2 Calculating Dimensions For languages L given by unambiguous regular expressions or unambiguous contextfree grammars, we can use the results from Section 2 to evaluate s(L). In combination with the last two theorems, this allows a simple calculation of the Hausdor dimension of F (L! ). In the case of regular languages, we just note that taking in ma in the de nition of the valuation dimension is not necessary.

Theorem 34 Let : n ! (0; 1) be a contractive valuation. Let L  +n be a regular e-free language and = valdim (L). Then, (L) = 1.

Proof. If (L) < 1, then (L)  Pi2IN( (L))i < 1, contradicting Corollary 15. Q.E.D. In case of regular expressions, we obtain the following algorithm: Input: an expression R 2 URn describing a code [R]  +n ; a contractive valuation : n ! (0; 1) Output: the Hausdor dimension dimH ([R]! ) = dimH (cl[R]! ) Procedure: compute the 2 [0; 1) satisfying 1 = R (R) We have seen in Theorem 33 that we can apply the above procedure in order to determine the Hausdor dimension of Euclidean sets, too. Alternatively, of course it it possible to employ the eigenvalue method sketched in the end of Subsection 3.1. Languages given by unambiguous contextfree grammars might be treated similarly. Unfortunately, we are now forced either to consider taking in ma in the de nition of valuation dimension or to prove that the t-dimensional valuation t(L) of the examined language L is indeed nite and greater than 1 for some t < s (where the dimension is calculated assuming (L) = 1). Namely, consider Kuich's example (see also [36, Example 6.3]) x1 = 1 + 2x1x1x1 de ning the pre xcode L  2. For a valuation (1) = (2) = 1=2, p 1 ln K  0:9183, s ? s we have p(L) = 1 i 2 > 3 4 = K  0:5291, and, for s0 = ?ln2 s (L) = 0:5 < 1. An ad hoc approach 1 = 2?s + 2?s would wrongly yield s = 1. Note that the more cautious calculation x = 0:5 + 0:5x3 in the case s = 1 would lead to two possible solutions, x = 1 and x  0:6180, where the latter value perfectly ts into the picture delivered by s (L)  0:7937, where s0 < 1. In view of Theorem 9, we see that we made the mistake not to choose the minimal solution of the equation x = 0:5 + 0:5x3. 3

0

3

0

5.3 Some Fractals We show how to apply our results in some examples in the following. As a basic IFS, we take the quadtree IFS F : 4 ! S ([0; 1]2). The e ect of its four mappings is indicated in Figure 1: E.g., F (2) maps the unit square onto its lower right fourth. The words in the center of the subsquares are the pre xes of the addresses of its points. For example, any address of the point indicated by the -symbol starts with 11. Of course, AF = [0; 1]2 is fairly uninteresting. We remark that V = (0; 1)2 may serve as a test set for OSC. Firstly, taking L = f1; 2; 3g, we get the well-known Sierpinski triangle. Since L is a code, we may ln3  1:5850 as the Hausdor dimension. solve Fs (L) = 3(0:5)s = 1, delivering = ln2

25 3

4

1

2

3

4

13

14

11

12



2

Figure 1: Quadtree

(a) L

(b) [R]

(c) M

(d) N

Figure 2: Sierpinski and its variants

26 Secondly, consider the regular expression R = (43)(1 [ 2). Taking as indeterminate y with (i) = y for i 2 4, we get R(R) = R(43) R(1 [ 2) = R(4) R(3)( (1) + (2)) = 1?1 y 1?1 y 2y. Substituting y = 2?s and solving R(R) = 1, we obtain  1:9000 as the Hausdor dimension of F ([R]!) (and of F (cl[R]!) as well). Thirdly, we consider the language M = f1i2i : i 2 INnf0gg[f1i 3 : i 2 INg. M is generated by the unambiguous linear contextfree grammar given by the following equations (x1 is the start symbol): x1 = 1x22 + 12 + 1x3 + 3 x2 = 1x22 + 12 x3 = 1x3 + 3 Taking as indeterminate y with (i) = y for i 2 4, and setting x1 = 1 in the numerical system, we get the following system we have to solve with some y 2 (0; 1): 1 = y2x2 + y2 + yx3 + y x2 = y2x2 + y2 x3 = yx3 + y p Hence, y = ?1+6 13  0:4343, delivering as dimension  1:2034. Since every choice of the parameter y uniquely determines a solution (x1; x2; x3), in view of Theorem 9, our approach letting x1 = 1 is justi ed. Finally, we can make a similar approach for N = M [ f2i3 : i 2 INg. We obtain as its dimension  1:4073. We like to remark that there is another but related approach connecting languages and fractals (based on an IFS F : n ! S (X )): Take an !-language F  !n and consider the (fractal) set F (F ). For example, using regular (or nite-state strongly closed) !languages, we obtain in such a way a class of fractals known under di erent names: Generalized recurrent systems [9], graph directed constructions [25], recurrent IFS [4], MRFS [8], hierarchical IFS [28], see also [27]. By the well-known McNaughton theorem, a regular !-language F can be represented in the form F = Smi=1 Wi  Vi! , where the Vi's are regular pre xcodes. Hence, dimH (F (F )) = maxmi=1 dimH (F (Vi! )), where the latter dimensions can be computed easily presuming the Vi's are given by unambiguous regular expressions. In such a way, we obtain another way for determining the Hausdor dimension of those fractals as well. Our method is not restricted to the calculation of the Hausdor dimension of regular !-languages and their fractal counterparts in Euclidean space: It is well-known that also Sm other !-languages, e.g. contextfree !-languages, are of the form F = i=1 Wi  Vi! (cf. [32]). (Here the languages Wi; Vi are not necessarily regular.) It is clear from the formulae derived in this and in the preceding sections that for fractals related to !-languages of this shape the Hausdor dimension can be calculated as soon as we are able to calculate the -entropy of the corresponding languages, thus leading to a problem related to formal language theory.

6 Conclusions We presented results connecting formal language theory and fractal geometry. It is intriguing that notions like unambiguity and codes taken from the formal language side

27 can be meaningfully interpreted in fractal geometry. On the other hand, we were led to the concept of a valuation by our research in the area of fractal geometry. It turned out that this concept is also interesting from the purely language theoretical point of view, especially when keeping in mind that, besides combinatorial arguments, concepts relating numbers to words and languages are rarely encountered in the theory of formal languages. There are other connections between formal language theory and fractals, e.g. questions of hierarchies of fractal descriptions inherited by language hierarchies [14, 16] and decidability issues [10]. We think it is promising to pursue further research connecting formal language theory with other parts of mathematics, such as fractal geometry.

References [1] L. M. Andersson. Recursive Construction of Fractals. PhD thesis, Helsinki: Suomalainen Tiedeakatemia, Aug. 1992. Annales Academiae Scientiarum Fennicae Series A, I. Mathematica, Dissertationes, 86. [2] C. Bandt. Self-similar sets 3. Constructions with so c systems. Monatshefte fur Mathematik, 108:89{102, 1989. [3] M. F. Barnsley. Fractals Everywhere. Boston: Academic Press, 1988. [4] M. F. Barnsley, J. H. Elton, and D. P. Hardin. Recurrent iterated function systems. Constructive Approximation, 5:3{31, 1989. [5] J. Berstel and D. Perrin. Theory of Codes. Pure and Applied Mathematics. Orlando: Academic Press, 1985. [6] L. Boasson and M. Nivat. Adherences of languages. Journal of Computer and System Sciences, 20:285{309, 1980. [7] A. Bruggemann-Klein. Regular expressions into nite automata. In LATIN'92, volume 483 of LNCS, pages 87{98, 1992. [8] K. C ulik, II and S. Dube. Methods for generating deterministic fractals and image processing. In LNCS: 464; IMYCS, pages 2{28. Springer, 1990. [9] F. M. Dekking. Recurrent sets: A fractal formalism. Technical Report 82-32, Technische Hogeschool, Delft (NL), 1982. [10] S. Dube. Fractal geometry, Turing machines and divide-and-conquer recurrences. RAIRO Informatique theorique et Applications/Theoretical Informatics and Applications, 28(3{4):405{423, 1994. [11] S. Eilenberg. Automata, Languages, and Machines, Volume A. Pure and Applied Mathematics. New York: Academic Press, 1974. [12] J. L. Encarnac~ao et al., editors. Fractal Geometry and Computer Graphics, Beitrage zur Graphischen Datenverarbeitung des ZGDV (Darmstadt). Springer-Verlag, 1992.

28 [13] H. Fernau. Valuations, regular expressions, and fractal geometry. Submitted for publication, Dec. 1993. [14] H. Fernau. Valuations of languages, with applications to fractal geometry. To appear in Theoretical Computer Science (1995), Volume 143, Sept. 1993 (submitted). [15] H. Fernau. In nite iterated function systems. Mathematische Nachrichten, 169, 1994. [16] H. Fernau. Iterierte Funktionen, Sprachen und Fraktale. Mannheim: BI-Verlag, 1994. [17] H. Fernau and L. Staiger. Valuations and unambiguity of languages, with applications to fractal geometry. In S. Abiteboul and E. Shamir, editors, Automata, Languages and Programming, 21st International Colloquium, ICALP 94, volume 820 of LNCS, pages 11{22, July 1994. ISBN: 3-540-58201-0. [18] C. A. Gunter and D. S. Scott. Semantic Domains, chapter 12, pages 633{674. Elsevier & MIT Press, 1992. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, Volume B, Formal Models and Semantics. [19] A. Habel, H.-J. Kreowski, and S. Taubenberger. Collages and patterns generated by hyperedge replacement. Languages of Design, 1:125{145, 1993. [20] W. Kuich. On the entropy of context-free languages. Information and Control (now Information and Computation), 16:173{200, 1970. [21] W. Kuich and A. Salomaa. Semirings, Automata, Languages, volume 5 of EATCS Monographs on Theoretical Computer Science. Berlin: Springer, 1986. [22] R. Lindner and L. Staiger. Algebraische Codierungstheorie; Theorie der sequentiellen Codierungen, volume 11 of Elektronisches Rechnen und Regeln. Berlin: AkademieVerlag, 1977. [23] B. Mandelbrot. The Fractal Geometry of Nature. New York: Freeman, 1977. [24] R. D. Mauldin and M. Urbanski. Dimensions and measures in in nite iterated function systems. Unpublished manuscript received in december, 1993. [25] R. D. Mauldin and S. C. Williams. Hausdor dimension in graph directed constructions. Transactions of the American Mathematical Society, 309(2):811{829, Oct. 1988. [26] W. Merzenich and L. Staiger. Fractals, dimension, and formal languages. RAIRO Informatique theorique et Applications/Theoretical Informatics and Applications, 28(3{ 4):361{386, 1994. [27] M. Nolle. Comparison of di erent methods for generating fractals. To appear in the Proceedings of the IMYCS'92, 1992. [28] H.-O. Peitgen, H. Jurgens, and D. Saupe. Fractals for the Classroom. Part One. Introduction to Fractals and Chaos. New York: Springer, 1992. [29] P. Prusinkiewicz and M. Hammel. Escape-time visualization method for languagerestricted iterated function systems. In Encarnac~ao et al. [12], pages 24{44.

29 [30] P. Prusinkiewicz and A. Lindenmayer. The Algorithmic Beauty of Plants. New York: Springer, 1990. [31] L. Staiger. Finite-state !-languages. Journal of Computer and System Sciences, 27:434{448, 1983. [32] L. Staiger. Research in the theory of !-languages. J. Inf. Process. Cybern. EIK (formerly Elektron. Inf.verarb. Kybern.), 23(8/9):415{439, 1987. [33] L. Staiger. Ein Satz uber die Entropie von Untermonoiden. Theoretical Computer Science, 61:279{282, 1988. [34] L. Staiger. Quadtrees and the Hausdor dimension of pictures. In A. Hubler et al., editors, \Geobild'89" Proceedings of the 4th Workshop on Geometrical Problems of Image Processing, volume 51 of Mathematical Research, pages 173{178, Georgenthal, 1989. Berlin: Akademie-Verlag. [35] L. Staiger. Hausdor dimension of constructively speci ed sets and applications to image processing. In Topology Measures, and Fractals (C. Bandt, J. Flachsmeyer and H. Haase eds.), Proceedings of the Conference on Topology and Measure VI, Warnemunde (Germany), August 1991, volume 66 of Mathematical Research, pages 109{120. Berlin: Akademie-Verlag, 1992. [36] L. Staiger. Kolmogorov complexity and Hausdor dimension. Information and Computation (formerly Information and Control), 103:159{194, 1993. [37] L. Staiger. Codes, simplifying words, and open set condition. Technical Report 94{14, RWTH Aachen Fachgruppe Informatik, 1994. To appear in: Mathematical Linguistics and Related Topics (Gh. Paun, Ed.), The Publishing House of the Romanian Academy of Sciences, Bucharest. [38] L. Staiger and K. Wagner. Automatentheoretische und automatenfreie Charakterisierungen topologischer Klassen regularer Folgenmengen. Elektronische Informationsverarbeitung und Kybernetik (jetzt J. Inf. Process. Cybern. EIK), 10(7):379{392, 1974. [39] I. Sudborough and E. Welzl. Complexity and decidability for chain code picture languages. Theoretical Computer Science, 36:173{202, 1985. [40] S. Takahashi. Self-similarity of linear cellular automata. Journal of Computer and System Sciences, 44:114{140, 1992. [41] C. Tricot. Douze de nitions de la densite logarithmique. Comptes rendus des seances de l' Academie des Sciences (Paris), serie I, 293:549{552, Nov. 1981. [42] F. v. Haeseler, H.-O. Peitgen, and G. Skordev. Linear cellular automata, substitutions, hierarchical iterated function systems and attractors. In Encarnac~ao et al. [12], pages 3{23. [43] B. L. van der Waerden. Algebra II, volume 23 of Heidelberger Taschenbucher. Berlin: Springer, 5 edition, 1967.

30 [44] K. Wicks. Fractals and Hyperspaces, volume 1492 of LNM. Berlin: Springer-Verlag, 1991. [45] S. J. Willson. Cellular automata can generate fractals. Discrete Applied Mathematics, 8:91{99, 1984.

Suggest Documents