Generating Classes of Languages by P Systems and Other Devices Artiom ALHAZOV Research Group on Mathematical Linguistics Rovira i Virgili University Pl. Imperial T´arraco 1, 43005 Tarragona, Spain E-mail:
[email protected] Institute of Mathematics and Computer Science Academy of Sciences of Moldova Str. Academiei 5, Chi¸sin˘au, MD 2028, Moldova E-mail:
[email protected]
Abstract. The purpose of this paper is to give a finite description of the classes of languages (as opposed to the description of languages) by a generative device, like a P system. The definition of the generated class is inspired by the forbidding-enforcing systems. A more general definition of generating a class of languages by two devices is also given and some relationships between these two definitions are studied.
1
P Introduction
This paper investigates the generative model for classes of languages, using P systems. One can define a class of languages by a P scheme that generates it. A P scheme is a P system, where the initial multisets are not specified. For every multiset assignment W , the corresponding P system Π(W ) will generate a language L(W ) = L(Π(W )). The language class, generated by the P scheme can be defined as the set {L(W ) | W ∈ I}. The multiple words are generated due to the non-determinism, and multiple languages are generated due to the initial multisets. The weak point of defining generation in such a way is the question of defining the set I of the initial multisets. What about defining the class of languages with one P system? Can multiple words be generated by the string replication, and multiple languages - by the non-determinism? Such a definition will follow.
2
Definitions
Consider P systems with string replication, see [1]. Application of a rule α → (β1 , tar1 )|| · · · ||(betar , tarr ) to a string w = xαy (to the occurrence of α between x and y) means deleting one copy of w from the rule’s region and adding xβk y to the region specified by tark for all k’s between 1 and r Let us define the class of languages C(Π), generated by such P system Π as the set (not the union) of languages, generated on some computational paths, as formalized below. 18
Definition 2.1 Consider a P system Π with n membranes. At every moment, every region (including region 0 - the environment) contains a finite multiset of strings. A configuration of Π is defined as an (n + 1)-tuple M = (Fk )0≤k≤n of the multisets Fk of strings in every region. One step of computation is denoted by “⇒”. Note, that if M is a halting configuration, than we can write M ⇒ M . Let D = {M0 , M1 , · · ·} be (an infinite) derivation (a sequence of configurations) of Π, i.e. M0 is the initial configuration of Π and Mi ⇒ Mi+1 , i ≥ 0. (i) Here, Mi = (Fk )0≤k≤n , i ≥ 0. The language L(D) of the derivation D is defined as (i) the limit language of the environment, i.e. L(D) = ∪i≥0 supp(F0 ). (The union is used, (i) (i+1) because supp(F0 ) ⊆ supp(F0 ), i ≥ 0, as the strings ejected into the environment remain there). Let us denote the set of all derivations in Π by D. The generated class of languages is defined as C = C(Π) = {L(D) | D ∈ D}. We denote the family of language classes, generated by such P systems with stringobjects by CSP. Comparison of the language classes is considered modulo the empty language.
3
Examples
Example 3.1 Π0 : C(Π0 ) = {∅} - the class containing only the empty language Π0 = (∅, [1 ]1 , ∅, ∅) One membrane, no objects, no rules. Example 3.2 Π1 : C(Π1 ) = {{an bm cn | n ≥ 1} | m ≥ 1} abc b → bb b → (b, out)
c → 2 (cc, out) 3
a → (aa, in3 )||(a, out)
1
Example 3.3 Π2 : C(Π2 ) = {{am bn cm | n ≥ 1} | m ≥ 1} abc b → abc b → (b, out)
2
b → bb||(b, out)
1
Also, for any language L ∈ RE, CSP ∋ {{w} | w ∈ L} (The set of languages, consisting of single words of L), as well as CSP ∋ {L} (the class, consisting of the single language L). For the first result, just take the string P system, generating L, and consider the model of generating language classes. For the latter one, perform a deterministic parallel simulation of application of the rules (idea: use a shuttle E and replace a → α by Ea → aE||αE).
4
Generating important language classes
Theorem 4.1 ARB = P(Σ∗ ) ∈ CSP Proof. Consider the following P system Π3 : C(Π3 ) = ARB s s → s||q s→ε
q → aq ∀a ∈ Σ q → (ε, out)
1
19
This system produces languages with any number of any words. Inclusion in ARB is obvious. Conversely, for any language L ⊆ Σ∗ , consider the lexicographic ordering. There is a derivation, producing the words in that order, so L ∈ C(Π3 ). 2 Theorem 4.2 F IN ∈ CSP Proof. Consider the following P system Π4 : C(Π4 ) = F IN sr s → si s→l li → l||d
di → q q → aq ∀a ∈ Σ q → (ε, out)
1
The system nondeterministically generates lin r for some n ≥ 0. For a fixed n, it nondeterministically produces any set of at most n words. 2 Theorem 4.3 REG ∈ CSP Proof. Consider the following P system Π5 = (O, µ, w1 , · · · , w3 , R1 , · · · , R3 ): O = O1 ∪ O2 ∪ {H, J, L, R, S, T, U, V, W, X, Y, Z, i}, O1 = {B, F, I, K} ∪ Σ, O2 = {Za | a ∈ Σ}, µ = [1 [2 [3 ]3 [4 ]4 [5 ]5 ]2 ]1 , w3 = LU KHS; wk = ε, k 6= 3, R1 = {Ll → L | l ∈ O1 ∪ {H, Y }} ∪ {LR → (ε, out)} R2 = {U K → (KU, in5 ), U i → (U, in4 ), Y → (ε, out)} R3 = {S → F S, S → HS, S → (R, out), T → IT, T → BS} ∪ {S → aIT | a ∈ Σ} R4 = {U l → lU | l ∈ O1 ∪ {H ′ , i}} ∪ {U H → (H ′ V, out)} R5 = {U l → lU | l ∈ O1 ∪ {i}} ∪ {W l → lW | l ∈ O1 ∪ {H, i}} ∪ {U H ′ → HU, U H → HV, V aI → Za aJ||aV I, W JJ → JW J, W JI → ZJJ} ∪ {W JB → XIB, lZa → Za l, lZ → Zl} ∪ {lX → Xl | l ∈ O1 ∪ {H, I, i}} ∪ {LZa → aLW, LZ → LiW, JX → XI, LX → (LU, out), V F → (Y, out)} C(Π5 ) = REG. In region 3 the finite automaton description for any regular language is nondeterministically generated, and then the string exits in region 2, and starting from that moment, words of the corresponding language are generated in deterministic and parallel way. Region 4 is used to look for the next state, region 5 makes the transitions, and region 1 performs the cleanup and outputs the results. 2 Example 4.1 Consider the language (a∗ ba)∗ a∗ , given by a finite automata with 2 states and 3 transitions. start: q1 , q1 a → q1 , q1 b → q2 , q2 a → q1 . final: q1 . An example of its description would be LU KHF aIBbIIBHaIBR, where bold letters encode the transitions, F indicates a final state, B separates the transitions and H separates the states. We will now give a different, more general definition of generating a class of languages, and then indirectly prove that RE, CF, CS ∈ CSP. 20
5
Generating Classes by Two Devices
Consider two devices, a generative device DI and a generative device with input DU . More exactly, DI is a mechanism producing a language L(DI ) = LI ⊆ Σ∗ , later called initial language, and DU is a mechanism, that, given a word w ∈ Σ∗ , produces a language L(DU , w) = Lw . Then, C(DI , DU ) = {Lw | w ∈ LI } is the class of languages, generated by DI and DU . One can speak of a more general model, where one device generates a description of the other device; this model also makes sence. However, it requires a universal simulator of the second device given its description, while the definition we use does not. Finally, the family of these classes of languages, generated by two formalisms MI and MU is defined as F (MI , MU ) = {C(DI , DU ) | DI ∈ MI , DU ∈ MU } For example, if the first formalism is regular grammars, or better finite automata, viewed from the generative point of view, and the second formalism is finite transducers, (and both operate over subalphabets of Σ), then the family of generated language classes is F (gF A, F T ). It is mentioned in [2], that there is no such finite automation A, and such a finite transducer T , that {L(T, w) | w ∈ L(A)} = REG, i.e. REG ∈ / F (gF A, F T ). Claim: REG ∈ F (gF A, 2F T c), where 2F T c means two-way-input finite transducers with a counter. Idea: A finite automaton generates the description of all finite automata over some vocabulary Σ, where the states are encoded in unary, and then a transducer simulates the work of a finite automata, choosing a transition, applying it and using the counter to store the state, outputting the symbol and going back to the beginning of the description and repeating the procedure over. It can optionally halt reaching the final state.
6
Back to P Systems and Turing Machines
Now let us consider the MI = gT M and MU = T M , i.e. the Turing machines, generating languages. The difference is that in the first case the machine starts on a blank tape, and the second machine starts with a word. We will denote the family of classes of languages, generated by Turing machines as F (gT M, T M ) = CRE, and call it the family of recursively-enumerable classes of languages. Theorem 6.1 CRE ⊂ CSP Proof. First, the inequality is clear because P (Σ∗ ) ∈ CSP by Th 4.1, and CRE can have only classes with at most countable number of languages. We now proceed to inclusion. Consider two Turing machines AI and AU , defining a class C = C(AI , AU ). We will show a P system Π, generating C. Consider a two-membrane structure. In the inner membrane, the first Turing machine AI is simulated by rewriting rules. The halting configurations will correspond to symbol-state pairs, having no associated commands in AI . We add rules that would remove the head and move the string into the outer region. In the outer region, the work of AU will be simulated in the deterministic parallel way: when the current statesymbol pair has more commands associated to it, the string will be replicated the needed number of times and on each copy its own command would be executed. For the statesymbol pair, having no associated commands, we add commands removing the state and moving the string to the environment. Π generates C ∪ E, where E = ∅ if the work of AI is time-bounded and E = {∅} otherwise. 2
21
Corollary 6.1 {CF, CS, RE} ⊂ CSP It is enough to show {CF, CS, RE} ⊆ CRE. For that, make the first machine MI generate the descriptions of all type-2, type-1 or type-0 grammars, respectively. The second machine MU will simulate the derivations. Thus, for any L ∈ RE there is a description w = w(L) and L(MU , wL ) = L. The first machine will supply the language of descriptions of CF , CS or RE, and then the second one will produce the corresponding classes. The existence of such machines can be motivated using the Turing thesis.
7
Problems
We list here several questions. 1. Conjecture: REC ∈ / CSP. 2. Give examples of formalisms MI , MU , satisfying the conditions F (MI , MU ) ⊆ P (REG) and F (MI , MU ) ∋ REG. 3. Let ∪{L | L ∈ C} = L (*) for a P system Π0 ; does every subset of L belong to the union of language classes, generated by all P systems satisfying the condition (*)? 4. What are the closure properties?
References [1] Gh. P˘ aun, Membrane Computing. An Introduction, Springer-Verlag, Berlin, Heidelberg, 2002. [2] Gh. P˘ aun, G. Rozenberg, A. Salomaa, DNA Computing. New Computing Paradigms, Springer, Berlin, Heidelberg, 1998.
22