Graph Automata for Linear Graph Languages F.J. Brandenburg and K. Skodinis University of Passau, 94032 Passau, Germany e-mail: fbrandenb,
[email protected]
Abstract. We introduce graph automata as devices for the recognition
of linear graph languages. A graph automaton is the canonical extension of a nite state automaton recognizing a set of connected labeled graphs. It consists of a nite state control and a collection of heads, which search the input graph. In a move the graph automaton reads a new subgraph, checks some consistency conditions, changes states and moves some of its heads beyond the read subgraph. It proceeds such that the set of currently visited edges is an edge-separator between the visited and the yet undiscovered part of the input graph. Hence, the graph automaton realizes a graph searching strategy. Our main result states that nite graph automata recognize exactly the set of graph languages generated by connected linear NCE graph grammars.
1 Introduction The theory of graph languages is based on generative devices, i.e., on graph grammars. A graph grammar consists of a nite set of productions, which are used to grow graphs or hypergraphs by repeated replacements of nodes or hyperedges. There are various types of graph grammars introduced so far in literature. They dier mainly in the form of the productions and in the embedding mechanisms, see, for example [Cou90, EKR91, Eng89]. However, the dual to generative devices is still missing. There is no systematic approach on recognizing devices for graph languages. There are no graph automata, which t to the major classes of graph grammars. This is a gap in the theory of graph languages. Here we do a rst step to ll this gap. In an early paper, Rosenfeld and Milgram [RM72] have introduced web automata, which however have the computational power of Turing machines. Similarly, the approaches by Wu and Rosenfeld [WR79a, WR79b] and Remila [Rem94] on cellular automata have a high computational power in the range of linear bounded automata and have not been studied as duals to important types of graph grammars. We consider linear graph languages. These are generated by linear NCE graph grammars. Linear graph grammars are special node replacement systems and are investigated in detail in [EL89]. They can be seen as the extension of linear context-free grammars from strings to graphs. The productions of an NCE graph grammar are triples (A; R; C ), where A is a nonterminal node label, R is a nonempty graph of the right-hand side and C is the embedding relation.
C establishes edges between the nodes of R and the former neighbours of the
replaced node. The language of a graph grammar consists of the set of all terminal labeled graphs derivable from the axiom. Such a graph grammar is linear, if the right-hand side graphs have at most one nonterminal node. The theory of graph grammars and their languages has been explored in great detail and depth. We refer the reader to e.g. [Bra95, Cou90, EKR91, Eng89, EL89, Nag79, RW86]. In this paper we introduce graph automata. They continue the line of nite automata and tree automata. These machines operate on strings and trees. Graph automata work on connected labeled graphs. A graph automaton is a multihead automaton with a nite set of states and a nite set of instructions. It uses its heads for a systematic search of the input graph. In a move, some heads are advanced and read a pre-de ned subgraph of the remaining input graph. The automaton scans this subgraph for consistency and changes states. It is inherently nondeterministic. A graph automaton may choose among several next states. This can be made deterministic by the power set construction. However, it must choose the proper subgraph to be read, particularly at the start. A graph automaton accepts if after a sequence of consistent moves the input graph is completely scanned and it has reached a nal state. Although a graph automaton has a nite number of states and nitely many instructions, the number of heads is unbounded. It depends on the size and the structure of the graph given to the input and is bounded from below by the edge-search number [BS91]. The heads are placed on nodes and edges, where they guard nodes and clear edges. At any time, the set of currently visited edges is an edge-separator between the visited and the yet undiscovered part of the input graph. In a move, the border of separating edges continuously moves beyond the read subgraph. Thus already cleared nodes and edges cannot be recontaminated. This means a monotone search strategy, graph searching without recontamination [BS91, LaP93, MHGJP88]. Hence, graph automata are plans for monotone search strategies on graphs. The search strategies are special. They are given by a nite set of instructions and can be executed by nondeterministic nite state machines. Our main result states that graph automata are equivalent to linear graph grammars and recognize exactly the class of connected linear graph languages.
2 Basic notions We assume that the reader is familiar with the basic notions from graph theory and from the theory of node-replacement graph grammars. We deal with undirected, connected, node labeled graphs. The approach can be extended to directed graphs with node and edge labels. De nition 1. Let be a nite alphabet. A graph g = (V; E; m) is a simple, undirected, node labeled graph and consists of a nite set of nodes V , a set of
undirected edges E without self-loops and multiple edges, and a node labeling function m : V ! . Let g = (V (g); E (g); m(g)). Our graph automata impose a direction on the edges e = fu; vg, such that u is the "old" node and v is the "new" node. They slide along e from u to v and so they de ne edge-separators. Directed edges from u to v are written as pairs e = (u; v). De nition 2. Let g = (V; E; m) be a graph and let U V be a subset of the nodes of g. The subgraph induced by U is gj = (U; D; n), where D = ffu; vg 2 E j u; v 2 U g and n(v) = m(v) for every node u 2 U . Let h = gj . The complementary graph g ? h is the subgraph induced by V ? U . Thus g ? h = gj ? . The edge-separator of an induced subgraph h = gj is the set of edges between h and its complementary graph, sep(g; h) = f(u; v) j u 2 U; v 2 V ? U g. These edges are directed from U to V ? U . Their sources in U are called ports, i.e., port(g; h) = fu 2 U j (u; v) 2 sep(g; h)g. The endnodes in V ? U are the neighbours of h, neigh(g; h) = fv 2 V ? U j fu; vg 2 E and u 2 U g. For these notions we may identify an induced subgraph gj with its de ning set of nodes U. The sets of separator edges and ports can easily be updated, if a subgraph h is extended by another disjoint induced subgraph. Moreover, the separator edges and the ports uniquely determine the induced subgraph of a connected graph. Lemma 3. Let h and h0 be node disjoint induced subgraphs of a graph g, i.e., h = gj and h0 = gj with U \ U 0 = ; . Then the edge-separator and the ports of the subgraph gj [ induced by U [ U 0 are sep(g; h [ h0 ) = sep(g; h) [ sep(g; h0) ? ffu; u0g j u 2 U; u0 2 U 0 g and port(g; h [ h0 ) = port(g; h) [ port(g; h0 ) ? fu 2 V (g) j fu; vg 2 E (g) implies v 2 U
U
V
U
U
U
U0
U
h
h0
U [ U 0 g:
Lemma 4. Let g be a connected graph and h = gj an induced subgraph of g. A node v of g is in U i v is connected with a port u 2 port(g; h) by a path which contains no separating edge from sep(g; h). v belongs to the complementary graph g ? h i v is connected to some port by a path whose last edge is the only separator edge of that path. These characterizations of h and g ? h can be generalized to an inside test such that v 2 U i all paths from a port have an even number of separating edges. U
Note that lemma 4 holds if g is connected. Only in this case the separator edges uniquely describe subgraph h and its complementary subgraph g ? h. Therefore we consider connected graphs. In every computation step our graph automaton places its edge-heads on the separator edges between the already visited part h and the not yet visited part g ? h of the connected input graph g. The graph automaton does not need to store all nodes and edges of h. It is sucient to place its edge-heads on the separator edges betwenn h and g ? h. Next we review graph grammars from the NCE family of node replacement systems, see, for example, [Cou90, EKR91, Eng89, RW86].
De nition 5. A graph grammar is a tuple GG = (N; T; P; S ), where N is the alphabet of nonterminal node labels, T is the alphabet of terminal node labels, S 2 N ? T is the axiom and P is a nite set of productions of the form p = (A; R; C ), where A 2 N is the node label of the left-hand side, the right-hand side R is a nonempty graph and the connection relation C consists of pairs (a; w) with a 2 (N [ T ) and w 2 V (R). According to their labels we speak of terminal and nonterminal nodes. For a node w 2 V (R) let C ?1 (w) = fa j a 2 N [ T; (a; w) 2 C g denote the set of labels in the connection relation of w.
There is a natural and well-established graphic notation for the productions, which we shall use throughout. For p = (A; R; C ) draw the right-hand side graph R with the nonterminal nodes as unit size squares, terminal nodes as points and (whenever possible) straight line edges. For the left-hand side draw a big rectangle with label A around R. Finally, for every connection (a; w) 2 C draw a line from the node w of R to an a-labeled point outside the big rectangle. This concepts helps in understanding the application of productions and the (visual) de nition of derivations, see [Bra95, Hic94]. A derivation step means replacing a node v with label A by the right-hand side R and establishing connections between the neighbours of v and the nodes of R as speci ed by C . The language L(GG) generated by GG consists of all terminal graphs that can be derived from the axiom S . Instead of a formal de nition we give an example for the generation of chains of the form a b c , n 1. By productions p3 , a new a-, b- and c-node is generated and is connected to its left neighbour with the same label. Using p2 the rst aand b-nodes are connected and the nal production p4 connects the last b- and c-nodes. Example 1. Let GG = (N; T; P; S ) be a graph grammar with N = fS; Ag, T = fa; b; cg and P = fp1; p2 ; p3 ; p4 g. The productions of GG are shown in Fig. 1. n n n
S
a b c p1
S
a
a
b
A
c
b c
A
a
a b A
c p
p 2
3
Fig. 1. The productions of GG A derivation is illustrated in Fig. 2.
A
a
b
b
c
c
p4
a S
b
=> p 2
X
c
a
a
a
b
b
b
c
c
c
X
=>p 3
=>p 4
a
a
b
b
c
c
a
a
a
a
b
b
b
b
c
c
c
c
X
=> p 3
Fig. 2. A derivation of GG Forthcoming we restrict ourselves to linear graph languages and consider linear graph grammars in normal form for their generation. The normal form is important for the construction of an equivalent graph automaton. Graph automata have a built-in check for the conditions set by the normal form.
De nition 6. A graph grammar GG = (N; T; P; S ) is linear, if the right-hand side of every production has at most one nonterminal node. Then the connection relations consist of pairs (a; w) with terminal node labels a 2 T . GG is connected, if all graphs of L(GG) are connected. GG is in normal form, if GG is chain-free, context consistent and neighbourhood preserving. Thus for every production (A; R; C ), R does not solely consist of a nonterminal node, there is a context describing function c : N ! P (T ) with c(A) = fa 2 T j S ) g and v 2 V (g) with m(g)(v) = A implies that there is a neighbour u 2 neigh(g; v) with m(g)(u) = ag, and for every application of some production (A; R; C ) to some node v in g with S ) g such that g ) g0 we have neigh(g; v) = neigh(g0; R). Hence, c records the labels of the neighbours of the nonterminal vertex v and edges do not go lost by a rewriting step. Engelfriet and Leih [EL89] and Rozenberg and Welzl [RW86] have shown that the normal form is no restriction for the generative power of linear and boundary graph grammars, respectively.
Lemma 7. It is decidable whether or not a linear graph grammar is connected. For every linear graph grammar GG there is a linear graph grammar GG0 in normal form generating all connected graphs of L(GG), i.e., L(GG0 ) = fg 2 L(GG) j g is connected g.
3 Graph Automata Before we give a formal de nition of a graph automaton let's take a look at nite state automata and tree automata and explore some analogies. Recall the way a nondeterministic nite automaton A works, and suppose that it makes no -moves. In each move A reads a new symbol of the input string and changes states. The change of states is determined by the transition function. The automaton accepts, if the input string is completely scanned and it has reached a nal state. If A is a one-way automaton, then it reads every letter exactly once. It uses its read-head to separate the already scanned part of the input string from the part yet to be visited. The read-head plays the role of a separator. Next reconsider a bottom-up tree automaton, see [GS84]. Initially, it marks each leaf of an input tree with a nal state. In each computation step a new node v with label a will be marked with some state s, if its sons v1 ; v2 ; : : : ; v were marked with the states s1 ; s2 ; : : : ; s , respectively, and there is an instruction (s1 ; s2 ; : : : ; s ; a) ! s. The bottom-up tree automaton accepts an input tree t, if the root of t is marked with an initial state. Again, the roots of the already marked subtrees are a separator. The scenario for top-down tree automata is similar, now starting at the root and nishing, when all leaves are marked by a nal state. n
n
n
A graph automaton GA consists of a nite state control and a collection of edge-heads. The number of edge-heads is unbounded and depends on the edgesearch number of the given input graph [BS91]. The edge-heads mark the current separator edges. The end nodes of the separator edges are the ports of the subgraph visited so far. In each step, the graph automaton reads a new subgraph according to some instruction and checks its compatibility. The compatibility is described by augmented graphs with node labels with three components. It includes a dynamic check for connectivity and neighbourhood preservation. Then the automaton changes states and moves its edge-heads beyond the read subgraph such that they occupy the new separator edges. Graph automata are nondeterministic in several respects. There is a choice of the next state and of the subgraph to be read and there must be a proper initialization. Otherwise the computation will fail. This is made such that a graph automaton can reconstruct the derivations of its associated connected linear graph grammar. An augmented graph h over the base-alphabet T is a node labeled graph h = (V; E; m) such that m : V ! T P (T ) f0; 1g. For i = 1; 2; 3 let m denote the projection of m onto the i-th component. If m(v) = (a; X; j ), then m1 (v) = a is the ordinary node label, m2 (v) = X describes a set of node labels for neighbours of v and m3 (v) = j indicates the existence or nonexistence of other edges. A graph g and an augmented graph h are taken as isomorphic, if i
they are isomorphic on the ordinary node labels. Moreover, we identify a graph and its isomorphic copy.
Now we are ready to de ne graph automata and their computations. De nition 8. A graph automaton GA = (Q; T; ; q ; F ) consists of a nite set of states Q, the alphabet T of node labels, the start state q 2 Q, the set of nal states F Q; and the transition function : Q ? P (T ) ! P (Q), where ? is a nite set of augmented graphs. Each such tuple (q; ; Y ) ! q0 is an instruction of GA. Let g = (V; E; m) be a connected graph, which is an input to GA. A con guration of GA on g is a pair K = (q; h), where q 2 Q is the current state and h = gj is an induced subgraph of g. Since an induced subgraph h = gj of a connected graph g is completely determined by h, by the set of vertices U , by the edge-separators S = sep(g; h) and by the set of ports R = port(g; h) we may replace h by any of U , S , or R. Furthermore, S = ; i R = ; i h = ; or h = g. o
o
U
U
An instruction (q; ; Y ) ! q0 of a graph automaton de nes a computation step on con gurations K ` K 0 . Let K = (q; k) and let = (W; D; m) be an augmented graph, such that its projection onto the rst component is 1 = (W; D; m1 ). Then the instruction (q; ; Y ) ! q0 is applicable to K if the following holds: There is an induced subgraph of g ? k, which is isomorphic to 1 . The isomorphic copy of 1 = (W; D; m1 ) is new and is read in this computation step. Let m(w) = (a; X; j ) be the augmented node label of a new node w 2 W . 1. For every node label b, b 2 X i there is a port u 2 port(g; k) with label b and an edge fu; wg between u und w and for every port u0 2 port(g; k) with label b there is an edge e = fu0; wg from u0 to w. Hence, if e = fu; wg is an edge between a new node w 2 W and some port u 2 port(g; k), then m(g)(u) 2 X . 2. Moreover, j = 0 i fw; w0 g 2 E (g) implies w0 2 port(g; h) [ W . I.e., w is directly connected only with "old" nodes. 3. Finally, for the third component of the instruction, b 2 Y implies that there is a port u 2 port(g; k) with label b and an edge between u and some new node z 2 V (g) ? (V (k) [ W ), and this holds for all b-labeled ports. Conversely, if e = fu; z g is an edge from a port u to some new node z 2 V (g) ? (V (k) [ W ), then e is registered in Y by the label m(u). If the ports, port(g; k) and the new set of nodes W satis es the conditions set by X , j and Y , then it was legal to read the subgraph induced by W and the instruction can be executed. Then K ` K 0 where K 0 = (q0 ; k0 ) and h0 = k [ 1 . As usual, let K ` K 0 denote the transitive closure of ` such that K ` K 0 describes a computation of a graph automaton from con guration K to con guration K 0 . The graph automaton halts, if there is no applicable instruction or if it has deactivated all its edge-heads. Then it may accept. GA starts in the initial state q with all its edge-heads deactivated. Thus, the language accepted by nal state o
and deactivation of all edge-heads is L(GA) = fg j g is connected, (q0 ; ;) `( and q 2 F g.
q;g )
The operational view to a computation step K ` K 0 by the instruction (q; ; Y ) ! q0 with = (W; D; m) is as follows. The graph automaton is in state q. Its edge-heads visit the set of separating edges in direction from h to the complementary graph g ? k. The node-heads guard the ports port(g; k). Then GA reads an isomorphic copy gj of 1 in the rest graph g ? k. If there exist ports, then every node w 2 W is connected to some port by a path whose last and only edge is a separator edge. Thus some heads search the copy of 1 starting from separator edges. Furthermore, if m(w) = (a; X; j ) is the augmented node label of some node w 2 W and X 6= ;, then the isomorphic copy of w is directly connected to all ports u whose label is in X , and for each such label b 2 X there exists such a port and such an edge. From the viewpoint of a port u with label b, every separator edge e = (u; w) is registered by the augmented label b at w. If there is such a port, then all ports u with this label are treated the same and are put into one class. If j = 0, then all edges incident with w either are separator edges or are edges from 1 . Otherwise, if j = 1, there is at least one edge from w to a yet unvisited node of g. The graph automaton will clear the edges of the read subgraph and the separator edges between the ports of port(g; h) and the nodes from the read subgraph 1 . Some edge-heads will be advanced to the edges between the nodes of 1 and the rest graph. The graph automaton moves some node-heads to those new nodes w of 1 , whose augmented label m(w) = (a; X; j ) has j = 1. These nodes become new ports. It removes node-heads from those old ports, whose incident edges are all cleared. These are exactly the ports u 2 port(g; h) whose label m(u) is not recorded in the third component Y of the instruction. If all these checks succeed, then the graph automaton can excecute the instruction and it enters the next state q0 . Example 2. Let GA = (Q; T; ; q0; F ) be a graph automaton with Q = fq0 ; q1 ; q2 ; f g, T = fa; b; cg, F = ff g and the instructions (q0 ; 1 ; ;) ! ff g, (q0 ; 2 ; ;) ! fq1 g, (q1 ; 3 ; ;) ! fq1g, and (q1 ; 4 ; ;) ! ff g, where = (U ; E ; m ) for 1 i 4, are the augmented graphs given below in Fig. 3. W
i
v1
(a, O ,0)
v1
v 2
(b, O ,0)
v 2
v 3
(c, O ,0)
v 3
v
i
i
i
(a, {a}, 0)
1
(a, {a}, 1)
v1
(b, O ,1)
v 2
(b, {b}, 1)
v 2
(b, {b}, 0)
(c, O ,1)
v 3
(c, {c}, 1)
v 3
(c, {c}, 0)
(a, O ,1)
Fig. 3. The augmented graphs 1; 2; 3; and 4 On an input a b c with n 1, which should be drawn like an "S", GA n n n
can only apply the second instruction. It must read the rst a-node and the rst b-node in the upper left corner of the "S", which are connected by an edge and it can read any c-node. However, if it does not read the rst c-node in the lower corner of the "S", it will run into an error. If n 3 and it picks a c-node in the middle, then after the next move, there are two c-nodes, which are ports, and each of these c-ports must be connected by the other c-nodes. Similarly, if GA picks the last c-node that is connected to the b-node, there will be two c nodes, which are ports and which must be connected to the other c-nodes. If the proper nodes are read in the rst step, then the graph automaton works deterministically and sweeps over the "S" from left to right, reading the next a-, b-, and c-nodes in its next move. Observe, that the automaton GA is the associate of the linear graph grammar GG generating fa b c j n 1g. A computation of the automaton is shown in Fig. 4. n n n
G=
(a)
(c)
a
a
a
a
b
b
b
b
c
c
c
c
a
a
a
a
b
b
b
b
c
c
c
c
a
a
a
a
b
b
b
b
c
c
c
c
(b)
(d)
a
a
a
a
b
b
b
b
c
c
c
c
a
a
a
a
b
b
b
b
c
c
c
c
Fig. 4. A computation of graph automaton GA
Lemma 9. Let GA = (Q; T; ; q ; F ) be a graph automaton and let (q; h) ` (g0 ; h0 ) be a computation step by some instruction (q; ; Y ) ! q0 . Then GA deactivates all its edge-heads if and only if Y = ; and j = 0 for every node w 2 V ( ) with augmented node label (a; X; j ). 0
We are going to simplify the GA. A graph automaton GA = (Q; T; ; q0; F ) is quasi-deterministic, if j(p; ; Y )j = 1. Using the power set construction known for nite state automata we obtain:
Lemma10. For every graph automaton GA = (Q; T; ; q ; F ) there exists a 0
quasi-deterministic graph automaton GA0 = (Q0 ; T 0; 0 ; q00 ; F 0 ), such that L(GA) = L(GA0 ).
Although GA0 has no choice for its next state, it is not deterministic in the classical sense. From the example above it is easy to see that a graph automaton must be initialized properly. This is the rst place for nondeterminism. Also, there may be a choice, which subgraph should be read. This can easily be seen from a cycle with a-labeled nodes. For an illustration see our example with chains of the form a b c . In the rst step the graph automaton must choose the nodes with labels a and b, which are connected. On the a's and b's it proceeds deterministically, and picks the next neighbours. However, it may choose any c but the last, if n > 0. But if it picks a c-node in the middle, there will be two c-ports after one step. Each of them must be connected to the next c-node, but these edges do not exist. Hence GA would fail and run into an error. n n n
For our main result on the equivalence of graph automata and connected linear graph grammars the following observation is useful.
De nition 11. Let p = (A; R; C ) be a production of a linear graph grammar GG in normal form. The augmented graph (p) associated with p is the subgraph of R induced by the terminal nodes W of R, where these nodes have augmented labels m(w) = (a; C ?1 (v); j ) with a = m(R)(w) and j = 1 i the nonterminal node v of R is directly connected to w by an edge fv; wg. Moreover, the pair ( (p); C ?1 (v)) completely characterizes the production p, where v is the nontermial node of R. Conversely, if GG is in normal form, then ( (p); C ?1 (v)) can be constructed from p and the context describing function. If R is a terminal graph, then j = 0 for all augmented node labels and Y = ;, where Y = C ?1 (v), and conversely. This one-to-one correspondence is the key to the equivalence of linear graph grammars and graph automata.
Theorem 12. For every connected linear graph grammar GG in normal form there is a graph automaton GA such that L(GG) = L(GA).
Proof. For GG = (N; T; P; S ) construct the associate graph automaton GA = (Q; T; ; q0; F ). Let Q = N [ ff g, where f is a new state. De ne F = ff g and q0 = S . Every production p = (A; R; C ) of GG is transformed one-to-one into an associate instruction (A; (p); Y ) !; A0 , such that (p) is the augmented graph associated with p and A0 is the label of the single nonterminal node label v0 of
R or A0 = f , if R is a terminal graph. Let Y = C ?1 (v0 ) = fa 2 T j (a; v0 ) 2 C g and Y = ;, if v0 does not exist. It remains to prove by induction that derivations of GG translate one-to-one into computations of GA, and vice-versa. Let S ) h ) h0 ) g be a derivation of some graph g 2 L(GG), where h ) h0 is obtained by applying a production (A; R; C ) to the nonterminal node v. Let U be the set of terminal nodes of h and v its nonterminal node. Let W be the set of terminal nodes of R, which is nonempty, and let v0 be its nonterminal node, which may not exist. Then V (h) = U [ fvg and V (h0 ) = U [ W [ fv0 g or V (h0 ) = U [ W , if h0 is terminal. Let K0 ` K ` K 0 ` K be the associated computation of GA on g, such that K = (A; k) is associated with h i the following invariant holds. f
(*) The graphs h and k coincide on the terminal nodes, i.e. k = hj , where U is the set of terminal nodes of h. If h is not terminal, then V (h) = U [ fvg, where v is the nonterminal node of h. Then m(v) = A; otherwise A = f . U
Moreover, the neighbours of the nonterminal node coincide with the ports of k, fu 2 V (h) j fu; vg 2 E (h)g = port(g; k). Finally, the labels of these nodes are stored both in the context describing set c(A) of the nonterminal A and in the second components X of the augmented node labels m(w) = (a; X; j ) of an applicable instruction.
This invariant holds for S and K0 , since all relevant sets are empty. Suppose that (*) holds for h and K = (A; k). If the production p = (A; R; C ) is applied to the nonterminal node v of h such that h ) h0 , then K ` K 0 by the instruction (A; (p); Y ) ! A0 and h0 and K 0 are associated. To see this, rst observe that (A; (p); Y ) ! A0 is applicable to K . Since GG is chain-free, W and Rj are nonempty. Hence, (p) is nonempty. Let W = V ( (p)) up to isomorphism and augmented labels. W
For w 2 W let m(w) = (a; X; j ) be its augmented label. Then b 2 X i there is a port u 2 U with m(u) = b and fu; wg 2 E (g) i there is an edge fu; vg 2 E (h) and (b; w) 2 C . This holds for all nodes u 2 U with label b, which are connected to the nonterminal node v, and these are the ports of k with label b. Moreover, for w 2 W there is an edge fw; v0 g 2 E (R) i j = 1 in the augmented label m(w) = (a; X; j ). Finally,
R has a nonterminal node v0 i there is an edge fu; v0g 2 E (h0 ) with u 2 U i fu; vg 2 E (h) and (m(u); v0 ) 2 C i u 2 port(g; k) and m(u) 2 Y: Now, the application of (A; (p); Y ) ! A0 to K yields K 0 = (A0 ; k0 ), where
V (k0 ) = V (k) [ W , the nodes have the proper terminal labels, and E (k0 ) = E (k) [ E (Rj ) [ ffu; wg j u 2 port(g; k); m(w) = (a; X; j ) and m(u) 2 X g: This coincides with the set of edges of h0 between terminal nodes, E (h0 j [ = E (hj ) [ E (Rj ) [ ffu; wg j u; w 2 W; fu; vg 2 E (h) and (m(u); w) 2 C g: W
U
W
U
W
Finally, port(g; k0 ) = fu 2 port(g; k) j m(u) 2 Y g [ fw 2 W j m(w) = (a; X; j ) and j = 1g = fu 2 V (h0 ) j fu; v0 g 2 E (h0 )g: Hence, the invariant (*) holds for h0 and K 0 . Conversely, by the same reasoning, if K ` K 0 by the instruction (A; (p); Y ) ! A0 and (*) holds for K and h, then h ) h0 by the application of p = (A; R; C ) to the nonterminal node v of h, and (*) holds for K 0 and h0 . If (*) holds for a terminal graph g and a con guration K = (A; k), then A = f and k = g. Hence, L(GG) = L(GA). Example 3. Consider the connected linear graph grammar GG in normal form of example 1 which generates the language a b c . The equivalent graph automaton GA = (Q; T; ; q0; ff g) associated with GG is the graph automaton GA of example 2. n n n
For the converse simulation there is again a one-to-one transformation from instructions to productions. Here, the application conditions for the instructions are translated into the context describing function for the productions. This prevents a misuse of productions. Moreover, disconnectivity must be excluded. Theorem 13. For every graph automaton GA there is a linear graph grammar GG such that L(GA) = L(GG). Moreover, GG is connected and in normal form. Proof. Suppose that GA = (Q; T; ; q0; F ) has only a single nal state f , which is reached only at termination, when the edge-heads are deactivated. Furthermore, instructions are excluded, which would disconnect a graph, when they are translated into productions. This happens, when there are no edges to ports, i.e. if for some instruction (q; ; Y ) ! q0 with q0 6= t, the set Y and all sets X , where (a; X; j ) is the augmented label of the nodes of , are empty. Such instructions are deleted from GA.
Let GG = (N; T; P; S ). The set of nonterminals N Q P (T ) consists of pairs of states and sets of terminal node labels and is constructed with the productions. The second components are the context describing function. Let S = (q0 ; ;). For every instruction (q; ; Y ) ! q0 of GA there is an associate
production p = (A; R; C ) of GG. Let = (W; E; m) be an augmented graph with m(w) = (a; X; j ) for every node w. De ne X ( ) = fa 2 T j a 2 X and w 2 W g and
Z ( ) = fa 2 T j m(w) = (a; X; 1) for w 2 W g:
X ( ) collects all node labels stored in the second components of the augmented node labels of the nodes of . Z ( ) collects all node labels, whose third component is set to one. Then A = (q; X ( ) [ Y ). If Z ( ) [ Y 6= ; , then
R = (W 0 ; E 0 ; m0 ) with
W 0 = W [ fv0 g for some new nonterminal node v0 and terminal nodes W , m0 (w) = m1 (w) 2 T for w 2 W , m0 (v0 ) = (q0 ; Z ( ) [ Y ), and E 0 = E [ ffw; v0 g j w 2 W and m(w) = (a; X; j ) with j = 1g. If Z ( ) [ Y = ; and q0 = f is the nal state, then there is no nonterminal node v0 and V = V 0 and E = E 0 . Let C = f(b; w) j w 2 W; m(w) = (a; X; j ) and b 2 X g [ f(b; v0 ) j b 2 Y g. By construction, GG is chain-free, neighbourhood preserving and context consistent, where the context describing function is the projection onto the second components of the nonterminals. Moreover, GG is connected, since the instance for disconnectivity is deleted from GA. It remains to prove that computations of GA correspond one-to-one to derivations of GG. This follows along the lines of the proof of Theorem 1. Let K0 ` K ` K 0 ` K be a computation of GA on some connected graph g with K ` K 0 by the instruction (q; ; Y ) ! q0 . Then there is an associated derivation S ) h ) h0 ) g in GG with h ) h0 by the production p = (A; R; C ), and conversely. If the invariant (*) holds for K and h, then it does so for K 0 and h0 . Hence L(GA) = L(GG). f
Example 4. Let GA be the automaton of example 2. The equivalent linear graph grammar GG associated with GA is shown in Fig.5. The graph language of GG is exactly fa b c j n 1g. n n n
Combining these results we obtain the equivalence of connected linear graph grammars and graph automata.
Theorem 14. For graph languages of connected graphs the following are equivalent: (1) L is generated by a connected linear graph grammar. (2) L is accepted by a nite graph automaton.
(q , O) 0
a b c p
(q , O) 0
(q1 , {a, b, c})
a b
a (q1 , {a, b, c})
c
c p 2
1
b
a b
a (q , {a, b, c}) 1
c p
(q1 , {a, b, c}) a
b
b
c
c
3
p4
Fig. 5. The linear graph grammar associated with GA Observe, that nite graph automata are more powerful than nite state automata, when they are used to recognize strings as chains of labeled graphs. Our running example fa b c j n 1g is a famous witness. n n n
For the complexity, given a linear NCE graph grammar GG the construction of an equivalent graph automaton GA takes polynomial space in the size of GG. Finally, let's consider the complexity of the membership problem for connected graphs generated by linear NCE graph grammars. The membership problem is known to be in NP, and is in NL for linear graph grammars of bounded degree [EL89]. In fact, the membership problems are complete for these classes. These bounds are easy to see for graph automata. A graph automaton runs in linear time. Using a standard representaton of graphs it can be simulated in nondeterministic polynomial time, which shows the inclusion in NP. If the graph grammars has bounded degree, then the constructed graph automaton has a bounded number of edge-heads. Such automata can be simulated on logarithmic space, which shows the inclusion in NL.
4 Conclusion Our main result shows that the graph automata introduced in this paper recognize exactly the graph languages of connected linear graph grammars. This is the rst step towards a systematic study of various types of graph automata, which closely correspond to the major classes of node-replacement graph grammars. Canonical extensions are pushdown graph automata, which store sets of edge-separators in a stack and correspond to the boundary graph grammars. Another important issue are restrictions to graph automata with a bounded number of edge-heads. These should correspond to graph languages of bounded degree, and nally we shall investigate deterministic graph automata, which may have a choice for the initialization and then proceed in a unique fashion. Last but not least we are interested in types of graph automata which realize general graph searching strategies.
References [BS91] D. Bienstock, P. Seymour. Monotonicity in graph searching. J. Algorithms 12 (1991), 239-245. [Bra95] F.J. Brandenburg. Designing graph drawings by layout graph grammars. Proc. Workshop on Graph Drawing 94, LNCS 894 (1995), 416-427. [Cou90] B. Courcelle. Graph rewriting: an algebraic and logic approach. Handbook of Theoretical Computer Science, Elsevier, Amsterdam, (1990) 193-242. [EKR91] H. Ehrig, H.J. Kreowski, G. Rozenberg. Proc. 4. Workshop on Graph Grammars and Their Application to Computer Science, LNCS 532 (1991). [Eng89] J. Engelfriet. Context-free NCE graph grammars. Proc. FCT 89, LNCS 380 (1989), 148-161. [EL89] J. Engelfriet, G. Leih. Linear graph grammars: power and complexity. Inform. Comput. 81 (1989), 88-121. [GS84] F. Gecseg, M. Steinby. Tree Automata. Akademiai Kiado, Budapest (1984). [Hic94] T. Hickl. Rechtwinkliges Layout von hierarchisch strukturierten Graphen. Dissertation, Universitat Passau (1994). [JRW86] D. Janssens, G. Rozenberg, E. Welzl. The bounded degree problem for NLC graph grammars is decidable. J. Comput. System Sci. 33 (1986), 415-422. [Kau87] M. Kaul. Practical applications of precedence graph grammars. Proc. 3. Workshop on Graph Grammars and their Application to Computer Science, LNCS 291 (1987), 326-342. [LaP93] A.S. LaPaugh. Recontamination does not help to search a graph. J. Assoc. Comput. Mach. 40 (1993), 224-245. [MHGJP88] N. Megiddo. S.L. Hakimi, M. R. Garey, D.S. Johnson, C.H. Papadimitriou. The complexity of searching a graph. J. Assoc. Comput. Mach. 35 (1988), 18-44. [Nag79] M. Nagl. Graph Grammatiken. Vieweg, Braunschweig (1979). [PR69] J.L. Pfaltz, A. Rosenfeld. Web Grammars. Proc. Joint Intern. Conference on Arti cial Intelligence, Washington, D.C., (1969), 609-619. [Rem94] E. Remila. Fundamental study - Recognition of graphs by automata. Theor. Comput. Sci. 136 (1994), 291-332. [RM72] A. Rosenfeld, D.L. Milgram. Web automata and web grammars. Machine Intelligence 7 (1972), 307-324. [RW86] G. Rozenberg, E. Welzl. Boundary NLC graph grammars - Basic de nitions, normal forms and complexity. Inform. Control 69 (1986), 136-167. [WR79a] A. Wu, R. Rosenfeld. Cellular graph automata I. Inform. Control 42 (1979), 305-329. [WR79b] A. Wu, R. Rosenfeld. Cellular graph automata II. Inform. Control 42 (1979), 330-353.
This article was processed using the LATEX macro package with LLNCS style