closed partition lattice and machine decomposition

16 downloads 0 Views 306KB Size Report
Jun 19, 2001 - All the closed partitions of a finite state machine form a lattice, the .... resented by a list of its elements, and a partition is represented by a list.
CLOSED PARTITION LATTICE AND MACHINE DECOMPOSITION David Lee Mihalis Yannakakis Bell Laboratories 600 Mountain Avenue, RM 2C-424 Murray Hill, New Jersey 07974 E-mail: flee,[email protected] June 19, 2001

Abstract

Finite state machines are widely used to model systems in diverse areas. Often the modeling machines can be decomposed into smaller component machines, and this decomposition can facilitate the system design, implementation and analysis. Hartmanis and Stearns developed an elegant algebraic theory for machine decomposition that is based on the closed partition lattice of a machine. In this paper we study the computation of the closed partition lattice of nite state machines for the application to their decomposition. We present ecient algorithms for constructing the closed partition lattice and for machine decomposition.

Index Terms: nite state machine, machine decomposition, closed parti-

tion lattice.

Corresponding Author: David Lee

1

1 Introduction A nite state machine contains a nite number of states and produces outputs on state transitions after receiving inputs. The function of a nite state machine is to compute the next state and output from the current state and input. Often the same function can be fullled jointly by two or more smaller machines, the component machines, which provide a decomposition of the nite state machine. Hartmanis and Stearns developed an elegant algebraic theory for this purpose 3]. Finite state machines have been widely used to model systems in diverse areas, including sequential circuits, some types of programs (in lexical analysis, pattern matching, etc.), and more recently, communication protocols and network systems 6, 2, 7]. However, for most of the practical systems, the modeling nite state machines have a formidable size with a large number of states. The decomposition of such machine into smaller component machines can facilitate the system design, implementation and analysis. For instance, a machine decomposition technique was proposed by Wang and Schwartz 11] for the fault management of communication protocols. The technique was further developed for detecting execution errors in communication protocols associated with error control codes 8]. However, several basic problems of machine decomposition have to be resolved before the technique can be applied eectively for practical systems. For instance, given a nite state machine, is it decomposable? If the answer is yes then how can it be decomposed to meet practical needs, such as the component machines satisfy a required size. The original work of Hartmanis and Stearns studied the algebra of machine decomposition. They have showed that machine decomposition depends on its closed partition lattice. A partition on the states of a machine is closed if any input maps each block into another block in the partition. All the closed partitions of a nite state machine form a lattice, the closed partition lattice. This lattice provides all the information for machine decomposition. The original work of Hartmanis and Stearns developed the basic underlying concepts and theory, but did not study the algorithmic aspects of the theory, i.e., it did not address explicitly algorithms and their complexity for the construction of the closed partition lattice and the decomposition of machines. In this paper we shall study ecient algorithms for machine decomposition. We rst investigate the computation of closed partition lattices of nite state machines and then its application to designing ecient algorithms for 2

machine decomposition. In Section 2, we discuss the basic concepts of nite state machines and their closed partitions. In Section 3, we present our algorithm for the construction of closed partition lattices. In general, the number of closed partitions and the size of the lattice can be much larger (exponentially larger in the worst case) than the size of the nite state machine. It such cases, where the size of the output can exceed the size of the input, it is useful to measure the complexity of an algorithm as a function of both the size of the input (the given machine in this case) and the output (the lattice in this case). Our algorithm constructs the closed partition lattice in time linear in the size of the lattice and polynomial (quadratic) in the size of the machine. In Section 4, we describe the application to machine decomposition. To aid the readers and for completeness, we provide a brief description of machine decompositions, and then we discuss the implication of our algorithms for determining the existence of decompositions and constructing them.

2 Finite State Machine and its Closed Partitions After a brief description of nite state machines, we outline the basics of closed partitions and associated lattice operations, introduced in 3], and we present related computations and their complexity.

Finite State Machine

A nite state machine has a nite number of states and moves between states upon inputs and produces outputs. Formally, a nite state machine (FSM) M is a quintuple M = (I O S  ) where I is a nite set of input symbols, O is a nite set of output symbols, S is a nite set of states,  is the state transition function mapping from S  I to S , and  is the output function mapping from S  I to O. When the machine is in a current state s in S and receives an input a from I it moves to the next state specied by (s a) and produces an output given by (s a). An FSM can be represented by a state table, which describes the next state and output functions. An FSM can also be represented by a state transition diagram, a directed graph whose vertices correspond to the states of the machine and whose edges correspond to the state transitions each edge is labeled with the input and output associated with the transition. 3

Example 1 Consider a multiplication table mod 6. The multiplicands are 0 1 : : :  5, represented by states S = fs0  s1  : : :  s5 g, respectively. The multipliers are 2 and 3, represented by inputs I = f2 3g, respectively. The

products are recorded by the next states si , i = 0 1 : : :  5 and outputs O = f0 1 : : :  5g. For instance, the entry in row s4 and column 2 indicates that 4 times 2 is 2 mod 6.

S s0 s1 s2 s3 s4 s5

2

s0 =0 s2 =2 s4 =4 s0 =0 s2 =2 s4 =4

3

s0 =0 s3 =3 s0 =0 s3 =3 s0 =0 s3 =3

2

Given an FSM M = (S I O  ), denote the number of states, inputs, and outputs by n = jS j, p = jI j, and q = jOj, respectively. We extend the transition function  and output function  from input symbols to strings as follows. For an initial state s0 , an input sequence x = a0    ak takes the machine successively to states si+1 = (si  ai ) with the nal state (s0  x) = sk+1, and produces an output sequence (s0  x) = o0    ok where oi = (si  ai ), i = 0 1     k. Also, we can extend the transition and output functions from a single state to a set of states Q. For an input sequence x, dene (Q x) = f(s x)js 2 Qg and (Q x) = f(s x)js 2 Qg. A partition  on the state set of an FSM M = (S I O  ) is a set fB1      Bk g of disjoint subsets of the state set S , whose union is S  that is, ki=1Bi = S and Bi T Bj =  for i 6= j . The elements Bi of a partition are called blocks. If two states s and t are in the same block we denote s  t (). A partition  is closed if each input maps a block of  into a block. That is, (s a)  (t a) () for all a 2 I and s  t (). Note that this property is independent of the output behavior of machine M . (Closed partitions are called in 3] s.p. (substitution property) partitions.)

Example 2 We use overline to denote subsets of S . For the machine M of Example 1,  = fs0  s2  s4  s1  s3  s5 g is a closed partition where for clarity a block of states fs0  s2  s4 g is represented by s0  s2  s4 . For instance, (s0  s2  s4  2) = s0  s2 s4 and (s0  s2  s4  3) = s0  s0 s2  s4 . 2 4

We rst discuss some standard properties and operations of partitions, not necessarily closed. Given partitions  and  ,  is greater than or equal to  , denoted by   , if each block of  is contained in a block of . Informally,  is a coarser partition than  . Obviously, the relation is a partial order on the partitions, i.e., it is reexive, antisymmetric, and transitive. With this partial order, the smallest partition is the zero partition 0 where each state is in a block by itself, i.e., each block is a singleton set, and the largest partition is the trivial partition 1 = fS g where all the states are in one block. Given partitions  and  of S , the greatest lower bound (GLB), denoted by    , is the partition such that for s and t in S , s  t (   ) if and only if s  t () and s  t ( ). Note that    is the largest partition which is smaller than  and  , hence the name of GLB. It is obvious that we can simply take the intersections of the blocks of  and  to obtain    . Specically, for each block in  we split it by its intersections with all the blocks of  . From its computation,    is often called the intersection of  and  . Obviously, if  and  are closed partitions then their intersection is also closed. Before describing the computational aspects, we rst briey describe the representation of partitions. There are several natural ways. One way is to use an array A on the sets of states to represent a partition : assume that the states are numbered from 1 to n and the blocks are likewise numbered from 1 to the cardinality of the partition then A i] is the number of the block that contains state i. Another way is using lists: each block is represented by a list of its elements, and a partition  is represented by a list of the blocks (i.e., it is a list of lists). In order to represent more succinctly partitions that have many singleton blocks, we can use a variant of the list representation where we include in the list only the nonsingleton blocks states that are not included in the list are deduced then to belong to singleton blocks. A third way is to represent a block as a string of the states in the block (or their numbers), and represent a partition as a string which is the concatenation of the block strings separated by a separator symbol, e.g. \j" again for succinctness we need only include the nonsingleton blocks. The array representation has the advantage that is is easy (i.e. O(1) time) to nd the block of a given state, and to determine whether two states belong to the same block, while the list representation has the advantage that it is easy to list all the members of a block. It is straightforward to convert between the dierent representations in O(n) time using standard techniques. Since the algorithms we discuss below for operations on partitions take at 5

least linear time, the representation of the given partitions is not important, i.e. the statements apply to any of the common representations. The intersection of partitions can be computed easily in linear time. Assume for concreteness that both partitions  and  are represented using arrays. Each state has a pair of block numbers from  and  , respectively. Clearly, two states are in the same block of the intersection    if and only if they have the same pair of block numbers. Hence the states can be partitioned using a lexicographic (bucket) sort of the pairs, which takes time O(n). See e.g. 1]. Therefore,

Proposition 1 Given two partitions  and  of states of an FSM, their GLB    can be computed in time O(n) where n is the number of states. Furthermore, if both  and  are closed partitions then so is    . 2

Example 3 Let  = fs0  s3 s1  s5 s2  s4g and  = fs0 s2  s4 s1 s3  s5g.

Numbering the blocks in each partition, the states have the pairs given in the following table. S pair s0 (1 1) s1 (2 2) s2 (3 1) s3 (1 2) s4 (3 1) s5 (2 3) The lexicographic sorting of the pairs yields the following order with the corresponding buckets (which are the blocks of    ): (1 1) : fs0 g, (1 2) : fs3 g, (2 2) : fs1 g, (2 3) : fs5 g, (3 1) : fs2  s4 g.

2

Suppose that  and  are closed partitions of S . Let i , i = 1 2 be closed partitions with i  and i  . Then the intersection 1  2 is closed and 1  2  and 1  2  . Taking the intersection of all the closed partitions that are greater than or equal to both  and  , we obtain a closed partition, denoted by  +  , with  +   and  +   . It is the smallest closed partition which is greater than or equal to  and  and is called the least upper bound (LUB) (sometimes called also the sum) of  and  3]. 6

In summary, the set of all the closed partitions of an FSM M forms a lattice LM , under the partition partial ordering, which contains the zero and trivial partitions 0 and 1 it is called the closed partition lattice of machine M . The lattice is typically represented by its Hasse diagram: a directed graph whose nodes are the elements of the lattice and whose edges are the pairs (  ) such that  >  and there is no other element such that  > > . We discuss next the computation of LUB and partition closures they are the basic operations for the construction of closed partition lattices.

Consistent Partitions

Let ; = fB1      Bk g be a set of (non-empty) blocks of states, with ki=1 Bi = S . 1 Blocks in ; may not be disjoint and hence may not form a partition of S . A partition  is consistent with ; if each block in ; is a subset of a block of . Obviously, the trivial partition 1 is consistent with ;. On the other hand, the intersection of two consistent partitions is also consistent. Taking the intersections of all the partitions consistent with ;, we obtain the smallest partition consistent with ;, denoted by ; . As an example, referring to the FSM of Example 1, if ; = fs0  s3  s1  s5  s2  s4  s0  s2 g, then the smallest partition consistent with ; is ; = fs0  s2  s3  s4  s1  s5 g. Computing the smallest partition consistent with a given set of blocks of states is a useful operation. Observe that if two blocks in ; intersect with each other then they must be contained in the same block of any consistent partition and hence also of ; . Consequently, after we merge two intersecting blocks of ; and obtain a new set of blocks ; , the smallest consistent partition remains the same, i.e., ; = ; . Consequently, we can repeatedly merge all the blocks into one if they intersect with each other until no two blocks intersect with each other and we obtain a partition, which is the smallest consistent partition ; . Specically, starting from a block B , we merge all the blocks that intersect B into a new block B . Meanwhile we remove the merged blocks from ;. We repeat the process and merge all the blocks that intersect B until no more merging is possible. We obtain a block B that does not intersect with any remaining blocks in ; and that gives a block in ; . We repeat the same process for the remaining blocks in ; and obtain ; . One way to think of this process is as computing the connected components of the hypergraph H; with set of nodes S and set of This assumption is only for convenience otherwise, we can add a block S ; ki=1 Bi to ;. 1

7

hyperedges ;, or (equivalently), computing the components of the bipartite graph G; with node set S  ; and edge set f(s Bi ) j s 2 Bi g. It is easy to see that the blocks of the smallest consistent partition ; are the sets of states that are in the same component of the hypergraph H; or the graph G;. It takes time linear in the number of edges and nodes to compute P the connected components of graph G; 1]. There are n + k nodes and ki=1 jBi j edges, and we have:

Proposition 2 Given a set of non-empty blocks of states ; = fB1  : : :  Bk g with ki=1 BPi = S , the smallest partition consistent with ; can be constructed in time O( ki=1 jBi j) where jBi j is the cardinality of block Bi . 2

LUB of Closed Partitions

Given two closed partitions  and  , we take the set ; of all the blocks in  and  , and compute the smallest partition ; consistent with ;. Obviously, ;  +  . To see that ; is closed, note from its computation that, if two states s, t are in the same block of ; then there is a sequence of states q1  : : :  qk with q1 = s and qk = t such that any two consecutive states qi qi+1 in the sequence are in the same block of  or  . Since the two given partitions are closed, for any input a the corresponding next states (qi  a) and (qi+1  a) are also in the same block of  or  . Thus, (s a) and (t a) are in the same block of ; . Therefore, ; is closed. Since both  and  are partitions of n states S , it takes time O(n) to compute ; =  +  .

Proposition 3 The LUB of two closed partitions can be computed in time O(n).

2

Partition Closure

Recall that a partition is closed if for each block B and input a, the image block (B a) is contained in another block. Given a partition , taking the intersections of all the closed partitions that are greater than or equal to , we obtain the smallest closed partition  . It is called the closure of . We now discuss the computation of the closure of a partition . Suppose that there is a block B 2  and input a such that (B a) intersects more 8

than one block Ci , i = 1 : : :  k, in . Since   and is closed, there is a block C in  with (B a) C . Consequently, C intersects and hence contains Ci , i = 1 : : :  k. Therefore, we merge all the intersected blocks Ci 's, and the resulting partition remains smaller than  . We repeat the same process for all the blocks and inputs until no merge is possible and we obtain a closed partition  with    . Since  is the smallest closed partition with   ,  =  and we have constructed the closure of . The procedure is similar to that of computing a consistent partition we merge blocks that intersect with an image block (B a) instead. Note that for computing a consistent partition we examine each block once. However, for the construction of the closure we may examine a block, or its containing block after merging, more than once whenever it intersects an image block. Therefore, a state of a block is examined repeatedly whenever the containing block is processed. Yet this is unnecessary for the following reason. After a block B is examined with all the inputs a, the image blocks (B a) are contained in the merged blocks. From then on (B a) will never intersect more than one block. Consequently, we can shrink block B into one state and we do not have to examine each state of B . The appendix contains a pseudo-code Algorithm 1 for the computation and the data structures for the ecient implementations. The following proposition gives its complexity. Recall that (m n) is the inverse Ackerman function, a very slowly growing function that is for all practical purposes upperbounded by 4.

Proposition 4 Given a partition  of states of a nite state machine, the

closure of  can be computed in time O(pn (pn n)) where p is the number of inputs and n the number of states.

2

3 The Closed Partition Lattice We can construct the closed partition lattice of an FSM as follows. We start with the minimal closed partitions, which are not the zero partition, called basis. Then we build the lattice bottom up.

Basis of the Closed Partition Lattice

The basis of a closed partition lattice consists of all the closed partitions, which are not zero and do not contain any other closed partitions, i.e., they 9

are the parents of the bottom element of the partition lattice. Informally, they are the minimal and non-trivial closed partitions. The basis plays an important role in constructing the closed partition lattice and in machine decomposition. We discuss its computation next. A partition (not necessarily closed) is minimal if all its blocks are singleton state except for one block that has two states a minimal partition is characterized by a two state block. Obviously, a basis element of the closed partition lattice is the closure of a minimal partition but the converse is not true in general. To nd all the basis elements one approach is the following: rst construct the closures of all the minimal partitions, and then delete those which are redundant or properly contain another constructed closed partition. This will take in general at least cubic time: there are O(n2 ) pairs of states and generating the closed partition for each of them will take time proportional to pn (pn n). We now give a quadratic algorithm in the following. For a pair of states si , sj , let ij denote the smallest closed partition in which si and sj are in the same block i.e. ij is the closure of the partition with all blocks singleton except for a two state block fsi  sj g. Suppose that for some input a, we have (si  a) = sp and (sj  a) = sq . Then sp and sq must be in the same block of ij , because ij is closed and si and sj are in the same block. Since pq is the smallest closed partition with sp and sq in the same block, it follows that ij pq .

Lemma 1 Let ij and pq be the smallest closed partitions corresponding to the pairs of states fsi  sj g and fsp  sq g, respectively. If there is an input a such that (si  a) = sp and (sj  a) = sq , then ij pq . 2

Construct a directed graph G with all the unordered state pairs as nodes: fsi  sj g i 6= j . There is an edge from node fsi  sj g to fsp sq g, if there is an input a, such that (si  a) = sp and (sj  a) = sq , or vice-versa, i.e., (si  a) = sq and (sj  a) = sp. The graph G is called the implication graph.

Example 4 Figure 1 shows the implication graph for the nite state ma-

chine of Example 1. Each node in the gure corresponds to a pair of states fsi  sj g of the FSM and is labeled for clarity by the two indices ij . As an example, consider the node 15 in the gure corresponding to the pair of states fs1  s5g. On input 2, state s1 transitions to s2 and state s5 transitions to s4 hence node 15 has an edge to 24. On input 3, both states s1  s5 transition 10

15

12

24

45 14 25 01

34

03

13

02

05

23

35

04

Figure 1: Implication graph to the same state s3  hence this input does not give rise to an edge in the implication graph, because the graph includes nodes corresponding only to pairs of distinct states.

2

From Lemma 1, if there is a path from node fsi  sj g to node fsp sq g, then the corresponding closed partitions satisfy ij pq and, consequently, ij is not a basis element of the closed partition lattice unless ij = pq and pq is a basis element. Therefore, each basis element is generated by a pair fsp sq g that belongs to a strongly connected component (SCC) of G without outgoing edges, a bottom SCC. The converse is, however, not true in general the closure of a minimal partition of a node in a bottom SCC may not be a basis element. Note that all the nodes in an SCC generate the same closed partition by Lemma 1. We rst construct the closed partitions from all the bottom SCC's and then delete those which are either not a basis element or redundant. We show rst how to construct for any state pair fsi  sj g the corresponding closed partition ij using the implication graph.

Lemma 2 For a pair of states fsi sj g, let H i j ] be the undirected graph

on the set of states in which two states sp  sq are connected by an edge i the implication graph contains a path from fsi  sj g to fsp sq g. Then the closed partition ij corresponding to fsi  sj g is equal to the partition of the connected components of H i j ]. Proof: If the implication graph contains a path from fsi  sj g to fsp  sq g then sp and sq must be in the same block in the partition ij . It follows that also each connected component of H i j ] is contained in a block of ij . It suces

11

to prove that the connected component partition is closed. We need to show that for any two states sp and sq in a component (block) and for any input a, (sp  a) and (sp  a) are in the same component (block). If there is an edge (sp  sq ) in H i j ], then fsp sq g is a node of G reachable from fsi  sj g, and either (sp  a) = (sq  a) or f(sp  a) (sq  a)g is also a node reachable from fsi  sj g, and hence f(sp  a) (sq  a)g is an edge of H i j ] in either case (sp  a) and (sq  a) are in the same component (block). Suppose, there is no edge (sp sq ) in H i j ]. Since sp and sq are in a connected component, there is a path in the component from sp to sq : sp t1  t2  : : :  tk  sq . From the above argument, for any input a, (sp  a) and (t1  a) are in the same component, and (t1  a) and (t2  a) are in the same component. Hence (sp  a) and (t2  a) are in the same component. Inductively, it can be shown that (sp  a) and (sq  a) are in the same component. The lemma follows.

2

Note that all the nodes fsi  sj g of an SCC can reach the same set of nodes, and therefore they all have the same associated graph H i j ]. From the previous lemma, for each bottom SCC C we can construct the graph H C ] associated with its nodes, and nd its connected components which yield the closed partition C ] corresponding to the SCC C . Recall that each basis element is obtained from a bottom SCC. However, among all the closed partitions from the bottom SCC's some are not basis elements and some are redundant, i.e. the same partition may be derived by more than one bottom SCC's. We identify them and remove them as follows. Construct a bit array indexed by the state pairs of the machine: fsi  sj g i 6= j . For each bottom SCC C , construct the associated graph H C ] and compute its connected components yielding a partition C ]. Sort the partitions C ] from all the bottom SCC's in descending order of their number of blocks. We process them in this order. Suppose that we process a bottom SCC C with associated partition C ]. We consider each nontrivial block of the partition and examine each pair of states fsi  sj g in the same block. We mark fsi  sj g in the index array if it has not been marked. However, if it has been marked, we discard C and move on to the next bottom SCC. We output the partitions C ] of all the remaining bottom SCC's C .

Example 5 Consider the implication graph of the machine M in Example 1, which is shown in Figure 1. There are three bottom SCC's: A = ffs0  s3 gg, B = ffs2  s4 gg, and C = ffs0 s2 g fs0  s4 gg. We construct the corresponding graphs H A] H B ] H C ], and use them to compute the corresponding closed partitions A] B ], and C ], respectively: The graph 12

H A] has just one edge (s0  s3 ), and hence just one nontrivial connected component fs0  s3 g, thus, A] = fs0  s3  s1  s2  s4  s5 g. Similarly H B ] has one nontrivial component fs2  s4 g, hence B ] = fs2  s4  s0  s1  s3  s5 g. The graph H C ] has two edges (s0  s2 ) (s0  s4 ), thus, it has one nontrivial component fs0  s2  s4 g, hence C ] = fs0  s2  s4  s1  s3  s5 g. In the ordering of the bottom SCC's, A and B come rst and then C . Obviously, A] and B ] provide two basis elements of the closed partition lattice and are output by the algorithm. However, when we process C we discover that the pair fs2  s4 g has already been marked (it was marked when we processed B ), hence we do not output the corresponding partition C ]. Note that C ] > B ], and hence C ] is not a basis partition. 2

We summarize now the procedure for the construction of all the basis elements.

Procedure Basis

input. An FSM M = (S I O  ). output. The basis elements of its closed partition lattice.

1. Construct the implication graph G of the FSM M : the graph has all the unordered state pairs fsi  sj g i = 6 j as nodes, and it has an edge from node fsi  sj g to fsp  sq g, if there is an input a, such that (si  a) = sp

and (sj  a) = sq , or (si  a) = sq and (sj  a) = sp . 2. Compute the strongly connected components (SCC's) of G. 3. For each bottom SCC C of G, do the following steps 3a and 3b. 3a. Construct an undirected graph H C ] on the states of the machine with edges corresponding to the nodes of C , i.e. for each pair of states fsi  sj g which is a node in C , the graph H C ] contains a corresponding edge connecting si and sj . 3b. Find the connected components of H C ] and let C ] be the state partition induced by the connected components i.e, all the states (nodes) of a connected component yield a block of the closed partition C ]. (Comment: For eciency purposes, we only need to include in H C ] states that are incident to at least one edge, and accordingly, in the partition C ] we only need to list explicitly the nonsingleton blocks, corresponding to the nontrivial components of H C ].) 13

4. Construct a bit array A indexed by the state pairs of the machine: fsi sj g i 6= j , and initialize to 0. 5. Sort the constructed partitions C ] from all the bottom SCC's in de-

scending order of their number of blocks ties are broken arbitrarily. 6. Process the bottom SCC's in the above order from step 5, and for each bottom SCC C do the following. 6a. For each nonsingleton block of the partition C ] and for each pair of states si , sj in the same block, check the corresponding entry of the bit array A. If Afsi  sj g] = 0 then set Afsi  sj g] = 1 else, exit the processing of C and skip step 6b (i.e., discard this SCC C and proceed to the next bottom SCC). 6b. Add C ] to the basis.

Lemma 3 The procedure computes correctly the basis elements of the closed partition lattice.

Proof: As mentioned earlier, a basis element of the closed partition lattice is a partition ij for some pair fsi  sj g in a bottom SCC. We need to show: (1) When processing a bottom SCC, if the partition is discarded then it is not a basis element or an identical closed partition has been output previously and conversely, (2) Each partition that is output is a basis element and is unique, i.e. if the partition of a bottom SCC is not a basis partition or it is identical to a previously output partition then it is discarded. (1) We use induction on the order of the processing of the bottom SCC's. Suppose a bottom SCC C is rejected because a pair fsi  sj g was already marked, say during the processing of a previous bottom SCC T . Since si and sj are in the same block of C ] and T ], it follows that ij C ] and ij T ]. There are two cases: (A) C ] is not a basis element. It is correctly discarded as claimed in (1). (B) C ] is a basis closed partition. Then we must have ij = C ]. Hence C ] T ]. By the ordering of the SCC's, the partition C ] has at most as many blocks as T ], which implies that the two partitions must be equal, C ] = T ]. By the induction hypothesis, either the algorithm has output T ] or it has output an earlier partition equal to it. (2) Suppose the algorithm outputs C ] for some bottom SCC C . If the partition does not belong to the basis, then there is another basis partition

14

 such that  < C ]. By (1), the partition  is output by the algorithm,

and because of the ordering of the bottom SCC's, it must be output for some SCC T that is ordered before C . Since  = T ] is non-trivial and T ] < C ], there is a state pair fsi  sj g, which is in the same block in T ] and hence also in C ]. The state pair fsi  sj g has been marked when we processed T , and therefore we will discard C when we examine the state pair fsi  sj g in C ]. 2 It takes time O(pn2 ) to construct the graph G, and to compute its strongly connected components. For each bottom SCC C , we construct the corresponding graph H C ], and nd its connected components yielding the corresponding closed partition C ] in time proportional to the number of edges of H C ], i.e. the number of nodes of C (provided that we only list the nontrivial blocks of the partition). Hence the total cost of constructing the partitions for all the bottom SCC's is O(n2 ). We sort the constructed closed partitions by their cardinalities in time linear in their number (hence time at most O(n2 )), using a bucket sort for instance. The index array for marking has size O(n2 ). When we examine a closed partition from a bottom SCC, at each step we either mark an entry in the index array or we discard the closed partition under consideration in the rst case, we charge this step to the entry of the index array, and in the second case, we charge the step to the closed partition being discarded. Every entry of the index array is charged for at most one step (the step that marks it), and every discarded closed partition is charged for at most one step (the step that discards it). Since the size of the index array and the number of discarded closed partitions are bounded by n2 , it follows that the total number of steps is O(n2 ). In summary, we have the following theorem.

Theorem 1 Given a nite state machine, we can construct in time O(pn2)

the basis of the closed partition lattice where p is the number of inputs and n the number of states.

2

Note that in the theorem we only list explicitly the nontrivial blocks of the basis partitions otherwise the output by itself may need to be of size n3 in the worst case since the basis could have a quadratic number of partitions, and each partition would take space n to list it.

15

Closed Partition Lattice

The number N of closed partitions of a nite state machine is generally often much larger than (potentially, exponential in) the number of states n of the machine. Thus, the complexity of an algorithm that generates the closed partition lattice has to be measured both in terms of the parameters of the input machine, i.e. the number of states n and the number of inputs p, and in terms of the size of the output, i.e. the number N of nodes and number M of edges of the closed partition lattice. One way to generate all the closed partitions is the following 3]: rst nd all the smallest closed partitions ij corresponding to pairs of states this forms a rst set of closed partitions. Then take sums (LUB) of members of the rst set to form a second set of partitions, take sums of members of the second set to form a third set, and so on. This process yields all the elements (nodes) of the closed partition lattice. In general, the complexity is quadratic in the number N of closed partitions as we noted, N is typically much larger than n (often exponential in n). Furthermore, this process yields only the elements of the lattice (the set of closed partitions). If we want to compute also the edges of the lattice, the straightforward way would be to compare pairwise the closed partitions to determine the relation between them, and then compute the transitive reduction of the relation, which takes even more than quadratic time (the best current bound for transitive reduction is N 2:38 using fast matrix multiplication.). We describe below another, more ecient method to compute all the closed partitions as well as the edges of the partition lattice the complexity of the algorithm depends linearly on N and M . The closed partition lattice is constructed bottom up, starting from the zero partition, by using repeatedly the basis algorithm of the previous subsection (applied to a suitably modied nite state machine), to compute the parents of each closed partition in the lattice. Let L be a set of closed partitions initially, it only contains the zero partition 0, and at the end of the algorithm it will contain all the closed partitions. We have a queue Q that contains the closed partitions that have been found but not processed yet (we call them active). Initially, the zero partition is marked active and inserted into the queue Q. In the general step, we extract a partition from Q, and process it to determine its parents in the lattice. We process elements in Q as follows until it becomes empty. Take an element  from Q it represents a closed partition. For this partition, we \shrink" all the states in each block into a state and obtain a 16

nite state machine M , which has the same input and output sets as the original nite state machine and has n states where n = jj is the number of the blocks in the partition. We construct the basis elements 1  : : :  r , of the closed partition lattice of machine M as described in the previous subsection. For a basis element i , each state is a block of states of the original machine M , and thus, i induces a partition i of the states of M . Obviously, i is also a closed partition, and is a parent of  in the lattice, i.e., there is no closed partition  such that  <  < i . We examine all the elements i , i = 1 : : :  r. If i is already in the lattice L, we make it a parent of . Otherwise, we add i into L, make it a parent of , mark it active, and insert it in Q to be further processed. We then mark the processed partition  inactive. We will discuss later how to check eciently if an element i is already in the lattice L. When the queue Q becomes empty we claim that we have constructed the complete lattice L. From the construction, each element in the lattice L is a closed partition and there are no redundant elements. We only need to show that each closed partition  is in L. Let 0< 1 <    < r =  be an arbitrary chain of closed partitions where for any two consecutive elements in the chain i < i+1 there is no closed partition such that i < < i+1 . Obviously, 1 is a basis element and is in L. Inductively, suppose that the closed partition i is in the lattice. Then all the elements ij , j = 1 : : :  r, are considered where there is no element  such that i <  < ij . Element i+1 must be one of them and hence is in the lattice. 0

0

0

0

0

0

Example 6 Figure 2 shows the closed partition lattice for the FSM of Ex-

ample 1. Every node is labelled by a string of symbols that denotes the corresponding partition in a succinct notation listing only the nontrivial blocks, where each block is given by the indices of the states that it contains, and blocks are separated by a separator symbol j. Thus, for example the node labeled (03j24) represents the partition fs0  s3  s1  s2  s4  s5 g. The algorithm in this example proceeds as follows. The queue Q is initialized with the zero partition, which is the bottom element of the lattice. We rst extract it from Q and compute the basis partitions as in Example 3: 03] = fs0  s3  s1  s2  s4  s5 g and 24] = fs2  s4  s0  s1  s3  s5 g. The two partitions are the parents of the zero partition and are inserted in the queue Q. Next, one of them, say 03] is extracted from Q and processed to determine its parents: We merge the states s0 and s3 in the FSM M , and compute the basis elements of the resulting FSM M 03] , as described in subsection 3.1. That is, we construct the corresponding implication graph, 17

(012345) (03j1245)

(01234)

(0234j15)

(02345)

(03j14j25) (03j124) (03j245) (03j15j24) (0234) (03j14)

(03j25)

(03j24)

(03)

(15j24)

(024j135)

(024j15) (024j13) (024j35) (024)

(24) ()

Figure 2: The closed partition lattice of machine M

18

15

12

24

45

031 14

25

032

035 034

Figure 3: The implication graph of machine M 03] nd the bottom strong components and process them in the appropriate order. Figure 3 shows the implication graph for M 03] . We remark that it is not necessary to do this computation from scratch: The implication graph for M 03] can be obtained from the implication graph of M by removing the node 03 (and its incident edges), and merging the nodes representing the pairs fs0  si g fs3  si g for all i 6= 0 3 that is, merging nodes 01 and 13 into one node, labeled 031 in Figure 3, merge nodes 02 and 23 into 032, nodes 04 and 34 into 034, and nodes 05 and 35 into node 035. The implication graph has four bottom strongly connected components f14g f24g f25g and f032 034g. Processing them in this order, we nd that only the rst three yield basis elements for M 03] , which are the partitions 03j14] = fs0  s3  s1  s4  s2  s5 g, 03j24] = fs0  s3  s1  s2  s4  s5 g and 03j25] = fs0  s3  s1  s2  s5  s4 g. These partitions are the parents of the partition 03]. Since the three partitions are new, they are inserted into the Q for further processing. We can construct the rest of the lattice proceeding bottom up in a similar manner using a basis computation at each node of the lattice to determine its parents.

2

We discuss now the implementation and the complexity of the algorithm. To process each element from Q, it takes time O(pn2 ) to construct all its parent elements, to add to L, and to insert in Q if necessary, ignoring for the time being the cost of checking if a constructed parent element is already in the lattice. For the closed partition lattice L of N elements, it takes time O(pn2N ) to complete the process. We now show that it takes time O(n) to check if a newly constructed element is already in the lattice L. 19

Each lattice element, i.e., a closed partition, is a partition of the n states of the machine. For simplicity, we use n integers for the states: 1 : : :  n. We can write each partition as a string over the alphabet that includes the numbers 1 : : :  n and a separator j. To obtain a unique encoding for each partition, we sort the numbers (states) in a block in increasing order. and we sort the blocks by their leading integer (state). We can omit the singleton blocks for succinctness and just record the nontrivial blocks the trivial blocks can be then inferred. For example, the partition fs0  s3  s1  s4  s2  s5 g is represented by the string 03j14. (We used this succinct notation in the previous example to represent the partitions.) We build a trie to store the strings that encode the partitions that have been constructed in the lattice. Recall that a trie is a data structure, in the form of a rooted tree, used for representing a set of strings. The edges of the tree are labelled by letters of the alphabet, where edges out of the same node are labelled with distinct letters. There is exactly one leaf for each string in the given set, and each string is equal to the concatenation of the labels on the edges of the path from the root to the corresponding leaf. In our case, the root node of the trie has at most n children for the strings with the leading symbol 1 : : :  n, respectively. Generally, each nonleaf node at the r-th level has at most n + 1 children with the rth symbol 1 : : :  n j, respectively. The trie has depth less than 2n. As usual, nodes that have only one child can be eliminated (merged with their parent) and the trie can be stored in compact form (called compacted trie) with edges labeled by strings rather than single letters. Given an element (string), we can walk down the trie according to the symbols in the string from left to right as follows. From the root we follow the edge, whose label matches the rst symbol of the string, leading to a node at the rst level of the trie (a child of the root). Inductively, at the rth step we follow the edge whose label matches the rth symbol of the string. At any moment if a child node is not in the trie then we add it to the trie and proceed from there. If we arrive at an existing leaf node where an identical string has been installed, the element is already in the lattice. Otherwise, it is a newly added element. While walking down the trie r ; 1 steps from the root to a node v with an rth symbol of the string, node v may or may not have an edge whose label matches the rth symbol. A sequential scan of all the existing children of v would take time O(n). If we sort the children of v then it takes time O(log n). We can use an (n + 1)-vector at each node to record its children and it takes a constant time to check for each symbol match. However, the 20

space required is O(n) for each node, and the total space required is O(nN ). To process each active element in the lattice it takes time O(pn2 ) to construct all the parent elements and time O(n) to process each parent element constructed. The total time for the rst process is O(pn2 N ) where N is the number of elements in the lattice. On the other hand, the number of parent elements is the number of incoming edges to the node. Therefore, the total time for the second process is O(nM ) where M is the number of edges in the lattice. In summary,

Theorem 2 The closed partition lattice of a nite state machine can be

constructed in time O(pn2 N + nM ), where p and n are the number of inputs and states of the machine, N and M are the number of elements and edges in the lattice.

2

4 Machine Decomposition The theory of closed partitions and their lattice was developed in 3] for the purpose of decomposing nite state machines into smaller machines. We sketch the application in this section. For more details we refer to 3]. Consider a closed partition  of a machine M . Ignoring the output behavior, we can construct an image (or quotient) machine M from the partition  as follows. Machine M has the same inputs I as M and each block of  is a state of M . Let B be a state of M and a an input in I . Select a state s 2 B and let (s a) 2 C 2 . Dene the next state function of M :  (B a) := C . Since  is a closed partition, the next state function  is well dened. Given a closed partition , which is not the zero partition, the image machine M does only part of the computation performed by M , since it only keeps track of the block of  that contains the current state. If there is another machine M that performs the remainder of the computation, i.e., it keeps track of the current state, knowing the current block, then the combined information of the two machines M and M fullls the function of M . In other words, the function of M is decomposed into that of two machines M and M , resulting in a decomposition of machine M . If the two component machines M and M operate in parallel, i.e., they do not depend on each other, then they provide a parallel decomposition. If they 21

operate serially, i.e., the operation of one machine depends on the other, then they provide a serial decomposition. Often the joint behavior of the component machines properly includes that of machine M , and M is a submachine of the composite machine. Specically, an FSM M = (S I O  ) is a submachine of an FSM M = (S  I  O     ) if I I , O O , and there is a one-to-one mapping  : S ! S , an embedding, such that for every state s in S and input a in I ,  ((s) a) = ((s a)) and (s a) =  ((s) a). 0

0

0

0

0

0

0

0

0

0

0

Serial Decomposition

Informally, we can connect two FSM's M and M in sequence such that machine M takes outputs from machine M as inputs. We take all the possible combinations of states of the two machines as states, inputs to machine M as inputs, and outputs from machine M as outputs, and we obtain an FSM, which models the joint behavior of the two machines in a serial order. Formally, a serial connection of two FSM's M = (S  I  O     ) and M = (S  I  O     ) with I = O is an FSM M = (S I  O   ) := M ! M with: (1) States S = S  S  (2) Input set I and output set O  (3) State transition function ((s t) a) := ( (s a)  (t  (s a))) and (4) Output function ((s t) a) :=  (t  (s a)) where s 2 S , t 2 S , and a 2 I . Informally, from state s machine M receives an input a, moves to the next state  (s a) and produces an output  (s a), which is an input to machine M . Upon receiving input  (s a) from M , machine M moves from state t to the next state  (t  (s a)) and produces an output  (t  (s a)). The next states of machines M and M are  (s a) and  (t  (s a)), which is the next state of machine M , and the output of machine M is that of machine M :  (t  (s a)). Given an FSM M , we want to construct two serially connected machines M ! M that contains M as a submachine. Specically, the machine M ! M is a serial decomposition of M if M is a submachine of M ! M , denoted by M M ! M , and M and M are called the component machines of the decomposition. Obviously, a machine may have more than one serial decomposition. A decomposition is non-trivial if both component machines have fewer states than M . As shown in 3], a nite state machine has a non-trivial serial decomposition if and only if it has a non-trivial closed partition on the set of states. The proof of this theorem is constructive, i.e. we can compute a serial decomposition of a machine from a closed partition. Suppose an FSM 22

M = (S I O  ) of n states has a non-trivial closed partition . Since  is non-trivial, it has r < n blocks and the largest block has k < n states. Let  be a k block partition on S , not necessarily closed, such that    = 0, the zero partition. For instance,  can be constructed by labeling the states of each block Bi of  by 1     ni , ni k, and then placing all the states with the same label in one block of  . Note that for each state s there is a unique block B 2  and B 2  such that s = B \ B . Let M be the image machine from the closed partition  and let its output be the current state of M and the input. Specically, M = (fB g I O = fB g  I    = e) where fB g is the set of blocks of states from partition , which consists of the states of machine M , and e is the identity mapping. Let M = (fB g I = O  O    ) where fB g is the set of blocks of states from partition  , which consists of the states of machine M . Since    = 0, for B 2  and B 2  , B \ B is either a singleton state or an empty set. If the intersection is a singleton state, let (B \ B  a) 2 B . Dene the state transition function  (B  (B  a)) := B , and the output function  (B  (B  a)) := (B \ B  a). Otherwise, we leave the state transition and output functions undened and hence machine M is incompletely specied. Informally, M computes the block of  that contains the current state of M . Since the states of M are blocks of  and the input to M is a block of  and an element of I , the current state of M and its input is known in M and thus M can compute the block of  which contains the next state of M and the corresponding output. 0

0

Example 7 The FSM of Example 1 has many closed partitions, so there

are many possible serial decompositions. For example, we can let  = fs0  s2 s4  s1 s3  s5g, be the partition into even and odd states, and take  = fs0  s1  s2  s3  s4  s5 g. Partition  is not closed, but this is not important. The FSM M has two states ("even" and "odd") and M has three states. Given an input 2 or 3, M determines whether the next state is even or odd, it moves appropriately, and passes to M the input and the old state (odd or even). With this information and its current state, machine M can determine the appropriate next state and output. For example, if M is in state s0  s1 and receives input (odd,2) from M , then it knows that the 23

composite machine M must be in state s1  hence M moves to state s2  s3 and outputs 2.

2

Conversely, a nontrivial serial decomposition of a machine induces a nontrivial closed partition of its states. Suppose M = (S I O  ) has a non-trivial serial decomposition: M1 ! M2 where M1 = (S1  I O1  1  1 ) and M2 = (S2  I2  O 2  1 ). Let  : S ! S1  S2 be the embedding. We partition state set S by the non-empty inverse image blocks of :  := fB =  1 (s1 S2 ) : s1 2 S1 g. Since jj jS1j < n, it is a non-trivial partition. and it can be shown that  is also closed. Thus, an FSM has a non-trivial serial decomposition if and only if it has a proper closed partition, and this is true if and only if it has a proper basis element. From Theorem 1, this can be determined in time O(pn2 ): Corollary 1 We can determine whether a given nite state machine has a non-trivial serial decomposition in time O(pn2 ) where p is the number of inputs and n the number of states. ;

2

Of course, if all we want is to determine the existence of a nontrivial serial decomposition, then we do not need to construct the whole basis, and there may well be a faster method. We leave this as an open problem. Closed partitions can be used to construct serial decomposition into two components. We can consider serial connections of more than two machines in the same way. Paths in the closed partition lattice can be used to construct serial decompositions into multiple components. Suppose that there is a path in the lattice from 1 to 0: 1 > 1 > 2 > : : : r > 0. Suppose that i has ni blocks and the largest block contains ki states, i = 1 : : :  r. Then i 1 is a closed partition of the image machine of i , and we have a serial decomposition: Mi 1 Mi ! Mi where Mi has ki states. Let M0 = M . Then we have Mi 1 Mi ! Mi  i = 1 : : :  r : Therefore, ;

0

;

0

0

;

M Mr ! Mr    ! M1 : 0

0

P

The total number of states of the decomposed machines is: nr + ri=1 ki . There are dierent paths in LM from 1 to 0 and each path provides a serial decomposition. Note that we can take a subset of closed partitions on a path for a serial decomposition and we can select a path based on an optimization criterion. 24

Parallel Decomposition

A parallel connection of two FSM's M1 = (S1  I1  O1  1  1 ) and M2 = (S2  I2  O2  2  2 ) is the machine M := M1 jjM2 with: (1) States S = S1  S2  (2) Input set I = I1  I2  (3) Output set O = O1  O2  (4) State transition function ((s1  s2 ) (a1  a2 )) := (1 (s1  a1 ) 2 (s2  a2 )) and (5) Output function (1 (s1  a1 ) 2 (s2  a2 )) where s1 2 S1 , s2 2 S2 , a1 2 I1 , and a2 2 I2 . We can consider parallel connections of more than two machines in the same way. For clarity we only discuss two-machine connections. Embedding a machine M = (S O I  ) into the parallel connection of two machines M1  M2 involves an input splitting function and an output merging function, dened as follows. An input splitting is a mapping  : I ! I1  I2 , and an output merging is a mapping : O1  O2 ! O. Intuitively, function  splits an input a 2 I into two inputs a1 2 I1 and a2 2 I2 for machines M1 and M2 , respectively. We denote the projection from I to I1 and I2 by 1 and 2 , respectively, and we have 1 (a) = a1 and 2 (a) = a2 . Function merges outputs o1 and o2 from machines M1 and M2 into an output o 2 O: (o1  o2 ) = o. We denote parallelly connected machines M1 jjM2 with input splitting  and output merging by (( I ) ! M1 jjM2 ! (  O)). It is an FSM with: (1) States S1  S2  (2) Inputs I  (3) Outputs O (4) Next state transition function: ((s1  s2 ) a) := (1 (s1  1 (a)) 2 (s2  2 (a))) and (5) Output function: ((s1  s2 ) a) := (1 (s1  1 (a)) 2 (s2  2 (a))). The machines M1 = (S1  I1  O1  1  1 ) and M2 = (S2  I2  O2  2  2 ) provide a parallel decomposition of machine M = (S I O  ) if: there is an input splitting function  : I ! I1  I2 and an output merging function : O1  O2 ! O such that M is a submachine of (( I ) ! M1 jjM2 ! (  O)) and M1 and M2 are called the component machines of the decomposition. Obviously, a machine may have more than one parallel decomposition. A decomposition is non-trivial if both component machines have fewer states than M . It is shown in 3] that a nite state machine M has a non-trivial parallel decomposition if and only if there exist two non-trivial closed partitions  and  on M such that    = 0 . The proof is again constructive. Since  and  are non-trivial closed partitions of machine M = (S I O  ), construct two image machines: M = (fB g I O = fB g  I    = e) and M = (fB g I O = fB g  I    = e) where e is the identity mapping. Dene an input splitting function  : I ! I  I with  =  = e where e is the identity mapping. Dene an output merging function : (fB g  I )  (fB g  I ) ! O as follows. Let B 2  and B 2  . Since 25

   = 0, for B 2  and B 2  , B \ B is either a singleton state or an empty set. If the intersection is a singleton state B \ B = fsg, for a 2 I , let ((B  a) (B  a)) := (s a). Otherwise, we leave undened, and hence machine M jjM is incompletely specied. Also note that function is only dened in a subdomain of (fB g  I )  (fB g  I ) where the two input symbols in I are identical. It can be shown that M is a submachine of M = (S  I O    ) := (( I ) ! M jjM ! (  O)). Conversely, from a non-trivial parallel decomposition of a machine M one can construct two closed partitions ,  such that    = 0. 0

0

0

0

Example 8 The FSM of Example 1 has many pairs of closed partitions, whose glb is the zero partition. For example, we can let  = fs0  s2  s4  s1  s3  s5 g, be again the partition into even and odd states, and take  = fs0  s3  s1  s4  s2  s5 g. The FSM M has two states and M has three states. An input, 2 or 3, is passed directly in the parallel decomposition to both machines, which move to the appropriate next state. In this case we can let the output function of each machine be simply the next state, and let the output merging function map a pair of outputs, i.e., a pair of states of M and M , or in other words a pair consisting of a block of  and a block of  , to the index of the (unique) state of M that is in the intersection of the two blocks. For example, consider input 2 and suppose M is in state s1  s3  s5 and M is in state s0  s3 . Then M moves to state s0  s2  s4 , which it also outputs, and M moves to state s0  s3 which it outputs. The output merging function combines the two outputs, (s0  s2  s4  s0  s3 ) = 0, which is the overall output of the parallel composition. 2

Thus, an FSM has a non-trivial parallel decomposition if and only if it has two closed partitions whose intersection is the zero partition 0. We do not need to compute the whole lattice in order to determine whether this condition hold: Note that this is the case if and only if there are at least two elements in the basis of the closed partition lattice of the machine. Therefore, we have:

Corollary 2 We can determine whether a given nite state machine has a non-trivial parallel decomposition in time O(pn2 ), where p is the number of inputs and n the number of states.

2 26

Again, just for the existence question we do not need the whole basis we leave it as an open problem to nd a faster method. However, the closed partition lattice and its basis give us all possible decompositions. The intersection of two closed partitions  and  is the zero partition 0 if and only if they have only one common descendant: 0, that is, if and only if  and  are ancestors of dierent partitions of the basis. We can nd such a pair (in fact, all such pairs) for a parallel decomposition, using LM . We can examine the closed partition lattice of the component machines for further decomposition if so desired. In conjunction with serial decomposition, we can decompose a machine both serially and in parallel, resulting in a network of machines 3]. On the other hand, for a given machine for decomposition, there can be more than one possible serial or parallel decomposition. The determination of the type of decomposition and the selection of the pairs of component machines for a decomposition can be based on certain optimization criteria. For instance, we may want to construct a parallel decomposition such that the sum of the states of the two component machine is minimal. The closed partition lattice provides information for the decomposition.

5 Conclusion We have studied ecient algorithms for the computation of closed partition lattices of nite state machines for the applications to machine serial and parallel decompositions. State-splitting is an interesting technique, which has been used successfully for machine minimization 6]. It could also be used for machine decomposition see Chapter 5 in 3]. How much would state-splitting facilitate the decomposability of machines and reduce the size of component machines and what is the complexity? These are interesting research topics. In principle, nite state machines model appropriately systems such as sequential circuits and control portions of communication protocols. However, in practice the usual system specications include variables and operations based on variable values ordinary nite state machines are not powerful enough to model in a succinct way the physical systems any more. Extended nite state machines, which are nite state machines extended with variables and transitions associated with predicates and actions on the variable values, have been studied 7]. The general mathematical model of extended nite state machine with unbounded variables has the same 27

computing power as Turing machine 5] the EFSM model with bounded variables is equivalent to ordinary nite state machines, but it is much more powerful in succinctness and it is often used for practical systems. For the decomposition of extended nite state machines, an algebraic theory and ecient algorithms remain to be explored.

References 1] A.V. Aho, J. E. Hopcroft, and J. D. Ullman (1974) The Design and Analysis of Computer Algorithms, Addison-Wesley. 2] D. Brand and P. Zaropulo (1983) On communicating nite-state machines, JACM, vol. 30, no. 2, pp. 323-342. 3] J. Hartmanis and R. E. Stearns (1966) Algebraic Structure Theory of Sequential Machines, Prentice-Hall. 4] C. A. R. Hoare (1985) Communicating Sequential Processes, Prentice Hall. 5] J. E. Hopcroft and J. D. Ullman (1979) Introduction to Automata Theory, Languages, and Computation, Addison-Wesley. 6] Z. Kohavi (1978) Switching and Finite Automata Theory, 2nd ED, McGraw-Hill. 7] D. Lee and M. Yannakakis (1996) Principles and Methods of Testing Finite State Machines - a Survey, The Proceedings of IEEE, Vol. 84, No. 8, pp. 1089-1123, August. 8] G. Noubir, B. Y. Choueiry and H. J. Nussbaumer (1996) Fault Tolerant Multiple Observers using Error Control Codes, Proc. ICNP, pp. 84-91. 9] R. E. Tarjan (1975) Eciency of a good but not linear set union algorithm, JACM, vol. 22, no. 2, pp. 215-225, 1975. 10] R. Milner (1989) Communication and Concurrency, Prentice Hall. 11] C. Wang and M. Schwartz (1993) Fault Detection with Multiple Observers, IEEE/ACM Trans. on Networking, Vol. 1, No.1, pp. 48-55. 28

Appendix Algorithm 1 (PARTITION CLOSURE)

input. An FSM M = (S I O  ) and a partition of states  = fB1  : : :g. output. The closure of ,  .

1 initialize doubly linked lists of blocks ACTIV E   and INACTIV E   and a list pend(B )  B for each block B  2 ACTIV E   3 while (ACTIV E 6= ) 4 delete a block B from ACTIV E and insert into INACTIV E  5 if (jpend(B )j > 1) 6 temp  pend(B ) 7 select a state sB from B  pend(B )  fsB g 8 for each a in I 9 initialize a block Ba   pend(Ba )   10 for each state s in (temp a) 11 let C be the block that contains s 12 delete C from ACTIV E or INACTIV E  13 Ba  Ba  C  pend(Ba )  pend(Ba )  pend(C ) 14 if (more than one block has been merged into Ba) 15 insert Ba into ACTIV E  16 else 17 insert Ba back to ACTIV E or INACTIV E  18   INACTIV E The algorithm keeps in a list ACTIV E blocks whose image blocks may intersect more than one block, and the rest of the blocks are kept in INACTIV E . It is an invariant of the algorithm that at the beginning of each execution of the while-loop (line 3), for each input a and block B 2 INACTIV E , its image (B a) intersects at most one block (and hence the image is contained in one block). This property may be tem29

porarily violated after line 4 which moves a block from the ACTIV E to the INACTIV E list, and the rest of the loop serves to restore the invariant. For each block B there is a list pend(B ) that contains a subset of the states of B . The subset has the property that, at the beginning of each execution of the while-loop, for any state v of B ; pend(B ) there is a state v of pend(B ) whose transitions on all inputs go to the same blocks as v that is, for every input a, the states (v a) and (v  a) belong to the same block. Thus, for the purpose of checking whether a block B has transitions on some input to more than one blocks, it suces to check only the states in pend(B ). A block is taken from list ACTIV E at Line 4 and moved to the INACTIV E list. If its pending subset pend(B ) contains only one state, then for every input a, the image (B a) is contained in one block, and nothings needs to be done. If its pending subset pend(B ) contains more than one state then we process it in Line 5-17. We examine all the inputs and merge blocks as needed so that for each input the transitions from all states of pend(B ) (and hence also B ) go to the same block. All the states are removed from the pending list of B , except for one state that is used to represent the rest of the states. Note that B itself may be merged with other blocks during this process this is the reason that pend(B ) is copied rst to a temporary list temp. In more detail, the algorithm examines the image of the pending set of B for each input a at Line 8. If the image intersects more than one block then merge the intersected blocks in Line 9-13. Specically, for each input a we initialize a block Ba for merging blocks that intersect the image block (B a). If a state s in (B a) is contained in a block C then C must intersect (B a), and we merge all these blocks C into Ba after deleting them from ACTIV E or INACTIV E . If more than one block is merged into Ba , we insert the new block Ba into ACTIV E for further processing in Line 1415. Otherwise, (B a) is contained in a block C = Ba and we return the block back to either ACTIV E or INACTIV E where it has been deleted from. To distinguish the two cases of Line 14 and 16, we can simply keep a counter of the number of blocks merged to Ba . Having merged all the blocks that intersect (B a) for each input a, the states of block B will continue to transition to the same block after subsequent merges, hence we do not need to examine them again. This is the reason for deleting all of them but one from the pending set of B (or the pending set of the current block that contains B if B has been merged into another block during this processing). Finally, when the ACTIV E list becomes empty there is no more merging 0

0

30

needed and the INACTIV E list contains the blocks of a closed partition. During the process, all the blocks of ACTIV E and INACTIV E form a partition of S  we only merge blocks. When the algorithm terminates, the blocks in INACTIV E provide a closed partition   since ACTIV E becomes empty and no image blocks intersect more than one block. We now show that  is the closure of . Suppose that  is a closed partition such that  . The only occasion that we change the partition is when we merge blocks C that intersect (B a). Since   and is closed, there is a block C 2  such that (B a) C , and, consequently, C also intersects C . Since  , there is C 2  with C C , which intersects C . Since  is a partition, C = C and hence C C . Therefore, from examining an image block (B a), we only merge blocks within a block of  . This shows that   and, therefore, Algorithm 1 constructs the closure of . For ecient implementation, we can use doubly linked lists for ACTIV E and INACTIV E for easy insertions and deletions it takes a constant time to insert, and delete an element from the list. The pending subset of each block is kept as a linked list, hence merging two pending sets is link concatenation, a constant time operation. As for the blocks themselves, we keep them in a Union-Find data structure (i.e. trees with path compression, see e.g. 9, 1]) which supports the operations of unioning (merging) two blocks, and nding the current block containing a given state. The number of unions is obviously at most n ; 1 (every union decreases the number of blocks by one). We claim that the number of nds is less than 2pn. To see this, for each state s charge to the state itself the rst time that we determine the blocks containing its images (s a) under the various inputs a. In every subsequent time that we perform a nd operation on the images of s, it must be the case that the block of B was merged previously into another block and s was selected as a representative in pend(B ) charge these nds to the previous merge of B . Thus every nd operation is charged to either a state or a merge operation, so the total number of nds is less than 2pn. From the performance of the Union-Find data structure, it follows that the complexity of the algorithm is O(pn (pn n)), where is the inverse Ackermann function, a slowly growing function that is for all practical purposes upper-bounded by 4 for reasonable values of n. 0

0

0

31