Functional Decomposition of Composite Finite State Machines Partha S. Roop and A. Sowmya Department of Arti cial Intelligence School of Computer Science and Engineering The University of New South Wales Sydney, NSW 2052, AUSTRALIA proop,
[email protected] Phone : +61-2-9385-3980 Fax : +61-2-9385-1814
Abstract. Many of the multiway general decompositions of nite state machines (FSMs) proposed in the past are concerned with the cost reduction of the eventual logic level implementation. In this paper we propose a new method of decomposing a new FSM model called Composite Finite State Machines (CFSMs), which is ideal for use in a microprocessor based system design environment. Our algorithm partitions the CFSM of the design functionality into a set of interacting CFSMs such that the partitioned CFSMs represent the dierent sub-functions of the design speci cation. Unlike existing FSM decomposition schemes, our algorithm is bottom-up and is able to determine suitable devices from a design library to implement the partitioned sub-functions. It is an extension of the behavioural mapping procedure proposed in [12] which addressed the implementability question for a single design function by mapping to a particular device, whereas our algorithm performs a behavioural mapping between a design function and a set of suitable devices.
1 Introduction Behaviour mapping is often an important task performed by a CAD system to determine the implementability of a design function by an available device. This task has been automated recently by a mapping algorithm [12], that nds out a device from a design library to implement a given design function. However, this algorithm fails when the design function needs to be implemented by a set of devices instead of a single device. In this paper we have extended this behavioural mapping algorithm so that the question of implementability of a multi-functional design speci cation can be addressed using a set of available devices.
For sequential circuits, the behaviours are often represented as FSMs and hence to perform design, some form of FSM partitioning has to be adopted. One of the earliest decompositions proposed by Hartmanis and Stearns [8] could discover both parallel and cascade decompositions using algebraic decomposition. However, parallel and cascade decomposition have limited use in design. To overcome the limitations of these decompositions, a general decomposition method was proposed by Devadas and Newton [5]. In a general decomposition, each partitioned machine has information about the current state of the others. Later research in FSM decomposition focussed on dierent cost directed techniques [1, 17] to improve the general decomposition. Another class of FSM decompositions has been proposed which focussed on separation of data and control aspects of a computation for various applications. Wolf in[16] proposed the decomposition of an FSM into a data register and a control FSM. The total number of states in the data register and the control FSM was much smaller than the original FSM before the decomposition. This method was used for the ecient implementation of a pipelined state transition graph generated by high level synthesis. Fahmy et. al in [6] have also proposed a very similar strategy of decomposing a Real Time Acceptor (RTA) into two state machines called control structure (CS) and data structure (DS). The necessary and sucient condition for the decomposition of a behaviour graph into CS and DS has been formalized and is used as the basis for the synthesis algorithms for RTAs. In this work, we are proposing the decomposition of a new FSM model called composite nite state machines (CFSMs) proposed by Mitra et. al in [12]. CFSMs have been proposed to model the behaviour of microprocessor peripheral devices. Most of these devices have internal registers and also support operations on these registers. In a CFSM model, operations on these data registers are abstracted out to another set of FSMs called subsidiary machines (SMs) and the abstract functional behaviour of the device is modelled by a single FSM called the primary machine (PM). This separation of data from control helps in reducing the number of states of the functionality being modelled and thus makes the corresponding behaviour mapping between CFSMs tractable. Our CFSM partitioning algorithm splits the CFSM on a functional basis. Various functional partitioning approaches have been proposed for system design due to obvious advantages over structural partitioning [14, 15]. One of the popular functional partitioning techniques known as procedure exlining was proposed by F. Vahid [13]. Procedure exlining "divides a large set of statements into several procedures, where each procedure performs a distinct computation". The procedure exlining idea is very similar to the program slicing techniques of software engineering [4]. Program slicing identi es a set of tightly coupled modules within a given program where each tightly coupled module performs a single speci c function. Program slicing is performed by adopting standard data dependency analysis procedures. Our CFSM partitioning algorithm is inspired by the program slicing idea in the sense that it tries to identify the independent computations inside a CFSM spec-
i cation. The general decomposition algorithms [5] try to identify the repetitive computations by attempting subgraph isomorphism within a sequential FSM whereas our algorithm identi es the distinct computations within a CFSM. Identi cation of distinct computations is done in a bottom up approach using the behavioural mapping algorithm proposed in [12]. The organization of the paper is as follows : Section 2 describes the CFSM formalism and the behaviour mapping procedure by Mitra et. al [12], which is being extended in this work. Section 3 will discuss the basic idea behind the CFSM split procedure and subsequently the algorithm will be presented. Section 4 will present certain results and conclusions.
2 The CFSM formalism and Behavioural Mapping 2.1 The CFSM Formalism Composite Finite state Machine (CFSM) model was proposed to capture the behaviour of microprocessor peripheral devices. The basic motivation was that such devices have data registers and support operations on these registers and representation of the behaviour of such a device as a single sequential FSM resulted in a huge number of states. In a CAD system it is often required to implement a given behaviour by some available device. Other methods of state reduction such as the EFSM model [3] and statecharts [7] were unsuitable as they could not be used easily to test the implementability of one behaviour by another. CFSMs were proposed in [12] to reduce the number of states in the behaviour and they could also be used to determine the implementability of one behaviour by another, which is a basic task performed by any CAD system. A CFSM consists of a number of constituent machines, operating concurrently and communicating with each other. One of these constituent machines is termed as the primary machine (PM), and represents the abstract functional behaviour of the overall CFSM. The other constituent machines are called subsidiary machines (SMs), which handle the data and memory operations only. The PM is represented by a state transition graph (STG) whereas the SMs are represented as the names of the functionalities they perform. The operational semantics of every constituent machine of a CFSM is based on the synchrony hypothesis [2]. Every state transition arc in the PM is labelled by a triplet e[c]=a where, e is an event that triggers the transition, c is a set of guard conditions that enables the actual ring of the transition, and a is a set of actions that are executed by the PM before entering the next state. The SMs are represented in a parameterized form, where the names indicate the functionality and the parameters specify the name of the data inputs to the
respective function. In this way, although data input events are not used directly by the PM, it can call the respective SM by specifying the name of the SM and the data arguments, thus accessing the results of the particular function on the speci ed data inputs. Example 1 : Let us consider the behaviour of an up-counter, with a maximum count of MAXCNT-1. This counter outputs the count value at the instant it is stopped. In a conventional FSM representation, it would have required MAXCNT + 1 states, including a state for 0 count and a state for over ow. A representation of this counter in the CFSM formalism results in a PM having four states, whose STG is shown in Figure 1. ARC1 n1
ARC2 n2
n3
ARC3
DATA INPUTS : CLK (0 : 1) CONTROL INPUTS : START, STOP DATA OUTPUTS : DATA (0:MAXCNT-1) CONTROL OUTPUTS : OVFL (0:1) SMS : S1 : no-of(CLK+); S2 : S! >= MAXCNT
n4
EDGE LABLINGS : ARC1 : START /init(S1), init(S2) ARC2 : STOP / DATA = S1, OVFL = 0 ARC3 : S2 / OVFL = 1
Fig. 1. CFSM of an Up-Counter
2.2 Behavioural Mapping In a CAD system,behaviour mapping is often required to implement desired behaviours by some available devices. The selection of the device that can implement the behaviour, and the required interfacing is usually done by human experts. The interface consists of the transformations that may have to be performed on the inputs and outputs of the device. The above analysis has been automated in [12] by using a behavioural mapping algorithm that maps the behaviour of the design functionality in CFSM form to the behaviour of a suitable device in a device library also in CFSM form. This algorithm also derives the speci cation of the interface automatically. This mapping algorithm determines if a design function's PM F can be implemented by the PM of a device D. If this implementation is possible the algorithm determines the speci cation of the interface I such that D:I can implement F . The speci cation of I is basically the transformations that are required such that the inputs of F can be bound to that of D and the outputs of D can be mapped to that of F . A library of available transformation operators is maintained to achieve the required transformations.
The behaviour mapping is not necessarily surjective, because a general device is capable of exhibiting a wide range of functionalities, only a subset of which is capable of meeting the requirements of F . The FSM equivalence algorithm of [10], though similar cannot address the implementability question and is also inadequate to derive the device's interface. The algorithm works by performing an exhaustive traversal of the function and device PMs and establishes equivalences between the edges of the function and that of the device as the traversal progresses. An edge of F is made equivalent to that of D if their edge labelings are equivalent and they do not violate the existing equivalences. The existing equivalences thus act as the speci cations of I . The mapping algorithm is successful if for every path in F from start state to terminal state there exists an equivalent path in D from start state to terminal state.
3 Functional Partitioning of CFSMs The behaviour mapping algorithm discussed in the previous section determines the implementation of a design function by a single device. However, suppose the design function is such that it cannot be implemented by any single device but requires the interconnection of a set of devices, then this algorithm will fail to do the implementation. An improved algorithm is presented in this paper that solves the above mentioned problem : implementation of a multi-functional CFSM speci cation by a set of devices. The intuitive idea is that the algorithm partitions the multifunctional CFSM F into a set of primitive functions F 1 to F n such that for each CFSM F i, there exists a device Di in the database such that Di can implement F i. For partitioning F we do not adopt any standard FSM decomposition algorithms due to the fact that they tend to use the subgraph isomorphism idea to identify repetitive computations. In a given CFSM there may not exist any isomorphic subgraphs but it may still be multi-functional. Hence we adopt a bottom up strategy, in which we try to identify subgraphs within the function behaviour that are isomorphic to a set of device behaviours. Before we can present the CFSM partitioning approach, we shall brie y dwell upon the idea of obtaining multi-functional design speci cations in the CFSM formalism. Subsequently, we shall present three types of edge equivalences, that are mainly the consequence of multi-functionality of F and hence were not part of the edge equivalence procedures presented in [12].
3.1 Idea behind composed behaviour modelling Suppose we have two devices or functions connected as shown in Figure 2(a). Since the lines X and X' of the two devices are tied together (X being an output line and X' being an input line), any signal transition in X also aects X' instantaneously. Let us assume that in the device behaviour of B, there is an edge in which whenever X' goes high an action Y = 1 is performed. Indirectly we can say that X triggers the generation of the action Y = 1 in B. Hence, if the edges corresponding to this situation in the two devices is depicted in Figure 2(b), then the resulting composed edge in the function behaviour will be as in Figure 2(c). This transformation preserves the semantics of the original edges since like in statechart semantics [9] a causal ordering between the actions in a micro-step is assumed (e.g, say actions A1, A2 occur as a result of event e, if A1 is listed before A2 in the transition arc, then even though they occur simultaneously in the same time step, A1's micro step is said to precede that of A2). X
X’
Device A
Device B
Y
(a) Interconnection of two devices with the output line of A connected to input line of B e(c) / X=1
X’+ / Y=1
(b) EDGES OF THE TWO DEVICES TO BE COMPOSED e(c) / X=1
ε / Y=1
(c) Composition of edges in the composed CFSM
Fig. 2. Simple Composition Rule The above rule is not a composition rule to compose two CFSMs automatically but is only a rule of thumb for the designer to compose arcs of the two devices in which interconnections explained as above exist. We would like to stress at this point that the composed functionality is not obtained automatically but is input to the system by the designer.
3.2 Types of device and function edge equivalences 1. Straight Mapping : This is identical to the mapping done between the function and device edge labelings in the behavioural mapping algorithm of [12]. This process tries to establish a correspondence between the e[c]=a labelings on a function arc with that of a corresponding device arc. These newly produced transformations (which are termed as bindings) should not con ict with the existing bindings (bindings produced due to previous edge-labeling equivalences).
2. Concatenate and Map : Straight mapping produces the equivalence between one device edge with a single function edge. However, due to the introduction of transitions, it may be necessary to map a given device edge to a set of function edges (consider the situation as shown in Figure 3(A) ). In order to facilitate the mapping in such situations, we concatenate the actions on successive transitions in the function CFSM (up to that many transitions as the number of actions present on the device edge being mapped). After the concatenation is over, the mapping process is identical to straight mapping. 3. Deepen the Map : Suppose the device and function edges are as shown in Figure 3(B). Such a situation in the function behaviour can occur when the input line of one device is connected to the output line of the other. In this case, the action on one of the arcs of the composite function will be driving the events of the succeeding arcs. So the events of the succeeding arcs are made during behaviour modelling. To facilitate the mapping to a device edge in such a situation, the action on the preceding arc appears as the event of a succeeding arc and then the actions on the succeeding transitions are gathered into this arc (up to that many transitions as the number of actions present on the device edge being mapped). After the deepening, the mapping process is identical to straight mapping. e(c) / X, Y, Z
(a) Device Edges to be Mapped
ε/
e1(c1)/X1
ε/
Y1
Z1
(b) Function Edges Prior to Concatenation e1(c1) / X1,Y1,Z1
(c) Function Edges After Concatenation
(A) CONCATENATE AND MAP A1 / A2, A3
(a) Device Edge to be Mapped
ε
e1(c1) / A1’
ε
/ A2’
/ A3’
(b) Function Edges Before Deepening e1(c1)/A1’
A1’ / A2’, A3’
(c) Function Edges After Deepening
(B) DEEPEN THE MAP
Fig. 3. Device and Function Edge Equivalences
3.3 The CFSM-Split Algorithm The algorithm assumes the following :
{ A set of device behaviours is already encoded and stored in a design library in CFSM form.
{ The design functionality is also acquired in the CFSM form. { The design functionality being modelled is inherently sequential. The algorithm iterates over a set of device behaviours and tries to map a given device behaviour as a part of the functional behaviour. The number of iterations can be reduced if the designer has some idea regarding which set of devices may possibly implement the function, or else the iterations have to be done for all the devices in the database. The iterations can start with any chosen device. The algorithm performs a depth rst traversal of the device CFSM, starting with its start node. For each traversed edge, an equivalent edge ( or a set of edges) is searched in the function CFSM. Whenever, an equivalent edge is found we try to join it with previously found equivalent edges to construct a new partitioned CFSM (sub-function). The equivalent edges are joined exactly in the manner the corresponding edges appear in the device (an isomorphic subgraph of the device CFSM is constructed). During the depth rst traversal, if we fail to nd an equivalent edge for any particular edge of the device, that branch is abandoned and the search continues in a new branch (all the mapped edges to the abandoned branch are removed from the new sub-function CFSM being constructed. Subsequently, these edges are maintained as un-mapped edges in the function CFSM). This process of new partitioned CFSM construction is depicted in Figure 4. The process of mapping continues until all the branches in the device CFSM has been tried. The partitioned sub-function will be NULL if we do not have at least one path from start node to a terminal node in the device for which equivalent edges exist in the function CFSM. F
C
G
A B
E
H
D
I
J
(a) CFSM of the FUNCTION b
c
a
d
e (b) CFSM of the DEVICE
A
I
G
(c) CFSM of the Partitioned SUB-FUNCTION
Fig. 4. Construction of Partitioned CFSM If the partitioned sub-function is NULL then the device is rejected and all edges in the function mapped to some edges of this device are unmapped. Otherwise, we give this device behaviour and the sub-function behaviour as input to the behaviour mapping algorithm of [12] to obtain the necessary interface I 1 such
that D1.I 1 can implement the sub-function F 1. This process of partitioning continues until the function behaviour F has been partitioned to sub-functions F 1 to F n such that for each F i there exists a device Di which implements F i.
Interconnecting the Mapped Devices Once the CFSM-split algorithm suc-
cessfully terminates, the set of selected devices can be interconnected to specify the system architecture. This interconnection can be derived by looking at the bindings generated for the individual devices. An I/O line of the function when bound to lines Y and Z of two dierent devices, speci es the interconnection required between the lines Y and Z. The working of the CFSM-split algorithm is depicted by the following example.
Example Consider the CFSM of an analog to digital converter (ADC) using an
up-counter, a digital to analog converter (DAC) and a comparator. The CFSM of this composite functionality is shown in Figure 5. Figure 6(a), Figure 6(b) and Figure 6(c) depict the behaviour of a down counter, a DAC and a comparator respectively. Along with each gure, the respective partitioned CFSM for this device and the generated bindings of behaviour mapping is shown. Figure 6(d) shows the system architecture generated by the proposed algorithm.
A4
A1
A2
DATA INPUTS : DIN, INP1, INP2, CLK
A3
A5
DATA OUTPUTS : OUT, AOUT CONTROL INPUTS : START, STOP CONTROL OUTPUTS : NEQ, EQ
A6
SMs : S1 : no-of(CLK=), S2 : Analog-val(DIN) EDGE LABLINGS : A1 A2 A3 A4 A5 A6
: : : : : :
START / init(S!),init(S2), OUT=S1 e / AOUT=S2 e (INP1INP2) / NEQ=1 e / OUT = S1 e (INP1=INP2) / EQ=1 e / OUT=S1
Fig. 5. CFSM of ADC using Up-Counter, DAC and Comparator
4 Results and Conclusions The following designs have been obtained using the CFSM-Split technique proposed in this paper :
arc2 a2 arc1
arc3
a1
a3
arc4 EDGE LABLINGS : a1 : START / init(S1), OUT=S1 a2 : NEQ / OUT = S1 a3 : EQ / OUT = S1 DATA INPUTS : PULSE,INVAL DATA OUTPUTS : VAL CONTROL INPUTS : LOAD, LATCH CONTROL OUTPUTS : UNDER-FLOW SMs : S11 : number-of(PULSE-), S12 : mem(INVAL), S13 : S11 > S12 EDGE LABLINGS : arc1 : LOAD / init(S11), init(S12), init(S13), VAL=S12 arc2 : PULSE- / VAL=S12-S11 arc3: LATCH / VAL=S12-S11 arc4 : S13 / UNDER-FLOW=1
BINDINGS GENERATED BY BEHAVIOR MAPPING : START=LOAD OUT=mem(INVAL)-VAL NEQ=PULSE EQ=LATCH (b)
CFSM of the SUB-FUNCTION (constructed using arcs from the ADC behavior)
(a) CFSM OF A DOWN COUNTER PRODUCING THE COUNT ON EVERY PULSE
(a) CFSM of a Down-Counter and the Partitioned Sub-Function a1
A1
DATA INPUTS : DIGI-IN DATA OUTPUT : AN-OUT CONTROL INPUTS : CONTROL OUTPUTS : SMs : S21 : Analog-Val(DIGI-IN)
EDGE LABLINGS : A1 : OUT / AOUT=S2 BINDINGS GENERATED BY BEHAVIORAL MAPPING OUT=DIGI-IN
EDGE LABLINGS :
AOUT=AN-OUT
a1 : DIGI-IN / AN-OUT=S21
(b) CFSM of SUB-FUNCTION corresponding to the behavior of DAC
(a) CFSM of DAC
(b) CFSM of a DAC and the partitioned Sub-Function arc1
ARC1
ARC2
arc2
DATA INPUTS : I1, I2 DATA OUTPUTS : CONTROL INPUTS : CONTROL OUTPUTS : EQUAL, NOTEQUAL EDGE LABLINGS :
EDGE LABLINGS : ARC1 : AOUT(INP1=INP2)/ EQ=1 ARC2: AOUT(INP1 INP2) / NEQ=1 BINDINGS GENERATED BY BEHEVIORAL MAPPING :
arc1 : I1 (I1=I2) / EQULAL=1 arc2 : I1 (I1I2) / NOTEQUAL=1
AOUT = I1 NEQ = NOTEQAL EQ= EQUAL
(a) CFSM of a COMPARATOR
(b) CFSM of SUB-FUNCTION corresponding to DAC
(c) CFSM of a Comparator and the Partitioned Sub-Function DIGITAL OUTPUT LOAD
OUT PULSE INVAL
DOWN COUNTER
mem(INVAL) -VAL
DIGI-IN
AN-OUT
I1
DAC
NOTEQAL COMP
I2
EQAL
LATCH ANALOG-INPUT
(d) System Architecture
Fig. 6. Mapped Device Behaviours along with the Partitioned Sub-Functions and the System Architecture
1. ADC using up-counter, DAC and comparator 2. ADC using down-counter, DAC and comparator 3. A printer and Microprocessor interfacing using handshaking (8255). The main contributions of the paper are : i) the development of a new approach to partitioning FSMs on functional basis using a set of device behaviours, in a bottom-up approach and ii) the automatic derivation of the system architecture as a consequence of the functional partitioning and behavioural mapping. A new approach to compose CFSMs during behaviour modelling has also been proposed. The CFSM partitioning algorithm is similar to the program slicing algorithm [4] in the sense that, it collects the transitions belonging to a particular distinct computation from the function CFSM and reconstructs a new CFSM. In program slicing also, dierent statements belonging to a particular sub-function are collected from dierent parts of a program and combined to give a program performing a single speci c function. However, unlike program slicing, we cannot use standard data dependency procedures to identify distinct computations. Therefore, we have adopted a bottom-up strategy to compare with a set of device behaviours to construct a set of isomorphic subgraphs to some of these devices. In a CAD system, behavioural mapping is often required to perform the mapping of a desired behaviour to a given device or a set of devices. The task of automating this mapping for a function to a single device was proposed in [12]. In this paper, we have proposed a generalized algorithm, that extends this work of mapping a design function to a set of devices. The interfaces required for each device and the necessary interconnections between the devices is also derived automatically. The present version of the algorithm is applicable only if the functionality being modelled is inherently sequential. However, in many of the practical problems considered in our domain, there exists a sequential ow of control between the devices. Moreover, work is now in progress to remove this restriction, so that more generalized designs can be attempted. The algorithm also assumes that a given device is used only once in a given functionality. A slightly modi ed algorithm can easily be developed to overcome this restriction. The present algorithm derives a system architecture but does not address the question of optimising performance versus cost for the derived architecture. This problem can be addressed by extending the CFSM-Split algorithm using certain hardware/software codesign [11] techniques.
References 1. P. Ashar, S. Devadas, and A. R. Newton. A uni ed approach to decomposition and re-decomposition of sequential machines. In proc.of 27th DAC, pages 601{606, 1990. 2. G. Berry and G. Gonthier. The esterel synchronous programming language. Sc. Comput. Prog., 19:87{152, 1992.
3. K. T. Cheng and A. S. Krishnakumar. Automatic functional test generation using the extended nite state machine model. In Proc. 30th DAC, pages 86{91, 1993. 4. I. S. Chung and Y. R. Kown. An approach to partitioning programs on the functional basis and applications. Microprocessing and Microprogramming, 40:315{326, 1994. 5. S. Devedas and A. R. Newton. Decomposition and factorization of sequential nite state machines. IEEE Transactions on CAD, 8:1206{1217, 1989. 6. A. F. Fahmy and A. W. Biermann. Synthesis of real time acceptors. Journal of Symbolic Computation, 15:807{842, 1993. 7. D. Harel. Statecharts : a visual formalism for complex systems. Sci. Comput. Prog., 8:231{274, 1987. 8. J. Hartmanis and R. E. Stearns. Algebraic Structure Theory of Sequential Machines. Prentice Hall, 1966. 9. J. Hooman, S. Ramesh, and W. P. de Roever. A compositional semantics for statecharts. Theoretical Computer Science, 1992. 10. J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 1979. 11. R. S. Mitra, P. S. Roop, and A. Basu. An overview of mickey - a knowledge based hardware-software codesign framework for microprocessor-based systems. Sadhana-Academy proceedings in Engineering Sciences, 1996. 12. R. S. Mitra, P. S Roop, and A. Basu. A new algorithm for implementation of design functions by available devices. IEEE Transactions on very large scale integration (vlsi) systems, 4(2):170{180, June 1996. 13. F. Vahid. Procedure exlining: A transformation for improved system and behavioral synthesis. In proc. Int. Symp. System Synthesis, pages 84{89, 1995. 14. F. Vahid and D. D. Gajski. Closeness metrics for system-level functional partitioning. In proc. of European DAC, pages 328{333, 1995. 15. F. Vahid, T. Dm Le, and Y. Hsu. A comparison of functional and structural partitioning. In proc. of the Int. Symp. on System Synthesis, pages 121{126, La Jolla, CA, 1996. 16. Wayne Wolf. Fsm decomposition for pipelined data. INTEGRATION, the VLSI journal, 15:117{131, 1993. 17. W. L. Yang, R. M. Owens, and M. J. Irwin. Lower bound study on interconnect complexity of decomposed nite state machines. IEE Proc. -Comput. Digit. Tech., 142(5), September 1995.