21st International Conference on VLSI Design
A Module Checking based Converter Synthesis Approach for SoCs Roopak Sinha, Partha S Roop and Samik Basu
Abstract— Protocol conversion involves the use of a converter to control communication between two or more protocols such that desired system-level specifications can be satisfied. We investigate this problem in a formal setting and propose, for the first time, a temporal logic based automatic solution to convertibility verification and synthesis. At its core, our technique is based on local module checking and determines the existence of the converter and if a converter exists, it is automatically generated. A number of key features of our technique distinguishes it from all existing formal and/or informal approaches. Firstly, we handle both data and control mismatches using a single unifying module checking based solution. Secondly, the proposed approach uses temporal logic for the specification of correct behaviors (unlike earlier automaton based specifications) which is both elegant and natural to express event ordering and data-matching requirements. Finally, we have experimented extensively with the examples available in existing literature to evaluate the applicability of our technique in a wide range of applications.
Fig. 1. However, the masters and slave have inherent control and data mismatches (explained in later sections), which prevent their integration into the AHB system system. Protocol conversion, in this case, will look at creating a converter for each master (shown in Fig. 1), such that mismatches can be eliminated. Converter 1 Master 1 Producer Master 2
A
Roopak Sinha is a PhD student at the Department of Electrical and Computer Engineering, University of Auckland,
[email protected] Partha S Roop is a Senior Lecturer at the Department of Electrical and Computer Engineering, University of Auckland, New Zealand.
[email protected] Samik Basu is an Assistant Professor at Department of Computer Science, the Iowa State University.
[email protected]
1063-9667/08 $25.00 © 2008 IEEE DOI 10.1109/VLSI.2008.109
Slave Memory
Arbiter
Fig. 1.
SYSTEM-on-a-chip (SoC) contains individual processing and peripheral components (called intellectual property or IP blocks) connected together using a common bus [8]. Components in a SoC are usually developed in isolation and may follow independent communication protocols. Therefore, when several components are interconnected, it is possible that their may suffer from protocol mismatches [11]. Mismatches occur when the exchange of control signals and/or data between components is not consistent with the intended behaviour of their interaction [8], [15], [18] (leading to control and/or data mismatches). In order to resolve mismatches, it is required that mismatched components be redesigned to achieve desired system-level behaviour. This is usually a very expensive process. Due to this overhead, protocol conversion, a term broadly used to refer to techniques that resolve mismatches without requiring manual modification of components, has been studied extensively for over two decades [3], [8], [15], [16], [19]. Protocol conversion typically involves the automatic generation of extra glue-logic, called a converter, to control the communication between components in order to satisfy system-level behaviour. Consider the example of a SoC that uses the AMBA high-performance bus (AHB) [8] to connect two masters - a producer processor and a consumer processor. These two processors in the SoC [20] communicate using the slave RAM block to read/write shared data, as shown in
B U S
Consumer Converter 2
Index Terms— protocol mismatches, protocol conversion, module checking.
I. I NTRODUCTION
AMBA AHB
Protocol conversion overview
A formal protocol conversion technique concerns itself with a range of issues. Firstly, participating protocols and their interaction, and specifications must be formally described. A protocol conversion technique must also be able to detect mismatches (mismatch detection) and have an algorithm to automatically generate converters if mismatches exist (converter generation). Additional issues include determining scope–the range of mismatches that can be handled, converter existence–which checks whether a converter to resolve mismatches exists, and converter correctness–which checks whether a given converter indeed bridges mismatches. The answers to the above questions differ between individual protocol conversion techniques. Related Work. Existing protocol conversion techniques can be broadly categorized as informal or formal. Informal approaches like [2]–[4] and [17] lack mathematical rigor, have very restricted scope and focus mainly on converter generation without addressing the questions of converter correctness and existence. Formal approaches, like [8], [10] and [19], on the other hand are based on mathematical techniques and proofs and solve protocol conversion within well-defined but restricted scopes, that differ for each technique. [19] present a game-theoretic formulation to resolve control mismatches between protocols with only unidirectional communication and do not address data mismatches. [8]- [9] provide synchronous protocol automata to precisely model protocols, and their solution, based on checking for a compatibility relation between protocols, can only handle a restricted set of data mismatches along with control mismatches. Additionally, modelchecking based verification for proving the correctness of the synthesized converter is performed as an additional step. In [10], a hybrid simulation/verification approach to protocol conversion in SoC designs is proposed, where both simulation and formal
497 492
verification is combined. However, the questions of converter correctness and generation are not addressed comprehensively. Protocol conversion seems superficially similar to the problem of controller synthesis in discrete event systems [1], [12], [22]. In this setting, controllers perform selective disabling of controllable events to control a given plant. However, in protocol conversion, converters may use additional techniques like event buffering and forwarding, and generation of extra control signals (described in later sections) in addition to disabling. Also, data mismatches require additional effort and cannot be addressed in the discreteevent setting. Given the above summary of available protocol mismatches techniques, it is evident that no single unifying approach to automatically resolve control and data mismatches in a formal setting exists. With the increasing use of SoCs, the lack of such a technique significantly increases design effort. Our solution. This paper provides the first protocol conversion technique to handle control and data mismatches under a unifying framework. Its key features and contributions are as follows: •
•
•
•
•
Protocol representation, interaction and detection of mismatches: We use Synchronous Kripke Structures (SKS) to formally represent protocols and their interaction. SKS are finite state descriptions that precisely model of control and data behaviour of SoC protocols. SKS description also leads to straightforward mismatch detection. Specifications: Temporal logic CTL is used to describe the intended control and data behaviour of the interaction between protocols. Temporal logics provide a natural way of writing such requirements succinctly and effectively. Although this paper presents CTL-based conversion, our approach can be extended to other temporal logics like LTL and CTL∗ . Converter generation: We develop a module checking [14] based formulation for converter generation. Model checking is not used as it can only verify whether protocols satisfy given CTL properties. However, using module checking, we can construct an environment (converter) which can control protocols to satisfy given properties. This is in fact the first known practical application of module checking. Converter representation and control: In our approach, converters are described using SKS. A converter controls protocols by using techniques such as disabling, buffering and event forwarding, and generation of extra control signals to bridge mismatches. Converters generated using our approach can, for the first time in the protocol conversion setting, deal with arbitrary data-widths between two protocols. Converter existence and correctness: Converters generated by our algorithm are guaranteed to bridge protocol mismatches and satisfy given specifications. Also, given a protocol pair and a set of constraints on conversion, if a converter cannot be generated by our technique, mismatches cannot be resolved by any converter having the same features as converters in our setting.
The rest of this paper is organized as follows. Section II provides the description of protocols and specifications. Section III describes how converters operate. Section IV describes how converters are automatically synthesized in our setting. Finally, section V provides results obtained from a wide range of protocol mismatch problems, with concluding remarks in section VI.
II. P ROTOCOLS AND S PECIFICATIONS A. Protocol description and interaction We define Synchronous Kripke Structures for protocol description as follows. Definition 1 (SKS): A Synchronous Kripke structure (SKS) is a finite state machine represented as a tuple hAP , S, s0 , I, O, R, L, clki where AP is a set of propositions; S is a finite set of states with s0 ∈ S being the initial state; I is a finite and nonempty set of inputs and O is a finite non-empty set of outputs. R ⊆ S × {t} × B(I) × 2O × S is the transition relation where B(I) represents the set of all boolean formulas over I ; and L : S → 2AP is the state labelling function. Finally, the event t represents ticking of the clock clk. All transitions trigger with respect to the ticks of the clock clk and a boolean combination of inputs. A transition can therefore trigger with respect to the presence or absence of a single input a (represented as a or ¬a respectively), a combination of one or more inputs (e.g. ¬a ∧ b ∨ ¬c). In our current setting, the same clk is used to drive all protocols and converters. Hence we remove references to clk and its ticks from SKS transitions described henceforth. Also, for transitions of the type (s, t, b, o, s′ ) ∈ R, b/o
we use the shorthand s −− → s′ . States of a SKS are labelled using atomic propositions. In addition, we use some atomic propositions that have an integer suffix to indicate data input or output over ports of specific widths. These are subsequently used by our algorithm to address datawidth mismatches (illustrated in section IV). Fig. 2 presents the SKS description of the various parts of the AMBA AHB-based SoC system presented in Fig. 1. The SKS PS for master 1 (producer) consists of 7 states with s0 as its initial state (Fig. 2(a)). In s0 , the master keeps requesting bus access by emitting the HBUSREQ1 signal every tick. When it receives the grant signal HGRANT1, it moves to state s1 from where it either chooses to perform a single write operation or an incrementing burst operation. A burst transfer (transition to s2 ) is selected when the signal int, denoting an internal choice made by the protocol, is present whilst a single write (s3 ) is performed when int is absent. To perform a burst operation, the protocol emits relevant control signals to indicate an incrementing burst operation (and keeps requesting further bus access using HBUSREQ1 output) and reaches state s2 where it writes a 16-bit data packet (denoted by the label DOut16 ) onto the AHB’s data bus. It then waits for the HREADY signal, which signifies a successful read by the slave, while persistently requesting bus access. If HREADY is received before HGRANT1, implying that the master no longer has bus access, the protocol resets back to s0 . However, if HGRANT1 is available when HREADY is read, the protocol moves to state s6 to write another 16-bit packet. When the protocol chooses to perform single write, the protocol writes 16-bit data in state s3 and then moves back to its initial state upon confirmation that data has been received (HREADY). Fig. 2(b) shows the SKS PT for the bus-arbiter. In its initial state s0 , the arbiter awaits bus request signals HBUSREQ1 and HBUSREQ2 from masters 1 and 2 respectively, and grants access to the first requester. Once access has been granted, the arbiter waits for the completion of a transfer by awaiting the HREADY signal, and then moves back to its initial state. The SKS for the slave memory block PU is shown in Fig. 2(c). In its initial state u0 , the slave waits for signal HSELECT1 (its
493 498
Fig. 2.
SKS representation for various parts of a SoC.
enable signal) to activate read/write options in state u1 . In u1 , if the write signal HWRITE has been activated by a bus-master, the memory reads 32-bit data from the AHB’s data bus (u2 ). Otherwise, it writes 32-bit data to the AHB’s data bus (u3 ). Finally, the SKS PV for master 2 (consumer) is shown in Fig. 2(d). From its initial state v0 , it keeps requesting bus access by emitting the HBUSREQ2 signal. When the grant signal HGRANT2 is received, it moves to state v1 where it emits control signals HADDR and HSINGLE (requesting a read from the slave) and moves to state v2 . It then awaits HREADY before moving to state v3 where it reads a 16-bit packet from the AHB’s data bus. More 16-bit packets can be read if the signal MORE is present, otherwise the protocol resets back to its initial state. We now define the interaction between protocols, called their parallel composition. Definition 2 (Parallel Composition): Given two SKS P1 = hAP1 , S1 , s01 , I1 , O1 , R1 , L1 , clk i and P2 = hAP2 , S2 , s02 , I2 , O2 , R2 , L2 , clk i, their parallel composition is the SKS P1 ||P2 =hAP1||2 , S1||2 , s01||2 , I1||2 , O1||2 ,R1||2 , L1||2 ,clki where AP1||2 = AP1 ∪ AP2 ; S1||2 = S1 × S2 ; s01||2 = (s01 , s02 ); and I1||2 ⊆ I1 ∪ I2 ; O1||2 = O1 ∪ O2 , and L1||2 ((s1 , s2 )) = L1 (s1 ) ∪ L2 (s2 ). Finally, the transition relation R1||2 ⊆ S1||2 × {t} × B(I1||2 ) × 2O1||2 × S1||2 such that: b1 /o1
b2 /o2
(s1 −−→s′1 ) ∧ (s2 −−→s′2 ) ((s1 ,s2 )
b1 ∧b2 /o1 ∪o2
−−→
((s′1 ,s′2 ))
Intuitively, the parallel composition contains all possible states and transitions that can be reached by making simultaneous transitions from each protocol. For example, given the bus-arbiter PT and the memory slave PU , the parallel composition PT ||PU has the initial state (t0 , u0 ) which will have transitions to states
(t0 , u0 ), (t0 , u1 ), (t1 , u0 ) and (t1 , u1 ). Each transition of the combined state (t0 , u0 ) combines a transition of t0 (which can reach t0 and t1 ) and another transition of u0 (which can reach u0 and u1 ).
Given the SKS descriptions of the various parts of a SoC system, conversion can be performed between various components, such as the bus-arbiter and a master, a slave and a master, or a master and the rest of the system. In this paper, we perform conversion between the consumer processor PV and the combined bus-arbiter and slave memory chip represented by PT ||PU (Fig. 2). The two SKS have certain inherent mismatches. The consumer master reads 16-bit data whilst the memory writes 32-bit data, leading to a data-width mismatch. Similarly, the sequence of control signals exchanged between them may result in executions contrary to intended behaviour. B. Specifications in CTL In our setting, the temporal logic CTL is used to describe desired control and data behaviour of the interaction between mismatched protocols. CTL is defined over a set of propositions using temporal and boolean operators as follows: φ → p | ¬p | tt | ff | φ ∧ φ | φ ∨ φ | AXφ | EXφ | A(φ U φ) | E(φ U φ) | AGφ | EGφ
Semantics of a CTL formula, ϕ denoted by [[ϕ]]M are given in terms of a set of states in a SKS M , which satisfies the formula. A state s ∈ SM is said to satisfy a CTL formula ϕ, denoted by M, s |= ϕ, if s ∈ [[ϕ]]M . Typically, we omit the M in [[ ]]M if the context is clear. We also say that M |= ϕ to indicate M, s0 |= ϕ. In this paper, we restrict ourselves to formulas where negations are applied to propositions only. Control constraints. For the consumer processor PV and the
494 499
combined arbiter-memory protocols represented by PT ||PU , the following CTL control-constraints can be used: [ϕ1 ] AGEFStart1 : The consumer must always eventually receive bus access in order to read data from the memory. [ϕ2 ] AGEFDOut32 : The combined system should also allow the memory slave write data. [ϕ3 ] AG((Idlev ∧ Idlet ) ⇒ A(¬Start U Opt)): The master cannot move further from its initial state before the arbiter grants it bus access. [ϕ4 ] AG((Idleu ∧Idlet ) ⇒ A(¬DOut32 U W ait)): The memory cannot write data unless requested by the master. Data constraints. In addition to control constraints, it is also essential to describe the correct data communication behaviour of participating protocols. To restrict protocols such that no data underflow or an overflow happen, we introduce data counters as follows. Given the data-widths N and M (which are integers) of the outputs and inputs respectively, we compute the minimum width needed for the communication medium (usually a buffer) between the two protocols. If N < M , then the minimum capacity must be N × f such that f is the smallest integer for which N × f ≥ M ; otherwise the minimum capacity is N . This assumption ensures that there are enough preceding outputs before any input. While the minimum bound of communication medium buffer can be computed as above, the maximum bound can be any value greater than the minimum bound. In our setting, we assume that the maximum bound of the communication medium buffer is LCM(N, M ). Given a capacity K of the communication medium between these bounds, the maximum number of outputs possible when the medium is empty is x = ⌊K/N ⌋; while the maximum number of inputs possible when the medium is full is y = ⌊K/M ⌋. We use an auxiliary counter for every input/output pair such that the counter is incremented by y for every output and decremented by x for every input. We then verify that the counter always remains between 0 and x × y using the CTL property AG(0 ≤ counter ≤ (x × y)). Example. Consider the AMBA-based SoC example in Figure. 2. Data outputs from the arbiter-memory protocol PT ||PU are 32-bits while while data inputs by the consumer-master are 16-bits. Hence, N = 32 and M = 16. As N > M , the communication medium capacity needs to be at least 32-bits. As the bus connecting these protocols is the AMBA AHB, which has a data-bus of size 32, the protocols can be possibly handled in our setting (if the data-bus’s width was less than 32-bits, our technique would fail). Given this 32-bit capacity of the communication medium, the maximum number of write operations possible when the medium is empty is x = ⌊K/N ⌋ = 1. Similarly, the maximum number of read operations is y = ⌊K/M ⌋ = 2. Given these values for x and y , we introduce a counter variable counter which is incremented by 2 (y ) for each DOut32 and decremented by 1 (x) for each DIn16 . To verify that the counter always remains between 0 and x × y = 2, we use the following property ϕd ≡ AG(0 ≤ counter ≤ 2). In addition to overflow/underflow prevention using the above CTL formula, stronger restrictions can be described. For example, the formula AG((counter = (x × y)) ⇒ A(¬DOut U (counter = 0)) requires that once the communication medium is completely full, all data on it is read completely before more data is 1 EFp
is an abbreviation for E(tt U p).
added. Such constraints cannot be handled by existing protocol conversion techniques. Furthermore, multiple counters can be used, allowing for conversion for protocols with multiple datacommunication channels. III. C ONVERTERS : D ESCRIPTION AND C ONTROL In the presented approach, converters are described using SKS. Definition 3 (Converter): Given two protocols P1 = hAP1 , S1 , s01 , I1 , O1 , R1 , L1 , clk i and P2 = hAP2 , S2 , s02 , I2 , O2 , R2 , L2 , clk i with their protocol composition P1 ||P2 =hAP1||2 , S1||2 , s01||2 , I1||2 , O1||2 ,R1||2 , L1||2 ,clki, a converter C for P1 and P2 is a SKS: hAPC , SC , sC0 , IC , OC , RC , LC i where APC = ∅, each c ∈ SC corresponds to some s ∈ S1||2 with the initial state c0 corresponding to s01||2 ; IC ⊆ (O1 ∪ O2 ) and OC ⊆ (I1 ∪ I2 ). The transition relation RC ⊆ SC ×{t}×B(IC )×2OC ×SC such b/o
that for any c ∈ SC corresponding to s ∈ S1||2 , if s −− → s′ , b′ /o′
a transition c −− → c′ where c′ corresponds then c can have V to s′ , b′ = oi (i = 1 . . . |o|, oi ∈ o) is the conjunction of all elements of o, and o′ is a set that satisfies the boolean formula b. Converter states are not required to satisfy any temporal property (AP = ∅) as all desired propositional properties must be satisfied by the participating protocols. A converter acts as an intermediary between the participating protocols. Converters read outputs generated from participating protocols and produce outputs that form the inputs to the protocols. This basic strategy helps converters control protocols in the following ways2 : •
•
•
Disabling: Converters may prevent signals emitted by one protocol from being visible to another protocol. This helps to disable transitions that may lead to faulty states and paths. Buffering and event forwarding: Converters contain buffers which can store signals emitted by one protocol that can be forwarded to another protocol at a later stage. Generation of missing control signals: A converter can generate signals which are required by a protocol but are not generated by any other protocols that it controls.
Converters exercise lock-step control over protocols. Each converter state c corresponds to a single state s in the protocol b/o
composition of the participating protocols. A transition s −− → s′ of s is allowed only when c has a matching transition of the b′ /o′
form c −− → c′ , and both transitions are made simultaneously (relating c′ to s′ ). Of course, the inputs required by a transition might be present as outputs of the same transition, forwarded from the converter’s buffers, and/or be generated by the converter itself. Fig. 3 shows the converter for the consumer master and arbiterslave protocols. In its initial state, it lets the consumer processor P2 continue to request for a grant until a grant HGRANT2 is received from the arbiter. The grant is immediately provided to P2 , enabling it to make a transition from its initial state v0 to v1 and the converter makes a transition from c0 to c1 . The converter then receives the control signals from P2 which include the slave-memory enable signal HSELECT1, which is immediately conveyed to the slave. In the next tick, both the slave-memory and consumer processor make transitions without requiring or emitting any signals. At this stage, the slave-memory makes a 2 Note that controllers in discrete-event systems [13] can only perform disabling.
495 500
HREQUEST2/ {HREQUEST2}
c0
true/{}
c1
HGRANT2/ {HGRANT2} HSINGLE Ù HSELECT1/ {HSELECT2}
c5
c2
true/{}
true/{MORE}
c4
Fig. 3.
HREADY/ {HREADY}
c3
Converter for the arbiter-slave and consumer processor protocols.
transition to u3 allowing it to write a 32-bit packet to the databus. After writing, the slave-memory emits HREADY marking the end of the transaction. The converter passes this signal to the master allowing it to read a 16-bit packet from the data-bus. The converter then generates the missing input MORE to enable the master to read another 16-bit data, ensuring that the 32-bits written on the data-bus are read completely. In the next tick, the converter allows the master to reset back to its initial state. IV. C ONVERTER G ENERATION A LGORITHM The algorithm to automatically generate converters in our setting is based on module checking [14], also known as model checking for open systems. Model checking can merely check whether given protocols satisfy given CTL properties. Module checking, on the other hand, is used to construct an environment (converter) under which the given protocols satisfy given CTL formulas. We demonstrate the working of the algorithm by showing how the converter presented in Fig. 3 is obtained for the AMBA AHB based SoC example provided in Fig. 2. Intialization. The main inputs to the algorithm are the SKS descriptions of two participating protocols (each one can be a composition of multiple protocols), a set of counters (one for each matching data input/output pair), and a set of CTL formulas to be satisfied. Additionally, each input/output of the protocols must be identified as either controllable or uncontrollable. Controllable protocol inputs can be disabled by a converter as opposed to uncontrollable inputs that can never be disabled. Uncontrollable inputs model signals that are generated internally by protocols or by other IPs that the converter has no control over. Furthermore, each controllable signal must be further selected as either buffered or non-buffered. Buffered signals are those that can only be presented to a protocol if they have previously been generated by another protocol (and hence buffered by the converter). Nonbuffered signals, on the other hand, may be generated by the converter without reading them from other protocols. Finally, the parallel composition of the participating protocols is computed and used during the tableau generation phase. To carry out converter generation between the consumer master protocol PV and the arbiter-memory slave pair PT ||PU , we provide the following inputs to the algorithm. PV and PT ||PU form the SKS descriptions of participating protocols. We introduce one counter for the data-communication between the slave and the master and the input specification set contains the formulas ϕ1 , ϕ2 and ϕd (see section II-B). Furthermore, signals like HBUSREQ2, HGRANT2, HREADY, HSELECT1 etc that are emitted by one protocol and read by the other are marked as controllable and buffered. The input MORE to the master PV is marked controllable
and non-buffered because it is not emitted in PT ||PU . The input HBUSREQ1 to the arbiter which is not presented by PV is similarly marked controllable. Finally, the parallel composition PV ||(PT ||PU ) is computed. Note that properties ϕ3 and ϕ4 , which describe the correct control signal sequencing between the master and the arbiterslave pair, are not included in the specification set. This is so because these properties are handled implicitly by our algorithm. Consider for example, the property ϕ1 (AG((Idlev ∧ Idlet ) ⇒ A(¬Start U Opt))) that requires the master to move to state labelled by Start (v1 ) only after the arbiter has granted access by moving to state t1 . Now, the master can only move to v1 when the signal HGRANT2 is provided to it by the arbiter. Furthermore, the arbiter only emits this signal when it makes a transition to t1 . Now because HGRANT2 is marked as a buffered signal, any transition in the master using HGRANT2 is disabled by the converter before the signal is read (and buffered) from the arbiter-slave pair. This achieves the behaviour intended by ϕ3 , making its addition the specification set redundant. ϕ4 is also be handled in a similar manner. Tableau Generation. Given the parallel composition of participating protocols, the algorithm attempts to build a successful tableau for the parallel composition given the control and data constraints to be satisfied. A tableau is a graph that contains a finite number of nodes and edges. Each node NODE relates to a state in the parallel composition of participating protocols, a set FS of CTL formulas, a set I of counter valuations, a set H of already visited nodes to ensure termination, and a set of events E that have been buffered at the node. The tableau generation starts by creating a root node which corresponds to the initial state of the parallel composition, the original set of CTL properties to be satisfied, a set of counter valuations where each counter is set to 0 and empty H and E sets. The above root node is passed to the main recursive function. Given such a NODE, its corresponding state s is checked against formulas in FS. The function either returns the NODE back indicating that a converter for the given node can be achieved or returns a false-node indicating failure. If the recursive function returns successfully for the root node, the algorithm returns tt, implying that a converter can be automatically constructed. Given the parallel composition PV ||(PT ||PU ), each tableau node relates to a state (v, t, u) in the composite system, a valuation for counter, and a set of CTL formulas. The root node corresponds to the initial state (v0 , t0 , u0 ) of PV ||(PT ||PU ), the formula set {ϕ1 , ϕ2 , ϕd }, the set I where counter is set to 0 and empty H and E sets. The above root node is passed to the main function which proceeds as follows. Each formula in FS ({ϕ1 , ϕ2 , ϕd }) is broken down into present and next-state commitments. Present-state commitments are those which must be satisfied by the state s corresponding to the current node. Next-state commitments are formulas of the type AX or EX which must be satisfied by the successors of s. The steps involved in breaking down the formula set FS for the root node of the consumer master and arbiterslave example is shown are shown in Tab. II. Consider how the formula ϕ1 = AGEFStart is processed (Tab. II, steps 13). The formula is initially broken down into the conjunction EFStart ∧ AXAGEFStart and the conjuncts are added back to FS. The formula EFStart is a present-state commitment whereas AXAGEFStart is a next-state commitment. EFStart is processed
496 501
Inputs/Outputs ¬(HBUSREQ1 ∧ HBUSREQ2) ∧ ¬HSELECT1/{HBUSREQ2} ¬(HBUSREQ1 ∧ HBUSREQ2) ∧ HSELECT1/{HBUSREQ2} ¬(HBUSREQ1 ∧ HBUSREQ2) ∧ HSELECT1 ∧ HGRANT2/{} ¬(HBUSREQ1 ∧ HBUSREQ2) ∧ ¬HSELECT1 ∧ HGRANT2/{} HBUSREQ1 ∧ ¬HSELECT1/{HGRANT1, HBUSREQ2} HBUSREQ1 ∧ HSELECT1/{HGRANT1, HBUSREQ2} HBUSREQ1 ∧ HSELECT1 ∧ HGRANT2/{HGRANT1} HBUSREQ2 ∧ ¬HSELECT1 ∧ HGRANT2/{HGRANT1} HBUSREQ2 ∧ ¬HSELECT1/{HGRANT2, HBUSREQ2} HBUSREQ2 ∧ HSELECT1/{HGRANT2, HBUSREQ2} HBUSREQ2 ∧ HSELECT1 ∧ HGRANT2/{HGRANT2} HBUSREQ2 ∧ ¬HSELECT1 ∧ HGRANT2/{HGRANT2}
State (v0 , t0 , u0 ) (v0 , t0 , u1 ) (v1 , t0 , u1 ) (v1 , t0 , u0 ) (v0 , t1 , u0 ) (v0 , t1 , u1 ) (v1 , t1 , u1 ) (v1 , t1 , u0 ) (v0 , t1 , u0 ) (v0 , t1 , u1 ) (v1 , t1 , u1 ) (v1 , t1 , u0 )
Type Enabled Disabled Disabled Disabled Disabled Disabled Disabled Disabled Enabled Disabled Disabled Enabled
(buffering) (buffering) (buffering) (buffering) (buffering) (buffering) (buffering) (buffering) (buffering)
TABLE I S UCCESSORS OF (v0 , t0 , u0 ).
further to form the disjunction Start ∨ EXEFStart. The first disjunct Start is a proposition and is checked against the labels of (v0 , t0 , u0 ). As the state is not labelled with Start, the other disjunct EXEFStart must be satisfied by s. However, this formula is not broken down further because it is a next-state commitment. The other formulas are processed in a similar manner till FS only contains next-state commitments (see step 8, Tab. II). Step 1 2 3 . . 8
FS AGEFStart, ϕ2 , ϕd } {EFStart, AXAGEFStart, ϕ2 , ϕd } {Start ∨ EXEFStart, AXAGEFStart, ϕ2 , ϕd } ... ... {EXEFStart, EXEFDOut32 , AXAGEFStart, AXAGEFDOut32 , AXAG(0 ≤ counter ≤ 2)}
TABLE II P ROCESSING THE ROOT NODE
Once only next-state commitments remain in FS, the algorithm passes these commitments to the successors of s in the following manner. Firstly, all AX-formulas in FS are aggregated in the set FS AX whilst the set FS EX contains the remaining EX formulas of FS. Next, a conforming subset of the set of successors of s is computed. Given the state s with n successors, there are 2n − 1 possible non-empty successor subsets. Some of these successors must always be enabled by the converter as the events triggering transitions from s to such successors are uncontrollable by the converter. Hence a conforming subset must contain all such successors. Similarly, some successors of s cannot be reached because the signals required to trigger transitions from s to such successors are not buffered in E. Hence, a conforming subset must not contain any such successors. Take for example the root node for the consumer master and arbiter-slave pair. After all present-state commitments are checked, FS AX and FS EX are computed. FS AX is the formulaset {AGEFStart, AGEFDOut32 , AG(0 ≤ counter ≤ 2)} whilst FS EX is {EFStart, EFDOut32 } (see FS in Step 8, Tab. II). Next, a conforming subset Succ is selected. Although the state (v0 , t0 , u0 ) has a number of successors, only three can possibly be enabled (see Tab. I). This is because signals like HREADY, HBUSREQ2 and HGRANT2 are buffered signals and transitions involving the presence of such signals can only be triggered if the signals are present in E (which is empty for the root node) or are present as outputs in the corresponding transition (emitted at the same time as when they are required). Hence a conforming
subset of (v0 , t0 , u0 ) must not contain any disabled transitions and contain at least one enabled transition. Then, for each state s′ in the conforming subset Succ, a node NODE′ is formed (with NODE as its parent). All AX commitments contained in FS AX are passed to every NODE′ . Each formula in FS EX is distributed as a commitment to any one of the newly created nodes. I is updated to I′ by checking the labels of s′ (some labels may increment/decrement a counter), H′ is updated to include the node NODE (the parent of NODE′ ) and E′ contains all signals emitted in the transition from s to s′ along with any remaining signals in E (some buffered signals may have been used for the transition from s to s′ ). Then, the same recursive process described above is used to check if each NODE′ satisfies all commitments (all AX commitments and some EX commitments) passed to it. If any NODE′ returns a failure, we move to select a different distribution of the formulas in FS EX for the children nodes. If a distribution that allows the satisfaction of all future commitments is found, the node NODE′ is returned (signifying success). On the other hand, if no distribution of the formulas in FS EX returns success, another conforming subset is chosen. If no conforming subset that satisfies the future commitments of NODE under any possible distribution can be found, we return failure. If a successful tableau is generated by the above procedure, a converter is automatically extracted. A tableau node refers to a state s in the parallel composition of participating protocols, a specific valuation I of all counter variables, a set of CTL formulas FS and a set of buffered events E. The children of each successful tableau node represent the transitions that can be enabled by the converter in s (given specific I, FS and E). This information is stored in the converter which can then guide each state s of the protocol parallel composition such that given constraints are met. For the consumer-master and arbiter-slave example, a successful tableau can be generated. The converter shown in Fig. 3 is generated automatically by processing the successful tableau in the manner described above. Complexity. The complexity of the algorithm can be obtained from the number of recursive calls. It is of the order O(|I| × 2|S| × 2|FSi | × 2|E| ) where |I| is the size of the counter set, |S| is the size of the state space of the parallel composition of participating protocols, |FSi | is the size of the set of formulas to be satisfied by the initial state of the protocol composition, and |E| is the maximum size of the buffered signal set contained in the converter. Converter existence, correctness and algorithm termination.
497 502
No.
Name
1
Handshake-serial [19]
1.1
4.1 4.2 5
Handshake-serial (data-mismatch) 2-way communication ABP sender (8-bit data)NS receiver (8-bit data) [5] 16-bit ABP, 8-bit NS 8-bit ABP, 16-bit NS ABP receiver (8-bit data)NS sender (8-bit data) [5] 16-bit ABP, 8-bit NS 8-bit ABP, 16-bit NS NS sender (16-bit data) Poll-End receiver (8-bit data)Ack-Nack sender (8-bit data) [21] 8-bit poll-end, 16-bit Ack-Nack 16-bit poll-end, 8-bit Ack-Nack 16-bit Master, 8-bit slave
6 7
Mutex 8-bit Reader, 24-bit writer
16 12
7.1 7.2 8 9 10 11 12 13
5-bit Reader, 4-bit writer 2-bit Reader, 9-bit writer MCP missionaries4-bit ABP sender-modified receiver SoC Master-Slave Pipeline-Handshake [7] AMBA AHB-master and BVCI Target 16-bit Producer and AMBA AHB, 32-bit consumer,arbiter, 32-bit mem Producer, consumer, arbiter, memory on AMBA ASB Producer, consumer, arbiter, memory on AMBA APB Modified producer, consumer, arbiter , memory on AMBA APB
12 12 30 166432 15 24 20 224
1.2 2 2.1 2.2 3 3.1 3.2 4
14 15 16
No. of states in P1 ||P2 4 12 12 18 18 18 24 24 24 6 9 9 9
224 224 128
Properties
Result
No read before corr. write (ϕ11 ). Outputs a and b alternate along every path (ϕ12 ). ϕ11 ,ϕ12 Data is eventually consumed before more is written. ϕ11 ,ϕ12 Each output is eventually read (ϕ21 ). Another output allowed only after an input (ϕ22 ). ϕ21 , ϕ22 ϕ21 , ϕ22 Each output is eventually read (ϕ31 ). Another output allowed only after an input (ϕ32 ). ϕ31 , ϕ32 ϕ31 , ϕ32
Success
No overflow or underflow during data communication (ϕ41 ). ⇒ ϕ41 ⇒ ϕ41 Correct handshaking sequence. Each output is consumed before another output. Mutual exclusion always achieved Correct handshaking sequence (ϕ71 ) No data overflows or underflows (ϕ72 ). Both protocols reset after each transaction (ϕ73 ). (ϕ71 , ϕ72 , ϕ73 ) (ϕ71 , ϕ72 , ϕ73 ) All missionary-cannibal pairs transported without loss Sender can always eventually read data. Each protocol resets after one transaction. Pipeline can always read data in a transaction. Data is eventually read. Correct control signal sequencing. No data loss and fairness. Correct control signal sequencing. No data mismatches and fairness. Correct control signal sequencing. No data mismatches and fairness. Correct control signal sequencing. No data mismatches and fairness.
Success
Success Success Success Success Success Success Success Success
Success Failure Success Success Success Success Success Success Success Success Success Success Success Success Failure Success
TABLE III B ENCHMARKING R ESULTS
The following theorem theorem proves that our approach comprehensively handles the questions of converter correctness and existence. It can also be proved that the given algorithm always terminates even in the presence of bounded counters and recursive CTL formulas. Theorem 1 (Sound and Complete): Given the protocol composition SKS P1 ||P2 = hAP1||2 ,S1||2 , s01||2 , ΣI 1||2 , ΣO 1||2 , R1||2 , L1||2 , clki of two protocols P1 and P2 , an initial set of counter valuations I, a set E describing bufferable of signals toevent buffering rules E and a set of CTL formulas FS, a converter that can control the protocol composition to satisfy all properties in FS exists iff the tableau generation algorithm returns tt. V. R ESULTS The conversion algorithm has been developed using C/C++ and by extending the NuSMV model checker [6]. Tab. III provides conversion results obtained from a number of mismatched protocol pairs that have been chosen to showcase the capabilities of the proposed approach. The first two columns describe the mismatched protocols and the third column contains the number of states in their parallel composition. The description of CTL properties used for conversion is provided in the next column. The final column states the result of conversion (success/failure). Problems 1–4 are classical protocol conversion examples explored in earlier works on protocol conversion [5], [19], [21]. Our
approach is able to handle these examples in a similar manner to earlier works and can handle variations that extend these examples. These variations, which involve bidirectional communication between protocols and data-width mismatches, cannot be handled by earlier works which explored these problems. Problems 5– 9 are synthetically created benchmarks which model commonly encountered protocol mismatches such as control sequencing. Problems 8 and 9 are well-known NuSMV examples that were modified to create control sequencing mismatches. For these examples, our approach was generate converters to ensure proper control sequencing and mutual exclusion. Furthermore, problem 7 (and variants 7.1 and 7.2) demonstrate the ability of our algorithm to handle arbitrary data-width mismatches. No other existing technique can handle arbitrary data-widths. Finally, some SoC protocols are presented in problems 10–16. These problems model control and data mismatches that cannot be handled by other protocol conversion techniques. Problem 13 looks at matching the producer-master PS with the rest of the SoC system presented in this paper (Fig. 2). Problems 14–16 look at the same problem but use different AMBA buses, namely ASB (Advanced System Bus) and APB (Advances Peripheral Bus). In some cases our algorithm fails to generate a converter. Problem 4.2 fails because the 8-bit Ack-Nack sender is unable to write more than 8-bit data per-transaction while the poll-end receiver reads 16-bits, leading to underflows. Problem 15 fails because of
498 503
Converters synthesized in our setting are capable of buffering signals and can handle several protocol mismatch problems. We also present comprehensive experimental results to show the practical applicability of our approach. We are currently working on extending the proposed approach to deal with clock mismatches. We claim that the same algorithm can be extended with little effort to handle extensions to the logic while clock mismatches can be represented and addressed by using multi-clock Synchronous Kripke Structures [20].
No. of states in converter
40 35 30 25 20 15 10 5 0 0
20
40
60
80
100
Total data transferred (x*y) Sequential i/o
Fig. 4.
Interleaved i/o
Reader-writer protocol pair
R EFERENCES
the inability of the AMBA APB to allow burst transfers. When failures occur, mismatches between protocols cannot be bridge by any converter (that exercises the same control over protocols as in our setting) under the given inputs (protocols, properties, buffering and counter rules–see theorem 1) and therefore, the inputs need to be modified manually. Although our technique fails to generate a converter for some problems, it provides traces (counter-examples) that may help during component modification. Faulty traces returned by our algorithm can pinpoint which CTL and protocol state caused a failure. Failures caused by empty signal buffers or uncontrollable actions can also be revealed. Fig. 4 shows how the number of states in the generated converter changed when the total number of reads and writes per-transaction in the reader-writer protocol pair (problem 7) was varied. In the first sequence (sequential i/o), the data communication behaviour was constrained such that in each transition, the writer must first write data to completely fill the communication medium (fixed at x × y where x and y are the weights for reads and writes respectively) and the reader must then completely read all data. For this purpose, CTL properties of the form AG(T ransactionBegin ⇒ A(¬Read U counter = (x × y)) and AG(counter = (x × y) ⇒ A(¬W rite U (counter = 0 ∧ T ransactionEnd)) were used. In the second case (interleaved), the protocols were allowed to arbitrarily read and write data as long as counter boundaries were not breached. It was observed that as the total size of data to be exchanged was increased, the number of states in the converter also increased in an almost linear fashion. The number of converter states in the sequential i/o case never exceeded the interleaved case because of the stronger data constraints used. Converters were generated even when the data sizes were not integer multiples of each other. These results demonstrate that we have developed, the first formal technique for SoC protocol conversion problem that can handle control mismatches, data-width mismatches and additional specification verification (CTL constraints) using a single unifying solution. We have also demonstrated through a series of experiments a practical SoC application based on the AMBA bus that the proposed approach can be used in real SoC designs. Another advantage of this technique is the feedback obtained in case of a failure to match protocols. VI. C ONCLUSIONS In this paper, we present a unifying approach towards performing conversion for protocols with control and data mismatches. Earlier works could only handle control or data mismatches separately or handle both in a restricted manner. Our fully automated approach is based on CTL module checking, allowing temporal logic specifications for both control and data constraints.
[1] M Antoniotti. Synthesis and verification of discrete controllers for robotics and manufacturing devices with temporal logic and the ControlD system. PhD thesis, New York University, New York, 1995. [2] G V Bochmann. Deriving protocol converters for communication gateways. IEEE Transactions on Communications, 38(9):1298–1300, September 1990. [3] G Borriello and R H Katz. Synthesis and optimization of interface transducer logic. In Digest of Technical Papers of the IEEE International Conference on Computer-Aided Design, pages 274–277, 1987. [4] F M Burg and N D Iorio. Networking of networks: Interworking according to osi. IEEE Journal on Selected Areas in Communications, 7(7):1131–1142, September 1989. [5] K L Calvert and S S Lam. Formal methods for protocol conversion. IEEE Journal on Selected Areas in Communication, 8(1):127–142, 1990. [6] R Cavada, A Cimatti, G Keighren, E Olivetti, M Pistore, and M Roveri. NuSMV 2.1 User Manual, 2006. [7] V D’Silva, S Ramesh, and A Sowmya. Bridge over troubled wrappers: Automated interface synthesis. In VLSID’04, 2004. [8] V D’Silva, S Ramesh, and A Sowmya. Synchronous protocol automata : A framework for modelling and verification of soc communication architectures. In DATE, pages 390–395, 2004. [9] V d’Silva, S Ramesh, and A Sowmya. Synchronous protocol automata: a framework for modelling and verification of soc communication architectures. IEE Proc. Computers & Digital Techniques, 152(1):20–27, 2005. [10] S Gorai, S Biswas, L Bhatia, P Tiwari, and R S Mishra. Directedsimulation assisted formal verification of serial protocol and bridge. In Proceedings of the 43rd annual conference on Design automation DAC ’06, pages 731 – 736, 2006. [11] P Green. Protocol conversion. IEEE Transactions on Communications, 34(3):257–268, March 1986. [12] S Jiang and R Kumar. Supervisory control of discrete event systems with ctl* temporal logic specifications. SIAM Journal on Control and Optimization, 44(6):2079–2103, 2006. [13] R Kumar and S S Nelvagal. Protocol conversion using supervisory control techniques. In IEEE International Symposium on ComputerAided Control System Design, pages 32–37, 1996. [14] O Kupferman, M Y Vardi, and P Wolper. Module checking. Information and Computation, 164:322–344, 2001. [15] S Lam. Protocol conversion. IEEE Transactions on Software Engineering, 14(3):353–362, 1988. [16] F Maraninchi and Y Remond. Argos: an automaton-based synchronous language. Computer Languages, 27:61–92, 2001. [17] S Narayan and D Gajski. Interfacing incompatible protocols using interface process generation. In Design Automation Conference, pages 468–473, 1995. [18] K Okumura. A formal protocol conversion method. In ACM SIGCOMM 86 Symposium, pages 30–37, 1986. [19] R Passerone, L de Alfaro, T A Henzinger, and A L SangiovanniVincentelli. Convertibility verification and converter synthesis: Two faces of the same coin. In International Conference on Computer Aided Design ICCAD, 2002. [20] I Radojevic, Z Salcic, and P S Roop. Mccharts and multiclock fsms for modelling large scale systems. In Fifth ACM-IEEE International Conference on Formal Methods and Models for Codesign (MEMOCODE’2007), 2007. Accepted for publication. [21] M Rajagopal and R E Miller. Synthesizing a protocol converter from executable protocol traces. IEEE Transactions on Computers, 40(4):487– 499, 1991. [22] P J G Ramadge and W M Wonham. The control of discrete event systems. Proceedings of the IEEE, 77(1):81–98, January 1989.
499 504