velopment, namely the development of a software applica- tion A on a middleware ...... and dispatch, and (2) components that are specific to the application being .... an Infrastructure for Mobile and Wireless Systems, number. 2538 in Lecture ...
Compositional Verification of Middleware-Based Software Architecture Descriptions Mauro Caporuscio, Paola Inverardi, Patrizio Pelliccione Dipartimento di Informatica Universit`a dell’Aquila I-67010 L’Aquila, Italy {caporusc, inverard, pellicci}@di.univaq.it Abstract In this paper we present a compositional reasoning to verify middleware-based software architecture descriptions. We consider a nowadays typical software system development, namely the development of a software application A on a middleware M . Our goal is to efficiently integrate verification techniques, like model checking, in the software life cycle in order to improve the overall software quality. The approach exploits the structure imposed on the system by the software architecture in order to develop an assume-guarantee methodology to reduce properties verification from global to local. We apply the methodology on a non-trivial case study namely the development of a Gnutella system on top of the SIENA event-notification middleware.
1. Introduction In this paper we present a compositional reasoning to verify middleware-based Software Architecture (SA) descriptions. We consider a nowadays typical software system development, namely the development of a software application A on a middleware M [15]. Our goal is to efficiently integrate verification techniques in the software life cycle in order to improve the overall software quality. To this extent we make two crucial choices that we will fully motivate later on. One is related to the software system abstraction we want to verify, i.e. system software architecture. The other is on the verification technique we decide to apply, i.e. compositional modelchecking. Both choices have an implication on the applicability of the verification technique. In recent years model checking has gained popularity due to its increasing use for software system verification even in industrial contexts [12, 23]. However the application of model checking techniques is still prevented by
the state explosion problem. As remarked by Gerald Holzmann in [22] no paper was published on reachability analysis techniques without a serious discussion of this problem. State explosion occurs either in systems composed of (not too) many interacting components, or in systems where data structures assume many different values. The number of global states easily becomes enormous and intractable. To solve this problem, many methods have been developed by exploiting different approaches [10]. They can be logically classified into two disjoint sets. The first set, that we call Internal Methods, considers algorithms and techniques used internally to the model checker in order to efficiently represent transition relations between concurrent processes, such as Binary Decision Diagrams [3] (used for synchronous processes) and Partial Order Reduction [27] techniques (used for asynchronous processes). The second set, that we call External Methods includes techniques that operate on the input of the model checker (models), and can be used in conjunction with Internal Methods. In this set there are Abstraction [9], Symmetry [14] and Compositional Reasoning [16, 33]. In particular, Compositional Reasoning aims at decomposing a global system property specification into local properties that hold on small sub-parts of the system. This decomposition is meaningful if we have the knowledge that the conjunction of the local properties on the system sub-parts implies the global property on the entire system. This suggests the use of software architectures as suitable system abstractions to efficiently carry on Compositional Reasoning. The SA specification [1, 32] represents the first, in the development life-cycle, complete system description. It provides both a high-level behavioral abstraction of components and of their interactions (connectors) and, a description of the static structure of the system. The aim of SA descriptions is twofold: on one side they force the designer to separate architectural concerns from other design ones, thus abstracting away many details. On the other, they allow
Proceedings of the 26th International Conference on Software Engineering (ICSE’04) 0270-5257/04 $20.00 © 2004 IEEE
for analysis and validation of architectural choices, both behavioral and quantitative, in order to obtain better software quality in an increasingly shorter time-to-market development scenario [2]. Since we are interested in verification of behavioral properties, we concentrate on SA behavioral descriptions. To this extent we adopt current industrial practice notations, namely state-based machines and scenarios [31, 25]. While state-based machines, such as State Diagramsor Labeled Transition System (LTS), describe components behavior, scenarios, e.g. Message Sequence Charts (MSC) or Sequence Diagrams, identify how they interact. Behavioral properties are usually specified by using Temporal Logic formulas, and in particular Linear Temporal Logic (LTL) [29]. In an industrial context it is unfeasible to write by hand temporal logic formulae, as pointed out by Holzmann [22]. Following the graphical approach we propose to describe LTL properties by means of scenarios. The set of properties that can be specified in this way represents an expressive subset of LTL properties. Summarizing, our aim is to model check middlewarebased SA with respect to a subset of LTL system properties by means of Compositional Reasoning. Our approach exploits the structure imposed on the system by the SA. The rest of the paper is organized as follows. Section 2 gives an informal overview of the approach and of the application context. Section 3 recalls notions on Compositional Reasoning and on the other formalisms we use. Section 4 presents the foundations of our approach. Section 5 applies the compositional verification to a non-trivial case study namely the development of a Gnutella [19] system on the SIENA event-notification middleware [5, 6]. Related work are discussed in Section 6 and conclusion and future work are presented in Section 7.
the arrows shape that characterizes the communication type (synchronous, asynchronous, deferred synchronous and so on). For example, Figure 2 depicts a property for the SA in Servent 1
Application
Servent 1
A
Servent 3
Servent 2
query(q,id 1)
Servent 2
query(q,id 1) hits(h2,id 2) hits(h 2,id2)
Servent 3
a)
b)
Figure 1. SA of a generic application A Figure 1 in both notations: the MSC and the LTL formula. The idea of the translation is the following: for each path in the system model, after a query(q, id1 ) message (the first message in the MSC) eventually a hits(h2 , id2 ) message is eventually sent back (the second message in the MSC). Then we assume that the high-level SA description satisfies Servent 1
Servent 2
query(q,id 1) hits(h2,id 2) [](query(q,id 1) -> hits(h 2,id 2)) LTL formula
Figure 2. MSC and the corresponding LTL formula the set of properties Z derived from the set of MSC1 . Usually, in specifying the high-level SA, the designer does not know how interactions will be actually achieved. Nowadays the development of distributed applications often relies on a supporting middleware infrastructure which provides the required communication services [15]. In architectural terms this means that the high level SA will be refined in a more detailed SA that presents additional components. These represent the interface components towards the middleware and allow the application components to access the services offered by the middleware. Referring to Figure 3.a, the middleware communicates through the interfaces with the application components Servent1 , Servent2 , Servent3 . In this context, the designer’s challenge is to understand if the behavioral properties hold by A are still valid. In fact, due to the introduction
2 Overview In this section we briefly introduce our approach to verify middleware-based applications at the architectural level. We assume to have a high-level SA description of an application A (as showed in Figure 1.a) that identifies the components and the connections among components. Figure 1.b describes the behavior of the whole application which shows how the components interact in order to provide the required system services by means of a MSC. State diagrams for components internal behaviors are not always mandatory in our approach since it is possible to synthesize components internal behaviors starting from the MSC descriptions [34] of the whole application. As highlighted in the Introduction we use the MSC notation also to represent the properties of interest. We have defined an algorithm [8] which given a MSC provides the corresponding LTL formula taking into account the message order and
1 In the following we will indifferently talk of Z as a set of properties or of a single property since we can always reduce a set of formulas to a single conjunctive one.
2
Proceedings of the 26th International Conference on Software Engineering (ICSE’04) 0270-5257/04 $20.00 © 2004 IEEE
Application
the interfaces and to the middleware, respectively. The starting point of these algorithms are the MSC representing the property Z. The use of this diagrammatic notation allows the definition of an only a subset of LTL properties. However, this subset appears to be sufficiently expressive from the designer perspective. At the same time this restriction allows the application of the algorithms mentioned above. It is worth noticing that the theorem can be extended by allowing a generic property Z at the expense of reducing the automation of the local properties generation. Once the verification problem is reduced to check the validity of local properties on suitable sub-systems we use the C HARMY (CHecking ARchitectural Model consistencY) framework [24]. C HARMY is a tool that automatically translates LTS diagrams (used to model the considered subsystem) and MSC (used to represent the properties of interest) in Promela (the specification language of the model checker Spin [23]) code and in LTL formulae respectively. The verification between the Promela model and the properties is performed by using the model checker Spin.
A
Servent 1
Servent 1 Interface
M Servent 1
Middleware Core
Servent 3 Interface
Servent 2 Interface
Servent 1 Interface
M Core
Servent 2 Interface
Servent 2
Servent 2
query(q,id 1) connect query(q,id 1)
Middleware Infrastructure
query(q,id 1)
hits(h,id 2)
query(q,id 1) hits(h,id 2)
hits(h,id 2)
Servent 3
hits(h,id 2)
a)
b)
Figure 3. Architectural Refinement
of M that offers services to the application, the property Z may be falsified by the new SA. One approach to face this challenge is to model check the whole refined SA, obtained by composing in parallel components and middleware, with respect to the behavioral application properties [17, 26]. This may be very expensive and the verification can fail because of state-explosion. Moreover this approach is not intuitive from the designer point of view since the verification does not take advantage of the designer implicit assumption that the middleware properly behaves. These considerations naturally lead to investigate a compositional approach. The idea is to decompose the verification of the global property Z of A, into the verification of a number of properties that hold locally on the architectural components. Then, the validity of the local properties should imply the validity of Z. Of course this is not generally true and we need to make some restrictions in order to reach our goal. The architectural structure helps in this direction. Following the designer intuition we would like to establish some properties on the middleware once and for all and then simply assume that these properties hold. Then, since the application A holds the property of interest Z we would like to reduce the verification to only prove that A correctly uses the middleware communication infrastructure. On the formal side the approach we propose makes use of the following tools. Application components are modeled with (Reference) LTS. The middleware behavior is described by a set of properties expressed as LTL formulas. The properties of interest i.e. the application global property Z is described by using Message Sequence Charts and then translated in LTL formulas in an automatic way. The core of our work is the theorem on “Architectural Decomposability” detailed in Section 4. Some automation has been introduced to support and to facilitate its applicability on real case studies. In particular we have defined algorithms to split an application global property Z into sets of properties that are local to the application components, to
3 Background In this section we briefly introduce the notions we have used within the remainder of this paper.
3.1 Specification tools 3.1.1 Labeled Transition System Different models are nowadays used to specify the behavior of a process: e.g. Labeled Transition System (LTS), Input/Output Transition System, Finite State Machine etc. We start from the LTS standard definition to define our Reference LTS (RLTS). A LTS is defined as follows: Definition: Labeled Transition System A Labeled Transition System is a quintuple (S, L, S 0 , SF , T ), where S is the set of states, L is the set of distinguished labels denoting the LTS alphabet, S 0 ∈ S is the initial state, SF ⊆ S is the l set of final states and T = { −→⊆ S × S | l ∈ L} is the transition relation labeled with elements of L. In our approach we use a notational variant of LTS called Reference LTS whose definition follows: Definition: Reference LTS A Reference LTS (RLTS) is a LTS such that i) labels uniquely identify the architectural communication channels, i.e., a channel is denoted by its label and can be used only by a pair of components. ii) For each element l ∈ L, an ? or ! operators are defined with the following meaning: ?l[!l] identifies an input [output] operation on the l channel. Moreover we use the special label 3
Proceedings of the 26th International Conference on Software Engineering (ICSE’04) 0270-5257/04 $20.00 © 2004 IEEE
τact to identify actions “act” that do not involve communication (internal actions).
Components are usually designed with some assumptions over the behavior of their environment. Thus, in order to guarantee that a component satisfies its local properties, it is necessary that its environment satisfies some assumptions. This strategy is called assume-guarantee and was introduced by Amir Pnueli [33]. The notation used by Pnueli is the following:
Note that condition i) is not restrictive since it is always possible to uniquely identify a channel between two components. An example of MSC is in Figure 2, while an example of RLTS is in Figure 4. The Servent waits for a query message and then, if it finds the requested f ile, sends its hits back. ?query(q,id)
M The common reading is “if the environment of M satisfies ϕ, then M in this environment satisfies ψ”. Below we show the classical reasoning chain:
query
[h==””]
M < ϕ > < ϕ > M < ψ > —————————– M · M < ψ >
[h!=””]!hits(h,id)
Figure 4. The Servent RLTS
τ < c0 >→< c0 > ——————————————τ < c0 ||c1 >→< c0 ||c1 >
!a ?a < c0 >→< c0 > < c1 >→< c1 > ———————————————————— a < c0 ||c1 >→< c0 ||c1 >
Its interpretation is: if M satisfies ϕ and M , over an environment that satisfies ϕ, satisfies ψ then M · M will satisfies ψ, where “·” is a suitable composition operator. The notion of satisfaction, above mentioned, is the satisfaction of a temporal logic specification (|=). In the scope of this work M and M model components behaviors while ϕ and ψ are LTL formulae. Actually, in the Theorem “Architectural Decomposability” we will use the relation , also referred as Simulation [30], to express the ability of one system to have “more behaviors” than another. Simulation is usually used to relate an implementation to a specification [20, 28]. For example, given an implementation M and a specification M , M M states that M can simulate M . To use the simulation relation in our context we recall that there is a correspondence between |= and . This is achieved by using the Tableau construction that maps an LTL formula ϕ to the associated state transition system T (ϕ). More precisely, given an LTL formula f , the Tableau T for the path formula f is a Kripke structure [10] that includes all paths that satisfy f . Intuitively, one can prove M |= ϕ by analyzing if M T (ϕ). That is, the satisfaction of a formula by M corresponds exactly to M being simulated by the tableau for the formula.
τ < c1 >→< c1 > ——————————————τ < c0 ||c1 >→< c0 ||c1 >
?a !a < c0 >→< c0 > < c1 >→< c1 > ———————————————————— a < c0 ||c1 >→< c0 ||c1 >
4 Approach to Modular Model Checking
Let us now introduce two operations on RLTS we make use in the following. Relabeling: The relabeling function is applied to the labels of a component that represent message exchanges. Formally the relabeling function f can be expressed as follows: f : C × L × L −→ C, where C is a set of components and L is a set of labels. The first set L represents the actual labels of the component C while the second one represents the target labels. The mapping between the elements of these sets is position based i.e. the i-th label of the first set is mapped to the i-th label of the second one. Parallel Composition: Following [30] we can define the parallel composition C0 ||C1 of the components C0 and C1 by means of operational semantics transition rules. We have four rules, two representing the internal evolution of the components C0 and C1 , respectively and two representing the synchronization between them:
3.2 Compositional Reasoning
In this section we formally describe our approach to model check middleware-based applications. Our initial assumptions are the following: let A = A1 , A2 , . . . , An be an application composed of n distributed components that satisfies a set of properties Z. Let us assume that the designer decides to implement the interactions through the services offered by a middleware M ; the resulting system is called S.
The Compositional Verification key idea is to decompose the system specification into properties that describe the behavior of a system’s subset. In general checking local properties over subsystems does not imply the correctness of the entire system. The problem is due to the existence of mutual dependencies among components. 4
Proceedings of the 26th International Conference on Software Engineering (ICSE’04) 0270-5257/04 $20.00 © 2004 IEEE
5. For each M SCzj ∈ Z we apply the Algorithm A2 that given M SCzj as input, returns a set H of sets Hi ; Hi contains the local properties related to the interaction between each f (Ai )||Ii and M . f is a relabeling function (refer to Section 3) that forces Ai to interact with Ii rather than with other components:
We consider the architecture of S, see Figure 3.a, in which we identify: i) the application A; ii) the middleware core M ; iii) a set of interfaces I, one for each component of A that represents a bridge between A and M . Since M is a well defined middleware, we know that M satisfies a given set of properties P = {p1 , p2 , . . . , pl }. The following notational conventions will hold throughout the rest of the paper.
A2 : M SCzj → {H1 , . . . , Hn }
• By writing M |= P we denote M |= ψ where ψ = p1 ∧ p2 ∧ · · · ∧ pl .
and such that: ∀i((A |= zj ) ⇒ (f (Ai )Ii |= Hi ))
• By writing M < P > we denote M < ψ > where ψ = p1 ∧ p2 ∧ · · · ∧ pl . 1
p11 , · · ·
, p1n
2
Informally the sets Hi translate high level actions performed by A in terms of the implementation level actions performed by S. Referring to our case study described in the next section, this amounts to establish that each doquery action performed at the Gnutella application level must be translated into a suitable publish message P U B at the SIENA level.
p21 , · · · , p2m 1
and P = be two • Let P = sets of LTL properties. By writing T (P ) T (P 2 ) we denote T (ϕ) T (ψ) where ϕ = p11 ∧ · · · ∧ p1n and ψ = p21 ∧ · · · ∧ p2m . • when the context is not ambiguous we will denote a property expressed with a M SCzj by simply using zj .
6. For each M SCzj ∈ Z we apply the Algorithm A3 that, given M SCzj as input, extracts a set Pd of behavioral properties that M must hold in order to satisfy M SCzj :
Before introducing the architectural decomposition theorem, let us informally introduce the steps the theorem is built on: 1. Let us consider a set of behaviors Z = {M SCz1 , . . . , M SCzo }. Z are the global properties satisfied by the application A. Given the implementation S, we want to prove that also S satisfies Z.
A3 : M SCzj → {d1 , . . . , dt } and such that: (A |= zj ) ⇒ (M |= Pd )
2. For each M SCzj ∈ Z we apply the Algorithm A1 that given a M SCzj as input, returns a set Q of sets Qi where Qi contains local properties related to each Ai appearing in M SCzj :
Pd represents the set of expected behaviors of the middleware needed in order to satisfy M SCzj . 7. For each M SCzj ∈ Z apply the Algorithm A4 in order to split the M SCzj into a conjunction of subproperties that must hold locally on the system components. More precisely
A1 : M SCzj → {Q1 , . . . , Qn } and such that:
A4 : M SCzj → z1 ∧ z2 ∧ · · · ∧ zp
∀i(A |= zj ) ⇒ (Ai |= Qi )
such that:
This means that the algorithm A1 decomposes the global property z ∈ Z in local properties that hold locally at each component involved in the interaction described by M SCzj ;
(A |= zj ) ⇒ (zj ⇒ z1 ∧ z2 ∧ · · · ∧ zp ) ∀i(∃k((zi ∈ Qk )∨(zi ∈ Hk )∨(zi ∈ V )∨(zi ∈ Pd )))
3-4. Let us now consider the set P = {p1 , p2 , . . . , pl } of standard properties of M which comes with a set of constraints V = {v1 , . . . , vm }. V assesses the correct usage of M by the application A. This means that M satisfies P , under the assumptions expressed by V . The set V is needed to ensure the validity of the properties P in M . In fact they specify how M should be used in order to correctly work. In particular V must be satisfied by the interfaces components.
This last step is needed in order to ensure the global consistency of all the previous steps. Interested reader can find more details about the algorithms in [8]. The Theorem Architectural Decomposability formalizes the steps introduced above and demonstrates that under some assumptions we can reduce from global-properties verification to local-properties verification. 5
Proceedings of the 26th International Conference on Software Engineering (ICSE’04) 0270-5257/04 $20.00 © 2004 IEEE
Theorem 1: Architectural Decomposability Let A = A1 . . . An be an application composed of n components that satisfies a set of properties Z. Let I, M , V , P , f be defined as above. Then, for each z ∈ Z let:
• z i is a property that describes the interaction between a generic interface Ij and M . The negation of one of them falsifies hypothesis 4. • z i is a desired property of the middleware, then it is contained in Pd . By Definition 1 the negation of z i implies the negation of hypothesis 6.
• Q = A1 (M SCz ) • H = A2 (M SCz )
then we can conclude that the thesis holds and, iterating the reasoning for each z ∈ Z we can conclude
• Pd = A3 (M SCz ) • z 1 ∧ z 2 ∧ · · · ∧ z p = A4 (M SCz )
f (A)||I||M < Z >
Under the following hypothesis:
5 Implementing The Gnutella Protocol on top of a Publish/Subscribe Middleware
1. A < z > Gnutella [19] represents one of the most famous peerto-peer applications providing file discovery and sharing across a network. A previous work, done by D. Heimberg [21], illustrates how to exploit the publish/subscribe middleware capabilities in order to provide Gnutella-like services. In this section we use this example to apply our approach, pointing out how an implementation of the Gnutella protocol based on the SIENA Publish/Subscribe Middleware [6, 7] respects the specification. The analysis we have conducted starts from the Gnutella Protocol specification, takes into account the desired scenarios (representing desired behaviors), designs the system by using the SIENA middleware and, finally applies our approach to verify the desired scenarios on the final system. Due to space limits we only give a sketch of the case study; for a more comprehensive presentation please refer to [8]. The Gnutella architecture is composed of a number of nodes connected with each other. Each node, called Servent acts as client, server and router. As showed in Figure 5.a, we consider an instance of the system composed of three Servent components. Figure 5.b depicts the refined SA of the considered application:
2. Ai < Qi > , ∀Ai ∈ A 3. Ii < V >, ∀i 4. < V > M < P > 5. f (Ai )||Ii < Hi > for each component in A 6. T (Pd ) T (P ): M contains all behaviors specified by the properties contained in Pd . 7. Algorithm A4 decomposes z into the following: z = z1 ∧ z2 ∧ · · · ∧ zp Then we have f (A)||I||M < z > Sketch of the proof: let us suppose that there exists a z ∈ Z such that z is not satisfied by f (A)||I||M . For the hypothesis 7 z = z 1 ∧ z 2 ∧ · · · ∧ z p . Thus, each z i is a property held by a system subpart and matches with one of these cases:
1. Three Servent that can send and receive “queries” and “hits”.
• z i is a property of a component f (Aj ): each property of a component f (Aj ) is described by the set f (Qj ). From hypothesis 2 we obtain f (Aj ) < f (Qj ) > and then the negation of a property in f (Qj ) leads to an absurd.
2. Three Servent Interface that put in touch the Servent with the middleware by opportunely translating each message m in a publication P U B(m) or in a subscription SU B(m).
• z i is an interaction between a pair f (Aj ) and Ij : the interactions between each pair Aj and Ij are formalized by the properties Hj . The negation of an Hj leads to an absurd for hypothesis 5.
3. The communication between components is achieved by using the SIENA Publish/Subscribe middleware.
5.1 Applying the approach
• z i is a property of an interface Ij : each property in Ij is described by the set V . To negate a property implies an absurd for 3.
In this subsection we apply our compositional methodology on the SIENA-Gnutella system. After the definition of 6
Proceedings of the 26th International Conference on Software Engineering (ICSE’04) 0270-5257/04 $20.00 © 2004 IEEE
where P U B Servent1 (e) means that Servent1 published e and, analogously N OT Servent0 (e) indicates that Servent0 receives the event e. Moreover, there exist a set V of constraints, as earlier defined in Section 4, that expresses how to correctly use the middleware. For space limits we present them in their instantiated form in the when proving, in the following, the Hypothesis 3 of Theorem 1.
Servent 1
Servent 1 Interface
Siena Core
Simple Gnutella Network
Servent 2 Interface
Servent 2
Gnutella
Servent 1
Servent 2
Servent 3
Servent 3 Interface
Pub/Sub Infrastructure
Servent 3
a)
b) Properties Z: due to space limits we describe the approach focusing on one property only. Figure 6 depicts the MSC for the selected property applied to the SIENA-Gnutella system; the MSC for the same property applied to a Gnutella system is depicted in Figure 2. These MSCs mean: if Servent1 makes a query q, (doquery(q, id1 ) message in Figure 6 and query(q, id1 ) message in Figure 2) and the Servent2 contains the selected file h2 , then Servent1 must receive a suitable hits (rechits(h2 , id2 ) message in Figure 6 and hits(h2 , id2 ) message in Figure 2).
Figure 5. a) SA of Gnutella, b) Gnutella Servents on top of SIENA
the SIENA properties, P , and of the Gnutella properties, Z, we check if the system satisfies Theorem 1 hypothesis. SIENA properties: in the following we report some SIENA properties [4]. Obviously this middleware satisfies also other properties. Here we have selected only those ones useful to understand the case study. In order to simplify the reading of the LTL formulae we introduce the ⊗ operator defined as follows.
Theorem 1 hypothesis Hypothesis 1: following the C HARMY approach we have proved that the Gnutella system (without the SIENA middleware) satisfies Z.
Definition: Xor Operator Let a, b be two LTL formulae. The ⊗ operator returns true if one, and only one of the formulae evaluates to true. It is defined starting from the ∨ and ∧ basic operators:
Servent1
Servent1 Interface
Siena Core
Servent2 Interface
Servent2
doquery(q,id1)
a ⊗ b ≡ (a ∨ b) ∧ ¬(a ∧ b)
QueryToPub
pub(s_query(q,id1))
1. When a client Servent0 subscribes a filter (SU B Servent0 ) and a client Servent1 publishes an event (P U B Servent1 ) then Servent0 receives an event notification identified by the N OT Servent0 message only if Servent0 has not performed an unsubscription (U N S Servent0 ).
core
notify(s_query(q,id1)) NotToQuery
recquery(q,id1)
[h2!=“”]dohits(h2,id2)
((¬P U B Servent1 ∪ SU B Servent0 ) ⇒
HitsToPub
pub(s_hits(h2,id2),id1)
♦(U N S Servent0 ⊗ N OT Servent0 ))
notify(s_hits(h2,id2))
2. The events received by a client Servent0 and generated by the same source, e.g. Servent1 , maintain the publication order.
core
NotToHits
rechits(h2,id2)
((¬P U B Servent1 (e2 ) ∪ P U B Servent1 (e1 )) ⇒ ♦((¬N OT Servent0 (e2 ) ∪ N OT Servent0 (e1 ))))
Figure 6. MSC for the property of interest 7
Proceedings of the 26th International Conference on Software Engineering (ICSE’04) 0270-5257/04 $20.00 © 2004 IEEE
query
Hypothesis 2: applying the algorithm A1 the obtained set Q = {Q1 , Q2 } is:
3. (N OT (s query(q, id)) ⇒ ♦recquery(q, id)). A query sent to the Servent is eventually followed by a query notification.
Q1 = {doquery(q, id1 ) ⇒ ♦rechits(h2 , id2 )}
4. (N OT (s hits(h, id)) ⇒ ♦rechits(h, id)). A query notification is eventually followed by a hits sent to the Servent.
and Q2 = {(recquery(q, id1 ) ⇒ ♦τquery ), ((τquery ∧ (h2 ! = “”)) ⇒ ♦dohits(h2 , id2 ))}
These properties are proved valid on the models f (Ai )||Ii by using C HARMY.
Using C HARMY we have proved that Servent1 satisfies the set of properties Q1 and that Servent2 satisfies the set of properties Q2 .
Hypothesis 6: starting from the MSC in Figure 6, the algorithm A3 extracts the set Pd of SIENA desired properties.
Hypothesis 3: the constraints V are instantiated as follows:
Pd = {(P U B(s query(q, id1 )) ⇒
1. (¬(SU Bi (id)) ∪ SienaConnect). A servent cannot perform the id subscription before a SIENA connection.
♦N OT IF Y (s query(q, id1 ))), (P U B(s hits((h2 , id1 ), id1 )) ⇒
2. ♦(SU Bi (id)). Each servent must eventually subscribe its id.
♦N OT IF Y (s hits(h2 , id2 )))} Unlikely, Hypothesis 6 is not verified because Pd contains behaviors not included into P . To solve this problem it is necessary to require that SIENA satisfies the following additional properties:
3. (¬(SU Bi (f ile)) ∪ SU Bi (id)). A servent cannot perform a f ile subscription before an id subscription. 4. (¬(N OTi (s hits(h, id), dest)) ∪ SU Bi (id)). A servent is not able to receive an event notification before the id subscription.
(¬(P U B Servent1 (s query(f ile, id1))∪ SU B Servent2 (f ile)))
5. (¬(P U Bi (s query(q, id)))) ∪ SU Bi (id)). A servent can publish a query only after the id subscription.
and
6. (¬(N OTi (s query(q, id))) W SU Bi (f ile))). A servent is able to receive a query notification only after a f ile subscription.
(¬(P U B Servent1 (s hits(f ile, id1))∪
7. (¬(U N SU Bi (f ile))) W SU Bi (f ile)). A servent can delete a f ile subscription if and only if the f ile has been subscribed. The check of satisfiability, performed using C HARMY, between each interface model and this set of properties returns a valid result.
These properties state that the system requires a startup phase where all components must be initialized by subscribing for events of interest. In fact a consumer needs to be subscribed in order to receive events of interest. This shows that the verification induced by the theorem application allows for the identification of a design lack which did not take into account the initialization phase of the system.
Hypothesis 4: SIENA satisfies the properties P . The check is performed by using C HARMY. Interested readers can refer to [4].
Hypothesis 7: we apply the algorithm A4 to the MSC in Figure 6. The algorithm splits the global property into a conjunction of sub-properties. The result is the following:
Hypothesis 5: the algorithm A2 extracts for each component the interaction properties H between A and M :
(doquery(q, id1 ) ⇒ ♦τQueryT oP ub ) ∧(τQueryT oP ub ⇒ ♦P U B(s query(q, id1 ))) ∧(N OT IF Y (s hits(h2 , id2 )) ⇒ ♦τN otT oHits ) ∧(τN otT oHits ⇒ ♦rechits(h2 , id2 )) ∧(P U B(s query(q, id1 )) ⇒ ♦τcore ) ∧(τcore ⇒ ♦N OT IF Y (s query(q, id1 ))) ∧(P U B((s hits(h2 , id2 ), id1 )) ⇒ ♦τcore ) ∧(τcore ⇒ ♦N OT IF Y (s hits(h2 , id2 ))) ∧(N OT IF Y (s query(q, id1 )) ⇒ ♦τN otT oQuery )
SU B Servent2 (f ile)))
1. (doquery(q, id) ⇒ ♦P U B(s query(q, id))). After one query operation the query is eventually published. 2. (dohits(h, id) ⇒ ♦P U B(s hits(h, id), dest)). A hits operation is eventually followed by its publishing. Each servent cannot receive two equal hits. 8
Proceedings of the 26th International Conference on Software Engineering (ICSE’04) 0270-5257/04 $20.00 © 2004 IEEE
∧(τN otT oQuery ⇒ ♦recquery(q, id1 )) ∧(dohits(h2 , id2 ) ⇒ ♦τHitsT oP ub ) ∧(τHitsT oP ub ⇒ ♦P U B((s hits(h2 , id2 ), id1 ))) ∧(recquery(q, id1 ) ⇒ ♦τquery ) ∧((τquery ∧ (h2 ! = “”)) ⇒ ♦dohits(h2 , id2 ))
reduced generality. However our approach aims at providing a compositional verification framework which reflects designers choices and assumptions made at the architectural level.
7 Conclusion and future work Therefore, since all hypothesis have been verified, we can state that the Gnutella system on top of SIENA still satisfies the specifications.
In this paper we presented an approach to compositionally model check middleware-based software architectures specifications. The proposed methodology makes use of standard artifacts currently in use in the industrial setting. This allows an easy integration of our methodology in current industrial development processes. It is worthwhile noticing to this respect that the software architecture phase is always present in modern large software development and that the diffusion of UML has standardized the use of state diagrams and sequence diagrams as SA description formalisms. We have applied our approach to a real system specification that communicates through a Publish/Subscribe Middleware. In our experience model checking Publish/Subscribe applications is particularly expensive in terms of both time and space. From this point of view the main contribution of our work is in proposing a way to solve the state explosion problem at the level of software system architectural verification. Our approach relies on the Architectural Decomposition Theorem. In order to make the theorem more easily applicable we have provided a set of algorithms that automatically perform the decomposition of the global property Z in sets of properties that must hold locally at the system components level. This degree of automation has been obtained at the price of requiring that Z can be expressed by using Message Sequence Charts. This limitation allows only a subset of LTL formulas to be considered. Future work will investigate how to extend the algorithms for the automatic generation of local properties to a larger class of LTL global properties. Moreover, we intend to investigate how to embed the middleware properties and constraints into the Model Driven Architecture Transformations. This will allow for correctly translating high-level to low-level SA representation by simplifying the Theorem hypotheses verification.
6 Related Work We can identify in the literature two categories of works that are mostly related to our research. The first one concerns works that deal with verification of middleware-based software architectures. The second one encompasses all the works that have been done in the compositional reasoning research area. Focusing on the first category the work reported in Garlan [17] points out how event-based software is hard to reason about and test. In that paper an approach based on the idea of providing a generic, parametric publish/subscribe model-checking framework is proposed. This framework allows for decomposing the problem in two parts: (1) a reusable model that captures run-time event management and dispatch, and (2) components that are specific to the application being modeled. This same line of work has been extended in [26] by allowing for the modeling of larger and more realistic systems. However, in this approach the verification of system models still requires a large quantity of resources (time and space). In both cases, in fact, the approach relies on the validation of the whole composite system, that is the application and the middleware are both modelled and verified. We share with this approaches the problem domain, namely verifying middleware-based architectures but our focus in on compositional verification. Considering the second category many works have been recently proposed in compositional and in particular in assume-guarantee reasoning. Focusing only on the most recent ones we can refer to [18, 11] and to [13]. In the first group of papers the authors present a novel framework for performing assume-guarantee reasoning in an incremental and fully automatic fashion. The second approach focuses on source code verification. In both cases the aim of these works is to provide some kind of automation. This is the key issue in assume-guarantee reasoning because its practical impact has been limited so far due to the non-trivial human input required. In our approach we propose to factorize the verification of well known parts of the system according to a decomposition driven by the software architecture structure. With the approaches mentioned above we share the need to automatically produce the assumptions at the possible expense of a
Acknowledgments The authors would like to acknowledge the italian national project C.N.R SP4 that partly supported this work.
References [1] L. Bass, P. Clements, and R. Kazman. Software Architecture in Practice. Addison-Wesley, Massachusetts, 1998.
9
Proceedings of the 26th International Conference on Software Engineering (ICSE’04) 0270-5257/04 $20.00 © 2004 IEEE
[2] M. Bernardo and P. Inverardi, editors. Formal Methods for Software Architectures, volume 2804 of Lecture Notes in Computer Science, Bertinoro, Italy, Sept 2003. SpringerVerlang. [3] R. E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Transaction on Computers, C35(8):677–691. [4] M. Caporuscio, P. Inverardi, and P. Pelliccione. Formal analysis of architectural patterns. In Proceedings of the First European Workshop on Software Architecture (EWSA 2004), St. Andrews, Scotland, UK, May 2004. [5] A. Carzaniga, D. S. Rosenblum, and A. L. Wolf. Achieving Scalability and Expressiveness in an Internet-Scale Event Notification Service. In Proceedings of the Nineteenth Annual ACM Symposium on Principles of Distributed Computing, pages 219–227, Portland, OR, July 2000. [6] A. Carzaniga, D. S. Rosenblum, and A. L. Wolf. Design and Evaluation of a Wide-Area Event Notification Service. ACM Transactions on Computer Systems, 19(3):332–383, Aug. 2001. [7] A. Carzaniga and A. L. Wolf. Content-based networking: A new communication infrastructure. In NSF Workshop on an Infrastructure for Mobile and Wireless Systems, number 2538 in Lecture Notes in Computer Science, pages 59–68, Scottsdale, Arizona, Oct. 2001. Springer-Verlag. [8] D. Cecchini. Compositional Reasoning for Software Architecture: Gnutella-Siena. Tecnical Report in italian, http://www.di.univaq.it/caporusc/pubblications, 2003. [9] E. M. Clarke, O. Grumberg, and D. E. Long. Model Checking and Abstraction. ACM Transaction on Programming Languages and Systems, 16:1512–1542. [10] E. M. Clarke, O. Grumberg, and D. A. Peled. Model Checking. The MIT Press, 2001. [11] J. M. Cobleigh, D. Giannakopoulou, and C. Pasareanu. Learning Assumptions for Compositional Verification. In Ninth International Conference on Tools and Algorithm for the Construction and Analysis of Systems (TACAS 2003), number 2619 in LNCS, Warsaw, Poland, April 2003. Springer. [12] D. Compare, P. Inverardi, P. Pelliccione, and A. Sebastiani. Integrating model-checking architectural analysis and validation in a real software life-cycle. In the 12th International Formal Methods Europe Symposium (FME 2003), number 2805 in LNCS. Springer, September 2003. [13] J. Dingel. Computer-Assisted Assume/Guarantee Reasoning with VeriSoft. In 25th International Conference on Software Engineering (ICSE2003), Portland, Oregon, May 2003. [14] E. A. Emerson and A. P. Sistla. Simmetry and model checking. In Courcoubetis, number 83, pages 463–478. [15] W. Emmerich. Software engineering and middleware: a roadmap. In Proceedings of the conference on The future of Software engineering (ICSE 2000) - Future of SE Track, pages 117–129, Limerick, Ireland, 2000. ACM Press. [16] N. Francez. The Analysis of Cyclic Programs. PhD thesis, The Weizmann Institute of Science, 1976. [17] D. Garlan, S. Khersonsky, and J. S. Kim. Model Checking Publish/Subscribe Systems. In Proceedings of The 10th International SPIN Workshop on Model Checking of Software (SPIN 03), Portland, Oregon, May 2003.
[18] D. Giannakopoulou, C. S. Pasareanu, and H. Barringer. Assumption generation for software component verification. Proc. 17th IEEE Int. Conf. Automated Software Engineering 2002, September 2002. [19] Gnutella Home Web Page. http://www.gnutella.com. [20] O. Grumberg and D. E. Long. Model Checking and Modular Verification. ACM Transaction on Programming Languages and Systems, 16:846–872. [21] D. Heimbiger. Adapting Publish/Subscribe Middleware to Achieve Gnutella-like Functionality. In 2001 ACM Symposium on Applied Computing (SAC 2001): Special Track on Coordination Models, Languages and Applications, Las Vegas, NV, March 2001. [22] G. J. Holzmann. The logic of bugs. In FSE 2002, Foundations of Software Engineering, Charleston, SC, USA, 2002. [23] G. J. Holzmann. The SPIN Model Checker: Primer and Reference Manual. Addison Wesley, 2003. [24] P. Inverardi, H. Muccini, and P. Pelliccione. Charmy: A framework for model based consistency checking. Technical report, Department of Computer Science, University of L’Aquila, Jan. 2004. [25] ITU-T Recommendation Z.120. Message Sequence Charts. ITU Telecommunication Standardisation Sector. [26] J.Bradbury and J. Dingel. Evaluating and Improving the Automatic Analysis of Implicit Invocation Systems. In European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering. (ESEC/FSE 2003), Helsinki, Finland, September 2003. ACM Press. [27] S. Katz and D. Peled. An efficient verification method for parallel and distributed programs. In Workshop on Linear Time, Branching Time and Partial Order Logics and Models for Concurrency, volume 354 of LNCS, pages 489–507. Springer, 1988. [28] D. Long. Model Checking, Abstraction and Compositional Reasoning. PhD thesis, Carnegie Mellon University, 1993. [29] Z. Manna and A. Pnueli. The temporal logic of reactive and concurrent systems. Springer-Verlag New York, Inc., 1992. [30] R. Milner. An algebraic definition of simulation between programs. In The Second International Joint Conference on Artificial Intelligence, Sept 1971. [31] Object Management Group (OMG). Unified Modeling Language (UML) Version 1.5. http://www.omg.org/uml/, March 2003. [32] D. E. Perry and A. L. Wolf. Foundations for the study of software architecture. In SIGSOFT Software Engineering Notes, volume 17, pages 40–52, Oct 1992. [33] A. Pnueli. In transition from global to modular temporal reasoning about programs. Logics and Models of Concurrent Systems, sub-series F: Computer and System Science:123– 144, 1985. Springer-Verlag. [34] S. Uchitel, J. Magee, and J. Kramer. From sequence diagrams to behaviour models. In WTUML: Workshop on Transformations in UML. Satellite event of the European Joint Conferences on Theory and and Practice of Software ETAPS01 Genova Italy., 2001.
10
Proceedings of the 26th International Conference on Software Engineering (ICSE’04) 0270-5257/04 $20.00 © 2004 IEEE