Starting from Message Sequence Chart for Software Architecture Early Performance Analysis Antinisca Di Marco, Paola Inverardi Dipartimento di Informatica Universit`a di L’Aquila Via Vetoio 1, Coppito, Italy adimarco,
[email protected]
Abstract Software development requires designers to take decisions about functional and non functional aspects of the system under construction since the early stages of the software life-cycle. There still exists little automation to support non functional decisions along the software life-cycle. In this work we focus on performance issues and we consider the integration of early performance analysis into the software life-cycle. Building on our experience in software architecture performance analysis, we discuss the features required to the Message Sequence Charts notation to allow automated generation of a performance model based on Queueing Networks.
1 Introduction During the software development process designers have to take decisions on the design of the system under construction since the early stages of the process. These decisions, mostly based on experience, may affect either functional or non functional aspects of the software system. The success of the project or its failure may depend from them. Software specifications help designers to reason on the software system behavior and to produce a software system meeting functional requirements. As far as non functional requirements are concerned, these models, as they are, are not suitable to develop a good product. Typically, non functional analysis and validation may require additional information usually not provided by software artifacts. Among non functional aspects we focus here on performance ones. In the last decade, research efforts concentrated on integrating quantitative validation in the software development process, in order to meet performance requirements. Sev-
eral approaches have been proposed [4] which introduce methodologies that derive performance models from software artifacts (more or less detailed) specified at different stages of the life-cycle. Software performance is the process of predicting (at early phases of the life-cycle) and evaluating (at the end) whether the software system satisfies the user performance goals. From the software point of view, the process is based on the availability of software artifacts that describe suitable abstraction of the final software system. Requirements, software architectures, specification and design documents are examples of these artifacts. Since performance is a run time attribute, performance analysis requires suitable descriptions of the software run time behavior, that we refer as dynamics. In this context, we have defined a methodology [2] that, starting from a Software Architecture (SA) description given by means of Message Sequence Chart (MSC) [8], automatically derives a performance evaluation model, based on a Queueing Network (QN) model [7]. From the QN evaluation the designers might have insight on the goodness of their decisions with respect to performance aspects. Our methodology starts from a set of MSC that contains enough information to allow the generation of the QN model. Based on our approach, in this work we discuss the information content, the extensions and the constraints on the MSC notation our methodology requires. From a broader point of view, we want to start discussing the adequacy of the MSC notation as dynamics description notation, for early validation of performance requirements. It is worthwhile noticing that in this work, we focus on the (automatic) generation of the topology of a QN model starting from architectural specification. We do not deal with the QN parameterization that requires the knowledge of domain specific information such as system operational profile and resource workload. For this aim many efforts have been done to extend scenario notations to allow performance data
Trace Analysis Traces Generation
SA Description Traces/Regular Exp Message Sequence Charts
MSC2QN Methodology
Feedback
Queuing Network Model
Interaction Sets definition
Interaction Sets QN generation Interaction sets analysis
Performance Evaluation
Intermediate QN and set of real concurrent components
Results
Results Interpretations
QN model restructuring
Figure 1. Software Performance Analysis Process and MSC2QN methodology.
specifications. Recently a UML profile [6] has been defined to allow the embedding of performance related data in the UML diagrams. The paper is structured as follows: in the next section we outline the methodology showing its macro steps through a simple case study. Section 3 discusses the information the MSC must provide, the constraints the MSC have to respect and the extensions of MSC notation the methodology requires. The last section presents conclusions.
resent activities of the process whereas rectangles indicate the output of each process step. The software performance analysis process takes as input the MSC-based description of a SA, goes through our methodology (MSC2QN methodology), and through the model evaluation and finally provides feedbacks at the architectural level from the results interpretion. For the purposes of this paper we discuss the MSC2QN step. Very briefly we recall that the topology of a QN model is represented by a set of service centers, which are independent entities, suitably connected. The translation algorithm applied in our approach aims at deriving a QN model as close as possible to the SA model. The generated QN model represents the real level of concurrency of the system. In fact, the sequential behavior of software components that are strongly synchronized is represented by means of a single service center (see Figure 10 (b)). Independent components are instead associated to individual service centers. Service centers are then connected in the QN model depending on how the SA components interact with each other. At the right-hand side of Figure 1 we outline the macro steps of the MSC2QN methodology which are: 1. the Trace Analysis phase that starts encoding the MSC by means of regular expressions. These regular expressions are analyzed in pair to find out their common prefix and to identify interaction pairs that can give information on the real concurrency between SA components. These interactions pairs form the Interaction sets.
2 From MSC to Queueing Network Methodology
2. the QN Generation step that, by analyzing the Interaction sets, identifies the components or sets of components that can be modelled as service centers of the final QN and identifies the interconnections among the service centers, i.e. the topology of the QN.
In this section we summarize our approach for the automatic generation of a performance model. It starts from a SA specification described by MSC to generate a QN model. This approach is based on a previous work [3] that builds a QN model starting from a finite state model (FSM) of the global system behavior. This first technique turned out to be inefficient for its computational complexity and for the possible FSM state space explosion when dealing with real projects. To overcome these drawbacks and to fully integrate software performance analysis in common practice software life-cycles, we re-engineered the methodology that now starts from a MSC-based SA description [2]. Our approach uses concepts and reasoning on MSC introduced in [1]. The left-hand side of Figure 1 sketches the software performance analysis process we use to validate SA with respect to performance requirements. In the figure, ovals rep-
To illustrate our methodology we use a simple case study, that is a simplified version of the electronic commerce system in [5]. We suppose that there is a supplier that publishes his catalogue on web. The supplier accepts customer orders, delivers the ordered items and maintains all the relevant data. He needs to maintain information on customers, on catalogue and on orders purchased by his customers. The catalogue can be browsed from everybody, but only registered customer may buy items by logging in the system and placing an order. Each registered customers has a cart where he can insert or delete items. The customer can order only if the cart is not empty. The system also allows the customer to monitor the order status and to confirm the delivery in order to permit the payment. In this paper, we consider a simplified version of this system. We focus on the customer
Order DB CustomerProcess
CustomerInterface
CustomerProcess
CustomerServer
*
notL
*
Cart DB Customer LogInReq In_gate LoginWindow
Customer DB
Catalog DB
DeliveryOrderProcess
Out_gate LoginInfo
* *
In_gate
Login
W
notL
Invoice DB
InfoRequest
Customer DB Involved
notL
R CustomeInfo
L
SupplierProcess
LoggedIn
InvoiceProcess *
Bank DB
LoggedIn Out_gate
Figure 2. SA components of the Electronic Commerce System.
view by considering only the following functionalities: Login, BrowseCatalogue, BrowseCart, InsertItemCart, DeleteItemCart, BrowseOrderStatus, ConfirmDelivery, and Logout. In Figure 2 we show the Software Architecture of the system, where we identify some data bases (Customer DB, Cart DB, Order DB,etc.) and four components (CustomerProcess, SupplierProcess, etc.). Each customer has associated a (individual) CustomerProcess and there is a server for each involved database that permits to communicate with it. The interactions with these servers are asynchronous. All components can read data from a DB and to write or to update data on a DB a component needs the opportune rights. For example the CustomerProcess has full rights on Customer DB, it can only read data on the OrderDB, but to update this data it has to interact with the DeliveryOrderProcess component. From Figure 3 to Figure 9 we report the scenarios we have considered to derive the QN model (Figure 11) of the identified subsystem. The scenarios are expressed by means of the Message Sequence Chart notation extended as we discuss in Section 3. MSC provide information on the state of the involved components (i.e. the hexagons), on the subsystem input and output flows (i.e. the MSC gates), on the types of the interaction (i.e. the arrows head). From the considered MSC, the methodology generates the regular expressions reported in Table 1. The regular expressions are sequences of labels like S(C1 , C2 )c that indicates an interaction between C1 and C2 components. The type of the interaction is indicated by c ∈ {a, s} where a and s mean asynchronous and synchronous communication respectively. In the regular expressions corresponding to the
Figure 3. MSC1: Login Scenario.
CustomerInterface
CustomerProcess
CatalogServer
*
*
*
Customer BrowseCatalog In_gate
MR CustomerRequest W CatalogRequest W
CatalogInfo
CatalogOutput
*
CatalogInfo
R
Catalog DB Involved
*
D
Out_gate
Figure 4. MSC3: Browse Catalog Scenario.
scenarios CP denotes CustomerProcess, and OS denotes OrderServer that is the server for the Order DB. By analyzing the regular expressions, the methodology collects information on the sequences and on the types of communications among components, on the concurrency among components and on non-deterministic behaviors. During this step two types of Interaction sets are built: Basic Interaction Sets (BIS) and Structured Interaction Sets (SIS). For each label in regular expressions a BIS is built. It indicates the type of the communication and the SA components involved in it. Hence, the BIS corresponding to S(C1 , C2 )a is {(C1 , C2 )a }. Each SIS is instead composed by two interaction appearing in two different MSC and giving information on the level of parallelism of the involved components in terms of concurrent behavior and/or non deterministic behavior. These interactions determine the point where two
MSC MSC1
Trace/Regular Expression S(EN V, CI)s S(CI, EN V )s S(EN V, CI)s S(CI, CP )s S(CP, CustS)a S(CustS, CP )a S(CP, CI)s S(CI, EN V )s S(EN V, CI)s S(CI, CP )s S(CP, CS)a S(CS, CP )a S(CP, CI)s S(CI, EN V )s S(EN V, CI)s S(CI, CP )s S(CP, CartS)a S(CartS, CP )a S(CP, CI)s S(CI, EN V )s S(EN V, CI)s S(CI, CP )s {S(CP, CartS)a S(CartS, CP )a )}2 S(CP, CI)s S(CI, EN V )s S(EN V, CI)s S(CI, CP )s S(CP, DOP )s S(DOP, OS)a S(DOP, CP )s S(CP, CI)s S(CI, EN V )s S(EN V, CI)s S(CI, EN V )s S(EN V, CI)s S(CI, CP )s S(CP, CI)s S(CI, EN V )s
MSC3 MSC4, MSC6 MSC5 MSC12 MSC13
Table 1. Traces of considered MSC
CustomerInterface
CustomerProcess
CartServer
*
L
*
CustomerInterface
CustomerProcess
CartServer
*
L
*
Customer BrowseCart In_gate
Customer InsertItem
MR BrowseCart
In_gate
* InsertItem
L
W
ReadCartStatus
W W
R CartInfo
L
Cart DB Involved
InsertItem
W L
D
CartOutput
Wr CartInfo
CartInfo L
*
*
Cart DB Involved
F
CartInfo CartOutput
Out_gate
Out_gate
Figure 5. MSC4: Browse Cart Scenario.
CustomerInterface
CustomerProcess
CartServer
CD
L
F
Figure 7. MSC6: Insert Item Scenario.
Customer DeleteItem In_gate
CustomerInterface
CustomerProcess
DeliveryOrderProcess
OrderServer
*
L
*
*
* DeleteItem
Customer ConfirmDelivery
L ReadCartStatus W
R
Cart DB Involved
In_gate
L
* CustomerConfirm
CartInfo *
L
updateStatus Wr CartInfo
Shipment Received
Cart DB Involved
CartInfo CartOutput
*
return Out_gate
return
return
updateOrderStatus UO
Wr
Order DB Involved
*
Out_gate
Figure 6. MSC5: Delete Item from Cart Scenario.
Figure 8. MSC12: Confirm Delivery Scenario.
CustomerProcess
*
L
Customer
{ CP, CS}
CP
LogoutReq In_gate LogoutMsg
{ CI, CP}
a
s
8
CustomerInterface
CI,CP
CS
(b) Synchronous Interaction
8
(a) Asynchronous Interaction
Out_gate Confirm In_gate
* CS
*
{ (CP, DOP)
s
, (CP, CS) a }
ND
CP
Logout
0 DOP
W reply reply
*
notL
(c) One-to-two communication
notL
Out_gate
Figure 10. Example of Translation patterns.
Figure 9. MSC13: Logout Scenario.
• non deterministic behavior of components i.e. when one-to-two, two-to-one , or alternative communication occur. One-to-two communication takes place when two traces with a common prefix present two interactions with the same sender but different receivers. Twoto-one communication is the symmetric situation of the previous one, i.e. the interaction pairs have different senders but the same receiver (referring to Figure 12 (b), the corresponding Structured Interaction Set is {(Comp3 , Comp1 )s , (Comp2 , Comp1 )s }N D and the common prefix is S(Comp1 , Comp2 )s ). Finally, alternative communication occurs when two MSC, after a common prefix, show two different interactions with distinct senders and receivers. In this case, the two traces do not join as it happens in the concurrent behavior. Referring to our case study and focusing on MSC3 and MSC12 of Figures 4 and 8, the SIS is {(CP, CS)a , (CP, DOP )s }N D that indicates a one-totwo communication. The methodology aims at discovering the above situations since from them it derives the real degree of parallelism among SA components. Using this information SA components are associated with the service centers of the QN model and the interconnections among these service centers are established.
8
Departures Class B
CustS Class B
0
0 SCI
CS CP
CI
8
Class D lambda1 Class A lambda 2
8
...
CartS 8
8
• concurrent behavior of two components, i.e. when there exist two different execution traces, with a common prefix, showing a sequence of two interactions with symmetric ordering (referring to Figure 12 (a), the Structured Interaction Set is {(Comp1 , Comp2 )s , (Comp2 , Comp3 )s } and the common prefix is empty);
8
execution traces (MSC temporal sequences) start to be distinct from a common prefix. Structured Interaction Sets are created when the following cases occur:
The second macro step of the MSC2QN methodology is the generation of the QN model from the Interaction Sets resulting from the first macro step. In Figure 10 we show some translation patterns used during the generation of the QN model of Figure 11 corresponding to our case study.
S CP
Departures Class C
OS Class C
0 DOP Class B Class B
Figure 11. Queueing Network. For the scope of this paper we do not further detail the MSC2QN methodology since we are interested on the capabilities and features that the MSC notation must have to allow early performance analysis. The methodology requires that the MSC description must have state information of each component before interactions occur. This information on the component state allows the identification of concurrent interactions. Let us suppose the methodology identifies a SIS composed by I1 and I2 interactions having the same sender component. If this component reaches two states in conflict (conflict is a relation on states components that holdes when, given a component, two states represent two mutually exclusive component behaviors) before executing I1 and I2 , then the identified SIS does not give information on the degree of parallelism of the involved components but it only models the information of a flow. For example, in the MSC1 and MSC13 scenarios the trace analysis step finds the SIS
Comp1
Comp2
Comp3
Comp1
Comp2
op1
Comp3 op2
op2
op1
(a) Concurrent Communications
Comp1
Comp2
Comp3
Comp1
op1
Comp2 op1
op2
op3
(b) Two-to-one Communications
Figure 12. Concurrent and One-to-two Communications in MSC notation.
{(CP, CustS)a , (CP, CI)s }N D that is not further considered for concurrency analysis since the CP states in the scenarios are in conflict. The methodology, in this case, introduces two classes of jobs outgoing CP service center, one that goes to the database servers and the DOP and the other one that goes back to CI (see Figure 11). As a final remark, we intend to highlight a crucial assumption that undergoes the described methodology (as it is up to date). We assume that all the MSC of the SA description start from the same initial state of the software system. In other words there is no hierarchy among MSC, and every one can only be triggered, in practice, from an external event. This constraint may be obviously overcome by making use of HMSC [8].
Since MSC are a partial representation of the software system behavior, there are several approaches to synthesize a global finite state system model out of a set of MSC. A single MSC then represents a valid path on this automaton. For the purpose of this paper we do not need to present the synthesis approach we assume. When necessary, we will only make explicit assumptions on the intended global model. As we already said we do not need to generate a complete model out of a set of MSC. The reference to the assumed global model is necessary only to justify choices that we make in the generation of the QN model from the MSC SA description. To apply our methodology, the MSC description (i.e. the set of considered MSC) must have several features and properties. First of all, the set of MSC must be representative of major system behaviors and for each SA component there must exist at least a MSC describing its interactions with other components. If the set of MSC has not this characteristics we do not have enough information on how components interacts with each other. The MSC describing the SA dynamics verify the nondegeneracy property, i.e. arrows can only occur horizontally [1]. Synchronous and asynchronous interactions must have different representations. For this purpose we use the standard MSC notation (see Figure 13). To simplify the approach, we assume that the primitive basic communication between components is one to one. This assumption is not restrictive since both multiple-senders and multiplereceivers interactions can be modelled by a set of one to one communications.
3 The starting point: MSC notation and its needed extensions
Comp1 l1
We have chosen MSC as SA description notation for many reasons: • Except for small extensions, the standard MSC notation allows us to describe all the information we need in order to automatically generate a QN model from a software description, such as the types of interaction or the interactions partial ordering.
Comp2
Comp3
l2
l 3,l 4
l6
l 7,l 8
Input in_gate
l5 l9
op1
l 10
op2
l 11, l 12 Output
op1 = Synchronous Interaction op2 = Asynchronous Interaction
out_gate
• Being the MSC model not complete, the complexity of the approach decreases. We are more interested on the type of communications and on their ordering rather than in the complete behavior model.
Figure 13. State Information and Interaction Types in MSC Notation.
• MSC is either a graphical and a textual notation. In fact, underlying the graphic notation of MSC there is the definition of a trace language which simplifies the automation of the trace analysis step of our methodology.
As introduced in section 2, MSC must contain state information about each component before each interaction. We represent the state information by means of the setting condition concept of the MSC notation. If a component is designed as a multi-thread component, state information for
C1
C2
repeat: n times interactions block endrepeat: n times
Figure 14. MSC with a repeat block.
that component is a tuple of setting conditions, one for each thread. Figure 13 shows an example of MSC containing such information, where lk represents the setting condition for a component. The setting condition of comp3 is a pair since it is composed by two threads. The software system may receive service requests from external software components and it may give back some kind of information. These external software components are called environment. In a QN, the interactions of the system with its environment are modelled by input and output flows respectively, and the QN is called open. Instead, when interactions between the software system and its environment do not exist, the corresponding QN is called closed QN. Our methodology can build both QN models typology. However, to generate an open QN, the approach needs information on the interactions with the environment. The MSC notation provides the gate feature to model this. Gates represent the interface with the environment. Any message attached to the MSC frame constitutes a gate. If the arrow starts from the MSC frame and ends in the lifetime line of an object it will correspond to an input flow. In the symmetric case, i.e. if the arrows starts from the lifetime line of an object and ends to the MSC frame it will correspond to an output flow. Figure 13 shows two gates, in gate and out gate, referring respectively to an input stream and to an output stream. MSC permit us to represent many aspects of a SA dynamics useful to produce QN models from SA descriptions. However some extensions of the notation must be defined. First of all, interaction definition must be extended. In general the interactions involve objects. In our approach the interactions occur between architectural components. Although the difference is mostly conceptual, in practice our extension allows the designers to abstract from (possibly unknown) behavioral details. Obviously, on the other hand, we loose notation expressiveness, for example dynamic creation of objects can not be modelled. Secondly, we extend the MSC notation by introducing blocks of interactions that can occur many times in order to represent iteration cycles, as shown in figure 14.
4 Conclusion In this work we have summarized our methodology to software architecture performance analysis. Our approach generates a QN performance model from a software architecture description based on MSC. Basing on our experience we started to discuss the adequacy of MSC notation to early performance validation of software architecture. We have pointed out the information the MSC description has to provide, the constraints on the SA description and the extensions of the MSC notation that allow the automatic generation of the QN topology (i.e. the set of service centers composing the model and their interconnections). In our discussion we did not consider the information required to parameterize the performance model, such as the system operational profile and the workload of the resources. Many efforts have been recently proposed to extend scenario notations to insert such additional performance information. As future works, beside continuing the validation of the methodology proposed so far, we want to analyze possible MSC extensions to allow automatic QN parameterization. From the additional information, such as operational profiles and resource workload, provided by extensions of MSC, the methodology might define the service rates of the service centers, the arrival rate of the requests and the scheduling policy of queues.
References [1] R. Alur, K. Etessami, and M. Yannakakis. Inference of message sequence charts. In Proc. 22 nd Int. Conf. on Softw. Eng. (ICSE2000), June 2000. [2] F. Andolfi, F. Aquilani, S. Balsamo, and P. Inverardi. Deriving performance models of software architectures from message sequence charts. In Proceedings of the Second International Workshop on Software and Performance (WOSP00), September 2000. [3] F. Aquilani, S. Balsamo, and P. Inverardi. Performance analysis at the software architecture design level. Performance Evaluation, 45(4), 2001. [4] S. Balsamo, A. D. Marco, P. Inverardi, and M. Simeoni. Software performance: state of the art and perspectives. Technical Report of Sahara Project. Submitted to publication, 2002. [5] H. Gomaa. Designing Concurrent, Distributed, and RealTime Applications with UML. Addison-Wesley, 2000. [6] O. M. Group. UML Profile, for Schedulability, Performance, and Time. OMG document ptc/2002-03-02, http://www.omg.org/cgi-bin/doc?ptc/2002-03-02. [7] E. Lazowska, J. Kahorjan, G. S. Graham, and K. C. Sevcik. Quantitative System Performance: Computer System Analysis Using Queueing Network Models. Prentice-Hall, Inc., Englewood Cliffs, 1994. [8] I. T. S. Sector. Message Sequence Charts, ITU-T Recommentation Z.120(11/99). 1999.