Automatic derivation of software performance models from ... - CiteSeerX

Performance Evaluation 45 (2001) 81–105

Automatic derivation of software performance models from CASE documents夽 V. Cortellessa∗ , A. D’Ambrogio, G. Iazeolla Department of Computer Science, S&P, University of Roma “Tor Vergata”, 110 Via di Tor Vergata, I-00133 Rome, Italy

Abstract Lifecycle validation of software performance (or prediction of the product ability to satisfy the user performancerequirements) is based on the automatic derivation of software performance models from CASE documents or rapid prototypes. This paper deals with the CASE document alternative. After a brief overview of existing automatic derivation methods, it introduces a method that unifies existing techniques that use CASE documents. The method is step-wise clear, can be used from the early phases of the software lifecycle, is distributed-software oriented, and can be easily incorporated into modern (e.g., UML-based) CASE tools. The method enables the software designer with no specific knowledge of performance theory to predict at design time the performance of various final product alternatives. The designer does only need to feed the CASE documents into the performance model generator. The paper carries on an application case study that deals with the development of distributed software, where the method is used to predict the performance of different distributed architectures the designer could select at preliminary design time to obtain the best performing final product. The method can be easily incorporated into modern object-oriented software development environments to encourage software designers to introduce lifecycle performance validation into their development best practices. © 2001 Elsevier Science B.V. All rights reserved. Keywords: Software performance; Software validation; CASE tools

1. Introduction Software quality is not an add-on feature of software products. In other words, it cannot be introduced by retrofit actions (when the product has already been developed), but has to be built-in across the software development process [11]. 夽

Work partially supported by funds from the MURST project on Software Architectures and Languages to Coordinate Distributed Mobile Components from the University of Roma “Tor Vergata” research on the Performance Validation of Advanced Systems and from the University of Roma “Tor Vergata” CERTIA Research Center. ∗ Corresponding author. E-mail addresses: [email protected] (V. Cortellessa), [email protected] (A. D’Ambrogio), [email protected] (G. Iazeolla). 0166-5316/01/$ – see front matter © 2001 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 6 - 5 3 1 6 ( 0 1 ) 0 0 0 3 6 - 0

82

V. Cortellessa et al. / Performance Evaluation 45 (2001) 81–105

Fig. 1. Strategy scheme for lifecycle validation of software performance [11].

Software performance is one of the main attributes of software quality. The approach of introducing performance by retrofit actions has been termed fix-it-later approach in [3], where the many problems this approach creates have been documented. Lifecycle validation of software performance is the process of predicting (at early phases of the lifecycle) and evaluating (at the end) the ability of the software product to satisfy the user performance goals. In order to implement lifecycle validation of software performance, the strategy scheme in Fig. 1 has to be applied to each phase of the development lifecycle [11]. It applies to life-cycle artifacts. The term artifact is conventionally used to mean either the final software program, or an intermediate version of it. The requirement document, the analysis document, the design documents, etc. are examples of artifacts. When applied to the ith phase, the scheme assumes that the previously validated (i − 1)th phase artifact is received in input. On its basis, a tentative phase(i) artifact is produced, and validated in the PV block by comparing the predicted performance of the product with the user performance goals. In case of unacceptability, if cost effective, a new tentative artifact is produced for better performance. Else (i.e., if it is not cost-effective to produce a new tentative), a feedback loop to the user goals takes place, for goals revision. The PV block of the strategy scheme in Fig. 1 consists of three main activities:


83

1. performance model production; 2. performance model evaluation; 3. comparison of evaluation results with user performance goals. The first activity deals with the derivation of a performance model from existing development documents (e.g., analysis documents or design documents or source code). A considerable effort, however, is necessary and large expertise is required to manually translate standard software documents into a performance model, and this makes model development usually unattractive for software developers. Only a few methods have been developed for the automatic generation of performance models [4,8–10,12,14]. In some cases, as for example in [4,10,12], they do not give the detailed step-wise algorithms for model generation. In some other cases, as for example in [9], they are more specific to be used at the detailed design stage of the software lifecycle (while in many circumstances performance predictions at earlier development phases, such as analysis and preliminary design phases, are important to system developers). Still in other cases, as for example in [4,7,9,14], they do not explicitly model distributed software (e.g., do not model the performance of software servers). Some other times, as for example in [8], they are based on a different approach, that makes use of executable rapid prototypes of the software product. This paper gives an unified view of performance model generation methods based on non-executable software documents and includes a method that overcomes some difficulties of existing methods. In particular, it introduces a model generation method which: • • • • •

is step-wise clear from the algorithmic point of view; can be used since the early phases of the software lifecycle, such as analysis and preliminary design; does not require executable prototypes; includes features to also deal with distributed software; can be easily incorporated into modern (e.g., UML-based) CASE tools, among which System Architect from Popkin, Rose from Rational and COOL:Jex from Sterling.

This paper is organized as follows. Section 2 describes the general method for generating a performance model from software development documents. Section 3 gives details of the method application by use of a case study. The generated performance model can be of two types: the extended queuing network (EQN) type [1] or the layered queuing network (LQN) type [6]. In this paper, the LQN type is illustrated. Finally, in Section 4 the issues relevant to the method application are discussed, with details of the method complexity and size and an example of model use to obtain performance metric values. 2. Method outline The assumption is made that the software is developed according to an object-oriented approach, by use of UML notation [2,5]. The described method deals with generation of a performance model from object-oriented analysis documents (OOA), produced by use of an UML-based CASE tool. The method also assumes some documents are available at object-oriented design time (OOD), when performance aspects of distributed software are also to be considered. The method is illustrated in Fig. 2. The top area refers to input data

84


Fig. 2. Method outline for model generation.

to the performance model generator. The top-left part relates to input documents that are available at analysis time (class diagram CD, set of sequence diagrams SD1 through SDn, platform data PD, and operational profile OP that includes user workload data). In case of distributed software, the top-right part is also used, which relates to input documents that are available at preliminary-design time (software modules architecture (SW), client/server structure (C/S), and modules-platform mapping (MP)). The remaining blocks in Fig. 2 relate to various steps of the performance model generation, that can be synthesized as follows: Step 1

generation of the set of precedence graphs (PG1 through PGn), each graph relevant to a specific scenario (object interactions). Such graphs are also called scenario PGs; Step 2 generation of the global precedence graph (global PG), that merges together scenario PGs and represents the behavior of the whole system; Step 3 generation of the extended flow graph (EFG). This generation is based on the global PG, the workload data (from OP documents), the SW modules architecture and the C/S structure. The EFG is a standard flow graph with extensions to represent client/server structures;


Step 4

Step 5

85

generation of the so-called extended execution graph (EEG) on the basis of the platform configuration (PC) from PD documents (see Fig. 2), the EFG and on the basis of experience data and MP, as better discussed in Section 3.4; generation of LQN performance model on the basis of the EEG and resource capacity data (RC) from the PD document (see Fig. 2), as better seen in Section 3.5.

The so obtained LQN model is then evaluated by use of an LQN evaluation tool [15]. Example results of such an evaluation are given in Section 4.2. 3. Method description A constructive description of the method is given here by the use of an application case that deals with the development of a distributed software called mail system. The system gives the users the capability to download bulk of mail messages coming from external sources. A database stores log records of each user operation, and is updated by the system when the user closes the session [16]. There are two categories of users: administrators, who are allowed to download both the log records and the mail messages, and standard-users, allowed to download only mail messages. Two scenarios are thus defined for this system, each one referring to a particular category of users. 3.1. Inputs from the analysis phase artifacts According to the OP document the administrators are 10% of total users and they download one mail message for session, while standard-users are 90% of total users and download three mail messages for session.

Fig. 3. Platform configuration (PC) of the considered mail system.

86


Table 1 PC devices list Device 1 Device 2 Device 3 Device 4 Device 5 Device 6 Device 7 Device 8 Device 9

T UWS UWS-DB HWS-A HWSA-DB HSW-B HWSB-DB LAN WAN

Generic peripheral terminal of the user workstation CPU of the user workstation DB unit of the user workstation CPU of the HW server A DB unit of the HW server A CPU of the HW server B DB unit of the HW server B Local area network Wide area network

A mail system process is run for each user logged on the user workstation. Number (N) of users are simultaneously logged and their think time is 10 s on the average. Each user must be authenticated in order to get access to the mail system. It is assumed that, according to the PD document the mail system PC is the one in Fig. 3, and consists of one User Workstation, with a number of peripheral terminals operated by users, and two servers, HW server A and HW server B. The user workstation is connected to the HW server A by the LAN and to the HW server B by the WAN. The WAN also directs the external mail messages to HW server A or B according to design-time decisions (see Section 3.4). The user workstation and the hardware servers are each endowed with their own DB device, as better described in Table 1, which gives the complete set of PC devices in Fig. 3. Additional artifacts from the object-oriented analysis phase are the CD and the set of SDs, on whose basis the PGs of Step 1 can be generated as described below. 3.2. Generation of precedence graphs Precedence graphs (PGs) are used to identify the execution flow and the interaction among system actors. This section describes the generation of a PG from data provided by the OOA, more specifically from the CD and the set of SDs. This step of the method consists of two sub-steps: • the generation of a scenario PG for each scenario (Section 3.2.1); • the generation of a global PG for the overall system, obtained by merging all scenario PGs (Section 3.2.2). 3.2.1. Generation of the scenario PGs The OOA provides the diagrams used for the generation of scenario PGs. Such diagrams are the CD and the set of SDs. The CD depicts the static structure of the system in terms of classes, with attributes and methods, and relationships between classes. Class methods are the main elements used for the scenario PG generation. Fig. 4 describes the CD of the considered mail system, consisting of four classes: • the User class, which refers to the users of the mail system; • the System class, with the mail session and verify user methods; • Mail Mgr class, that manages the Mail DB, with the read mail method to download a single mail message;


87

Fig. 4. Class diagram of the considered mail system.

• the History Mgr class, that manages the History DB, with the view log and update log methods. While the CD refers to the overall system, SDs are diagrams that refer to a particular scenario. A SD shows the interaction in time sequence among objects, or instances of each given class, in the scenario: the vertical dashed lines represent the lifelines of the objects (time progress is top-down) and the horizontal arrows represent control transfer (or object interactions) from the sender object to the receiving object. Each UML-type activation box [5] between an incoming arrow and the subsequent outgoing arrow of the SD graphically represents the code executed by each class method.

Fig. 5. SD for the standard-user scenario.

88


Fig. 6. SD for the administrator scenario.

Only two SDs are illustrated here, given in Figs. 5 and 6 for the standard-user scenario and the administrator scenario, respectively. In the administrator scenario for example (see Fig. 6) the User object initially calls the mail session method of the System object. The System executes the method by first verifying the user rights and then by calling the view log method of the History Mgr object and the read mail method of the Mail Mgr object. Finally, the System returns control to the User object and, at the same time, performs a call to the update log method of the History Mgr object. Such a call is of asynchronous type, so there is no return arc to the System object. It is easy to be convinced that from each SD a precedence graph (PG), also called “scenario PG”, can be derived (see Figs. 7 and 8). This is done by introducing a PG node for each method call or method execution in the SD. PG nodes are vertically grouped into subsets, each for an actor corresponding to a specific SD object. Nodes are labeled by U, S, M and H to denote they belong to the User actor, to the System actor, to the Mail Mgr actor and to the History Mgr actor, respectively. PG arcs give the transfer of control between PG nodes. Arcs with the call label represent synchronous calls, which expect a return arc. Arcs with the goto label represent asynchronous calls, which have no return arcs. Each PG node can be a standard node (representing method CALL or method EXECUTION) or a special node (SPLIT, LOOP, NOP, etc.). The correspondence between method calls in Figs. 5–8 is illustrated in Table 2, where the LOOP node S1 represents the three times repetition of visits to nodes S2 and M0 , corresponding to the sequence of three calls to read mail method of Mail Mgr object in Fig. 5, the SPLIT node S4 represents the simultaneous transfer of control to nodes U1 and H1 , and the NOP node U1 is used to represent the user think time. The EXECUTION nodes (S0 , M0 , H0 , H1 ) and the CALL nodes (U0 , S2 , S3 , S5 ) are of immediate evidence.


Fig. 7. PG for the standard-user scenario of Fig. 5.

Fig. 8. PG for the administrator scenario of Fig. 6.

89

90


Table 2 SD–PG correspondence U0 U1 S0 S1 S2 S3 S4 S5 M0 H0 H1

CALL of System:mail session NOP EXECUTION of System:verify user LOOP:3 CALL of History Mgr:view log CALL of Mail Mgr:read mail SPLIT CALL of History Mgr:update log EXECUTION of Mail Mgr:read mail EXECUTION of History Mgr:view log EXECUTION of History Mgr:update log

Fig. 9. Merge operation.

3.2.2. Generation of the global PG The global PG represents the behavior of the overall system and is obtained combining together the specific scenario PGs. Combination is performed by merging together the specific PGs according to the following operation. Let PG(h) be the result of merging PG(i) and PG(j), in symbols: PG(h) := PG(i) ⊕ PG(j ) where ⊕ denotes the merge operation. Then PG(h) is the precedence graph consisting of four sub-graphs (g1 , g2 , g3 , g4 ) connected by a CASE node (see Fig. 9), with g1 being the entrance sub-graph to the CASE, g2 and g3 the alternative sub-graphs and g4 the continue sub-graph of the CASE node. Sub-graphs g1 through g4 are defined as follows: • g1 is the sub-graph (if any) common to the initial parts of PG(i) and PG(j) (for example the (U0 , S0 ) sub-graph in Figs. 7 and 8); • g2 and g3 are the immediately following sub-graphs in PG(i) and PG(j) which are structurally different 1 (for example g2 = (S1 , S3 , M0 ) in Fig. 7 and g3 = (S2 , H0 , S3 , M0 ) in Fig. 8); 1

By structurally different we mean that PG(i) and PG(j) differ by at least one node or one arc.


91

• g4 is the final sub-graph (if any) common to PG(i) and PG(j) (for example the (S3 , S4 , U1 //H1 ) sub-graph in Figs. 7 and 8). Based on the above defined merge operation, the global PG, denoted gPG, is obtained from the specific scenario PGs (PG(1), PG(2), . . . , PG(n)) according to the following algorithm: Global PG generation algorithm Begin gPG := PG(1) For i = 2, . . . , n ⇒ gPG := gPG ⊕ PG(i) End Fig. 10 shows the global PG that results from the merge of the specific scenario PGs of Figs. 7 and 8.

Fig. 10. Global PG.

92


Table 3 Workload data Number of Users

N

Think time % Administrators % Standard-users

10 s 10% 90%

3.3. Generation of the EFG According to Fig. 2, the EFG is the graph generated at Step 3 of the method, which prescribes the following inputs to the EFG generation: 1. 2. 3. 4.

the global PG obtained at Step 2 of the method; the workload data (from OP document); the SW modules architecture; the C/S structure.

Input 1 has been described in Section 3.2.2. Input 2 describes the workload imposed by the user, consisting of: 2.1. number of users, 2.2. users’ think time, 2.3. percentage of different user categories, as illustrated in Table 3 for the example case study. Input 3 is the SW modules architecture given by the structure chart generated at preliminary design time, as illustrated in Fig. 11 for the example case study. By comparing Figs. 4 and 11 it is seen that, in the example case study, it is assumed a one-to-one mapping between classes from the CD and software modules in Fig. 11 (for more complex systems, various alternative mappings can be found), except for the User class that is assumed not to denote a specific software component. Nevertheless the global PG node U0 will be translated into an EFG block to be considered as the user think time block. Input 4 is a description of which modules play the role of clients and which the role of servers. According to Table 4, in the example case study, assumption is the System module plays the role of client, while the Mail Mgr and History Mgr modules play the role of servers (see below for the meaning of the multiplicity data).

Fig. 11. SW modules architecture.


93

Table 4 C/S structure Module

Role

Multiplicity

System Mail Mgr History Mgr

Client Server Server

– 2 1

An EFG is a graph semantically equivalent to the global PG but notationally different. It is an enrichment of the standard flow graph (FG) introduced in [3] for SPE. According to [3], an FG is a graph whose nodes are called blocks, which are of various types: basic blocks, expanded blocks, repetition blocks, case blocks, split blocks, fork-join blocks, lock-free blocks and share blocks. The semantics of such nodes is of immediate evidence. The reader is sent to [3] for details. The EFG we introduce enriches the mentioned FG blocks by a so-called software server block, whose semantics is illustrated in Fig. 12, where the number inside the left diamond denotes the degree of multiplicity, in other words the number of simultaneous users admitted to the block. Each software server block may contain nested software server blocks. The translation of a global PG into an EFG is performed by applying the following set of rules: • • • •

each SPLIT node of the global PG is converted into an EFG split block; each LOOP node of the global PG is converted into an EFG repetition block; each CASE node of the global PG is converted into an EFG case block; each node of the global PG belonging to an actor (object) that, according to the C/S structure of Table 4, plays the role of client is converted into an EFG basic block (i.e., a plain rectangular block). Examples of such nodes are nodes U0 , S0 , S2 , S3 , S5 of the global PG; • each node of the global PG belonging to an actor (object) that, according to the C/S structure of Table 4, plays the role of server is converted into an EFG software server block (i.e., a rectangular block with diamond sides). Examples of such nodes are nodes M0 , H0 , H1 of the global PG. Fig. 13 gives a view of the EFG obtained from the global PG in Fig. 10. It is easy to recognize that the alternative branches outcoming from the CASE block have associated branching probabilities of values

Fig. 12. Semantics of the software server block.

94


Fig. 13. Extended flow graph (EFG).

0.1 and 0.9 for administrators and standard-users, respectively, as obtained from the user workload in Table 3. It is also easily seen that the blocks on the 0.1 branch correspond to nodes S2 , H0 , S3 and M0 of the global PG in sequence, while the blocks on the 0.9 branch correspond to global PG nodes S3 and M0 connected by the repetition block S1 . Finally, the outcoming blocks from the CASE block correspond to nodes S4 , S5 and H1 of the global PG, where S4 is the SPLIT node converted into a split block. The Mail Server multiplicity 2 gives the server the capability to accommodate two simultaneous executions (e.g., two standard-users or one standard-user and the administrator). According to the nesting rule of the software server block, the outer thin-line software server block in Fig. 13 denotes the execution of the System mail session method and illustrates the fact that all included blocks play the role of server for the User client block U0 . Since a System is run for each user


95

Table 5 Modules-platform mapping Platform resource

User workstation i HW server A HW server B

SW Modules MP1 alternative

MP2 alternative

System History Mgr Mail Mgr

System Mail Mgr History Mgr

(according to the OP document described in Section 3.1), the degree of multiplicity of such a software server block is N, i.e., the number of users from Table 3. 3.4. Generation of the EEG An EEG is the direct derivation of the EFG obtained above. It is a weighted graph in which to each EFG block a resource demand vector: d = (d1 , d2 , . . . , dn )

is associated, where di is the number of elementary operations the block demands to device i of the PC. Example di s are the number of program statements executed by the CPU, the number of accesses to the DB and the number of accesses to LAN and/or WAN. As already outlined in Section 2 (see Fig. 2), the EEG is obtained at Step 4 of the method, on the basis of the PC (see Fig. 3), of the EFG (see Fig. 13), and on the basis of MP data and experience data. Modules-platform mapping data (MP) are decisions on which software module to allocate on which PC device. Example mapping decisions are the two alternatives MP1 and MP2 described in Table 5. In MP1, the incoming mail messages are collected by the HWSB, while they are collected by the HWSA in MP2. In the following, alternative MP1 will be assumed. Experience data are estimations of the values taken by components di basing on data from similar systems. Table 6 describes the results of such an estimation for di of the various blocks illustrated in Fig. 13. Each row in Table 6 gives the resource demands d1 through d9 submitted by each block (U0 through H1 ) of the EFG. Each di is expressed in terms of number of accesses (acc) or number of statements (stat). Row S3 , e.g., says that the demand from the call Mail Mgr.read mail block is for 10 UWS statements, one LAN access, one WAN access and zero elsewhere. The resulting EEG is the weighted graph illustrated in Fig. 14, where each block has the associated demand vector obtained from Table 6. 3.5. Derivation of the LQN model According to [6], an LQN model is an high level abstraction of queuing network models introduced for modeling the contention for software servers and hardware devices. The basic LQN components are called entities. An LQN includes an entity for each software task (client or server) and an entity for each processor or hardware device (see Fig. 15).

96


Table 6 Resource demands EFG block

Resource demand vectors

(U0 ) Call System.mail session (S0 ) System.verify user (S2 ) Call History Mgr.view log (S3 ) Call Mail Mgr.read mail (S5 ) Call History Mgr.update log (M0 ) Mail Mgr.read mail (H0 ) History Mgr.view log (H1 ) History Mgr.update log

T (acc)

UWS UWS(stat) DB (acc)

HWSA HWSA(stat) DB (acc)

HWSB HWSB(stat) DB (acc)

LAN WAN (acc) (acc)

1 0 0 0 0 0 0 0

5 100 10 10 10 0 0 0

0 0 0 0 0 0 100 200

0 0 0 0 0 150 0 0

0 0 1 1 1 0 0 0

0 1 0 0 0 0 0 0

0 0 0 0 0 0 1 2

0 0 0 0 0 1 0 0

0 0 0 1 0 0 0 0

Software tasks in Fig. 15 are connected by directed edges that represents method calls, going from client tasks to server tasks. Such edges are labeled by numbers denoting the average number of calls. In a similar way, hardware devices are connected to software tasks by boldface directed edges called device calls and labeled by numbers denoting the number di of elementary operations the task demands to device i. Tasks may be labeled by numbers denoting the degree of multiplicity of the task. When omitted, the degree of multiplicity is assumed to be 1. Number values associated to directed edges are called LQN

Fig. 14. Extended execution graph (EEG).


97

Fig. 15. LQN components.

parameters. To each software tasks entries are associated denoting the part of the task that provides a certain service (analogous to object methods). For more details the reader is sent to [6]. This section illustrates the generation of the LQN performance model of Step 5 of the method outlined in Section 2. The algorithm to generate an LQN model from an EEG consists of two steps: • generation of the LQN, • parameterization of the LQN, described in Sections 3.5.1 and 3.5.2, respectively. 3.5.1. Generation of the LQN In this step the generation of an LQN without parameters is illustrated for the example case study. An LQN model is generated from an EEG by applying the following set of rules: • one software task is generated for each type of software server blocks (i.e., System, Mail Mgr, History Mgr). One software task is also generated for each type of client blocks (i.e., User); • the degree of multiplicity of each server task is obtained from the degree of multiplicity of the corresponding software server block. For client tasks, the degree of multiplicity is obtained from workload data (Table 3, see later); • entries for each task are obtained by visiting the EEG and, for each pair of client/server blocks (i.e., a calling basic block followed by a software server block) the name of the called method and the name of the calling method become the entry name in the server task and the entry name in the client task, respectively; • method calls are obtained by drawing a directed edge from the calling entry (of the client task) to the called entry (of the server task); • devices are derived from the platform configuration devices (Table 1); • device calls are obtained by drawing boldface directed edges from each entry to each device for which a non-zero value is specified in the entry resource demand vector (Table 6). Fig. 16 describes the resulting LQN model for the example case study of Fig. 14. According to the above described set of rules, the following tasks, and entries for each task, have been identified and introduced in the LQN model:

98


Fig. 16. LQN model for alternative MP1.

• a User client task, with a dummy entry user entry, corresponding to the U0 block of the EEG, and with a degree of multiplicity of N, corresponding to the number of users in Table 3; • a System server task, with a mail session entry, corresponding to the set of S0 , S2 , S3 and S5 blocks of the EEG, and with a degree of multiplicity of 10, as from Fig. 14; • a History Mgr server task, with view log and update log entries, corresponding to the H0 and H1 blocks of the EEG, respectively, and with a degree of multiplicity of 1, as from Fig. 14; • a Mail Mgr server task, with a read mail entry, corresponding to the M0 block of the EEG, and with a degree of multiplicity of 2, as from Fig. 14. Methods calls, devices and device calls are of immediate evidence. 3.5.2. Parameterization of the LQN In this step the parameterization of the LQN is performed by associating values to directed edges representing method calls and device calls. The value associated to a method call is called MCP (method call parameter), while the value associated to a device call is called DCP (device call parameter). The MCP gives the expected number of calls that the client task entry makes to the server task entry. The DCP gives the device demand of the task entry, expressed in different units (statements, bits, transactions) depending on the device type considered. DCPs for each entry are first evaluated as follows: 1. let Ni be the number of EEG blocks associated to each entry i of the LQN model; 2. let dij be the resource demand vector of the EEG block j (see Table 6) associated to LQN entry i, where j = 1, . . . , Ni ;


99

Table 7 v ij and d ij for the example case study LQN entry (i)

User entry Mail session

Read mail View log Update log

EEG block (j)

v ij

U0 S0 S2 S3 S5 M0 H0 H1

1 1 0.1 2.8 1 2.8 0.1 1

d ij

T (acc)

UWS (stat)

UWSDB (acc)

HWSA HWSA(stat) DB (acc)

HWSB HWSB(stat) DB (acc)

LAN (acc)

WAN (acc)

1 0 0 0 0 0 0 0

5 100 10 10 10 0 0 0

0 1 0 0 0 0 0 0

0 0 0 0 0 0 100 200

0 0 0 0 0 150 0 0

0 0 1 1 1 0 0 0

0 0 0 1 0 0 0 0

0 0 0 0 0 0 1 2

0 0 0 0 0 1 0 0

3. let v ij be the mean number of visits to the EEG block j associated to LQN entry i, where j = 1, . . . , Ni . Such number represents the average number of times that an EEG block is passed through during one execution of the EEG, and it is easily obtained by visiting the EEG and meanwhile counting the number of visits to each block, weighted by the branching probabilities. Refer to [3] for details about methods to derive mean number of visits to blocks of an execution graph; 4. the resource demand vector associated to each LQN task entry i, denoted by D i , is then obtained as: Di =

Ni v ij d ij j =1

5. the elements of vector D i are finally used to obtain DCP values for entry i. These are obtained by expressing such elements in proper units, as from Table 11. 2 Thus, number of statements in D i remains number of statements in the corresponding DCP value, while number of accesses to DB and LAN/WAN devices becomes number of transactions and bits, respectively. Number of accesses to the peripheral terminal device in D i remains unchanged in the corresponding DCP value, since such a device is only used to represent the users’ think time in LQN models and it does not appear as a device in Table 11. Table 7 gives the values of v ij and d ij for the example case study, while Table 8 gives the resulting D i , that are finally mapped to DCP values in Table 9. DCPs for each entry are obtained by assuming one transaction for each access to DB type devices and a mean number of bytes transferred for each access to LAN/WAN devices, in particular: • 2500 bytes for mail messages; • 4000 bytes for view log records; • 250 bytes for update log records. 2 It is understood that CPU statements are high level language statements. In other words it is assumed that the capacity takes into consideration the overhead introduced by the middleware and the operating system. In conformity to other authors it is assumed a CPU speed of 2–5 MIPS at machine level, with a ratio of high level to machine instructions of 1/20 [3,13]. A similar assumption holds for DB requests and capacity. In conformity to other authors it is assumed an access time of 20–50 ms per request for I/O devices, with a ratio of DB request to I/O request of 1/1 [3,13].

100


Table 8 D i for the example case study LQN entry (i)

User entry Mail session Read mail View log Update log

Di

T (acc)

UWS (stat)

UWSDB (acc)

HWSA (stat)

HWSADB (acc)

HWSB (stat)

HWSBDB (acc)

LAN (acc)

WAN (acc)

1 0 0 0 0

5 139 0 0 0

0 1 0 0 0

0 0 0 10 200

0 0 0 0.1 2

0 0 420 0 0

0 0 2.8 0 0

0 3.9 0 0 0

0 2.8 0 0 0

Table 9 DCP values for the example case study LQN entry (i)

User entry Mail session Read mail View log Update log

DCP values T (acc)

UWS (stat)

UWSDB (trans)

HWSA (stat)

HWSADB (trans)

HWSB (stat)

HWSBDB (trans)

LAN (bits)

WAN (bits)

1 0 0 0 0

5 139 0 0 0

0 1 0 0 0

0 0 0 10 200

0 0 0 0.1 2

0 0 420 0 0

0 0 2.8 0 0

0 61200 0 0 0

0 56000 0 0 0

Fig. 17. Parametrized LQN for alternative MP1.


101

The resulting DCPs values for the example case study are the device call labels (5, 139, 1, etc.) in Fig. 17. MCPs are obtained, for each LQN entry, by use of the mean number of visits v ij to each EEG calling block j associated to the LQN entry i. An EEG calling block is a basic block that represents a method call to a software server block. The mean number of visits to such a block corresponds to the number of calls from the LQN entry associated to the EEG calling block to the LQN entry associated to the EEG software server block. The resulting MCPs values for the example case study are represented by method call labels (1, 0.1, 2.8, 1) in Fig. 17. Their values are obtained from the v ij values in Table 7 for blocks U0 , S2 , S3 and S5 , respectively. 4. Application issues This section gives a few details on the method application and the degree of its size and complexity. This section concludes with an example of model use. 4.1. Method size and complexity It is of interest to the software engineer to have some details about the amount of work to be done to translate the software project into a performance model. We shall assume the designer has knowledge of its project size expressed in: • number of scenarios, each represented by a sequence diagram (SD) (see Section 3.2.1); • a detailed view of each SD, including number of interactions (horizontal arrows) and number of class methods (UML-type activation boxes) (see Figs. 5 and 6); • total number of devices in the PC (see Section 3.1). Let k be the number of scenarios, let mj be the number of methods and ij the number of interactions in scenario j (j = 1, . . . , k). Then k M= mj j =1

and

k I= ij j =1

will denote the total number of methods and interactions in the project, respectively. Besides, let r be the total number of devices. It is easy to be convinced that the global PG consists of O(M) nodes and O(I) arcs (see Sections 3.2.1 and 3.2.2) and thus its space complexity is O(M), since I and M are of the same order. On the other hand, O(M) also gives the time complexity for the generation of the global PG, since a number of M methods in the SDs are to be visited to generate a corresponding number of PG nodes. In summary, O(M) is the time and space complexity of Steps 1 and 2 of the method (see Section 2). Step 3 is itself O(M) again, since according to Section 3.3 it is simply performed by a one-to-one translation of each global PG node into an EFG node. Step 4 requires a bit larger work, since it requires the production of a r-sized vector for each EFG node. And thus it is again O(M) being r largely lower than M, in general.

102


Step 5 of the method finally requires again O(M) both in space and time. Indeed, let e denote the number of entries of the resulting LQN (see Section 3.5), such a number is nothing else of the total number of methods in the class diagram, which is obviously not larger than M. The space complexity of the LQN is by definition the space to accommodate number e + r + c items, where r is the above defined number of devices and c the number of LQN method plus device calls. It is easy to be convinced that c is not larger than M + er, and thus the total LQN space complexity is O(M), as stated above. On the other hand, the time to perform the LQN generation (see Section 3.5.1) is the time to visit the O(M) nodes of the EEG and to generate, for each node, at most number r device calls. Thus, the time complexity is again O(M). The further LQN parameterization of Section 3.5.2 can be performed in the course of the LQN generation itself, and thus it does not increase the above given time complexity for Step 5. In conclusion, the amount of work the software engineer can expect to be performed by the model generation algorithm is of the same order of magnitude M of the total number of methods he can recognize in his project scenarios and this is also the amount of space required to allocate the performance model. 4.2. Model use Model evaluation is not the focus of this work. Nevertheless, in order to show the usefulness of the approach, in this section a few performance results are obtained that give a prediction of the performance of the mail system software project under development. As stated in Section 3.4, the generated LQN performance model in Fig. 17 is related to alternative MP1 (see Table 5). By introducing slight changes in the resource demands of columns LAN and WAN in Table 6, the procedure illustrated in Section 3.5 yields the similar LQN model for alternative MP2 (not illustrated here). The MP1 and MP2 models can be used to predict the MP alternative (MP1 or MP2) that will yield the shortest response time, or the time to execute the System mail session illustrated in Fig. 14. The two models have been evaluated by using the LQNS tool [15] for several values of the number (N) of users (see Table 3), ranging from 2 to 10 users. For each given number of users, the average response time has been obtained (see Table 10). Table 10 Average response time No. users

2 3 4 5 6 7 8 9 10

Response time (s) MP1 alternative

MP2 alternative

32.7 49.4 66.4 83.8 101.4 119.0 136.3 154.4 172.8

22.0 33.1 44.2 55.4 66.6 77.8 89.1 100.4 111.9


103

Table 11 Resource capacity Platform resource

Capacity

User workstation i HW server A HW server B LAN WAN User workstation DB Mail DB History DB

105 statements/s 105 statements/s 105 statements/s 106 bits/s 104 bits/s 50 transactions/s 50 transactions/s 50 transactions/s

As seen from Table 10 alternative MP2 performs 1.5 times better than alternative MP1. Such a result gives to the software designer a quantitative evaluation of the relative importance of the History Mgr and Mail Mgr software modules from the performance point of view. In alternative MP2, the History Mgr is allocated to HWSB which is reached by the WAN (see Fig. 3), while the Mail Mgr is reached by the LAN. And vice versa for alternative MP1. Regardless of the fact that the Mail Mgr module is called more frequently than the History Mgr module (see Fig. 14), its relative importance from the performance point of view is lower. Indeed, in order to obtain a better performance (i.e., lower response time) it is necessary to make the History Mgr reachable by the LAN instead of by the WAN. The tool also gives quantitative insights on the relative effects of the LAN and WAN. Indeed, regardless of the fact that the WAN capacity is 100 times lower than the LAN capacity (see Table 11) the impact on the response time is negligible (only 1.5 times). In conclusion, the software designer has been enabled to predict the impact of his design time choices without being required of any specific knowledge of queuing model theory. The analyst is only required to input his CASE documents into the model generator illustrated in Section 3, and then feed the obtained LQN model into the LQNS solver to obtain quantitative predictions.

5. Conclusions The use of performance modeling requires large expertise and thus is usually unattractive for software developers. On the other hand, the prediction of software performance during the early phases of the lifecycle is of paramount importance to software validation activities. In other words, the ability of the final product to meet the user performance requirements. In order to enable software designers to introduce performance modeling in their current practice, methods are to be introduced for the automatic derivation of the performance model from CASE documents. This paper has given a method that takes in input standard CASE software documents from analysis and preliminary design phases and yields in output an LQN performance model ready for evaluation. The method is easy to use and can be effectively incorporated into modern object-oriented software development environments (UML-based, etc.) to encourage software designers to introduce lifecycle performance validation into their development best practices. It is presently available in a prototype form and is being developed in a web version accessible by interested practitioners.

104


References [1] [2] [3] [4]

[5] [6] [7] [8] [9] [10] [11] [12] [13]

[14] [15]

[16]

S.S. Lavenberg, Computer Performance Modeling Handbook, Academic Press, New York, 1983. S.R. Schach, Classical and Object-oriented Software Engineering, WCB/McGraw-Hill, New York, 1999. C.U. Smith, Performance Engineering of Software Systems, Addison-Wesley, Reading, MA, 1992. C.U. Smith, L.G. Williams, Performance engineering evaluation of object-oriented systems with SPEED, in: R. Marie, et al. (Eds.), Computer Performance Evaluation Modelling Techniques and Tools, Lecture Notes in Computer Science, Vol. 1245, Springer, Berlin, 1997. G. Booch, J. Rumbaugh, I. Jacobson, Unified Modeling Language User Guide, Addison-Wesley, Reading, MA, 1997. J.A. Rolia, K.C. Sevcik, The method of layers, IEEE Trans. Softw. Eng. 21 (8) (1995) 689–700. C. Hrischuk, C.M. Woodside, J. Rolia, R. Iversen, Trace-based load characterization for generating software performance models, IEEE Trans. Softw. Eng. 25 (1) (1999) 122–135. G. Franks, A. Hubbard, S. Majumdar, D. Petriu, J. Rolia, C.M. Woodside, A toolset for performance engineering and software design of client–server systems, Perform. Eval. (special issue) 24 (1–2) (1996) 117–135. C.M. Woodside, A three-view model for performance engineering of concurrent software, IEEE Trans. Softw. Eng. 21 (9) (1995) 754–767. U. Herzog, A concept for graph-based process algebras, generally distributed activity times and hierarchical modelling, in: Proceedings of the Fourth Workshop on Process Algebra and Performance Modelling, Torino, Italy, July 1996. G. Iazeolla, A. D’Ambrogio, R. Mirandola, Software performance validation strategies, in: Performance’99 Conference, CRC Press, Boca Raton, FL, 1999. V. Cortellessa, R. Mirandola, Deriving a queuing network based performance model from UML diagrams, in: Proceedings of the Second International Workshop on Software Performance (WOSP2000), Ottawa, Canada, September 2000, pp. 58–70. C. Hrischuk, J. Rolia, C.M. Woodside, Automatic generation of a software performance model using an object-oriented prototype, in: Proceedings of the International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’95), 1995, pp. 399–409. C.U. Smith, B. Wong, SPE evaluation of a client/server application, in: Proceedings of the Computer Measurement Group, Orlando, FL, December 1994. C.M. Woodside, S. Majumdar, J.E. Neilson, D.C. Petriu, J.A. Rolia, A. Hubbard, R.B. Franks, A guide to performance modelling of distributed client–server software systems with layered queuing networks, Department of Systems and Computer Engineering, Carleton University, Ottawa, Canada, November 1995. A. D’Ambrogio, G. Iazeolla, M. Versari, Software performance model generation from non-executable design, Report RI.99.03, Laboratory for Computer Science, University of Roma, Tor Vergata, Roma, Italy, July 1999.

V. Cortellessa received his Laurea degree in computer science from the University of Salerno (Italy) in 1991 and his Ph.D. degree in computer engineering from the University of Roma at Tor Vergata (Italy) in 1995. Currently, he is a Research Assistant Professor at CEMR, West Virginia University (WV, USA), and he holds a graduate research fellowship at DISP, University of Roma at Tor Vergata (Italy). His research interests include performance modeling of software/hardware systems, software engineering, parallel simulation. He is a member of ACM.

A. D’Ambrogio is researcher at the Department of Computer Science, Systems and Industrial Engineering, University of Roma at Tor Vergata (Italy). His research is in the fields of distributed object computing, web-based modeling and simulation, computer supported cooperative work and software quality engineering. He is a member of ACM and IEEE.


105

G. Iazeolla is a Full Professor of Computer Science, Software Engineering Chair, Faculty of Engineering, University of Roma at Tor Vergata (Italy). His research is in the areas of software engineering and information system engineering in relation to system performance and dependability modeling and validation.

Automatic derivation of software performance models from ... - CiteSeerX

Automatic derivation of software performance models from ... - CiteSeerX

Suggest Documents

Automatic Derivation of Fault Tree Models from SysML Models for

Derivation of Petri Net Performance Models from UML ... - CiteSeerX

Automatic Derivation of Performance Models in the Context ... - CURVE

Towards Automatic Derivation of a Product Performance Model from a ...

Derivation of Data-Driven Software Models from Business Process ...

Automatic composition of software systems from ... - CiteSeerX

Software Performance Models from System ... - Carleton University

Automatic derivation and implementation of fast ... - CiteSeerX

Software Performance Models from System ... - Semantic Scholar

automatic recovery from software failure - CiteSeerX

Automatic Extraction of PEPA Performance Models from UML Activity ...

Automatic Derivation of AADL Product Architectures in Software ...

Automatic Generation of Performance Models - Semantic Scholar

Automatic Generation of Layered Queuing Software Performance ...

Derivation of UML Based Performance Models for Design

Derivation of UML Based Performance Models for Design Assessment ...

Automatic Generation of Help from Interface Design Models - CiteSeerX

Automatic Reconstruction of 3D CAD Models from Digital ... - CiteSeerX

automatic generation of building models from lidar data ... - CiteSeerX

Automatic Generation of Help from Interface Design Models - CiteSeerX

Automatic Identification of Time-Series Models From Long ... - CiteSeerX

Automatic Generation of Help from Interface Design Models - CiteSeerX

Software Process Models - CiteSeerX

Automatic Derivation of Service Candidates from Business Process ...