Performance Engineering of Distributed Software Process Architectures

13 downloads 129261 Views 220KB Size Report
recognize potential software bottlenecks in a distributed software sys- tem over ..... solver's estimates take into account contention for software resources such as.
Performance Engineering of Distributed Software Process Architectures Greg Hills1 , Jerome Rolia2 and Giuseppe Serazzi3 1 2

School of Computer Science, Carleton University, Ottawa, ON, Canada. K1S 5B6 Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada. K1S 5B6 3 Dipartmento di Elettronica e Informazione, Politecnico di Milano, Italy

Abstract. An important goal of a system's development team is to

provide a software structure that evolves gracefully with its workload's intensity and characteristics, and the technologies that support the system. We describe a computationally e cient technique that helps us recognize potential software bottlenecks in a distributed software system over a range of workload conditions. Using this technique, software changes needed to support the workload over time can be identi ed early. Support for these software changes can be planned in advance and built into the system's architecture. The engineering structures from the Reference Model for Open Distributed Processing (RM-ODP) are used as the basis for our software performance modelling. A case study is given that demonstrates how the proposed technique can be applied when implementing a distributed application in an environment such as the Open Software Foundation's (OSF) Distributed Computing Environment (DCE). Keywords: bottlenecks, client-server, distributed applications, distributed systems, software architecture, software performance engineering

To appear in the proceedings of the 8 International Conference on Modelling Techniques and Tools for Computer Performance Evaluation '95, Heidelberg, Germany, September 1995. th

1 Introduction

To bene t from more cost e ective and exible technologies, organizations are moving away from centralized systems and implementing distributed systems. The need to share information across organizational boundaries leads to open distributed processing systems. Since organizations rely on distributed systems to accomplish their every day tasks and these applications can span organizational boundaries, understanding their performance behaviour is essential. When designing a distributed application, both software structure and the underlying hardware and software technologies that support the distributed system have an impact on performance. In this paper we consider how to study the impact of software structure and hardware technology on application performance. We describe the distributed application in terms of the engineering structures of the Reference Model for Open Distributed Processing (RM-ODP).

This allows us to model the application in terms of its individual components such as objects, operating system processes and nodes. In this way, our modelling environment provides enough abstraction to develop performance models for applications built using midware environments such as DCE 1] and the Object Management Group's (OMG) Common Object Request Broker Architecture 2] (CORBA). Within a distributed application, software technologies can impose constraints or limits on the number of concurrently active customers that software resources (i.e. software components) such as objects, operating system processes and nodes are able to support. It should be noted that while hardware resources have a utilization less than or equal to one (100%), a software resource may have a utilization limit greater than one. For example, a critical section within an application would have a maximum utilization level of one: at most one customer is able to use the critical section at a time. On the other hand, a database subsystem may permit up to concurrently executing transactions at a time. In this case the utilization limit is . In DCE, the number of concurrent threads that support a server's interface is xed when the process starts to accept calls. At a node level, there may be limits on the total number of concurrent processes or threads supported by the node. If these software technology constraints, i.e. utilization limits, are approached then software queueing delays arise or requests for service are lost. Software queueing delays can increase a request's response time and limit an application's ability to fully exploit its underlying hardware. Under these scenarios, customers queue at the software resources instead of queueing at devices. A software server is a bottleneck when it reaches its utilization limit. Obviously, hardware can also limit the performance of a distributed application. The service rates of the hardware associated with the application impose limits on the throughput of the requests. Once the hardware resources become saturated, the distributed system has reached its maximum feasible throughput (based on a physical viewpoint only). A scalable architecture of a distributed system should have the ability to support workloads that change in both intensity and characteristics, yet be able to fully re ect the performance gains o ered by new technologies. In order to achieve software performance scalability we must recognize potential software bottlenecks under various workload scenarios and plan for the changes needed to avoid them as the system evolves. In this paper we describe an ecient approach to recognize potential software bottlenecks over a wide range of workload scenarios and show how this information can be used early in the development process. First we consider how our approach relates to the Software Performance Engineering approach described in 3] and the Timebench toolset 4, 5]. In 3] execution graphs are created for the dominant request types and are used to relate the requests to their resource demands. The graphs are comprised of the components used by the requests and the branches and loops that connect them. The components are blocks of program statements and procedures that perform n

n

functions for the system. We borrow from this approach but deal with higher level software components. As in 3], Timebench views software components as blocks of program statements and associates then with operating system processes. Timebench is used to consider how to distribute the processes across a network. In this paper, we present a tool with an information model that is appropriate for object based systems. The lowest level software component we consider is an object's method (subprogram). In our case, this is appropriate since we focus on the con guration of operating system processes from objects and the distribution of the processes across a network rather than principles of algorithmic design in general. In Section 2 we discuss a technique that can be used by a development team to identify potential software bottlenecks in a distributed application. Section 3 describes the Distributed Application System Sizer (DASS) tool. DASS has an information model based on the engineering structures of the reference model for ODP and helps us explore the impact of alternative strategies for deploying objects across a distributed system. For a given workload, DASS can be used to determine whether the utilization of any software resource approaches or exceeds its speci ed limit. To consider the software performance scalability question, we must evaluate the impact of many di erent workloads on software resource utilization. Instead of an exhaustive search to nd the worst case resource utilizations for an experimental selection of workload scenarios, Section 4 selects a small set of cases that bound the utilization of the software components. Conclusions are o ered in Section 5.

2 Performance Engineering

In general, the purpose of a sizing exercise is to ensure that the hardware provided with a system has sucient capacity to meet the needs of its users and that loads are acceptably balanced across nodes. In this paper we also consider the impact of load on software resources. If the utilization of a software resource approaches its limit as speci ed in the system's software technology constraints, software queueing delays arise and will strongly a ect the performance of the system. If we ignore these e ects, the results of our usual sizing studies will be incorrect. Our purpose is to nd a distribution, i.e. allocation, of application components that reduces the likelihood of such delays. We have adopted the engineering structures of the Reference Model for Open Distributed Processing 7, 8] to model our distributed system. RM-ODP provides several levels of abstraction to partition the application's components across a distributed system. Figure 1 shows the relationships of its components, that include: Basic Engineering Objects (BEOs) : these objects are the building blocks of the application. They contain the procedures or methods (subprograms) that perform the processing in the system Clusters : are groupings of related objects. Clusters are an abstraction that help us in the recon guring of an application and have no other impact on performance

Capsules : these are the operating system processes required by the application and Nodes : these are the stations in a distributed environment. BEOs, clusters, and capsules are our software components.

More BEOs

Cluster

More Capsules

Basic Engineering Object

More Clusters

Basic Engineering Object

Capsule

Node

Fig. 1. Engineering structures of RM-ODP These engineering structures have sucient abstraction to support the modelling of di erent distributed environments such as IBM's CICS 9] transaction processing environment, DCE and CORBA. A CICS application that resides on a mainframe is centered around one operating system process that executes user de ned programs. This can be modelled using one capsule with the user programs becoming the BEOs. In DCE and CORBA, application processes are the capsules and the objects' methods that provide services to other processes are the interfaces for these capsules. With this engineering structure, we are able to make the following design choices for allocation and hence distribution: { the allocation of objects to clusters { the allocation of clusters to capsules and { the allocation of capsules to nodes. Our approach to the performance evaluation of a system is to identify those requests expected to a ect performance most 3] and to aggregate their resource consumption by software component. As a request travels through the system, it makes visits to methods within objects and these methods make visits to other methods and demands on the system's resources. For the performance analysis we require as input the following visit ratios and service demands on a per request basis: { the number of visits each object's methods makes to other object's methods, only the methods that are utilized in the request are counted { the hardware resource demands made by each of these methods:

the number of CPU instructions and the number of disk operations. It should be noted that the allocation of the components and the number of visits between methods a ect the communication costs of the distributed system. If methods that communicate frequently are placed in di erent capsules or on di erent nodes, the costs for the communications between these methods will be di erent. These varying costs must be taken into account when modelling a distributed system. This is discussed further in Section 3.1. We also need hardware technology parameters and software technology constraints to calculate the resource demands of the various requests. These include: { CPU service rate { average disk service times and { utilization limits for the software resources (objects, clusters and capsules), these de ne our software technology constraints. To conduct a performance study of a distributed system, the throughputs of each of the request types are also needed. In this paper we consider closed models, the choice of throughputs used in the analysis is discussed in section 4.2. Given the throughput of request type and its demand at resource , its utilization of resource is given by the utilization law: . (1) = The resource can be either a hardware or a software resource. We de ne a workload scenario as a set of request types and their throughputs. For a workload scenario, the utilization law provides hardware and software utilization levels for the distributed system. Software resources that approach or exceed their maximum utilization level speci ed in the software technology constraints can therefore be identi ed. Section 3.2 introduces an example application's computational model and we show how its mapping onto a distributed system can be studied using DASS. For a given hardware con guration and allocation of software components over the hardware, our task is to test whether the software technology constraints are satis ed over a full range of workload scenarios, i.e., with the di erent workload mixes and intensities. If not, we must re-con gure the software components until the constraints are satis ed. An ad hoc approach to verifying the software technology constraints is to formulate an experimental design that checks many di erent combinations of request loads on the system. Unfortunately, a very large number of experiments are needed to gain con dence in the results. In Section 4.2 we discuss a more ecient technique that limits the number of test cases needed to identify potential software bottlenecks. Xr

r

Dir

i

i

Uir

Xr Dir

3 Performance Engineering Tool

In this section we describe the Distributed Application System Sizer (DASS) and show how it can be used to conduct a performance study. The tool stores and manages the information needed to estimate performance measures. The example shows how a distributed system can be modelled using DASS.

3.1 Distributed Application System Sizer The information needed to perform an analysis of a distributed system has to be stored in a way that can be easily accessed and managed. This section describes the Distributed Application System Sizer (DASS) which can be used to store the data required to perform the analysis. DASS maintains the relationships between requests and the individual software components and helps an analyst study alternative con gurations of an application. Each con guration for that application is de ned as a separate model within DASS. DASS maintains the visit ratio and service demands for requests and technology constraints for hardware and software resources. It also supports the engineering structures of RM-ODP to link the BEOs to clusters, clusters to capsules and capsules to nodes. Since DASS is aware of which objects are on the same node, DASS is able to automatically manage the resource demands needed to support communication mechanisms 10]. If an object is moved, the resource demands needed to communicate with other objects are automatically updated. Once a distributed application has been described, DASS provides the following performance measures: { demands by request { demands by software components (methods, BEOs, clusters or capsules) and { once a workload scenario is speci ed, utilization levels of software and hardware components. A list of the software or hardware components that approach or exceed their utilization limits as de ned by the software and hardware technology constraints is also produced. Capsule Utilizations 6

5

Utilization

4

3

2

1

0 1

2

3

4

5

6

7

8

Workload Scenarios

Request Type 1

Request Type 2

9

10

11

Request Type 3

Fig. 2. A capsule utilization graph produced by DASS

12

Figure 2 gives an example of a capsule utilization graph produced by DASS with a workload comprised of three request types. The utilization levels of a capsule are given for several workload scenarios. Some of the workloads consist of all three types of requests while others have only one or two request types. If the capsule shown in Figure 2 had a limiting utilization of ve due to the software technology constraints, we would have to modify its software structure to satisfy this constraint under workload scenarios 11 and 12. This could be done by moving some of the objects from the overly utilized capsule to a less utilized capsule. The DASS tool integrates with other analytic tools, namely an asymptotic bounds solver 11, 15] and a Layered Queuing Model (LQM) solver 4, 12]. The bounds solver is used in this paper to nd the asymptotic utilization of objects, clusters, capsules and devices over all possible ratios of request loads. The LQM solver can be used for analytic and simulation modelling of the system to estimate the average response times of requests and queue lengths at resources. The LQM solver's estimates take into account contention for software resources such as capsules as well as hardware devices. Next we consider a sample application that is used in our case study.

3.2 Example Application To analyze the performance of a distributed application, we rst consider its computational model. Figure 3 illustrates the computational model of a simple retail store. We have six objects in the system which are client (user), purchase, return, credit, inventory and warehouse objects (U,P,R,C,I,W respectively) each containing one method with the same name. User Object Return Object Purchase Object Credit Object

Inventory Object

Warehouse Object

Fig. 3. Computational model of a simple retail store There are three types of requests which can be performed in the system: small purchase, large purchase and return merchandise. With a small purchase request a client buys an average of two items, with a large purchase request a client buys an average of ten items, and with a return merchandise request a client returns an average of one item. During a purchase request, the purchase method calls

the credit method and then calls the inventory method. The inventory method calls the warehouse method to update the warehouse database, the number of calls between the inventory and warehouse methods corresponds to the number of items being purchased. A return merchandise request executes much the same way as a purchase request, the only di erence being there is no call to the credit method. To move from the computational model to an implementation of the system, we need to decide how to distribute the software resources across the system. Also, for each request type, we must specify the visits between each of the methods and their resource demands. Figures 4 and 5 give two implementation models for the proposed system. In each of the models, the system has four nodes with one capsule per node. Nodes 1 and 2 are client nodes which submit the requests to the system. In the rst model, node 3 has one capsule containing a cluster that includes the purchase, return and inventory objects, and a second cluster that includes the check credit object. Node 4 has one capsule containing a single cluster that includes the warehouse object. The second model is similar to the rst, except that the inventory object is moved from the capsule on node 3 to the capsule on node 4. Model 1 U

Model 2 U

Node 1

U

Node 2

C

Node 1

P R I

U Node 2

C

P R I

W

Node 3

W

Node 3 Node 4

Node 4

Requests for Service

Disk Pool

Requests for Service

Disk Pool

CPU

Object

CPU

Object

Fig. 4. Implementation Model 1

Fig. 5. Implementation Model 2

Each type of request causes visits between methods, and in turn each of the methods in the application makes demands on the system's resources. Using these resource demands and visits between methods for each type of request, the DASS tool calculates the demands on each of the di erent software resources and node devices by request type. Factors that a ect the demands on system resources are the resource requirements of the methods and their need for communication. In this example, we assume that each communication between methods on di erent nodes require 25000 CPU instructions per method and communications between methods on the same node require 1000 CPU instructions per method. Table 1 gives the resource requirements of each of the methods, Table 2 gives the service rates and service times of node devices, and Table 3 lists the

request types and the number of visits between methods for the requests. In this example, we assume the resource requirements for each method is the same for each request. Method CPU inst Disk IOs Purchase 55000 3 Return 55000 3 Inventory 35000 0 Credit 30000 2 Warehouse 15000 2

Node CPU Service Disk Service Rate (MIPS) Time (sec) Node 1 5 NA Node 2 5 NA Node 3 10 0.01 Node 4 5 0.01

Table 1. Resource requirements

Table 2. Node service parameters

Request Type Calling Method Called Method Visits Small Purchase Client Purchase 1 Purchase Credit 1 Purchase Inventory 1 Inventory Warehouse 2 Large Purchase Client Purchase 1 Purchase Credit 1 Purchase Inventory 1 Inventory Warehouse 10 Return Client Return 1 Return Inventory 1 Inventory Warehouse 1

Table 3. Method visits Table 4 shows the demands on resources of nodes 3 and 4 aggregated by request type. Resource demands at nodes 1 and 2 are not considered. These nodes support user interfaces and can easily be upgraded by adding more user interface nodes to the network. Given the throughputs for each of the di erent request types, the utilizations of the software resources can be calculated using the utilization law (1). By modifying the allocation of software resources, the tool can be used to balance loads as the system's workload and underlying hardware change. Model # Request Type 1 2

Resource CPU3 CPU4 Disk3 Disk4 Small Purchase 0.018 0.016 0.02 0.013 Large Purchase 0.038 0.08 0.02 0.067 Return 0.0115 0.008 0.01 0.0067 Small Purchase 0.011 0.018 0.02 0.013 Large Purchase 0.011 0.042 0.02 0.067 Return 0.008 0.015 0.01 0.0067

Table 4. Resource demands by request type in seconds

In doing the calculation for software resources utilizations, we note that an object's utilization includes its utilization of other objects. This is because a thread of control must be maintained by the calling object. Figure 6 illustrates a ow of control between objects in a return merchandise request. We assume that the inventory object is waiting while its requests are being processed by the warehouse object. Therefore, the resource demands of the warehouse object a ect the utilization of the inventory object. U

U Node 1

Node 2 C

P

R I

W

Node 3 Node 4

Start

End

Fig. 6. Flow of control of a return request

4 Workload Scenarios

Before we can estimate the utilization levels of software and hardware components we must select which workload scenarios (i.e. request types and their throughputs) must be considered. Section 4.1 describes techniques for estimating throughputs in closed models. Section 4.2 describes the asymptotic bound analysis technique that we employ to minimize the number of workload scenarios used in the performance study.

4.1 Workload Selection As a rst approach, we can base the scenarios on measurements from an existing system and/or on the system's quality of service requirements. Using DASS, we then compute the utilization measures for both software and hardware components under these anticipated workloads. This is a good way to compare alternative designs. However, these throughput levels will not necessarily help us understand a design's worst case software component utilizations. We would like to consider the worst case so that our chosen system design is less sensitive to assumptions about and/or changes to workload behaviour. In general the approach we take is to plan the allocation of objects across our distributed system so that the number of software bottlenecks is minimized. The ideal situation is to choose an allocation that avoids software bottlenecks

completely. With this strategy, we look towards throughput bound analysis techniques for closed models that do not take into account software blocking. They give us the highest request throughputs and hence software component utilizations. The component utilizations can be compared with their limits. If a limit is exceeded, the software should be recon gured to avoid the likelihood of a software bottleneck. The question we address in this section is: what are the combinations of requests that cause the highest utilizations of software components? An ad hoc approach would be to randomly check many workload scenarios and search for maximum software component utilizations. This would of course be time consuming and would not provide us with much con dence in the results of our search. However since we assume that we have no software bottlenecks, the highest throughputs will be achieved when some combination of devices saturate. This point can be used to help us nd a minimal set of workload scenarios that are of interest. In 11, 15] a bounds technique that takes as input customer (request) class demands at devices is provided. They calculate the ratios of populations that cause combinations of devices in the system to saturate. Given these population mixes and the devices that saturate, we calculate the asymptotic throughputs for the request classes. The population mixes and the corresponding request throughputs de ne our workload scenarios. The calculation of throughputs does not take into account software bottlenecks, but is appropriate with our approach. Once the bottlenecking phenomena have been reduced, further bounding techniques 14] or mean value analysis techniques 12] that take into account software blocking could be considered. 14] provide a bounds technique that accepts as input the populations of customer classes and the classes' demands at devices is provided. They calculate the asymptotic bounds on the throughput of the classes, taking into account the software blocking due to software technology constraints. This is an appropriate approach for understanding the impact of software bottlenecks on throughputs and could be applied once the workload scenarios of most interest are known.

4.2 Asymptotic Bounds Analysis for Workload Selection To start our software analysis, we assume that the software technology constraints are all satis ed and that there are no software bottlenecks. In this paper we associate each request type with a single customer class. Consider a workload scenario described by a vector that de nes the fraction of total customers that issue each type of request. For example, if we have a system with only two types of requests and = (0 7 0 3), 70% of the customers will issue type one requests and 30% will issue requests of the second type. Since we assume that there are no software bottlenecks, increasing the total number of customers and hence throughput of a workload scenario will cause some subset of devices to saturate. In this paper, our purpose is to choose a set of that can be used as workload scenarios to test whether a system's software technology constraints are indeed satis ed. To nd the set of , we adopt an asymptotic bound analysis technique : 

:

for closed multi-class models 11, 15]. The technique takes as input the resource demands of each request type at each device and creates a series of systems of linear equations that express the bounding relationships between the demands of each request type at the devices and the utilization of the devices. By setting the utilizations of selected devices to one, the solution of the system of linear equations gives the mixes of customer populations that cause the combination of devices to saturate. Figure 7 illustrates all possible workload mixes for a system with two types of requests and two devices. For a xed number of requests in the system, all the possible request mixes are described by the line of equation 1 + 2 = 1. By considering the saturation of each combination of devices, the analysis technique nds the set of vectors that are the crossover points of the device bottlenecks. These de ne our workload scenarios for the software analysis. Note that the total throughput changes monotonically, either increasing or decreasing, between two consecutive crossover points if the request type mix is moving in a single device saturation sector. If the mix is in a common saturation sector, where more than one device saturates concurrently, the global throughput remains constant. Thus since the total throughput only changes at these crossover points, the maximum asymptotic utilization of each software resource must occur at some subset of these workload scenarios. Therefore, it is no longer necessary to conduct an experimental search for the maximum utilization of software resources. This is an ecient method to direct a performance study for a distributed application. on k ec en ce

1

ttl

on k

d

s1

2 ce

vi

de

Bo

ttl

en

ec

k

de

E1

on

vi

ce

Bo

ttl

an

en

ec

2

Bo

vi

1



Crossover Points define our Workload Scenarios

de

Fraction of Population of Request Type 1



1 Fraction of Population of Request Type 2

E2

Fig. 7. Bottleneck crossover points for a two request type/two device system

The asymptotic throughputs of our workload scenarios are used by DASS to calculate the software resources' corresponding asymptotic utilizations. The utilizations can be compared with their limits as speci ed in the software technology constraints. Figure 8 shows the asymptotic capsule utilizations of the two alternative implementations of the simple retail store. The workload scenarios are found by applying the bounds analysis technique to each alternative. The rst

model has 12 workload cases, the second model has 11. Workload scenario 1 for Capsule 1 in Model 1 has its largest asymptotic utilization with contributions from all three request types 4 , whereas, Capsule 1 in Model 2 has its largest asymptotic utilization with workload scenario 6 which is comprised of only the large and small purchase request types. Capsule 1 Utilization - Model 1 Grouped By Request Type

Capsule 2 Utilization - Model 1 Grouped By Request Type

4

3

3.5

2.5

Capsule Utilization

Capsule Utilization

3 2.5 2 1.5

2

1.5

1

1 .5

.5 0

1

3

5

2

7

4

9

6

0

11

8

1

12

10

3

Large

Return

5

2

Workload Scenario

7

4

9

6

8

11 12

10

Workload Scenario

Small

Large

Return

Small

Model 1 Capsule 1 Utilization - Model 2 Grouped By Request Type

Capsule 2 Utilization - Model 2 Grouped By Request Type

4

3

2.5

3

Capsule Utilization

Capsule Utilization

3.5

2.5 2 1.5

2

1.5

1

1 .5

.5 0

1

3 2

5 4

7 6

9 8

0

11

1

Large

Return

3 2

10

5

7

4

6

9 8

11 10

Workload Scenario

Workload Scenario

Large

Small

Return

Small

Model 2

Fig. 8. Simple retail store: capsule utilizations for implementation Models 1 and 2 For this example we assume that Capsules 1 and 2 have utilization limits of four and three respectively. Model 1 approaches its limit for Capsule 1 in scenarios 1, 5, 7, 9 and 11. Model 2's utilization level of Capsule 1 does not approach the limit. The maximum asymptotic throughputs of the large, return and small requests for Models 1 and 2 are (12 5 87 50) and (14 9 66 7 50) completions per second. These results indicate that both models provide equivalent maximum throughput for small requests, Model 1 is better suited for return requests, and Model 2 has the better throughput for large purchase requests. In general we prefer Model 2 since it never approaches its capsule utilization limits. However, it should be noted that in Model 1 the utilization limit of Capsule 1 is not approached when we have a large portion of return requests 5 . If we expect return requests to dominate the system, it could be recon gured to Model 1 to achieve the higher throughput. : 

4 5



Note: it is only slightly larger than several of the other scenarios. See workload scenarios 2, 4 and 6 of Model 1, Capsule 1.

: 

: 

In general, if resource utilization levels approach or are greater than their limits, we expect software queueing problems and possibly software bottlenecks. We can then use DASS to recon gure the system. Options are to replicate or reallocate capsules, clusters, or objects or to partition them. We iterate between DASS and the bounds analysis until a con guration is found that satis es its throughput and software technology constraints over all workload conditions. If no single con guration satis es all workloads combinations, the system's software architecture should be chosen to support the changes needed to easily recon gure the software to support the various workload conditions.

5 Conclusion and Future Work In this paper we consider the performance implications of the con guration of a distributed application. We introduce the idea of a software technology constraint that limits the concurrency permitted within a software resource and gives a simple de nition of a software bottleneck. Using an asymptotic bounds analysis technique and a sizing tool called DASS, we are able to choose a relatively small set of test cases we call workload scenarios to verify whether our software technology constraints are satis ed. This small set of test cases is suf cient to test the full range of workloads. An example is given to illustrate the technique. Studying alternate implementation models of distributed systems allows us to nd con gurations that best support the system's anticipated workloads. The results of the study can be used to identify changes in a system's implementation needed to switch between con gurations that best support di erent workloads. Support for these changes should be built into the system's software architecture to help decrease future maintenance costs. The technique can also be used to explore the impact on software performance of order of magnitude changes in the service rates of hardware and software technologies. For example, a ten fold speedup in network performance may have a signi cant impact on strategies to partition software resources. A system that does not violate its software technology constraints for any workload scenario is described as software performance scalable. It has no software bottlenecks and is fully able to take advantage of performance gains o ered by anticipated hardware technologies. By helping to study the impact of such changes on software performance behaviour, this work supports the study of software architectures for open distributed systems. Future work includes investigating how well the asymptotic utilizations computed for software components bound their utilization under normal operating conditions. The demands that contribute to a component's utilization have not been in ated to account for queueing delays at devices and other software resources. These delays can be estimated using MVA techniques for distributed software 4, 12] once a reasonable con guration and distribution for the components has been found.

6 Acknowledgements The authors gratefully acknowledge the funding provided for this work by the IBM Canada CORDS and MANDAS projects, the Natural Sciences and Engineering Research Council of Canada, and by Italian MURST through 40% and 60% Projects.

References 1. Open Software Foundation, \Introduction to OSF DCE," Prentice Hall, 1992. 2. Object Management Group and Xopen, \The Common Object Request Broker: Architecture and Speci cation," Object Management Group and X/Open, Framingham, MA and Reading Berkshire, UK, 1992. 3. C.U. Smith, \Performance Engineering of Software Systems," Addison-Wesley, August 1990. 4. G. Franks, A. Hubbard, S. Majumdar, D. Petriu, J. Rolia, C.M. Woodside, \A Toolset for Performance Engineering and Software Design of Client-Server Systems," SCE Technical Report SCE-94-14, Carleton University, Ottawa, Canada, June 1994. To appear in a special issue of the Performance Evaluation Journal. 5. R.J.A. Buhr, G.M. Karam, C.M. Woodside, R. Casselman, R.G Franks, H. Scott, and D. Bailey, \TimeBench: a CAD Tool for Real-Time System Design," Proceedings of the 2nd International Symposium on Environments and Tools for Ada (SETA2), Washington D.C., January 1992. 6. K.A. Raymond, \Reference Model of Open Distributed Processing: a Tutorial" Open Distributed Processing, II (C-20) 7. ISO/IEC JTC1/SC21/WG7 N885, \Reference Model for Open Distributed Processing - Part 1: Overview and Guide to Use," November 1993. 8. ISO/IEC 10746-2, \Basic Reference Model of Open Distributed Processing - Part 2: Descriptive Model," July 1993. 9. Jim Gray, Andreas Reuter, \Transaction Processing: Concepts and Techniques," Morgan Kaufmann Publishers, San Mateo, CA, 1993. 10. E. Pozzetti, V. Vetland, J.A. Rolia, G. Serazzi, \Characterizing the Resource Demands of TCP/IP," To appear in the Proceedings of the International Conference On High-Performance Computing and Networking (HPCN 95), Springer Verlag, May 1995. 11. G. Balbo, G. Serazzi, \Asymptotic Analysis of Multiclass Closed Queuing Networks: Common Bottlenecks" to appear in Performance Evaluation Journal, North Holland, 1995 12. J.A. Rolia, \Software Performance Modelling," CSRI Technical Report 260, University of Toronto, Canada, January 1992. 13. C.E. Hrischuk, J. Rolia, C.M. Woodside, \Automatic Generation of a Software Performance Model Using an Object-Oriented Prototype," Proceedings of International Workshop on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS'95), p. 399 - 409. 14. C.M. Woodside, S. Majumdar, \Robust Bounds and Throughput Guarantees for General Closed Multiclass Queuing Networks," SCE Technical Report SCE-94-05, Carleton University, Ottawa, Canada, January 1994.

15. G. Balbo, G. Serazzi, \Asymptotic Analysis of Multiclass Closed Queuing Networks: Multiple Bottlenecks," Technical Report 93-094, Politecnico di Milano, EECS Dept., 1993.

This article was processed using the LaTEX macro package with LLNCS style