Performance Property Prediction Supporting ...

10 downloads 1012 Views 186KB Size Report
quired to store blocks on a disk may depend on the number ... possible to store at each component. .... One of the laptops was a HP COMPAQ 2510p running.
Performance Property Prediction Supporting Variability for Adaptive Mobile Systems Gunnar Brataas∗ , Shanshan Jiang

Roland Reichle, Kurt Geihs

SINTEF ICT and * IDI, NTNU Trondheim, Norway

University of Kassel Germany

[email protected] [email protected]

[email protected] [email protected]

ABSTRACT A performance property prediction (PPP) method for component-based self-adaptive applications is presented. Such performance properties are required by an adaptation middleware for reasoning about adaptation activities. Our PPP method is based on the Structure and Performance (SP) framework, a conceptually simple, yet powerful performance modelling framework based on matrices. The main contribution of this paper are the integration of SP-based PPP into a comprehensive model- and variability-based adaptation framework for context-aware mobile applications. A meta model for the SP method is described. The framework is demonstrated using a practical example.

Categories and Subject Descriptors C.4 [Performance of Systems]: Modelling Techniques

General Terms Performance, Measurement

Keywords Autonomic computing, Mobile systems

1.

INTRODUCTION

Mobile computing creates a need for context-aware, adaptive applications that can be reconfigured at run-time [16]. Utility functions express the rationale of an adaptation decision in a precise way, and are preferred over action or goal policies [8]. For applications on mobile devices that are intrinsically resource constrained, application performance is crucial for the user acceptance, and therefore performance properties will be an important part of an utility function. Current practical approaches to component-based performance property prediction (PPP) are often based on retrofitting, in which you know the best application variant for some basic context values and then manually tune the componentbased PPP model to get the desired behaviour. However,

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SPLC’11 August 21–26, 2011, Munich, Germany Copyright 2011 ACM ISBN 978-1-4503-0789-5/11/08 ...$10.00.

this ad hoc approach falls short when components are developed independently, because they will then be specified nonuniformly. Complex relationships between context changes and dynamic adaptation decisions at run-time also make retrofitting difficult. This article is based on a conceptually simple, yet powerful performance evaluation framework termed Structure and Performance (SP) [6]. In the MUSIC project [13], the possible configurations of a component-based application are specified by the application developer in an architectural variability model [15]. This variability model allows a compact specification of a possibly huge number of application variants and is used by the adaptation middleware to create and reason about the different application variants. The method presented is embedded into MUSIC’s variability model. It should also be useful for other component-based adaptation middleware running on mobile devices. The remainder of the paper is organised as follows: In Section 2, we briefly describe important concepts in the underlying adaptation framework. In Section 3, as a basis for the amalgamation of SP and the variability model, we describe a formal SP meta model. In Section 4, we illustrate MUSIC’s variability model extended with our PPP approach, using a concrete application scenario. Also in Section 4, we describe our parameter capture approach and show how we validated the response time predictions. The applicability of our overall approach is discussed in Section 5. In Section 6, we relate our findings to the state of the art. Section 7 concludes and suggests directions for further work.

2.

ADAPTATION FRAMEWORK

In MUSIC [15], component-based applications may be reconfigured dynamically at run-time in order to react to context changes and to maintain the application utility in a dynamic environment. Fig. 1 shows that an application is modelled as a Component Type that can have different realisations. The context dependencies and the Quality of Service (QoS) properties of a certain realisation are described using Plans. Corresponding to the atomic and composite component types, there are Atomic and Composite Realisation Plans. An Atomic Realisation Plan describes an atomic component and contains a reference to the class that realises the component. The Composite Realisation Plan describes the internal structure of a composite component by specifying the Component Types that are involved and the connections between them. To create a possible variant, one of the plans of a Component Type is selected. If the plan is a Composite Realisation Plan, we proceed by recursively se-

Meta-level description

Static Model

Application

Devolved Work

is a

Component Type

realise

Service Times

Component

specialise Atomic Realisation Plan

Resource Demands Dynamic Model

describe realisation of Composite Realisation Plan

Load

Work

Component concepts

Atomic Component

Response Times Composite Component

Figure 3: Basic SP concepts. described by described by

Figure 1: Conceptual component-based adaptation model. lecting one Realisation Plan for every involved Component Type. The recursion stops if an Atomic Realisation Plan is chosen. Thus, by resolving the variation options in a combinatorial fashion, we create different application variants. Because different realisations can be selected for a composite component, different variants may have very different architectural structures. Obviously, the MUSIC variability model approach bears some similarities to software product line techniques. However, MUSIC focuses on dynamic adaptation at run-time, while the engineering of software product lines primarily aims at off-line configuration activities. The utility of a component configuration is computed using a developer-defined utility function. Those parts of the application that are evaluated during planning are called variation points, i.e. Component Types that can have different realisations. A plan exhibits both requested properties (e.g., memory consumption, network bandwidth, monetary cost) and offered properties (e.g., throughput, response time, result accuracy) referring to the QoS model of the application. To support the adaptation reasoning process, property predictors help to estimate the offered and requested properties of the component that are associated with a plan. They predict the values of the non-functional properties of a component (or composition) for the given execution context. These predicted property values are input arguments to a normalised utility function that computes the expected

k

k+

utility of an application variant for the given context. As illustrated in Fig. 2, the derivation of the utility of each application variant n for an application type Ai is based on how the user weights (using fk ) certain properties Pk . All possible application variants Vi,n consume resources Rm,q , where m indicates the resource type and q signals several variants for each resource. For example, the clock speed of a CPU may vary, in which case the rate at which battery is consumed will also change. pn,k and rn,m describe the properties k and resource types m used by each variant n.

3.

SP META MODEL

We now describe elements of a formal meta model for the SP method to provide a clear foundation for the implementation of the PPP extension to the MUSIC MW (middleware). The meta model will also facilitate the use of SP in other contexts. The SP method [6] rests on the separation between work (what is done?) and load (how often is the work done?), as shown in Fig. 3. Work is devolved through levels of software components onto hardware resources. The end result of an SP model is the number of primitive hardware operations that are performed per top-level user operation. This devolved work is multiplied by service demands (the time to execute one hardware resource operation) to obtain resource demands (the time to execute one top-level user operation). Resource demands are then combined with load and fed into a dynamic queueing modelling model. In SP, work modelling is done without software contention, i.e. from a contention point of view, it is static. An example SP diagram is shown in Fig. 4, where the five boxes at the bottom represent hardware resources and the three other boxes represent software components. This figure represents the modelled components in the TAE de-

k+2

1

k+1

n,k

k

k+2

+2) n,(k

i

n,m

( n+ 1),

m+1 m,1

m

(n+2 ),m

m+1,1

Processing Storage Communication

i+1

Planner & Ticket Controller

i ii (i+1),1

i,1i

Legend

Graphical UI

n’,

m+2,1

(m +3 )

Ticket Service i ii m+3,1

Client CPU

Client Disk

Network

Ticket Service Server CPU

Ticket Service Server Disk

Figure 2: Basic MUSIC concepts. Figure 4: SP model with a cycle.

0..* DataStructur e

+defines +works on

0..* AbstractComponentType +provides

Operations

1.. *

2

0..*

0..*

1.. *

1.. *

+considers 1.. * ComplexityMatrix +characterises

ResourceType

1.. *

ComponentType

+involves

Link Resour ce

destination source

Component

+considers CompactnessMatrix +characterises 0..1 +characterises

0..*

+provides +provides ProcessingLink

0..*

CommunicationLink

StorageLink

AbstractComponent

Figure 5: Meta model for SP. scribed in Section 4. In this diagram there is a cycle, because the Planner & Ticket Controller also devolves work on the GUI. In Section 4.4 we show that this cycle can be resolved. The basic concept in the SP method [6] is that of the component. An SP component can be both a software component, and a hardware device or resource. MUSIC already used separate concepts for software components and hardware resources. To clarify the relationships in our meta model, which is presented in Fig. 5, we therefore introduce the concept of an abstract component, which is a generalisation of the resource and the component. In MUSIC, variability was achieved through the typing concept: a type can have a number of different realisations or, viewing it from the opposite perspective, a number of realisations is generalised/abstracted by a certain type. In this sense, an abstract component type can be realised by an abstract component, just as a resource type can be instantiated by a resource and a component type by a component. The links between abstract components are of three types: processing, storage, and communication. Graphically, it is customary to represent processing links as solid thin lines, and to use solid thick lines for storage links, as in Fig. 4. For communication links, we use dotted thin lines. Fig. 4 shows communication between the Planner & Ticket Controller (PTC) and the Ticket Server, represented by communication links from both these components to the network resource. Abstract components have one or more operations (or services), which they offer to superior components or users. Each operation in a component devolves work on operations on lower-level abstract components. How many operations on lower-level abstract components are required per offered operation is captured with the help of Complexity Matrices (CMs). As each link involves operations on lower-level abstract components, a link is always associated with a CM. Most often the elements in the CMs are constants, such as 0 or 1, or other integer or rational values. In the general case, matrix elements are polynomials with variables. These variables may reflect data and load dependencies. The amount of work involved in performing garbage collection may depend on the load on the system or the amount of work required to store blocks on a disk may depend on the number

of blocks to store. Table 1 in Section 4.2 shows two CMs for the TAE application. A component will always use both processing and storage operations in inferior components or resources [6]. In addition, distributed components need to have communication operations. For storage operations, the SP framework also considers the amount of storage needed by each operation, for example, the number of disk blocks required to store tickets on the disk. The memory requirement of the software component itself is currently not part of the SP framework, but for our purposes it is required to determine which application variants are able to fit in a mobile device, that has limited storage capacity. Each component operates on one or more data structures, for example, tickets or documents for an application, tuples in a DBMS, blocks on a disk, packets and segments in a network, and even registers in a CPU. Data structures are orthogonal to operations. Therefore, orthogonal to the complexity matrix, the compactness matrix describes the relation between data structures along storage links. Compactness may also be used along communication links to represent the mapping between data structures at several levels in a communication structure. However, for communication, this information will normally also be represented by the complexity matrices; hence, it is optional. Given that we are not always interested in data structures, they may be left out, as indicated by the 0 cardinality in Fig. 5. As shown in Fig. 6, extent denotes the total amount of data that have to be stored for each storage component, whereas limit denotes the total amount of data that it is possible to store at each component. Extent is orthogonal to devolved work. Using dataload, which is the total number of “items” to be stored for each top-level data structure [6], multiplied by compactness matrices, we can compute the extent on each abstract component, which must be less than or equal to the limit. The relationship between storage resources is similar to the storage component relationships in Fig. 6, but where both CodeSize and TemporaryData are removed. The extent of a running application depends not only on data, but also on the following two factors. 1) The code itself (which is, in principle, static, but may vary for each appli-

4.1

+must fit DataExtent +calculated for

Limit 0..1 +calculated for

0..1

+must fit Component

CodeSize +has

+calculated for

0..1

+must fit +has

DevolvedWor k

TemporaryData

Figure 6: Relationships between component storage concepts in the meta model.

cation variant, which again depends on the size of the code for the realisation of each individual component). 2) Temporary data that is required for computations, for example, in Java, data that is stored on the heap. Both permanent and temporary data vary during the lifetime of the application; for example, data structures expand or contract as computations are performed. In this paper, we provide a PPP method which supports adaptation reasoning for self-adaptive applications. For such applications, adaptation is usually realised by providing a number of applications variants which are evaluated according to the current operational context. The variant having the highest utility is then selected for configuration. Typically a great number of different variants have to be provided, in order to provide enough reconfiguration possibilities for a number of different operational contexts. In the traditional SP view, this would mean that for each variant an SP model has to be created and evaluated. As described in Section 2 dependent on the choice of concrete realisations for the component types (composite or atomic) the application structure and therefore the SP model corresponding to the respective application variant can differ substantially. It is not feasible to explicitly model SP diagrams for each of the numerous application variants. Therefore, in our approach SP models are implicitly generated during the evaluation of the application variant. These models are derived from the UML models of the composite realisations. Links are here defined by connectors between ports which provide or require certain interfaces which in turn define operations. The kind of dependency, i.e. the distinction between higher and lower-level components is defined by the link direction. A higher-level component uses the operations provided by an interface of a lower-level component. Following this guideline we simply search the links/paths for all top-level operations through all components of the application variant until we reach interfaces/operations of a primitive component. If we have collected all paths for the top-level operations to all involved primitive components we have an implicit representation of an SP diagram, which we can recursively evaluate.

4.

TRAVEL ASSISTANT EXAMPLE

The TAE represents a trial MUSIC application that supports a traveller with relevant travel information when using the RATP metro in Paris [13]. The TAE considers adaptation that would be useful for such a journey, for example, using a GUI or a speech-based user interface. We had access to the complete 3 kilo lines of TAE Java source code.

TAE Variability Model

Fig. 7 shows the variability model for the TAE where component types are related through ports. The platform model is shown at the bottom in Fig. 7. Each port in the platform model is a provided port. At the top level, the Travel Assistant has one composite realisation, which involves six component types: Controller, UI, Ticket (Service), ItineraryPlanner, MapProvider, and TouristInfoProvider. The connector Controller —(o— Ticket means that the Ticket service provides interfaces/ operations for the Controller. The user interface (UI) component type has two atomic realisations and one composite realisation that involves two other component types. Given that we have one atomic realisation for the VoiceUI and two atomic realisations for the Text2Speech component type, in total we have five different realisations for the UI component type. The Controller component type has three atomic realisations and there is one atomic realisation for the TouristInfoProvider component type. The Ticket, the ItineraryPlanner, and the MapProvider components each have two atomic realisations. We will now have five UI realisations, three controller realisations, and two Ticket, two ItineraryPlanner, and two MapProvider realisations. For TouristInfoProvider, there is just one realisation. In total, we have 5+3+2+2+2+1 = 15 different atomic realisations. Each of these atomic realisations requires a complexity matrix. However, the different realisations may be combined freely, giving in total 5 × 3 × 2 × 2 × 2 × 1 = 120 variants for the Travel Assistant application. Each of these variants will correspond to a different SP diagram. Fig. 4 shows one of these diagrams. The complexity matrices corresponding to this particular realisation are analysed further below.

4.2

Defining the Components and Operations

Each of the components in Fig. 7 is related to one or more Java classes; for example, the GUI component is implemented by a package with several GUI related classes that refer to different GUI pages and interfaces. The Planner and Ticket Controller (PTC) component is implemented by the PlannerAndController.java, which uses ItineraryPlannerAndController.java implicitly via inheritance. The following atomic realisations were not implemented and are therefore empty: TouristInfoProvider, ItineraryPlanner (both realisations), and CompositeSpeechUI. In addition, MapProvider is an external service and has not been measured. Each Java method corresponds to an SP operation. Inside each method, different branches may be selected. We may model this as one average operation, or we may separate each branch. In addition, each branch may operate on different data structures. For example, the TPC component only has the method doAction(), having three different branches internally for searching the itinerary, buying a ticket, and showing a page. The show page branch operates on eight different data structures, including the ticket list. In total, this gives 10 different TPC operations, as shown in the CM in Table 1. If we knew that the devolved work for these 10 different branches were equal, we could model it as one average operation, but we would first have to verify that they were equal. We have chosen to model all these 10 different branches explicitly.

   

 

   

   #

         

 

   

   

   

   

 

     

         

   



                 

   

  

      

    $    

        

         

   

 

 

         

   

       "    

     

    !   

    "    

    !   

     

      % &#

    ' ( 

     

     #

Figure 7: Variability view for Travel Assistant Example (TAE).

4.3

Measurement Hardware and Set Up

We measured the TAE running on two laptops, because the required measurement tools were not available on handheld devices. It may be possible to convert these numbers to an actual mobile device, using some conversion factors reflecting type and speed of processors between measurement and deployment. One of the laptops was a HP COMPAQ 2510p running Windows XP SP2. It had an Intel Core 2 Duo U7600 CPU running at 1.26 GHz and had2 GB RAM. This HP laptop ran the main part of the Travel Assistant application. We thus refer to it as the client machine. The other laptop was a Dell Precision M60 running Windows XP SP3. It had an Intel Pentium 1.6 GHz CPU and was supported by 1 GB RAM. This machine ran the Ticket Service, Map Service, and Enhanced Itinerary Planner. We thus refer to it as the server. The HP machine had a 100 GB hard disk that operated at 4800 RPM and had a block size of 0.5 KB, an average seek time of 15 ms, and a maximum disk-transfer rate of 100 MB/s [4]. For fewer than 100 blocks, the disktransfer time was 0.5 ms and negligible. To get the average access time, we added on average half a rotation (7 ms) to the average seek time. For one access of fewer than 100 blocks, we would therefore require 7 ms + 15 ms = 22 ms. For all operations, we measured fewer than 100 disk blocks

read or written. We did not measure any disk activity on the Dell machine; consequently, we did not compute the average disk access for this machine. For simplicity, we assumed that writing and reading blocks has the same service time. The two laptops were connected using a 100 MBit/s network, which is so fast that for the message sizes in our example, we could simply ignore the network. For measurements, we used the JProbe version 8.1 Java profiler tool [7] and Microsoft’s software monitor tool perfmon.msc [12]. In JProbe, we used the total CPU time for resources, while for components we used the inherent CPU time. Such data was extracted from JProbe and fed into the CMs leading to the CPU, for example, the Client CPU column in Table 1. JProbe was not able to measure less than 1 ms, so some of the client CPU measurements are shown as zero. From the CPU service demand of an operation we can calculate the number of CPU instructions required for an operation. JProbe measures the number of method invocations for each method. This information was fed directly into all CMs that do not lead to resources. The accuracy of the measurements for resource demands, i.e., CPU measurements, in the client and server laptops depends on the profilers. Given that the overall objective of our measurements was to get a feel for the work that is required, rather than getting very accurate measurements per

se, we used average values and no confidence intervals. The recorded utilisation measured by perfmon.msc, was averaged over the measurement cycle, and then subtracted by an averaged background utilisation obtained in a similar way, to get the increase in utilisation for a single operation. We did not use an automatic load generator, but instead did the test manually. We used the GUI menu to perform the highlevel operations repeatedly. For example, we performed the “BuyTicket” operation 10 times over a measurement cycle of 10 minutes. We used code instrumentation to measure the actual response time by recording the timestamp before and after the operations and calculating the difference between them.

4.4

volved by the GUI to its resources. We could have resolved GUI the cycle by simply removing the last 10 rows from CPTC , GUI to the and adding each element in first 10 rows in CCCPU elements in the last 10 rows in the same matrix, so that GUI the first element in the new 10 by 1 C  CCPU matrix would have had one element with value 219, while the last row in this matrix would have had the element 203. Given that the GUI 10 by 10 cycle resolved C  PTC matrix is an identity matrix, the GUI would then have a transparent relationship to the (hardware) resources that it is using indirectly, via the PTC.

Table 2: TAE Complexity matrix sizes.

Complexity Matrices

Table 1 shows the complexity specifications for the two complexity matrices of the GUI. The part of the table to the left of the vertical line is a CM with 20 rows and 10 columns, reflecting the relationship between the GUI and GUI . The the Planner & Ticket Controller (PTC), termed CPTC CM on the right also has 20 rows but only one column. This CM shows how many units of CPU each of the 20 top-level GUI operations consumes. This matrix is named GUI (CCPU is the Client CPU). For the operations, we CCCPU used the following abbreviations: I (Itineraries), SP (ShowPage), ST (ShowTicket), P (Purchase), T (Ticket), SGP (ShowGUIPage), and D (Display). We define one unit of client CPU operations and server CPU operations as the average number of CPU operations during 1 ms of operation on the client CPU and server CPU, respectively.

SP (TList)

SP (UserTInfo)

SP (IList)

SP (SearchIMenu)

SP (DI)

SearchI

SP (DMap)

SP (TPInfo)

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

Client CPU

BuyT

BuyTMenu BuyT STList STInfo IListMenu SearchIMenu DI SearchI DMap TPInfo SGP (TList) SGP (TP) SGP (TInfo) SGP (UTInfo) SGP (DI) SGP (DMap) SGP (IList) SGP (BuyTInfo) SGP (SearchIMenu) SGP (SearchI)

SP (TP)

GUI GUI and CCCPU . Table 1: Complexity matrices CPTC

0 0 0 0 0 0 0 0 0 0 219 156 94 250 234 234 250 234 188 203

The CM between the PTC and the GUI (not shown) is a 10 by 20 matrix, where the leftmost 10 by 10 matrix represents the zero matrix and the rightmost 10 by 10 matrix GUI represents first a 10 represents the identity matrix. CPTC by 10 identity matrix for the 10 GUI operations used by the end user stacked on top of the zero matrix for the 10 GUI operations used internally. For all the 10 end-user GUI operations, CPU consumption was measured to zero by JProbe, as described in Section 4.3. Using this information we can see that the cycle between the GUI and the PTC is a easy to resolve, because we only have to perform one iteration to find the work that is de-

Specification GUI/PTC GUI/CCPU PTC/GUI PTC/CCPU PTC/CDisk PTC/Network PTC/TS TS/Network TS/TSSCPU TS/TSSDisk SUM

# elements 20 × 10 = 200 20 × 1 = 20 10 × 20 = 200 10 × 1 = 10 10 × 2 = 20 10 × 2 = 20 10 × 3 = 30 3× 2= 6 3× 1= 3 3× 2= 6 515

= 0 10 20 10 10 7 6 3 6 3 0 75

1 10 0 10 0 0 0 3 0 0 0 23

Table 2 shows the sizes of all the CMs in the TAE. The 10 matrices contain in total 515 matrix elements, where only 75 or 15 % matrix elements are non-zero. 23 of these 75 non-zero matrix elements are simply filled with the numerical number 1. Therefore, more complex measurements are only required for 75 - 23 = 52 matrix elements. All of these 52 matrix elements stem from the use of hardware resources. Numbers which are known to be non-zero are reported as such even if the actual JProbe profiler report them as zero. It is interesting to observe that the larger the matrices, the more sparse they become. For example, both the 200-element GUI/PTC and the PTC/GUI matrices are only 5 % full, whereas the six-element TS/Network matrix is 100 % full. Using JProbe, we were able to measure all the matrices except those that represent disk and network usage, i.e. the four matrices PTC/CDisk, PTC/Network, TS/Network, and TS/TSSDisk. These four matrices were measured using Microsoft’s perfmon.msc. With this software monitor, we were only able to measure the number of resource operations per top-level operation. Due to the simple structure of the higher level CMs in TAE (some of them are identity matrices), we were easily able to split the measured composite CM into the required elementary CMs. For more complex cases, enough information may not be available for splitting a composite matrix into elementary CMs and educated guessing may be required [18].

4.5

Measurement Validation

To get a feel for the accuracy of our modelling, we compared the measured and the predicted response times. The predicted response times were derived by summing the residence times for all the resources i using a single-class openqueueing excluding memory constraints [11]:  network model t t Rr = i Di,r /(1 − Ui ), where the total utilisation Ui for device i was computed by adding the background utilisa-

for all the r top-level operations Ur ,i : tion Uib to utilisations  Uit = Uib + r Ur ,i . The utilisation Ur ,i for each device i for each class r was derived by multiplying the resource demand, described above, by the arrival rate: Ur ,i = λr ×Dr ,i . Given that we are interested in comparing response times for alternative application variants, we used the same arrival rate for all application variants, where the distribution between the different classes, which reflects different top-level operations, mirrors the estimated or measured distribution of these classes. If we had been particularly interested in one top-level operation, we would not have needed this distribution and could have used a single-class queueing network. Table 3 compares the response time as predicted by our PPP model with the measured response time. For the ItinerariesListMenu, abbreviated to IListMenu, there was a discrepancy of 1.7 %. The DisplayItinerary operation has a perfect match between measured and predicted response times. For the BuyTicket operation, the predicted response time was 18 % higher than the measured response time. The reason may be modelling inaccuracy in itself, but it may also be that some element of caching was present when the response time was measured, even if we tried to avoid this as far as possible. The background utilisation for CPUs and disks on the two laptop devices are shown in the figure. For the dual core client CPU, we focused on the core that actually ran the code.

5.

U, client CPU (% )

U, client disk (% )

U, server CPU (% )

U, server disk (% )

R, projected (ms)

R, discrepancy (% )

Background IListMenu DI BuyT

R, measured (ms)

Table 3: Response Time Validation

297 281 265

1.26 2.34 3.44 3.54

0.36 1.11 0.75 0.20

1.36 1.20 1.13 1.20

3.89 7.50 3.56 7.28

302 281 313

1.7 % 0% 18.1 %

DISCUSSION

The applicability of our PPP approach depends on a tradeoff between the three factors 1) model solver efficiency, 2) modelling and measurements costs and 3) model accuracy: Model solver efficiency In MUSIC, we must compute the performance property for each application variant or realisation on a mobile device, which has scarce resources. In the TAE, there were 120 different realisations. In other applications, there may be even more. Hopefully, we will be able to reduce the number of realisations that it is necessary to consider. We may also increase the efficiency by caching previous intermediate property prediction calculations. In addition, the multiplication of real numbers may be approximated by fixed point integers. Our work devolution approach should definitely exploit the sparseness of our matrices, especially since the sparseness seems to increase with the matrix dimension, as indicated in the TAE. Modelling and measurements costs (MMC) is determined by the number of individual measurements and the cost of performing each of them. MMC will be linear with

respect to the number of atomic realisations, whereas the number of application variants grows exponentially with the number of variants for each component. In our example, we found that CMs that lead to components, rather than to resources, were easy to find using a profiler. It was also easy to estimate the relationships between all components and processing resources (CPUs). What turned out to be costly was measuring the CMs that lead to secondary storage and to networks, because these resources did not seem to be supported by profilers. It is good performance engineering practice to focus on characterising the consumption of resources by bottleneck resources. However, because bottleneck resources may change depending on the selected variant, it may not always be possible to do so. Model accuracy is generally improved the closer the number of operations specified in the model represents the actual number, and will again generally increase with a direct relation between actual component methods and model component operations. However, in the TAE, we modelled one method with several operations, thereby increasing the granularity of the model even further. Generally, the more operations we model, the greater the potential accuracy of the model. Both the number of components and the number of CMs are defined by the variability model, which is created to cover the variability aspect and in most cases does not take into account any decrease of MMC or increase of accuracy. Looking at these three factors, model solver efficiency and MMC both favour small models, while model accuracy favours large models. The important question then becomes: What is the necessary degree of accuracy that is required to distinguish between competing application variants? This is further work.

6.

RELATED WORK

We have described a component-based PPP framework for adaptation on mobile devices. It is suitable for realistic size applications, as demonstrated by the TAE application. The Aura framework considers QoS properties in selfadaptive systems [17]. However, this framework is not component-based. An interesting method for performance analysis for autonomic systems that focuses on bottlenecks is described in [9]. However, this method is not applicable in a component-based setting. Ma describes a comprehensive component-based framework for property prediction that is based on synchronous data flow (SDF) [10]. The data flow paradigm is based on states, and this paradigm suffers from state explosion for realistic examples. As described in Section 5, efficient algorithms are especially important in our setting for run-time adaptation reasoning on mobile devices. This requirement rules out solution methods that are known to be very accurate, but at the cost of efficiency. While our SP-based approach to performance modelling considers hardware contention, contention for software resources (threads, buffers, critical regions, or locks) is not taken into account. Software contention is supported by MARTE,a UML profile adopted by OMG [14], which builds on LQN [2]. As part of the PLASTIC project, which also looked at adaptation for mobile devices, Di Marco & Mascolo outlined how LQNs can be generated, using as a basis for annotated UML diagrams [1]. In [3], as part of the Q-ImPrESS project, Grassi et al. describe a comprehensive approach based on

MARTE, to handling both performance and availability requirements for self-adaptive systems. These authors first define a bridge model to extract the required design information and afterwards generate an analysis model that is based on this bridge model. However, variability and the comparison of competing application variants are not considered by neither De Marco & Mascolo nor Grassi et al. MUSIC might simply have used the MARTE / LQN paradigm. This would have required the incorporation of the LQN solver (LQNS) into our MW. The speed for property prediction in MUSIC would then depend on the resource consumption and memory requirement of the LQNS, which seems not to be designed for a resource-constrained, mobile device. By using our conceptually simple matrix-based approach, we have favoured speed over accuracy. Further research is required to determine how much faster our approach is than MARTE / LQN and how much more accurate MARTE / LQN is when used on a handheld device, which by definition has only one user. On a server that has many different users, software contention of course is more important, and solver efficiency may no longer be a pressing issue. Elements of a meta-model for SP were defined in a technical report [5]. Our meta-model is tailored to MUSIC, in which the concepts of (hardware) resources and (software) components were already established. As a result, we had to distinguish between the two concepts. We have added concepts for code size, data extent, devolved work and temporary data and have made explicit the relationship between data limits on the one hand and code size, data extent, and temporary data on the other. Instead of having processing, storage, and communication components, we distinguish between types of link, because this makes the conceptual model more intuitive due to visual differences between the three types of link. The models displayed in Fig. 5 and in Fig. 6 are therefore new.

7.

CONCLUSIONS AND FURTHER WORK

We have demonstrated how performance property prediction based on SP, a software performance evaluation framework, can be smoothly integrated into a model-based adaptation framework for context-aware, self-adaptive applications. To the best of our knowledge, there is no comparable solution that provides such a comprehensive and middlewareintegrated approach to performance prediction for self-adaptive applications running on resource-constrained mobile devices. Nevertheless, there remain a number of challenging research questions. An analysis of the size and structure of the matrices in large examples is needed to get a better feeling for how sparse these matrices typically are. We need to obtain more practical experiences with the model granularity trade-off between model solver efficiency, modelling and measurements costs, and accuracy. Another practical issue is the potential automation of parameter capture from profilers, as well as profilers for mobile devices. How to deal with cycles between components should also be explored. More experience on the storage dimension of the PPP framework is required: compactness matrices, temporary data etc. Finally, service-oriented adaptation, a unique feature of the MUSIC framework, where a service that is discovered dynamically may replace a local application component if it increases the overall utility of the application, should also be incorporated in the PPP.

8.

ACKNOWLEDGMENTS

We are grateful to our colleagues in the MUSIC project for their stimulating discussions; in particular, Svein Hallsteinsen and Jacqueline Floch from SINTEF ICT. This research was supported by the MUSIC project and Telenor ASA.

9.

REFERENCES

[1] A. Di Marco and C. Mascolo. Physical Analysis and Prediction of Physically Mobile Systems. In WOSP, pages 129 – 132. ACM, 2007. [2] G. Franks, T. Al-Omari, M. Woodside, O. Das, and S. Derisavi. Enhanced Modelling and Solution of Layered Queueing Networks. Transactions on Software Engineering, 35(2):148 – 161, 2009. [3] V. Grassi, R. Mirandola, and E. Randazzo. Model-Driven Assessment of QoQ-Aware Self-Adaptation. In SW Eng. for Self-Adaptive Systems, LNCS 5525, pages 201 – 222. Springer, 2009. [4] HP. QuickSpecs, HP Compaq 2510p Notebook PC. http://h18000.www1.hp.com/products/quickspecs/ 12717 div/12717 div.PDF. [5] P. H. Hughes. The IMSE Conceptual Model. Technical report, Esprit Project no. 2134: Integrated Modelling Support Environment - IMSE, 1989. [6] P. H. Hughes and J. S. Løvstad. A Generic Model for Quantifiable Software Deployment. In Conf. on SW Eng. Advances, pages 22 – 22. IEEE, 2007. [7] JProbe. www.quest.com/jprobe/. [8] J. Kephart and R. Das. Achieving Self-Management via Utility Functions. IEEE Internet Computing, 11(1):40 – 48, 2007. [9] M. Litoio. A Performance Analysis Method for Autonomic Computing Systems. Transactions on Autonomous and Adaptive Systems, 2(1):1 – 29, 2007. [10] H. Ma, I.-L. Yen, J. Zhou, and K. Cooper. QoS analysis for component-based embedded software: Models and methodology. Journal of Systems and Software, 79(6):859 – 870, 2006. [11] D. A. Menasc´e, V. A. Almeida, and L. W. Dowdy. Performance by Design. Prentice Hall, 2004. [12] Microsoft. Overview of performance monitoring. technet.microsoft.com/en-us/library/cc961845.aspx. [13] MUSIC. http://ist-music.berlios.de/. [14] OMG. UML Profile for MARTE. www.omgmarte.org/spec/MARTE/1.0, 2009. [15] R. Rouvoy et al. MUSIC: Middleware Support for Self-Adaptation in Ubiquitous and Service-Oriented Environments. In SW Eng. for Self-Adaptive Systems, LNCS 5525, pages 164 – 182. Springer, 2009. [16] M. Satyanarayanan. Pervasive Computing: Vision and Challenges. IEEE Personal Communications, 8(4):10 – 17, 2001. [17] J. Sousa, V. Poladian, D. Garlan, S. B., and S. M. Task-Based Adaptation for Ubiquitous Computing. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 36(3):328 – 340, 2006. [18] M. Woodside, V. Vetland, M. Courtois, and S. Bayarov. Resource Function Capture for Performance Aspects of Software Components and Sub-systems. In Performance Engineering, LNCS 2047, pages 239 – 256. Springer, 2001.

Suggest Documents