On performance-based software model refactoring

4 downloads 647 Views 312KB Size Report
refactoring of performance problems in software models, with the aim of interpreting ... run-time performance monitoring [2], but it is well-known that the costs of ...
On performance-based software model refactoring Davide Arcelli Dipartimento di Ingegneria e Scienze dell’Informazione e Matematica (DISIM) Universit`a degli Studi dell’Aquila L’Aquila, Italy [email protected] Abstract—Identifying and removing the causes of poor performance in software systems are complex problems due to a variety of factors to take into account. In the last few years, we have dedicated some effort to issues related to the detection and refactoring of performance problems in software models, with the aim of interpreting model-based performance analysis results and translating them into architectural feedback. In this paper we summarize our recent research results regarding: (i) issues that we have to cope with when working on the software side or the performance side to detect and remove performance problems, (ii) the role and the influence of thresholds in specifying, detecting and refactoring performance antipatterns occurring in a software model, and (iii) the definition of a process for refactoring a software model, basing on performance analysis results and performance antipatterns detection.

I.

I NTRODUCTION

Identifying and removing the causes of poor performance in software systems are complex problems due to a variety of factors to take into account. Similarly to other non-functional properties, performance is an emergent attribute of software, as it is the result of interactions among software components, underlying platforms, users and contexts [1]. The current approaches to these problems are mostly based on the skills and experience of software developers or, in the best cases, of performance analysts. Quite sophisticated profiling tools have been introduced for run-time performance monitoring [2], but it is well-known that the costs of solving performance problems at runtime is orders of magnitude larger than the ones at early phases of the software lifecycle [3]. Therefore, dealing with issues related to the detection and refactoring of performance problems in software models, with the aim of interpreting model-based performance analysis results and translating them into architectural feedback early in the lifecycle, is very beneficial. Hence, our recent research work concerned: (i) issues that we have to cope with when working on the software side or the performance side to detect and remove performance problems, (ii) the role and the influence of thresholds in specifying, detecting and refactoring performance antipatterns occurring in a software model, and (iii) the definition of a process for refactoring a software model, basing on performance analysis results and performance antipatterns detection. In Figure 1 a round-trip Software Performance Engineering (SPE) process is schematically represented. The forward path starts from a software model which is transformed into a performance model (e.g. [4]) that can be solved with common performance analysis techniques/tools to obtain performance indices [5], [6]. The backward path consists of a problems detection/solution step that processes the performance indices, in

conjunction with the software artifact and/or the performance model, to detect and remove possible sources of performance problems. Hence, a set of refactoring actions that may apply to the software artifact and/or the performance model is obtained. The round-trip process is reiterated until satisfactory performance indices are obtained. Dotted arrows represent the option of working on the performance model to detect and remove performance problems. In this case a transformation from the performance model to the software model has to take place when satisfactory indices are obtained. Dotted-line arrows represent the option of working on the software model where refactoring actions are applied. In both cases the forward path has to be run at each iteration to obtain the performance indices of a refactored (performance or software) model. The remainder of the paper is organized as follows. Section II summarizes our findings regarding the peculiar aspects of working on either the software or the performance model side. Section III concerns working on the software model side. In particular, Section III-A summarizes our research work related to the role and the influence of thresholds in specifying, detecting and refactoring performance antipatterns occurring in the software model. Section III-B provides a brief illustration of our process for refactoring a software model, basing on performance analysis results and performance antipatterns detection. Each section ends with a brief discussion regarding the most challenging research topics and future works in the concerning context of that section1 . II.

S OFTWARE OR P ERFORMANCE MODEL SIDE ?

In the last 5 years several approaches have appeared for identification and removal of performance problems either in the software model [10], [9], [11] or in the performance model [12], [13]. Although these two categories of approaches nicely fit into the round-trip process of Figure 1, they work in different modeling environments, under different assumptions. Recently, we have highlighted the differences between these two categories of approaches in order to envisage the contexts where they can be more appropriately used [7]. For this goal we have considered two approaches that we have previously introduced [9], [12]. The first approach works on the software side and is based on the detection and solution, on the software model, of performance antipatterns that are used for “codifying” the knowledge and experience of analysts by means of the identification of a problem, i.e. a bad practice that negatively 1 Readers interested to more detailed discussions concerning the topics we deal with in this paper can refer to [7], [8], [9].

Fig. 1.

Round-trip Software Performance Engineering.

affects software performance, and a solution, i.e. a set of refactoring actions that can be carried out to remove it [9]. The second approach works on the performance side and is based on bidirectional model transformations between UML software models and Queueing Network (QN) performance models. A forward transformation is used to generate the performance model from an initial software model. The corresponding backward transformation is used to generate a new software model from a satisfactory performance model obtained by means of changes made by the analyst on the performance side [12]. We have applied both the approaches to the same running example in the E-Commerce domain in order to illustrate the differences in the obtained results. Thereafter, we have raised the level of abstraction and we have discussed the issues that we have to cope with when working on the software side or the performance side to detect and remove performance problems. Findings Summary Comparing the two performance-based model refactoring approaches at work on the same example allowed us to highlight the peculiar aspects of working on either the software side or the performance side. Table I summarizes our findings concerning this argument. TABLE I.

OVERVIEW OF THE COMPARED REFACTORING APPROACHES .

Classification parameters Design Human skills skills Performance required skills Problem detection Degree of Problem solution automation Number of actions Complexity Refactoring Performance gain metrics predictability Number of iterations Scalability Single-iteration complexity

AP-based approach (software side) medium

BT-based approach (performance side) low

low

high

medium medium high low, medium, high low

high high low low medium

low, medium

medium, high

O(f orward)

Obid (f orth) +Obid (back)

From Table I we can notice that: (i) the human role in SPE should not be completely removed, because the experience and skills of software designers and/or performance analysts cannot be fully embedded in automated processes; (ii) automation in the problem detection and solution step is traditionally very well supported on the performance side, where a whole theory on bottleneck identification and removal has been introduced few decades ago (e.g. [5]) and has been continuously refined by

more recent results (e.g. [14]). Instead, on the software side a certain level of automation has been introduced only recently, for example based on antipatterns or on metaheuristics that search the solution space looking for changes that can improve the performance indices [11]; (iii) the richness of software model notations makes the number of alternative refactoring actions outcoming from the detection and solution phase on the software side considerably higher than the one on the performance side; (iv) the effectiveness of refactoring actions is given by the tradeoff between the refactoring complexity (i.e. the distance between the original model and the refactored one) and the performance gain obtained from the refactoring. In most cases it is difficult to predict the performance gain of a refactoring action without actually solving a refactored performance model; (v) since the portfolio of refactoring actions that can be applied is certainly more limited than the one working on a software model, in order to solve nested performance problems more iterations may be needed on the performance side compared to the software side; (vi) a single loop of the round-trip process illustrated in Figure 1 is shorter in case of refactoring on the performance side than on the software side. Challenging research topics and future works With the increasing interest on refactoring approaches based on performance antipatterns and bidirectional transformations, we retain very relevant to study contexts where different techniques can better work than other ones. The work in [7] is an initial step for the comparison of such approaches. Several future directions will be investigated: (i) our experience will be either consolidated or turned down by applying the approaches to a significant amount of examples; (ii) it could be very interesting to study a mixed approach combining the ones we compared in [7]; (iii) the introduction of performance antipatterns at the performance model side might support the performance analyst in detecting and solving performance problems; (iv) finally, we planned to work on the introduction of measurement-based performance problems detection and solution at code level by means of monitoring-driven testing techniques for cloud applications [15]. III.

W ORKING ON THE SOFTWARE MODEL SIDE

Performance antipatterns are well-known bad design practices that lead to software products suffering by poor performance. A certain number of performance antipatterns has

been defined and classified [16], [17], [18], [19], [10], and refactoring actions have also been suggested to remove them [20], [9]. Hence, in this section we summarize our main research works concerning the software model side and, in particular, (a) the role and the influence of thresholds in specifying, detecting and refactoring performance antipatterns occurring in the software model [8], and (b) the definition of a process for refactoring the latter, basing on performance analysis results and performance antipatterns detection [9]. A. Influence of numerical thresholds on the specification, detection and refactoring of performance antipatterns A specific characteristic of performance antipatterns is that they contain numerical parameters that may represent thresholds referring to either performance indices (e.g. a device utilization) or design features (e.g. number of interface operations of a software component). We have analyzed and highlighted the influence of such thresholds on the capability of detecting and refactoring performance antipatterns. In particular: (i) we have shown on a simple example how a set of detected antipatterns instances may change while varying threshold values, and (ii) we have discussed the influence of thresholds on the complexity of refactoring actions. Findings Summary Table II contains a list of performance antipatterns. Each row represents a specific antipattern which is characterized by three attributes: antipattern name and number of design/performance thresholds. TABLE II.

OVERVIEW OF ANTIPATTERNS THRESHOLDS . Antipattern

Blob Extensive Processing Empty Semi Trucks Excessive Dynamic Allocation “Pipe and Filter” Architectures Circuitous Treasure Hunt Tower of Babel Concurrent Processing Systems One-Lane Bridge The Ramp Traffic Jam More is Less

Thresholds Design Performance 2 2 2 2 2 1 2 1 1 2 1 1 1 1 0 5 0 1 0 2 0 1 0 0

From Table II we can notice that: (i) some antipatterns include both design and performance thresholds such as Blob, Extensive Processing, etc.; (ii) some antipatterns only include performance thresholds such as Concurrent Processing Systems, One-Lane Bridge, etc.; (iii) finally there is one antipattern (i.e. the More is Less) without thresholds because it lays on configuration parameters (database connections, web connections, etc.) detected by run-time software analysis. Several more observations regarding the antipatterns refactoring derive from our experimentation: (i) if a refactoring action refers to a threshold related to a design feature (e.g. number of connections) we can ensure that its application leads to the removal of the antipattern instance; (ii) on the other hand, if a refactoring action refers to a threshold related to a performance index (e.g. hardware nodes utilization or throughput) we can not ensure that its application leads to the actual removal of the antipattern instance, in fact we need

a further performance analysis step for the refactored model; (iii) since the specification of some antipatterns only contains thresholds related to performance indices, we think that it is more difficult to refactor such antipatterns rather than the ones referring also to design features. Challenging research topics and future works It is certainly of great interest to extend the experiment reported in [8] to other performance antipatterns. For this goal, more complex examples shall be considered. Instead, its extension it to different types of antipatterns that use thresholds in their definitions needs to be carefully planned, because domain-specific characteristics could be exploited. As future work we also intend to introduce confidence values that may be associated to antipattern instances to quantify the probability that numerical threshold values support the actual antipattern presence. Finally, some fuzziness can be introduced for the evaluation of the threshold values thus to make antipattern detection rules more flexible [21]. B. Antipattern-based software model refactoring Starting from the performance antipatterns definition given in [10] and the techniques aimed at detecting performance antipatters in software architectural models introduced in [22] and [23], in [9] we have moved a further step ahead by undertaking the problem of removing performance antipatterns detected in an architectural model. We have introduced a model-based approach which allows to formalize refactorings embedded into performance antipatterns definitions. Our approach enables developers to focus on (potential) sources of performance problems and suggests how to refactor software models in order to remove problems. To this end, we have adapted a Role-Based Modeling Language (RBML) [24] to represent: (i) antipattern problems as Source Role Models (SRMs), and (ii) antipattern solutions as Target Role Models (TRMs). Once the occurring antipatterns instances have been detected on the software model and one of them has been selected as the one that has to be removed, model refactoring consists in applying refactoring actions that might be expressed in terms of differences between an SRM referring to that antipattern instance and the corresponding TRM. Hence, a model-tomodel transformation underlies each SRM-TRM pair in terms of differences between the SRM and the TRM. The approach has been applied to a case study in the ecommerce domain, whose experimental results demonstrate its effectiveness. Findings Summary We have shown that the solution step can be automated as well, at a certain extent, and an adapted version of RBML, used in [24] for design patterns application, can be nicely adopted to define refactoring actions even for performance antipatterns to come. The benefit of using RBML is that the concept of role is very suitable to capture the heterogeneity of the knowledge (i.e. architectural model properties, performance indices, etc.) underlying the specification of antipatterns, hence we have

defined some SRM-TRM pairs for several existing antipatterns. Thus, SRM-TRM pairs represent new instruments in the hands of developers to achieve architectural model refactorings aimed at removing sources of performance problems. In [9] the SRM-TRM replacement is only implicitly defined. However, we have already experimented that the formalization of a SRM-TRM pair is an excellent starting point for the generation of the underlying transformation from an SRM to the corresponding TRM. Challenging research topics and future works The possibility of considering a set of design alternatives as candidates for the solution of performance problems is a step ahead when compared to the means adopted today that, very often, boil down to the skills and experience of performance analysts. Here we have shown that the solution step can be automated as well, at a certain extent, and a RoleBased Modeling Language is a very promising instrument to address this problem. Several main directions might be investigated in future. First, we want to further validate our approach by introducing SRM-TRM pairs for other known performance antipatterns. The generation of transformations underlying SRM-TRM pairs is not automated in [9]. Advanced model-driven techniques, such as model differences (i.e. [25], [26]), have been introduced in literature to represent refactorings as difference models. They combine the advantages of declarative difference representations and enable the reconstruction of the final model by means of automated transformations. Hence they can be applied to our approach for generating those transformations. Besides, the application of refactoring actions (additions, removals, and modifications) in architectural models must be propagated in a consistent way through different views. A very interesting approach has been introduced in [27] to address consistency among views, and we are studying how to apply it in the software performance domain. Finally, the combination of SRM-TRM pairs (or the combination of antipattern removal) can be a key factor for the success of our process, hence we intend to investigate the dependencies among antipatterns in order to define priority rules for the simultaneous solution of antipatterns.

[7]

[8]

[9]

[10] [11]

[12]

[13]

[14]

[15] [16] [17]

[18]

[19]

[20] [21]

[22]

R EFERENCES [1]

C. M. Woodside, G. Franks, and D. C. Petriu, “The future of software performance engineering,” in Workshop on the Future of Software Engineering (FOSE), 2007, pp. 171–187. [2] K. Ramachandran, K. Fathi, and B. Rao, “Recent trends in systems performance monitoring amp; failure diagnosis,” in Industrial Engineering and Engineering Management (IEEM), 2010 IEEE International Conference on, dec. 2010, pp. 2193 –2200. [3] H. Harreld, “NASA Delays Satellite Launch After Finding Bugs in Software Program,” April 20, 1998. [4] C. M. Woodside, D. C. Petriu, D. B. Petriu, H. Shen, T. Israr, and J. Merseguer, “Performance by unified model analysis (PUMA),” in WOSP. ACM, 2005, pp. 1–12. [5] E. D. Lazowska, J. Zahorjan, G. S. Graham, and K. C. Sevcik, Quantitative system performance: computer system analysis using queueing network models. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1984. [6] G. Casale and G. Serazzi, “Quantitative system evaluation with java modeling tools,” in ICPE, 2011, pp. 449–454.

[23]

[24]

[25]

[26] [27]

D. Arcelli and V. Cortellessa, “Software model refactoring based on performance analysis: better working on software or performance side?” in FESCA, ser. EPTCS, B. Buhnova, L. Happe, and J. Kofron, Eds., vol. 108, 2013, pp. 33–47. D. Arcelli, V. Cortellessa, and C. Trubiani, “Influence of numerical thresholds on model-based detection and refactoring of performance antipatterns,” 2013. [Online]. Available: http://ppap.soccerlab.polymtl.ca/papers/Arcelli-et-alPerformanceAntiPatterns.pdf ——, “Antipattern-based model refactoring for software performance improvement,” in QoSA, V. Grassi, R. Mirandola, B. Buhnova, and A. Vallecillo, Eds. ACM, 2012, pp. 33–42. V. Cortellessa, A. Di Marco, and C. Trubiani, “Performance antipatterns as logical predicates,” in ICECCS, 2010, pp. 146–156. A. Martens, H. Koziolek, S. Becker, and R. Reussner, “Automatically improve software architecture models for performance, reliability, and cost using evolutionary algorithms,” in Proceedings of the first joint WOSP/SIPEW international conference on Performance engineering, ser. WOSP/SIPEW ’10. New York, NY, USA: ACM, 2010, pp. 105– 116. R. Eramo, V. Cortellessa, A. Pierantonio, and M. Tucci, “Performancedriven architectural refactoring through bidirectional model transformations,” in QoSA, V. Grassi, R. Mirandola, B. Buhnova, and A. Vallecillo, Eds. ACM, 2012, pp. 55–60. J. Xu, “Rule-based automatic software performance diagnosis and improvement,” in WOSP, A. Avritzer, E. J. Weyuker, and C. M. Woodside, Eds. ACM, 2008, pp. 1–12. G. Franks, D. C. Petriu, C. M. Woodside, J. Xu, and P. Tregunno, “Layered bottlenecks and their mitigation,” in QEST, 2006, pp. 103– 114. CloudScale consortium, “The CloudScale project,” http://www.cloudscale-project.eu/. C. U. Smith and L. G. Williams, “Software performance antipatterns,” in Workshop on Software and Performance, 2000, pp. 127–136. C. U. Smith, “Software performance antipatterns: Common performance problems and their solutions,” in In Int. CMG Conference, 2001, pp. 797–806. C. U. Smith and L. G. Williams, “New software performance antipatterns: More ways to shoot yourself in the foot,” in Int. CMG Conference. Computer Measurement Group, 2002, pp. 667–674. ——, “More new software antipatterns: Even more ways to shoot yourself in the foot,” in Int. CMG Conference. Computer Measurement Group, 2003, pp. 717–725. P. A. Laplante and C. J. Neill, AntiPatterns: Identification, Refactoring and Management, C. Press, Ed., 2005. S. S. So, S. D. Cha, and Y. R. Kwon, “Empirical evaluation of a fuzzy logic-based software quality prediction model,” Fuzzy Sets Syst., vol. 127, no. 2, pp. 199–208, Apr. 2002. [Online]. Available: http://dx.doi.org/10.1016/S0165-0114(01)00128-2 V. Cortellessa, A. Di Marco, R. Eramo, A. Pierantonio, and C. Trubiani, “Digging into UML models to remove performance antipatterns,” in ICSE Workshop Quovadis, 2010, pp. 9–16. C. Trubiani and A. Koziolek, “Detection and solution of software performance antipatterns in palladio architectural models,” in ICPE, 2011, pp. 19–30. R. B. France, S. Ghosh, E. Song, and D.-K. Kim, “A Metamodeling Approach to Pattern-Based Model Refactoring,” IEEE Software, vol. 20, no. 5, pp. 52–58, 2003. A. Cicchetti, D. Di Ruscio, and A. Pierantonio, “A Metamodel Independent Approach to Difference Representation,” Journal of Object Technology, vol. 6, no. 9, pp. 165–185, 2007. J. E. Rivera and A. Vallecillo, “Representing and Operating with Model Differences,” in TOOLS, 2008, pp. 141–160. A. Egyed, A. Demuth, A. Ghabi, R. E. Lopez-Herrejon, P. M¨ader, A. N¨ohrer, and A. Reder, “Fine-tuning model transformation: Change propagation in context of consistency, completeness, and human guidance,” in International Conference on Theory and Practice of Model Transformations (ICMT), 2011, pp. 1–14.