Visual Identification of Software Evolution Patterns - CiteSeerX

9 downloads 33153 Views 60KB Size Report
means losing business opportunities. However, these maintenance activities can easily lead to a phenomenon called “software deterioration”. Prior literature [4 ...
Visual Identification of Software Evolution Patterns Andrejs Jermakovics

Marco Scotto

Giancarlo Succi

Free University of Bolzano Bozen Piazza Domenicani, 3 I-39100 Bolzano, Italy ph. +39 0471 016135

Free University of Bolzano-Bozen Piazza Domenicani, 3 I-39100 Bolzano, Italy ph. +39 0471 016135

Free University of Bolzano-Bozen Piazza Domenicani, 3 I-39100 Bolzano, Italy ph. +39 0471 016130

[email protected]

[email protected]

[email protected]

ABSTRACT Software evolution plays a key role in the overall lifecycle of a software system. In this phase, software developers extend the capabilities and functionality of the system to meet new user requirements. However, the maintenance process could rapidly lead to phenomena of “source code deterioration”. The possibility to early detect bad software evolution patterns represents a paramount opportunity to keep the application maintainable. In this paper we propose a combined visualization to identify software evolution patterns related to user requirements. Such visualization consists in showing the evolution metrics of a software system together with the implementation of its requirements. We also show some examples on how this visualization could help to identify some “common” evolution patterns.

Categories and Subject Descriptors D.2.8 Metrics

General Terms Measurement

Keywords Software evolution, evolution deterioration, software metrics

of

requirements,

software

1. INTRODUCTION Software evolution [2] is a process that takes place only when the initial development of a software project was successful. The goal is to incorporate new user requirements in the application and/or adapt it to a new operating environment. This phase is of paramount importance because: 1) it takes a large part of the overall lifecycle costs; 2) failing in changing software quickly means losing business opportunities. However, these maintenance activities can easily lead to a phenomenon called “software deterioration”. Prior literature [4, 8]

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. IWPSE'07, September 3-4, 2007, Dubrovnik, Croatia Copyright 2007 ACM ISBN 978-1-59593-722-3/07/09...$5.00

has already discussed the problem of software deterioration, also called “design erosion” [11] or “software aging” [17]. The problem can be defined as follows: as a piece of software ages while it is being maintained, maintenance become more and more difficult. In several industrial cases, this problem has led to a redevelopment of systems from scratch. An approach to deal with software deterioration is refactoring [10]. Refactoring is a controlled technique for improving the design of an existing code base. Fowler proposes a set of refactoring techniques which improve the readability or simplify the structure of a computer program without changing its behavior. In this paper we propose an approach to visually identify software evolution patterns related to requirements. In particular, we propose a combined visualization showing the evolution of a software system together with the implementation of its requirements. Such view can help project managers to keep the evolution process of a software system under control. It is of paramount importance to identify bad software evolution phenomena in order to take corrective actions such as refactoring. The idea is similar to the concept of bad code smell, term, firstly coined by Kent Beck that suggests when code should be refactored or the overall design should be reconsidered. The paper is organized as follows: section 2 presents the visualization proposed and the evolution patterns we are able to identify; section 3 describes the tools and the techniques used to detect software evolution patterns; section 4 highlights the limitations of this work; finally, section 5 draws the conclusions.

2. VERSION -REQUIREMENT TIMELINE We propose a combined visualization showing the evolution of a software system together with the implementation of its requirements. It is designed to help identify patterns of software evolution. It consists of two separate visualizations aligned together (Figure 1). The top part is a Requirement Timeline showing requirements and the periods of their implementation. We describe it in more detail in section 3.1. The bottom part is a Version Timeline showing a bar chart of different versions of the system. Each bar corresponds to a version of the system in a specific date and its height is determined by software quality metrics discussed in section 3.2. The color is determined by software size measures (LOC, number of classes, etc.), ranging from white (smallest) to black (largest), which is an approach used in polymetric visualizations [14].

Figure 3. Time pressure pattern Figure 1. Version-requirement timeline Both views are aligned together according to the date, making it possible to see correlations between them. Thus, this allows seeing what happened to the code during the implementation of each requirement and leads us to visually recognize various patterns of software evolution. We describe and illustrate some of these patterns in the following sections.

2.1 Fast hack pattern A Fast hack [13] pattern in software evolution is characterized by a careless coding. It can be recognized by a declining quality of software during an implementation of a requirement (Figure 2).

To detect this pattern, we use the size measure for the version bar height. The quality of code does not necessarily decrease, however the size of the system grows significantly, which is identified by the increasing the height of the version bars.

2.3 Refactoring pattern Refactoring [10] pattern occurs when the internal structure of the code is improved. It can be recognized by an increase in software quality and an absence of requirement implementation. Figure 4 shows an excerpt from evolution having this pattern.

Figure 2. Fast hack pattern

Figure 4. Refactoring pattern

One can see that during the Fast hack implementation little effort has been made (visible as lighter color of requirement bar), but the new code has caused the overall quality to decrease. This can happen, for example, when the developer does not respect the architecture, uses a bad object-oriented design or simply does “cowboy coding”.

During the refactoring phase, most the time is devoted only to code restructuring and no new requirements are being implemented. The size of the system may even decrease since, for example, duplicate code could be removed, and therefore one might observe a lightening of the version bars. The requirements following the occurrence of this pattern are expectantly easier to implement.

2.2 Time pressure pattern Time pressure occurs when a requirement is being implemented exceedingly quickly. It can be recognized by a large amount of effort (darker shade of requirement bar) in an unusually short period of time (Figure 3). This is common, for example, when developers work overtime to meet a deadline.

2.4 Deterioration pattern Deterioration occurs when software becomes worse and worse in its structure. It can be recognized in the decline of quality and an increasing time of requirements implementation over a longer period of time (Figure 5).

3.2 Quality metrics Software quality is a composite property of many internal and external software attributes. There has been a lot of discussion on the meaning of software quality [3, 14]. Now, there is a general consensus [9] that software quality is a property defined by several small-scaled and directly measurable attributes. In this research, we use complexity, coupling, and cohesion metrics, as defined by Chidamber and Kemerer [6] (Table 1); such measures are widely accepted both by practitioners and researchers and validated by several previous studies [1], [5]. Table 1. Quality metrics

Figure 5. Deterioration pattern Software deterioration is inevitable [15], however the decreasing quality is only half of the trouble. The real problem arises when it becomes more and more difficult to implement requirements, therefore it takes longer time [8]. This increases the costs and makes it harder to meet deadlines.

2.5 Other software evolution patterns It is possible to detect other patterns, starting from trivial ones like growth, reduction and idle to more significant ones like: 1) copy&paste pattern, a requirement has very little effort, but code grows a lot; 2) reliance pattern, a requirement implemented on top of other requirements. Requirements overlap, but one has little effort while others a lot.

Requirement Timeline allows seeing how the project has evolved by showing when each requirement was implemented. Additionally, it gives a duration overview of each requirement's implementation and the invested effort.

CBO

Coupling Between Object classes

LCOM

Lack of Cohesion in Methods

RFC

Response For a Class

WMC

Weighted Methods per Class

Apart from the issues of quality measures, currently the visualization has some other limitations: •

Since this visualization is based on LSI algorithm, it is also subjected to its limitations. Namely, the linking of requirements to classes is not always precise and full semantics are not considered. A further discussion is out of scope of this paper and we refer to [7];



When many requirements overlap (implemented at the same time), it becomes difficult to identify and differentiate patterns. This problem, however, can be minimized by producing several visualizations for the same period and each having their own set of requirements;



At present, the detection of patterns is not automatic and requires a person, therefore is subjective;



Requirement difficulty is not taken into account, which could result in imprecise identification. Some requirements may need more time just because they are more difficult to implement. This inaccuracy, however, could be reduced if difficulty is specified in some way;



Our current quality measures focus only on code properties and not on overall design properties. Even if code is of high quality, some requirements may be difficult to implement because design is complicated.

3.1 Requirement timeline

In our current solution [12] data for the visualization is calculated automatically with information retrieval techniques and effort data from PROM (PRO Metrics) tool [18]. During the development of the project all coding activities and their durations are recorded using non-invasive plug-ins of development environments. This gives us information on how much time was spent on particular classes in the source code. Then, using Latent Semantic Indexing (LSI) [7], each requirement is associated with a set of classes that have implemented it. Out of these the most significant ones are selected using similarity threshold of 0.75 (most similar to the requirement). Since for each of these classes we have the total time and dates of development, we can calculate the total time and dates of development for each requirement. We select minimum and maximum dates as starting and ending dates, and aggregate the amount of development time of all developers. This technique allows recovering the period of each requirement’s development as well as the total time spent.

Definition

4. LIMITATIONS

3. CURRENT SOLUTION A Requirement Timeline depicts evolution of a project in terms of requirements. It shows the period of each requirement's implementation with date on the x-axis and individual requirements on y-axis. The color represents the time that was spent on implementing the requirement.

Metric

5. CONCLUSIONS AND FUTURE WORK Although being an approximation, we believe that this visualization provides the following benefits: 1) it allows seeing different patterns of evolution or recurring patterns. Thus we are able to identify these parts and can see how evolution is affected by them; 2) it allows to study evolution from a requirement perspective. Requirements are easier to comprehend since they are written in natural language, thus the whole evolution can be

understood better. We can see how requirements impact evolution and vice versa; 3) it allows predicting the course of evolution. After realizing how different patterns affect evolution, we can make predictions about the upcoming patterns and the changes in quality. For example, we can infer that a series of Fast Hacks can cause a Deterioration pattern to occur, thus calling for refactoring. In the future, we plan to work on identification of additional patterns and on automatic detection. Our final goal is to have the patterns detected and highlighted in the timeline (Figure 6). Thus we could read the whole “story” of evolution.

[6] Chidamber, S., Kemerer, C. F. "A metrics suite for objectoriented design". IEEE Trans. on Software Engineering, 20(6): 476-493, June, 1994. [7] Deerwester S., Dumais S.T., Furnas G.W., Landauer T.K., and Harshman, R. “Indexing by Latent Semantic Analysis” Journal of the American Society for Information Science, 41(6):391–407, 1990. [8] Eick, S.G., Graves, T.L., Karr, A.F., Marron, J.S., Mockus, A., "Does code decay? Assessing the evidence from change management data", IEEE Transactions on Software Engineering, Volume: 27, p. 1-12, Jan 2001 [9] Fenton, N., Pfleeger, S. L. “Software Metrics A Rigorous &Practical Approach”. PWS Publishing Company, Boston, 1997. [10] M. Fowler, “Refactoring: Improving the Design of Existing Code”, Addison-Wesley, 1999.

Figure 6. Evolution story By having the two views together one can infer more information than looking at each view separately. In this work we have shown how these views allow identifying patterns of software evolution. Although at present the visualization has a number of limitations, at least we have demonstrated that it can be useful and is worth exploring further. All in all, the strength of the visualization is not so much its novelty, but its identification power and ease of reading.

6. REFERENCES [1] Basili, V., Briand, L., and Melo, W. L. “A. Validation of Object-Oriented Design Metrics as Quality Indicators”. IEEE Transactions on Software Engineering, Volume 22(10): 267271, 1996. [2] Bennett, K. H., Rajlich, V. “T., Software Maintenance And Evolution: A Roadmap”, In A Finkelstein (ed.) The Future of Software Engineering, ACM Press, 2000. [3] Boehm, B. W., Brown, J. R., Kaspar, J. R. et al. Characteristics of Software Quality. TRW Series of Software Technology, Amsterdam, North Holland, 1978. [4] Bosch, J., “Design & Use of Software Architectures: adopting and evolving a product-line approach”, Addison-Wesley Publishing Co., 2000. [5] Briand, L., Wüst. J. Modeling Development Effort in ObjectOriented Systems Using Design Properties. IEEE Trans. on Software Engineering, 27(11): 963-986, 2001.

[11] Van Gurp J., Bosch J., “Design Erosion: Problems & Causes”, Journal of Systems and Software, 61(2):105-119, 2002. [12] Jermakovics A., Scotto M., Sillitti A., Succi G., “Lagrein: Visualizing User Requirements and Development Effort”, 15th IEEE International Conference on Program Comprehension, 2007 [13] Land R., “Software Deterioration and Maintainability – A Model Proposal”, Second Conference on Software Engineering Research and Practice in Sweden, 2002. [14] Lanza M., Ducasse S. “Polymetric Views—A Lightweight Visual Approach to Reverse Engineering, IEEE Trans. on Software Engineering, 29(9):782-795, 2003. [15] M. M. Lehman, J. F. Ramil, P. D. Wernick, P. D.E. and W. M. Turski, “Metrics and Laws of Software Evolution - The Nineties View”. In Proc. of the 4th Int. Symposium on Software Metrics, 1997. [16] McCall, J. A., Richards, P. K., and Walters, G. F. Factors in Software Quality. RADC TR-77-369, Vols I, II, III, US Rome Air Development Center Reports NTIS AD/A-049 014, 015, 055, 1977. [17] Parnas, D. L., “Software Aging”, In Proceedings of the 16th International Conference on Software Engineering, IEEE Press, 1994. [18] Scotto M., Sillitti A., Succi G., Vernazza T. “A non-invasive approach to product metrics collection”, Journal of Systems Architecture, 52(11):668-675, 2006.

Suggest Documents