Incremental effort prediction models in Agile

2 downloads 0 Views 114KB Size Report
Incremental effort prediction models in Agile Development using Radial Basis. Functions. Raimund Moser. A. , Witold Pedrycz. B. , Giancarlo Succi. A. A.
Incremental effort prediction models in Agile Development using Radial Basis Functions Raimund MoserA, Witold PedryczB, Giancarlo SucciA A Free University of Bolzano, Italy, BUniversity of Alberta, Canada [email protected], [email protected], [email protected]

Abstract One of the impediments to the wide dissemination of software estimation and measurement practices is the significant overhead imposed by these practices on the project and development team. Despite significant investment in research, the lightweight estimation of development effort is still an unsolved problem in software engineering. This study proposes a new, lightweight effort estimation model aimed at iterative development environments, as Agile Processes. The model is based on Radial Basis Functions. It is experimented in two semi-industrial projects carried out using a customized version of Extreme Programming (XP). The results are promising and evidence that the proposed model can be developed incrementally and from scratch for new projects without resorting itself to historic data.

1. Introduction Effort prediction has always been perceived as a major topic in software engineering. The reason is quite evident: many software projects run out of budget and schedule because of an underestimation of the development effort. Since the pioneering work by Putnam [17], Boehm [5] and Albrecht [2], there have been many attempts to construct prediction models of software cost determination. An overview of current effort estimation techniques, their application in industry, and their drawbacks regarding accuracy and applicability can be found in [12]. The most prominent estimation model comes in the form of the so-called COCOMO family of cost models [6]. While capturing the essence of project cost estimation in many instances, they are not the most suitable when faced with more recent methodologies of software development such as agile approaches and small development teams. Moreover, models such as COCOMO II depend quite heavily on many projectspecific settings and adjustments, whose impact is

difficult to assess, collect, and quantify [16]. What makes the situation even worse, is the fact that in agile processes an effective collection of such metrics and the ensuing tedious calibration of the models are quite unrealistic. As in other fields of software engineering a major impediment for the development of effort estimation models is scarcity of experimental data. Many case studies, surveys and experiments on effort prediction found in the literature suffer from one or more of several drawbacks: • Since data coming from industrial environments are very limited different studies have used the same dataset (for example the COCOMO’81 or Kemerer dataset) for analysis and validation raising concern on the generalization capabilities and/or bias of the findings [14]. • Manually collected data are error prone and unreliable. Moreover, developers do not like to spend their time on other activities than development; therefore, asking them to trace their own development effort is likely to produce data of poor quality [11]. • Traditional effort estimation models are static as they are built at one point in time using historic data. However, in agile development with its fast plan-implement-release cycle there is a risk that such models are outdated and not able to adapt to high volatility of requirements, technologies, personnel or other factors. As far as we know, no specific models have been developed for agile and iterative development processes. Only a few studies deal with the idea of updating or refining prediction models during project evolution and using effort of previous development phases as predictor variable. Closest to our work is a recent study by [20] who incorporate into a hybrid cost estimation model feedback cycles and possibility for iterative refinement. MacDonell and Shepperd [13] use project effort of previous phases of development as predictor for a simple regression model and show that

it yields better results than expert opinion. However, both studies do not address the peculiarities of agile processes and use a different modeling approach. In general, traditional effort estimation works as follows: Some predictor variables are collected or estimated at the beginning of a project and fed into a model. The model, which is usually built upon historic data using similar projects, predicts the total development effort. While this approach is reasonable for traditional, waterfall-like development processes where common predictor variables such as function or feature points, software size, formal design specifications, design documents, etc. are known at the beginning of a project and typically do not change too much throughout the overall project this is not the case for agile development processes. In agile development, a project is realized in iterations and requirements usually change from one iteration to the next. At the end of each iteration, developers release the software to the customer who will eventually require new features, and change or removal of already implemented functionalities. At the beginning of the next iteration developers will negotiate with the customer about requirements and develop a plan (design document) for the next iteration (in XP this process is referred to as planning game [3]). Therefore, standard predictor variables proposed in the literature, in particular the ones derived from design documents, are only known at the beginning of the next development iteration and not a priori for the whole project. Being cognizant of the existing challenges as outlined above, the key objectives of our study are outlined as follows: • We use a novel, non-invasive, tool-based approach for collecting fine grain effort data in a close-to an industrial, XP-like environment. • We propose a new type of prediction model, suited to iterative processes and referred to as incremental model. • We carry out a thorough experimental validation of a specific implementation of the incremental model using Radial Basis Functions (RBF). Given the objectives above we aim at answering the following research question: In agile, iterative software development processes, are incremental effort prediction models efficient for iterative effort prediction and do they perform better than traditional models? Our proposed incremental approach addresses a crucial point in effort estimation as in general, managers tend to be over-optimistic and over-confident in estimation and scheduling, and are normally reluctant to move from initial estimates and schedules when progress slips [15]. An estimate should be

dynamic – as the project progresses more information becomes available. The remainder of the paper is organized as follows. In Section 2 we propose and elaborate on the concept of incremental prediction models. Section 3 discusses the use of RBF models for effort prediction. In Section 4, we present a case study, and in Section 5 we discuss limitations of this research. Finally, in Section 6 we draw the conclusions.

2. Incremental prediction models There are crucial issues in the development and utilization of traditional, monolithic effort estimation models as described earlier that have to be clearly underlined: First, the choice of the predictor variables is highly demanding as at the beginning of the project we may not know which variables of the project could prove to be good effort estimators. While we might be tempted to collect a lot of variables (in anticipation of a proper compensation for the shortcoming identified so far), the process could be time consuming, costly, and at the end lead to overly complicated models. Second, in the construction of global models we rely on historic data or/and expert opinion. Given the unstable and highly non-stationary environment of software development, this may lead to models whose predictive capabilities are questionable. Moreover, software industry is moving into a direction where projects are not completed but constantly evolving with new updates and deliveries in response to market demands. In such scenario it is not obvious when to freeze the project for model building purposes. Agile software development introduces another problem into traditional effort estimation. Predictor variables usually are not known at the beginning of a project, but become only available at the beginning of a new iteration. Under these circumstances a long-term model of effort estimation that naturally relies on information available at the beginning of a project seems to be of limited applicability. Taking into account the limitations pointed out above, we assume a different development position by focusing on the incremental mode of model development. The main idea of an incremental prediction model is that it is built after each iteration instead of at the end of a project. Thus, it is able to adapt to any changes during development in a much smoother way than a global model. Moreover, we endow the incremental model with a dynamic character by using the estimates of effort reported in the previous iterations that are treated as an additional input. The incremental model operates only for iterative effort prediction as it cannot

be used to predict total development effort at the beginning of a project.

have to be developed for a given budget and period of time. In such scenarios an incremental model may help a software firm to get a reliable estimate of the number of features that can be implemented within a single development iteration.

3. An implementation using Radial Basis Functions (RBF) and design metrics Figure 1: Incremental model of iterative effort prediction. The essence of the incremental model (Figure 1) can be explained as follows: • Model building: At the end of each iteration a new model is built using as input predictor variables of that iteration and development effort of previous iterations. • Iterative effort prediction: At the beginning of a new iteration predictor variables are collected and fed together with past effort into the newly built incremental model. The output of the model produces an estimation of the cumulative development effort starting from iteration one to the end of this iteration. When comparing incremental models versus global ones, we can highlight the following essential differences, which are summarized in Table 1. Table 1: Comparison between features of incremental and global models. Feature Continuous model evolution and refinement Historic data is needed for model building Automatic adaptation to new business environments/changes of environment Estimation of total project effort Estimation of effort for single development iterations

Traditional model

Incremental model





















Clearly, with an incremental effort prediction model a company cannot estimate the total development effort at the beginning of a project. This may be an important issue for a company that is trying to decide whether or not they want to take on a given software project. However, in today’s software industry companies often negotiate with customers which and how many features

In the previous Section we described the general operation mode of an incremental effort prediction model. For a real implementation we first need a mathematical model for mapping input variables to effort and second a set of predictor variables. Radial Basis Functions (RBF) provide a flexible way to generalize linear regression functions and show some properties, which make them suitable for modeling software engineering data. An RFB network functions as follows: First, input data are mapped in a non-linear way using basis functions (we use Gaussians); then, the final response variable is built as a linear combination of the output of the basis functions with the network’s weight vector. RBF’s have been used for effort estimation showing very promising results [18]. The difficulty with RBF models is the model selection process: RBF’ s are completely specified by 3m parameters where m is the number of basis functions (we use also the terms receptive fields or neurons), which may be considerably high for fast fluctuating target functions. Usually a modeler has to specify the type and number of radial basis functions and the center and spread for each basis function. In general, the performance of RBF models depends highly on the chosen network architecture (number of layers, number of neurons, activation function, number of receptive fields, spread etc.). While there is no general theory behind the structural optimization of the topology of the networks, they are developed as a result of some trial and error process. For each set of parameters, that specify the model, we compute the leave-one-out cross-validation error [4]. We keep the set that produces the lowest error and this topology of the network is deemed optimal: We start with one receptive field and add one at a time until the crossvalidation error stops decreasing. We repeat this procedure for a range of spread parameters and keep the spread and number of receptive fields that return the absolute smallest cross-validation error. As predictor variables we use the Chidamber and Kemerer (CK) set of object-oriented design metrics [7]. The CK metrics have some nice properties, which make them in particular attractive for the kind of prediction model we propose:

• They are widely known by practitioners and in the research community and have been validated by several other researchers. • For the purpose of model building the CK metrics can be extracted automatically from source code. We do not use all 6 CK metrics as predictors but exclude the NOC (number of children) and LCOM (lack of cohesion of methods) metrics. We exclude NOC because in both projects it is almost 0 for all classes and hence does not contribute to the variation of effort data. As for LCOM several researchers have questioned its meaning and the way it is defined by Chidamber and Kemerer [9].

4. Case study The case study concerns two commercial software projects – we refer to them as project A and project B developed at VTT Technical Research Centre of Finland in Oulu, Finland. The programming language used in both projects was Java. Project A delivered a production monitoring application for mobile, Java enabled devices. Project B delivered a project management tool for agile projects. For both projects the development process followed a tailored version of the Extreme Programming practices [1]: in project A two pairs of programmers (four people) and in project B three pairs of programmers (six people) have worked for a total of eight weeks. The projects were divided into five iterations, starting with a 1-week iteration, which was followed by three 2-week iterations, with the project concluding in a final 1-week iteration. For project A three of the four developers had an education equivalent to a BSc and limited industrial experience. The fourth developer was an experienced industrial software engineer. Since the team was exposed for the first time to the XP process a brief training of the XP practices, in particular of the testfirst method was provided prior to the beginning of the project. For project B four developers were 5 - 6th year university students and the two remaining employees of VTT and as such experienced industrial software engineers.

4.1. Data collection process We used in both case studies our in-house developed tool PROM [19] for automatic and noninvasive data collection. To collect all needed product and process metrics with the PROM tool we adopted the following data collection procedure:

• Every day at midnight various source code metrics (among them are the CK metrics) are extracted from a CVS repository. • A plug-in for Eclipse (the IDE used by developers) collects automatically the time spent for coding on individual classes and methods. These measures are integrated and stored in a data warehouse. An analysis tool can extract them easily from the data warehouse and use for model building and statistical analysis.

4.2. Descriptive statistics of the data For project A the total coding effort recorded by the PROM tool is about 305 hours. Project A has 1776 lines of code (counted as Java statements in the source code) divided in 30 classes. Project B has 3426 lines of code and 52 classes. The total coding effort for project B is about 664 h. Table 2 shows descriptive statistics of the collected design metrics and coding effort for both projects. Table 2: Descriptive statistics of the data. Metric

CBO

Mean Std

9.8 ±7.1

Mean Std

14.9 ±10

WMC RFC Project A 17.8 25.7 ±22.9 ±19.7 Project B 14.6 36.6 ±15.7 ±30

DIT

Effort (h)

2.3 ±1.2

10.2 ±16.2

2.5 ±1.1

12.75 ±15.3

Due to space constraints we do not report box-plots for the different variables: They evidence that data have a few outliers and are highly skewed - two conditions, which would be problematic for ordinary least square regression models but are mitigated by RBF networks.

4.3. Results In Table 3 we present the results for effort prediction using an incremental RBF model: Columns 2 to 5 report several criteria for measuring prediction accuracy for estimating coding effort per class and column 6 reports the relative error of predicting the total coding effort per iteration. For assessing prediction accuracy we employ 4 different criteria – used exclusively none of them is reliable [10]: Two give the error relative to the true value (MRE) respective the estimate (MER); one criterion is the usual standard deviation (SD). Finally, the last criterion is the percentage of predictions, which have a relative error less than 25% (PRED(25%)). In general, accuracy of predicting total effort per iteration is

higher than predicting development effort for a single class. This could be explained by the fact that by summing up all classes errors due to underestimation respective overestimation annihilate in part. The prediction accuracy improves from iteration to iteration indicating that the incremental model stabilizes during development and prediction errors decrease and converge. Already for iteration 5 the model provides - for software engineering standards accurate predictions (MRE around or less than 25%), which are enough precise to be of real value for the purpose of project management [8]. Unfortunately, the two projects under scrutiny have only 5 development iterations; therefore, we are not able to infer whether this trend continues and at which level of prediction accuracy we could converge in a longer lasting project. This remains to be addressed in a future study. Table 3: Prediction of effort per class and total effort per iteration with the use of incremental RBF models. I

MMRE

MMER

SD

PRED(25%)

Total MRE

Project A 3

65%

59%

2.6

0%

54%

4

42%

66%

2.0

21%

33%

5

14%

14%

0.6

73%

6%

Project B 3

89%

98%

3.1

9%

81%

4

55%

111%

3.3

12%

54%

5

30%

37%

1.5

40%

16%

Legend: I-Development Iteration, MMRE-Median Magnitude of the Relative Error, MER-Median Magnitude of the Error Relative to the Estimate, SD-Standard Deviation, PRED(25%)-Prediction at a 25% level, Total MREMagnitude of the Relative Error of the total development effort. Values less than 35% are in bold.

In order to compare our proposed incremental approach with the traditional, monolithic approach we proceed – imitating a real world scenario - as follows: We combine both data sets (from project A and project B) for building an RBF model using as predictors CK metrics extracted from source code at project conclusion and as dependent variable total coding effort. Then, we take such model for predicting in an iterative way coding effort for both projects. The result is that the predictions provided by the traditional model are in all cases and both at a class and system level less accurate – sometimes even of the order of magnitude – than the ones obtained by the incremental model: For project A the average relative error of estimating the total coding effort is around 280%, while for project B it is around 295% - these numbers evidence that

traditional models fall far short of providing useful effort predictions for the projects under scrutiny. Overall, the results enable us to answer our research question: We can state that for the projects under scrutiny incremental models provide accurate effort estimations for later development iterations (MRE around or less than 25%) and perform by far better than traditional models.

5. Limitations of this research This work is a first step towards understanding of effort prediction models in iterative, agile environments. It is needless to say that in order to consolidate the findings of this study and transform them into usable models and recommendations to developers and managers several replications are required. The proposed approach has been validated with two software projects in a particular XP environment and all possible threats to external validity have to be considered carefully. In particular we are faced with the following threats: • Generalizability of the settings of the case studies: since the development process is a specific version of XP we cannot conclude that the results obtained from this study also hold in different XP or agile environments. • Generalization with respect to subjects: the participants of the case studies are in part master students; it is questionable whether or not they represent the average software developer in industry. Thus, further investigation with a randomized sample of developers is needed to analyze to which degree our findings are biased by the selection of the subjects. • Most of the participants of the study have been exposed to XP for the first time. We do not control the impact of a learning curve on our results. It is referred to a future study if experienced XP developers would have performed in the same way. As for the construction validity of this research there remain some important issues we have to be aware of and clarify in the future: The choice of the CK design metrics as predictor variables may have a crucial impact for the results obtained by incremental and global models. Other choices could favor one approach over the other and give different conclusions. In conclusion we would like to stress out that this study is based on data from two semi-industrial projects in an innovative development environment: Undoubtedly, the results of this research can form a baseline for future research and a starting point for a

better understanding of effort prediction models in XPlike development processes.

6. Conclusions In this study, we propose a new approach to lightweight, iterative effort prediction. We have identified a number of reasons for which the suitability of the monolithic prediction models is limited when dealing with agile software development. We propose an incremental approach and apply it - using RBF networks - to two agile, semi-industrial development projects. The results we obtain are promising: Incremental models are stable and convergent in the sense that their prediction error decreases from iteration to iteration. They can be used right from the start of development without the need for historic data and improve prediction accuracy throughout project evolution due to their iterative nature. Intelligent data collection and analysis tools allow easy automation of the model building process. At the beginning of a development iteration they could be integrated in a planning game where customers, developers, and managers develop a first objective and independent cost estimation.

References [1] P. Abrahamsson, A. Hanhineva, H. Hulkko, T. Ihme, J. Jäälinoja, M. Korkala, J. Koskela, P. Kyllönen, and O. Salo, “Mobile-D: An Agile Approach for Mobile Application Development”, Proceedings of the 19th Annual ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA’04, Vancouver, British Columbia, Canada, 2004. [2] A.J. Albrecht, J.E. Gaffney, “Software function, source lines of code, and development effort prediction”, IEEE Transactions on Software Engineering, 9(6): 639-648, 1983. [3] K. Beck, Extreme Programming Explained: Embrace Change, Addison-Wesley, 1999. [4] C.M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, Oxford, UK, 1994. [5] B.W. Boehm, Software Engineering Economics, Prentice-Hall, 1981. [6] B.W. Boehm, B. Clark, E. Horowitz, R. Madachy, R. Shelby, C. Westland, “Cost Models for Future Software Life Cycle Processes: COCOMO 2.0”, Annals of Software Engineering, 1995. [7] S. Chidamber, C.F. Kemerer, “A metrics suite for objectoriented design”, IEEE Transactions on Software Engineering, 20(6): 476-493, 1994.

[8] S. Conte, D.H.E. Dunsmore, V.Y. Shen, Software Engineering Metrics and Models, Benjamin/Cummings Publishing Company, Inc., 1986. [9] S. Counsell, S. Swift, J. Crampton, “The interpretation and utility of three cohesion metrics for object-oriented design”, ACM Trans. Softw. Eng. Methodol., 15(2): 123149, 2006. [10] T. Foss, I. Myrtveit, E. and Stensrud, “A comparison of LAD and OLS Regression for Effort Prediction of Software Projects,” Proc. 12th European Software Control and Metrics Conf., 9-15, 2001. [11] P.M. Johnson, H. Kou, J.M. Agustin, C. Chan, C.A. Moore, J. Miglani, S. Zhen, W.E. Doane, “Beyond the Personal Software Process: Metrics collection and analysis for the differently disciplined”, Proceedings of the 2003 International Conference on Software Engineering, Portland, Oregon, 2003. [12] M. Jørgensen, and M. Shepperd, “A Systematic Review of Software Development Cost Estimation Studies Document Actions”, IEEE Transactions on Software Engineering, 33(1): 33-53, 2007. [13] S. MacDonell, M.J. Shepperd, “Using Prior-Phase Effort Records for Re-estimation During Software Projects”, Proceedings of the Ninth International Software Metrics Symposium (METRICS’03), Sydney, Australia, 2003. [14] C. Mair, M. Shepperd, M. Jørgensen, “An Analysis of Data Sets Used to Train and Validate Cost Prediction Systems”, Proceedings of the 1st International Workshop on Predictor Models in Software Engineering, PROMISE 2005, St. Louis, MO, USA, 2005. [15] S. McConnell, “Avoiding classic mistakes”, IEEE Software, 1996, pp. 111-112. [16] T. Menzies, D. Port, Z. Chen, J. Hihn, S. Stukes, “Validation methods for calibrating software effort models”, Proceedings of the 27th International Conference on Software Engineering, St. Louis, MO, USA, 2005. [17] L.H.A. Putnam, “A general empirical solution to the macro software sizing and estimation problem”, IEEE Transactions on Software Engineering, 4(4): 345-381, 1978. [18] M. Shin and A.L. Goel, “Empirical Data Modeling in Software Engineering Using Radial Basis Functions”, IEEE Transactions on Software Engineering, 26(6): 567576, 2000. [19] A. Sillitti, A. Janes, G. Succi, T. Vernazza, “Collecting, Integrating and Analyzing Software Metrics and Personal Software Process Data”, Proceedings of the 29th EUROMICRO, Antalya, Turkey, 2003. [20] A. Trendowicz, J. Heidrich, J. Münch, Y. Ishigai, K. Yokoyama, N. Kikuchi, “Development of a hybrid cost estimation model in an iterative manner,” Proceeding of the 28th International Conference on Software Engineering, Shanghai, China, 2006.

Suggest Documents