ACM SIGSOFT Software Engineering Notes
Page 1
May 2012 Volume 37 Number 3
Computational Intelligence in Software Cost Estimation: An Emerging Paradigm Tirimula Rao Benala
Satchidananda Dehuri
Rajib Mall
Department of Computer Science and Engineering,
Department of Information & Communication Technology
Anil Neerukonda Institute of Technology and Sciences, Visakhapatnam, India
Fakir Mohan University, Vyasa Vihar, Balasore-756019, India.
Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Email:
[email protected]
E-mail:
[email protected]
E-mail:
[email protected]
ABSTRACT One of the key features for the failure of project estimation techniques is the selection of inappropriate estimation models. Further, noisy data poses a challenge to build accurate estimation models. Therefore, the software cost estimation (SCE) is a challenging problem that has attracted many researchers over the past few decades. In the recent times,the use of computational intelligence methodologies for software cost estimation have gained prominence. This paper reviews some of the commonly used computational intelligence (CI) techniques and analyzes their application in software cost estimation and outlines the emerging trends in this area.
based models [31] composite methods [6, 32]. All estimation methods are based upon five core metrics; namely, quality, quantity, time, cost, and productivity introduced in [45] as depicted in Figure 2 proposed by Harry M. Sneed [44], in which four factors are in an antagonistic relationship. The values for each factor are given by a square, with the area of the square (or, for better illustration, the circumference)being the productivity of the developing organization. The productivity is considered invariant over the project duration.
Categories and Subject Descriptors D.2.9 [Software Engineering]: Management – cost estimation.
General Terms Management, Measurement, Economics.
Keywords Software cost estimation, Computational Intelligence, software effort prediction, Neural Networks, Evolutionary Computation, and Fuzzy Logic.
1. INTRODUCTION The last two decades have witnessed a paradigm shift in the field of software engineering. Now, many software projects are being developed by geographically distributed team that may span the globe due to competitive open market. According to the Standish group’s report “CHAOS Summary 2009”, only 32% of projects are witnessing success; that is, they are delivered on time within the budget, with required features and functionality. With the growing size and complexity of software projects, the task of project manager is getting very tough as a client always looks for high quality software with competitive price and delivery on time. In this context, accurate software cost estimation plays an important role in successful project completion. Software cost estimation involves software effort prediction, choosing appropriate software sizing techniques, ascertaining productivity figures for the project and calculating the impact of project execution on software cost estimation. The basic activities in software effort and schedule estimation was presented in [35] and shown in Figure 1.
Figure 1: Basic Software Project Estimation
1.1 Software Cost Estimation Models The Software cost estimation techniques can be classified into the following six categories: parametric models including COCOMO (Constructive Cost Model), [5, 13], SLIM (Software Life Cycle Management) [36], and SEER-SEM (Software Evaluation and Estimation of Resources–Software Estimating Model) [22], expert judgment including Delphi technique [15] and work breakdown structure based methods [23, 24, 49] learning oriented techniques including machine learning methods [16, 33, 40] and analogy based estimation [1, 14, 41] regression based methods including ordinary least square regression [9, 29] and robust regression [30] dynamics
DOI: 10.1145/180921.2180932
Figure 2: Sneed’s devil’s square [8] Determining the right time for estimation plays an important role in reducing the effort in execution of an IT project. Figure 3 depicts an overview of possible estimation milestones, where milestones 1 to 7
http://doi.acm.org/10.1145/180921.2180932
ACM SIGSOFT Software Engineering Notes
Page 2
have the following meanings: 1) the end of feasibility study, 2) the project start, 3)the end of requirement analysis, 4-6) the end of IT design until end of project, 7) the project postmortem. The testing effort should be reduced with time otherwise the requirement analysis milestone has to be revisited.
May 2012 Volume 37 Number 3
intelligent behaviors in complicated, uncertain, and changing environments. CI methodologies can be adapted to dynamic changes in project parameters. SCE actions can be taken based on real-time datasets and historical reasoning. Researchers have conducted a lot of work for applications of CI in the field of SCE. In this paper, we highlighted some of the commonly used CI paradigms in SCE systems for study of optimal feature weighting and project selection.
. Figure 3: Milestones of Estimation in IT Projects [19]
1.2 Why Computational Intelligence in Software Cost Estimation? The software cost is determined by many factors that are attributed to how software solution is achieved, namely: fresh development of entire project from scratch, software implementation and customization of a commercially off-the-shelf product, porting of software, migration of software, conversion projects, software maintenance projects, defect fixing, operational support, fixing odd behavior, software modification, functional expansion, agile software development projects, and web projects. Each factor has its own characteristics, which makes the whole cost estimation systems a more complex nonlinear stochastic systems and thus pose many problems and challenges for researchers and engineers. Even though traditional estimation theories have been evolving over last several decades, they are still not considered satisfactory. Apart from this fact, software development is largely a human endeavor. Therefore, the human reactions to specific situation and their feelings should also be taken into account. Therefore, SCE systems should be based on some “intelligent” techniques .In this regard one major attempt is the use of CI. Since CI emerged several decades ago, it has found wide acceptance in many areas of research. It is claimed as the successor of artificial intelligence (AI) and a way for future computing [50]. CI methodologies facilitate solving difficult problems. CI essentially simulates “intelligence” by the use of certain computational methods, which include artificial neural networks (ANNs), fuzzy systems, evolutionary computation (EC) algorithms, swarm intelligence (SI) and fuzzy systems (FS). Taxonomy of the computational intelligence techniques is depicted in Figure 4. While individual techniques from these CI paradigms have been applied successfully to solve nonlinear, time varying, correlated discontinuous complex probabilistic real-world problems. The current trend is to evolve hybrids of paradigms incorporating probabilistic techniques, since none of the paradigms is superior to the other in all situations. In doing so, we capitalize on the respective strengths of the components of the hybrid CI system, and eliminate weaknesses of individual components [11]. There are several reasons why CI methodologies can be an attractive tool in SCE systems. As mentioned earlier, SCE systems are large complex nonlinear stochastic systems. Therefore, it is hard to find optimal feature weighting and project selection in any cost estimation model. CI provides a feasible way to obtain either optimal or suboptimal solutions. Most CI methodologies do not require precise models. Sometimes, no model is even needed. As a broader definition [11], CI is a study of adaptive mechanisms to enable or facilitate
DOI: 10.1145/180921.2180932
Figure 4: Computational Intelligence Taxonomy
2. COMPUTATIONAL INTELLIGENCE FOR SOFTWARE COST ESTIMATION Recent trend shows that CI techniques are emerging as robust optimization techniques to solve highly nonlinear, discontinuous, correlated and complex problems. As mentioned earlier, a wide variety of CI techniques are being used, which include neural networks, evolutionary computation, and fuzzy logic which are proven to be powerful tools to meet the aforesaid objectives in SCE. Although numerous classical approaches have been proposed, there are still certain difficulties that are yet to be addressed, such as the non-normal characteristics (which includes skewness, heteroscedasticity and excessive outliers) of the software engineering datasets [37] and the increasing sizes of the datasets [42]. Large non-normal characteristics always lead to low prediction accuracy and high computational expense [17]. To alleviate these drawbacks, CI techniques can be employed for selection of promising data points. This paper is an attempt to explore CI techniques that have been used in the field of Software Cost Estimation.
2.1 Performance Metrics Suitable Performance criteria are essential to measure the prediction accuracies of cost estimation models. In the literature, several quality metrics have been proposed to assess the performance of estimation methods. More specifically, Mean Magnitude of Relative Error (MMRE), PRED(0.25), and Median Magnitude of Relative Error (MMRE) are three popular metrics. The PRED (0.25) identifies cost
http://doi.acm.org/10.1145/180921.2180932
ACM SIGSOFT Software Engineering Notes
Page 3
estimations that are generally accurate, while MMRE is not always reliable. However, MMRE has become accepted as the de facto standard in the software cost estimation literature. In addition to the metrics mentioned above, there are several metrics available in the literature, such as Adjusted Mean Square Error (AMSE), Standard Deviation (SD), Relative Standard Deviation (RSD), and Logarithmic Standard Deviation (LSD). In the following, we explain the metrics.
May 2012 Volume 37 Number 3
known as neurons or nodes, which are interconnected. It can be described as a directed graph in which each node performs a transfer function of the following form.
,
2.2 Evolutionary Computation (EC) for Feature Weighting and Project Selection EC are nature inspired techniques. EC are essentially an umbrella of techniques that include genetic algorithms (GAs) [20], genetic programming [26], evolutionary programming [12], evolutionary strategies [39,46], differential evolution [47], and so on. They imitate natural processes, such as natural evolution under the principle of survival of the fittest. These techniques have been used for solving the problem of selection of promising data points by simultaneous optimization of feature weights and project selection. Different EC algorithms have similar implementation and algorithmic characteristics. The general framework of EC consists of three fundamental operations and two optional operations. We now discuss the basic steps of an EC algorithm. In EC algorithm, the first step is initialization. Then the algorithm enters evolutionary iterations with two operational steps, namely, fitness evaluation and selection and population reproduction and variation. The new population is then evaluated again and iterations continue until a termination criterion is satisfied. Besides the above three necessary steps, EC algorithms sometimes additionally perform an algorithm adaptation procedure or a local search (LS) procedure. The EC algorithms with LS are known as memetic algorithms (MA) [34].The developments and applications of EC algorithms have been one of the fastest growing fields in computing science [51]. In this context, promising research results was reported in [27]. They have used GA for simultaneous optimization of feature weights and project selection. Fitness of a population indicates quality of solution that it represents. As already mentioned in Section 2.1, out of the three performance metrics we discussed, MMRE is the de facto standard for Software Cost Estimation. Reciprocal of MMRE is often chosen as the fitness function in EC as well as SI [27]. EC and SI techniques are incorporated as a part of hybrid system, especially using EC-ANN or SI-ANN to fine tune feature weights or project attributes and select similar projects for the project under consideration to attain the Effort of the target project by training, testing and validating the attained similar projects with optimized feature weights in ANN. Fig 5 shows a flow representing to predict effort using Evolutionary Computation. After a several epochs the Neural Network can find the best model and predicts the final effort in Person-Months (PM).
2.3 Artificial Neural Networks for Effort Prediction In this section, we first discuss a few basic concepts in artificial neural network (ANN). Subsequently, we discuss implementation aspects of ANNs.
Figure 5: EC based Effort Prediction
2.3.1 Artificial Neural Networks Artificial Neural Networks (ANNs) are massively parallel and distributed information processing systems, composed by simple processing units (neurons), that have the intrinsic property of storing knowledge, based on the experience which make it available to be used [18] . ANN simulates brain’s information processing capacity.ANN is one of the most significant CI technologies. There are eventually two kinds of ANN structures: acyclic or feed-forward networks and cyclic or recurrent networks. Feed-Forward networks are further divided into Single Layer Feed-Forward networks which has no hidden units and Mutli-Layer Feed Forward networks which has one or more hidden units [38]. An ANN consists of a set of processing elements, also
DOI: 10.1145/180921.2180932
where
is the output of node ,
is the jth input to the node and
is the weight of connection between nodes (or bias) of the node. Usually
and .
is the threshold
is nonlinear, such as threshold, sigmoid
or Gaussian function. Figure 5 depicts a possible network architecture that is suitable for software cost estimation. The inputs to this network are the project attributes namely, unadjusted function points, development language, project type to name a few. At first, the system is initialized with random weights. The network then learns the relationships implicit in a set of data by adjusting the connection
http://doi.acm.org/10.1145/180921.2180932
ACM SIGSOFT Software Engineering Notes
Page 4
weightings when presented with a combination of inputs and outputs that are known as the training set. There are a number of training algorithms that can be used to train the network, each having particular speciality. The most common learning algorithm used by software metrics’ researchers is Back-Propagation. Figure 6 illustrates a possible Multi-layer neural network effort prediction model .The inputs to this network ,constitute the project attributes which are also known as data point. In the beginning, the system is initialized with random weights. The network then learns the relationship implicit in a set of data in a particular data point by adjusting the weights when presented with a combination of inputs and outputs that are known as the training set. Let us discuss how ANNs can be used in effort prediction.
May 2012 Volume 37 Number 3
any wrong input is given. Second, it extends the property of nonlinearity so that complex nonlinear problems can be solved. There are several activation functions which are popularly used: Sigmoid function, threshold function, logistics function. As sigmoid functions are differentiable which essential feature of weight learning process is, it is widely used for software cost estimation. ANN follows a two-step process. In step 1 three fold validation is employed for the training of the non-linear adjustment (ANN). This is followed by predicting stage in step 2. At this stage, a new project is presented to the trained hybrid system (EC-ANN or SI-ANN). The optimal class and its corresponding projects with optimal feature weights are presented to ANN network for final prediction. The figure 5 schematically presented prediction mechanism.
2.4 Fuzzy Systems for Software Effort Prediction 2.4.1 Fuzzy Systems
Figure 6: Multi-layer perceptron
2.3.2 Neural Network Structure The proposed structure of the neural networks is customized to the Data set under consideration. The process of building ANN structure is explained as follows. The training process of an ANN is a non-linear and non-constrained optimization problem, where a search takes place for a minimum of the error function between the network output and the desired output. This cost function traditionally is the mean square error (MSE). This is the average of the instantaneous square error values for all training patterns that are presented to the ANN. Equation 4 defines the instantaneous square error, while Equation 5 computes the MSE. N is the output layer’s number of neurons, dk(t) is the desired output for the neuron k when it is provided an input in time t, yk(t) is the output computed by neuron k when it is provided the same input in time t, and M is the number of training patterns [10].
A fuzzy system is a classical CI technology. The term Fuzzy denotes imprecision or uncertainty. In contrast with “crisp logic” in which there are only two possible values, i.e., true or false, “fuzzy logic” reasons approximately or in a certain degree of true or false. Fuzzy theory emerged from fuzzy sets that had culminated from the effort of Zadeh [52]. Zadeh introduced the term fuzzy logic in his seminal work Fuzzy sets, which described the mathematics of fuzzy set theory. The basic concepts of graduation and granulation form the core of Fuzzy Logic and are the principal distinguishing features of fuzzy logic. Graduated granulation (also known as fuzzy granulation), is a unique feature of fuzzy logic. Graduated granulation was inspired by the way in which humans deal with complexity and imprecision [53, 54, 55]. An instance of granulation is the concept of a linguistic variable – a concept which was coined by Zadeh [56]. A simple example of a linguistic variable is shown in Fig. 7. Today, the concept of a linguistic variable is predominantly used in almost all applications of fuzzy logic for solving complex real world problems. Granulation may be interpreted as a form of information compression of variables and input/output relations. An important concept which is related to the concept of a linguistic variable is the concept of a granular value [58]. Precisely speaking, consider a variable, X, which takes values in U. Let u be a value of X. Informally, if u is known precisely, then u is referred to as a singular (point) value of X. If X is not known precisely,but there is some information which constrains possible values of u, then the constraint on u defines a granular value of X, (Fig. 8). For example, if what is known about u is that it is contained in an interval [a,b], then [a,b] is a granular value of X. A granular variable is a variable which takes granular values. In this sense, a linguistic variable is a granular variable which carries linguistic labels. It should be noted that a granular value of Age is not restricted to young, middle-aged or old. For example, ‘‘not very young” is an admissible granular value of age [56,57,60].
(5) (6)
2.3.3 Effort Production The feed forward multi layer network with back propagation learning is the most commonly used structure in the field of software cost estimation. The network contains neurons arranged in layers with each neuron is connected to every neuron of the lower layer forming a complete graph (See Fig 6). The cost drivers or project attributes are fed as inputs at the input layer which propagates across subsequent layers of processing elements known as neurons and generates effort estimation in terms of Person-Months (PM) at the output layer. Each neuron computes an activation function and passes the resultant value to the output. The activation function is designed to meet two desired functionalities. First, it activates for right input and deactivates when
DOI: 10.1145/180921.2180932
Figure 7: Granulation of Age; Young, middle-aged, and old are linguistic (granular) values of age [60]
http://doi.acm.org/10.1145/180921.2180932
ACM SIGSOFT Software Engineering Notes
Page 5
Figure 8: Singular and Granular Values [60]
2.4.2 Use of Fuzzy Systems for Software Cost Estimation Measurement in software engineering is challenging as many of the software attributes are still qualitative rather than quantitative such as portability, maintainability, reliability, migration of software, conversion projects, defect fixing, software modification, fixing odd behavior, functional expansion etc. Their evaluation depends primarily on expert judgement. The qualitative issue is related to the scale type on which the attributes are measured. In measurement the scale type of a measure have been defined by Stevens in 1946 [43]: Nominal, Ordinal, Interval, Ratio or Absolute. The goal of this step is the characterization of a software project by a set of attributes. Selecting a set of attributes that accurately describe a software project is a complex task. The selection of attributes depends on specific objective function (i.e. software effort prediction) of the particular CI technique. The problem is to identify the attributes that incorporate a significant relationship with effort in a given environment. The general methodology followed by cost estimation researchers and practitioners is to test the correlation between the effort and all the attributes for which data in the studied environment are available. This solution does not take into consideration the confidence with which the attributes that can affect largely the effort, if they have not yet recorded data. [21]. Briand et al. proposed t-test procedure to select the set of attributes [7]. Shepperd et al. claimed that this is not a good solution because this procedure is not efficient to model the potential interactions between the software project attributes [42]. They consider that statistical methods cannot solve the selection problem in the software cost estimation field. There are two other criteria every relevant and independent attribute must obey. These are, the attribute must be comprehensive which implies that it must be well defined and the attribute must be operational which implies that it must be easy to measure. Incorporating Fuzzy learning procedures in the CI techniques which appears to be a good procedure to solve the attribute selection problem.
2.4.3 Effort prediction Fuzzy logic systems that are popularly being used can be categorized into three types: Pure fuzzy logic systems, Takagi and Sugeno’s fuzzy system, and fuzzy logic system with fuzzifier and defuzzifier. As most of the real world problems produce crisp data as input and expect crisp data as output the last type is widely used. Fuzzy logic system with fuzzifier and defuzzifier, first, proposed by Mamdani and it has been successfully applied to a variety of industrial processes and consumer products [59] . Software Effort prediction is not an exception. The three main steps to apply fuzzy logic for effort prediction are: Step 1: Fuzzification: It converts crisp input to crisp output. Step 2: Fuzzy Rule Based System: Fuzzy logic systems use fuzzy IFTHEN rules. Fuzzy Inference Engine: Once all crisp input values are fuzzified into their respective Linguistic values, the inference engine accesses the fuzzy rule base to derive.Linguistic values for the intermediate and the output linguistic variables. Step 3: Defuzzification: It converts fuzzy output into crisp output Attarzadeh et al [2] proposed a adaptive software cost estimation model incorporates Mamdani’s fuzzy logic system to handle imprecision and
DOI: 10.1145/180921.2180932
May 2012 Volume 37 Number 3
uncertainty in software attributes of COCOMO-II model. It produced more accurate results as compared to COCOMO-II using evaluation criteria MMRE=0.334110513 and PRED(25%)=35. Ahmed et al., [3](Ahmed et al., 2009) proposed an innovative Type-2 Fuzzy logic System (FLS) which evaluates the performance of a prediction system developed using the framework for handling imprecision and uncertainty when size is provided as a precise but uncertain input. The architecture depicted in Figure 9 shows that the prediction system consists of two stages: nominal effort prediction and EAF (Effort Adjustment Factor) prediction. The outputs of both the stages are merged (multiplied) to produce the actual effort. This architecture is built on top of Ahmed et al’s architecture which addresses imprecision while estimating the nominal and actual efforts [4,48]. In this experiment, Ahmed et al., explicitly provided the experimental details for nominal effort prediction stage because uncertainty in size affects nominal effort prediction. The EAF prediction stage is simple because we do not assume any uncertainty in cost drivers. Therefore, one can easily build separate FLS for each cost driver like Ahmed et al. [4].
Figure 9: Type-2 FLS based Effort Prediction System [3]
3. DATASETS AND THEIR VALIDATION There are ten widely used datasets in the field of Software Cost Estimation as shown in Table 1. All the datasets mentioned in table 1 may not be directly applied to all CI techniques. To make the dataset ready to be used for particular techniques is known as data validation. Proper Data validation is more important than data analysis as different outputs may confuse the estimator .There may be several reasons for this. Some attributes information collected may not be useful for particular techniques. Some techniques may not work well if values are missing for a particular attributes. (e.g. OLS Regression) . In case of categorical attributes missing value flag has to be assigned if more than 15% of the values are missing. The attributes that are known at the time of estimation are the only once that is taken into account. For example, Person-Months (PM) should not be considered as it is not known until the technique is applied and solution is obtained in terms of PM. Table 1 :Overview of Software Effort prediction datasets
S r N o. 1
Datas et Name
2
Exper ience ISBS G USP0 5 Coc8
3 4 5
ESA
No. Of Attri butes 14
No Of Observ ations
Resource URL
131
29
627
http://www.esa.int/esapub/bulle tin/bullet87/greves87.htm http://www.fisma.fi
14
1160
http://www.isbsg.org
14
193
16
63
http://www.promisedata.org/?p =48 http://www.promisedata.org/?p
http://doi.acm.org/10.1145/180921.2180932
ACM SIGSOFT Software Engineering Notes
6 7 8 9 1 0
1 Cocn asa Euro Clear Desha rnais Maxw ell Albre cht
16
93
11
90
9
81
23
62
7
24
Page 6
=6 http://www.promisedata.org/?p =35 http://www.euroclear.com http://www.promisedata.org/?p =9 http://www.promisedata.org/?p =108 http://promisedata.org/?p=167
4. MINIMUM STATISTICS FOR RESULT ANALYSIS When conducting experiments with various techniques, we often want to compute the estimation accuracy of each technique with respect to each other technique. This can be achieved by using some statistical methods that can be used to determine as to whether there is some significant difference between the techniques or not. The accuracy of a model can be predicted by four different methods: Common Accuracy statistics (includes MMRE, MdMRE, PREd(0.25)), box plots, Wilcoxon signed-rank test, and accuracy segmentation table. The Kruskal Wallis test is a robust non-parametric statistical alternative to the one-way independent sample test ANOVA and it is an extension of Wilcoxon test.
5. DISCUSSION AND FUTURE WORKS We discussed a few important computational intelligence techniques that have been used for software cost estimation. CI techniques are generally considered as promising techniques because a large number of generalization factors are connected with software projects. Spurious attributes or missing attributes can lead to erroneous conclusions for CI techniques. Therefore researcher should clearly specify the testing and validation methodology followed for particular data sets, while claiming their method as superior to contemporary ones. Many CI techniques are still to be investigated for the effectiveness in the field of Software Cost estimation.
6. REFERENCES 1. Auer, M., Trendowicz, A., Graser, B., Haunschmid, E. & Biffl, S., (2006). Optimal project feature weights in analogy-based cost estimation: improvement and limitations. IEEE Transactions on Software Engineering, vol. 32, no. 2, pp.83–92. 2. Attarzadeh, I. & Ow, S. H., (2011). Improving estimation accuracy of the COCOMO II using an adaptive fuzzy logic model. In Proceedings of IEEE conference on Fuzzy Systems. June 27-30, Taipei, Taiwan. 3. Ahmed, M. A. & Muzaffar, Z., (2009). Handling Imprecision and Uncertainty in Software Development Effort Prediction: A Type-2 Fuzzy Logic based prediction. Information and Software Technology, vol.51, pp. 640-654. 4. Ahmed, M.A., Saliu, M.O., & Al-Ghamdi, J., (2005). Adaptive fuzzy logic based framework for software development effort prediction, Information and Software Technology, vol.47 (1), pp.31–48. 5. Boehm, B., (1981). Software Engineering Economics, Prentice Hall. 6. Boehm, B., Abts , C., Brown, A., Chulani, S., Clark, B., Horowitz, E., Madach, R., Briand, L., Emam, K. E., Surmann, D. & Wieczorek, I., (1999). An assessment and comparison of common software cost estimation modeling techniques. In Proceedings of the 21st International Conference on Software Engineering, Los Angeles, California, pp. 313–323. 7. Briand, L.C., Langley, T., & Wieczorek, I., (2000). A replicated assessment and comparison of common software cost modeling
DOI: 10.1145/180921.2180932
8. 9.
10.
11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
22. 23. 24. 25. 26.
27. 28. 29.
May 2012 Volume 37 Number 3 techniques. ACM Proceedings of the 22nd international conference on Software engineering. pp. 377—386. Bundschuh, M. & Dekkers, C. (2008). The IT measurement compendium: estimating and benchmarking success with functional size measurement. Springer, Berlin, 1st edition. Costagliola, G., Ferrucci, F., Tortora, G., & Vitiello, G., (2005). Class point: an approach for the size estimation of object-oriented systems. IEEE Transactions on Software Engineering, vol.31, no.1, pp.52–74. Daniel G. e Silva, Mario Jino, & Bruno T. de Abreu (2010). Machine learning methods and asymmetric cost function to estimate execution effort of software testing. In Proceedings: Third IEEE International Conference on Software Testing, Verification and Validation. Engelbrecht, A.P., (2007). Computational Intelligence: An Introduction, 2nd edition, New York: Wiley. Fogel, L.J., Owens, A.J. Walsh, M.J., 1966. Artificial Intelligence Through Simulated Evolution, John Wiley & Son, New York, NY. Huang, X.S., Ho, D., Ren, J. & Capretz, L.F., (2007). Improving the COCOMO model using a neuro-fuzzy approach. Applied Soft Computing Journal, vol.7, no.1, pp.29–40. Huang, S.J. & Chiu, N.H., (2006). Optimization of analogy weights by genetic algorithm for software effort estimation. Information and Software Technology, vol.48, pp.1034–1045. Helmer, O., (1966). Social Technology. Basic Books, NY. Heiat, A., (2002). Comparison of artificial neural network and regression models for estimating software development effort. Information and Software Technology, vol. 44, pp.911–922. Huang, Y.S., Chiang, C.C., Shieh, J.W., & Grimson, E., (2002). Prototype optimization for nearest-neighbor classification. Pattern Recognition, vol. 35, pp. 1237–1245. Haykin, S. (1998). Neural Networks: A Comprehensive Foundation. Prentice Hall. http://www.buchhandel.de/WebApi1/GetMmo.asp?MmoId=100860 8&mmoType=PDF. Holland, J., 1975. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, MI, USA. Idri, A., Abran, A., Khoshgoftaar, T. M., (2002). Estimating software project effort by analogy based on linguistic values. 8th IEEE Symposium on Software Metrics, 4-7 June, Ottawa, 2002, pp. 93-101. Jensen, R. (1983). An improved macrolevel software development resource estimation model. In: Proceedings of 5th Conference of International S Parametric Analysts, pp. 88-92. Jørgensen, M., (2004). A review of studies on expert estimation of software development effort. The Journal of Systems and Software, vol. 70, pp. 37–60. Jorgensen, M. (2004). Top-down and bottom-up expert estimation of software development effort. Information and Software Technology, vol.46, pp.3–16. Kadoda, G., Cartwright, M., Chen, L., Shepperd, M. (2000).Experiences Using Case-Based Reasoning to Predict Software Project Effort. EASE, Keele, UK, 2000, p.23 Koza J.R., 1990. Genetic Programming: a Paradigm for genetically breeding populations of computer programs to solve problems. Technical Report STAN-CS-90-1314, Stanford University Computer Science Department, 1990. Li Y.F., Xie, M., Goh, T.N. (2009) A study Of Project Selection Feature Weighting For Analogy Based Software Cost Estimation. Journal of Systems and Software, vol. 82, pp.241-252. Liang, T. & Noore, A., (2003). Multistage software estimation, in: Proceedings of the 35th Southeastern Symposium on System Theory, 16–18 March, 2003, pp. 232–236. Mendes, E., Mosley, N. & Counsell, S. (2005). Investigating Web size metrics for early Web cost estimation. Journal of Systems and Software, vol.77, no. 2, pp.157–172.
http://doi.acm.org/10.1145/180921.2180932
ACM SIGSOFT Software Engineering Notes
Page 7
30. Miyazaki, Y., Terakado, K., Ozaki, K. & Nozaki, H. (1994). Robust regression for developing software estimation models. Journal of Systems and Software, vol.27, pp.3–16. 31. Madachy, R. (1994). A Software Project Dynamics Model for Process Cost, Schedule and Risk Assessment, Ph.D. Dissertation, University of Southern California. 32. MacDonell, S.G. & Shepperd, M.J. (2003). Combining techniques to optimize effort predictions in software project management. Journal of Systems and Software, vol.66, pp.91–98. 33. Oliveira, A.L.I. (2006). Estimation of software project effort with support vector regression. Neurocomputing, vol.69, pp.1749–1753. 34. Ong, Y. S., Lim, M. H., & Chen, X. (2010). Memetic computation—Past, present and future, IEEE Computational Intelligence Magazine, vol. 5, no. 2, pp. 24–31. 35. Peters, K. (2000).Software project estimation: Methods and Tools. vol. 8, no. 2. 36. Putnam, L., & Myers, W. (1992). Measures for excellence. Yourdon Press Computing Series 37. Pickard, L., Kitchenham, B., & Linkman, S. (2001).Using simulated data sets to compare data analysis techniques used for software cost modeling. IEEE Proceeding of Software vol.148, no.6, pp.165–174. 38. Russell, S. & Norvig, P. (2007). Artificial Intelligence A Modern Approach, Prenitce Hall. 39. Rechenberg, I., 1965. In: Cybernetic Solution Path of an Experimental Problem, Library Translation, vol. 1122, Royal Aircraft Establishment, Farnborough,Hants, UK. 40. Shin, M., & Goel, A.L., (2000). Empirical data modeling in software engineering using radial basis functions. IEEE Transactions on Software Engineering, vol.26, no.6, pp.567– 576. 41. Shepperd, M., & Schofield, C. (1997). Estimating software project effort using analogies. IEEE Transactions on Software Engineering, vol. 23, no. 12, pp. 736–743. 42. Shepperd, M., & Kadoda, G. (2001). Comparing software prediction techniques using simulation. IEEE Transaction on Software Engineering, vol. 27, no.11, pp. 1014-1022. 43. Stevens, S. S. (1946).On the Theory of scales and Measurement, Science 103, pp. 677-680. 44. Sneed, H.M. (1987), Software Management. Cologne: Müller GmbH. 45. Sr. Larry Putnam and Ware Myers. (2003), Five Core Metrics: the intelligence behind successful software management. Dorset House Publishing Company, Incorporated, 1st edition. 46. Schwefel, H.P., 1965. Kybernetische evolution als strategie der experimentellen forschung in der stromungstechnik, Master’s Thesis, Technical University of Berlin, Germany. 47. Storn, R., Price, K., 1995. Differential evolution – a simple and efficient adaptive scheme for global optimization over continuous spaces, Technical report,International Computer Science Institute, Berkley.
DOI: 10.1145/180921.2180932
May 2012 Volume 37 Number 3
48. Saliu, M., Ahmed, M., & Al-Ghamdi, J. (2004). Towards adaptive soft computing based software effort prediction, in: IEEE Meeting of the Fuzzy Information Processing NAFIPS, vol. 1, pp.27–30. 49. Tausworthe, R.C. (1980). The work breakdown structure in software project management. Journal of Systems and Software, vol.1, no.3, pp.181–186. 50. Venayagamoorthy, G. K. (2009). A successful interdisciplinary course on computational intelligence. IEEE Computational Intelligence Magazine, vol. 4, no. 1, pp. 14–23. 51. Zhang, J., Zhan, Z-H, Lin, Y., Chen, N., Gong, Y-J., Zhong, J-H., Chung, H S H., Li, Y., & Shi, Y-H (2011). Evolutionary computation meets machine learning: A survey. IEEE Computational Intelligence Magazine, pp. 68-74. 52. Zadeh, L.A. (1965). Fuzzy sets. Information and Control, vol.8, pp. 338–353. 53. Zadeh, L. A. (1979). Fuzzy sets and information granularity. In: M. Gupta, R. Ragade, R. Yager (Eds.), Advances in Fuzzy Set Theory and Applications, North-Holland Publishing Co., Amsterdam, 1979, pp. 3–18 . 54. Zadeh, L.A. (1997). Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets and Systems, vol. 90, pp.111–127. 55. Zadeh, L. A. (1998). Some reflections on soft computing, granular computing and their roles in the conception, design and utilization of information/intelligent systems. Soft Computing, vol. 2, pp.23– 25. 56. Zadeh, L.A., (1973). Outline of a new approach to the analysis of complex systems and decision processes. IEEE Transaction on Systems Man and Cybernetics SMC-3, pp. 28–44. 57. (a) Zadeh, L.A. (1975). The concept of a linguistic variable and its application to approximate reasoning- Part I, Information Sciences, vol. 8, pp.199–249; (b) Zadeh, L.A. (1975). The concept of a linguistic variable and its application to approximate reasoning- Part II. Information Sciences, vol. 8, pp. 301–357; (c) Zadeh, L.A. (1975). The concept of a linguistic variable and its application to approximate reasoning-Part III, Information Sciences, vol. 9, pp. 43–80. 58. Zadeh, L. A. (2006). Generalized theory of uncertainty (GTU)– principal concepts and ideas, Computational Statistics & Data Analysis, vol.51, pp. 15–46. 59. Zadeh, L.A. (2001). The future of soft computing. The 9th IFSA World Congress and 20th NAFIPS International Conference, Vancouver, Canada, pp. 217-228. 60. Zadeh, L.A. (2008). Is there a need for Fuzzy Logic? Information Sciences 178 (2008) 2751–2779.
http://doi.acm.org/10.1145/180921.2180932