Optimizing Basic COCOMO Model Using Simplified Genetic ... - Core

11 downloads 0 Views 196KB Size Report
Ayush Nigam, Avinash Singh, Sharad Singh, Manjeet Choudhary,. Avinash Tiwari and Dharmender Singh Kushwaha. Motilal Nehru National Institute of ...
Available online at www.sciencedirect.com

ScienceDirect Procedia Computer Science 89 (2016) 492 – 498

Twelfth International Multi-Conference on Information Processing-2016 (IMCIP-2016)

Optimizing Basic COCOMO Model using Simplified Genetic Algorithm Rohit Kumar Sachan∗ , Ayush Nigam, Avinash Singh, Sharad Singh, Manjeet Choudhary, Avinash Tiwari and Dharmender Singh Kushwaha Motilal Nehru National Institute of Technology Allahabad, Allahabad 211 004, India

Abstract The estimation of software effort is an essential and crucial activity for the software development life cycle. In recent years, many researchers and software industries have given significant attention on the estimation of software effort. In industry, effort is used for planning, budgeting and development time calculation. Therefore a realistic effort estimation is required. Many researchers have proposed various models for software effort estimation, such as statistical models, algorithmic models, machine learning based models and nature inspired models in the past. In this research paper, a simplified genetic algorithm based model is proposed. A simplified genetic algorithm is used for optimizing the parameters of the basic COCOMO model. The proposed approach is applied on NASA software project dataset. Experimental results show better realistic estimation over the basic COCOMO. © 2016 The Authors. Published by Elsevier B.V. © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license Peer-review under responsibility of organizing committee of the Twelfth International Multi-Conference on Information (http://creativecommons.org/licenses/by-nc-nd/4.0/). Processing-2016 Peer-review under (IMCIP-2016). responsibility of organizing committee of the Organizing Committee of IMCIP-2016 Keywords:

COCOMO Model; Effort Estimation; Genetic Algorithm; Nature-inspired Algorithm; Optimization.

1. Introduction Software effort estimation (SEE) has been an essential and crucial activity for the software development life cycle (SDLC). Effort estimation is used for planning, budgeting and monitoring the activities of software development. It is also used for on time and within budget delivery of software. It is a method for computing the most realistic and reliable value of required effort for developing or developed project. It is calculated in term of person-month. This effort is used for project plans, project budgets, investment analysis, resource allocation schemes, pricing process, etc. According to Standish Group Report1 , in UK during the period of 2002–2003, out of 13522 projects only 33% projects were completed within time and budget, 82% projects were late, 43% projects were overrun their financial plan and 20% projects were cancelled. According to another report2, 70–80% software projects were above their estimated plan and that average overrun is about 30–40%. Hence, there is a requirement of a more realistic and reliable software effort estimation model. Several effort estimation methods have been proposed and improved by many researchers in the past. These methods are categorised into Expert judgement, Algorithmic method and Analogy based method. Expert judgment methods are ∗ Corresponding author. Tel.: +91-9999434710; Fax: +91-532-254-5341.

E-mail address: [email protected]

1877-0509 © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of organizing committee of the Organizing Committee of IMCIP-2016 doi:10.1016/j.procs.2016.06.107

493

Rohit Kumar Sachan et al. / Procedia Computer Science 89 (2016) 492 – 498

Table 1. Basic-COCOMO Models. Model Name Organic Model Semi-Detached Model Embedded Model

Project Size

Effort

Less than 50 KLOC

E = 2.4 (KLOC)1.05

50–300 KLOC

E = 3.0 (KLOC)1.12

Over 300 KLOC

E = 3.6 (KLOC)1.20

Work Breakdown Structure (WBS) method and Delphi technique3. Algorithmic methods are Constrictive Cost Model (COCOMO)4 , Function Point (FP)5 method, Software Life Cycle Management (SLIM)6 and Software Evaluation and Estimation of Resources-Software Estimating Model (SEER-SEM)7. The analogy based method is applied at a very early phase of SDLC. Early stage effort estimation methods are Improved Requirement Based Complexity (IRBC)8 , Requirement Based Software Development Effort Estimation (RBDEE)9, 10 and many others. Some researchers used Cognitive Information Complexity Measure (CICM)11, 12 for effort estimation. Currently, many issues have arisen regarding the applicability of these methods to solve the software effort estimation. Heuristic techniques are used to overcome the limitation of these methods and improve the applicability. Various heuristic optimization methods are used in optimization problems. These methods can be used in the software effort estimation also. These methods are Genetic algorithm13, Genetic programming14, Differential Evolution15, Particle Swarm Optimization16 and many others. This research paper provides a brief study of the Genetic Algorithms (GA) as an optimization algorithm. It is used for optimizing the parameter of the COCOMO model so that a more realistic effort can be estimated. The performance evaluation of the proposed method is analysed with the NASA software project dataset. The remaining research paper is organized as follows. Section 2 contains a concise description of background, proposed simplified genetic algorithm is explained in Section 3, Section 4 gives various evaluation criteria and dataset. Experimental results have been explained in Section 5. Finally, Section 6 concludes the proposed method with directions for future work. 2. Background The concise description of the COCOMO model and Genetic algorithm is discussed in this section. 2.1 The Constructive Cost Model (COCOMO) The Constructive Cost Model is a well documented and widely accepted algorithmic model for effort estimation. It was developed by Barry W. Boehm in 19814. Constructive Cost Model has three variants. These are basic, intermediate and detailed. The main parameter for the SEE is the size of the project. The size is represented in terms of lines of code (LOC) or a thousand-lines of code (KLOC). This model was built based on a historical dataset of 63 projects. The model defines mathematical equations for estimating effort, development time and the maintenance effort. The mathematical equation of a basic COCOMO model is defined in Equation 1 E = A × (KLOC) B

(1)

where KLOC is the size of the code (kilo-lines of code), E is the software effort computed in person-month and A, B are the COCOMO model parameters. The value of A and B depend on the model of a software project. COCOMO model has three types, depending upon project size. These models are Organic, Semi-detached and Embedded model17, 18. Table 1 shows the basic COCOMO models. The main issue with the COCOMO model is that it does not provide realistic effort in the current development environment. This limitation of the COCOMO model was overcome by exploration of the non-algorithmic technique like genetic algorithm or any other nature-inspired algorithms. Software effort estimation based on existing parameters

494

Rohit Kumar Sachan et al. / Procedia Computer Science 89 (2016) 492 – 498

does not always give precise result; due to that, we require the tuning of the parameters to get more accurate results. This paper optimizes the value of parameters A and B. 2.2 Genetic Algorithm (GA) The genetic algorithm is an evolutionary search algorithm. It belongs to the field of nature-inspired algorithms. In 1970s, John Holland and his collaborators proposed genetic algorithm at University of Michigan19. It is a global optimization method which is inspired from the abstraction of Darwinian evolution and natural selection of biological systems. GA20, 21 uses mathematical operators such as selection of fittest, crossover and mutation. The wide range of optimization problems have been successfully solved using genetic algorithms. 3. Proposed Method In this research paper, the parameters of the basic COCOMO model are optimized using the simplified genetic algorithm. This shall ensure that complexity of the proposed algorithm remains low. We use the crossover and selection operator for calculating new value of parameters A and B. The steps of the proposed simplified genetic algorithm are: 1. 2. 3. 4. 5. 6. 7.

Randomly generate an initial population of A and B. Calculate the fitness of each chromosome using fitness function. Select parent chromosome from population. Create new chromosomes by applying crossover for A after 8th bit to 17th bit and for B after 7th bit till 15th bit. Compute the fitness of new chromosomes. Select the next population by selecting the best chromosomes. Goto step 3, until max-generation is not reached else stop and return the best chromosome.

4. Evaluation Criteria and Data Set We propose to use the absolute difference between effort and estimated effort (Manhattan distance (MD)) as the fitness function for the proposed simplified genetic algorithm. The Manhattan distance is calculated by Equation 2  n   MD = |Efforti − Estimated Efforti | (2) i=1

The other evaluation criteria may be used as fitness function for evaluating the performance of the proposed model. They are22 : Mean Magnitude of Relative Error (MMRE): MMRE =

n 1  |Efforti − Estimated Efforti | n Efforti

(3)

i=1

Variance-Accounted-For (VAF):   var(Effort − Estimated Effort) × 100% VAF = 1 − var(Effort) Root Mean Square (RMS):

  n 1  (Efforti − Estimated Efforti )2 RMS = n i=1

(4)

(5)

495

Rohit Kumar Sachan et al. / Procedia Computer Science 89 (2016) 492 – 498

Table 2. The NASA Software Project Dataset. Project No.

KLOC

Methodology (ME)

Measured Effort

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

90.2000 46.2000 46.5000 54.5000 31.1000 67.5000 12.8000 10.5000 21.5000 3.1000 4.2000 7.8000 2.1000 5.000 78.6000 9.7000 12.5000 100.8000

30.0000 20.0000 19.0000 20.0000 35.0000 29.0000 36.0000 34.0000 31.0000 26.0000 19.0000 31.0000 28.0000 29.0000 35.0000 27.0000 27.0000 34.0000

115.8000 96.0000 79.0000 90.8000 39.6000 98.4000 18.9000 10.3000 28.5000 7.0000 9.0000 7.3000 5.0000 8.4000 98.7000 15.6000 23.9000 138.3000

Table 3. Simplified GA Parameter Setting.

Euclidian Distance (ED):

Operator

Type

Selection Mechanism Type of Crossover Type of Mutation Population Size Maximum Generation Domain of search for A Domain of search for B

Elitism Selection Two Point Binary Crossover NA 10 100 0:10 0.3:2

  n  ED = (Efforti − Estimated Efforti )2

(6)

i=1

Performance evaluation of the proposed work is tested on well-known and public NASA project dataset. The NASA dataset23 contains the size of project, Methodology (ME) and Measured effort of 18 software projects. These values are shown in Table 2. 5. Experimental Results The proposed experiment applies simplified GA for optimizing the value of A and B of the basic COCOMO model given in Equation 1. This model allows us to estimate effort for the software development of the 18 projects of NASA dataset. The parameters of simplified GA are set as Table 3. The fitness function is shown in Equation 2. The computed parameters can significantly simplify the estimation of the software effort for all projects. The computed values of effort are presented in Table 4. The model with the optimal set of parameter of A and B is presented in Equation 7. Figure 1 shows the measured effort comparison between NASA dataset, basic COMOMO and proposed simplified genetic algorithm. The result shows that effort estimated by the proposed simplified GA gives

496

Rohit Kumar Sachan et al. / Procedia Computer Science 89 (2016) 492 – 498 Table 4. Computed Effort Value by Basic COCOMO and Simplified GA.

Project No.

KLOC

NASA Effort

Basic COCOMO Effort

Simplified GA Effort

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

90.2000 46.2000 46.5000 54.5000 31.1000 67.5000 1 2.8000 10.5000 21.5000 3.1000 4.2000 7.8000 2.1000 5.000 78.6000 9.7000 12.5000 100.8000

115.8000 96.0000 79.0000 90.8000 39.6000 98.4000 18.9000 10.3000 28.5000 7.0000 9.0000 7.3000 5.0000 8.4000 98.7000 15.6000 23.9000 138.3000

271.1308 134.3028 135.2187 159.7450 88.6358 199.9770 34.8964 28.3439 60.1549 7.8730 10.8298 20.7448 5.2304 13.0055 234.6414 26.0808 34.0382 304.6805

114.8510 63.0152 63.3821 73.0839 44.1810 88.5476 19.9217 16.6782 31.7247 5.5821 7.3303 12.7740 3.9359 8.5715 101.5074 15.5325 19.5022 126.8900

Fig. 1. Effort Graph for NASA, COCOMO and Proposed Simplified GA.

Table 5. Computed Manhattan Distance (MD). Model Input Basic COCOMO Proposed Simplified GA

Model Output

Manhattan Distance (MD)

Effort Effort

48.8360 6.7125

better results as compared to actual NASA effort. The performance comparison of basic COCOMO and proposed simplified GA are shown in Table 5 and Fig. 2 E = 2.022817 × (KLOC)0.897183

(7)

Rohit Kumar Sachan et al. / Procedia Computer Science 89 (2016) 492 – 498

Fig. 2. Comparison of Manhattan Distance by COCOMO and Simplified GA.

6. Conclusions and Future Work This paper proposes a software effort estimation (SEE) model using a simplified form of genetic algorithm. The simplified GA is used for optimizing the parameters of the basic COCOMO model. This is computationally cheaper also. The performance analysis of the proposed model is analysed on the well-known NASA dataset. We have used the simplified genetic algorithm for developing a model for the software effort calculation. The Manhattan distance of proposed simplified GA is 6.7125 which is better than basic COCOMO. COCOMO with simplified GA tuned parameters gives an improved estimation compared to basic COCOMO. In the future, we will extend this work to optimize the effort estimation model provided by Alaa F. Sheta13 and intermediate COCOMO. References [1] S. Group, Chaos Uummary 2009, (2009). [2] K. Moløkken and M. Jørgensen, A Review of Software Surveys on Software Effort Estimation, In ISESE 2003. Proceedings International Symposium on Empirical Software Engineering, IEEE, pp. 223–230, (2003). [3] P. Suri and P. Ranjan, Article: Comparative Analysis of Software Effort Estimation Techniques, International Journal of Computer Applications, vol. 48(21), pp. 12–19, (2012). [4] B. W. Boehm, et al., Software Engineering Economics, Prentice-hall Englewood Cliffs (NJ), vol. 197, (1981). [5] J. E. Matson, B. E. Barrett and J. M. Mellichamp, Software Development Cost Estimation Using Function Points, IEEE Transactions on Software Engineering, vol. 20(4), pp. 275–287, (1994). [6] L. Putnam and W. Myers, Measures for Excellence, (1992). [7] R. Jensen, An Improved Macrolevel Software Development Resource Estimation Model, In 5th ISPA Conference, pp. 88–92, (1983). [8] A. Sharma and D. S. Kushwaha, Estimation of Software Development Effort from Requirements Based Complexity, Procedia Technology, vol. 4, pp. 716–722, (2012). [9] A. Sharma and D. S. Kushwaha, Applying Requirement Based Complexity for the Estimation of Software Development and Testing Effort, ACM SIGSOFT Software Engineering Notes, vol. 37(1), pp. 1–11, (2012). [10] A. Sharma, M. Vardhan and D. S. Kushwaha, A Versatile Approach for the Estimation of Software Development Effort Based on srs Document, International Journal of Software Engineering and Knowledge Engineering, vol. 24(01), pp. 1–42, (2014). [11] D. S. Kushwaha, Cicm: A Robust Method to Measure Cognitive Complexity of Procedural and Object-Oriented Programs, WSEAS Transactions on Computers, vol. 5(10), pp. 2348–2355, (2006). [12] D. S. Kushwaha, A. K. Misra, Software Test Effort Estimation, ACM SIGSOFT Software Engineering Notes, vol. 33(3), p. 6, (2008). [13] A. F. Sheta, Estimation of the Cocomo Model Parameters Using Genetic Algorithms for Nasa Software Projects, Journal of Computer Science, vol. 2(2), pp. 118–123, (2006). [14] A. F. Sheta and A. Al-Afeef, Software Effort Estimation for Nasa Projects Using Genetic Programming, Journal of Intelligent Computing, vol. 1(3), p. 147, (2010).

497

498

Rohit Kumar Sachan et al. / Procedia Computer Science 89 (2016) 492 – 498 [15] S. Aljahdali and A. F. Sheta, Software Effort Estimation by Tuning Coocmo Model Parameters Using Differential Evolution, In IEEE/ACS International Conference on Computer Systems and Applications (AICCSA), IEEE, p. 1–6, (2010). [16] C. Hari and P. P. Reddy, A Fine Parameter Tuning For Cocomo 81 Software Effort Estimation Using Particle Swarm Optimization, Journal of Software Engineering, vol. 5(1), pp. 38–48, (2011). [17] B. Boehm, C. Abts, B. Clark and S. Devnani-Chulani, Cocomo II Model Definition Manual, The University of Southern California, (1997). [18] O. Benediktsson, D. Dalcher, K. Reed and M. Woodman, Cocomo-Based Effort Estimation for Iterative and Incremental Software Development, Software Quality Journal, vol. 11(4), pp. 265–281, (2003). [19] J. H. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, U Michigan Press, (1975). [20] C. Guo and X. Yang, A Programming Of Genetic Algorithm in Matlab 7.0, Modern Applied Science, vol. 5(1), p. 230, (2011). [21] A. Gall¸inn¸ina, O. Burceva and S. Parˇsutins, The Optimization of Cocomo Model Coefficients Using Genetic Algorithms, (2012). [22] S. Aljahdali and A. Sheta, Evolving Software Effort Estimation Models Using Multigene Symbolic Regression Genetic Programming, Int. J. Adv. Res. Artif. Intell., vol. 2, pp. 52–57, (2013). [23] J. W. Bailey and V. R. Basili, A Meta-Model for Software Development Resource Expenditures, In Proceedings of the 5th International Conference on Software Engineering, IEEE Press, pp. 107–116, (1981).