INTERNATIONAL JOURNAL of ACADEMIC RESEARCH
Vol. 6. No. 3. May, 2014
Zahra A. Dizaji, K. Khalilpour. Particle swarm optimization and chaos theory based approach for software cost estimation. International Journal of Academic Research Part A; 2014; 6(3), 130-135. DOI: 10.7813/2075-4124.2014/6-3/A.18 Library of Congress Classification: TK7885-7895
PARTICLE SWARM OPTIMIZATION AND CHAOS THEORY BASED APPROACH FOR SOFTWARE COST ESTIMATION 1
Zahra Asheghi Dizaji , Kamal Khalilpour 1
2
Department of Computer Engineering, Science and Research Branch, Islamic Azad University, West Azerbaijan 1,2 Department of Mathematics, Faculty of Science, Mahabad Branch, Islamic Azad University, Mahabad ( IRAN) E-mails:
[email protected],
[email protected]
2
DOI: 10.7813/2075-4124.2014/6-3/A.18 Received: 16 Jan, 2014 Accepted: 08 May, 2014
ABSTRACT Nowadays, software development and production is considered as an essential need in many organizations. To produce and develop a good quality and affordable software, we need to have an accurate estimate of time and cost required to complete the software. Thus it can be said that the cost estimation of software projects play an important role in organization productivity. Usually it is difficult to estimate the cost of software projects. This could be due to reasons such as software projects not being tangible and understandable at the beginning of the production. Because the methods presented so far for estimating the cost of software projects do not fulfill this important task, new techniques are needed for a more accurate estimation. So in this paper, we have attempted to estimate the cost of software projects by using Particle Swarm Optimization (PSO) algorithm and Tent mapping as Chaos Optimization algorithm. We have used the NASA datasets as training and testing sets, and MARE to evaluate the performance of the proposed method. And the results have been compared to the intermediate COCOMO model proposed by Boehm (1981), which shows an Absolute Relative Error reduction in cost estimation up to 0.0797%. Key words: Particle Swarm Optimization Algorithm, Chaos Optimization Algorithm, Software Projects Cost Estimation, Meta-Heuristic Algorithms 1. INTRODUCTION Cost estimation which is done at the beginning of the production process and continues until the end of software production and development, has little accuracy due to the lack of adequate information. Thus, because of the important role that cost estimation plays in organization productivity, and considering the fact that any error in cost estimation can cause lots of financial and time loss, it is necessary to use effective methods to estimate the cost of software projects [1]. Several factors such as: the characteristics of manufactured software, human resources, available tools, project characteristics and user needs affect the software project’s cost estimation. Therefore, no software project cost estimation has been done with a hundred percent accuracy. But because estimating the cost of software projects has a significant impact on the success of those projects, we should try to increase the accuracy of the cost estimation [2]. To estimate with high accuracy, we need to have some initial information. In fact, with more accurate and abundant initial information, we can get more accurate cost estimation. In estimating time and cost by algorithmic methods, we try to evaluate the systems based on specified parameters and then estimate the time and cost of the system according to specific relationships. Due to the increasing volume and complexity of software projects using model-based methods is not an appropriate answer for cost estimation of software projects. Today, metaheuristic algorithms are used to estimate the cost of software projects. Considering that meta-heuristic algorithms use several cycles and several factors, they therefore, are able to find optimal solutions. In this paper, we have used PSO algorithm and hybrid of this algorithm with tent mapping as chaos optimization algorithm for estimating the cost of software projects. We have organized various sections in this paper as: Section 2: an overview of previous works. Section 3: basic concepts (intermediate COCOMO model, Chaos Theory, PSO). Section 4: The proposed method. Section 5: evaluation of the proposed method. Section 6: conclusion and future works. 2. PREVIOUS WORKS Today with the increasing volume and complexity of software projects, using model-based methods for estimating the cost of software projects are of low accuracy. Therefore, in recent years, several studies have been
130 | PART A. APPLIED AND NATURAL SCIENCES
www.ijar.eu
INTERNATIONAL JOURNAL of ACADEMIC RESEARCH
Vol. 6. No. 3. May, 2014
carried out to use non-algorithm methods such as machine learning algorithms as an alternative to model-based methods. In 2008 Raj Kiran, Carr, Ravi and Vinay Kumar presented two type of Wavelet Neural Networks (WNN) using Morlet function and Gaussian function as transfer function as well as training Threshold Acceptance Algorithm proposer for neural network (TAWNN). These researchers examined the results of their proposed method with other methods such as:
Multilayer Perception (MLP) Radial Basis Function Network (RBFN) Multiple Linear Regression (MLR) Dynamic Evolving Neuro-Fuzzy Inference System (DENFIS) Support Vector Machine (SVM)
Using Mean Magnitude Relative Error (MMRE) on datasets from IBM, IBMDPS and CF, and according to the results it is observed that the WWNMorlet on CF datasets and WNNGuassian on IBMDPS datasets have better performance than other methods. Also TAWNN method outperformed other presented methods except the WNN method [3]. In 2008, researchers [4] used Simulated Annealing Algorithm based on COCOMO to estimate the cost of software projects. They examined NASA datasets using two independent variables: 1.Lines Of Code (LOC) 2.Methodology (ME) and a dependent effort variable (CE). These researchers compared the results with the results from a research conducted by A. F. Sheta. These comparisons indicated an improvement in the accuracy of the proposed method. In 2011, Hari Prasad Reddy used a hybrid of PSO and Fuzzy Logic to estimate software project cost. Three models with Fuzzy Logic and PSO Algorithm with inertia weight was presented. In this research NASA datasets were used for training and testing sets. The results showed that using Fuzzy Logic can reduce the uncertainty in project size. The presented method had better performance in MARE, VARE and VAF [5]. In 2011 [6], they used PSO algorithms and the datasets clustered by k-mean algorithms for estimating the cost of software projects. In fact, COCOMO parameters which were produced by PSO algorithm had different values for each cluster, and because the values were clustered, it was expected that they would have better results. In this study the COCOMO81 dataset were used and the results were compared to the standard COCOMO, which indicated improved performance of the proposed method. Z.A. Dizaji and F.S. Gharehchobogh have used chaos factor to improve the performance of PSO Algorithm. In their article, Tent map, Lorenz attractor and Logistic map are used as Chaos Optimization Algorithms. These researchers used the hybrid of this algorithm with chaos factor to predict the road accidents according to accident type (damage, injury, death). The results showed improved performance of the proposed method. Also, in this paper, tent mapping has better performance compared to other mappings [7]. 3. BASIC CONCEPTS In this section of the paper, COCOMO model, PSO Algorithm and Chaos Optimization Algorithm will be argued. 3.1. Intermediate COCOMO model In the Intermediate COCOMO model first, the type and size of the project is determined, then the coefficients obtained from Cost Drivers like product features, available tools, personnel features and project features are applied to fundamental COCOMO parameters [8, 9]. The main problem of the COCOMO model and other algorithm models can be expressed as: very high error, lack of precision in estimating the size of the project and lack of certain values for constant parameters. In Intermediate COCOMO model, software projects cost estimation is done by formula (1): [3] =
(
) ∏
(1)
Where parameters a and b are constant and their values depend on the data on the dataset. Parameter size is the size of the project in Thousand Source Lines of Code (KSLOC). Parameter EM (Effort Multipliers) is the coefficient of effort in person/months. [9] COCOMO applies to three different classes of software projects: Organic projects: relatively small projects done by experienced teams Semi-detached projects: medium projects which are not simple nor complex Embedded projects: with predetermined hardware and operation without change. 3.2. PSO Algorithm The meta-heuristic algorithm, Particle (bird) Swarm Optimization, was first proposed in 1995 by Eberhart and Kennedy [10]. PSO algorithm is a social search algorithm which is modeled on the social behavior of bird flocks. PSO’s work is based on the principle that every particle adjusts its place in the search space according to the best place it has been and the best place its neighbors have ever been. Formula (2) is used to set the location of each particle and formula (3) is used to adjust the velocity of each particle:
Baku, Azerbaijan| 131
INTERNATIONAL JOURNAL of ACADEMIC RESEARCH
Vol. 6. No. 3. May, 2014
v , (t + 1) = wv , (t) + c r (t) pbest , (t) − x , (t) + c r (t) gbest (t) − x , (t)
(2)
x (t + 1) = x (t) + v (t + 1)
(3)
Where w is the inertia coefficient and depending on the requirements it can be constant, variable with iteration number, or random [10, 11, 12]. Existence of inertia coefficient ensures that the particles that get the best response in society do not stop and continue to move in the previous route [10, 13, and 14]. This coefficient also has a performance similar to that of mutation in genetic algorithms [10, 12]. Coefficients c1 and c2 are learning coefficients and lie in the range [0, 2]. r1 and r2 are random numbers which are usually considered with uniform distribution in the range [0,1]. Pbest is the best answer ever found by i-th particle. Also Gbest is the best answer proposed by the whole society of available particles. 3.3. Chaos Optimization Algorithm Chaos theory was introduced for the first time in late 1890 by Henri Poincaré [15]. Chaotic systems are non-linear dynamic systems which are very sensitive to their initial conditions. Small changes in such systems’ initial conditions would lead to drastic changes in the future. Chaotic systems apparently have random behavior. However, there is no need to have a random element to create a chaotic behavior, and even deterministic dynamical systems can show chaotic behaviors. The mapping used in this article is the Tent mapping, and formula (4) is used to apply this mapping.
xn+1 =
−
0≤
≤
≤
≤1
(4)
4. THE PROPOSED METHOD Estimating the cost of software projects plays an important role in software development, and considering the fact that estimating the cost of software projects by algorithm models is of low accuracy, therefore it necessitates the use of new techniques to solve this problem. In algorithm models, no values are defined for fixed parameters in cost estimation and mean values are considered instead. Consequently, it is not easy to find a reliable solution. Therefore, in this paper we have attempted to find these values according to PSO Algorithm and hybrid of this algorithm with Chaos Optimization Algorithm. In this paper project type factor is used for initial classification. After initial classification, training sets are given to proposed intelligent algorithms to predict the values of fixed parameters. After the completion of training and obtaining the values, they are applied to testing sets and in each stage the optimal values are selected and the cost estimation is carried out. 4.1. PSO Algorithm The first algorithm in the proposed method is the PSO Algorithm, which its procedure is shown in table (1). Table 1. PSO algorithm procedure Inputs: NASA datasets (including effective factors in estimation, project type and number of projects in the dataset.) Outputs: values for the fixed parameters of COCOMO model and classified data. Step one: reading the data in the dataset. Step two: discrimination of training and testing sets. Step three: classification of training and testing sets. Step four: calling PSO algorithm for each set. Step five: initializing the particles base on the fixed parameters in COCOMO model. Step six: evaluating the performance of Fitness function. Position of each particle is updated according to the setting parameters. Assessing the performance of Fitness function. Here Fitness function is MARE. The objective is to minimize the MARE by selecting the appropriate values from the specified range. Step seven: finding Pbest: If fitness (p) is better than fitness (Pbest) then Pbest=p Here Pbest for each particle is selected by assessing and comparing the estimation of current and previous values. Step eight: finding Gbest. Selecting the best value among the Pbests. Particle with the lowest difference between the existing cost values and the values estimated by the agent is selected. Step nine: updating the position and velocity of each particle according to formula (2) and formula (3). Step ten: repeat the steps four to nine until reaching the desired solution. Step eleven: getting the parameter values from Gbests as the optimum values. Step twelve: finish the optimization algorithm.
4.2. Hybrid of PSO and Chaos Optimization algorithms The next algorithm in the proposed method is the hybrid of PSO algorithm and Chaos Optimization algorithm. For combining Chaos and Particle Swarm algorithms formula (4) is used. And the operation of this factor is shown in formula (5). v , (t + 1) = wv , (t) + c
(t) pbest , (t) − x , (t) + c CM (t) gbest (t) − x , (t)
132 | PART A. APPLIED AND NATURAL SCIENCES
(5)
www.ijar.eu
INTERNATIONAL JOURNAL of ACADEMIC RESEARCH
Vol. 6. No. 3. May, 2014
The values of r1 and r2 variables of PSO algorithm in formula (2) are assigned by Chaos Optimization algorithm (Tent mapping) in formula (4). The combined formula is shown in formula (5) and the variables are CM1 and CM2. 4.3. Hybrid of PSO and PSO & Chaos The proposed method in this paper selects the best solution after getting the optimal solutions from the algorithms discussed earlier. The process is shown in figure (1).
Fig. 1. The proposed method procedure
5. DISCUSSION In this paper, we have used NASA datasets which contain 60 projects, and the proposed method to estimate the cost of software projects. Datasets are classified as: 80% for training sets and 20% for testing sets. To evaluate the results and compare them with the intermediate COCOMO model, MARE has been used. This error is calculated by formula (6) and formula (7). MARE =
|
|
(6)
= ∑
(7)
MARE %
MARE%
Considering that the proposed method selects the most optimal solution among other proposed methods, as shown in figure (2), figure (3), figure (4) and table (2), this model is more optimal than all other proposed models in this paper.
Fig. 2. Comparison of the pso with COCOMO model is based on MARE on training data
Fig. 3. Comparison of the pso & chaos with COCOMO model is based on MARE on training data
Baku, Azerbaijan| 133
Vol. 6. No. 3. May, 2014
MARE
INTERNATIONAL JOURNAL of ACADEMIC RESEARCH
Fig. 4. Comparison of the proposed method with COCOMO model is based on MARE on test data
Table 2. MARE on test data Model Name cocomo PSO Chaos_PSO Proposed Method
MARE
0.2952 0.1506 0.1153 0.0797
6. CONCLUSION AND FUTURE WORK Estimating the cost of software projects is considered as one of the most important aspects of software projects management. Thus, it has been paid significant attention by researchers over the last 30 years. In this paper we have used project type factor (Organic, Embedded, Semidetached) for classifying the dataset, and PSO algorithm and hybrid of PSO algorithm and Chaos Optimization Algorithm for an optimal cost estimation. And considering that the proposed methods select the optimal solutions among the proposed algorithms, thus it is capable of finding the optimal solutions. In this paper, datasets from 60 NASA projects have been used, and based on the results, it can be said that by combining the PSO Algorithm with Chaos Optimization Algorithm the performance of these algorithms is improved. Also using PSO algorithm and combined algorithm causes an improvement in cost estimation of software projects compared to COCOMO model, based on MARE. The MARE for a testing set produced by COCOMO model is equal to 0.2952%, and it is reduced to 0.0797% for the same data by the proposed method. Therefore, it can be concluded that the proposed method in this paper, is more optimal than the COCOMO model, and can be used to solve other problems. REFERENCES 1. Caper, Jones, Estimating software cost‖ tata Mc- Graw, Hill Edition, 2007. 2. Ian, Sommerville, Software engineering, Library of Congress Cataloging-in-Publication Data, 2011. 3. K. Vinay Kumar, V. Ravi, Mahil Carr, N. Raj Kiran, Software development cost estimation using wavelet neural networks, The Journal of Systems and Software, vol : 81, pp :1853–1867, January 2008. 4. Mitat Uysal, Estimation of the Effort Component of the Software Projects Using Simulated Annealing Algorithm, World Academy of Science, Engineering and Technology, vol :41, pp:258-261, 2008. 5. P.V.G.D. Prasad Reddy and CH.V.M.K. Hari, Fuzzy Based PSO for Software Effort Estimation, Communications in Computer and Information Science, vol :147, pp :227–232, 2011. 6. Tegjyot Singh Sethi, CH.V.M.K. Hari, B.S.S. Kaushal, and Abhishek Sharma, Cluster Analysis and Pso for Software Cost Estimation, Communications in Computer and Information Science, vol: 147, pp :281–86, 2011. 7. Farhad Soleimanian Gharehchopogh, Zahra Asheghi Dizaji,A New Chaos Agent Based Approach in Prediction of the Road Accidents with Hybrid of PSO Optimization and Chaos Optimization Algorithms: A Case Study, International Journal Of Academic Research, Vol. 6. No. 2. March, 2014. 8. B.W. Boehm, Software Engineering Economics, Prentice-Hall, Englewood Cliffs, New Jersy, 1981.
134 | PART A. APPLIED AND NATURAL SCIENCES
www.ijar.eu
INTERNATIONAL JOURNAL of ACADEMIC RESEARCH
Vol. 6. No. 3. May, 2014
9. T. Menzies, D. Port, Zh. Chen, J. Hihn, Validation Methods for Calibrating Software Effort Models, ICSE ACM USA, 2005. 10. Bilal Alatas, Erhan Akin, A. BedriOzer, Chaos embedded particle swarm optimization algorithms, sciencedirect, pp:1715–1734, September 2007. 11. N. Nedjah and L. M. Mourella, Swarm Intelligent Systems, Springer, 2006. 12. Farhad Soleimanian Gharehchopogh, Isa Maleki, Seyyed Reza Khaze, A Novel Particle Swarm Optimization Approach For Software Effort Estimation, International Journal of Academic Research, Vol. 6. No. 2. March, 2014. 13. R.L. Haupt and S. E. Haupt, Practical Genetic Algorithms, John Wiley & Sons, Inc, 2004. 14. Farhad Soleimanian Gharehchopogh, Z.ASHEGI DIZAJI, Z.AGHIGHI, "Evaluation of Particle Swarm Optimization Algorithm in Prediction of the Car Accidents on the Roads: A Case Study", International Journal on Computational Science & Applications (IJCSA), Vol: 3, No: 4, pp. 1-12, August 2013. 15. Heinz-Otto Peitgen- HartmutJürgens - DietmarSaupe, Chaos and Fractals, Springer Science and Business Media, United States of America, 2004.
Baku, Azerbaijan| 135