A Fuzzy Logic Based Software Cost Estimation Model - SERSC

International Journal of Software Engineering and Its Applications Vol. 7, No. 2, March, 2013

A Fuzzy Logic Based Software Cost Estimation Model Ziauddin1, Shahid Kamal2, Shafiullah khan3 and Jamal Abdul Nasir4 1

1

COMSATS Institute of Information Technology, Vehari, Pakistan 2 FSKSM , Universiti Teknologi Malaysia 3 Kohat University of Science & Technology, Kohat, Pakistan 4 ICIT, Gomal University, D.I.Khan, Pakistan

[email protected], [email protected], [email protected], 4 [email protected] Abstract

Software cost estimation is a challenging and onerous task. Estimation by analogy is one of the expedient techniques in software effort estimation field. However, the methodology utilized for the estimation of software effort by analogy is not able to handle the categorical data in an explicit and precise manner. Early software estimation models are based on regression analysis or mathematical derivations. Today’s models are based on simulation, neural network, genetic algorithm, soft computing, fuzzy logic modelling etc. This paper aims to utilize a fuzzy logic model to improve the accuracy of software effort estimation. In this approach fuzzy logic is used to fuzzify input parameters of COCOMO II model and the result is defuzzified to get the resultant Effort. Triangular fuzzy numbers are used to represent the linguistic terms in COCOMO II model. The results of this model are compared with COCOMO II and Alaa Sheta Model. The proposed model yields better results in terms of MMRE, PRED(n) and VAF. Keywords: Fuzzy logic, Triangular Fuzzy Number, Membership function and Fuzziness, Software Effort Estimation

1. Introduction Estimating the work-effort and the schedule required to develop and/or maintain a software system is one of the most critical activities in managing software projects. The task is known as Software Cost Estimation [27]. During the development process cost and time estimates are useful for the initial rough validation and monitoring of the project’s progress; after completion, these estimates may be useful for project productivity assessment for example. There are different techniques used in software cost estimation: Algorithm model, also called parametric model, is designed to provide some mathematical equations to provide software estimation. LOC-based models are algorithm models such as [2, 6, 7, 8]. Ali Idri and Laila Kjjri [9] proposed the use of fuzzy sets in the COCOMO-81 models [8]. Musilek P. and others [18] proposed f-COCOMO model, using fuzzy sets. The methodology of fuzzy sets giving rise to f-COCOMO [18] is sufficiently general to be applied to other models of software cost estimation such as function point method [14]. W. Pedrycz and others [19] found that the concept of information granularity and fuzzy sets, in particular, plays an important role in making software cost estimation models more user friendly. Harish Mittal and Pradeep Bhatia [17] used triangular fuzzy numbers for fuzzy logic sizing. Lima, O.S.J. and Others [13] proposed the use of concepts and properties from fuzzy set theory to

7


extend function point analysis to Fuzzy function point analysis, using trapezoid shaped fuzzy numbers for the linguistic variables of function point analysis complexity matrixes. In this paper we propose triangular fuzzy numbers to represent the linguistic variables. The results can be optimised for the given application by varying fuzziness of the triangular fuzzy numbers. To apply fuzzy logic first fuzzification is done using triangular fuzzy number, Fuzzy output is then evaluated and estimation is done by defuzzification technique given in this paper. 1.1. Fuzzy Logic In 1965, Lofti Zadeh formally developed multi-value set theory, and introduced the term fuzzy logic [5]. Fuzzy Logic (FL) starts with the concept of fuzzy set theory. It is a theory of classes with un-sharp boundaries, and considered as an extension of the classical set theory. The membership μ of an element x of a classical set A, as subset of the universe X, is defined as follows: μ A (x) = 1 if x € A μ A(x) = 0 if x € A A system based on FL has a direct relationship with fuzzy concepts (such as fuzzy sets, linguistic variables, etc.) and fuzzy logic. The popular fuzzy logic systems can be categorised into three types: Pure fuzzy logic systems, Takagi and Sugeno’s fuzzy system, and fuzzy logic system with fuzzifier and defuzzifier . Since most of the engineering applications produce crisp data as input and expects crisp data as output, the last type is the most widely used type of fuzzy logic systems. Fuzzy logic system with fuzzifier and defuzzifier, first, proposed by Mamdani and it has been successfully applied to a variety of industrial processes and consumer products [16]. The main three steps of applying fuzzy logic to a model are: Step 1: Fuzzification: It converts a crisp input to a fuzzy set Step 2: Fuzzy Rule-Based System: Fuzzy logic systems use fuzzy IF-THEN rules. Fuzzy Inference Engine: Once all crisp input values are fuzzified into their respective linguistic values, the inference engine accesses the fuzzy rule base to derive linguistic values for the intermediate and the output linguistic variables. Step 3: Defuzzification: It converts fuzzy output into crisp output. Use of fuzzy sets in logical expression is known as fuzzy logic. A fuzzy set is characterized by a membership function, which associates with each point in the fuzzy set a real number in the interval (0,1), called degree or grade of membership [3, 4, 20, 21]. The membership function may be triangular, trapezoidal, parabolic etc. Fuzzy numbers are special convex and normal fuzzy sets, usually with single modal value, representing uncertain quantitative information. A fuzzy number is a quantity whose value is imprecise, rather than exact as in the case of ordinary single valued numbers [4, 9, 20, 21, 23]. Any fuzzy number can be thought of as a function, called membership function, whose domain is specified, usually the set of real numbers, and whose range is the span of positive numbers in the closed interval (0, 1). Each numerical value of the domain is assigned a specific value and 0 represents the smallest possible value of the membership function, while the largest possible value is 1. In many respects fuzzy numbers depict the physical world more realistically than single valued numbers. The curve in Figure 1a is a triangular fuzzy number, the curve in Figure 1b is a trapezoidal fuzzy number, and the curve in Figure1c is bell shaped fuzzy number.

8


Figure 1. Fuzzy Membership Functions A triangular fuzzy number (TFN) is described by a triplet ( ,m, ), where m is the modal value,  and  are the right and left boundary respectively. Fuzziness of a TFN (, m,) is defined as: Fuzziness of TFN (F) =

 

, 0 < F< 1

The higher the value of fuzziness, the more fuzzy is TFN. 1.2. COCOMO II The COCOMO I [7] model is a regression-based software cost estimation model, which was developed by Boehm in 1981 and thought to be the most cited and the most plausible model among all traditional cost estimation models. The COCOMO I was a stable model on that time. One of the problems with the use of COCOMO I today is that it does not match the development environment of the late 1990’s. Therefore, in 1997, Boehm developed the COCOMO II to solve most of the COCOMO I problems. Figure 2 shows the process of software schedule, cost, and manpower estimation in the COCOMO II. The COCOMO II includes several software attributes such as: 17 Effort Multipliers (EMs), 5 Scale Factors (SFs), Software Size (SS), and Effort estimation that are used in the Post Architecture Model of the COCOMO II. Effort

EMs (17) COCOMO II Effort Estimation Model

SF (5)

Schedule

Personne l

SS (1)

Figure 2. COCOMO II Estimation Process The formula for the process is given by:

17

ffort (P )=A Size

B

0.01 ∑ S

j

j=1

(

∏

i

i=1

)

∑

Personnel = Effort/Schedule Where A=2.94; B=0.91; C=3.67; D=0.28 More details about the Model can be found in [7].

9


2. Proposed Model Our model is established based on the COCOMO II and Fuzzy Logic. The COCOMO II includes a set of input software attributes: 17 Effort Multipliers (EMs), 5 Scale Factors (SFs), one Size in KDLOC (SZ) and one output, Effort. The architecture of the Model is shown in Figure 3.

Fuzzification

Rule Base

EM-Fuzzification

EMs (17)

Inference Engine

Defuzzification

SF- Fuzzification

SF (5)

SZ- Fuzzification

SZ (1)

Database

EFFORT

Figure 3. Fuzzy Logic Model for Software Effort Estimation 2.1. Inputs There are three set of inputs  Size in KLOC  17 Effort Multipliers  5 Scale Factors All these inputs are provided as crisp data. These inputs are directed to fuzzification module. 2.2. Fuzzification This module is sub divided into three sub modules. SZ-Fuzzifications module converts the Size input to the fuzzy variable, SF-Fuzzification module converts Scale Factors input to fuzzy variables whereas EM- Fuzzification module converts the Effort Multiplier input to Fuzzy variable. The Terms Very Low, Low, Nominal, High, Very High and extra High have been defined for each variable. We defined a fuzzy set for each linguistic value for Triangular Membership Function. A Triangular Membership Function (TMF) is a three-point function [8], defined by minimum (α), aximum (β) and modal (m) values, that is, T (α, m, β), where (α ≤ m ≤β). We take each linguistic variables as a triangular Fuzzy numbers, TFN (, m, ), m,  m. The membership function ((x)) for which is defined as: (x) =

0, x   x- / m -  ,  x  m -x / -m , m x   0, x 

Where m is the mean of input sizes.

10


The Fuzzy Set defined for Size is shown in Figure 4a. The Size range has been divided into small intervals to get a better fuzzy number. Figure 4b represents Fuzzy Set for Scale Factor RESL whereas Figure 4c represents Fuzzy Set for Effort Multiplier PERS. All the scale factors and effort multipliers have scaling range from Very Low to Extra High.

Figure 4a. Fuzzy Set for SIZE

Figure 4b. Fuzzy Set for Risk Resolution (RESL)

Figure 4c. Fuzzy Set For Personnel Capability (PERS) The Fuzzication Process converts the crisp data to linguistic variables which are passed to Inference Engine.

11


2.3. Inference Engine Inference engine operates on a set of fuzzy rules. The Fuzzy Model rules contain the linguistic variables related to the project. We have implemented this model using FIS tool in MATLAB. A sample of the rules is presented below: if (PREC is VL) then (EFFORT is XH) if(PREC is LO) then (EFFORT is VH) if(FLEX is VL) then (EFFORT is XH) The number of rules that have been used in the proposed model are 281. 2.4. Defuzzification This module calculates and converts the fuzzy output into crisp data form. The defuzzification has been performed using Centeroide Method. It has been used to calculate the Centre of Gravity (COG) area under the curve as COG =

∑

( )

∑

( )

3. Experimental Study In order to carry out our experiments to evaluate the efficiency of our proposed model, we have chosen 30 projects data, developed in local software house, ranging from Very Small to Large. We have chosen two estimation methods, COCOMO II and Alaa Sheta [1a] estimation method. We have used following MMRE, PRED (L) and VAF as criteria for assessment of these models. The Mean Magnitude of Relative Error, MMRE, is probably the most widely used evaluation criterion for assessing the performance of competing models. One purpose of MMRE is to assist us to select the best model as it gives us accuracy prediction of the competing models. MMRE is calculated as average of the Magnitude of Relative Errors MREs of each estimate. The MRE and MMRE are calculated as: MRE = ∑ Variance Account For (VAF) is used in the context of statistical models whose main purpose is the prediction of future outcomes on the basis of other related information. It is the proportion of variability in a data set that is accounted for by the statistical model. It provides a measure of how well future outcomes are likely to be predicted by the model. It is calculated as: VAF(%) = ( Where

12

(

) ( )

)

is the actual ffort and ’ is the Calculated ffort.


Prediction PRED (L) is the probability of the model having relative Error less than or equal to L. It is calculated: PRED (L) = K / N Where K is the number of Observations where MRE is less than or equal to L and N are total number of observations. Table 1 shows estimated efforts of all three models. It also shows MRE of all three models for every estimate. Table 2 shows comparison of these models. The estimation results are graphically shown in Figure 5, 6 and 7. Comparison of the model results in Table 2 shows that the proposed model has better estimation accuracy as compared to other models having 7.512 MMRE with 98.77% likeliness to have the same ratio of MMRE if the model is tested under different environment with different information. In order to further verify the prediction we have analysed the results for 25%, 15%, 10% and 8% prediction. The results show that the proposed model is 63.33% confident to have its average MRE, which shows the stability of its estimates. The stability of estimates of the proposed model can also be verified from Figure 5, 6, 7. The Figure 7 shows that there are minute variations in actual and estimated calculations. Table 1. Estimated Efforts and MREs of all Three Models Alaa Sheta

Proposed

MRE COCOMO

MRE Alaa Sheta

MRE Proposed

P.No

Size

Actual

COCOMO II

1

18

301

214

243

293

28.90365

19.2691

2.657807

2

50

1063

841

931

984

20.88429

12.41769

7.431797

3

40

605

601

453

537

0.661157

25.12397

11.23967

4

22

243

282

181

201

16.04938

25.5144

17.28395

5

13

141

128

95

127

9.219858

32.62411

9.929078

6

12

112

131

91

121

16.96429

18.75

8.035714

7

3

16

15

13

16

6.25

18.75

0

8

11

91

78

65

82

14.28571

28.57143

9.89011

9

42

781

751

634

722

3.841229

18.82202

7.554417

10

16

233

241

201

249

3.433476

13.73391

6.866953

11

21

334

304

289

318

8.982036

13.47305

4.790419

12

52

982

1043

756

890

6.211813

23.01426

9.368635

13

23

305

361

300

281

18.36066

1.639344

7.868852

14

27

421

459

398

401

9.026128

5.463183

4.750594

15

9

45

48

36

41

6.666667

20

8.888889

16

15

189

197

178

192

4.232804

5.820106

1.587302

17

4

17

19

14

17

11.76471

17.64706

0

18

6

22

23

19

22

4.545455

13.63636

0

19

2

8

8

7

8

0

12.5

0

13


20

7

34

39

31

29

14.70588

8.823529

14.70588

21

13

84

97

72

79

15.47619

14.28571

5.952381

22

18

274

329

241

268

20.07299

12.0438

2.189781

23

24

403

398

365

379

1.240695

9.42928

5.955335

24

26

371

352

322

362

5.121294

13.20755

2.425876

25

61

1241

1459

871

1102

17.56648

29.81467

11.20064

26

29

677

605

561

638

10.63516

17.13442

5.760709

27

19

291

271

202

263

6.872852

30.58419

9.621993

28

10

110

128

94

119

16.36364

14.54545

8.181818

29

21

239

201

189

213

15.89958

20.9205

10.87866

30

12

145

168

156

189

15.86207

7.586207

30.34483

Table 2. Comparison of the Models

MMRE VAF PRED(25) PRED(15) PRED(10) PRED(8)

COCOMO II 11.003% 95.86% 93.33% 63.33% 50% 40%

Alaa Sheta

Proposed

16.838% 93.90% 80% 53.33% 20% 10%

7.512% 98.77% 96.33% 93.33% 80% 63.33%

Figure 5. COCOMO II Vs Actual Calculation of Effort

14


Figure 6. Alaa Sheeta Vs Actual Calculation of Effort

Figure 7. Proposed Model Vs actual Calculation of Effort

4. Conclusion In this paper, we have presented a Software Effort Estimation Model using Fuzzy Logic Model. Triangular membership function has been used to generate linguistic fussy values. One can easily develop the same model using trapezoidal or Gauss-Bell membership functions. Similarly ther is still margin of improvement in fuzzification process. If fuzziffication process is further optimized, it will certainly result in substantial reduction in number of fuzzy rules implemented in inference engine.

15


References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19]

[20]

[21] [22] [23] [24] [25] [26] [27]

16

A. . Sheta, “Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects”, Journal of Computer Science, vol. 2, no. 2, (2006), pp. 118-123. A. J. Albrecht, “ easuring application development productivity”, SHAR /GUID IB Application development Symposium. A. S. Andreou and E. Papatheocharous, “Software Cost Estimation using uzzy Decision Trees”, in 23rd IEEE/ACM International Conference on Automated Software Engineering, (2008), pp. 371-374. I. Attarzadeh and S. Hockow, “Improving the Accuracy of Software Cost Estimation Model Based on a New Fuzzy Logic Model”, World Applied Sciences Journal, vol. 8, no. 2, (2010), pp. 177-184. J. W. Bailey and Basili, “A eta model for software development resource expenditure”, Proc. Intl. Conf. Software Engineering, (1981), pp. 107-115. V. R. Basili and K. Freburger, “Programming easurement and stimation in the Software ngineering Laboratory”, Journal of System and Software, vol. 2, (1981), pp. 47-57. B. Boehm, “Software Cost stimation with COCO O II”, Prentice Hall, (2000). B. Boehm, “Software ngineering Economics”, nglewood Cliffs, NJ Prentice-Hall, (1981). A. Idri, A. Abran and L. Kjiri, “COCO O cost model using uzzy Logic”, 7th International Conference on Fuzzy Theory & Technology Atlantic, New Jersy, (2000). A. Idri, A. Abran and T. M. Khoshgoftaar, “ uzzy Analogy: A new Approach for Software Cost stimation”, 11 Intemational Workshop in Software Measurements, Montreal, (2001) August 28-29, pp. 93-101. C. Jones, “Programming Languages Table”, Release 8.2, (1996) March. I. R. Khan, A. Alam and H. Anwar, “Efficient Software Cost Estimation using Neuro-Fuzzy Technique”, National Conference on Recent Developments in Computing and its Applications, (2009), pp. 376-381. O. S. J. Lima, P. P. Farias and A. D. Belchor, “A uzzy odel for Function Point Analysis to Development and Enhancement Project Assessments”, CL I J, vol. , no. 2, (2002). J. E. Matson, B. E. Barrett and J. M. ellichamp, “Software Development Cost stimation Using unction Points”, I Trans. on Software Engineering, vol. 20, no. 4, (1994), pp. 275-287. E. Mendes and N. osley, “Web Cost stimation: An Introduction, Web engineering: principles and techniques”, Ch 8, (2005). A. Mittal, K. Parkash and H. ittal, “Software cost estimation using fuzzy logic,” SIGSO T Software Engineering Notes, vol. 35, no. 1, (2010), pp. 1-7. H. Mittal and P. Bhatia, “Optimization Criterion for ffort stimation using uzzy Technique”, CL I Electronic Journal, vol. 10, no. 1, (2007), pp. 2. P. Musílek, W. Pedrycz, G. Succi and M. Reformat, “Software Cost stimation with uzzy odels”, ACM SIGAPP Applied Computing Review, vol. 8, no. 2, (2000), pp. 24-29. W. Pedrycz, F. F. Peters and S. Ramanna, “A uzzy Set Approach to Cost stimation of Software Projects”, Proceedings of the 1999 IEEE Canadian Conference on Electrical and Computer Engineering Shaw Conference Center, Edmonton Alberta, Canada, (1999). P. V. G. D. Prasad Reddy, K. R. Sudha, P. Rama Sree and S. N. S. V. S. C. Ramesh, “Fuzzy Based Approach for Predicting Software Development ffort”, International Journal of Software Engineering (IJSE), vol. 1, no. 1, (2010), pp. 1-11. P. Reddy PVGD, “Particle Swarm Optimization In The Fine Tuning Of Fuzzy Software Cost Estimation Models”, International Journal of Software Engineering (IJSE), vol. 1, no. 2, (2010), pp. 12-23. R. S. Pressman, “Software ngineering; A Practitioner Approach”, c Graw-Hill International Edition, Sixth Edition, (2005). C. S. Reddy and K. Raju, “An Improved Fuzzy Approach for COCOMO's Effort Estimation using Gaussian embership unction”, Journal of Software, vol. 4, no. 5, (2009), pp. 452–459. C. Yau and R. H. L. Tsoi, “Assessing the uzziness of General System Characteristics in stimating Software Size”, IEEE, (1994), pp. 189-193. L. A. Zadeh, “ rom Computing with numbers to computing with words-from manipulation of measurements to manipulation of perceptions”, Int. J. Appl. Math. Computer Sci., vol. 12, no. 3, (2002), pp. 307-324. L. A. Zadeh, “ uzzy Sets, Information and Control”, vol. 8, (1965), pp. 338-353. Z. Zia, A. Rashid and K. U. Zaman, “Software cost estimation for component based fourth-generationlanguage software applications”, I T Software, vol. 5, no. 1, (2011), pp. 103-110.


Authors Ziauddin Dr. Ziauddin is working as Assistant Professor in COMSATS Institute of Information Technology, Vehari. He has worked in Gomal University for 22 years. He has published more than 15 research papers in reputed International journals. His area of expertise is Software Engineering, Software Process Improvement, Software Reliability Engineering, Software Development Improvement, Software Quality Assurance and Requirement Engineering. He is Gold Medalist from Peshawar University in M.Sc Computer Science. He has Diploma in Computer Forensics from School of Education, Indiana University USA. Shahid Kamal Shahid Kamal Tipu is working as Assistant Professor in Institute of Computing & Information Technology, Gomal University, D.I.Khan since 2000. Currently he is doing his Ph.D. in Malaysia. His area of expertise is Information System Security and Software Quality. He has published his research in reputed International Journals. He is a renowned software engineer in local industry.

Shafiullah Khan Dr. Shafiullah Khan is Assistant professor in Institute of Information Technology, Kohat University of Science and Technology, Pakistan. He has done his PhD from the School of Engineering and Design, Brunel University, UK. His research mainly focuses on wireless broadband network architecture, security and privacy, security threats and mitigating techniques.

Jamal Abdul Nasir Jamal Abdul Nasir is working as Assistant Professor in Institute of Computing & Information Technology, Gomal University, D.I.Khan since 1988. Currently he is doing his Ph.D. in CIIT, Islamabad. His area of expertise is Data Mining and Artificial Intelligence. He has published his research in reputed International Journals. He is a one of the pioneers of Information technology in the country.

17


18

A Fuzzy Logic Based Software Cost Estimation Model - SERSC

A Fuzzy Logic Based Software Cost Estimation Model - SERSC

Suggest Documents

Analytic Study of Fuzzy-based model for Software Cost Estimation

A neuro-fuzzy model for software cost estimation - IEEE Xplore

A Case Study Research on Software Cost Estimation Using ... - SERSC

Fuzzy Case-Based Reasoning Models for Software Cost Estimation

Approaching Software Cost Estimation Using an Entropy-Based Fuzzy

software cost estimation with fuzzy inputs: fuzzy modelling and ... - UAH

Software cost estimation with fuzzy inputs: Fuzzy modelling and ...

FUZZY LOGIC BASED INTELLIGENT SOFTWARE REQUIREMENT ...

Guidelines by employing Fuzzy-Logic based software

Fuzzy Logic Application in Modeling Bioinformatics ... - SERSC

Software Cost Estimation Model Based on Integration of ... - CiteSeerX

Cost Estimation Model For Reuse Based Software Products - IAENG

Fuzzy logic-based model for predicting material

A Cost Model Based on Software Maintainability

Software Outsourcing Cost Estimation Model (SOCEM).

A Fuzzy Logic-Based Approach for Estimation of

A Novel Fuzzy Logic Based Software Component ... - Semantic Scholar

Towards A Fuzzy Logic Based Measures for Software Projects ...

Towards A Fuzzy Logic Based Measures for Software ... - CiteSeerX

An Estimation of Software Reusability using Fuzzy Logic ... - IEEE Xplore

Application of Fuzzy Logic Approach to Software Effort Estimation

Software cost estimation

Software Cost Estimation - CiteSeerX

Software cost estimation