Dronacharya Research Journal
Issue II
Jan-June 2010
ISSN No.:0975-3389
HANDLING IMPRECISION IN SOFTWARE ENGINEERING MEASUREMNTS USING FUZZY LOGIC Dr. Pradeep Kumar Bhatia* Professor, Department of Computer Science Guru Jambheshwar University of Science & Technology, Hisar, Haryana, India E-mail-
[email protected]
Harish Kumar Mittal Lecturer, Department of Informatioin Technology Vaish College of Engineering, Rohtak-124001, India E-mail-
[email protected]
Kevika Singla Senior Software Engineer Royal Bank of Scotland, Gurgaon-122001, India E-mail:
[email protected] ______________________________________________________________________________________________________________________________
ABSTRACT The measurement of software is recognized as a fundamental topic in software engineering research and practice. Before initiating any project prior estimation about the time, cost and manpower involved in the project must be made to ensure the success of the project. Software effort and quality estimation has become an increasingly important field due to increasingly pervasive role of software in today’s world. It is noted that traditional estimation approaches can have serious difficulties when used on software engineering data that is usually scarce, incomplete, and imprecisely collected. As the uncertainties are best handled by using fuzzy logic, the emphasis is on quantitative estimation of various software attributes using fuzzy technique. As a rule of thumb we can say if some decision making or human communication involve during development process we can use the concept of fuzzy logic to improve software development processes and products. Key words: Fuzzy Logic, Effort Estimation, Software Quality, Software Maintainability, Software Testing, Software Productivity. __________________________________________________________________________________________________________________________
1. INTRODUCTION Software metrics are measurements of the software development process and product that can be used to indicate the performance of the software product and to build software quality models. Before initiating any project prior estimation about the time, cost and manpower involved in the project must be made to ensure the success of the project. Software effort and quality estimation has become an increasingly important field due to increasingly pervasive role of software in today’s world. Before initiating any project prior calculation about the time, cost and manpower involved in the project must be made to ensure the success of the project. Software cost estimation is the process of predicting the amount of effort required to build a software system and its duration. With the use of new technologies the present cost estimation formulae are not giving good estimates. Moreover the estimated size of the project is a fuzzy number, while many of these do not take into account fuzziness In order to develop high quality reliable software, various quality attributes need to be quantified. In present study we have focused mainly on software effort, quality and reliability assessment using fuzzy logic. The paper is divided into four sections. Section 2 gives some basics of fuzzy logic along with the variety of criterions to validate the accuracy of measurement techniques. Section 3 discusses some prevalent techniques for software engineering measurements using fuzzy logic. Section 4 concludes the paper and describes promising topics worthy of further research.
*Corresponding Author
51
Dronacharya Research Journal
Issue II
Jan-June 2010
2. RELATED TERMS 2.1 Fuzzy Logic Fuzzy logic is a superset of conventional (Boolean) logic that has been extended to handle the concept of partial truth i.e., truth values between completely true and completely false. Fuzzy logic is a methodology, to solve problems which are too complex to be understood quantitatively, based on fuzzy set theory, and introduced in 1965 by Prof. Zadeh in the paper Fuzzy Sets [ZADE65]. Use of fuzzy sets in logical expression is known as fuzzy logic. A fuzzy set is characterized by a membership function, which associates with each point in the fuzzy set a real number in the interval [0, 1], called degree or grade of membership. Fuzzy logic systems are one of the main developments and successes of fuzzy sets and fuzzy logic. A FLS is a rule-base system that implements a nonlinear mapping between its inputs and outputs. Fuzzy logic process consists of the following steps: • • • • •
Input as a crisp number. Fuzzification Fuzzy Logic Defuzzification Crisp output
Fig. 1: Fuzzy Logic Process
2.2 Fuzzy Number Fuzzy numbers are special convex and normal fuzzy sets, usually with single modal value, representing uncertain quantitative information. A fuzzy number is a quantity whose value is imprecise, rather than exact as in the case of ordinary single valued numbers. Any fuzzy number can be thought of as a function, called membership function, whose domain is specified, usually the set of real numbers, and whose range is the span of positive numbers in the closed interval [0,1]. Each numerical value of the domain is assigned a specific value and 0 represents the smallest possible value of the membership function, while the largest possible value is 1. In many respects fuzzy numbers depict the physical world more realistically than single valued numbers. Suppose that we are driving along a highway where the speed limit is 80km/hr, we try to hold the speed at exactly 80 km/hr, but our car lacks cruise control, so the speed varies from moment to moment. If we note the instantaneous speed over a period of several minutes and then plot the result in rectangular coordinates, we may get a curve that looks like one of the curves shown in figure 1. However there is no restriction on the shape of the curve. The curves in figure 1 show a triangular fuzzy number, a trapezoidal fuzzy number, and a bell shaped fuzzy number. µ(x) 1
0.5
0
10 Triangular MF
20
30 x
40
Trapezoidal MF
50
Bell Shaped MF
Fig. 2: Membership Functions
52
60
Dronacharya Research Journal
Issue II
Jan-June 2010
2.3 Criteria for validation of estimation models The following criterions are frequently used by various researchers for validation of estimation models: Variance Accounted For (VAF) A model which gives higher VAF is better than that which gives lower VAF.
var ( E − Eˆ ) VAF (%) = 1 − *100 var E
(0.1)
Where, E = measured Value
Eˆ = estimated Value f = frequency
var x =
∑ f ( x − x) ∑f
2
(0.2)
Where, x = mean x Mean Absolute Relative Error
MARE (%) =
∑ f (R ∑f
E
)
*100
(0.3)
Where,
Absolute Relative Error ( R E
)
=
E − Eˆ (0.4)
E
A model which gives lower Mean absolute Relative Error is better than that which gives higher Mean absolute Relative Error. Variance Absolute Relative Error
var R E
∑ f (R (%) =
Ε
− mean RΕ ) 2
∑f
*100
(0.5)
A model which gives lower Variance Absolute Relative Error is better than that which gives higher Variance Absolute Relative Error. Root Mean Square Error (RMSE) RMSE is frequently used measure of differences between values predicted by a model or estimator and the values actually observed from the thing being modeled or estimated. It is just the square root of the mean square error as shown in equation given below
RMSE =
1 n ( Ei − Eˆ i ) 2 ∑ n i =1
(0.6)
Prediction (n) Prediction at level n is defined as the % of projects that have absolute relative error less than n [EMIL05]. A model which gives higher pred (n) is better than that which gives lower pred(n).
53
Dronacharya Research Journal
Issue II
Jan-June 2010
3. FUZZY LOGIC IN SOFTWARE ENGINEERING MEASUREMENTS 3.1 Fuzzy logic in Software Effort and Cost Estimation Effective estimation of effort is the most challenging activity in software development. Software effort estimation is not an exact science. Effort estimation process involves a series of systematic steps that provide estimate with acceptable risk. Various models have been derived by studying large number of completed software projects from various organizations and applications to explore how project size is mapped into project effort and project cost. • Chuk Yau, Raymond H.L. Tsoi, in 1994 [CHUK94], introduces a Fuzzified Function Point Analysis (FFPA) model using TFN, to help software size estimators to express their judgment. Through a case study for in-house software, this paper presents the experience of using FFPA to estimate the software size and compares it with the conventional FPA. • Ryder, J in 1998 [RYDE98], investigates the application of fuzzy modeling techniques to two of the most widely used software effort prediction models: the Constructive Cost Model and the Function Points model. • W. Pedrycz and others [PEDR99] found that the concept of information granularity and fuzzy sets, in particular, plays an important role in making software cost estimation models more user-friendly. • Ali Idri, Alain Abran and Laila Kjiri, in 2000 [IDRI00], proposed the use of fuzzy sets rather than classical intervals in the COCOMO’81 model. For each cost driver and its associated linguistic values, they defined the corresponding fuzzy sets using trapezoidal-shaped membership functions. • Musilek, P., Pedrycz, W. and others [MUSI00] fuzzify the basic COCOMO model at two different levels of detail. They proposed f-COCOMO model, using fuzzy sets. They claim that methodology of fuzzy sets giving rise to f-COCOMO is sufficiently general to be applied to other models of software cost estimation such as function point method. • Nonika Bajaj, Alok Tyagi and Rakesh Aggarwal, in 2006 [BAJA06], discussed the bottom up approach of cost estimation using fuzzy logic. Trapezoidal fuzzy numbers were used to represent various linguistic terms of bottom-up estimation. • Harish Kumar Mittal and Pradeep Kumar Bhatia in 2007, [HARI07] proposes two models viz Model1, Effort Estimation using fuzzy technique (without methodology) and Model2, Effort Estimation using Fuzzy Technique (with methodology). Rather than using a single number they took software size (KLOC) as a triangular number. Estimated effort can be optimized for any application type by varying arbitrary constants for these models. The developed models are tested on ten NASA software projects, on the basis of four criterions for assessment of software cost estimation models. Comparison of these models is done with the existing leading models and it is found that the developed models provide better estimation. • Harish Kumar Mittal and Pradeep Kumar Bhatia [MITT07] in 2007 introduces rectified model based on function point analysis for effort estimation. Fuzzy functional points are first evaluated and then the result is defuzzified to get the crisp functional points and hence the size estimation in person hours. The developed model is tested on published experimental data. Comparison of results from developed model is done with the conventional FP estimate. • Avner Engel and Mark Last, in 2007 [ENGE07] model software testing costs and risks using fuzzy logic methodology. They estimate the quality cost occurring during the development of software for an avionic suite in a fighter aircraft and demonstrate that applying fuzzy logic methodology yields results comparable to estimations based on models using the probabilistic paradigm. Quality costs are defined as money spent on verification, validation and testing plus all costs stemming from software and system failures. They also compared actual quality costs measurement vs modelling using data of [ENGE03]. • Alaa F. Sheta and others in 2008 [ALAA08] presented a Fuzzy logic based model for effort estimation using LOC approach. They use PSO (particle swarm optimization) method for tuning of COCOMO parameters. NASA, SEL dataset was used for validation of their model with RMSSE as evaluation criteria. A part of data was used for tuning and other for implementation. • Harish Mittal and Pradeep Bhatia [MITT08] introduced a fuzzy logic based precise approach to quantify Cost of software testing and risks. Most VVT cost data and relevant parameters are not available in precise form. Therefore, fuzzy modelling has the distinct advantage of deriving realistic information based on imprecise knowledge. The proposed study gives better results as compared to some earlier models. Furthermore, the calculation process is simpler than the process of earlier models. The methodology of fuzzy logic used for, in the proposed study, is sufficiently general and can be applied to other areas of quantitative software engineering
54
Dronacharya Research Journal
Issue II
Jan-June 2010
3.2 Fuzzy Logic in Software Quality and Reliability Estimation Quality, simplistically, means that a product should meet its specification. In order to develop software quality prediction model, one must first identify factors that strongly influence software quality and the number of residual errors. It is extremely difficult, to accurately identify relevant quality factors. Furthermore, the degree of influence is imprecise in nature. Due to its natural ability to model imprecise and fuzzy aspect of data and rules, fuzzy logic is an attractive alternative in such situations. • Houari A. Sahraoui and Others, in 2001 [SAHR01], provided an approach for Quality estimation using Fuzzy Threshold Values. They used a fuzzy logic based approach to investigate the stability of a reusable class library interface, using structural metrics as stability indicators. • Zhiwei Xu, in his Ph.D. Thesis, Florida Atlantic University, in 2001 [ZHIW01], studied the aspects of usage of Fuzzy Logic in Software Reliability Engineering using fuzzy expert systems in early risk assessment, Software quality models, software cost estimation. He used Commercial software systems and COCOMO database to demonstrate usefulness of the concepts. • Sun Sup So and others [SUN02], in 2002 proposed a fuzzy logic based approach to predict error prone modules using inspection data. Empirical evaluation of the proposed system was done on the published inspection data. They claim that this approach offers advantages over others in several ways. First, interpretation of much of inspection data is fuzzy in nature, and this model provides a natural mechanism to model such fuzzy data. Rules used in determining error-prone modules are fuzzy, too. Second, prototype system can be developed without having to have extensive empirical data and that the system’s performance can be continuously tuned as more inspection data become available. Finally, utilization of this system requires no extra cost to the development team since our analysis is based on inspection data and analysis is automated. • K.K. Aggarwal and Yogesh Singh, [AGGA05] in 2005 explored following metrics for estimation of software maintainability using Fuzzy Logic. Average number of live variables (ALV) Average life span of variables (ALS) Average cyclomatic complexity (ACC) Comment ratio (CR) They used TFN, with MATLAB Fuzzy Logic toolbox and mamdani inference system. Empirical results prove that the integrated measure of maintenance obtained from this model shows a strong co-relation to the maintenance time. • N. Raj Kiran, V. Ravi, [KIRA07] in 2007 develops models to accurately forecast software reliability. Various statistical (multiple linear regression and multivariate adaptive regression splines) and intelligent techniques (back-propagation trained neural network, dynamic evolving neuro–fuzzy inference system and TreeNet) constitute the ensembles presented. Three linear ensembles and one non-linear ensemble are designed and tested. Based on the experiments performed on the software reliability data obtained from literature, they observed that the non-linear ensemble outperformed all the other ensembles and also the constituent statistical and intelligent techniques. • Harish Kumar Mittal and Pradeep Kumar Bhatia [HARI08] in 2008 presented a fuzzy logic based precise approach to quantify quality of software. Software can be given quality grades on the basis of two metrics inspection rate/hr and error density. The prediction of quality of software is very easy by this approach. Precise quality grades can be given to any software. Software is graded on the quality grade basis in 10 grades. Modules having low grade are supposed to be most error prone, while those having higher quality grade are considered satisfactory on the basis of quality. Triangular Fuzzy numbers have been used for inspection rate and error/kLOC. The methodology of fuzzy logic used for, in the proposed study, is sufficiently general and can be applied to other areas of quantitative software engineering. • Harish Kumar Mittal and Pradeep Kuamr Bhatia [MITT09] in 2009, proposed a model for predicting the maintainability of software based on combined effect of various factors which affect maintainability. The model is validated on some software projects to check the usefulness of the approach. Lower values of maintainability indicate need of improvement in the software, so that the maintenance costs can be reduced. The methodology of fuzzy logic used in the study, is sufficiently general and can be easily applied to other areas of quantitative software engineering.
55
Dronacharya Research Journal
Issue II
Jan-June 2010
• Harish Kumar Mittal, Pradeep Kumar Bhatia and JP Mittal [HARI09] in 2009, proposes a fuzzy logic based precise approach to quantify productivity of software. The estimation of productivity of software is very easy by this approach. Triangular Fuzzy numbers have been used for cyclomatic complexity density. The methodology of fuzzy logic used for, in the proposed study, is sufficiently general and can be applied to other areas of quantitative software engineering. The model is evaluated on the basis of published data for a small pilot project on actual maintenance data. However, the technique is quite general and may be tested for medium and large projects.
CONCLUSION Conventional estimation approaches can have serious difficulties when used on software engineering data that is Scarce, Incomplete and imprecisely collected. Even though effort has been done to propose fuzzy based models, yet there is a vast scope to find better fuzzy models. Predicting better cost estimation and to find techniques to increase software reliability are always needed. The existence of large set of alternatives provides the facility to identify the best approach of software measurement techniques using the validation criterion discussed in the paper.
REFERENCES [1]
[AGGA05] Aggarwal, K. K.,Y. Singh & M. Puri, , Measurement of Software Maintainability using a Fuzzy model, Journal of Computer Science, U.S.A, 1(4), pp 538-542 (2005).
[2]
[ALAA08] Alaa F. Sheta, David Rine, Aladdin Ayesh: Development of software effort and schedule estimation models using Soft Computing Techniques. IEEE Congress on Evolutionary Computation 2008: (1283-1289)
[3]
[BAJA06] Nonika Bajaj, Alok Tyagi, Rakesh Agarwal: Software estimation: a fuzzy approach. ACM SIGSOFT Software Engineering Notes 31(3): 1-5 (2006)
[4]
[CHUK94] Chuk Yau, Raymond H.L. Tsoi, “Assessing the Fuzziness of General System Characteristics in Estimating Software Size”, IEEE, 189-193, (1994).
[5]
[EMIL05] E. Mendes, S. Counsell, N. Mosley: Towards Taxonomy of Hypermedia and Web Application Size Metrics. In Proceedings of International Conference of Web Engineering (ICWE 2005), pp. 110--123, 2005.
[6]
[ENGE07] Avner Engel, Mark Last, Modeling software testing costs and risks using fuzzy logic paradigm, Journal of Systems and Software, v.80 n.6, p.817-835, (2007).
[7]
[HARI07] Harish Mittal, Pradeep Bhatia, “Optimization Criterion for Effort Estimation using Fuzzy Technique”. CLEI EJ, Vol. 10 Num. 1 Pap. 2, (2007).
[8]
[HARI08] Harish Mittal, et. al. “Software Quality Assessment Based on Fuzzy Logic Technique”, ISSN: 1453-2277, International Journal of Software Computing Applications (IJSCA), Issue 3, pp: 105-112, (2008).
[9]
[HARI09] Harish Mittal, Pradeep Bhatia, JP Mittal, “Software Maintenance Productivity Assessment using Fuzzy Logic” ACM SIGSOFT SEN (accepted)
[10] [IDRI00] Ali Idri , Alain Abran and Laila Kjiri, COCOMO cost model using Fuzzy Logic, 7th International Conference on Fuzzy Theory & Technology Atlantic, New Jersy, March-(2000). [11] [KIRA07] Raj Kiran N., Ravi V., Software Reliability Prediction by Soft Computing Techniques, J. Syst. Software, (2007). [12] [MITT07] Harish Mittal, Pradeep Bhatia, “A comparative study of conventional effort estimation and fuzzy effort estimation based on Triangular Fuzzy Numbers”, International Journal of Computer Science & Security, Vol. 1, Issue 4, pp 36 – 47, ISSN: 1985-1533, (2007).
56
Dronacharya Research Journal
Issue II
Jan-June 2010
[13] [MITT08] Harish Mittal, Pradeep Bhatia,“Estimation of Software Testing Costs and Risks” Proc. Of The International Conference on Software Engineering Research & Practice (SERP'08), organized by WORLDCOMP’08, Las Vegas, Nevada, USA, (2008). [14] [MIIT09] Harish Mittal, Pradeep Bhatia,” Software maintainability assessment based on fuzzy logic technique”, ACM SIGSOFT Software Engineering Notes, Volume 34, Issue 3, ISSN: 0163-5948, May (2009). [15] [MUSI00] Musílek, P., Pedrycz, W., Succi, G., & Reformat, M., Software Cost Estimation with Fuzzy Models. ACM SIGAPP Applied Computing Review, 8(2), 24-29, (2000). [16] [PEDR99] Pedrycz W., Peters J.F., Ramanna S., A Fuzzy Set Approach to Cost Estimation of Software Projects, Proceedings of the 1999 IEEE Canadian Conference on Electrical and Computer Engineering Shaw Conference Center, Edmonton Alberta, Canada. (1999). [17] [PRES05] Pressman, Roger S., Software Engineering; A Practitioner Approach, McGraw-Hill International Edition, Sixth Edition, (2005). [18] [RYDE98] Ryder, J., "Fuzzy modelling of software effort prediction," Information Technology Conference, 1998. IEEE, vol., no., pp.53-56, 1-3, Sep (1998). [19] [SAHR01] Houari A. Sahraoui, Mounir A. Boukadoum, and Hakim Lounis. Building Quality Estimation models with Fuzzy Threshold Values. L’Objet, 17(4):535--554, (2001). [20] [SUN02] So, S. S., Cha, S. D., and Kwon, Y. R. Empirical evaluation of a fuzzy logic-based software quality prediction model. Fuzzy Sets Syst. 127, 2 (Apr. 2002), 199-208, (2002). [21] [ZHIW01] Zhiwei Xu , Taghi M. Khoshgoftaar, Fuzzy logic techniques for software reliability engineering, Florida Atlantic University, Boca Raton, FL, (2001). [22] [ZADE65] Zadeh L.A., Fuzzy Sets, Information and Control, 8, 338-353, (1965).
57