Software quality is the degree to which software possesses a desired combination of attributes such as reliability, maintainability, efficiency, portability, usability ...
International Journal of Software Engineering Research & Practices Vol.1, Issue 1, Jan, 2011
Comparative Study of Soft Computing Techniques for Software Quality Model Deepak Gupta
Vinay Kr.Goyal
Harish Mittal
Student, Deptt. of CSE SGVU,Jaipur
Prof. ,Deptt. of C.S.E. J.I.E.T.,Jind
A.P. ,Deptt. of I.T. V.C.E., Rohtak
maintainability of software. In general, it is not possible for any system to be optimized for all potential attributes. A corresponding quality model is meant to define the critical and most significant quality attributes and show how they can be achieved. It is often impossible to measure software external quality attributes directly. External attributes such as maintainability, understandability, and complexity are affected by many different factors and there are no straightforward metrics for them. Rather, we have to measure some internal attribute of software (such as its size) and assume that a relation exists between what we can measure and what we want to know. Ideally, there should be a clear and validated relationship between the internal and the external software attributes. External attributes are visible to the stakeholders (e.g., customers, users, and development project managers) of the product; internal attributes concern the developer of the product. In general, stakeholders (other than the developers) of software products care only about external quality attributes, but it is the internal attributes—which deal largely with the structure of the software—that help developers achieve the external qualities. For example, the internal quality of verifiability is necessary for achieving the external quality of reliability. In many cases, however, the qualities are related closely, and the distinction between internal and external is not sharp.
Abstract: Soft Computing refers to a consortium of computation methodologies. It promises to become a powerful means for obtaining solutions to problems quickly, yet accurately and acceptably. And Software Quality Model identifies fault-prone modules and no. of errors in the software. Some existing quality models can predict fault-proneness with reasonable accuracy in certain contexts. The increasing demand of software quality requires more powerful modeling techniques for software quality estimation. There is need to develop a quality models based on modeling techniques that must evaluate high-level quality characteristics with great accuracy. This paper presents a case study of different software quality estimation techniques to build software quality model and also compare the performance of these techniques. A few techniques are Artificial Neural Network, Case-Base Rule, Regression Tree, Rule Based System, Multiple Linear Regression and Fuzzy System etc. Our results reveal that Fuzzy and Rule Based System techniques can provide a good solution for designing a Software Quality Model. Keywords: S/W Quality, Quality Model, ANN, CBR, RT, RBS, MLR, Fuzzy.
Introduction Software quality is the degree to which software possesses a desired combination of attributes such as reliability, maintainability, efficiency, portability, usability and reusability. A quality model is a schema to better explain of our view of quality. Software quality models provide such definitions along with means for prediction and assessment. Software quality model can be used to identify program modules that are likely to be defective. A software quality estimation model allows the software development team to track & detect potential software defects. Such quality models will also help developers in building better quality programs. A number of well-known quality models are used to build quality software in industry. The aim of building new model is to predict the fault labels (fault-prone or not fault-prone) of the modules for the next release of the software.
Quality Models: The term Quality Model is defined in [9] as “the set of characteristics and relationship between them, which provides the basis for specifying quality requirements and evaluation quality”. Two of the earliest quality models are due to McCall [14] and Boehm [4] et al. In these models, the characteristics are quality factors and quality criteria. Quality models can be divided in two categories based on the approach which is used to build those as follows [8]. Fixed-model approach: We assume that all important quality factors needed to monitor a project are a subset of those is a published model. To control and measure each attribute, we accept the model’s associated criteria and metrics. Then we use data collected to determine the quality of the product
Estimation models can take different forms depending on the building technique that issued. They can be mathematical models. They can also be rule sets or decision trees. Quality estimation includes estimating reliability as well as
33
International Journal of Software Engineering Research & Practices Vol.1, Issue 1, Jan, 2011 model is to predict the operational quality of the software modules based on the known software metrics data. Currently many techniques are used to build and apply estimation models for real life software quality estimation.
Define-your-own-model approach: We accept the general philosophy that quality is composed of many attributes, but we do not adopt a given model’s characterization of quality. Instead, we meet with prospective users to reach a consensus on which quality attributes are important for a given product. Together we decide on a decomposition (possibly guided by an existing model) in which we agree on specific relationships between them
Software Quality Estimation Techniques Many software quality modeling techniques have been developed and used in real life software quality estimation problems.
Boehm and McCall models are typical examples of fixed quality models. Trendowicz and Punter [21] has done an excellent survey of different approaches of modeling quality. There are three main requirements which are discussed below.
According to T.M.Khoshgoftaar [10], there are six software quality estimation techniques: A. Case-Based Reasoning: Case-Based Reasoning [2] is a method of storing observations, such as data about a project's specifications and the effort requires to implement it, and then when faced with a new observation retrieving those stored observations closest to the new observation and using the stored values to estimate he new value, in this case effort. Thus a case-base reasoning system has a pre-processor to prepare the input data, a similarity function to retrieve the similar cases, a predictor to estimate the output value, and a memory updater to add the new case to the case base if required.
Flexibility: The quality models should be flexible because it is context dependent As each company has its own characteristics and requirements and different quality objectives, so the quality models need to be flexible enough to be applicable across different companies. Similarly different projects and processes have different quality requirements Reusability: Depending on the projects’ similarity level, quality model should support the reuse of measurement data as well as quality characteristics and their relationship. It enhances the accuracy and efficiency if the quality models incorporates experiences from past.
B. Regression Tree: A regression tree for software quality prediction is a collection of decision rules represented by an abstract tree model. A set of predictors are used to predict a response variable such as number of faults. There are currently two techniques that facilitate software quality modeling using regression trees. These are CART and SPLUS.CART is statistical tool for tree structured data analysis [5].CART provides two methods to build regression tree models, CART-LS and CART-LAD. S-Plus: A solution for advanced data analysis and statistical modeling, s-plus is a tool hosting a set of data mining functions including regression trees [7].
Transparency: The quality model should be transparent so that the relationships between the characteristics have some rationale. And it also should allow the expert to directly interfere to model structure for any necessary modification. Software Quality Estimation Models Software quality models are useful tools toward achieving the objectives of a software quality assurance initiative. A software quality estimation model allows for software developer team to track and detect potential software defects relatively early on during development, which is critical to many high-assurance systems.
C. Neural Networks: Neural network comprise a set of interconnected neurons, each having a transformation function that it performs on the weighed sum of its inputs to produce an output. In NN modeling, we determine a pattern of connection, a method of determining the interconnection weights and a transformation function. The output then becomes an excitatory (positive) or inhibitory (negative) input to other neurons in the network. The process continues until one or more outputs are generated.
Software quality estimation models are generally of two types: A Classification model that predicts the class membership of modules into two or more quality–based classes. These are initially trained with known software metrics and class membership data. and A Prediction models that estimates a quantitative quality factor(dependent variable) such as the number of faults in a program module using software metrics(independent variables) as predictors collected during the development phases of the software life cycle.
D. Fuzzy Systems: A FLS is a rule-base system that implements a nonlinear mapping between its inputs and outputs.A Fuzzy System is a mapping between linguistic terms, such as “very small”, attached to variables [18]. A fuzzy set is characterized by a membership function, which associates with each point in the fuzzy set a real number in the interval [0, 1], called degree or grade of membership. The membership function may be triangular, trapezoidal, parabolic
Software quality model can be used to predict a quality factor such as number of faults for a module or number of code churns. The overall objective of software quality modeling is to seek the underlying relationship between the software metrics, and program. The aim of software quality estimation
34
International Journal of Software Engineering Research & Practices Vol.1, Issue 1, Jan, 2011 Zhong et al. [22]: “Unsupervised Learning for Expert-Based Software Quality Estimation” current software quality estimation models often involve using supervised learning methods to train a software quality classifier of a software fault prediction model. In such models, the dependent variable is a software quality measurement indicating the quality of a software module by either a Risk-Based class membership or the number of faults. In reality, such a measurement may be inaccurate, or even unavailable. In such situation, this paper advocates the use of unsupervised learning techniques to build a software quality estimation system, with the help of a software engineering human expert.
etc. A FLS is a rule-base system that implements a nonlinear mapping between its inputs and outputs. E. Rule-Based Systems: Rule-Based Systems have been used in very few cases for modeling software development [13]. A rule-based system is organized around set of rules that are activated by facts being present in the working memory, and that activate other facts as shown in fig. 4. In this way chaining can occur with one rule enabling another rule to fire. F. Multiple Linear Regression: MLR is an equation where the response variable is expressed in terms of predictors. The general form of an equation is [3]: yi = a0 + a1xi1 + : : : + apxip yi = a0 + a1xi1 + : : : + apxip + ei where {xi1; : : : ; xip} are the values of the independent variables, (a0; : : : ; ap} are the parameters to be estimated, ^ yi is the dependent variable to be predicted, yi is the actual value of the dependent variable, and ei = yi.
Burgin et al. [6]: “Software Technological Roles, Usability, and Reusability” Software reuse is an important and relatively navy approach to software engineering. The aim of this paper is rather development of a methodology and mathematical theory of software metrics for evaluation of software reusability. In the second section, going after Introduction, it is demonstrated that reusability is in form of usability. This allows one to use experience in the development and utilization of software usability metric for the development and utilization of software reuse metric. In the third section, different types and classes of software metrics are explicated and compared, while in the fourth section, software metrics and their properties in a formalized context are studied The research is Oriented at the advancement of software engineering and, in particular, at creation of more efficient reuse metrics.
LITERATURE REVIEW An early software fault prediction is a proven technique in achieving high software reliability. Prediction models based on software metrics can predict number of faults in software modules. Timely predictions of such models can be used to direct cost effective quality enhancement efforts to modules that are likely to have a high number of faults. Khoshgoftaar et al. [11]: “Detecting Outliers Using RuleBased Modeling for Improving CBR-Based Software Quality classification Models” several studies have shown that accuracy of such models improves when outliers and data noise are removed from the training data set. This paper introduces a new approach called Rule-Based Modeling(RBM) for detecting and removing training data outlier in an effort to improve the accuracy of a Case-Based Reasoning(CBR) classification model. They evaluate the approach by comparing the classification accuracy of CBR models built with and without removing outlier from training data set. It is demonstrated that applying RBM technique for eliminating outlier significantly improves the accuracy of CBR-Based software quality classification models.
K.K. Aggarwal et al. [1]: “Measurement of Software Maintainability Using a Fuzzy Mode” proposed a fuzzy model for maintainability assessment where, maintainability is a measure of characteristics of software e.g. source code readability, documentation quality and cohesiveness among source code and documents. It is also seen that maintainability very much depends on the average number of live variables in a program and average life span of variables. Presently there is no model that considers the effect of these two factors. Therefore, a model which integrates the four factors namely average number of Live Variables LV, average Life Span (LS) of variables, the average Cyclomatic Complexity (ACC) and the Comments Ratio (CR) and provides a measure of maintainability is proposed.
Quah and Thwin [20]: “Application of Neural Network for Software Quality Prediction using Object-Oriented Metrics” this presents the application of neural network in software quality estimation using object-oriented metrics. Quality estimation includes estimating reliability as well as maintainability of software. Reliability s typically measured as number of defects. Maintenance effort can be measured as the lines changed per class. Two neural network models are used: they are Ward Neural Network; and General Regression Neural Network (GRNN). GRNN network model is found to predict more accurately than Ward Network model.
Mertik et al. [15]: “Estimating Software Quality with Advanced Data Mining Techniques” presented the use of advanced tool for data mining called Multi method on the case of building software fault prediction model. Multi method combines different aspects of supervised learning methods and can improve accuracy of generated prediction model. They demonstrate the use Multi method tool on the real data from the Metrics Data Project Data (MDP) Repository. Their preliminary empirical results show promising potentials of this approach in predicting software quality in a software measurement and quality dataset.
35
International Journal of Software Engineering Research & Practices Vol.1, Issue 1, Jan, 2011 TABLE 1
Khoshgoftaar et al. [12]: “Count Models for Software Quality Estimation” identifying which software modules are likely to be faulty is an effective technique for improving software quality. However, classification techniques such as the logistic regression model (lrm) cannot be used to predict the number of faults. In contrast, count models such as the Poisson regression model (prm), and the zero-inflated Poisson (zip) regression model can be used for software quality. It was observed that the zip model yielded better fault prediction accuracy than the prm. Seliya and Khoshgoftaar [19]: “Software Quality Analysis of Unlabeled Program Modules with Semi-supervised Clustering” proposed a contain-based semi-supervised clustering scheme that uses K-means clustering method as the underlying clustering algorithm for this problem. They showed that this approach works better than their previous unsupervised learning based prediction approach. However, this approach uses an expert’s domain knowledge to iteratively label clusters as fault-prone or not. Therefore, this model is also dependent on the capability of the expert.
This table shows a comparison between the techniques [17] with respect to some desirable modeling attributes. We can say that no single technique is suited to all types of problems. Small data set are problematic for all modeling techniques, however by using expert knowledge as a supplement to the data an accurate model can still be derived. Finally the table covers each techniques capability to include known information into a model that is to initialize a model with known facts and then use data to improve and refine it which is first implemented using fuzzy and rule based system.
Harish Mittal et al. [16]: “Software Quality Assessment Based on Fuzzy Logic Technique” proposed a fuzzy logic based precise approach to quantify quality of software. Software can be given quality grades on the basis of two metrics inspection rate/hr and error density. The prediction of quality of software is very easy by this approach. Precise quality grades can be given to any software. Triangular Fuzzy numbers have been used for inspection rate and error/loc. The methodology of fuzzy logic used for, in the proposed study, is sufficiently general and can be applied to other areas of quantitative software engineering.
Conclusion: In this paper we have studied that a variety of techniques have been developed for software quality estimation, most of which are suited either prediction or classification, but not for both. We have discussed the issues in software quality estimation. Software reliability is becoming more and more important in software industry, various techniques are required to discover faults early in development of software Statistical models rely totally on historical data and so transparency is not present in these models. We investigated these problems in the existing models. Therefore, we need quality models that are: flexible, transparent and reusable. After study existing quality modeling techniques, we concluded that no single quality model copes with all of our requirements. However, several approaches meet some of those requirements. We have studied here RT, NN, RBS, MLR, CBR and FUZZY System modeling techniques to build a quality estimation model. As per table conclusion we conclude that no single technique fulfill all requirements. But combined use of technologies has resulted in effective problems solving in comparison with each technology used individually and exclusively such as fuzzy & Rule Based Systems
Analysis of Present Quality Modeling Techniques: Software metrics-based quality estimation models include those that provide a quality-based classification of program modules and those provide a quantitative prediction of a quality factor for the program modules. Previously both algorithmic and non algorithmic techniques have been used to build quality models. Algorithmic approach uses the historical data to come up with a functional relationship. Non algorithmic approaches use expert judgment, probabilistic models and some other soft computing techniques to approximate the functional relation. Regression analysis is the most widely used algorithmic approach. The other algorithmic techniques are discriminate analysis, principal component analysis etc. Among non algorithmic techniques, probabilistic and soft-computing approaches are common. We tried to look at different techniques from the perspectives of imprecision, uncertainty, transparency and generality. Generally speaking, software quality modeling techniques are suited for either one of these types of models, i.e., classification or prediction, but not both.
36
International Journal of Software Engineering Research & Practices Vol.1, Issue 1, Jan, 2011 [12] Khoshgoftaar, T.M. and Gao, K., “Count Models for Software Quality Estimation”, IEEE Transaction on Reliability, 56(2), 212-222, 2007. [13] Lakhotia, A., “Rule–Base Approach to Computing Module Cohesion”, Proceeding of the 15th International Conference on Software Engineering, Los Alamitos, CA, 35-44, 1993. [14] McCall, J.A., Richards, P.A. and Walters, G.F., “Factors in Software Quality”, US Rome Air Development Center Reports NTIS AD/A-049 014, 015, 055, 1977. [15] Mertik, M., Lenic, M., Stiglic, G. and Kokol, P., “Estimating Software Quality with Advanced Data Mining Techniques”, Proceedings of the international Conference on Software Engineering Advances, IEEE Computer Society, Washington, DC, 2006. [16] Mittal, H., Bhatia, P.K. and Goswami, P., “Software Quality Assessment Based on Fuzzy Logic Technique”, International Journal of Software Computing Applications, Issue 3, 105-112, 2008. [17] Mittal,H., “Critical Analysis of Software Quality Estimation Techniques” , In 2nd International Conference on Information Technology for Excellence, Organized by MAIMT, Jagadhari, 2010 [18] Munakata, T. and Jani, Y., Fuzzy Systems: An Oerview. Comm. ACM 37(3), 69-76, 1994. [19] Seliya, N. and Khoshgoftaar, T.M., “Software Quality Analysis of Unlabeled Program Modules with Semisupervised Clustering”, IEEE Transactions on Systems, Man And Cybernetics-Part A: Systems And Humans, 37(2), 201-211, 2007. [20] Quah, T.S. and Thwin, M.M.T., “Application of Neural Networks for Software Quality Prediction using Object-Oriented Metrics”, IEEE Transactions on Neural Networks, 2003. [21] Trendowicz, A. and Punter, T., “Quality Modeling for Software Product Lines”, 7th ECOOP Workshop on Quantitative Approaches in ObjectOriented Software Engineering, 2003. [22] Zhong, S., Khoshgoftaar, T.M. and Seliya, N., “Unsupervised Learning for Expert-Base Software Quality Estimation”, Proceeding of the 8th International Symposium on High Assurance Systems Eng., Tempa, FL, 149155, 2004.
References [1] Aggarwal, K.K., Singh, Y., Chandra, P. and Puri, M., “Measurement of Software Maintainability Using a Fuzzy Mode”, Journal of Computer Sciences, 1(4), 538-542, 2005. [2] Aha, D.W., “Case-Based Learning Algorithms”, Proceeding of the DARPA Case-Based Reasoning Workshop, Washington, DC, 147-158, 1991. [3] Berenson, M.L., Levine, D.M. and Goldstein, M., Intermediate Statistical Methodsand Applications: A Computer Package Approach, Englewood Clis, NJ: Prentice Hall 1983. [4] Boehm, B.W., Brown, J.R. and Kaspar, J.R., “Characteristics of Software Quality”, TRW Series of Software Technology, North Holland, 1978. [5] Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J., Classification and Regression Trees (2nd ed.). Belmont, California: Wadsorth International Group, 1984. [6] Burgin, M., Lee, H.K. and Debnath, N., “Software Technological Roles, Usability, and Reusability, Information Reuse and Integration”, Proceedings of the IEEE International Conference, 2004. [7] Clark, L.A. and Pregibon, D., Tree-Based Models, Pacific Grove, California:Wadsworth International group, 377-419, 1992. [8] Fenton, N.E. and Pfleeger, S.L., Software Metrics: A rigorous and practical approach, PWS Publishing Company, Boston, USA, 1997. [9] ISO/IEC 14598 International standard, Standard for Information Technology- Software Product Evaluation - Part 1: General Overview, 1999. [10] Khoshgoftaar, T.M., Cukic, B. and Seliya, N., “Comparative Study of the Impact of Underlying Models on Module-Order Model Performances”, 8th IEEE International Symposium on Software Metrics, Boca Raton, Florida, USA, 161, 2002. [11] Khoshgoftaar, T.M., Bullard, L.A. and Gao, K., “Detecting Outlier using Rule-Based Modeling for Improving CBR-Based Software Quality Classification Models”, Case-Based Reasoning Reseach and Develoment, Springer-Verlag, 216-230, 2003.
37