Role of Soft Computing Techniques in Software

0 downloads 0 Views 315KB Size Report
ISO (International Organization for Standardization) [5] is the world's largest ..... [5] ISO/IEC 14598 International standard, Standard for Information. Technology- ...
Proc. of the International Conference on Advanced Computing and Communication Technologies (ACCT 2011)

Software Quality Estimation Models: A Comparative Analysis Deepak Gupta

Vinay Kr.Goyal

Harish Mittal

Student, Deptt. of CSE SGVU,Jaipur

Prof. ,Deptt. of C.S.E. J.I.E.T.,Jind

A.P., Deptt. of IT V.C.E., Rohtak

Estimation Modeling Techniques are considered a part of the reliability and quality control “toolbox”. They can be used to calibrate software metrics based models for predicting either a quantitative value or a risk-based class membership as an indicator of the expected quality of software modules.

Abstract:

Software quality is defined as the degree to which a software component or system meets specified requirements and specifications. It has become one of the most important requirements in the development of systems. Previous research has shown that software quality models based on software metrics can yield prediction with useful accuracy. A software quality estimation model allows the software development team to track & detect potential software defects. They identify software sub-systems (modules, component, classes or files) which are likely to contain faults. Different researchers have proposed their own design software quality prediction models to help measure the quality of software products, but every organizations use different quality prediction models based on their requirements. This paper

Timely software quality estimations during the development life cycle can be obtained with the aid of models, such as software quality prediction and software quality classification models. A software quality prediction model estimate a quantitative quality factor (or dependent variable) such as the number of faults in a program models using predictors (or independent variables) [4] collected during the development phase of the software life cycle In contrast, a software quality classification model, also based on a set of predictors collected early in the software life cycle, predicts a module’ class membership such as fault-prone (FP) or not fault-prone (NFP) [8]. Both types of models can be used as software quality prediction models in a broad sense and differences lies only in what the models is to predict fault-proneness or fault number.

presents a comparative evaluation of different software quality estimation models and analyzes their performance. This study also helps to select a best approach to get desired levels for quality & also suggest using soft computing approach for software quality prediction. Keywords: S/W Quality, S/W Metrics, Quality models, S/W Quality Estimation Model

For both types of models, Software Metrics are treated as independent variables and different techniques can be used to predict software quality. Estimation models can take different forms depending on the building technique that issued. They can be mathematical models. They can be rule sets or decision trees. Some of the more recent software quality estimation techniques include Case-Based Reasoning, Fuzzy Logic[2], Neural Network[2], Classification and Regression Tree, Support Vector Machine [18] and Bayesian Belief Network[1].

Introduction Software quality model identifies fault-prone modules & no. of errors in the software. A software quality estmaion models allowsthe software development team to track & detects potential software defects. Software quality prediction is helpful for better utilization of resource and to minimize error. Hence it helps in reducing the cost by early assessment of faults. It also helps in planning more thorough tests for modules that might have defects. Typically when a software module has been identified as fault-prone, it can be verified and tested more accurately. This increases the probability of identifying future bugs & reduces the risks of Faults in order to estimate the quality of software.

The Aim is to propose a capable model to estimate the faultproneness of software modules or classes. The main goal of the study reported in this paper to identify best Software Estimation Quality Model. Many organizations are involved in the study of software quality prediction techniques and much work has been done to develop and refine models. But still no study has concluded a best quality estimation models that always outperforms other quality models. Despite the emergence of various software quality prediction models and techniques, the applicability and accuracy of these models are still under research. Therefore more research is required to analyze more results and comparisons.

Due to high cost of correcting problems discovered by customers the goal of our modeling is identification of faultprone modules early in development software quality models are tools for focusing efforts to find faults. Software Quality

54

Proc. of the International Conference on Advanced Computing and Communication Technologies (ACCT 2011)

Software Quality Software Quality is defined as the degree to which a software component or system meets specified requirements and specifications. Assessing software quality in the early stages of design and development is crucial as it helps reduce effort, time and money. However, the task is difficult since most software quality characteristics (such as maintainability, reliability and reusability) can not be directly and objectively measured before the software product is deploy and used for a certain period of time.

B. ISO (International Standard Organization) ISO (International Organization for Standardization) [5] is the world’s largest developers and publisher of International standards. ISO is a non-governmental organization that forms a bridge between the public and private sectors. 1) ISO 9001: ISO 9001 is an international quality management system standard applicable to organizations within all type of business. ISO 9001 is a process oriented approach towards quality management. That is, it proposes designing, documenting, implementing, supporting, monitoring, controlling and improving processes.

Software Metrics Effective management of any process requires quantification, measurement, and modeling. Software metrics [3] provide a quantitative basis for the development and validation of models of the software development process Metrics can be used to improve software product and the process by which it is developed. Software metrics may be broadly classified: 



2.)

Product Metrics: may measure the complexity of

the software design, the size of the final program, or the number of pages of documentation produced. Process Metrics: are measures of the software development process, such as overall development time, type of methodology used, oo the average level of experience of the programming staff.

ISO 9126 : ISO has also release the ISO 9126 : Software Product Evaluation : Quality Characteristics and Guidelines for their Usestandard. ISO 9126 also includes functionality as a parameter, as well as identifying both internal and external quality characteristics of software products.

C. Capability Maturity Model (CMM): is a service mark owned by Carnegie Melon University (CMU) and refers to a development model elicited from actual data. Hen it is applied to an existing. Organization’s software development processes, it allows an effective approach tward improving them.     

Quality Models: A quality model is a schema to better explain of our View of Quality. Some existing quality models can predict faultproneness with reasonable accuracy in certain contexts. Other quality models attempt at evaluating several quality characteristics but fail at providing reasonable accuracy, from lack of data mainly.

Initial Managed Defined Quantitatively Managed. Optimized

Review of Existing Software Quality estimation Models S. Amasaki et. al.[1] “A Bayesian Belief Network for assessing the likelihood of fault content” have found some projects which generate poor quality products and SRGM(software reliability growth model) fails to predict quality in such cases. Regression models can be used to identify such risky projects, but they suffer from a severe problem, that highly correlated metrics can’t be used simultaneously and hence the prediction results will not be reliable. It has proposed to use Bayesian Belief Network (BBN). BBN can help to identify risky projects which may lead to poor quality products and hard to predict in advance. BBN approach has two Major Parts: a directed acyclic graph (DAG) and a probability table.

A. McCall’s Quality Model (1977) McCall [11] presents a quality model in order to attempts to bridge the gap between users and developers by focusing on a number of software quality factors that reflect both the user’s views and the developer’s priorities. The McCall quality model has three major perspectives for defining and identifying the quality of a software product: product revision (ability to undergo changes), product transition (adaptability to new environments) nad product operations (its operatin characteristics).

55

Proc. of the International Conference on Advanced Computing and Communication Technologies (ACCT 2011)

Xing et. al.[18] “A novel for early software quality prediction based on support vector machine” have suggested to use SVM(Support Vector Machines) for software quality prediction. In their suggested idea, Limited number of complexity metrics can be used as input vector and after mapping it into high dimensional features space, SVM can classify the modules a faulty or non-faulty. SVM is a data classification technique which can generalize high dimensional spaces under small training sample conditions.

this approach in predicting software quality in a software measurement and quality dataset. Khoshgoftaar et al. [10] “Count Models for Software Quality Estimation” identifying which software modules are likely to be faulty is an effective technique for improving software quality. However, classification techniques such as the logistic regression model (lrm) cannot be used to predict the number of faults. In contrast, count models such as the Poisson regression model (prm), and the zero-inflated Poisson (zip) regression model can be used for software quality. It was observed that the zip model yielded better fault prediction accuracy than the prm.

Khoshgoftaar et al. [9] “Detecting Outliers Using Rule-Based Modeling for Improving CBR-Based Software Quality classification Models” This paper introduces a new approach called Rule-Based Modeling(RBM) for detecting and removing training data outlier in an effort to improve the accuracy of a Case-Based Reasoning(CBR) classification model. They evaluate the approach by comparing the classification accuracy of CBR models built with and without removing outlier from training data set. It is demonstrated that applying RBM technique for eliminating outlier significantly improves the accuracy of CBR-Based software quality classification models.

Seliya et al. [15] “Software Quality Analysis of Unlabeled Program Modules with Semi-supervised Clustering” proposed a contain-based semi-supervised clustering scheme that uses K-means clustering method as the underlying clustering algorithm for this problem. They showed that this approach works better than their previous unsupervised learning based prediction approach. However, this approach uses an expert’s domain knowledge to iteratively label clusters as fault-prone or not. Therefore, this model is also dependent on the capability of the expert.

Quah et al. [16] “Application of Neural Network for Software Quality Prediction using Object-Oriented Metrics” this presents the application of neural network in software quality estimation using object-oriented metrics. Quality estimation includes estimating reliability as well as maintainability of software. Reliability s typically measured as number of defects. Maintenance effort can be measured as the lines changed per class. Two neural network models are used: they are Ward Neural Network; and General Regression Neural Network (GRNN). GRNN network model is found to predict more accurately than Ward Network model.

Mittal et al. [13] “Software Quality Assessment Based on Fuzzy Logic Technique” proposed a fuzzy logic based precise approach to quantify quality of software. Software can be given quality grades on the basis of two metrics inspection rate/hr and error density. The prediction of quality of software is very easy by this approach. Precise quality grades can be given to any software. Triangular Fuzzy numbers have been used for inspection rate and error/loc. The methodology of fuzzy logic used for, in the proposed study, is sufficiently general and can be applied to other areas of quantitative software engineering.

Zhong et al. [21] “Unsupervised Learning for Expert-Based Software Quality Estimation” current software quality estimation models often involve using supervised learning methods to train a software quality classifier of a software fault prediction model. In such models, the dependent variable is a software quality measurement indicating the quality of a software module by either a Risk-Based class membership or the number of faults. In reality, such a measurement may be inaccurate, or even unavailable. In such situation, this paper advocates the use of unsupervised learning techniques to build a software quality estimation system

Khoshgoftaar et al.[7] “Predicting fault-prone software modules in embedded systems with classification trees” have used Classification and Regression Trees (CART) to predict fault-prone software component in embedded system. The goal of their prediction was to improve reliability of embedded systems. This model was applied to the modules which were currently being developed to predict the software modules which have high risk of faults to be discovered later on. They have taken software metrics as independent variable and treated the classes of modules fault-prone or not faultprone as dependent variable.

Mertik et al. [12] “Estimating Software Quality with Advanced Data Mining Techniques” presented the use of advanced tool for data mining called Multi method on the case of building software fault prediction model. Multi method combines different aspects of supervised learning methods and can improve accuracy of generated prediction model. They demonstrate the use Multi method tool on the real data from the Metrics Data Project Data (MDP) Repository. Their preliminary empirical results show promising potentials of

Yuan et al. [20] “An application of fuzzy clustering to software quality prediction” has applied subtractive fuzzy clustering to classify the modules as fault-prone and not faultprone and predict the numbers of faults. They cluster the data by picking up potential centers of clusters and selecting data point with highest potential as the center. This process of selecting centers for forthcoming clusters continues until the

56

Proc. of the International Conference on Advanced Computing and Communication Technologies (ACCT 2011)

potential values of remaining data points fall below a certain threshold.. These rules are then used to predict number of faults. They perform clustering on product as well as process metrics.

Table 1

Analysis of Quality Estimation Models: We have analyzed different quality prediction models & evaluate their performance, which are used to classify fault prone modules and no. of faults. Some models define quality in one way and some in another way. They may be taking quality in terms of number of total defects, number of residual defects or just classifying a module under construction as faulty or non-faulty.

Study Parameters

NN

SVM

CBR

BBN

Fuzzy logic

CT/RT

Time

High

High

Low

High

High

Medium

Data Requiremen ts

High

Low

High

High

Low

Medium

Software Size

Medium, Large

Small, Medium, Large

Small, Medium

Medium, Large

Small,

Medium, Large

Medium High

Accuracy

Medium

Medium

Medium

Medium

Medium

COMPARISON OF PREDICTION TECHNIQUE

This table also shows various parameters or attributes on basis of which these approaches have been compared.

As per study given by Khan M.J. [6] , BBN based quality models predicts quality in terms of residual defects. An important problem with existing BBN based models as identified in [1] is that human factor in quality prediction has not yet been catered and ignoring it at moment may lead to loss of accuracy in prediction.

Future Work: To improve performance of Quality Prediction Model, we can apply soft computing approach (Fuzzy Logic & Neural Network), because of their learning and/or reasoning capability. Clustering techniques present better result in this field, which may be also use to enhance software quality prediction. . There exist some soft computing techniques using clustering, but still there is scope to improve clustering techniques.

SVM based quality model is a classifier and takes quality in terms of two classes: faulty and non-faulty modules. Its positive aspects is that it has the capability to work fine even in the presence of small amount of data by projecting it into high dimensional space. CBR based model is concern with strong working hypothesis and that is “current module will be as faulty as previous modules with similar software metrics”. This hypothesis restricts the CBR approach that the current module whose quality is to be forecasted must belong to same domain. In this case our prediction may be seriously affected.

Conclusion: In this paper we have compared and analyze the performance of various software quality estimation models and the way they predict quality. Taking a broad view of “Reliability” we focus on Software Quality Models that make timely predicted of reliability indicators on a module-by-module basis. We also criticized some of the approaches and identified their weak points. This comparative study may be helpful to identify an approach based on our desired goals and available resources. We have studied that many quality estimation model have been developed, most of which are suited either prediction or classification, but not for both. Given set of criteria considered in table, it is concluded that combination of Fuzzy-Neuro System based model yielded better results than other models. We investigated some problems in the existing quality model. Therefore, we need a quality model that full-fill all requirements. Also conclude that no single quality model copes with all of our requirements. But combined use of technologies based model has resulted in effective problems solving in comparison with each technology based model. We also conclude that ANN & FL are two key technologies in computational intelligence that have received growing attention in solving real world, non linear, complex problems. Comparison of approaches gives an answer on the effectiveness and efficiency of a soft computing approach.

In CART based model, RT can be used when the o/p value to be predicted is from the interval domain, while CT are used to predict the o/p class for an observation Neural Network based quality model interprets quality in abstract terms and classifies the module as faulty or not-faulty. ANN [19] is characterized by their learning and selforganization capability. Training data plays an important role in predictions when using this approach. This approach may work better if combined with fuzzy logic. FLS based models [14] also produce better results. FL has emerged as a mathematical tool to deal with uncertainties associated with human cognition processes and it may be enhanced when combined with genetic algorithms to automatically generate example data to reduce cost and improve accuracy [2]. Table 1, shows a comparison between the different approaches [17] with respect to some desirable modeling attributes. Each of approach which is used for accurate in certain scenario and a real comparison on their accuracy cannot be made. So it’s difficult to classify which one is better.

57

Proc. of the International Conference on Advanced Computing and Communication Technologies (ACCT 2011) [12] Mertik, M., Lenic, M., Stiglic, G. and Kokol, P., “Estimating Software Quality with Advanced Data Mining Techniques”, Proceedings of the international Conference on Software Engineering Advances, IEEE Computer Society, Washington, DC, 2006.

References [1] Amasaki, S., Takagi, Y., and Kikuno, T., “A Bayesam belief network for assessing the likelihood of fault content”, Proceedings of the 14th IEEE international Symposium on Software reliability Engineering. 2003.

[13] Mittal, H., Bhatia, P.K. and Goswami, P., “Software Quality Assessment Based on Fuzzy Logic Technique”, International Journal of Software Computing Applications, Issue 3, 105-112, 2008.

[2] Baisch, E., and Liedtke, T., “Comparison of conventional approaches and soft-computing approaches for software quality prediction”’ Proceeding of IEEE International Conference on Systems, 1997.

[14] Mittal,H., “Critical Analysis of Software Quality Estimation Techniques” , In 2nd International Conference on Information Technology for Excellence, Organized by MAIMT, Jagadhari, 2010

[3] Fenton, N.E. and Pfleeger, S.L., Software Metrics: A rigorous and practical approach, PWS Publishing Company, Boston, USA, 1997.

[15] Seliya, N. and Khoshgoftaar, T.M., “Software Quality Analysis of Unlabeled Program Modules with Semisupervised Clustering”, IEEE Transactions on Systems, Man And Cybernetics-Part A: Systems And Humans, 37(2), 201-211, 2007.

[4] Fenton, N.E. and Neil, M., “A Critique of Software defect prediction models”, IEEE Transaction on Software Engineering, 1999. [5] ISO/IEC 14598 International standard, Standard for Information Technology- Software Product Evaluation - Part 1: General Overview, 1999.

[16] Quah, T.S. and Thwin, M.M.T., “Application of Neural Networks for Software Quality Prediction using Object-Oriented Metrics”, IEEE Transactions on Neural Networks, 2003.

[6] Khan, M.J., Awais M.M., and Hussain, T., “Comparative Study of Various Artificial Intelligence Techniques to Predict Software Quality”, in Proceeding of the IEEE, 2006

[17] Technical Report , LUMS, “A Survey of Measurement-Based Software Quality Prediction Techniques”, 2007.

[7] Khoshgoftaar, T.M., and Allen, E.B., ”Predicting fault-prone software modules in embedded systems with classification trees”, in Proceeding of the 4th IEEE International Symposium on High-Assurance Systems Engineering, IEEE, Computer Society, 1999.

[18] Xing, F., Guo, p., and Lyu, M.R., “A novel for early software quality prediction based on support vector machine”, Proceeding of the 16th IEEE International symposium On Software Reliability Engineering, 2005.

[8] Khoshgoftaar, T.M., and Seliya, N., “Analogy-Based practical classification rules for Software Quality Estimation”, Empirical Software Engineering, 8(4), 2003.

[19] Yang, Yao and Huang, “Early Software Quality prediction Based on a Fuzzy Neural network Model”, in Proceeding of the IEEE international conference on Natural Computational, 2007.

[9] Khoshgoftaar, T.M., Bullard, L.A. and Gao, K., “Detecting Outlier using Rule-Based Modeling for Improving CBR-Based Software Quality Classification Models”, Case-Based Reasoning Reseach and Develoment, Springer-Verlag, 216-230, 2003.

[20] Yuan, X., Khoshgoftaar, T.M., Allen E.B., and Ganesan, K., “An application of fuzzy clustering to software quality prediction”’ in Proceeding of the IEEE Symposium on Application-Specific Systems and Software Engineering Technology, IEEE, 2000

[10] Khoshgoftaar, T.M. and Gao, K., “Count Models for Software Quality Estimation”, IEEE Transaction on Reliability, 56(2), 212-222, 2007.

[21] Zhong, S., Khoshgoftaar, T.M. and Seliya, N., “Unsupervised Learning for Expert-Base Software Quality Estimation”, Proceeding of the 8th International Symposium on High Assurance Systems Eng., Tempa, FL, 149155, 2004.

[11] McCall, J.A., Richards, P.A. and Walters, G.F., “Factors in Software Quality”, US Rome Air Development Center Reports NTIS AD/A-049 014, 015, 055, 1977.

58