Software Quality Modeling using Metrics of Early Artifacts Chandan Kumar1, Dilip Kumar Yadav2 1,2
Department of Computer Applications National Institute of Technology, Jamshedpur, India-831014 1
[email protected],
[email protected]
reliability growth prediction modeling, and accurate assessment of reliability. The issue of finding a universal model for all possible software projects is yet to be solved. Selection of an exact model is very important in software reliability prediction because both the release date and the resource allocation assessment can be affected by the accuracy of prediction. The BBN is one of the methods for modeling systems that include underlying relationships among variables. The benefit of a BBN model is that it is not compulsory to enter values for all of the factors in order to get a prediction. In fact, for this model, reasonably exact predictions of the faults can be achieved typically by entering values for just the three or four most important factors recognized in the sensitivity analysis. Hence, this category of model can be used for competent decisionmaking and trade-off analysis during early development phases. In general BBN modeling, a Bayesian belief value to each unsure event is assigned, such as the belief value of “Cloudy” is assigned as 0.65, the belief value of “Jony is late” is 0.2, and so on. All these probabilities for the unsure events come from people᪃s personal judgments or domain experts that are determined by collecting empirical, historical or statistical data. The rest of the paper is ordered as follows: in section II, we present a concise analysis on Bayesian Belief Networks. In section III, we proposed a BBN model for the software development process, and then we evaluate the model by applying the authentic project data in section IV. Experiments and results are described in section V and conclusion is given in section VI.
ABSTRACT Software industries require reliability prediction for quality evaluation and resource planning. In early phase of software development, failure data is not accessible to conclude the reliability of software. However, early software fault prediction procedure provides a flexibility to predict the faults in early stage. In this paper, a software faults prediction model is proposed using BBN that focus on the structure of the software development process explicitly representing complex relationship of five influencing parameters (Techno-complexity, Practitioner Level, Creation Effort, Review Effort, and Urgency). In order to assess the constructed model, an empirical experiment has been performed, based on the data collected from software development projects used by an organization. The predicted fault ware found very near to the actual fault detected during testing. Keywords: Software Defect Prediction, Software Reliability, Bayesian Belief Network (BBN) 1. INTRODUCTION A software artifact can be released only after some prespecified reliability criterion has been met at the end of the development cycle. A number of logical models have been projected for assessing the reliability of a software system [1]. Data-driven approach and logical method are used to develop software reliability model. Normally, a good number of models consider fault detection process and data for analysis. In the early phase before testing, it is difficult to obtain accurate predictions. But early software defect prediction is useful for assessment of the software development process. In early software reliability prediction literature, there are numerous approaches and models studied like Causal Model (Bayesian net) [11, 12], ProPRED (Probabilistic Model) [13], ESRA (Early Software Reliability Assessment) approach [2, 3, 4], UML based software models [5], ERAT (Early Reliability Analysis Technique) [6], ERPM (Early Reliability Prediction Model) [7], Enhanced Model [8], Decision Tree Based Model in combination of K-means clustering as preprocessing technique [9]. There are no methods and approaches universal for all possible software projects. To develop a reliable software system, several issues are to be addressed. These include depth of reliable software, regular development methodologies, testing methods for reliability,
2. BAYESIAN BELIEF NETWORK (BBN) A Bayesian Belief Network (BBN) [10] represents underlying relationships of a system or dataset and provides a graphical representation of this causal structure through the use of Directed Acyclic Graphs (DAGs) with nodes and edges. The DAG representation then provides a framework for assumption and prediction. The nodes represent random variables with probability distributions, while edges represent weighted underlying relationships between the nodes. Each node has a probability of having a certain value. A directed edge exists from a parent to a child. Each child node has a conditional probability table based in parental values. An example of DAG is shown in Fig. 1, This represents an underlying relationship between
Ƈ7Ƈ
7)
3 random variables P, Q and R whereby there is a relationship PĺQ and PĺR but there is no direct relationship between Q and R. The existence of the underlying relationship Pĺ Q is represented by the arc/edge between the nodes P and Q while the strength of the relationship is represented by the conditional probability P (Q / P).
Be capable of used now on real, large-scale problems since computational tractability issues have been solved.
Fig. 2. Collecting facts in BBN 3.
BBN MODEL OF SOFTWARE DEVELOPMENT PROCESS Defining the BBN Model: To define the BBN model, there is a need to identify the range of information and data sources available, and identify risky variables. The five metrics considered in this model are described below.
Fig. 1. BBN underlying structure The probability table is assigned to each variable. The probability table is especially assigned to the dependent variable (here, Q, R and T are such dependent variables), and is named the conditional probability table (CPT). It is assumed that all variables are binary which take values “True” or “False” and values of some variables are unknown Fig. 2. The BBN approach provides quantified and auditable risk estimation and enables combination of multiple forms of data. This includes inflexible data such as test results, as well as individual data such as experience of staff. The most important advantages accessible by BBN are: 1) Best method for analysis under uncertainty. 2)
Prediction based on evidence propagation algorithm that dynamically updates the model given all current data.
3)
Can come together different data, individual beliefs and empirical data.
3.2. Practitioner level: (Experience + Product familiarity) This metric describes the experience of the software development team members and their familiarity with the software product to be developed. The programmers having less experience and less personal knowledge or information about the product features are more prone to errors and omissions.
including
4)
Will provide predictions even if there is partial or missing data.
5)
Visual analysis that makes every assumptions and evidence explicit and auditable to regulators and inspectors.
6)
3.1. Techno-complexity This metric describes the technology used for the software development and the complexity of software requirements. The increased complexity lead to misinterpretation about the product features.
3.3. Creation Effort Creation effort is assessed in terms of the person-hours. The increased effort spent in the creation of the software requirement specification document, design phase and in the coding phase introduces a smaller amount of faults in the software. 3.4. Review effort Review effort is assessed in terms of the person-hours. The better effort spent in the review of the software requirement
Achieve powerful "what-if" analysis to test sensitivity of conclusions.
Ƈ8Ƈ
specification document, design phase and in the coding phase extracts out the faults in the software.
software [14]. The specifications of projects are shown in Table 1-3.
3.5. Urgency: Percentage compression in time. Assuming that more number of faults are encountered if the time span for completion of the software is less. The architecture of the proposed model has been shown in Fig. 3.
Table 1. Specifications for Project 1
Fig. 3. BBN for Software Development Process 4. MODEL EVALUATION To evaluate our model we enter conditional probabilities to the various nodes of the BBN which are obtained with the help of experts working in the specified domain. The complied mode of BBN is shown in the Fig 4.
Sr. No.
Influencing Variables SRD
Design
Coding
1
Quality of artifacts High made+ Clarity of requirements
Medium
High
2
Effort spent for review High w.r.t creation effort
High
High
3
Effort spent for Medium creation w.r.t ideal phase effort
Medium
Medium
4
Experience practitioner of relevant stage
Medium
Medium
5
Technology/ Novelty High of the feature
High
High
6
Familiarity practitioner product family
of High with
High
Low
7
Urgency/standard time Low
Low
Low
8
Complexity application
Low
Low
9
Unplanned interruptions the work
NA
Low
Low
Medium
High
High
10
of Medium the
of Low
during
Detected Faults
Actual Defects
14
Table 2. Specifications for Project 2 Sr. Influencing Variables No.
SRD
Design
Coding
Quality of artifacts made+ Clarity of requirements
High
High
High
2
Effort spent for review w.r.t creation effort
High
High
High
3
Effort spent for creation w.r.t ideal phase effort
High
High
High
4
Experience of practitioner of the relevant stage
High
High
Low
1
Fig. 4. BBN in Compile Mode 5
5. EXPERIMENTS AND RESULTS Model is experimented with three different specifications of projects and gets the results with the help of Netica
Ƈ9Ƈ
Technology/ Novelty of the feature
Medium Medium
Medium
6
Familiarity of practitioner with product family
High
High
Low
7
Urgency/standard time
High
High
High
8
Complexity of application
High
High
High
9
Unplanned interruptions during the work
NA
High
Low
High
High
High
10 Detected Faults Actual Defects
1 1 14 15.48 2 2 18 21.49 3 3 8 13.48 The results above show better fault predicting accuracy of BBN model with around 81% correlation with the actual faults. 6. CONCLUSION The proposed model is developed using the qualitative value of five metrics (Techno-complexity, Practitioner Level, Creation Effort, Review Effort, and Urgency) of early artifacts of software development life cycle. Predicted number of defects is very near to actual defects detected in testing phase for three software projects shown in Table 2. The model is helpful during testing phase for planning the testing resources.
18
Table 3. Specifications for Project 3 Sr. Influencing Variables No.
SRD
Design
Coding
1
Quality of artifacts made+ Clarity of requirements
High
Medium
Medium
2
Effort spent for review w.r.t creation effort
High
Medium
High
3
Effort spent for creation w.r.t ideal phase effort
High
High
High
4
Experience of practitioner of the relevant stage
High
High
Medium
5
Technology/ Novelty of the feature
Low
Low
Low
6
Familiarity of practitioner with product family
High
High
Medium
7
Urgency/standard time
Low
Low
Low
8
Complexity of application
9
Unplanned interruptions during the work
10 Detected Faults
7. [1]
[2]
[3]
[4]
[5] Medium Medium
Medium
NA
Low
Low
High
Low
Medium
Actual Defects
[6]
8
[7] The actual faults and those obtained by implementing Bayesian Belief Networks for predicting the expected number of faults in 3 different projects before testing are illustrated in Table 4;
[8]
Table 4. Validation of the predicted Results S No
Projects
Actual number of Faults
Predicted number of faults
Ƈ 10 Ƈ
REFERENCES Amrit L. Goel, “Software Reliability Models: Assumptions, Limitations, and Applicability,” IEEE Transactions on Software Engineering, vol. SE-11, no. 12, 1985, pp. 1411-1423. D.K.Yadav, S.K.Charurvedi, R. B. Mishra, “Early Software Defects Prediction using Fuzzy Logic,” International Journal of Performability Engineering vol. 8, no. 4, 2012, pp. 399-408. Sirsendu Mohanta, Gopika Vinod, A. K. Ghosh, Rajib Mall, “An Approach for Early Prediction of Software Reliability,” ACM SIGSOFT Software Engineering Notes vol. 35(6), 2010, pp. 1-9. Sirsendu Mohanta, Gopika Vinod, and Rajib Mall, “A technique for early prediction of software reliability based on design metrics,” Int J Syst Assur Eng Manag, vol. 2(4), 2011, pp. 261–281. Vittorio Cortellessa, Harshinder Singh, and Bojan Cukic, “Early reliability assessment of uml based software models,” In WSOP: „02 Proceedings of the 3rd international workshop on Software and performance Pages 302-309, ACM, 2002 Rakesh Tripathi and Rajib Mall, “Early stage software reliability and design assessment,” In APSEC ᪃05: Proceedings of the 12th Asia-Pacific Software Engineering Conference, IEEE Computer Society, 2005. Carol Smidts, Martin Stutzke, Robert W. Stoddard, “Software Reliability Modeling: An Approach to Early Reliability Prediction,” IEEE Transactions on Reliability, vol. 47, 1998, pp. 268-178. K. Saravana Kumar, R. B. Misra, “An Enhanced Model for Early Software Reliability Prediction using Software Engineering Metrics,” In SSIRI '08: Second International Conference, IEEE Computer Society, 2008.
[9]
Parvinder S. Sandhu, Amanpreet S. Brar, Raman Goel, Jagdeep Kaur, Sanyam Anand, “A Model for Early Prediction of Faults in Software Systems,” In ICCAE ᪃10: Second International Conference, IEEE, 2010 [10] F.V. Jensen, an Introduction to Bayesian Networks. Springer-Verlag, 1996. [11] Fenton N, Krause P, Neil M, “Software measurement: uncertainty and causal modeling”, Software IEEE, vol. 19, 2002, pp. 116–122. [12] Fenton N, and Neil M, “A critique of software defect
prediction models”, IEEE Transactions on Software Engineering, vol. -25, no. 5, 1999, pp. 674-689. [13] Jie Ba, and Shujian Wu, “ProPRED: A Probabilistic Model for the Prediction of Residual Defects,” In MESA᪃12: IEEE/ASME International Conference Page 247-251. [14] Netica, Available at http://www.norsys.com.
Ƈ 11 Ƈ