International Journal of Software Engineering and Its Applications Vol. 9, No. 7 (2015), pp. 135-142 http://dx.doi.org/10.14257/ijseia.2015.9.7.14
Predicting Faults before Testing Phase using Halstead’s Metrics Rajni Sehgal1 and Dr. Deepti Mehrotra2 1, 2
Amity School of Engineering and Technology, Amity Noida 1
[email protected],
[email protected] Abstract
Software designers are motivated to utilize off-the-shelf software components for rapid application development. Such applications are expected to have high reliability as a result of deploying trusted components. This paper introduces Halstead’s software science to predict the fault before testing phase for component based system. Halstead’s software science is used to predict the faults for individual component and based on this faults reliability of different component is measured so that only reliable component will be reused.
Keywords: Faults, reliability, Component based software engineering
1. Introduction Delivering a reliable software product within a stipulated time and budget is always a prime objective of developer. Software reliability is defined as the chances of failurefree maneuver of a software system for a specified period of time in a specified environment [1]. With increase in complexity of projects and time crunch, testing phase has become more challenging. During late development stages in practice, high software reliability is frequently safeguarded by the developers by means of software testing. Detecting software fault at early development stage will improve reliability and overall quality of software. Halstead’s Metrics that aims predicting the faults at early stage of development will definitely improve the reliability of software, specially this methodology will be more suitable for component based system where the components are assembled to fulfill designed need. In the software engineering community CBSE (Component Based Software Engineering) based software development are flattering progressively popular. The objective is to design components that can be simply installed and cooperate well with existing system components. The frequent use of components makes them suitable for numerous applications. In practice, developers often ensure high software reliability only through soft-ware testing during late development stages. But to make the reusability more effective, the reliability of a component is known to the user before testing phase. If a reliable component is assembled into the newly developed system, expected reliability of that system is also high. CBSE-based development raises new issues in software development research: safeguarding the reliability of the collaboration between constituents, associations between system failures and failures of particular components, etc. Component–based software reliability engineering can aid us gather and scrutinize information that is pertinent for solving quality related problems. Component catastrophe rates have a major role in reliability appraisal process of the system as a whole. The problem we face is how to combine these pieces of the reliability conundrum together. It is known that a software fault and the resulting error in one component can be propagated to the other interacting components causing their failures. Reliability measurements are also useful in supporting management of the software development process. For instance, attaining reliability estimates early in the
ISSN: 1738-9984 IJSEIA Copyright ⓒ 2015 SERSC
International Journal of Software Engineering and Its Applications Vol. 9, No. 7 (2015)
development process can help decide if the software system is on track to encounter its reliability goals and consequently increase management effectiveness. In this paper we discuss the prediction of reliability of component assemblies where the components are reusable assets. CBSE is very obliging to improve the efficiency. But to make software reliable and reusable and to preserve the quality it is necessary to measure the attributes of the software. In CBSE, to make software reliable and reusable it is very crucial to check the eminence at every phase before testing to become aware of the mistakes at early stage of software development lifecycle.
2. Related Work Lots of work has been done in the field of predicting the faults before testing phase some of them is discussed here Leslie Cheung, Roshanak Roshandel, et. al., purposes Early Prediction of Software Component Reliability at architectural level in their study Markov Model is being used for three states i.e., Determining States, Determining Transitions, and Computing Reliability [2]. Petar Popic, Dejan Desovski, et. al., extend their previous work on Bayesian reliability prediction of component based systems by introducing the error propagation probability into the model. They conclude that error propagation may have a significant impact on the system reliability prediction and, therefore, future architecture-based models should not ignore it. [3].Vittorio Cortellessa, Vincenzo Grassi has done the analysis of reliability of a component based system based on error propagation probability. This is the probability that an error arising somewhere in the system propagates to other components possibly up to the system output [4] Ralf H. Reussner, Heinz W. Schmidt purposes a method based on rich architecture definition language (RADL) to predict the reliability of component based software system. They have shown that RADL allows software architects to predict component reliability through compositional analysis of usage profiles and of environment component reliability [5] Genaina Rodrigues, David Rodenblum presents a novel automatic approach for predicting software system reliability. This approach involves the probability of component failure and Scenario transition probabilities derived from an operational profile [6]. L Thomas Zimmermann, Nachiappan Nagappan shows how to use the complexity of a subsystem dependency graph to predict the number of failures at statistically significant levels [7]. T. L. Graves, A. F. Karr, J. S. Marron explored that the extent to which measurements from the change history are successful in predicting the distribution over modules of these incidences of faults. In general, process measures based on the change history are more useful in predicting fault rates than product metrics of the code: For instance, the number of times code has been changed is a better indication of how many faults it will contain than is its length [8]. Arun Sharma et. al., proposes a link-list based dependency representation and implements it by using Hash Map in Java. This representation can store the dependency along with other information like, provided and required interfaces of components along with their types. [9].Kirti Seth et. al., introduces a reliability model and reliability analysis technique for architecturebased reliability evaluation for component based system [10] Kirti Seth et. al., focus on four factors that have the strongest effect on CBSS reliability. Based on these four factors, they propose a new fuzzy-logic-based model for estimating CBSS reliability [11] N. S. Gill, P.S.Grover discusses issues related to component-based development (CBD) and suggests a general definition of software component based on several existing definitions. [12] Swapna S. Gokhale proposes a unifying framework for state based models for architecture based software reliability models. In state based model they consider application architecture is represented either as a discrete time Markov Chain(DTMC) or continuous time Markov Chain(CTMC) [13] Yue Jiang et. al., compare the performance of predictive models which use design level metrics with those uses code level metrics
136
Copyright ⓒ 2015 SERSC
International Journal of Software Engineering and Its Applications Vol. 9, No. 7 (2015)
and those that use both. [14] Nick J. Pizzi purposes in his case study an aggregation technique based on fuzzy integration that combines that combines the predictive quantitative assessments from multiple classifier [15] Cagatay Catal surveys the software engineering literature on software fault prediction and both machine learning based and statistical based approaches on 90 software fault prediction papers [16] YANG Weimin, LI Longshu purposes the correlation of software metrics focusing on the data sets of software defect prediction. A rough set model is presented to reduce the attributes of data sets of software defect prediction [17]. Athanasios Tsakonas et. al., investigate the capability of the genetic programming approach for producing solution composed of decision rules. [18] Thomas J. Ostrand et. al., developed a negative binomial regression model has been developed and used to predict the expected number of faults in each file of the next release of a system. The predictions are based on the code of the file in the current release, and fault and modification history of the file from previous releases [19]. 2.1. Predicting Faults for Component based System Deepak Panwar, Pradeep Tomar purposes a New Method to Find the Maximum Number of Faults in early life cycle of software development. Methodology used by them is Halstead’s Software Science, They apply the Halstead’s Software Science on a Component Based System to predict the faults before testing phase [20]. Panwar, D.; Tomar P.; Gill, N. S purposes new method to find out the faults before the testing phase. They purpose a new metrics which is improvement of Halstead’s Software Science, they improved the Halsted’s Software Science based on the Attributes of Component based Software Engineering [21].
3. Methodology (i) Predicting faults at early development Phase (Before Designing) Reliability of a product depends upon the number of faults in the component also on the Probability of usage. More number of faults does not mean that a component will fail; Reliability of a component also depends upon the usage. At early stage of designing of software one can predict the probability of length of code and usage of the component. Based on the design models it is not possible to express the chance of reliability with exact figures. We propose therefore Cartesian product of reliability levels, e.g., Lr = {high, medium, low}, where Lr is the set of reliability levels. To determine the Reliability of a component we define Lv = {high, low}, where Lv is the set of volume levels. There is extra possibility of occurrence of faults and failures where the volume of the code is high i.e., LOC is more, which results to less reliable component. Assigning these Volume levels (Lv) to the components, we know which components are highly fault-prone. The operational profile [23] “is a description of the usage of the system, showing which functions are mostly used.” This information is used to assign usage levels Lu to the components. This can be of numerous granularities. An example would be Lu = {high, medium, and low}. We can analyze the probability that the Faults in the component lead to a failure, when we know the usage of each component. Thus, Reliability of the component is the combination of volume level (Lv) and usage level (Lu). We refer to the mapping of the Volume level and Usage level to the reliability with the function fr: 𝑓𝑟 =
1 × 𝐿𝑢 −→ 𝐿𝑟 𝐿𝑣
What the function does is simply to map all components with a high volume level to its usage level and all components with low volume level to low
Copyright ⓒ 2015 SERSC
137
International Journal of Software Engineering and Its Applications Vol. 9, No. 7 (2015)
Table I. Predicting Reliability at Design Phase Volume High * High * High * Low * Low * Low *
Usage High Medium Low High Medium Low
Reliability Low Low High Low Medium Low
(ii)Predicting faults at early development Phase (After Designing) Step1: Component Based software system is taken which has 20,000 LOC, 10 components Components are Interactive but they are independent. Step2: Volume of 10 individual components is calculated using the Halstead’s Metrics Volume(V) = Nlog 2 (n1 + n2)
Where N =N1 + N 2. N1 is the no. of operator’s occurrences,N2 is the no. of operand’s occurrences ,n1 is the unique operator and n2 is the unique operand n1, n2 constitute the vocabulary of the program
Step3: Calculate the faults for 10 individual components using the Halstead’s Metrics Faults B=V/S0
B is the number of faults in the program and V is the volume .S0 is the mean number of mental discriminations (Decisions),S0= 3000 Step4: As faults leads to failure, there is need to calculate the MTBF i.e. Mean Time between Failures. MTBF is the calculated average time it will take for a system to fail which can be calculated as MTBF =
Mission time∗Sample Populations Failures with in mission time
[22]
Where mission time is the time for that data is collected for calculations of reliability, Sample population is the number of components taken for the study, failures with in mission time is the number of failure exhibits with in mission time
Step5: Reliability is the probability that a system will work without failures for a specific period of time. Value of reliability lies between 0 and 1 which can be calculated as reliability = e−planned service life/MTBF [22]
Service Life is the amount of time a device is service, or the expected length of operation before a device will fail
3.1. Experimental Setup Halstead’s software Science is being used to predict the faults for component based system. For this study a component based system is taken which has 10 independent component, it has 20,000 LOC (line of code). This component based system has database, so every component of this component based system is connected to the data base, technology used to develop this system is Microsoft .Net, which is GUI Based language,
138
Copyright ⓒ 2015 SERSC
International Journal of Software Engineering and Its Applications Vol. 9, No. 7 (2015)
and data base is designed in MySQL, As we know Microsoft .Net is GUI based language there is less effort to develop the frontend of the system as compare to the background of the system i.e., database of the system. Halstead Software Science is being applied to 10 different components on their foreground as well as background code to predict the faults. Halstead’s Software Science is applied to the individual components of component based system to predict the faults. This is shown in table2. If a system is faulty it may lead to failure, so various failures has been recorded which is shown is Table2
4. Observation and Results Table II. Prediction of Faults for Different Components Component1 Foreground Background Component2 Foreground Background Component3 Foreground Background Component4 Foreground Background Component5 Foreground Background Component6 Foreground Background Component7 Foreground Background Component8 Background Component9 Foreground Background Component10 Foreground Background
Volume
Faults
MTBF
Reliability
137.25 3120.10
0.044 1.04
0.05 0.025
0.984 0.036
99.91 3216.03
0.033 1.07
0.033 0.1
0.0820 0.436
456.87 2547.18
0.152 0.84
0.041 0.045
0.131 0.158
87.56 5536.28
0.029 1.85
0.0625 0.025
0.26 0.036
369.25 37898.23
0.122 12.632
0.031 0.038
0.07 0.11
2141.07 13118.16
0.69 4.372
0.033 0.037
0.080 0.106
378.64 14668.22
0.359 4.889
0.038 0.0625
0.11 0.26
2548.21
0.849
0.0833
0.3678
1409. 1140.87
0.486 0.380
0.125 0.047
0.51 0.170
1157.75 1938.62
0.383 0.646
0.03125 0.04166(
0.074 0.165
Table III. Number of Failures Exhibited within 6 Months Summary of Errors
Components
Splash Screen not showing Splash Screen Story board not starting Background worker not loading Canvas not expanding dynamically Incorrect dimensions of dynamic buttons
Component 1 Component 1 Component 1 Component 2 Component 2
Copyright ⓒ 2015 SERSC
139
International Journal of Software Engineering and Its Applications Vol. 9, No. 7 (2015)
Incorrect background image last service date format does not match database date format Fuel Logs(Index Out of Bounds Exception) Browse gallery API not working Designer Failed on loading Graph
Component 2 Component 3 Component 4 Component 5 Component
4.1. Comparison between Volume and Faults for Foreground Code
Figure 1. Number of Faults Predicted in 10 Components In GUI code .As it is seen in the above figure that where the volume is high numbers of faults are high. 4.2. Comparison between Volume and Faults for Background Code Background Faults 40000 35000
Fault
30000 25000
CV
20000
Fault
15000 10000 5000 0 1
2
3
4
5
6
7
8
9
10
Component Volume
Figure 2. Prediction of Faults for 10 Different Components In Background code, if we compare the Figure 1 and Figure 2 we can compare that volume in background code is more Than GUI code, so there is more probability of
140
Copyright ⓒ 2015 SERSC
International Journal of Software Engineering and Its Applications Vol. 9, No. 7 (2015)
occurrence of faults in background code than GUI code so there are more chances that system may fail due the faults in background code. 4.3. Comparison between MTBF and Reliability for Foreground Code
MTBF/Reliabilty 1.2
Reliabilty
1 0.8 Foreground MTBF Foreground Reliabilty
0.6 0.4 0.2 0 1
2
3
4
5
6
7
8
9
MTBF
Figure 3. MTBF (mean time between failures) /Reliability for Foreground Code This shows as the MTBF increases reliability of the system is also increasing. 4.4. Comparison between MTBF and Reliability for Background Code
Figure 4. MTBF (mean time between failures) / Reliability for Background Code This shows as the MTBF increases reliability of the system is also increasing.
5. Conclusion Halstead’s software science predict the faults for individual components but a system is composed of components, when a components works as a system means components are interactive through some interface so Halstead software science cannot predict the faults when the system are interactive, so there are some parameters like what will be effect on the reliability if the requirement change, design change, number of line of code is changed, some functions of one component is called by another component ,keeping view of all these
Copyright ⓒ 2015 SERSC
141
International Journal of Software Engineering and Its Applications Vol. 9, No. 7 (2015)
parameters a new metrics can be purposed which can predict more faults than Halstead’s software Science.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23]
142
J. D. Musa, A. Iannino and K. Okumoto, “Software reliability: measurement, prediction, application”, McGraw-Hill, Inc., New York, (1987). L. Cheung and R. Roshandel, “Early Prediction of Software Component Reliability at architectural level”, ACM, (2010). P. Popic and D. Desovski, “Error Propagation in the Reliability Analysis of Component based Systems”, IEEE, (2005). V. Cortellessa and V. Grassi, “A Modeling approach to analyze the Impact of Error Propagation on Reliability of Component Based System”, Springer, (2007). R. H. Reussner and H. W. Schmidt, “Reliability prediction for component-based software architectures”, Elsevier, (2003). G. Rodrigues and D. Rodenblum, “Using Scenarios to Predict the Reliability of Concurrent Component based Software System”, Springer, (2005). T. Zimmermann and N. Nagappan, “Predicting Subsystem Failures using Dependency Graph Complexities”, IEEE, (2007). T. L. Graves, A. F. Karr, J. S. Marron and H. Siy, “Predicting Fault Incidence Using Software Change History”, ACM, (2000). A. Sharma, P. S. Grover and R. Kumar, “Dependency analysis for component-based software systems”, ACM SIGSOFT, (2009). K. Tyagi, A. Sharma and A. Seth, “Minimum Spanning Tree-Based Approach for Reliability Estimation of CBSE-Based Software Applications”, IUP Journal of Computer Sciences, (2010) October. K. Tyagi, A. Sharma and A. Seth, “A rule-based approach for estimating the reliability of componentbased systems”, Elsevier, (2012). N. S. Gill and P. S.Grover, “Component-Based Measurement: Few Useful Guidelines”, ACM SIGSOFT, (2003). S. S. Gokhale “Analytical Models for Architecture-Based Software Reliability Prediction: A Unification Framework”, IEEE Transaction on Reliabilty, (2006). Y. Jiang, “Comparing Design and Code Metrics for Software Quality Prediction”, ACM, (2008). N. J. Pizzi, “Software quality prediction using fuzzy integration: a case study”, Springer, (2007). C. Catal, “Software fault prediction: A literature review and current trends”, Elsevier, (2011). Y. Weimin and L. Longshu, “A New Method to Predict Software Defect Based on Rough Sets”, IEEE, (2008). A. Tsakonas and G. Dounias, “Predicting Defects in Software Using Grammar-Guided Genetic Programming”, Springer, (2008). T. J. Ostrand, E. J. Weyuker and R. M. Bell, “Predicting the Location and Number of Faults in Large Software Systems”, IEEE Transactions, (2006). P. D. Tomar, “New Method to Find the Maximum Number of Faults by Analyzing Reliability and Reusability in Component-Based Software”, IEEE, (2011). D. Panwar, P. Tomar, N. S. Gill and A. Kumar, “New Method to Analyze the Impacts on Reliability and Reusability in Component-Based Software Development”, IMS Manthan, (2011). Vance persons, Joseph Dykshorn, “Mean Time between Failures”. J. D. Musa, A. Iannino and K. Okumoto, “Software Reliability: Measurement, Prediction, Application”, McGraw-Hill, (1987).
Copyright ⓒ 2015 SERSC