class error proneness and software metrics during the development phase of software projects ..... If an independent variable is continuous, such as CBO metric,.
Identification of Error Prone Classes for Fault Prediction Using Object Oriented Metrics Puneet Mittal1, Satwinder Singh1, and K.S. Kahlon2 1
BBSBEC, Fatehgarh Sahib, Punjab, India 2 GNDU, Amritsar, Punjab, India
Abstract. Various studies have found that software metrics can predict class error proneness. However their study is focused on the relationship between class error proneness and software metrics during the development phase of software projects not in system’s post-release evolution. This study is focused on the three releases of Javassist- open source java based software. This paper describes how we calculated the object-oriented metrics to illustrate errorproneness detection. Using Findbugs we collected errors in the post-release system and applied logistic regression to find that some metrics can predict the class error proneness in post release evolution of system. We also calculated model’s accuracy by applying one model on other version’s data. Keywords: Object oriented metrics, Class error-proneness, Javassist, Findbugs, JColumbus, Together tool.
1 Introduction It is the dream of every developer to develop error-free software. But inspite of undergoing software testing, walkthroughs and inspections; post maintenance of software is required. Few errors still remain in software that can make the life of software developer hell. It is very difficult to do changes after the software has been released. But if we know the probable areas where error can be located, the problem can be solved and the software testers can be helped to locate the errors easily and effectively. Software metrics are one way to measure the software and can be used for locating the errors. One goal of software metrics is to identify and ensure the essential parameters that affect software development. Software metrics provide quantitative basis for development and validation of models of software development process. Metrics can be used to improve software productivity and quality. In this paper, we describe how we calculated the object-oriented metrics for errorproneness detection from source code of open source software, Javassist[12]. We employed statistical methods (i.e. logistic regression) for predicting error proneness of code.
A. Abraham et al. (Eds.): ACC 2011, Part II, CCIS 191, pp. 58–68, 2011. © Springer-Verlag Berlin Heidelberg 2011
Identification of Error Prone Classes for Fault Prediction Using Object Oriented Metrics
59
2 Literature Survey Various software metrics have been proposed by various researchers in different paradigms. Various studies have sought to analyze the connection between objectoriented metrics and code quality ([16], [9], [2]). Chidamber et al. [5] developed and implemented a new set of software metrics for OO designs. These metrics were based on measurement theory and also reflect the viewpoints of experienced OO software developers. He gave set of six OO metrics (WMC, DIT, RFC, LCOM, NOC, and CBO) where WMC, NOC and DIT metrics reflect class hierarchy; CBO and RFC metric reflect class coupling and LCOM reflects cohesion. Basili et al. [2] collected data about faults found in object oriented classes. Based on these data, they verified how much fault-proneness is influenced by internal (e.g., size, cohesion) and external (e.g., coupling) design characteristics of OO classes. From their results, five out of the six CK OO metrics appear to be useful to predict class fault-proneness during the high- and low-level design phases of the life-cycle. Chidamber et al. [6] investigated the relationship between the CK metrics and various quality factors: software productivity, rework effort, and design effort. The study also showed that the WMC, RFC, CBO metrics were highly correlated. Therefore, Chidamber et al. did not include these three variables in the regression analysis to avoid generating coefficient estimates that would be difficult to interpret. The study concluded that there were associations between the high CBO metric value and lower productivity, more rework, and greater design effort. In another study, Wilkie and Kitchenham [18] validated the relationship between the CBO metric and change ripple effect in a commercial multimedia conferencing system. The study showed that the CBO metric identified the most change-prone classes, but not the classes that were most exposed to change ripple effect. Cartwright and Shepperd [4] also investigated the relationship between a subset of CK metrics in a real-time system. The study showed that the parts of the system that used inheritance were three times more error prone than the parts that did not use inheritance. Subramanyam and Krishnan [16] validated the WMC, CBO, and DIT metrics as predictors of the error counts in a class in a business-to-consumer commerce system. Their results indicated that the CK metrics can predict error counts. They examined the effect of the size along with the WMC, CBO, and DIT values on the faults by using multivariate regression analysis. They concluded that the size was a good predictor in both languages, but WMC and CBO could be validated only for C++. Alshayeb and Li [1] conducted a study on the relationship between some OO metrics and the changes in the source code in two client–server systems and three Java Development Kit (JDK) releases. Three of the CK metrics (WMC, DIT, and LCOM) and three of the Li metrics (NLM, CTA, and CTM) were validated. They found that the OO metrics were effective to predict design effort and source lines of code added, changed, and deleted in short-cycled agile process (client–server systems); however, the metrics were ineffective predictors of those variables in long-cycled framework evolution process. Olague et al. [14] indicated that the CK and QMOOD OO class
60
P. Mittal, S. Singh, and K.S. Kahlon
metrics suites are useful in developing quality classification models to predict defects in both traditional and highly iterative, or agile, software development processes for both initial delivery and for multiple, sequential releases. The above studies suggest that various OO metrics can predict the error proneness during development but still there usability is dubious. It seems to be difficult to predict the class error proneness in post-release system as system has already passed through various quality tests and very few exceptional errors remain behind. In this study we want to know whether software metrics can predict error proneness of class in post-release system. Our study is based on open source java project Javassist- byte code manipulator. Source code of Javassist is freely available online.
3 Data Collection We collected errors for three releases of Javassist project version 2.4, 2.6 and 3.0 using the FindBugs[7] tool. FindBugs is a program which uses static analysis to look for bugs in Java code. FindBugs gives list of probable bugs or errors along with their package name, class name and method name. After that we collected metrics for all the three versions of Javassist using the JColumbus[8] and Together[15] tool. Together tool is a plugin in eclipse for finding the metrics. Metrics used in study are NM, NA, NOA, NOC, DIT, CBO, RFC, LCOM5, WMC, and TCC (all listed in appendix). JColumbus could find 9 metrics NM, NA, NOA, NOC, DIT, CBO, RFC, LCOM5, WMC and for TCC we used Together tool. The object oriented metrics were extracted with the help of tool. Next we associated errors with each class in metrics list. Each class was marked erroneous if atleast one error was found and not erroneous if no error was found. 3.1 The Descriptive Statistics of Data Table 1 shows the distribution of errors and summarizes the number of classes that had errors, the number of error prone classes, the number of classes that did not have errors and the total number of classes considered in the study. Tables 2–4 summarize the metrics descriptive statistics. Table 1. Distribution of errors based on the error categories
Error Total no. of errors No. of error prone classes No. of classes without error Total classes
Javassist 2.4 43 25 168 193
Javassist 2.6 67 36 160 196
Javassist 3.0 78 46 192 238
Identification of Error Prone Classes for Fault Prediction Using Object Oriented Metrics Table 2. Javassist2.4 Percentile
Metrics
Mean
Standard Deviation
NM
16.82383
22.30751
0
145
4
10
22
NA
22.1399
61.80418
0
248
1
2
4
NOA
0.865285
1.021889
0
5
0
1
1
NOC
0.450777
1.464479
0
10
0
0
0
DIT
0.787565
0.854797
0
4
0
1
1
CBO
5.756477
6.471715
0
30
2
3
8
RFC
16.93782
23.48351
0
147
4
9
19
LCOM5
5.518135
7.891909
0
53
1
3
6
WMC
18.50259
38.17983
0
330
3
6
18
TCC
19.54839
30.74617
0
100
0
0
33
MIN
MAX
25%
50%
75%
Table 3. Javassist2.6 Percentile
Metrics
Mean
Standard Deviation
MIN
MAX
25%
50%
75%
NM
17.16837
22.79467
0
147
4
10.5
22.25
NA
22.95408
63.04279
0
249
1
2
4
NOA
0.872449
1.017315
0
5
0
1
1
NOC
0.454082
1.479062
0
10
0
0
0
DIT
0.795918
0.852859
0
4
0
1
1
CBO
5.821429
6.569842
0
30
2
3
8
RFC
17.35385
24.17682
0
151
4
9
19
LCOM5
5.607143
8.01369
0
55
1.75
3
6
WMC
18.79592
39.13224
0
349
4
6
18
TCC
18.89172
29.6024
0
100
0
0
33
Table 4. Javassist3.0 Percentile
Metrics
Mean
Standard Deviation
MIN
MAX
25%
50%
75%
NM
17.96639
25.28852
0
158
5
11
22.75
NA
22.64286
63.26699
0
310
1
2
4
NOA
0.890756
1.008737
0
5
0
1
1
61
62
P. Mittal, S. Singh, and K.S. Kahlon Table 4. (continued) NOC
0.487395
1.751988
0
13
0
0
0
DIT
0.806723
0.829593
0
4
0
1
1
CBO
6.189076
6.982948
0
36
2
4
8
RFC
18.31933
25.77129
0
180
5
10
19.75
LCOM5
5.57563
7.826208
0
71
2
3
6
WMC
20.0084
40.29239
0
376
4
8
19.75
TCC
22.06842
30.8437
0
100
0
4.5
33
We noticed that NOC metrics was zero for 75% of classes and the highest DIT value was 4 and 75 % of classes have only 1 level of inheritance. To get more insight on the relationship between the classes that have errors and the classes that have no error, we used the error box plot chart for Javassist2.4. The error box plot charts help us in comparing groups (no error/error groups) to draw conclusions that one group is higher (or lower) on average than another. If the means and their confidence intervals do not overlap, then we will find a statistically significant difference between the groups. If the means and their confidence intervals for two groups overlap, then the groups are probably not significantly different from one another in a statistical sense. Error box plot charts for Javassist2.4 in Fig.1 shows the means and 95% confidence intervals for all the metrics. All error charts show that the distribution of the classes with errors has more variability than the no error classes for all metrics. We noticed that NM, CBO, RFC, LCOM5, WMC don’t overlap (i.e., significantly different) and the means for the error group are higher than the no error group. The mean for the error group for the DIT metric is lower than the no error group which gives opposite results to our expectations. The means for NA, NOA, NOC, TCC metrics are higher in the error group than the no error group, but the intervals overlap, which mean that they may not be significantly different.
Fig. 1. Means and CI of all metrics
Identification of Error Prone Classes for Fault Prediction Using Object Oriented Metrics
63
Fig. 1. (continued)
4 The Statistical Models In this study, we used logistic regression for class error probability. It is used for prediction of the probability of occurrence of an event by fitting data to a logit function. We used the Univariate Binary Regression (UBR) test to examine whether there was any significant association between a metric and the class error proneness. We used 0.05 as the cutoff P-value in both tests [10]. We then combined the significant metrics into one set to build the multivariate prediction models for the error probabilities. We used the multivariate logistic regression (MLR) to predict class error probability. The independent variables were selected for the MLR analysis. Binary dependent variable tells whether class is erroneous or not in MLR model. The general MLR model is as follows:
ʌ< _;1,X2«;n)=
eg(x) 1+eg(x)
(1)
where g(x) = B0 + B1*X1 + B2*X2+_ _ _+BnXn is the logit function; is the probability of a class being faulty; Y is the dependent variable- it is a binary variable;
64
P. Mittal, S. Singh, and K.S. Kahlon
Xi (1