CCIS 191 - Identification of Error Prone Classes for ... - Springer Link

Identification of Error Prone Classes for Fault Prediction Using Object Oriented Metrics Puneet Mittal1, Satwinder Singh1, and K.S. Kahlon2 1

BBSBEC, Fatehgarh Sahib, Punjab, India 2 GNDU, Amritsar, Punjab, India

Abstract. Various studies have found that software metrics can predict class error proneness. However their study is focused on the relationship between class error proneness and software metrics during the development phase of software projects not in system’s post-release evolution. This study is focused on the three releases of Javassist- open source java based software. This paper describes how we calculated the object-oriented metrics to illustrate errorproneness detection. Using Findbugs we collected errors in the post-release system and applied logistic regression to find that some metrics can predict the class error proneness in post release evolution of system. We also calculated model’s accuracy by applying one model on other version’s data. Keywords: Object oriented metrics, Class error-proneness, Javassist, Findbugs, JColumbus, Together tool.

1 Introduction It is the dream of every developer to develop error-free software. But inspite of undergoing software testing, walkthroughs and inspections; post maintenance of software is required. Few errors still remain in software that can make the life of software developer hell. It is very difficult to do changes after the software has been released. But if we know the probable areas where error can be located, the problem can be solved and the software testers can be helped to locate the errors easily and effectively. Software metrics are one way to measure the software and can be used for locating the errors. One goal of software metrics is to identify and ensure the essential parameters that affect software development. Software metrics provide quantitative basis for development and validation of models of software development process. Metrics can be used to improve software productivity and quality. In this paper, we describe how we calculated the object-oriented metrics for errorproneness detection from source code of open source software, Javassist[12]. We employed statistical methods (i.e. logistic regression) for predicting error proneness of code.

A. Abraham et al. (Eds.): ACC 2011, Part II, CCIS 191, pp. 58–68, 2011. © Springer-Verlag Berlin Heidelberg 2011

Identification of Error Prone Classes for Fault Prediction Using Object Oriented Metrics

59

2 Literature Survey Various software metrics have been proposed by various researchers in different paradigms. Various studies have sought to analyze the connection between objectoriented metrics and code quality ([16], [9], [2]). Chidamber et al. [5] developed and implemented a new set of software metrics for OO designs. These metrics were based on measurement theory and also reflect the viewpoints of experienced OO software developers. He gave set of six OO metrics (WMC, DIT, RFC, LCOM, NOC, and CBO) where WMC, NOC and DIT metrics reflect class hierarchy; CBO and RFC metric reflect class coupling and LCOM reflects cohesion. Basili et al. [2] collected data about faults found in object oriented classes. Based on these data, they verified how much fault-proneness is influenced by internal (e.g., size, cohesion) and external (e.g., coupling) design characteristics of OO classes. From their results, five out of the six CK OO metrics appear to be useful to predict class fault-proneness during the high- and low-level design phases of the life-cycle. Chidamber et al. [6] investigated the relationship between the CK metrics and various quality factors: software productivity, rework effort, and design effort. The study also showed that the WMC, RFC, CBO metrics were highly correlated. Therefore, Chidamber et al. did not include these three variables in the regression analysis to avoid generating coefficient estimates that would be difficult to interpret. The study concluded that there were associations between the high CBO metric value and lower productivity, more rework, and greater design effort. In another study, Wilkie and Kitchenham [18] validated the relationship between the CBO metric and change ripple effect in a commercial multimedia conferencing system. The study showed that the CBO metric identified the most change-prone classes, but not the classes that were most exposed to change ripple effect. Cartwright and Shepperd [4] also investigated the relationship between a subset of CK metrics in a real-time system. The study showed that the parts of the system that used inheritance were three times more error prone than the parts that did not use inheritance. Subramanyam and Krishnan [16] validated the WMC, CBO, and DIT metrics as predictors of the error counts in a class in a business-to-consumer commerce system. Their results indicated that the CK metrics can predict error counts. They examined the effect of the size along with the WMC, CBO, and DIT values on the faults by using multivariate regression analysis. They concluded that the size was a good predictor in both languages, but WMC and CBO could be validated only for C++. Alshayeb and Li [1] conducted a study on the relationship between some OO metrics and the changes in the source code in two client–server systems and three Java Development Kit (JDK) releases. Three of the CK metrics (WMC, DIT, and LCOM) and three of the Li metrics (NLM, CTA, and CTM) were validated. They found that the OO metrics were effective to predict design effort and source lines of code added, changed, and deleted in short-cycled agile process (client–server systems); however, the metrics were ineffective predictors of those variables in long-cycled framework evolution process. Olague et al. [14] indicated that the CK and QMOOD OO class

60

P. Mittal, S. Singh, and K.S. Kahlon

metrics suites are useful in developing quality classification models to predict defects in both traditional and highly iterative, or agile, software development processes for both initial delivery and for multiple, sequential releases. The above studies suggest that various OO metrics can predict the error proneness during development but still there usability is dubious. It seems to be difficult to predict the class error proneness in post-release system as system has already passed through various quality tests and very few exceptional errors remain behind. In this study we want to know whether software metrics can predict error proneness of class in post-release system. Our study is based on open source java project Javassist- byte code manipulator. Source code of Javassist is freely available online.

3 Data Collection We collected errors for three releases of Javassist project version 2.4, 2.6 and 3.0 using the FindBugs[7] tool. FindBugs is a program which uses static analysis to look for bugs in Java code. FindBugs gives list of probable bugs or errors along with their package name, class name and method name. After that we collected metrics for all the three versions of Javassist using the JColumbus[8] and Together[15] tool. Together tool is a plugin in eclipse for finding the metrics. Metrics used in study are NM, NA, NOA, NOC, DIT, CBO, RFC, LCOM5, WMC, and TCC (all listed in appendix). JColumbus could find 9 metrics NM, NA, NOA, NOC, DIT, CBO, RFC, LCOM5, WMC and for TCC we used Together tool. The object oriented metrics were extracted with the help of tool. Next we associated errors with each class in metrics list. Each class was marked erroneous if atleast one error was found and not erroneous if no error was found. 3.1 The Descriptive Statistics of Data Table 1 shows the distribution of errors and summarizes the number of classes that had errors, the number of error prone classes, the number of classes that did not have errors and the total number of classes considered in the study. Tables 2–4 summarize the metrics descriptive statistics. Table 1. Distribution of errors based on the error categories

Error Total no. of errors No. of error prone classes No. of classes without error Total classes

Javassist 2.4 43 25 168 193

Javassist 2.6 67 36 160 196

Javassist 3.0 78 46 192 238

Identification of Error Prone Classes for Fault Prediction Using Object Oriented Metrics Table 2. Javassist2.4 Percentile

Metrics

Mean

Standard Deviation

NM

16.82383

22.30751

0

145

4

10

22

NA

22.1399

61.80418

0

248

1

2

4

NOA

0.865285

1.021889

0

5

0

1

1

NOC

0.450777

1.464479

0

10

0

0

0

DIT

0.787565

0.854797

0

4

0

1

1

CBO

5.756477

6.471715

0

30

2

3

8

RFC

16.93782

23.48351

0

147

4

9

19

LCOM5

5.518135

7.891909

0

53

1

3

6

WMC

18.50259

38.17983

0

330

3

6

18

TCC

19.54839

30.74617

0

100

0

0

33

MIN

MAX

25%

50%

75%

Table 3. Javassist2.6 Percentile

Metrics

Mean

Standard Deviation

MIN

MAX

25%

50%

75%

NM

17.16837

22.79467

0

147

4

10.5

22.25

NA

22.95408

63.04279

0

249

1

2

4

NOA

0.872449

1.017315

0

5

0

1

1

NOC

0.454082

1.479062

0

10

0

0

0

DIT

0.795918

0.852859

0

4

0

1

1

CBO

5.821429

6.569842

0

30

2

3

8

RFC

17.35385

24.17682

0

151

4

9

19

LCOM5

5.607143

8.01369

0

55

1.75

3

6

WMC

18.79592

39.13224

0

349

4

6

18

TCC

18.89172

29.6024

0

100

0

0

33

Table 4. Javassist3.0 Percentile

Metrics

Mean

Standard Deviation

MIN

MAX

25%

50%

75%

NM

17.96639

25.28852

0

158

5

11

22.75

NA

22.64286

63.26699

0

310

1

2

4

NOA

0.890756

1.008737

0

5

0

1

1

61

62

P. Mittal, S. Singh, and K.S. Kahlon Table 4. (continued) NOC

0.487395

1.751988

0

13

0

0

0

DIT

0.806723

0.829593

0

4

0

1

1

CBO

6.189076

6.982948

0

36

2

4

8

RFC

18.31933

25.77129

0

180

5

10

19.75

LCOM5

5.57563

7.826208

0

71

2

3

6

WMC

20.0084

40.29239

0

376

4

8

19.75

TCC

22.06842

30.8437

0

100

0

4.5

33

We noticed that NOC metrics was zero for 75% of classes and the highest DIT value was 4 and 75 % of classes have only 1 level of inheritance. To get more insight on the relationship between the classes that have errors and the classes that have no error, we used the error box plot chart for Javassist2.4. The error box plot charts help us in comparing groups (no error/error groups) to draw conclusions that one group is higher (or lower) on average than another. If the means and their confidence intervals do not overlap, then we will find a statistically significant difference between the groups. If the means and their confidence intervals for two groups overlap, then the groups are probably not significantly different from one another in a statistical sense. Error box plot charts for Javassist2.4 in Fig.1 shows the means and 95% confidence intervals for all the metrics. All error charts show that the distribution of the classes with errors has more variability than the no error classes for all metrics. We noticed that NM, CBO, RFC, LCOM5, WMC don’t overlap (i.e., significantly different) and the means for the error group are higher than the no error group. The mean for the error group for the DIT metric is lower than the no error group which gives opposite results to our expectations. The means for NA, NOA, NOC, TCC metrics are higher in the error group than the no error group, but the intervals overlap, which mean that they may not be significantly different.

Fig. 1. Means and CI of all metrics

Identification of Error Prone Classes for Fault Prediction Using Object Oriented Metrics

63

Fig. 1. (continued)

4 The Statistical Models In this study, we used logistic regression for class error probability. It is used for prediction of the probability of occurrence of an event by fitting data to a logit function. We used the Univariate Binary Regression (UBR) test to examine whether there was any significant association between a metric and the class error proneness. We used 0.05 as the cutoff P-value in both tests [10]. We then combined the significant metrics into one set to build the multivariate prediction models for the error probabilities. We used the multivariate logistic regression (MLR) to predict class error probability. The independent variables were selected for the MLR analysis. Binary dependent variable tells whether class is erroneous or not in MLR model. The general MLR model is as follows:

ʌ< _;1,X2«;n)=

eg(x) 1+eg(x)

(1)

where g(x) = B0 + B1*X1 + B2*X2+_ _ _+BnXn is the logit function; is the probability of a class being faulty; Y is the dependent variable- it is a binary variable;

64

P. Mittal, S. Singh, and K.S. Kahlon

Xi (1