On Building a Better Program Size Measure - CiteSeerX

On Building a Better Program Size Measure

On Building a Better Program Size Measure Akito Monden1, Shinji Uchida2, Ken-ichi Matsumoto1 1

Graduate School of Information Science, Nara Institute of Science and Technology {akito-m, matumoto}@is.naist.jp 2

Department of Information Engineering, Nara National College of Technology [email protected]

Abstract: Source Lines of Code (SLOC) is a most basic and widely-used program size measure in software project management and/or quality assurance although it greatly depends on a programmer who implemented the program. To build a better (i.e. programmer-independent) program size measure, this paper analyzed 9 independently-built C programs of a same functional specification, and found that 3 base measures (the number of tokens, tokens of code clones, and function parameters) are useful to eliminate programmer-dependent aspects of SLOC. A new size measure called Adjusted Length of Code (ALOC) built upon these 3 base measures showed that variations of size in ALOC was at most 1.22 times difference among 9 programs while SLOC showed 3.16 times difference. Furthermore, ALOC showed at most 1.60 times difference among another 6 independently-built programs of an alternative specification while SLOC showed 4.66 times difference among these programs. These results suggest that the new measure ALOC can reduce the programmer-dependent aspects of program size and can be used as a better size measure in project management. Keywords Product Metrics, Source Lines of Code, Program Analysis

1

Introduction

From the ancient age to the present of software development, Source Lines of Code (SLOC; usually not including comments and blank lines) has been a most basic measure of software product. So far, various SLOC-based process measures were defined and used for project management and/or quality assurance, such as defect density (defects per SLOC), test case density (test cases per SLOC), productivity (SLOC per person-hour) and so on. However, as everyone knows, SLOC has a serious flaw. That is, SLOC greatly depends on a programmer who implemented the program. Even if given functional specifications are the same, some programmer requires much more SLOC than others to implement the specification. Nevertheless, software companies keep

IWSM/MetriKon 2010

A. Monden, S. Uchida, and K. Matsumoto using SLOC-based process measures because there is no useful alternative of SLOC. Recently, Function Point (FP) has been proposed as a functional size measure, and has been successfully used as a basis of cost estimation, effort allocation and project scheduling, etc. However, the use of FP is usually limited to the early phase of software development since measurement of FP requires additional effort. On the other hand, a simple product size measure like SLOC, which can be continuously measured at low cost throughout a development, is still needed because project managers need to become aware of the growth of the product size to make sure development is going right. This paper tries to build a programmer-independent program size measure based on observations of independently-build programs of a same functional specification. 2

REQUIREMENTS TO SIZE MEASURE

This Section clarifies requirements that the program size measure need to satisfy. Let P be a program, S be a functional specification of P, and H be an implementer (a programmer) of P. We specifically denote P(S, H) as a program implemented by H based on S. Then, Adjusted Length of Code ALOC(P), which is the programmer-independent size of P, should satisfy the following requirement. [Req. 1] For an arbitrary S, Hi and Hj (i ≠ j), ALOC (P(S, Hi)) ≈ ALOC(P(S, Hj)) This requirement means that two independently-built programs of a same spec. should have almost same size. Obviously, conventional SLOC does not satisfy this requirement. In addition to Req. 1, we need the following requirement because any function that returns a constant value satisfies Req. 1. [Req. 2] For arbitrary P and Q of different specification, ALOC(P║Q) ≈ ALOC(P)+ALOC(Q) where P║Q is a program where Q was concatenated to P. For a practical use of the size measure, we also give the following requirement. [Req. 3] For an arbitrary P, ALOC(P) ≈ E(SLOC(P)) where SLOC(P) is SLOC of P, and E(x) is the expected value of x. This requirement means that ALOC(P) nearly equals to the average SLOC of P's potential implementations. By satisfying Req.3, project managers can directly substitute ALOC for SLOC as a better program size measure. For example, if a company has a productivity baseline "10 SLOC per person-hour", then it can be substituted by "10 ALOC per person-hour."

Software Measurement Conference

On Building a Better Program Size Measure 3

ANALYSIS OF INDEPENDENTLY-BUILT PROGRAMS

3.1

Materials

To identify a set of source code measures that can be used to eliminate programmer-dependent aspects of SLOC, this Section analyzes 9 independently-built C programs of a same functional specification. The functional specification, programmers and measurement tools are as follows. [Specification] This program solves a 3×3 board 8-puzzle (a sliding-block puzzle) via breadth-first search algorithm. The program reads an initial state of a board from standard input, find a solution, and output a sequence of states from the initial to the goal. To avoid the state explosion, the program must make sure not to inspect the same state appeared in past search. [Programmers] 9 master course students of Nara Institute of Science and Technology (NAIST). [Measurement tools] Resource Standard Metrics [4] was used for SLOC measurement. CCFinderX [1][3] was used for code clone measurement. Token Extractor [7] was used for token measurement. Table 1 shows characteristics of 9 programs PA, PB, …, PI. "Programming Tips" column indicates programming tips, algorithms or data structures being used in each program. "Line feed" column indicates whether a programmer over use line feed or not. "SLOC" indicates Source Lines of Code not including comments and blank lines. "Function" is the number of functions in a program. "Function parameters" is the total number of parameters of functions. "Token" is the number of tokens in a program. "Token types" is the number of unique tokens in a program. "Tokens of code clone" is the number of tokens covered by any code clone (duplicated code portions.) "Coverage of code clone" is the percentage of tokens covered by any code clone (i.e. "Tokens of code clone" divided by "Tokens"). As shown in Table 1, SLOC varies from 116 to 366 (3.16 times difference.) This support our intuition that SLOC greatly depends on a programmer who implemented the program. 3.2

Analysis

3.2.1 Programming Style As shown in Table 1, three programs (PD, PE and PF) over uses line feeds (or carriage returns). All these programs contain line feeds before and after every bracket "{" and "}", which cause increase of SLOC. Also, some programmer put more LFs (line feeds) for initializing variables, e.g. writing "int i; [LF] int first; [LF] int

IWSM/MetriKon 2010

A. Monden, S. Uchida, and K. Matsumoto

P

Programming Tips

PA

Line feed

Tokens of code clone

Coverage of code clone

SLOC

Functions

Function parameters

Tokens

Token Types

queue, adjacency list, hash

165

9

7

991

114

57

5.8%

PB

queue, hash

203

8

7

1406

154

351

25.0%

PC


119

4

2

883

100

153

17.4%

PD

3-dimension array

Yes

329

1

0

3342

107

2439

73.0%

PE


Yes

116

5

2

782

101

39

5.0%

PF


Yes

159

4

1

1090

109

312

28.7%

PG


135

4

2

965

119

130

13.5%

PH

queue, hash

366

22

25

2522

209

968

38.4%

PI


168

8

8

1274

138

254

20.0%

Table 1: Characteristics of 8-puzzle programs

move;" instead of writing "int i, first, move;" in one line. Such differences in programming styles indicate that SLOC varies among implementations even if their specification, algorithms, programming Tips are same. To lessen the effect of programming styles on program size, we decided to use the number of tokens instead of line counting as a basis of program size measure. 3.2.2 Granularity of modules As shown in Table 1, some programs had more fine-grained modules than others. For example, program PH had 22 functions and 25 parameters (fine-grained) while others had at most 9 functions and 8 parameters (coarse-grained). Since defining functions and passing arguments to function parameters require significant amount of code lines and tokens, granularity of modules significantly impacts the program size. Indeed, PH was largest in SLOC because of too many functions and parameters. To evaluate the effect of module granularity on program size, we use the number of functions and function parameters as another basic measures.


On Building a Better Program Size Measure 3.2.3 Code clone As shown in Table 1, some programs had much more code clones (duplicated code portions) than others; and, this made variations in SLOC among programs. For example, program PD's coverage of code clone was 73% while PE's was only 5%. Indeed, PD was largest in tokens because of too many code clones. To evaluate the effect of code clone, we use tokens of code clones and coverage of code clone as another basic measures. 3.2.4 Algorithms and programming tips As shown in the "Programming tips" column of Table 1, programming tips (including algorithms and data structures) being used varied among programs. Three largest programs (PB, PD and PH) did not use "adjacency list" technique, which enables programmers to write compact code. This indicates that lack of proper programming tips can cause increase of program size. Although it is difficult to directly capture the lack of proper programming tips by any source code measure, code clone measures could be used as indirect measures. As shown in Table 1, all three programs PB, PD and PH had significant amount of code clones. It can be considered that lack of proper programming tips tend to increase the repetitions of similar code fragments and thus increase the code clone measures. 4

DEFINITON AND DERIVATION OF SIZE MEASURE

Based on the analysis in previous Section, we decided to use Tokens, Functions, Function parameters, Tokens of code clones, and Coverage of code clones in Table 1 as base measures to build the programmer-independent size measure ALOC. Here we employ a simple regression model (below equation (1)) to build the ALOC measure where candidates of predictor variables are base measures.

ALOC  k1 N1  k 2 N 2    k n N n  C

………( 1 )

ALOC : Adjusted Length of Code (ALOC) Nj : Predictor variables kj : partial regression coefficient C : constant To estimate the regression coefficients ki and the constant C of equation (1), we need to carefully prepare a fit dataset so that the resulting regression model satisfies Req. 1, 2 and 3 in Section 2. To satisfy Req. 2, we need to prepare a concatenated program P║Q where P and Q have different specifications. However, since all programs in Table 1 have same specification, here we prepared 9 double-sized programs PA║PA, PB║PB, PC║PC, …PI║PI so that we can expect ALOC(Px║Px) to IWSM/MetriKon 2010


Predictor variable Tokens

Regression coefficient

Standard regression coefficient

.393

3.98

Function parameters

-12.2

-.978

Tokens of code clones

-.422

-3.36

(Constant)

15.5

Table 2: Resultant regression model

be 2×ALOC(Px) by ignoring the code clone pairs between former and latter part of Px║Px. As desired outputs, we gave the average SLOC of PA, …PI for all original size programs (to satisfy Req. 1 and 3) and 2 times of the average SLOC to concatenated programs Px║Px. Next, we selected a set of base measures to be included as predictor variables of equation (1). Considering that equation (1) additively connects predictor variables to compute ALOC, we selected "tokens of code clones" instead of the ratio measure "coverage of code clones." We then excluded the measure "Functions" since it had very high correlation (0.984) with "Function parameters." As a result, three base measures (tokens, tokens of code clones, and function parameters) were used as predictor variables. Table 2 shows the resultant regression model. We confirmed that all coefficients and the constant are statistically significant (p < 0.01). This regression model is our program size measure ALOC. Figure 1 shows how ALOC values computed by the model (y-axis) fit the desired outputs (x-axis). As shown in Figure 1, computed ALOC satisfy Req.1-3 for all programs PA, …PI. Table 3 shows comparison among SLOC, desired ALOC (i.e. average SLOC) and computed ALOC of programs PA, …PI. While SLOC varied from 116 to 366 (3.16 times difference), computed ALOC varied from 274 to 335, which resulted in only 1.22 times difference.


On Building a Better Program Size Measure ALOC (desired)

SLOC

P

ALOC (computed)

PA

165

296

296

PB

203

296

335

PC

119

296

274

PD

329

296

301

PE

116

296

282

PF

159

296

300

PG

135

296

316

PH

366

296

294

PI

168

296

312

Table 3: Comparison between SLOC and ALOC of 8-puzzle programs

700 600 500 400 300 200 100 0 0

Figure 1:

100

200

300

400

500

600

700

Scatter plot of desired ALOC - computed ALOC

5

EVALUTION OF SIZE MEASURE

5.1

Experiment with Alternative Specification

To evaluate the generality of ALOC measure (i.e. a regression model) derived in previous Section, we measure another 6 independently-built C programs PS, PT,…PX of an alternative specification. The functional specification of these programs is to translate a text stream by Huffman coding. 6 programmers include

IWSM/MetriKon 2010


P

SLOC

Function parameters

Tokens

Tokens of ALOC code clone

PS

334

7

2135

1433

165

PT

153

5

839

204

198

PU

499

0

2411

1656

264

PV

107

0

625

132

205

PW

130

3

730

93

226

PX

113

1

786

289

190

Table 4: Source code measures of Huffman coding programs

1 faculty member and 5 master course students (all came from software companies) of NAIST. Table 4 shows the result of the experiment. As shown in Table 4, while SLOC varied from 107 to 499 (4.66 times difference), ALOC varied from 165 to 264, which resulted in only 1.60 times difference. This suggests that derived ALOC measure in Section 4 can reduce the programmer-dependent aspects of program size for different specifications; and, can be used as a better size measure in project management. 5.2

Threats to Validity

Here we discuss the threats to the validity of our work. We used only two functional specifications and 9 and 6 implementations for each specification. We need to analyze other programs and other specifications in the future work. There are some other programming factors that can impact the program size. For example, lack of using proper standard libraries can cause increase of program size because a programmer need to write additional functionality in such a case. We need to consider such a factor in the future research. 6

RELATED WORK

Software companies often use logical SLOC (which counts the number of statements rather than source lines) instead of physical SLOC to reduce the influence of programming style. We calculated logical SLOC for 9 programs of Table 1. As a result, logical SLOC still varied from 56 to 235 (4.20 times difference). Therefore, logical SLOC is insufficient for reducing the programmerdependent aspects of program size. Halstead proposed a program size measure called "Volume" that takes the amount of vocabulary into account [2]. Halstead's Volume is given by N×log2n where N is the total number of tokens and n is the number of unique tokens. We calculated


On Building a Better Program Size Measure Volume for 9 programs of Table 1. As a result, Volume varied from 5207 to 22530 (4.33 times difference). Therefore, Volume is not useful as a programmerindependent size measure. Kusumoto et al. attempted to measure Function Point from source code in a specific application domain [5][6]. This can be an alternative approach to achieve our goal. Since this research is still on the way to achieve the goal, further research is required for a practical use. 7

Conclusion

To reduce the programmer-dependent aspects of program size, this paper first defined three requirements for a programmer-independent size measure. From an analysis with 9 independently-built C programs of a same functional specification, we found that 3 base measures (tokens, tokens of code clones, and function parameters) are useful to eliminate programmer-dependent aspects of SLOC. A new size measure Adjusted Length of Code (ALOC) built upon these 3 base measures showed that variations of size in ALOC was greatly reduced in these 9 programs. To evaluate the generality of ALOC measure, we also measured another 6 independently-built C programs of an alternative specification. The result showed ALOC measure is effective for different specifications; and, can be used as a better size measure in project management. Our new program size measure ALOC can be automatically measured from source code; and, since it satisfies Req. 3 of Section 2, project managers can easily substitute ALOC for SLOC as a better program size measure. In the future, it is necessary to improve our measure based on analyses of other programs and other functional specifications. Acknowledgement Part of this work was conducted in the StagE Project, the Development of Next Generation IT Infrastructure, supported by Ministry of Education, Culture, Sports, Science and Technology. Also, part of this work was conducted under Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (C) (22500028). References 1. CCFinderX, http://www.ccfinder.net/ 2. Halstead, M.H., "Elements of Software Science (Operating and programming systems series)", Elsevier Science Inc., New York, 1977. 3. Kamiya, T., Kusumoto, S., and Inoue, K., "CCFinder: A Multi-Linguistic Tokenbased Code Clone Detection System for Large Scale Source Code," IEEE Trans.

IWSM/MetriKon 2010

A. Monden, S. Uchida, and K. Matsumoto Software Engineering, vol. 28, no. 7, pp. 654-670, 2002. 4. Resource Standard Metrics, http://msquaredtechnologies.com/m2rsm/ 5. Shinji Kusumoto, Takuto Edagawa, and Yoshiki Higo, "On an Automatic Function Point Measurement from Source Codes," In 2nd Workshop on Accountability and Traceability in Global Software Engineering (ATGSE2008), pp. 27-28, Dec. 2008. 6. Shinji Kusumoto, Masahiro Imagawa, Katsuro Inoue, Shuuma Morimoto, Kouji Matsushita, Michio Tsuda, " Function Point Measurement from Java Program," In Proc. 24th International Conference on Software Engineering (ICSE2002), pp. 576582, May 2002. 7. Token Extractor for C/C++ http://www.vector.co.jp/soft/winnt/prog/se482039.html

Programs,


On Building a Better Program Size Measure - CiteSeerX

On Building a Better Program Size Measure - CiteSeerX

Suggest Documents

NANOVEA | A Better Measure.

A Program-Size Complexity Measure for Mathematical Problems and ...

Building a Better SUV

Building a Better Mousetrap

Building a Better Racetrack

Building a better Trap

Building a Better Broadmead

Building a Better Broadmead

Building a Computer Program Grader - CiteSeerX

Building a Game Development Program - CiteSeerX

Building a Game Development Program - CiteSeerX

Building a Better Credit Report

BUILDING A BETTER I-4

Build a Better Building - Curries

Building a better you - zBoost

Building a Better Tracker - AARP

Building a Better Tracker - AARP

Building a better future

Oracle size: a new measure of difficulty for communication ... - CiteSeerX

'A BETTER WAY TO MEASURE CHOICES'DISCRETE CHOICE ...

(RaoK~ Download 'Wordstruck; A Workbook on Building a Better ...

Better and Brighter? - Size

Wordstruck; A Workbook on Building a Better English Vocabulary

Wordstruck; A Workbook on Building a Better English Vocabulary