SOFTWARE VERIFICATION Various techniques have been proposed for reducing test suite size without compromising the suite’s effectiveness in performing regression testing. This article presents a hybrid technique using the variable-based method that combines both selection and prioritization. It considers source code changes and coverage information with respect to each test case. Variables are the vital source of changes in the program, and this method captures the effect of changes in terms of variable computations. The proposed hybrid approach for test case selection and prioritization has been compared with the test selection technique described by Rothermel (1996). Statement coverage, branch coverage, fault coverage, and modified code coverage are computed. The results present interesting insights into the effectiveness of using the technique for selection and prioritization during regression testing.
Key Words hybrid technique, regression testing, test prioritization, test selection
AND VALIDATION
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation Yogesh Singh, Arvinder Kaur, & Bharti Suri G.G.S.I.P University, Delhi
SQP References Defining and Managing Test Priorities for COTS Software vol. 2, issue 3 Johanna Rothman
38 SQP VOL. 11, NO. 2/© 2009, ASQ
INTRODUCTION Changes are often indispensable during the maintenance phase of the software life cycle. These changes may involve code changes to satisfy customer requirements, such as the addition of new functionalities, improving existing functionalities, fixing bugs, and so on. Because of these changes, it is important to retest in order to verify that these modifications do not have unintended effects and, therefore, that the system or component still complies with its specified requirements (Aggarwal and Singh 2008; IEEE 1990). The retesting should be carried out in a selective and reduced manner, as thorough testing is neither feasible nor economical. This selective retesting of the system or component is called regression testing. Regression testing thus establishes one’s confidence in the modified program. It is primarily a maintenance activity. In fact, it may account for as much as half of the cost of software maintenance (Beizer 1990). As far as the quality of software is concerned, the cost becomes imperative. The importance of regression testing can be understood from the fact mentioned by Binder (2000) that the single most costly bug in software history could have been revealed by regression testing (Elbaum et al. 2003). The maintenance phase generates large amounts of data and information that can be used later (Weller 2000). This information can be incorporated in preparing regression tests.
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation The test suites that are already available from earlier versions of software can be expensive to execute completely. Test suite reduction aims at decreasing execution time for running test cases and, thus decreasing the cost of testing (Leung and White 1991). Various test suite reduction techniques have been proposed in the literature, which can be categorized as: regression test selection techniques, regression test prioritization techniques, and hybrid techniques. Regression test selection techniques select the subset of test cases to execute on the modified version of software. They use information about old and new versions of the program, and the existing test suite. Regression test prioritization techniques aim to prioritize test cases to achieve objectives such as achieving code coverage earlier, checking features of software in order of expected frequency of use, and early fault detection. Thus, prioritizations can help in detecting the defects earlier and further help in correcting them quickly. Hybrid techniques combine both selection and prioritization for regression testing. The research literature has mentioned various techniques based on different concepts for regression test selection. Some of them are: 1) the modified non-core function technique, which selects test cases exercising functions in programs that have been changed or deleted in producing changed programs (Chen 1994); and 2) the modification-focused minimization technique explained by Fischer, Raji, and Chruscicki (1981) and Gupta, Harrold, and Soffa (1992), which seeks a subset of test suites that is minimal in covering all functions of the program identified as changed. Some other selection techniques have been proposed in (Rothermel 1996; Ball 1998; Bible, Rothermel, and Rosenblum 2001; Graves et al. 1998; Rothermel and Harrold 1998; Rothermel, Harrold, and Dedhia 2000). One of the techniques, called the safe test selection technique proposed by Rothermel (1996), involves the construction of a control flow graph for different versions of the program and uses these graphs to make the test selection. The selection is carried out with an intention to execute the changed code. Elbaum, Malishevsky, and Rothermel (2002) and Elbaum et al. (2004) proposed 18 different test case prioritization techniques that are classified into statement-level and function-level techniques. There are many search techniques for test case prioritization that are being developed and unfolded by various researchers in the field (Li, Harman, and Hierons 2007). Rothermel et al. (2001) have presented many techniques for prioritizing
test cases based on coverage of statements or branches in the program. A prioritization technique based on historical execution data is presented in (Kim and Porter 2002). The number of impacted blocks that are likely to be covered by the tests is used as the basis of prioritization in (Srivastava and Thiagarajan 2002). The technique developed using modified condition/decision coverage for test suite prioritization is presented in (Jones and Harrold 2001). Requirements based prioritization, which incorporates knowledge about requirements, complexity, and volatility, is proposed in (Srikanth 2004). Another prioritization technique, which takes into account the output-influencing and branches executed by test cases, is proposed in (Jeffrey and Gupta 2006). The hybrid approach combines both regression test selection and test case prioritization. A number of techniques and approaches have evolved in the last decade based on the concept: 1) the test selection algorithm proposed by Aggarwal, Singh, and Kaur (2004); 2) the hybrid technique proposed by Wong et al. (1997), which combines minimization, modification, and prioritization based selection using test history; 3) the hybrid technique proposed by Gupta and Soffa (1995) based on regression test selection using slicing and prioritizing the selected definition-use associations; and 4) the version-specific hybrid approach by Singh, Kaur, and Suri (2006).
BASIC METHOD AND TERMINOLOGY The proposed hybrid approach is based on the selection and prioritization of the test cases. It is a version-specific technique that takes into account the variable usage in the old as well as the modified program, named as P and P’ respectively. The key point in the technique is that the test cases in the original test suite T contain not only the basic notation but also extra information denoting the variable that is being checked by this test case. It selects all those variables that are in the changed statements and then selects only those test cases that either correspond to these variables or to the variables computed from them recursively. Multiple-level prioritization of the selected test cases is performed on the basis of variable usage. Figure 1 demonstrates the layout of the algorithm used in the proposed technique. Appendix 1 defines some related terminology. www.asq.org 39
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation FIGURE 1
Hybrid algorithm layout Modified program P’
Initial program P Changed lines CLB
Computed variable table CVT
Variables V Modified V Selection
Modified CVT
Test cases T
Priority1 assignment
Modified VDC
Priority2 assignment
Modified test cases T’
The Variable-Based Method To keep an account of variables usage in a program, a computed variable table (CVT) is prepared (maintained through development testing). A two-dimensional array with variable names and dependency count also is maintained during development testing. This array is called VDC. A list of variables “V” is created from changed (inserted/modified/deleted) lines, as stated in step 2 of the algorithm presented in Appendix 2, which demonstrates the authors’ technique. If any variable is deleted permanently from the program by modification or deletion of any line, it results in modified versions of V, VDC, and CVT. The selection step (step 3 of the algorithm) selects all those test cases that correspond to variables contained in modified V. Priority1 is assigned on the basis of the modified CVT. The test cases that correspond to the changed variables are assigned Priority1 as 1. The test cases that correspond to variables computed from the changed variables are assigned Priority1 as 2, and so on. After assigning Priority1, Priority2 will be assigned, as stated in step 4 of the algorithm. The purpose of assigning Priority2 is to further prioritize the test cases that have the same value as Priority1. Priority2 is based on the dependency count as in the modified VDC. The 40 SQP VOL. 11, NO. 2/© 2009, ASQ
test cases are assigned Priority2 as 1 whose associated variable has the highest dependency count in the VDC. Priority2 is further assigned as 2 whose associated variable has the next highest dependency count, and so on. The resultant test suite T' has test cases having Priority1 and Priority2 assigned.
EXPERIMENTAL SET UP AND DATA COLLECTION To carry out the study, 50 programs coded in C/C++ were selected. The programs used for the experiment were randomly selected from the commonly referenced literature (Kamthane 2006; Kernighan and Ritchie 1998; Myers 1978; Venugopal 2008). Some of the programs implement the functions of business applications, such as library management system, flight reservation, tax calculation, and so on (Johnsonbaugh and Kalin 1996; Jorgensen 2002). The test suites were prepared for each of them. A group of four students from the masters of technology program at Guru Gobind Singh Indraprastha University was formed under the supervision of two senior lecturers. The students had prior knowledge of software testing. The test cases were prepared by these students and verified by the senior lecturers. A prerequisite to the authors’ work is to have lines of source code numbered so as to perform the gray box analysis,
© 2009, ASQ
VDC
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation Descriptive statistics of computed measures from the techniques
Total lines % R_Test cases selected % Test cases selected % R_Statement covered
Minimum 26
Maximum 387
Mean 74.78
Std. deviation 55.33
Variance 3062.25
0
100
78.84
36.21
1311.84
12.33
100
49.13
25.64
657.85
0
100
87.92
25.87
669.38
% Statement covered
49.87
100
91.70
13.62
185.58
% R_Branch covered
0
100
82.60
31.00
961.22
24.19
100
82.98
21.81
475.84
% R_Faults covered
0
100
85.68
30.45
927.78
% Faults covered
20
100
89.34
20.57
423.31
% R_Modified coverage
0
100
96
19.797
391.83
% Modified coverage
0
100
94
23.98
575.51
% R_Prioritized test cases
0
100
18.24
25.08
629.14
% Prioritized test cases
0
100
21.50
19.70
388.16
% Branch covered
which is the basis of their testing technique. Gray box testing is a combination of black box and white box testing approaches. Gray box testing includes testing from the outside of the product, as is done in the black box, but the test cases are designed incorporating the information about the code or the program operation (Kaner, Bach, and Pettichord 2002). The test suites in the authors’ approach are based on gray box technique, as the input/output follows the black box strategy and at the same time keeping variable usage information, which is basically a white box approach. The technique proposed in the pioneering work of Rothermel (1996) was referred to for comparison. Various factors accounted for choosing this technique: the availability of the detailed explanation of all aspects of their work and the efficiency of using the control flow graph (CFG) with the test case trace for test selection. The overall experiment was carried out corresponding to each of the techniques. The experimental data were gathered for two techniques, as explained next. The analysis was carried out on the accumulated data. The cumulative results are shown in a later section.
Details of the Study The following steps describe the experimental work done in this study: 1) There are two versions of the program: the old and modified, and the original test suite T, CVT, VDC, CLB, and V. 2) The authors updated/modified V, VDC, and CVT depending upon the changes made in the older
version. Then they applied the proposed technique, and the result was the reduced test suite T' with priorities assigned. The previous steps were repeated for all the programs under consideration. With the resultant test suites, the objective was to measure the effectiveness of the technique in terms of statement, branch, and fault coverage. 3) For statement coverage, statements were traced with each statement run from T'. To analyze branch coverage, the authors analyzed T' with respect to the branches exercised. The branch here may refer to one of the possible outcomes of any decision statement, which can be single or compound. Branch coverage trace was carried out using the graphical approach CFG. A similar approach also was followed in the work done by Rothermel (1996), where CFG was drawn for each of the programs and the corresponding branch coverage was analyzed. 4) The third coverage criterion is related to faults. The authors created faulty versions of the programs and then analyzed the effectiveness of T' in exercising the contained faults. For creating the faulty versions of the programs, simple errors in the operator/operand were manually seeded on the model of the competent programmer hypothesis and coupling effect (DeMillo, Lipton, and Sayward 1978; Offutt 1992). The competent programmer hypothesis states that competent programmers tend to write programs that are close to being correct, that is, a program written www.asq.org 41
© 2009, ASQ
Table 1
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation
5) To analyze other techniques under consideration, the same programs were taken with the original test suites and their CFG. The experiment was conducted in a similar fashion looking for a coverage pattern for statement, branch, and fault for each program. In this phase, however, the difference lies in the technique as the basis for the experiment. The technique used here is as proposed by Rothermel (1996). The observations were compiled to make a comparative analysis.
EXPERIMENTAL RESULTS Descriptive Statistics The “min,” “max,” “mean,” “median,” and “standard deviation” of measures computed from the techniques are shown in Table 1. The values prefixed by “R” are computed by Rothermel’s technique and the others are computed by the authors’ technique. The mean value of
© 2009, ASQ
Table 2
Summary results of the two techniques
LOC Rothermel technique _Avg. % of test cases selected Hybrid technique_Avg. % of test cases selected Rothermel technique_Avg. % of stmt. coverage Hybrid technique_Avg. % of stmt. coverage Rothermel technique_Avg. of % branch coverage Hybrid technique_Avg. % of branch coverage Rothermel technique_Avg. % of fault coverage Hybrid technique_Avg. % of fault coverage Rothermel technique_Avg. % of modified coverage Hybrid technique_Avg. % of modified coverage Rothermel technique_Avg. % of prioritized test cases Hybrid technique_Avg. % of prioritized test cases
42 SQP VOL. 11, NO. 2/© 2009, ASQ
1-50
the tests selected by the authors’ technique is 49.13 and by the compared technique is 78.84. However, the mean of statement coverage achieved by the authors’ technique is 91.70 while the compared technique is 87.92. This shows that the authors’ technique selects fewer test cases with higher statement coverage. Table 1 shows that branch coverage, fault coverage, and modified coverage achieved by the authors’ and the compared technique are almost the same. The percentage of test cases selected for modified coverage is lower than in the compared technique.
Analysis of the Proposed Technique The data were grouped into three categories based on the lines of code (LOC) count. The authors selected LOC for grouping since it is a popular basis of size measurement. Most of the attributes, such as testability, fault proneness, and maintainability, are affected by the size of the software. Using the gathered data the authors explored the relationships between the computed measures. Table 2 shows the summarized results for the two techniques. Figure 2 compares the average percentage of test cases selected for each technique. Figure 3 shows the comparison in terms of the average percentage of statement coverage achieved. Similarly, Figures 4 and 5 denote the branch and fault coverage patterns, respectively, associated with each reduced test suite.
FIGURE 2
51-100 101-400
86.90
76.99
66.28
57.13
47.39
36.35
93.58
87.52
76.40
80
96.56
90.04
85.81
70
91.32
80.48
69.35
60
91.90
80.77
69.57
94.44
83.47
72.64
94.38
87.99
82.08
94.44
95.83
100.00
10
94.44
91.67
100.00
0
11.79
23.67
19.42
23.70
21.26
17.31
Average percentage of test cases selected
100 90
50 40 30 20
1-50
51-100 101- 400 L OC Rothermel Technique_Avg. % of Test Cases Selected Hybrid Technique_Avg. % of Test Cases Selected
© 2009, ASQ
by a competent programmer may be incorrect by relatively simple faults in comparison to a correct program. The coupling effect states that a test data set that detects all simple faults in a program will also detect more complex faults.
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation
1) The savings in terms of fewer test cases selected shows an approximate difference of up to 30 percent. 2) The statement, branch, and fault coverage are quite comparable and do not show any significant differences as compared to the technique proposed by Rothermel. 3) The number of test cases needed for modified coverage for the programs greater than 50 lines was fewer as compared to Rothermel’s technique. No general observations, however, can be made on the basis of the authors’ data set. An elaborate and varied study is required on different data sets to further support their observations.
FIGURE 3
Software practitioners may use the technique developed to reduce the time and effort required for regression testing. This technique may lead to greater savings when applied to large and complex programs as compared to small and simple programs. The application of this work may improve the quality, reliability, and effectiveness of the code, which may, in turn, increase the level of customer satisfaction. In practice, there is little time for regression testing due to the pressure of making the software operational. Hence, the selection of test cases from the large pool of test cases becomes very significant. If one is able to choose those test cases that have a higher probability of finding faults, it may ensure the correctness of changes and its ripple effect. The proposed technique may prove to be useful and effective in real-life situations. Software practitioners may further prioritize the selected test cases to improve the confidence of correctness within minimum possible time.
LIMITATIONS On carefully analyzing the behavior of two techniques the authors observe that their technique gives better results for the programs containing intensive variable
FIGURE 4
100
100
90
90
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0 © 2009, ASQ
Average percentage of statement coverage
APPLICATION OF THE TECHNIQUE
1-50
51-100 101- 400 L OC Rothermel Technique_Avg. % of Stmt. Coverage Hybrid Technique_Avg. % of Stmt. Coverage
0
Average percentage of branch coverage
1-50
51-100 101- 400 L OC Rothermel Technique_Avg. % of Branch Coverage Hybrid Technique_Avg. % of Branch Coverage
www.asq.org 43
© 2009, ASQ
While observing three coverage patterns belonging to the statement, branch, and fault parameter, the fourth parameter that can be looked upon is that of modified coverage, as shown in Figure 6. Another cumulative graph in Figure 7 shows the average percentage of test cases selected for the coverage of the changed code. For the authors’ data set, Rothermel’s technique (1996) worked very well, as is evident in the graphs shown. This shows the versatility and applicability of the work. The proposed technique shows the following benefits when applied to the programs:
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation
CONCLUSIONS In this article, a hybrid technique using the variablebased method is proposed. The effectiveness of this technique was evaluated through experiments. The main results are: • More than 80 percent of code coverage was achieved in most cases with a much lower percentage of test cases selected. The reason that 100 percent code coverage could not be achieved for each program is the program’s inherent structure. • In larger programs the proposed technique was more effective, as the percentage of tests selected was lower than in the smaller programs.
FIGURE 5
Further data validations are needed in order to establish the acceptability of the authors’ technique. They plan to replicate their work with large data sets and different programming languages to give generalized results. Note: For an example of a program before and after changes have been made, see Appendix 3. REFERENCES Aggarwal, K. K., and Y. Singh. 2008. Software engineering programs documentation, operating procedures, 3rd edition. New Delhi: New Age International Publishers. Aggarwal, K. K., Y. Singh, and A. Kaur. 2004. Code coverage based technique for prioritizing test cases for regression testing. ACM SIGSOFT Software Engineering Notes 29, no. 5 (September). Ball, T. 1998. On the limit of control flows analysis for regression test selection. In Proceedings of the ACM International Symposium on Software Testing and Analysis, Clearwater Beach, Fla., 134-142. New York: ACM Press. Beizer, B. 1990. Software testing techniques. New York: Van Nostround Reinhold.
FIGURE 6
100
100
90
90
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0 © 2009, ASQ
Average percentage of fault coverage
• The modified coverage achieved was very high for larger programs. The percentage of prioritized test suites needed for 90 percent to 100 percent of modified code coverage was only 15 percent to 25 percent. Thus, only a fraction of the resultant test suite was enough to cover the changes.
1-50
51-100 101- 400 L OC Rothermel Technique_Avg. % of Fault Coverage Hybrid Technique_Avg. % of Fault Coverage
44 SQP VOL. 11, NO. 2/© 2009, ASQ
0
Average percentage of modified coverage
1-50
51-100 101- 400 L OC Rothermel Technique_Avg. % of Modified Coverage Hybrid Technique_Avg. % of Modified Coverage
© 2009, ASQ
computations. Further, the technique does not build the new test cases required for the code added due to modification. Moreover, the types of decision statements may affect the percentage of coverage achieved. Coverage depends on the type of decision statements: Some decisions are taken after the execution such as in “do . . . while,” and “for,” and some before execution such as “while.” There are other options available in the programming language such as “switch statement,” “multiple condition decision statement,” “if . . . else,” and so on, which give different coverage for the same test case.
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation Bible, J., G. Rothermel, and D. Rosenblum. 2001. Coarse- and finegrained safe regression test selection. ACM Transactions on Software Engineering and Methodology 10, no. 2:149-183.
Gupta, R., M. J. Harrold, and M. Soffa. 1992. An approach to regression testing using slicing. In Proceedings of the Conference on Software Maintenance, 299-308.
Binder, R. V. 2000. Testing object-oriented systems. Reading, Mass.: Addison Wesley.
Gupta, R., and M. L. Soffa. 1995. Priority based data flow testing. IEEE ICSM: 348-357.
Chen, Y., D. Rosenblum, and K. Vo. 1994. Test tube: A system for selective regression testing. In Proceedings of the 16th International Conference on Software Engineering, Los Alamitos, Calif., 211-220.
IEEE. 1990. IEEE standard glossary of Software engineering terminology, IEEE standard 610.12. New York: IEEE.
DeMillo, R. A., R. J. Lipton, and G. Sayward. 1978. Hints on test data selection: Help for the practicing programmer. Computer 11, no. 4:34-41. Elbaum, S., A. G. Malishevsky, and G. Rothermel. 2002. Test case prioritization: A family of empirical studies. IEEE Transactions on Software Engineering 28, no. 2 (February):159-182. Elbaum, S., P. Kallakuri, A. G. Malishevsky, G. Rothermel, and S. Kanduri. 2003. Understanding the effects of changes on the cost-effectiveness of regression testing techniques. Journal of Software Testing, Verification, and Reliability 13, no. 2 (June): 65-83. Elbaum, S., G. Rothermel, S. Kanduri, and A. G. Malishevsky. 2004. Selecting a cost-effective test case prioritization technique. Software Quality Journal 12, no. 3: 185-210. Fischer, K., F. Raji, and A. Chruscicki. 1981. A methodology for retesting modified software. In Proceedings of the National Telecommunications Conference B-6-3 (November):1-6. Graves, T., M. J. Harrold, J. M. Kim, A. Porter, and G. Rothermel. 1998. An empirical study of regression test selection techniques. In Proceedings of the 20th International Conference on Software Engineering, Kyoto, Japan, 188-197. Los Alamitos, Calif.: IEEE Computer Society Press.
FIGURE 7
Average percentage of test cases selected for modified coverage
100 90
Jones, J. A., and M. J. Harrold. 2001. Test-suite reduction and prioritization for modified condition/decision coverage. In Proceedings of the International Conference on Software Maintenance, Florence, Italy, 92-101. Jorgensen, P. C.2002. Software testing: A craftsman’s approach. Boca Raton, Fla.: CRC Press. Kamthane, A. N. 2006. Object-oriented programming with Ansi & Turbo C++. Upper Saddle River, N.J.: Pearson Education. Kaner, C., J. Bach, and B. Pettichord. 2002. Lessons learned in software testing. New York: John Wiley and Sons. Kernighan, B. W., and D. M. Ritchie. 1998. The C programming language. Upper Saddle River, N.J.: Pearson Education. Kim, J. M., and A. Porter. 2002. A history-based test prioritization technique for regression testing in resource constrained environments. In Proceedings of the 24th International Conference on Software Engineering, Orlando, Fla., 119-129. Leung, H. K. N., and L. White. 1991. A cost model to compare regression test strategies. In Proceedings of the Conference on Software Maintenance. Sorrento, Italy, 201-208. Li, Z., M. Harman, and R. M. Hierons. 2007. Search algorithms for regression test case prioritization. IEEE Transactions on Software Engineering 33, no. 4 (April).
Offutt A. F. 1992. Investigations of the software testing coupling effect. ACM Transactions on software engineering and methodology 1, no. 1 (January):5-20.
70 60 50
Rothermel, G. 1996. Efficient effective regression testing using safe test selection techniques. PhD thesis, Clemson University.
40
Rothermel, G., and M. J. Harrold. 1998. Empirical studies of a safe regression test selection technique. IEEE Transactions on Software Engineering 24, no. 6:401-419.
30 20 10
© 2009, ASQ
Johnsonbaugh, R., and M. Kalin. 1996. Application programming in ansi. Upper Saddle River, N.J.: Prentice Hall, Inc.
Myers G. J. 1978. The art of software testing. New York: John Wiley and Sons.
80
0
Jeffrey, D., and N. Gupta. 2006. Test case prioritization using relevant slices. In Proceedings of Computer Software and Applications (COMPSAC’06), Chicago, 411-420.
1-50
51-100 101- 400 L OC Rothermel Technique_Avg. % of Prioritized Test Cases Hybrid Technique_Avg. % of Prioritized Test Cases
Rothermel, G., M. J. Harrold, and J. Dedhia. 2000. Regression test selection for C++ programs. Software Testing, Verification and Reliability 10, no. 2:77-109. Rothermel, G., R. H. Untch, C. Chu, and M. J. Harrold. 2001. Prioritizing test cases for regression testing. IEEE Transactions on Software Engineering 27, no. 10 (October):929-948. www.asq.org 45
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation Singh, Y., A. Kaur, and B. Suri. 2006. A new technique for version-specific test case selection and prioritization for regression testing. Journal of the CSI 36, no. 4 (October-December):23-32. Srikanth, H. 2004. Requirements-based test case prioritization. Student Research Forum in 12th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, Newport Beach, Calif. Srivastava, A., and J. Thiagarajan. 2002. Effectively prioritizing tests in development environment. In Proceedings of the International Symposium on Software Testing and Analysis, Rome, Italy, 97-106. Venugopal, K. R. 2008 Problems and solutions in C++. New York: Tata McGraw Hill. Wong, W. E., J. R. Horgan, S. London, and H. Agrawal. 1997. A study of effective regression testing in practice. In Proceedings of the 8th IEEE International Symposium on Software Reliability Engineering (ISSRE’ 97), 264-274. Weller, E. 2000. Applying quantitative methods to software maintenance. Software Quality Professional 3, no. 1:22-29.
BIOGRAPHIES Yogesh Singh is a professor with the University School of Information Technology, Guru Gobind Singh Indraprastha University, India. He is also controller of examinations with the Guru Gobind Singh Indraprastha University, India. He was founder head and dean of the University School of Information Technology, Guru Gobind Singh Indraprastha University. He received his master’s degree and doctorate from the National Institute of Technology, Kurukshetra. His research interests include software engineering focusing on planning, testing, met-
rics, and neural networks. He is coauthor of a book on software engineering, and is a Fellow of IETE and member of IEEE. He has more than 200 publications in international and national journals and conferences. Singh can be contacted by e-mail at
[email protected]. Arvinder Kaur is a reader with the University School of Information Technology, Guru Gobind Singh Indraprastha University, India. She obtained her doctorate from Guru Gobind Singh Indraprastha University and her master’s degree in computer science from Thapar Institute of Engg. and Tech. Prior to joining the school, she worked with Dr. B.R. Ambedkar Regional Engineering College, Jalandhar and Thapar Institute of Engg. and Tech. She is a recipient of the Career Award for Young Teachers from the All India Council of Technical Education, India. Her research interests include software engineering, object-oriented software engineering, software metrics, software quality, software project management, and software testing. She also is a lifetime member of ISTE and CSI. Kaur has published 40 research papers in national and international journals and conferences. Bharti Suri is a lecturer at the University School of Information Technology, Guru Gobind Singh Indraprastha University, Kashmere Gate, India. She holds master’s degrees in computer science and information technology. Her areas of interest are software engineering, software testing, software project management, software quality, and software metrics. Suri is a lifetime member of CSI. She is co-investigator of a University Grants Commission (UGC) funded major research project in the area of software testing. She has many publications in national and international journals and conferences to her credit. Suri can be contacted by e-mail at
[email protected].
Announcing
The ASQ Software Division’s International Conference on Software Quality Northbrook (Chicago), IL, October 10-11, 2009 Preconference tutorials October 9, 2009
Controlling Software Before Software Controls You! Help us make this a successful conference! Call for volunteers—Contact Mark Neal, conference chair, at
[email protected]. Contact
[email protected] to be added to the conference mailing list.
For more information, visit our Web site at www.asq-icsq.org. 46 SQP VOL. 11, NO. 2/© 2009, ASQ
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation
Appendix 1: Important Definitions Black box testing It involves only observation of output for certain input values. White box testing It involves examining the internal structure of the program. Test cases are derived by examining the program’s logic. Gray box testing It is a combination of black box and white box testing approaches. It includes testing from the outside of the product as is done in the black box, but the test cases are designed incorporating the information about the code or the program operation. Regression testing It is the process of retesting the modified parts of the software and ensuring that no new errors have been introduced into previously tested code. Control flow graph It is graphical representation of a program. It is a directed graph in which nodes are either entire statements or fragments of a statement, and edges represent flow of control. Test selection It selects a subset of an existing test suite to use in retesting a modified program. Test prioritization It involves prioritizing the test cases and executing them during retesting a modified program in order of their potential abilities to achieve certain testing objectives. Hybrid technique It combines the technique of selection and prioritization for regression testing. Other Terminology P Original program (before modification)
T Original test suite (built during development testing) T’ Resultant test suite S,S’, X The intermediate sets of variables computed at different steps of algorithm V A set consisting of variables from the changed lines CVT Computed variable table (built during development testing) CLB A two-dimensional array with two fields: changed line number of older version and a bit. The bit is 0 if the line is an inserted line and the bit is 1 if the line is deleted or modified. VDC A two-dimensional array with a variable name and dependency count (count of the number of times it is used as an operand). Priority If the priority values for two different test cases are i and j, respectively, and i0) && (a=0) && (b=0) && (c0) { Dval = sqrt(d); r1= (-(float) b-Dval)/ (2*(float) a); r2= (-(float) b+Dval)/ (2*(float) a); printf(“The roots are real and are r1 = %0.2f r2=%0.2f \n”, r1,r2); } else { Dval = sqrt(-d)/ (2*(float) a); printf(“The roots are imaginary and are r1 = (%0.2f, %0.2f) and r2 = (%0.2f, %0.2f)\n”, -b/(2.0*a),Dval, -b/(2.0*a),-Dval); } } else if (validInput == -1) printf(“The values do not constitute a Quadratic equation. “); else printf(“The value does not belong to a valid range.”); getch(); return 1; }
Quadratic Program (Modified)
# include # include # include int main() { int a,b,c,validInput =0; int valida=0,validb=0,validc=0; double Dval, d; float r1,r2; printf(“Enter the ‘a’ value : “); scanf(“%d”,&a); if ( (a>0) && (a=0) && (b=5) && (c0) { Dval = sqrt(d); r1= (-(float) b-Dval)/ (2*(float) a); r2= (-(float) b+Dval)/ (2*(float) a); printf(“The roots are real and are r1 = %0.2f r2= %0.2f\n”, r1,r2); } else { Dval = sqrt(-d)/ (2*(float) a); printf(“The roots are imaginary and are r1 = (%0.2f, %0.2f) and r2 = (%0.2f, %0.2f)\n”, -b/(2.0*a),Dval, -b/(2.0*a),-Dval); } } else if (validInput == -1) printf(“The values do not constitute a Quadratic equation. “); else printf(“The value does not belong to a valid range.”); getch(); return 1; }
www.asq.org 49
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation
Thirty test cases were built for this example program. Test cases T01 to T07, T08 to T13, T14 to T19, T20 to T21, T22 to T23, T24 to T25, T26 to T28, T29, and T30 were used in checking variable a, b, c, valida, validb, validc, validInput, d, and dval respectively. Test cases are shown in Table 1. The modifications are in lines 11 and 19. Thus, V consists of variables “a” and “c.” A computed variable table (CVT), variable dependency count array (VDC), and changed line bit (CLB) array are shown in Table 2, Table 3, and
Variables “d,” “r1,” “r2,” “Dval,” “valida,” and “validc” are computed from “a” and “c” as is shown in Table 3. Therefore, Priority1 is assigned as 2 to the test cases corresponding to these variables. There are test cases corresponding to “d,”
a
b
c
Actual output
-1 0 1 99 100 101 201 50 50 50 50 50 50 50 50 50 50 50 -1 -1 102 50 50 50 50 -2 50 0 60 79
50 50 50 50 50 50 50 -1 0 1 99 100 101 201 50 50 50 50 50 50 50 -1 102 50 50 50 100 50 70 56
50 50 50 50 50 50 50 50 50 50 50 50 50 -1 0 1 99 100 101 50 50 50 51 101 -1 50 70 50 70 57
The value does not belong to valid range The value does not constitute a quadratic equation The roots are real and are r1=-48.98 and r2=-1.02 The roots are imaginary and are r1=(-0.50,1.32) and r2=(-0.50, -1.32) The roots are imaginary and are r1=(-0.25, 0.66) and r2=(-0.25, -0.66) The value does not belong to valid range The value does not belong to valid range The value does not belong to valid range The roots are imaginary and are r1=(0.00,1.00) and r2=(0.00, -1.00) The roots are imaginary and are r1=(-0.01,1.00) and r2=(-0.01,-1.00) The roots are imaginary and are r1=(-0.99, 0.14) and r2=(-0.99, -0.14) The roots are equal and are r1=r2=-1.00 The value does not belong to valid range The value does not belong to valid range The roots are real and are r1=-1.00 and r2=0.00 The roots are real and are r1=-0.98 and r2=-0.02 The roots are imaginary and are r1=(-0.50,1.32) and r2=(-0.50, -1.32) The roots are imaginary and are r1=(-0.50,1.32) and r2=(-0.50, -1.32) The value does not belong to valid range The value does not belong to valid range The value does not belong to valid range The value does not belong to valid range The value does not belong to valid range The value does not belong to valid range The value does not belong to valid range The value does not belong to valid range The roots are imaginary and are r1=(-1.00,0.63) and r2=(-1.00, -0.63) The value does not constitute a quadratic equation The roots are imaginary and are r1=(-0.58, 0.91) and r2=(-0.58, -0.91) The roots are imaginary and are r1=(-0.35, 0.77) and r2=(-0.35, -0.77)
Table 3
Computed variable table (CVT)
Computed variable d r1 r2 Dval valida validb validc
50 SQP VOL. 11, NO. 2/© 2009, ASQ
Used variable a, b, c a, b, Dval a, b, Dval d, a a b c
© 2009, ASQ
Variable Involved a a a a a a a b b b b b b c c c c c c valida valida validb validb validc validc validInput validInput validInput d Dval
Table 2
T’(resultant test suite) after first iteration: Priority1 is 1 for the test cases corresponding to variable in V (see Table 5).
Variable dependency count (VDC) array
Variable a b c d Dval valida validb validc
Dependency 5 4 2 1 2 1 1 1
© 2009, ASQ
Test Case ID T01 T02 T03 T04 T05 T06 T07 T08 T09 T10 T11 T12 T13 T14 T15 T16 T17 T18 T19 T20 T21 T22 T23 T24 T25 T26 T27 T28 T29 T30
A Priority1 assignment based on CVT is shown in Tables 5-9:
Test case table
© 2009, ASQ
Table 1
Table 4, respectively.
Regression Test Selection and Prioritization Using Variables: Analysis and Experimentation
“Dval,” “valida,” and “validc,” which are used to update table T’. The results are shown in Table 8.
Priority2 further prioritizes cases having the same value of priority1. According to VDC array, “a” has a dependency count of 5 and “c” has a dependency count of 2.
Now variable “Dval” is computed from “d” and thus is given priority1 as 3. But since the test case already exists for “Dval” the priority1 corresponding to “Dval” test case remains as 2. Also “r1” and “r2” are computed from “Dval,” but there is no test case corresponding to them.
Further, there are four variables for which corresponding test cases have Priority1 as 2. Assigning Priority2 among Priority1=2 based on VDC the final T’ is:
Priority2 assignment based on dependency count:
After sorting on Priority1, Priority2
Table 4
Table 7
Table 6
T01, T06, T14, T18, T29 T30 T20, T24,
a c
Resultant test suite after second iteration
Test case ID Priority1 T02, T03, T04, T05, 1 T07 T15, T16, T17, 1 T19 2 2 T21 2 T25 2
a c d Dval vailda validc
Table 8 T01, T06, T14, T18, T29 T30 T20, T24,
Priority2
Test case ID Priority1 T02, T03, T04, T05, 1 T07 T15, T16, T17, 1 T19 2 2 T21 2 T25 2
T01, T06, T14, T18, T29 T30 T20, T24,
1
a
2
c d Dval vailda validc
T’ after assigning Priority2
Test case ID Priority1 T02, T03, T04, T05, 1 T07 T15, T16, T17, 1 T19 2 2 T21 2 T25 2
Table 9
Priority2
© 2009, ASQ
Priority2
T01, T06, T14, T18, T29 T30 T20, T24,
Priority2 1
a
2
c
2 1 2 2
d Dval vailda validc
© 2009, ASQ
Test case ID Priority1 T02, T03, T04, T05, 1 T07 T15, T16, T17, 1 T19
© 2009, ASQ
T01, T06, T14, T18,
Resultant test suite after first iteration
© 2009, ASQ
Table 5
Bit 1 1
Priority2 assignment of variable with Priority1=1
Prioritized resultant test suite
Test case ID Priority1 T02, T03, T04, T05, 1 T07 T15, T16, T17, 1 T19 2 2 T21 2 T25 2
Priority2 1
a
2
c
2 1 2 2
d Dval vailda validc
© 2009, ASQ
Changed line 11 19
© 2009, ASQ
Changed line bit (CLB) array
www.asq.org 51