Integration Testing Using Interface Mutations
Marcio E. Delamaro
IFSC-USP, Physics Institute of S~ao Carlos Unmiversity of S~ao Paulo, S~ao Carlos 13560, BRAZIL
Jose C. Maldonadoy
ICMSC-USP, Department of Computer Sciences and Statistics University of S~ao Paulo, S~ao Carlos 13560-970, BRAZIL
Aditya P. Mathurz
Software Engineering Research Center 1398 Department of Computer Sciences Purdue University, W. Lafayette, IN 47907, USA August 28, 1997 Abstract
A criterion for assessing the adequacy of test sets during integration testing is proposed. The criterion is based on a testing technique named Interface Mutation. The technique itself is designed to be scalable with the size of the software under test; the size being measured in the number of subsystems integrated. Using Interface Mutation, a test and development team is able to assess the adequacy of tests incrementally while integrating various subsystems. Also reported are results from a pilot experiment designed to investigate the cost of Interface Mutation compared to that of the traditional mutation testing technique already described in the literature. The pilot experiment was performed with the help of an enhanced version of Proteum. Keywords: Integration testing, test adequacy, mutation testing, PROTEUM.
Marcio E. Delamaro's research was supported by a grant from CAPES. He is currently visiting the Software Engineering Research Center at Purdue University. email:
[email protected]. y Jose C. Maldonado's research was supported by grants from CAPES, CNPq (Brazilian Government Agencies) and a grant from the Fulbright Program. He is currently visiting the Software Engineering Research Center at Purdue University. All correspondence regarding this paper may be sent to Jose Maldonado. Data not reported here may also be obtained by communicating at the above listed address. email:
[email protected]. z Aditya P. Mathur's research was supported in part by an award from the Center for Advanced Studies, IBM Toronto Laboratories and NSF award CCR-9102331. email:
[email protected]
1
Contents
1 Introduction 2 Integration Testing 3 Interface Mutation
3.1 Basics of Interface Mutation : : : : : : : : : : : : : : : : : : : : : : : : : 3.2 Integration errors : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
4 5 6 7
Experimental evaluation of Interface Mutation Data Analysis Conclusions and future work Acknowledgement
2
4 5 6 6 8
9 14 19 20
List of Figures 1 2 3 4 5
An example of integration error. : : : : : : : : : : An error revealing mutant. : : : : : : : : : : : : : Function A calls several functions. : : : : : : : : : Call graph of the sort program. : : : : : : : : : : Generation of a test set using Interface Mutation.
3
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: : : : :
: 6 : 7 : 8 : 11 : 15
1 Introduction Software testing may be considered to be an incremental activity that pervades most, if not all, of the software development cycle. During the development of a software system, one begins with the development of individual modules, also called units. An unit is composed of one or more functions. There may be cases when already developed and tested units from a previous project are selected for reuse in a new project. We assume that soon after its development a unit is tested before it is integrated with other units to form a subsystem or the desired system itself. In our discussion below we do not distinguish between a subsystem and a system. After having integrated units into a subsystem, one tests the subsystem for conformance with its speci cations. During the testing of a unit one constructs one or more test inputs and executes the unit against these inputs in some suitably chosen order. The collection of test inputs is referred to as a test set. A similar procedure is used for testing a subsystem. In either case, one develops a test set to test the unit or the subsystem in question. Considerable research has gone into the development of methods for evaluating how \good" a test set is for a given unit. As a result, there is a large body of criteria available to evaluate a test set for a given program against a given speci cation. Such criteria are also known as test adequacy criteria or simply adequacy criteria. For example, data
ow [14] and mutation [5] testing provide a variety of adequacy criteria. A tester may use one or more of these criteria to assess the adequacy of a test set for a given unit and then decide whether to stop testing this unit or to enhance the test set by constructing additional test cases needed to satisfy a criterion. In principle such criteria and the procedure may also be applied for the assessment of tests for subsystems. However, such an application is likely to be inecient due to program size, redundant because multiple test teams might test the same part of the code, and incomplete because the large program size might prohibit testers from applying the criteria exhaustively. Considering that units are tested prior to integration, we have developed a new method for assessing a test set for a subsystem. This method is named Interface Mutation as it aims at evaluating how well the interfaces between various units have been tested. Our method is an extension of mutation testing and is applicable, by design, to large software systems composed of interacting modules. The development of Interface Mutation was motivated by the need to assess test sets for subsystems which come about due to the integration of two or more units. For a given subsystem S that consists of two or more units, a related question is: How good is a test set for subsystem S ? Mutation testing has been found to be powerful in its error detection eectiveness when compared to other code coverage based criteria for selecting and evaluating test case sets at unit level [17]. One drawback of mutation testing is due to the large number of mutant programs that need to be executed during a test. Alternative approaches [17] have shown how to reduce this problem in order to keep the cost of testing in acceptable bounds without signi cant reduction in its eectiveness. Thus, it might be prohibitive to apply it to completely or partially integrated software. Our approach to overcome this problem is based on three key ideas: 1) restrict mutation operators to model integration errors; 2) test connections between two modules separately, one at a time; and 3) apply the integration mutation operators only on those parts of the modules that are related to module \interfaces" such as function calls, parameters or global variables. 4
In the remainder of this paper we describe interface mutation and report an experiment to evaluate its fault detection eectiveness. Integration testing is reviewed in Section 2. Interface mutation is introduced in Section 3. The design of a pilot experiment to assess the cost and eectiveness of Interface Mutationis described in Section 4. The analysis of results from our pilot experiment is presented in Section 5. Our conclusions and plans for future work appear in Section 6.
2 Integration Testing A test activity is composed of several phases. At one extreme there is unit (or module) testing that aims at revealing errors in the implementation of a module regardless of its interaction with other modules. At the other extreme there is system testing that intends to verify whether a collection of modules working together behaves as per speci cations or not. Between these two extremes there is integration testing. The entire program is often built gradually as a collection of several units. Subsets of these modules are integrated making a subsystem that in turn is integrated with other modules or other subsystems [7]. The program in Figure 1 contains an example of an integration error. The de nitions of functions main and sort are given below:
main: This function reads ve positive integers; places them in an array; calls a
function sort which sorts this array in ascending order; and prints the three largest of the values read. sort: This function gets two parameters: an array of integers and an integer which is the number of elements in the array. It sorts the array in ascending order and returns the sorted array.
In this program a misunderstanding about what is \an array with n elements" has resulted in an error. From one programmer's point of view the array index goes from 1 to n, whereas from another programmer's point of view the array index varies from 0 to n ? 1. Several strategies have been proposed for the integration process; top-down, bottomup, sandwich and big-bang are a few [13]. It has been suggested that incremental strategies are preferable as they favor nding errors earlier in the system integration process [13]. On the other hand, some experiments have shown that none of these approaches are clearly better than the others in terms of errors detection [15]. Regardless of the testing strategy, a criterion to select and to evaluate test sets is considered necessary. In general, during integration testing one is limited to the use of functional approach or the traditional coverage measures [2]. Criteria that are extensions of control and data ow criteria, de ned at unit level, for inter-procedure testing have also been proposed [9, 10]. We present a dierent approach to test case selection and evaluation at the integration level. Our goal is to explore the use of mutation testing, a testing method traditionally used at unit level, in the context of integration testing. 5
#define
N
unsigned main() { int
5
a[N+1];
i, n;
n = N; for (i = 1; i i; j--) if (a[j] < a[j-1]) { t = a[j]; a[j] = a[j-1]; a[j-1] = t; } }
Figure 1: An example of integration error.
3 Interface Mutation
3.1 Basics of Interface Mutation
The relations between modules, also known as interfaces and connections, establish how the modules interact and how they should work together in order to behave as a system. In a procedural language, such as C, modules represent functions and interfaces represent: 1) function calls and parameter passing; 2) values returned from functions; and 3) global data sharing. The idea underlying Interface Mutation is to create mutants by inducing simple changes only to the entities that belong to the interface between modules. At rst, simple faults directly related to the connections between modules are located. Two examples of such faults are a function call with a missing parameter and the use of an incorrect value as a parameter. The rst example is of a \static" interface fault. A compiler may be able to detect such an error using static analysis. Our concern 6
#define
N
unsigned main() { int
5
a[N+1];
i, n;
n = N; for (i = 1; i l) F cmp pa = skip(pa, fp, 0) G eol while (*p != ' n n')
Mutation qsort(hp, l-1) u g = 0 set l(a) eargv[1][i] == ' n 0' disorder("nonunique:", ibuf[i]? >l) pa = skip(pa, fp, 1) while (*p == ' nn')
was an argument in a call to F 2 from F1 or was a formal parameter of F 2. SRSR: This operator replaces each statement (simple or compound) by all the return commands in the same function. In the experiments, only return statements were replaced by all the other return statements in the same function. Varr, Vprr, Vsrr and Vtrr: Each array (pointer, scalar and structure respectively) reference is replaced by all other type-compatible local or global variables. In the experiments, global variables belonging to the F 2 interface, arguments in a call to F 2 from F1 and formal parameters of F 2 were mutated. VTWD: Each scalar reference x is replaced by succ(x) and by pred(x), where pred and succ denote, respectively, the predecessor and the successor functions. In the context of the experiments, global variables belonging to the F 2 interface, arguments in a call to F 2 from F1 and formal parameters of F 2 were mutated. A small set of mutation operators was selected to reduce the number of mutants. Also, if the operators are applied to all parts of the code the number of mutants could be very high and the time for mutation testing might be prohibitive for large modules. Table 4 lists the number of mutants generated and other statistics for various experiments. The second column in this table shows the number of mutants generated when that same set of 10 mutation operators is applied to all the code to which it could have been applied if traditional mutation testing were used. The third column presents the number obtained by applying Interface Mutation; these are the numbers of mutants actually used in the experiments. The fourth column shows the ratio, expressed as percentage, of the mutants obtained using Interface Mutation to the mutants obtained otherwise. For each experiment 50 adequate test sets were constructed. The rst step at generating an adequate test set was to create a pool of randomly generated test cases. A set of 200 les were generated randomly and used as input les. Each le has from 0 to 15 lines, each line has from 1 to 20 characters picked from letters, digits, tab and space. In addition 200 test cases were generated, each as a random combination of the ags c, 12
Table 3: Integration errors identi ed in the faulty versions of the sort program
Experiment Fault Calling Function Called Function Error Type I A qsort qsort 2 II B main eld 3 III B main merge 1 IV B merge qsort 2 V C merge set l 2 VI D sort set l 3 VII E merge disorder 1 VIII F cmp skip 2 IX G skip eol 2 X G cmp eol 2
n, m, u, r and f , at most 3 sort keys and at most 10 input les picked from those 200 randomly generated les.[19] These 200 test cases were then imported into Proteum. The second step was to identify the equivalent mutants and then execute the remaining mutants against the test set. If the 200 test cases did not kill all the non equivalent mutants, no additional test cases were generated. However, in some cases the mutation score was not 100%, i.e. the test set obtained was not adequate with respect to Interface Mutation. The average mutation score ws 96% and the least one was 84%. >From the 200 test cases, only the eective test cases were selected, i.e., only those test cases that have killed at least one mutant. A special case is considered when one of the 200 test cases causes an execution error (abnormal termination) in the program being tested. In this case, Proteum considers that such a test case kills all the mutants and then no additional test cases are required. Thus, for example, if test case 80 causes an execution error in the program being tested, the adequate test case set would contain all eective test cases until test case 79 plus test case 80 and none else. This case has occurred only in experiment VII. These steps are summarized in Figure 5. Hereafter these test sets are called Interface Mutation -generated test sets. Each faulty version of the sort program was then executed against each of the 50 interface mutation generated test sets and the results compared with the results of running the original sort against the same test sets. Diering results implied that this test set was able to reveal the inserted error. Table 5 shows the results obtained. For each experiment, it presents the mutation score obtained by the 50 mutation generated test sets, the size of the test sets and the percentage of test sets that reveal the error in the faulty program. Both the average and standard deviation values are given. To determine the eectiveness of Interface Mutation we used the approach used by Weyuker for another criteria [16]. Fifty test sets with the same cardinality of the mutation adequate test sets were randomly generated for each experiment. Their eectiveness in revealing the errors on the faulty programs was also measured. Table 6 summarizes 13
Table 4: Reducing mutant rates
EXPERIMENT TOTAL REMAINING PERCENTAGE EQUIVALENTS I 1076 128 12 2 II 5565 514 9 180 III 5003 902 18 337 IV 3409 256 8 65 V 2699 292 11 132 VI 1493 334 22 100 VII 2415 98 8 8 VIII 5245 332 6 177 IX 774 22 3 21 X 4543 48 1 31 the results. The second column repeats the eectiveness of the Interface Mutation generated test sets to make it easier to compare. The third column presents the percentage of those 50 randomly generated test sets that revealed the error. The results in the third column may be considered conservative because when applying mutation testing, if a test case reveals an error no additional test cases are required. So, in most cases a revealing test set may be smaller than those considered in Table 5. The fourth column in Table 6 presents the percentage of revealing randomly generated test sets that have the cardinality of interface mutation-generated test sets not considering their full size. Instead, it considers the size of interface mutation-generated test sets until the rst error is revealed (or the full size if the mutation generated test set does not reveal the error). Some strategies have been used to minimize the number of mutants in a testing session. For example, constrained mutation using x% randomly selected mutants or using only a limited set of mutation operators (like abs and ror operator for FORTRAN programs) has shown two good minimization strategies [18]. In the experiment we also applied a constrained Interface Mutation randomly selecting 10% of the mutants. We used this approach and re-ran the experiments presented in Table 5 using only 10% of the interface mutants. For experiments IX and X the strategy was not applied because the number of mutants in these cases was low. The results obtained are shown in Table 7.
5 Data Analysis We measure the cost of applying Interface Mutation by the number of mutants to be executed and analyzed. The eectiveness is measured as the fraction of errors revealed. Based on both the cost and the eectiveness measures Interface Mutation appeared to be a promising criterion. The number of mutants must be kept low because the time to execute each mutant tends to be longer in integration testing than in unit testing. 14
Generate test case pool with 200 test cases
Get 1 test case from pool
no
End of the pool?
Execute mutants
Kill any ?
no
adequate test case set
Discard
yes Keep
Have an ’almost’
yes
no Adequate ?
yes
Have an adequate test case set
Figure 5: Generation of a test set using Interface Mutation.
In our approach we attempt to reduce the number of mutants by reducing the number of mutation operators and reducing the sections of the code where they are applied. From Table 4 it can be observed that it is possible to reduce the number of mutants signi cantly. However, the experiments showed that further improvements could be eected at generating the mutants taking in consideration the characteristics of particular mutants. For example, the operator CCSR replaces each scalar variable by all the scalar constants that appear in the same function. Thus the number of mutants generated is proportional to the number of scalar references times the number of constants. In most cases the mutants created by applying CCSR operator in the same variable reference behave in the same way in the sense that if one is killed the others are also killed, or if one is equivalent, the others are equivalent too. Thus, this operator is a good candidate for applying some form of minimization strategy. For example, to randomly generate only a small percentage of mutants, rather than generate 100% (all) of the mutants. A dierent approach would be to limit the maximum number of mutants created in the same part of the code by the same operator. The time to execute mutants can also be used to measure the cost. However, it depends signi cantly on hardware related factors and on the strategy used to execute the mutants. Table 8 shows the timing obtained at running the experiments on a Pentium 90 machine running under Lynux 1.2.7. The table also presents two timings for each experiment. The rst is the timing obtained by executing all the live mutants against all 200 test cases in the test set. The values obtained show that in these circumstances the time spent at executing the mutants is unacceptably long. The second timing was measured 15
Table 5: Eectiveness results
MUTATION NUMBER OF ERROR EXPERIMENT SCORE TEST CASES REVEALING SETS I 1.00 5.20 100 0.00 0.82 II 0.91 12.42 96 0.00 1.23 III 0.95 12.28 94 0.13 1.13 IV 0.84 6.88 76 0.25 1.19 V 0.99 9.24 100 0.07 1.10 VI 0.98 11.36 100 0.15 1.37 VII 1.00 2.64 100 0.00 0.78 VIII 0.99 12.54 100 0.05 1.22 IX 0.98 0.98 98 0.16 0.16 X 0.99 1.96 100 0.10 0.26 Average 96.3 7.55 96.4 Std. Deviation 0.18 1.94 1.93 after eliminating the equivalent mutants. It shows that the maximum delay occurs due to the equivalent mutants that are executed against all the 200 test cases. If experiment II is taken as an example, there are 180 200 = 36; 000 executions that are avoided if the equivalent mutants are excluded prior to the execution. These facts just support the strategy of marking the equivalent mutants earlier in a test session to speed up the testing process. We rst executed the mutants against a few test cases, then analyzed the live mutants and determined the equivalent ones and only after that, executed the mutants against the remaining test cases. To decide whether a mutant is equivalent or not can also be a problem. As expected, at integration level this is a harder problem than at the unit level, since at this point the program is more complex. This problem is not particular to mutation testing; to decide whether a mutant is equivalent (and to build a test case to kill it if it is not equivalent) is perhaps as dicult as to decide whether there is a feasible path for exercising a def-use 16
Table 6: Eectiveness of randomly generated test sets
EXPERIMENT MUTATION RANDOM (1) RANDOM (2) I 100 84 60 II 96 88 48 III 94 90 54 IV 76 62 44 V 100 76 40 VI 100 82 30 VII 100 2 2 VIII 100 100 52 IX 98 4 4 X 100 10 8 Average 96.4 59.80 34.20 Std. Deviation 1.93 5.40 4.13 association (or to build a test case for that). Furthermore, these facts reinforce the belief that someone who knows the program { possibly the developer { might test it at unit and at integration time [13]. The eectiveness results obtained are also satisfactory as Table 5 shows. From the 10 experiments, 6 had all the 50 test sets revealing the error in the respective program. Experiment IX had 1 test set that did not reveal the error; this test set has mutation score 0.0 because the 200 random test sets were not able to kill the only non equivalent mutant used on the experiment. The worst results were obtained in experiments III and IV. Analyzing these two experiments we note that the errors not found can be related to the fact that in both cases the seeded fault was not in either functions being tested. In both cases, a fault (the same fault) was seeded in function field. In experiment III this fault causes an error on interface main--merge and in experiment IV it causes an error on interface merge--qsort. In these cases it would be more probable that the fault could be revealed at testing the main--field interface. i.e. an interface that reaches the function where the fault is located, as in experiment II. In addition, in experiment IV the mutation scores obtained are the lowest, with an average of 0.84. These results are important in that they may be used as basis for guidelines for applying Interface Mutation. Experiment IX deserves some comment. Given the domain de ned in the previous section, function eol is never called from function skip. Thus, when testing the connection skip--eol, it would be expected that no mutant could be killed by any test case. Actually this does not happen in experiment IX because mutants generated in the body of function eol were killed through calls from another function (in this case, function cmp). Then, if we consider important to know exactly what connection is being tested, the current implementation of mutation operators is not adequate. A new way 17
Table 7: Eectiveness using 10% mutants
MUTATION NUMBER OF ERROR FULL EXPERIMENT SCORE TEST CASES REVEALING SETS Interface Mutation I 1.00 3.58 96 100 0.00 0.81 II 0.91 6.40 82 96 0.00 0.91 III 0.94 6.48 68 94 0.18 0.97 IV 0.82 4.46 64 76 0.21 1.03 V 0.99 4.60 100 100 0.06 0.88 VI 0.99 5.62 100 100 0.14 0.84 VII 1.00 1.90 100 100 0.00 0.38 VIII 0.99 4.62 100 100 0.14 0.94 Average 0.96 4.71 88.75 96.4 Std. Deviation 0.21 0.93 3.53 1.93 for implementing the operators should be de ned. This could incorporate the ability to identify what connection is being exercised. For example, when testing connection skip--eol, a mutation inside the function eol must be considered only if it was called from skip. Using 10% of the interface mutants reinforces the results using full interface mutation. The experiments in which the results were near 100% using full mutation remain at the same level of eectiveness when using 10% of the mutants. On the other hand, for those cases, mainly experiments III and IV, for which the eectiveness was low had performed worse when using 10% of the mutants. In general, reducing the number of mutants by 90% and the average of required test cases from 7.55 to 4.71, the eectiveness fell from 96.40% of revealing test sets to 88.75%, a decrease of less than 10%. So the idea of constrained mutation seems promising in the context of Interface Mutation. Comparing Interface Mutation and random generation (Table 6) we note a signi cant improvement in the eectiveness of Interface Mutation test sets over randomly generated test sets. Considering random test sets that have the same cardinality as of the adequate 18
Table 8: Times to execute mutants
EXPERIMENT MUTANTS NON-EQUIV. TOTAL TIME NON-EQUIV TIME I 128 126 3 min 3 min II 514 334 4 h 53 min 49 min III 902 565 8 h 27 min 40 min IV 256 191 2 h 36 min 1 h 4 min V 292 132 2h 4 min VI 334 234 43 min 9 min VII 104 52 1h 52 min VIII 332 155 4 h 10 min 6 min IX 22 1 29 min 24 sec X 48 17 43 min 1 min mutation sets, the average eectiveness for all the experiments is only 59.8% compared with 96.4% of the mutation generated test sets. Considering random test sets with the cardinality of the mutation generated sets only until the rst revealing test case, the eectiveness is 34.20%. This indicates that the eectiveness achieved by Interface Mutation test sets is not due only to their length.
6 Conclusions and future work Interface Mutation attempts to reveal errors that manifest beyond the scope of a single module and propagate through the interface between functions. In addition, it takes into consideration the fact pointed out by Haley and Zweben that, even when testing at integration level, it is important to consider some structures internal to the modules [8]. The experiment reported here used the Unix sort program. This program was seeded with 7 faults which create 10 dierent integration errors. Then, Interface Mutation was applied to the seeded programs and its eectiveness on revealing the errors was measured. For each of the 10 experiments 50 test cases were generated using Interface Mutation. The eectiveness of the criterion was evaluated counting the number of test sets that reveal the error. The numbers obtained are promising. None of the errors stayed unrevealed. Actually, the average of test sets that reveal the errors reached 96%. Comparing this number with the eectiveness of randomly generated test sets showed that the results obtained are not due only to the size of the test sets but also due to the way they were chosen. One disadvantage of traditional mutation testing is its cost in terms of the number of mutants to execute and to analyze. This experiment showed that applying mutations only at points related to the interface and connections between functions keeps the number of mutants to a manageable limit. In addition, the idea of constrained mutation [18] 19
used for unit testing was also explored in the experiment and revealed to be a good approach at integration level. The next step in this work is to de ne a set of mutation operators speci cally designed to reveal integration errors. In addition, it is our goal is to turn this idea into a practical approach usable in real software development environments. In this direction, Proteum is being modi ed. The interface mutation operators have to be implemented as well as other new features to adequate it to integration testing.
7 Acknowledgement Thanks to Janet Stappleton for commenting on the earlier revisions of this work.
References [1] H. Agrawal, R. A. DeMillo, R. Hathaway, Wm. Hsu, W. Hsu, E. Krauser, R. J. Martin, A. P. Mathur and E. Spaord, \Design of Mutant Operators for C Programming Language", Technical Report SERC-TR41-P, Software Engineering Research center, Purdue University, March 1989. [2] B. Beizer, Software System Testing and Quality Assurance, Van Nostrand Reinhold Company, 1984. [3] M. E. Delamaro, Proteum: Um Ambiente de Teste Baseado na Analise de Mutantes (Proteum: A Test Environment Based on Mutation Analysis), Master Thesis, ICMSC-USP S~ao Carlos, S~ao Carlos-SP Brazil, October 1993. [4] M. E. Delamaro, J. C. Maldonado, M. Jino and M. L. Chaim, \Proteum: Uma Ferramenta de Teste Baseada na Analise de Mutantes (Proteum: A Testing Tool Based on Mutation Analysis)", Software Tools Proceedings of Seventh Brazilian Symposium on Software Engineering, October 1993. [5] R. A. DeMillo, R. J. Lipton and F. G. Sayward, \Hints on Test Data Selection: Help for the Practicing Programmer", IEEE Computer, Vol 11, No. 4, April 1978. [6] R. A. DeMillo, A. P. Mathur, \A Grammar Based Fault Classi cation Scheme and its Application to the Classi cation of the Errors of TEX", Technical Report SERC-TR165-P, Software Engineering Research center, Purdue University, September 1995. [7] C. Ghezzi, M. Jazayeri and D. Mandrioli, Fundamentals of Software Engineering, Prentice-Hall Inc, 1991. [8] A. Haley and S. Zweben, \Development and Application of a White Box Approach to Integration Testing", The Journal of Systems and Software, No. 4, 1984. [9] M. J. Harrold and M. L. Soa, \Selecting and Using Data for Integration Test", IEEE Software, March 1991. 20
[10] U. Linnenkugel and M. Mullerburg, \Test Data Selection Criteria for (Software) Integration Testing", Proc. of the First International Conference on Systems Integration, April 1990. [11] J. C. Maldonado, Criterios Potenciais Usos: uma Contribuic~ao ao Teste Estrutural de Software (Potential Uses Criteria : a Contribution to Structural Software Testing), PhD. Dissertation, DCA/FEE/UNICAMP, 1991. [12] A. J. Out, \The Coupling Eect: Fact or Fiction?", Proc. of Third Symposium on Software Testing, Analysis, and Veri cation (TAV3), December 1989. [13] Pressman, R.S., Software Engineering - A practitioner's approach (3rd. edition), McGraw-Hill, 1992. [14] S. Rapps and E. J. Weyuker, \Selecting Software Test Data Using Data Flow Information", IEEE Trans. on Software Engineering, Vol SE-11, No. 4, pp. 353363, April 1985. [15] J. F. Solheim and J. H. Rowland, \An Empirical Study of Testing and Integration Strategies Using Arti cial Software Systems", IEEE Trans. on Software Engineering, Vol 19, No. 10, October 1993. [16] E. Weyuker, T. Goradia and A. Singh \ Automatically Generating Test Data from a Boolean Speci cation," IEEE Trans. on software Engineering, Vol. 20, N. 5, pp. 353-363, May 1994. [17] E. W. Wong, On Mutation and Data Flow, PhD Thesis, Computer Science Department, Purdue University, December 1993. [18] E. W. Wong, J. C. Maldonado, M. E. Delamaro and A. P. Mathur, \Constrained Mutation in C programs," Proceedings of the VIII SBES - VIII Brazilian Symposium on Software Enginnering, Curitiba, PR, Brazil, October 1994, pp 439-452. [19] Unix man page for sort.
21