An Approach for Mutation Testing Using Elitist Genetic Algorithm i
3 K. K. Mishra, 2Shailesh Tiwari, Anoj Kumar and 4A.K. Misra Computer Science & Engineering Department MNNIT, Allahabad
INDIA i 2 3 4
[email protected],
[email protected],
[email protected],
[email protected]
Abstract- Mutation Testing is used as fault-based testing to
difficult. In order to motivate the generation of good test cases, we start with those test cases which have already killed some mutants in this way we can generate good test cases in minimum generations. Also to retain good test cases obtained in various generation we use elitist genetic algorithm. The remainder of the paper is divided into following section: section 2 describes a brief introduction of mutation testing. Section 3 describes the overview of genetic algorithms. Section 4 describes the proposed approach followed by conclusion in section 5.
overcome limitations of other testing approaches but it is recognized as expensive process. In mutation testing, a good test case is one that kills one or more mutants, by producing different mutant output from the original program. Evolutionary algorithms have been proved its suitability for reducing the cost of data generation in different testing methodologies. In order to reduce the cost of mutation testing, efficient test cases are generated that reveal faults and kill mutants. In this paper, we develop a new strategy for generating efficient test input data in the context of mutation testing.
II.
Keywords- Mutation Testing, Genetic Algorithm, Elitist Genetic Algorithm.
I.
MUTATION TESTING
Fault based testing techniques: mutation analysis and mutation testing are often used to overcome the limitations of other testing approaches. Mutation analysis is used to identify mutation techniques while mutation testing tests
INTRODUCTION
The goal of testing is to uncover as many faults as
adequacy criteria based on mutation analysis [2]. The history of mutation testing can be traced back to 1971 in a student paper by Richard Lipton [12]. The birth of mutation testing can also be identified in literature [16]. It can be applied at different levels: unit level, integration level, and specification level. Mutation testing is effectively used to identifying the adequate test data which can be used to find the faults [1]. It is also empirically proved that mutation testing is more powerful than the statement and branch coverage [15]. Mutation testing is more successful in finding faults than data flow testing [9, 10]. Mutation analysis is useful to assess and compare test suites and criteria in terms of their cost effectiveness. It can also be used to compare and assess new testing techniques [7]. The act of assessing test adequacy using mutation is referred as mutation analysis. In mutation testing, faults are deliberately instrumented into the original program by making few syntactic changes in original program. The modified program is called as mutant. Similarly, several mutated copies of original program are
possible with a set of test sets. Software organizations spend more than 40%-50% of their development cost in software testing. In order to test software, test data have to be generated. Generating test data manually is slow, expensive, and requires exhaustive efforts. So, automated test data generation techniques can be used to ease the process [4] and reduce the cost. Testing can be of two types: black box testing and white box testing. Black box testing is mainly a validation technique that checks to see if the product meets the customer requirements. However, white box testing is a verification technique which uses the source code to guide the selection of data [3]. Mutation testing is a kind of white box testing. Basically, it is fault based testing based on mutation analysis which overcomes the limitations of other testing approaches. Mutation analysis identifies technique to mutate, i.e. to modify, software artifacts [2]. Mutation testing provides a testing criterion which can be used to measure the effectiveness of a test set or data in terms of its ability to detect faults. Genetic algorithms are used to generate test data automatically which save the time and cost. But still there is a need to use genetic algorithm effectively to get results in less number of iterations which is possible only when minimal number of test cases that kills the maximum number of mutants are selected. We have modified simple genetic algorithm to achieve these goals. One drawback of simple genetic algorithm is related to initial population. In simple genetic algorithm initial population is generated randomly so searching of good test cases in minimum generation will be
generated. If the mutant and the original program generate different outputs for a test case then the mutant is called killed. If no test case can distinguish between the mutant and the original program then it is said to be alive [4]. There are some mutants that can never be killed, as they always produce the same output as the original program. These mutants are syntactically different but functionally equivalent to the original program [2]. These mutants are called as equivalent mutants. A good test case is one that kills one or more mutants, by producing different mutant output from the original program.
978-1-4244-5540-9/10/$26.00 ©2010 IEEE
426
equivalent mutant's problem is beyond the scope of this paper.
For an instance, consider following program fragmentP, 1. if(x
>
0)
III.
Genetic Algorithms (GAs) are adaptive heuristic search
2. doThisO; 3. if(x
>
algorithm premised on the evolutionary ideas of natural
10)
selection and genetic [13]. The basic concept of GAs is designed to simulate processes in natural system necessary
4. doThatO;
for evolution, specifically those that follow the principles
Some of the possible mutants ofP would be: M1: 1. if(x
problem [6]. A genetic algorithm (GA) is a search technique used in
10)
computing to
4. doThatO;
use techniques inspired by evolutionary biology such as recombination). In software testing, our basic goal is to search the >
domain for input variables which meets with the testing
0)
objective. So, the problem is to maximize a function such
2. doThisO;
adjusted
towards
global
optimum[ll].
Genetic
algorithms are implemented in a computer simulation in
4. doThatO; M4: 1. if(x
approximate solutions to
inheritance, mutation, selection, and crossover (also called
4. doThatO;
3. if(x
or
categorized as global search heuristics. Genetic algorithms
10)
M3: 1. if(x
exact
are a particular class of evolutionary algorithms (EA) that
2. doThisO; >
find
optimization and search problems. Genetic algorithms are
M2: 1. if(x == 0)
3. if(x
GENETIC ALGORITHMS
solutions
2. doThisO;
population
of
abstract
representations (called
(called individuals, creatures, or phenotypes) to
an optimization problem evolves toward better solutions. Traditionally, solutions are represented in binary as strings
3. if(x == 10)
of Os and Is, but other encodings are also possible. The
4. doThatO;
evolution usually starts from a population of randomly generated individuals and happens in generations. In each
Here M1, M2, M3, and M4 are possible mutants of original
generation, the fitness of every individual in the population
program fragmentP. All of these are first-order mutants as
is evaluated, multiple individuals are stochastically selected
only single change is made to create every possible mutant.
from the current population (based on their fitness), and
Let D be the input domain and T be the test case for execution then T is a subset of D. To get the measure of mutation adequacy, every test case is run against each mutated copy of the original program and the number of mutants killed by every test case is counted (i.e. where T fails for a certain mutant). Mutants which are not killed by T are called alive. A mutated copy of original program is said
modified (recombined and possibly randomly mutated) to form a new population. The new population is then used in the
next
iteration
of
the
algorithm.
Commonly,
the
algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a
to be equivalent: for every x belongs to D, such that, P.x is equal to M.x then the mutant M is said to be equivalent. We will also write it as M ==P. HereP.x is evaluation of program P on the input x and M.x is the evaluation of program P on input x [5]. Therefore, mutation adequacy criteria or mutation score is defined as, M Score(T)=100*( d / N- eq )
a
chromosomes or the genotype of the genome) of candidate
0)
satisfactory solution may or may not have been reached. Genetic algorithm searches the input domain of the program in an order to generate suitable test cases to kill given mutant. It uses three operators to guide the search: A.
(1)
Here'd' is the number of mutants killed by the test case T, 'N' is the total number of mutants generated and 'eq' is the number of equivalent mutants i.e. M == P. However
Selection
B.
Crossover
C.
Mutation
A variety of literature is available on this topic. Interested candidate may refer these references [14, 17].
427
IV.PROPOSED APPROACH Our approach extends the work of Mattias Bybro[5] and M. Masud[4]. Mattias Bybro developed MU tool(Fig. 2) for java
programming
language.
This
tool
automatically
generates the mutated copies of the original program by instrument the mutants. Initially original program is tested by executing a set of test cases to check whether the program is correct or not. These set of test cases are generated by JUnit. Then, the meta mutants are generated using the operators given in Table1 only if the original program is correct. Figure 1 elaborates the mutation testing process. TABLE I MurATION OPERATORS FOR 00 PROGRAM
Type
Description
AOR
Arithmetic Operator Replacement
EHF
LOR ROR NOR
Exception Handling Fault Logical Operator Replacement Relational Operator replacement No Operation replacement
VCP
Vmabie and Constant PertuIbation
RFI
Referencing Fault Insertion
MCR
Fig. 2 Model Proposed by Mattias Bybro [5]
Methods Call Replacement
M.
Masud proposed a framework in which original
program and mutant program are instrumented so that each program can be viewed as many small units. In his approach, instead of executing the entire program, every small unit is executed by the test case set in an order to kill the mutant unit.
Class A
Selftest A
N
DIagnosis Fig. 3 Model Proposed by M. Masud [4]
Fnhance Selt\est I
bncontDl SReciflClllionr----.,. specification
M. Masud proposes that Genetic algorithm can be used to
Add conlraClS to Ihe
generate test cases for those mutants which are not killed by any chosen test cases. They have given no weightage to those test cases which have already killed some mutants
Fig. 1 Mutation Testing Process [8]
Our approach extends the work of Masud's by collecting all those test cases in a buffer which have already killed some mutants. These collected test cases will act as an
428
required to generate if a mutant survives after the execution
initial population of the genetic algorithm. In this way the probability of getting good test cases in less iteration can be
of test case. To enhance the fault finding capability of a test
increased.
case, new test cases are generated by taking those test cases
Moreover we use elitist version of genetic
algorithm to collect various good test cases which are
as initial population which are capable of killing some
generated in different generation.
mutants or whose mutation score is high. The proposed approach described in this paper is the extension of the work
We use a checker module to compare and trace the output of each unit. The checker writes 1 in a log if the mutant unit
proposed by Mattias Bybro[5] and Masud[4]. However it is
survives and writes 0 if the mutant unit is dead. After
in
running the test case on every program unit, if mutant is
comparison to claim its effectiveness. This approach can be
preliminary
stage.
It
requires
some
experimental
used to develop a tool such as MU tool [5] in future.
killed then that test case is selected as initial population because that test case have high probability to generate a
REFERENCES
good test case that kills the mutants in less iteration. In our approach, those test cases that killed some mutants in
[I]
R. Geist, A. J. Offutt, and F. C. Harris, "Estimation and Enhancement of Real-Time Software Reliability Through Mutation Analysis," IEEE Transactions on Computers, vol. 41, no. 5, pp. 550-558, May 1992. London, Strand, London, 2008.
[2]
K. Ayari, S. Bouktif, G. Antoniol, "Automatic Test Input Data Generation via Ant Colony", Proceedings of the 9th annual conference on Genetic and evolutionary computation (GECCO'07), July 2007
[3]
J. Badlaney, R. Ghatol, R. Jadhwani, " An Introduction to Data Flow Testing", NCSU CSC TR-2006
[4]
M. Masud, A. Nayak, M. Zaman, N. Bansal, " A Strategy for Mutation Testing using Genetic Algorithms", IEEE CCECE/CCGEI, Saskatoon, May 2005
[5]
M. Bybro, "A Mutation Testing Tool for Java Programs," Thesis, Department of Numerical Analysis and Computer Science, Nada at the Royal Institute of Technology, KTH, Sweden, 2003.
[6]
C. C. Michael, G. McGraw, M. A. Schatz, "Generating Software Test Data by Evolution," IEEE Trans. on Software Engineering, Vol. 27, No. 12, pp. 1085-1110, 2001.
[7]
J. Offutt and R. H. Untch. Mutation 2000: Uniting the orthogonal. In Mutation 2000: Mutation Testing in the Twentieth and the Twenty First Centuries, San Jose, CA, pages 45-55, 2000.
[8]
B. Baudry, V. L. Hanh, J-M. Jezequel, and Y. L. Traon,."Building Trust into 00 Components using a Genetic Analogy," Proc. of 11th International Symposium on Software Reliability and Engineering (lSSRE'2K), pp. 4-14, 2000.
[9]
P. G. Frankl, S. N. Weiss, and C. Hu, "All-uses vs. mutation testing: An experimental comparison of effectiveness". The Journal of Systems and Software, Sept. 1997.
program unit are stored in a buffer whose size is equal to the initial population. When buffer is full then these test cases are used as initial population in an order to generate good test cases. We also considered those test cases as initial population which reached specified mutation score threshold in an order to achieve the testing objective. Our approach is clearly illustrated in figure 4.
Suffer SrZE=IntilJl P�ullt:ion
[10] J. Offutt, J. Pan, K. Tewary, and T. Zhang. An experimental evaluation of data flow and mutation testing. Software--Practice and Experience, 26(2): 165-176, Feb. 1996. [II] H. H. Sthamer, "Automatic generation of Software Test Data using Genetic Algorithms," PhD Thesis, University of Glamorgan, UK, 1995. [12] A. P. Mathur, "Mutation Testing," in Encyclopedia of Software Engineering, J. J. Marciniak, Ed., 1994, pp. 707-713. [13] W. M. Spears and V. Anand, "A study of Crossover Operators in Genetic Programming," Proc. 6th Int. Symp. On Methodologies for Intelligent Systems, 1991.
[14] D.E. Goldberg. "Genetic Algorithms: in Search, Optimization & Machine Learning." Addison Wesley, MA. 1989.
Fig. 4 Our Proposed Approach
V.
[15] P. J. Walsh. A measure of test completeness. PhD thesis, State University of New York at Binghamton, 1985. [16] S. C. P. F. Fabbri, J. C. Maldonado, T. Sugeta, and P. C. Masiero, R. G. Hamlet, "Testing Programs with the Aid of a Compiler," IEEE Transactions on Software Engineering, vol. 3, no. 4, pp. 279-290, July 1977
CONCLUSION
In this paper, we proposed an approach through which good test cases are generated for mutation testing by using
[17] J. Holland, "Adaptation in Natural and Artificial Systems", MIT Press, MA, 1975.
elitist version of genetic algorithm. It tries to find faulty unit in the code [4] as well as good test cases are generated having capability to kill the mutants. New test data are
429