tal principle of OO programming known as data encapsulation. The goal of the evolutionary search is to find MCS that def
Search-Based Test Case Generation for Object-Oriented Java Software Using Strongly-Typed Genetic Programming 1
Jos´e Carlos Bregieiro Ribeiro 1
[email protected]
Introduction
Test Object and Test Case Representation
Test data generation deals with locating good test data for a particular test criterion; the application of evolutionary algorithms to this process is often referred to as Evolutionary Testing or Search-Based Test Case Generation. Unit-test cases for object-oriented (OO) software consist of method call sequences (MCS), which define the test scenario. Each test case focuses on the execution of one particular public method – the method under test (MUT). One of the most pressing challenges faced by researchers in the ET area is the state problem, which occurs with objects that exhibit state-like qualities by storing information in private fields requiring all interaction to be performed through its public methods – a fundamental principle of OO programming known as data encapsulation. The goal of the evolutionary search is to find MCS that define interesting state scenarios for the variables which will be passed, as arguments, in the call to the MUT. The input domain thus encompasses the parameters of the test object’s public methods – including the implicit (i.e. the this object) and explicit parameters. Recent surveys indicate that Input Domain Reduction (IDR) strategies can greatly increase the performance of SBTCG problems.
Technical Approach
Figure 1. Example Test Object’s Bytecode and corresponding Control-Flow Graph.
Figure 2. Example Strongly-Typed Genetic Programming Tree and corresponding Method Call Sequence.
EMCDGs and Function Sets
Test Case Evaluation
The focus of our on-going work is on employing evolutionary algorithms for generating and evolving test cases for the structural unit-testing of third-party object-oriented Java programs.
At the beginning of each generation the weight Wni of a given node is multiplied by:
Our approach involves representing and evolving test cases using the Strongly-Typed Genetic Programming technique. The methodology for evaluating the quality of test cases includes instrumenting the test object, and executing it using the generated test cases as inputs with the intention of collecting trace information with which to derive coverage metrics. Static analysis, instrumentation and execution tracing is performed solely with basis on the high-level information extracted from the Java Bytecode of the test object. Test objects are represented internally by weighted control-flow graphs; our strategy for favouring test cases that exercise problematic structures and difficult control-flow paths involves dynamically reevaluating the weight of control-flow graph nodes every generation. The aim is that of efficiently guiding the search process towards achieving full structural coverage – which often involves promoting the definition of complex and intricate test cases that define elaborate state scenarios, by considering unfeasible test cases at certain points of the evolutionary search. Purity Analysis is employed as a means to automatically identify and remove entries that are irrelevant to the search problem, reducing the size of the set of method calls from which the algorithm can choose when constructing the method call sequences that compose test cases. Relevant contributions include the introduction of novel methodologies for automation, search guidance and input domain reduction, and the presentation of the eCrash automated test case generation tool.
• the hit count factor, which worsens the weight of recurrently hit CFG nodes;
Methodology Overview
foreach class under test do instrument for structural tracing; generate control-flow graphs; identify test cluster; analyse parameter purity; generate purified EMCDGs and function sets; foreach method under test do repeat reevaluate weight of CFG nodes; generate individuals; foreach individual do generate method call sequence; generate test case; compile and execute test case; trace CFG nodes hit; evaluate test case; remove hits from remaining nodes list; recombine and mutate individuals; until stopping criteria is met ;
• the weight decrease constant value α, so as to decrease the weight of all CFG nodes indiscriminately;
• the path factor, which improves the weight of nodes that lead to interesting nodes and belong to interesting paths. hitCni x∈Nsni Wx +1 Wni = (αWni) W ni Ns × init |T | 2
For feasible test cases, the fitness is computed with basis on their trace information; relevant trace information includes the “Hit List”: h∈Ht Wh (2) F itnessf easible(t) = |Ht|
Figure 3. Purified EMCDG for the Stack class. Member
Return Type Stack() Stack [IP] Object pop() Object [RE] Object pop() Stack [IP] Object push(Object) Object [RE]
(1)
Child Types
Stack [IP] Stack [IP] Stack [IP] Object [P0] Object push(Object) Stack [IP] Stack [IP] Object [P0] Object peek() Object [RE] Stack [IP] Object() Object [RE] Table 1. Purified Function Set for the Stack class.
For unfeasible test cases, the fitness of the individual is calculated in terms of the distance between the runtime exception index (i.e., the position of the method call that threw the exception) and the method call sequence length. Also, an unfeasible penalty constant value β is added to the final fitness value, so as to penalise unfeasibility. (seqLent − exIndt) × 100 F itnessunf easible(t) = β + seqLent
Experimental Results
(3)
Future Work
Test Case Evaluation • Unfeasible test cases are considered at certain points of the evolutionary search – once the feasible test cases that are being bred cease to be interesting. • A good compromise between the intensification and diversification of the search can be achieved. • The diversity and complexity of method call sequences and the definition of elaborate state scenarios is favoured.
Input Domain Reduction • The search space of Evolutionary Testing problems can be dramatically reduced by embedding Purity Analysis into the test case generation process. • For the test objects used, approximately a third of the set of entries that could potentially be selected for integrating the generated test cases were discarded. • A significant improvement is clearly observable in terms of the efficiency of the search – i.e., less computational time is required to find an adequate test set if the conditions are, otherwise, similar.
• Several open problems persist in the area of search-based test case generation. Future work will be mainly focused on addressing the challenges posed by the three cornerstones of OO programming: – encapsulation, – inheritance, – and polymorphism. • The most promising research-related challenges that lie ahead of us seem to be the following: – Input Domain Reduction - deals with removing irrelevant variables from a given test data generation problem, thereby reducing the size of the search space. – Search Space Sampling - deals with the inclusion of all the relevant variables to a given test object into test data generation problem, so as to enable the coverage of the entire search space.
GECCO’08 Graduate Student Workshop. July 12, 2008, Atlanta, Georgia, USA.