Using Ant's Alarm Pheromone to Improve Software ...

4 downloads 2309 Views 333KB Size Report
1 Computer Sciences, Florida Tech. Melbourne, Florida, USA [email protected]. 2 Software Engineering, IVIA Ltda. Fortaleza, Ceará, Brazil. {rafael.silva ...
Using Ant’s Alarm Pheromone to Improve Software Testing Automation ∗ Ronaldo Menezes1 , Rafael Silva2 , Marcelo Barros2 , and Alexandre M. Silva2 1

Computer Sciences, Florida Tech Melbourne, Florida, USA [email protected] Software Engineering, IVIA Ltda. Fortaleza, Cear´a, Brazil {rafael.silva,marcelo.barros,alexandre.menezes}@ivia.com.br

2

Summary. Software testing is the de-facto standard for quality control in industry. The complexity of today’s applications are becoming so high that our ability to manually test software is diminishing — experts argue that automation is the way forward in the field. Nature-inspired techniques, and in particular the area called swarm intelligence, have got the attention of researchers due to their ability to deal with complexity. In insect societies, and in particular ant colonies, one can find the concept of alarm pheromones used to indicate an important event to the society (e.g. a threat). Alarm pheromones enable the society to have a uniform spread of its individuals, probably as a survival mechanism — the more uniform the spread the better the chances of survival at the colony level. This paper describes a model of the aforementioned ant-behavior and shows how it can be integrated as part of a software testing automation methodology thus demonstrating that software testing can also benefit from nature-inspired approaches.

1 Introduction Software applications are executed following a sequence of well-defined functionalities available in the application. The exact sequence of functionalities that is executed depends on several factors but primarily on the input values, the order these values are presented to the application, and the task the user is trying to accomplish. The majority of software testing tools that perform functionality testing rely on test cases being generated from one’s experience with the application. The choice of test cases is driven by the experts’ concept of what is important to be tested, leading to bias in the test cases. In general there are two main approaches to test software. The first, more accepted in the academia, consists of using formal specification to design an application ∗

This research is funded by the Conselho Nacional de Desenvolvimento Cient´ıfico e Tecnol´ogico (CNPq), Brazil, under grant number 551761/2005-9.

2

Ronaldo Menezes, Rafael Silva, Marcelo Barros, and Alexandre M. Silva

and then use theorem provers to demonstrate certain properties of the application. The second approach consists of the traditional software engineering models (e.g., waterfall, spiral, prototyping) that have a specific phase for testing generally occurring after the application has been implemented. Despite all the claims of these less formal techniques being effective, the truth is that current approaches are incapable of testing software appropriately. From a financial point of view, it only makes sense to look at software testing with more attention given that low estimates put the cost of software testing at around 40% of the overall cost of development [14]. Ergo what is needed is a scientific method that allows testing to be carried out effectively by practitioners without requiring the rigorous approach of formal methods. Nature shows us examples of intelligence in groups of beings (agents) with limited cognitive capacity, such as ants, bees and termites. This type of intelligence that can be observed at the group level but almost inexistent at the individual level is called swarm intelligence (SI) [3]. Swarm-intelligent solutions have been shown efficient in dealing with distributed problems where the solution space of the problem is large. Pheromone-communication is the basis for many algorithm in the Swarm Intelligence field. In nature, they are also used in threat situations. Agents (bees, ants) release an alarm pheromone that can be detected by other individuals in a surrounding area. The result of pheromone release is an uniform spread of individuals leading to a high survival rate. Natus started as a project looking into the usefulness of nature-inspired computing concepts such as genetic algorithms (GA) and swarm intelligence in functional software testing automation. Natus uses GA to generate test plans and the model of alarm pheromone [10] to execute these plans. In this paper, we introduce Natus and show how an alarm pheromone model is used in the methodology.

2 Related Topics and Works 2.1 Software Testing Automation Software testing is a main area of research for the software engineering community and is also subject of independent study in many universities. The process of software development is far from being trivial. The complexity of the new applications asks for a development process with strict quality control at every phase of development. In the last years, software testing has been considered one of the most important procedures to assure software quality. A main theme in testing has been the creation of automated tools that are reasonably autonomous to assure that software budgets remain within estimates. An automated testing tool can facilitate considerably the work of a tester and, as the most attractive benefit, it may turn the whole process cheaper by reducing the work that was done manually before. Research in technologies such as the ones inspired in artificial intelligence and their integration in software testing appears to be the way forward. Artificial intelligence can naturally aid the process of testing by augmenting the ability of a software

Using Ant’s Alarm Pheromone to Improve Software Testing Automation

3

to discover defects that would not likely be found by experts due to their bias. Experts usually are blind to certain bugs for they attempt to test the unusual functionalities in an application and tend to ignore ordinary functionalities. It is fair to say that applications will never be free of defects — problems will exist even if all the development phases are taken correctly and cautiously, and the requirement analysis done thoroughly. Software testing aims at eliminating the majority of these problems via a systematic process that identifies components of the software prone to errors. The identification of components with defects is not trivial given the size of the search space: all the possible execution sequences of the software. This is an unbounded space; there is no systematic process that can explore all the search space in a finite amount of time. So, how can we traverse this space in a way that the sample of test plans executed contains an uniform distribution of plans in the search space? This uniformity is necessary because it increases the chance of finding an error in the application. The solution is on the development of automated tools that avoid the bias of experts who tend to prioritize test plans that they think to be important but which in practice may not be so. The ALARM model proposed in [10] is an approach to traverse search spaces in a uniform manner and it is an intrinsic part of the Natus methodology described in Section 3. 2.2 Nature-Inspired Approaches in Automated Software Testing Automated software testing is an area of growing interest but until recently very few researchers had looked at the possibility of using nature-inspired approaches to improve this area. Still today, most of the works focus on the use of genetic algorithms; these works are related to the testing VLSI circuits [2, 4] rather than software applications. In the area of testing software applications, the research is focused on the automatic generation of testing data and not test plans as it is later proposed in this paper. Mantere [9, 1] has done an extensive study on the use of GA in data generation but at the level of mostly toy problems and simple program subroutines. In embedded systems Wegener et al. [12] has described how data can be generated for performing structural testing. At time of conducting this research, we were only aware of one work looking at automatic generation of test plans using GAs. Kasik and George [7] propose to use GA in the generation of test plans but their focus is limited to GUI applications. The idea is to have tests that mimic a novice user manipulating the interface. Natus is focused on functionality testing of Web applications but the idea could be used in other applications. The concentration of Web applications at this stage is due to the fact that we are moving to a Web-based era where most applications run remotely on the Web. Additionally, Web applications use a well defined protocols suitable to the automatic generation of test plans.

4

Ronaldo Menezes, Rafael Silva, Marcelo Barros, and Alexandre M. Silva

2.3 Ant Colony Optimization The Ant Colony Optimization (ACO) algorithm was proposed by Dorigo et al. [5]. Their approach to solving combinatorial optimization problems employs lessons learned from the observation of foraging strategies in ant colonies — in particular ACO makes heavy use of positive and negative feedback as well as indirect communication via the environment (aka Stigmergy). In the real world, ants search for food and return to the nest when they find it, dropping pheromone along the way. These pheromone deposits attract other ants in the colony, biasing their efforts towards the newly located food source (i.e., positive feedback). As several ants search for food, those who find shorter paths to food sources return home faster allowing their trails to be traversed by more ants in a shorter amount of time than the less efficient paths. The shortest path, therefore, gets the strongest concentration of pheromone and the ants in the colony are able to find the shortest route to a new food source. To avoid converging onto paths other than the efficient ones, nature employs evaporation of the pheromone (i.e., negative feedback), thereby gradually removing the incentive for ants to visit inefficient paths (old paths) and improving the chance an ant will find more optimal paths to food sources. Negative feedback also serves the purpose of making the colony behavior adaptable to environment changes (e.g. elimination of old paths to no-longer-interesting food sources). Therefore, through the interaction of very simple agents, a relatively complex behavior has emerged — namely, the ability to find the shortest path to a destination. 2.4 Alarm Pheromone Pheromones are chemicals used for communications within the same species enabling the member of the species to perform actions such as food location, mating, aggregation, warning other individuals, to name but a few. An especial kind of pheromone, named alarm pheromone was discovered in the early 1960’s. There are two primary alarm behaviors: to disperse (flight behavior) or to move towards the source of alarm in an aggressive manner (fight behavior) [6]. It is generally thought that the differentiation between flight and fight behavior depends on the intensity of the alarm pheromone being released. An example of alarm communication is that of Acanthomyops colonies (living exclusively in the USA), they are subterranean ants that are large in size and dense in number and therefore it could be speculated that they would respond to alarm pheromones by the fight reaction because of the unlikeliness of dispersal. Workers that are nearby react the fastest and those that are farther away take longer to react. If more pheromone is not released, the signal dies out within minutes [15]. Another example, a flight one, can be show by ants of the specie Lasius alienus, they are smaller and nests under rocks or wood which allows for fast dispersal when disturbed. The component of their alarm substance is the same as that of Acanthomyops but instead of running to the source, they frantically run around in no particular

Using Ant’s Alarm Pheromone to Improve Software Testing Automation

5

direction. They also react to lower concentrations of alarm pheromones than Acanthomyops[15]. The ALARM model [10] deals with the flight behavior only and is used later in this paper to get a uniform execution of test plans.

3 The Natus Testing Methodology Natus started as a study on the usefulness of nature-inspired metaphors such as genetic algorithms and swarm intelligence in software testing automation. Natus is called a methodology because it goes beyond plan generation and execution; it tells us how these are integrated in the development lifecycle. Figure 1 shows the entire process that the methodology proposes. It can be divided into five parts: learning, definition of dependency, generation of plans (scripts), execution of the plans, and feedback from execution to improve on the learning. It starts with a process of recording (logging) all possible functionalities the application contains. These logs are inserted in a database as scripts for the recorded functionalities. It follows with users defining dependencies between these operations or letting the system create them. For instance, an operation may depend on another being executed before, and the tool needs to understand this dependency so that plans are generated correctly. Next, we move to the generation of test plans (groups of recorded operations) using a GA approach. The goodness of a plan is measured according to the inconsistency level it can take the application to, so a table of such inconsistencies has to be defined either by the user or randomly by the system (for a further description of all these steps the reader should refer to [11]). After the plans are created, the ALARM model (explained later) is used to prioritize plans according to how many times they have been executed before and how good the plan is. The plans are then executed and the inconsistencies table is updated — if a bug is found, the inconsistency is increased, otherwise its decreased. This allows the further generation and execution of new test plans, thus repeating the cycle. Note that the use of ALARM is of utmost importance because the process is not limited to just listing the cases and testing from the beginning of the list to the end. Doing so may not give us a diverse list of cases being tested (assuming that not all cases can be tested in the time frame available for testing) and work well if the number of test cases to be executed is small. Basically, a space of test cases generated by the GA can be explored by ALARM agents (ants) by considering that the fewer times the plan has been executed before the more likely it is to be executed now. The fitness of each plan can be used to initialize the environment so that fitter plans are prioritized (using food, as explained later in the paper), thus allowing the amount of executions to be taken into consideration. In a process opposite to ant foraging and inspired in alarm pheromones [8], the ants can make locations in the environment (test plans) less desirable as they select and execute them. This approach makes the ants more inclined to choose tests that have not yet been executed. Also, one needs to understand that important test plans (according to the fitness value) may be executed and indeed cause problems in the application. This test continues to be open for re-test and should continue to be considered according to its

6

Ronaldo Menezes, Rafael Silva, Marcelo Barros, and Alexandre M. Silva

Fig. 1. The flow of the Natus testing methodology that utilizes the GA and SI approaches.

fitness value (or even with an increased fitness value). The use of positive feedback can be used to reinforce the fitness value of such test cases, thus attracting more ants to explore that case again. In case the test does not cause any problem, its attractiveness diminishes. In the ALARM model this is controlled via the mechanism of pheromones and food on the environment.

4 The Alarm Pheromone Model Given that Vieira et al. [11] describes how GA can be used on the generation of test plans, we focus here on the description and evaluation of the second part of the

Using Ant’s Alarm Pheromone to Improve Software Testing Automation

7

methodology which involves the use of an alarm pheromone model. Ant-Colony Optimization (ACO) has been extensively used as a solution to problems were agents (ants) are required to converge to a path (or few paths). In this model, ants positively reinforce the path they choose with pheromones in order to attract other ants. Better paths tend to be reinforced more therefore attracting more ants. To avoid early convergence, the algorithm uses the concept of negative feedback implemented via evaporation of the pheromone. If we abstract the search space of test plans as a lattice where ants (test-plan executing agents) walk in search of plans to execute, we require an opposite view of ACO. Our model represent scenarios where ants respond negatively (repulsion) to the existence of pheromones in the locations surrounding its current location. The more pheromone present in a cell the more the ant will be repulsed by it. In terms of the Natus methodology one could say that these cells represent plans that have passed the tests already and should have less priority in future executions. It is also important to point out that we envision several ants dealing with the same search space. So the pheromone left by one influences other agents. In the proposed model we use the amount of food present in the cells (φ). This food acts as an attractor to an ant and is balanced by the repulsiveness of pheromones (τ ). The food exist to counter-balance the pheromones. As mentioned above, if a test plan fails we would like for this test plan to be given priority in future executions (even though it has been executed already) to test if the problem has been fixed. The food can be used for this purpose. Food can be used to make agents attracted to the cell (test plan) in spite of the pheromone at that location. Pij (t) =

φij (t)β P τij (t)α × k∈N [φik (t)]β

(1)

Equation 1 is the transition rule in the ALARM model and the core of the execution process. Pij represents the chance that a test plan j will be executed after the execution of plan i, where j is a plan in the neighborhood (N ) of i (according to the lattice used). The values of α and β control the degree of importance given to the amount of pheromone (number of executions) and food (importance of the plan). The negative feedback via evaporation requires an update rule for all cells. The update rule is exactly as in the ACO model. More details about the ALARM model can be found in [10]. In our simulations in Section 5 an equal distribution of food in the environment is assumed — 20 units of food in each cell of the environment. If the distribution is unequal the coverage will not be perfect in the beginning as it will represent the inequalities of the food distribution (pheromone). As mentioned before this approach can be used to counter-balance the pheromone. In the Natus methodology, extra amounts of food can be deposited in the environment to bias the agents to re-execute a test plan represented by a specific location in the environment. Each time a unit of food is consumed (location where the food is is visited), an amount of alarm pheromone is deposited in the cell.

8

Ronaldo Menezes, Rafael Silva, Marcelo Barros, and Alexandre M. Silva

5 Experimental Results In order to generate our results, and visualize the behavior of our alarm pheromone model, we have implemented a visual simulator in NetLogo [13]. The simulator contains a 2D lattice where agents are randomly placed. Parameters such as number of pheromone units used by each agent, evaporation rate of the pheromone, amount of food in the environment, quantity of food unit consumed per visit, control variables α and β, etc. can all be controlled in this simulator. Another important aspect of this simulator is the ability to show results related to the coverage of the environment at real-time: a panel is available to show the standard deviation of the number of visits of the cells of the environment. 5.1 Performance Evaluation of ALARM We have tested scenarios where ants are placed in the environment behaving in 2 different ways: random walk and ALARM. To show a uniform coverage we are using the standard deviation of how many times each cell is visited by an ant. Lower standard deviations represent a more uniform distribution of visits in the environment. Note that the standard deviation is a global characteristic of the environment. In the environment, each cell represents one test plan previously generated using GA.

Fig. 2. Comparison between random walk and the execution of test plans in Natus.

In all scenarios tested, the environments are 2D toroidal grids — each cell has 8 neighbors. For Equation 1 we have used α = 1 and β = 1. At each step of the simulation (all ants perform one move) an evaporation of 5% is globally applied to all cells of the grid. Each cell is initialized with 20 units of food which are consumed one by one as the ants walk to that location. Hence in our simulator we are giving each cell (test plan) the same importance level. The minimum amount of pheromone per cell is one and the same is true for the food otherwise the probability of choosing a cell could get to zero. Every time an ant visits a cell it consumes one unit of food and leaves four units of pheromone. Since the amount of food can represent the importance of a test plan to be executed, we assume in the simulator that errors found during the execution of a test plan are fixed immediately therefore the amount of food is always

Using Ant’s Alarm Pheromone to Improve Software Testing Automation

9

decreasing — the less food a cell has the more certain we are that the plan has passed and has not caused any problems in the application. The results shown in Figure 2 are averages of ten runs for each experiment. We have made two types of experiments: scalability in relation to the number of plans (Figure 2(left)) and the number of agents (ants) executing the plans (Figure 2(right)). For the first experiment we released 50 ants in environments of sizes: 50×50, 60×60, 70 × 70, 80 × 80, 90 × 90, and 100 × 100. At the end of each simulation (all with the same number of execution steps) we use the counter in each cell (number of times an ant passed at the location) and calculate the standard deviation for the entire environment. Another experiment was to see the behavior of the models when the size of the environment is fixed but we vary the number of ants. We used an environment made of 50 × 50 grid and used 30, 40, 50, 60, 70, and 80 ants respectively. The experiments were compared with random walk and in both cases, ALARM performed significantly better.

(a) Random Walk

(b) ALARM Model

Fig. 3. Snapshots taken at the end of the execution of the simulator of each approach. The shades of grade represent an approximation of the number of times a test case was executed. Darker areas represent more executions.

Figure 3 allows us to visualize the status of the environment at the end of the simulation. One can clearly see that the random walk model does not provide us with a very uniform coverage of the environment — clear patches are formed. On the other hand, ALARM are very uniform with no obvious patch formation.

6 Conclusion This paper described a methodology for software testing called Natus and concentrated on the evaluation of the test plan executions driven by the ALARM model. We demonstrate the uniformity of our model with experiments that measure the standard deviation of the number of times a cell is visited. We compared that with a well understood metric: random walk. Natus allows for the elimination of the bias brought by testing experts when choosing test plans for the applications they work with. Our approach works in uni-

10

Ronaldo Menezes, Rafael Silva, Marcelo Barros, and Alexandre M. Silva

son with a test plan generation module based on GA where the result of the execution of the test plans using ALARM can be used by the GA to generate other (more appropriate) test plans. Natus is the first full methodology for software testing automation inspired on its entirety by natural algorithms.

References 1. J. T. Alander and T. Mantere. Automatic software testing by genetic algorithm optimization, a case study, June 02 2005. 2. J. Aylor, J. Cohoon, E. Feldhousen, and B. Johnson. Compacting randomly generated test sets. In Proceedings of the IEEE International Conference on Computer Design, pages 153–156. IEEE Press, Sept. 1990. 3. E. Bonabeau, M. Dorigo, and G. Theraulaz. Swarm Intelligence: From Natural to Artificial Systems. Santa Fe Institute Studies in the Sciences of Complexity Series. Oxford Press, July 1999. 4. F. Corno, P. Prinetto, M. Rebaudengo, and M. S. Reorda. GATTO: A genetic algorithm for automatic test pattern generation for large synchronous sequential circuits. IEEE Trans. on CAD of Integrated Circuits and Systems, 15(8):991–1000, 1996. 5. M. Dorigo, V. Maniezzo, and A. Colorni. The ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics Part B: Cybernetics, 26(1):29–41, 1996. 6. W. O. H. Hughes, P. E. Howse, E. F. Vilela, and D. Goulson. The response of grasscutting ants to natural and synthetic versions of their alarm pheromone. Physiological Entomology, 26(2):165–172, 2001. 7. D. J. Kasik and H. G. George. Toward automatic generation of novice user test scripts. In CHI ’96: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 244–251, New York, NY, USA, 1996. ACM Press. 8. C. Lloyd. The alarm pheromones of social insects: A review. Technical report, Colorado State University, 2003. 9. T. Mantere. Automatic Software Testing by Genetic Algorithms. PhD thesis, University of Vaasa, May 02 2003. 10. R. Menezes, F. Martins, F. E. Vieira, R. Silva, and M. Braga. A model for terrain coverage inspired by ant’s alarm pheronome. In SAC ’07: Proceedings of the 2007 ACM Symposium on Applied Computing, pages 728–732, New York, NY, USA, 2007. ACM Press. 11. F. E. Vieira, R. Silva, F. Martins, R. Menezes, and M. Braga. Using genetic algorithms to generate test plans for funcionality testing. In Proceedings of the 44th ACM Southeast Conference, Melbourne, Florida, USA, March 2006. ACM, ACM Press. 12. J. Wegener, K. Buhr, and H. Pohlheim. Automatic test data generation for structural testing of embedded software systems by evolutionary testing. In Proceedings of the 2002 Genetic and Evolutionary Computation Conference, pages 1233–1240, New York, 9-13 July 2002. 13. U. Wilensky. Netlogo. Technical report, Center for Connected Learning and ComputerBased Modeling, Northwestern University, 1999. http://ccl.northwestern.edu/netlogo. 14. C. E. Williams. Software testing and uml. In Proceedings of the 10th International Symposium on Software Reliability Engineering, Boca Raton, Florida, Nov. 1999. IEEE Press. 15. E. O. Wilson. The Insect Societies. Harvard University press, 1971.

Suggest Documents