Genetic Algorithm Based Software Testing - CiteSeerX

9 downloads 1218 Views 144KB Size Report
Testing software manually is slow, expensive and demands ... The purpose of stress testing is to identify peak load .... in a genetic algorithm framework.
Genetic Algorithm Based Software Testing Jarmo T. Alander, Timo Mantere, and Pekka Turunen Department of Information Technology and Industrial Economics, University of Vaasa PO Box 700, FIN-65101 Vaasa, Finland e-mail: [email protected]

Abstract In this work we are studying possibilities to test software using genetic algorithm search. The idea is to produce test cases in order to nd problematic situations like processing time extremes. The proposed test method comes under the heading of automated dynamic stress testing. Keywords: genetic algorithms, software engineering, dynamic stress testing

1 Introduction Real-time software is increasingly applied to products in which failure may have severe consequences, thus the requirements for correctness and reliability are getting higher, too. In very reliable sequential programs, the rate of errors should be less than 10 errors/1000 lines of code, to avoid functional failure. Achieving this level is very labourious, because the amount of program testing work grows exponentially with code size. Testing software manually is slow, expensive and demands inventiveness. Automated testing can reduce both the time and costs needed for performing tests. Exhaustive test data generation is not possible. The most common way of generating test data is random, which is considered weak [11]. For this reason, e orts have been made to optimize test data sets using various methods, for example heuristic methods. In our study we are trying to identify the situations where the software has the slowest reaction time. The slowest reaction time is identi ed by having the software tested in dicult input generated by GA.

As the rst step, we have tested our approach by a small sequential program consisting of a set of delay loops. In the next step (in a forthcoming paper), we are going to test more complicated real-time software.

1.1 Related work The automatic generation of test data using genetic algorithms has been studied by Xanthakis et al [21], Watkins [20] and Jones et al [9]. Studies have been mostly based on white-box testing methodology. Here we are using a black-box technique. Recently there has been a growing interest to use GA based methods to test VLSI circuits [4, 5, 7, 8, 10, 12, 13, 15, 16, 17, 19]. See [2] for further references.

2 Software testing Dynamic testing techniques execute a program on input data, in contrast to static analysis which uses the program requirements and design documents for visual review. Dynamic testing can be subdivided into two categories of testing techniques, functional and structural. Functional testing, known as the black-box test, aims to test the code by measuring its output or performance without actually viewing the statements of code which are being activated and traversed. In contrast, structural tests are considered white-box or glass-box; the actual code of the program is viewed [18]. The combination of approaches makes testing e ective [1, 20].

2.1 The automation of testing

Software testing is quite an expensive and timeconsuming task. It can be done more eciently through the automatic generation of test data. Some bene ts of automatic testing are [6]: - testing can be prepared beforehand - it makes test runs considerably faster - test runs can be done during night-shift - the amount of routine work is reduced - it can be done remotely. Disadvantages include: - preparing tests is quite hard - more knowledge is needed than in non-automatic testing. Because of these disadvantages the pro tability of automatic testing is achieved through repeating the test for the newer version of software or for a di erent con guration [6]. Normally, the large number of test cases is the problem and this is why automatic testing is done. This problem can't be solved completely by automatic testing and for this reason, we are trying to use genetic algorithms for generating better test data sets.

2.2 Stress testing

The purpose of stress testing is to identify peak load conditions under which the system fails. The system is subjected to peak loads for key operational parameters: transaction volume, user load, le activity, error rates or their combinations.

3 Testing environment In this work we have used ESIM, which is a test automation environment for embedded software development. ESIM uses a workstation for testing embedded software written in the C programming language. Software is compiled with C compiler and linked to the ESIM environment library. The user describes the input and output system (the application-speci c hardware). ESIM then simulates the I/O system and the operating system of the application, allowing the user to monitor what is happening in each of them. [14] The GA and the tested program run separately in their own ESIM-tasks, which communicate with each other through simulated ESIM hardware ports.

GA sends inputs to the tested black-box program and measures the response time. Response time is the time it takes for the black-box program to perform the operations caused by the input parameters it received. When the black-box program is completed, it sends a response signal. The response time is the tness value for the GA.

3.1 A test case

Our rst test case is a program consisting of 100 randomly generated slowing-down loops. (see ftp.uwasa.fi cs/report97-1 for the generating program). The result is a bell shaped response time distribution shown in g. 2. GA feeds the tested program with a 32 bit string, which is used to select slowing-down loops. The population size was 40, of which 20 items was always selected to the next generation.

4 Results Fig. 1 shows how the worst case develops via generations, when testing our simple program. Fig. 2. shows the execution times distribution. From all the 232 possible solutions every 289th was tested in order to obtain the gure. The maximum response time is 1292 ms. The response time is somewhat non-deterministic, i.e. the same inputs do not always result in the same response time. Fig. 3. shows how the response times di er with the same input parameters (= the worst case found). The same inputs were fed in 3000 times, the average deviation was about 5 ms (time between quartiles).

5 Conclusion and future It seems that GA could be suitable for software testing with certain limitations. In white-box testing, problem complexity might cause problems [20]. In black-box testing, the problem might be to nd characteristic tness functions. One possible alternative could be to use the number of warnings from the operating system as a tness function. If memorycritical software is used, the rate of used memory could be useful a tness function.

The next step in our project is to evaluate real embedded software, which is a more complex task. Concurrency, continuous operation and the state behaviour of software may introduce non-determinism into the system, which increases the diculty of software testing [3]. The function of embedded software depends on the scheduling of parallel processes. There are thousands of di erent scheduling combinations. When the same inputs are given, di erent 600000 1000 = 1300 response times are obtained. This can interfere with S the search made by GA. However, because of the M1250 ; stochastic behaviour of the GA, several test runs 500000 E 1200 are needed in any case, so non-determinism should M not be an unsurmountable problem. I T 1150 400000 In our simple test case there was already some Figure E 1001: Fitness development via generations S non-determinism, caused by the operating system. TNN 1100 This means that the results should be veri ed in a O UO 1050 300000 real environment but it also means that GA might P S solve the problem in spite of the non-determinism, #1000 E R if it is not too strong. 10 200000 maximum M 950 Because of hardware dependencies, a common U solution for validating embedded software is to test response 900 M I 100000 it in a simulated or target environment [3]. When time X software is tested without a target environment, a A 850 simulation program is needed. Using simulating 800 1 0 software the functioning of a program can be traced. 0 0 80 100 1260 1270 1280 40 1300 1310 1320 repeatability may also1400 make it easier to 200 20 400 600 1290Tracing 800 60and 1000 1200 nd errors when the program is black-box tested by 'ENERATION 2ESPONSE TIME ;MS= 2ESPONSE TIME ;MS= a genetic algorithm. If a simulated environment has been used, results based on execution time, are not Figure 2: Curve of response times precise enough. In this case, results must be veri ed in the real environment. This test could also be called a system test because it tests the whole functionality of the program. Further information on this work can be found in our anonymous ftp server (ftp.uwasa.fi) in directory cs/report97-1.

Acknowledgements Figure 3: Curve showing the non-determinism of the worst case time response

The work is supported by the Finnish Technology Development Center (TEKES) and ABB Corporate Research. The authors will also gratefully acknowledge the assistance of Mr Jukka Matila from ABB Transmit and Mrs Elizabeth Heap-Talvela and Miss Lilian Grahn for their kind proofreading of the manuscript of this paper.

References

[1] Proceedings of the 3rd Computer Science Forum, Baden (Germany), 1996. ABB. [2] J. T. Alander. Indexed bibliography of genetic algorithms in electronics and VLSI design and testing. Report 94-1-VLSI, University of Vaasa, Department of Information Technology and Production Economics, 1995. (available via anonymous ftp at cite ftp.uwasa.fi directory cs/report94-1 le gaVLSIbib.ps.Z). [3] A. Auer and J. Korhonen. State testing of embedded software. In EuroStar -95, London (UK), 1995. [4] J. H. Aylor, J. P. Cohoon, E. L. Feldhousen, and B. W. Johnson. Compacting randomly generated test sets. In Proceedings of the 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors, pages 153{156, Cambridge, MA, 17.-19. Sept. 1990. IEEE Computer Society Press, Los Alamitos, CA. [5] F. Corno, P. Prinetto, M. Rebaudengo, and M. Sonza Reorda. Gatto: a genetic algorithm for automatic test pattern generation for large synchronous sequential circuits. IEEE Transaction on Computer Aided Design of Integrated Circuits, 15(8):991{1000, Aug. 1996. [6] J. Eskelinen. Automatic testing of an embedded software. In J. Jokiniemi and A. Lehtola, editors, Realtime and Embedded Systems, Espoo (Finland), 1992. (in Finnish). [7] T. Hayashi, H. Kita, and K. Hatayama. A genetic approach to test generation for logic circuits. In Proceedings of the Third Asian Test Symposium, pages 101{106, Nara (Japan), 15-17. Nov. 1994. IEEE Computer Society Press, Los Alamitos, CA. [8] M. S. Hsiao, E. M. Rudnick, and J. H. Patel. Automatic test generation using genetically-engineered distinguishing sequences. In Proceedings of the 14th IEEE VLSI Test Symposium, pages 216{223, Princeton, NJ, 28. Apr.- 1. May 1996. IEEE Computer Society Press, Los Alamitos, CA. [9] M. R. Jones, A. Tezuka, and Y. Yamada. Thermal tomographic methods. Kikai Gijutsu Kenkyusho Shoho, 49(1):32{43, Jan. 1995. [10] T. Lee and I. N. Hajj. Test generation for current testing of bridging faults in CMOS VLSI circuits. In Proceedings of the IEEE 38th Midwest Symposium on Circuits and Systems, pages 326{329, Rio de Janeiro (Brazil), 13. -16. Aug. 1996. IEEE, New York. [11] G. J. Meyers. The Art of Software Testing. John Wiley & Sons, New York, 1979.

[12] M. J. O'Dare and T. Arslan. Transitional gate delay detection for combinational circuits using a genetic algorithm. Electronics Letters, 32(19):1748{1749, 12. Sept. 1996. [13] I. Pomeranz and S. M. Reddy. Locstep: A logic simulation based test generation procedure. In Proceedings of the 25th International Symposium on FaultTolerant Computing, pages 110{118, Pasadena, CA, 27.-30. June 1995. IEEE, Piscataway, NJ. [14] Prosoft. ESIM - Testing environment for embedded software, User's Guide Version 2.1 for Windows NT. Prosoft, Oulu (Finland), 1995. [15] E. M. Rudnick and J. H. Patel. Combining deterministic and genetic approaches for sequential circuit test generation. In Proceedings of the 32nd Design Automation Conference, pages 183{188, San Francisco, CA, 12.-16. June 1995. IEEE, New York. [16] E. M. Rudnick, J. H. Patel, G. S. Greenstein, and T. M. Niermann. Sequential circuit test generation in a genetic algorithm framework. In Proceedings of the 31st Design Automation Conference, pages 698{704, San Diego, CA, 6.-10. June 1994. IEEE, New York. [17] D. G. Saab, Y. G. Saab, and J. Abraham. CRIS: A test cultivation program for sequential VLSI circuits. In Proceedings of the International Conference on Computer Aided Design, pages 216{219, 1992. [18] I. Sommerville. Software Engineering. AddisonWesley, New York, 1996. [19] J. Stefanovic and E. Gramatova. RTL level test generation using genetic algorithm and simulated annealing. In Proceedings of the 2nd Workshop on Hierarchical Test Generation, Duisburg (Germany), 25-26. Sept. 1995. [20] A. L. Watkins. The automatic-generation of test data using genetic algorithms. In I. M. Marshall, W. B. Samson, and D. G. Edgar-Nevill, editors, Proceedings of the 4th Software Quality Conference, volume 2, pages 300{309, Dundee (UK), 4.-5. July 1995. University of Abertay Dundee, Scotland. [21] S. Xanthakis, C. Ellis, C. Skourlas, A. L. Gall, S. Katsikas, and K. Karapoulios. Application of genetic algorithms to software testing (application des algorithmes genetiques au test des logiciels). In Proceedings of the 5th International Conference on Software Engineering, pages 625{636, Toulouse, France, 7.-11. Dec. 1992.