A Tool for Generating Test Cases
Mr A M Beitz & Mr T P Rout
Technical Report: SQI-93-11
A Tool for Generating Test Cases Andrew M. Beitz Centre for Information Technology Research University of Queensland
[email protected] and Terence P. Rout Australian Software Quality Research Institute Griffith University
[email protected] ABSTRACT The development of functional test cases based on the system specification is a key part of software development. Test cases can also be generated from the actual code of the system. This type of testing is known as structural testing. Current software testing practices focus, almost exclusively, on the code, despite widely acknowledged benefits of testing based on software specifications. As a result, most research and tool development on test automation has concentrated on structural rather than functional testing. This paper presents a tool developed to produce functional test cases based on the system specification. Test cases are generated automatically from a specification that describes the behaviour of the program. The specification language used to describe the behaviour of the program is rigorous, but not strictly formal. This offers a significant advantage over previous tools which cannot actually describe the behaviour of the program and hence cannot produce accurate test cases from the specification.
1.
INTRODUCTION
1.1
The Problem of Testing
Testing is the dynamic execution of a software unit under defined conditions, for the purpose of evaluating the software (Graham, 1990). It is a vital part of software development, and is one of the few methods of assuring software quality in widespread use. Testing is used to provide a measure of confidence that a software product correctly meets the requirements of the user. Executing tests on the software can detect the presence of faults, providing a practical means for assessing software reliability. A test case is defined as an input/output pair, where an "input" is some stimulus to the module and an "output” is some observable behaviour. A test set is a set of test cases. Since, for any non-trivial software program, there is an infinite number of possible test sets, the problem of testing is the derivation of a finite selection of test sets which optimises the chances of defect detection in the code under test. The theory of test data selection proposed by Goodenough & Gerhart (1977) provides a basis for constructing program tests. This theory defines test data selection criteria in terms of properties called validity and reliability. It proves a fundamental theorem stating that successful execution of test data satisfying a valid and reliable selection criterion guarantees absence of errors in a program. The theorem means testing can show the absence of errors, but only when the tests are properly selected. The aim in test data selection is to ensure the reliability of data selection criteria. A successfully executed test is equivalent to a direct proof of correctness if the test satisfies a data selection criterion proved to be valid and reliable. The theorem sets standards for evaluating weaknesses of test data selection criteria and points to the need for defining sufficient conditions and methods of proving test reliability and validity. The conditions for using the theorem will probably be difficult to satisfy in practice. "The virtue of the approach presented here perhaps lies mainly in the impetus it will give to putting software testing on a more rational basis" (Goodenough & Gerhart, 1977). The method used to define criteria for the selection of test cases is known as a testing technique. There are two main properties that a technique should meet (Laski, 1989). Firstly, the test cases are expected to have a significant potential for detecting a fault in the program. Secondly, the number of test cases should not exceed an economically acceptable level. The key issue in designing effective test cases is to yield the best subset of all the possible test cases, taking into consideration economic constraints, such as time and cost, which has the highest probability of detecting the most errors (Myers, 1979).
1.2
Functional Testing and Test Oracles
Functional and structural testing are two distinct classes of testing methods (Omar & Mohammed, 1989). Functional testing involves the generation of test cases that are based on the requirements, specifications, and design functions of a program; structural testing makes use of the program structure in designing an adequate test case (Omar & Mohammed, 1991).
Neither functional nor structural testing techniques are sufficient alone to detect all the errors in a program. The two techniques are complementary rather than competing, and neither of them should be ignored (Myers, 1979). Both of these views are useful, both have limitations, and both target different kinds of bugs. However, in some cases functional testing will cover all the structural tests, but never the other way around. The art of testing, in part, is in choosing the optimal balance between functional and structural tests (Beizer, 1990). The recommended procedure (Myers, 1979) is to develop test cases using the functional method and then develop supplementary test cases as necessary by using the structural method. The reason for combining the methods is because each method contributes to a particular set of useful test cases, but none of them by itself contributes a thorough set of test cases. This strategy will not guarantee that all errors will be found, but it has been found to represent a reasonable comprise, given the economic constraints. A tester must be able to find a sound trade-off between cost and quality. Some testing theoreticians have approached the problem by attempting to devise testing techniques that guarantee to reveal the presence of faults of a given type(s). An advantage of such techniques, from a practical point of view, is that they provide systematic procedures for selecting test data and for determining when testing has been completed. Unfortunately, these techniques tend to be either unacceptably time-consuming in application (White, 1985) or, if not, the types of faults they reveal are restricted to a very small subset of all possible fault types (Howden, 1982). To decrease the time consumed on using techniques, that can reveal a large subset of all possible faults types, good automated tools are sought. Testing often yields very little trustworthy information about the correctness of a software system. This may be due to the lack of an effective means for determining whether the system has behaved correctly on test execution. The tester of the software will usually only visually examine the outcome of some test to determine whether the system has behaved correctly. If this is done carelessly or incompletely, real knowledge about the system behaviour may be lost. If the system behaviour cannot be determined accurately, all the expense and effort of testing will be for naught. This problem can be tackled by incorporating oracles into the testing process. A test oracle determines whether a system behaves correctly for test execution (Richardson, Aha & O'Malley, 1992). All testing approaches depend on the availability of an oracle. Test oracles are a critical component of the testing process; however, the effective use of test oracles has been neglected. The two important aspects to consider in the theory of test case generation are the selection of data input and the determination of the expected response. Test cases must not only contain the input to exercise the software, but must also provide the corresponding correct output responses to the test inputs inserted. 'Real testing', unlike 'ad hoe' approaches, involves predicting and documenting the outcome before the test is run (Beizer, 1990).
1.3
Testing Tools
Testing has often been perceived as a tedious activity, yet it is seldom adequately toolsupported (Graham, 1990). Testing tools can decrease the time and effort spent on the testing process. Testing can be performed manually or with help from software tools. Manual
testing requires preparing the test cases, running the tests, preparing the expected results from the requirements specification, comparing the test output with the expected results to evaluate the correctness of the software being tested, and recording the errors found for later removal. Tool support can be given to many of these areas (Graham, 1990). In the past, testing has been primarily a manual operation and often an inefficient function in many organisations since it can suffer from human error and is often very time consuming (Perry, 1983). Manual testing can introduce human errors in the process and testers often consider testing as a dull and mundane process. Many tasks in testing software could be automated to reduce the burden on humans and to remove any possible human errors. Human testers, unlike automatic testing methods, can get careless and lazy and this can reduce the usefulness of their tests. (Stocks & Carrington, 1991) believe that methods for deriving test information do not need to be automated, and the method should not be totally automated. Totally automating the testing process may well be impossible. It is also not desirable, since a human tester can bring much insight to testing, as well as a degree of experience and wisdom in test case selection. In many ways, tests derived by humans are superior to any that could be derived automatically.
The growing demand for the production of quality software products will lead to an increasing interest in and use of testing tools. Testing tools are being developed to improve testing effectiveness and productivity. Testing, like program development, generates large amounts of information, necessitates numerous computer executions, and requires coordination and communication between the testers. Testing tools can ease the burden of test production, test execution, general information handling, and communication. Leading companies have already achieved reductions in testing time of up to 70%, and 30% improvements in software development productivity by using tools (Graham, 1991a). No matter how much automated testing is done or how carefully manual testing. is done, neither will find all errors in the software. The automated testing process will find only “systematic" errors (i.e. those which can be found by applying a rule-based strategy). Most of the elusive errors are found only by human inspiration and ingenuity. Manual testing can find more errors than current software tools. A software tool can find all occurrences of some types of error, but cannot find all types of error, only the types which it is capable of looking for. Most tools currently available offer assistance in detecting only syntactic errors, not semantic errors (Graham, 1990). Testing methods can benefit from a mixture of automation and human interaction. An ideal testing environment would interact with a human user to derive test cases. Mundane and simple work would be automated, requiring the human user only to review this work and add what additional tests are deemed necessary. A set of user-definable test paradigms would be incorporated to aid in the selection of appropriate test data. 2. 2.1
THE INTELLIGENT TESTING TOOL Specification
In this paper we present a description of an Intelligent Testing Tool (ITT). ITT carries out disciplined specification-based testing. The test cases are derived from the specification using
the cause-effect graphing technique. Cause-effect graphing (Elmendorf, 1973; Myers, 1979) is a technique that aids in selecting, in a systematic way, a high-yield set of test cases. The cause-effect graph is a formal specification that can be used to select optimal test cases in a systematic way. The tool presents both input and output specifications for derived test cases, and in this manner functions as a test oracle. Functional testing in the past has mostly been based on informal specifications; although these provide a natural way of specifying the system, their use has been largely unsuccessful (Myers, 1979). There has been a shift of emphasis towards the use of formal specifications in testing software, in large part because of their systematic role in test data selection. In order for a tool to be used successfully to design test input cases, the testing process itself must be systematic. Richardson, Aha & O'Malley (1992) consider the testing process as being typically systematic in test data selection. What makes testing systematic is the formal specification. In functional testing, the specification supplies the organising information for systematic testing. However, these specifications have not concentrated on providing a rigorous specification that can be used to describe in detail the behaviour of programs. The ITT tool considers functional testing based upon rigorous but not strictly formal specification. This has a significant role in building test cases that have a high potential of detecting faults within the system. Therefore, a specification provides an excellent means from which to derive test oracles. Test oracles can be derived from specifications and effectively incorporated into the testing process. These specification-based oracles will provide the necessary information to verify test execution against the specification. The specification method used for the ITT tool contains an input structure, an output structure, and an input-output relationship. This is the necessary information to describe the behaviour of the program. The specification consist of a composition part and a relation part to describe this behaviour. The composition part contains the input and output structure of the program. The operations used in the composition part are:
and
concatenation iteration selection option composition
+ {} [|] () =
The relation part contains the input-output relationship (transformation). The operations used in the relation part are:
and
defined as logical "AND" logical "OR" logical "NOT"
AND OR NOT
Each part of the specification contains a set of sentences that define the behaviour of the specification. The sentences are made of values and operations are applied on those values. The specification recognises two types of values, literal values and abstract values. A literal value contains quotations " " around the value and represents an actual value used. An
abstract value contains brackets < > around the value and represents a broad class value. The specification technique is described in detail in Beitz (1992).
2.2
Development and Use of the Tool
The ITT tool uses the information from the specification to transform it into a cause-effect graph from which the test cases are produced. Figure 1 shows how this information is extracted from the specification to construct the graph. The ITT tool identifies in the specification all the causes (inputs), effects (outputs), relations between causes and effects (transformation), and constraints to produce a representation of the cause-effect graph. The input part of the specification identifies the causes, which are selected by taking each class domain and the boundaries of that domain. A domain is selected by taking lowest domain specified in the specification. The output part of the specification identifies the effects; each value in the output part of the specification represents an effect. The transformation part of the specification identifies the relations between the causes and effects identified in the composition part. A cause-effect graph is constructed from the identified relations in the specification, as defined in the transformation; part of this section may lead to further relations being defined in the composition part of the specification. The composition part of the specification also represents relations, but they are hidden within the specification, within the concatenation operation (+), which represents a logical and, and the selection operation (1), which represents a logical or.
OUTPUT
INPUT TRANSFORMATION
1 1 v
causes and constraints Figure 1
v
v
graph
effects and mask-
Capturing the input, output and the transformation
There are two reasons why the whole specification is not expressed as a set of relations with no composition part. The first is to enable the user of the specification to be able to distinguish between the composition part and relation part of the specification, as both of these parts have different roles in the specification. The user must be able to derive a well defined specification through the inputs, outputs and the transformation on those inputs and outputs. The second reason is that the tool has to be able to identify constraints in the specification. These are identified only in the composition part of the specification. Not all constraints can be identified by the ITT tool in its present stage of development; however, these constraints are rare, and can be identified by the tester when testing. The derivation of test cases using the cause-effect graphing technique is a difficult process to automate efficiently. Use of the conventional procedure of cause-effect graphing involves tracing each effect back through the graph, to find all possible combinations of causes that will set an effect to a present state. However, this is an inefficient process when the number of possible combinations in a graph is considered. A more efficient process is desired to produce an efficient and yet optimal set of test cases. The approach used by the ITT tool is a heuristic algorithm approach that applies heuristic rules to the cause-effect graphing technique to produce the test cases. The heuristic algorithm approach generates test cases by building the templates from the cause-effect graph then constructs a decision table from these templates. A template contains a single or number of possible combinations that represent a single test case. The combinations are generated by using the rules in the cause-effect graphing technique. The algorithm will traverse the built cause-effect graph to find all possible combinations of test cases using the heuristic rules and then produce a template for each test case found. In the cases where not all the situations need to explored in the graph, the template will contain those redundant combinations and defer selecting until the decision table is constructed. When the decision table is constructed, a combination is taken from the template to represent the test case on the table. If the template only contains a single combination then that combination is selected. However, if there are a number of combinations in the template then the "best" combination is selected. The approach defers selecting the best set until the whole template is built. This is more efficient then trying to select the best combination during the construction of the decision table. The decision table is then further reduced by applying logical rules to minimise the table. The heuristic algorithm approach is defined in more detail in Beitz (1992).
The ITT tool produces the test cases from the table constructed. A test case contains the necessary properties of the actual test data used to carry out testing on the program. The test data is worked out by the tester through examining the properties of each test case. This brings in human interaction to find the more elusive errors in the program under test. 2.3
A Case Study
The intent of this case study is to show the use of the ITT tool on a simple procedure which demonstrates principles applicable to much larger systems. Consider the use of the "GET" procedure, with the syntax shown in Figure 2. The procedure uses the command "G", to retrieve items from a domain. The domain of possible objects comprises a box or a triangle. If the correct syntax is specified than the item is "retrieved", otherwise an error message is
displayed saying “invalid syntax". The specification for the “GET" command defines the input, the output, and the transformation of the input to the output. The input for the specification contains a "get” value. The "get" value requires the "G " plus an "item". The "item" can either be a "box" or a "triangle". The output produced is the "retrieved" item, or an "invalid syntax" message. The transformation taking place is the item being "retrieved" if the "get” command is used correctly, or an “invalid syntax” message if the command is used incorrectly. The ITT tool produced five test cases from the given specification using cause-effect graphing technique. Details of the specific responses from the tool are given in Appendix A. From the test cases produced, the tester can than express each test case into a single set of test data for the program. We could use the following test data from the test cases produced by the “GET" command specification: Test Case # 1 2 3 4 5
Input g box g triangle g box
Output a retrieved box a retrieved triangle invalid syntax invalid syntax invalid syntax
The test cases contain all the necessary information to derive the test data. The test data represents the test cases generated from the specification and can now be used to test the program. The test data can be used for regression testing of program, i.e. the test data will not change for the testing of the program, unless the specification for the program changes. It can therefore can be used repeatedly during the debugging of the program. This has the significant advantage of saving cost and time in not redesigning the test cases over again for any changes made to the program.
3. 3.
A NEW LOOK AT TESTING TOOLS The Tool in Context
A classification scheme proposed by Graham, (1991b), defines six ma or classes of testing j tools: Test Management, Test Design, Non-Execution Evaluation, Test Execution, Test Analysis, and Test Quality Evaluation. Although all the above categories have ample scope for future tool development, testing of software systems has generally been regarded as those activities to do with the execution of code and the examination of the output of code (Graham, 1990). As a result of this, current tools populate the Test Execution category most thoroughly (Graham, 1991b). The ITT tool falls into the Test Design category, as a tool for Test Input Generation from Requirements. Test case generation has usually been done manually (Graham, 1990). However, test input cases can be generated automatically for some types of tests. The automation of test cases is important to decrease the time and effort in designing them, so long as the process is systematic. Test input can be generated from two sources: from the specification of the system, and from the system code. An example of a commercial input generation tool is TESTGEN (Graham, 1990), which derives test input cases from a reasonably rigorous functional specification of the system. A prototype to derive test cases from a functional specification in narrative form is described by Roper (1990), but is far from being a commercial tool. There are no tools known which derive input test cases based on the structure of the code. The ITT tool uses a rigorous specification method to generate the test cases. This has a significant advantage of over tools which require the use of very formal specifications. The test cases designed by the ITT tool are aimed to cover all of the desired behaviours of the system. These test cases will ensure the detection of all functional errors within the system. However, cause-effect graphs can only express combinatorial relationships among causes. They cannot identify such situations as time delays, iterative functions, and feedback loops from effects to causes. Therefore, cause-effect graphing is only useful for transformational systems, and not reactive systems. In addition, the tool is not able to identify certain constraints which could reduce the number of test cases produced. In the absence of a fully formal specification, it is not possible to define criteria which would represent "complete" functional test coverage. No technique so far has been able to produce "all" of the useful test cases that might be identified in a specification. However, cause-effect graphing produces a suite of test cases that is close to optimal; this is recognised, for example, in a draft Standard for Testing (BCS,1991), which identifies the 'Tunctional coverage" achieved by cause-effect graphing as being the highest available. The test suite can be further optimised by using other techniques, such as boundary value analysis, during the process of constructing the cause-effect graph. The general opinion (Myers, 1976) is that cause-effect graphing is worth the time and effort involved, even though it is 'hard work'. Practitioners have reported that cause-effect graphing is a difficult and time consuming process (Myers, 1976). Most of the time and effort that is spent using the cause-effect graphing technique is on constructing the graph and the decision table. Tool support can help to overcome this problem, with the advantage of decreasing the
total time and effort needed for developing test cases. in the available armoury of testing tools. 3.2
It is felt that the ITT Tool fills a gap
Future Directions
There are a number of areas for further development of the ITT tool. The main future focus is to integrate it with a CASE tool, to assist the analyst and tester in test case selection. Computer-Aided Software Engineering (CASE) tools provide software tool support throughout the whole of the software development. However, a major problem with current CASE tools is that they focus only on activities in the first half of the software development life-cycle, and do not adequately address activities in later stages, such as testing (Graham, 1991 a). This lack of full support in CASE tools can lead to poor productivity and quality in the developed software (Graham, 1991a). Testing can be integrated into CASE tools by using Computer Aided Software Testing (CAST) tools. Eventually, CAST tools and CASE tools will be integrated through a common repository, so that tests can be designed at the analysis and design level, and then linked to all related documents. In this way, effective tools can provide input to test case generators and enable the test requirements to determine the software design and build processes (Graham, 1991b). It should be possible for the M tool to be integrated into a test management system. Existing commercial tools do not yet address all of the specific problems of the management of the testing process, although many of them do offer support to some aspects of test management (Graham, 1990). Test management activities include configuration management and control of test documentation, input cases, expected output files, test outputs, and discrepancy reports, as well as project management features including planning, estimating, resourcing, scheduling, and monitoring the testing process. There is no existing tool that offers help in either of these areas (GRAHAM, 1990). The ITT tool is capable of extension to address the above features in a test management system. The selection of test cases is a major problem in testing. The ITT tool still requires improvement in this area, by improving the specification, the type of test data to select, and the support of implementation testing with the functional testing. Constraints are an important factor in the selection of test cases, as they eliminate impossible test cases. The specification language of the tool is not rigorous enough to detect certain constraints within the specification. These constraints require further environmental considerations. The tool cannot detect these considerations because they are hidden within the specification. The specification language. of the tool needs to be improved to detect such constraints in the specification. This may require considering a closed world specification, in which the presence or absence of any other state can be detected. Specification-based testing depends upon the correctness of the specification. The ITT tool can be further adapted to display the cause-effect graph and check for incompleteness and inconsistencies in the specification. Displaying the cause-effect graph will aid the user identifying ambiguity, incompleteness and complexity in the specification. The SoftTec tool (Bender, 199 1) checks for ambiguity, incompleteness and complexity in a cause-effect graph. This approach could also be integrated into the ITT tool to provide a more complete support for the cause-effect graphing method. The key considerations governing future development
of the tool are the provision of a more integrated environment for testing, and expansion of the range of tools and techniques available to testers. 3.3
Conclusion
Even though much of the work performed in test case generation is systematic, it has usually been considered a manual and laborious task performed by the tester. Much of this work can be taken away from the tester, by using automated testing tools. These testing tools can bring savings in cost and time of project development. Not much work has been achieved in developing tools in Test Design, and it is an area for the future that needs to be considered in the development of testing tools. The ITT tool is one such move into the future of Software Testing Tools in Test Design. We have presented a tool for developing functional test cases from a rigorous, though not formal, specification. This has a significant advantage of over tools which require very formal specifications to be used. Less formal specifications better describe the behaviour of programs than more formal ones. Describing the correct behaviour is important in test case development. Test cases are useless if the correct behaviour cannot be determined. Test cases must also be selected appropriately due to the economic constraints in software development. A good way to select test cases is to use the cause-effect graphing technique. The tool uses this technique to select the test cases. The tool will aid the tester by saving time and cost in developing test cases.
APPENDIX A TEST CASES FOR THE PROCEDURE get TEST CASE NUMBER 1 The following inputs should be present : “g” "box" The following inputs should not be present "triangle" The following outputs should be produced : TEST CASE NUMBER 2 The following inputs should be present : “g” "triangle" The following inputs should not be present "box" The following outputs should be produced : TEST CASE NUMBER 3 The following inputs should be present The following inputs should not be present : "g" "box" "triangle" The following outputs should be produced : TEST CASE NUMBER 4 The following inputs should be present : “g” The following inputs should not be present : "box" "triangle" The following outputs should be produced : TEST CASE NUMBER 5 The following inputs should be present : "box" The following inputs should not be present : “g” "triangle" The following outputs should be produced :
REFERENCES
ABBOTT, J., (1986), Software Testing Techniques, Manchester: NCC.
BCS (1990), A Standard for Software Component Testing, Specialist Group in Software Testing, Graham, D. R. (ed.), Grove Consultants. BEITZ, A. M., (1992), A Tool for the Development of Test Cases B.Inf. (Honours) Thesis, School of Computing and Information Technology, Griffilth University.
BEIZER, B., (1990), Software Testing Techniques, Van Nostrand Reinhold, pp 1-26, 321-362. BENDER, R. A., (1991), Requirements-based Testing, QAI J, pp 27-32. ELMENDORF, W. R., (1973), Cause-Effect Graphs in Functional Testing TR-00.2487, IBM Systems Development Division, Poughkeepsie, N.Y. GOODENOUGH, J. B. & GERHART, S. L., (1977), Toward a Theory of Test Data Selec@on, in Current Trends in Programming Methodology, Vol. 2, R. T. Yeh (ed), Prentice-Hall, pp 44-79.
GRAHAM, D. R., (1990), Software Verification and Testing Tools: Availability and Uptake in Proceedings SE90, Hall, P. A. V. (ed.), Brighton, pp 335-371. GRAHAM, D. R., (1991 a), Computer Aided Software Testing: The CAST Report, Unicorn Seminars, U.K.. GRAHAM, D. R, (1991b), Software Testing Tools: A New Classification Scheme J Softw Testing Verif Reliab, Vol. 1, No. 2, pp 17-34. HOWDEN, W. E, (1982), Weak Mutation Testing and Completeness of Test Sets, YEE Trans Softw Eng, Vol. 8, pp 371-379.
LASKI, L, (1989), Testing in the Program Development Cycle Softw Eng J, Vol 4, No 2, pp 95-106. MYERS, G. L, (1976), Software Reliability, Wiley-Interscience. MYERS, G. L, (1979), The Art of Software Testing, Wiley-Interscience.
OMAR, A. A. & MOHAMMED, F. A., (1989), Structural Testing of Programs ACM SIGSOFT, Softw Eng Notes, Vol 14, No 2. OMAR, A. A. & MOHAMMED, F. A., (1991), A Survey of Software Functional Testing Methods ACM SIGSOFT, Softw Eng Notes, Vol 16, No 2, pp 75-82. PERRY, W. E., (1983), A Structured Approach to Systems Testing, Prentice-Hall.
RICHARDSON, D. L, AHA, S. L. & O'MALLEY, T. 0., (1992), Specification-based Test Oracles for Reactive Systems Proceedings Fourteenth International Conference on Software Engineering, Melboume, pp 105-118. ROPER, M, (1990), The Automatic Generation of Test Cases Testing Large Software Systems, Seminar Series on New Directions in Software Development, Wolverhampton Polytechnic.
STOCKS, P. & CARRINGTON D., (1991), Deriving Software Test Cases from Formal Specifications Proceedings Australian Software Engineering Conference, pp 327-340.
WHrrE, L. L, (1985), Domain Testing and Several Outstanding Problems in Program Testing INFOR, Vol. 23, pp 53-68.