A Practical and Complete Algorithm for Testing Real-Time Systems Rachel Cardell-Oliver and Tim Glover Department of Computer Science University of Essex Colchester CO4 3SQ United Kingdom Tel. +44-1206-873586 Fax. +44-1206-872788 fcardr,
[email protected]
Abstract. This paper presents a formal method for generating conformance tests for real-time systems. Our algorithm is complete in that, under a test hypothesis, if the system being tested passes every test generated then the tested system is bisimilar to its speci cation. Because the test algorithm has exponential worst case complexity and nite state automata models of real-time systems are typically very large, a judicious choice of model is critical for the successful testing of real-time systems. Developing such a model and demonstrating its eectiveness are the main contributions of this paper. Keywords: real-time systems, black-box testing, timed automata
1 Introduction An idealistic description of a formal development method from requirements to an implementation in hardware and software runs as follows. Construct successive re nements of the requirements until a detailed design is produced, verifying formally at each stage that the re nement satis es the requirements imposed by its predecessor. Finally, implement the design in hardware and software. Each of the stages from requirements to design is a textual description, and formal veri cation can be carried out by symbol manipulation. However, the nal implementation is not a description, and its properties can only be determined by performing experiments upon it. Thus the last stage in any formal development is a formal test method which ensures that an implementation meets its speci cation. A formal test method has four stages: checking the test hypothesis, generating test cases, running the tests and evaluating test results. The test hypothesis de nes the assumptions necessary to draw conclusions about the correctness of the implementation from the evidence of test results. For example, our test hypothesis requires that the system under test can be viewed, at some level of abstraction, as a deterministic nite state automata. A test case is an experiment to be performed on the implementation resulting in a pass or fail result for each test. The test generation algorithm constructs a set of test cases from a formal speci cation. A test method is complete if its test generation algorithm determines a nite set of test cases sucient to demonstrate that the formal
model of the implementation under test is equivalent to its speci cation. For the method to be practical the number of tests generated must not be too large. Previously proposed theories for testing reactive systems [8, 9, 11] are incomplete in this sense because the set of test cases required is in nite and so the outcome of testing a nite number of cases only approximates a complete test. Springtveld et al present a complete test generation algorithm for dense real-time automata based on [3, 6], but acknowledge the algorithm is not practical [10]. Indeed, all the test methods cited have a worst case exponential complexity. The main contribution of this paper is the development of a complete test method which is also practical. The advantage of a complete test method is that results from test and veri cation can be combined seamlessly and the assumptions under which this is done are explicit in the test hypothesis. The test justi cation theorem states that, IF the speci cation and implementation satisfy the test hypothesis THEN the implementation passes every test generated for the speci cation if and only if the implementation is equivalent to the speci cation. A proof of the test justi cation theorem for the testing theory developed in this paper may be found in [2]. The paper is organised as follows. In Section 2 we de ne our speci cation language and a model of timed action automata for testing real-time systems. Section 3 presents our adaptation for real-time systems of Chow's test generation algorithm for Mealy automata [3, 6]. Test generation examples are presented in Section 4.
2 Speci cation Language and Test Model In this section we recall the real-time speci cation language of timed transition systems (TTSs) [5, 1] and de ne timed action automata to model TTS computations for testing. It is straightforward to adapt our test method for dierent real-time speci cation languages, including those with synchronous communication such as timed CSP, but that is beyond the scope of this paper. Speci cations will be given by two explicit TTS processes: a system process, S , and its environment, E . The implementation under test is represented by an unknown TTS process I . The aim of testing is to show that, at a suitable level of abstraction, I corresponds to S when executed in the same environment E . We write S k E for the speci cation and I k E for the system to be tested. It is a notable feature of our test method that the environment itself is described by a TTS process, and that correctness only has to be demonstrated with respect to this environment. This allows a large process to be decomposed into manageable pieces without requiring the behaviour of each component to be de ned in arbitrary contexts. A TTS process consists of a collection of variables, each with an associated range of values. Each variable is tagged as public or hidden. All environment variables are public since the tester controls the environment for all tests. The
public variables of S (and I ) correspond to those whose values the tester can observe. A process state is a binding of a legal value to each process variable. An action is a list of variable bindings. The idea is that the actions of the process represent (multiple) assignments of values to variables. Thus an action can be seen as a relation between states. A timed transition system (TTS) speci cation consists of a unique initial state and a set of timed transition rules t each of the form If Ct Then Bt Between Lt And Ut
where C is a guard (a state predicate), B is an action, and U and L are upper and lower time bounds respectively. Each rule de nes a possible timed state transition from any state meeting the guard conditions. Each rule belongs to either the environment or the system process and the corresponding actions are regarded as inputs and outputs respectively. Associated with each transition rule is a special clock variable which takes discrete time values. The requirement that an action take place between the upper and lower time bounds is a guard condition on the associated clock. The computations of a TTS are de ned in terms of states and labelled transitions. Whenever an action can be inferred between two states, there will be a labelled transition between them. Labels == denote an action occurring after discrete time delay . The tag is i for an input action or o for an output action. Each computation of a TTS is an in nite sequence of states and labels
0 ! 0 =0 =0 ! 1 ! 1 =1=1 ! 2 ! : : : In any state the clock variable associated with transition rule t has the value ct . The rule is said to be enabled on if its guard Ct is true in that state.
The rule must become disabled after it is taken, that is after the assignment bindings Bt are applied to . We say state 0 is a (== )-successor of state i there is a timed transition rule t such that,
{ { { { {
t is enabled on 0 is obtained by applying the bindings Bt to ; = i if t is a transition rule of environment E and = o if t is a rule of the system process S ; the time step must be chosen so that no enabled rule's upper time bound is exceeded: cs + Us for all rules s enabled in ; the clock variables of each transition rule are set in 0 as follows: the clocks of all rules (including t) which are disabled in 0 are set to -1; the clocks of all rules disabled in but enabled in 0 are set to 0; the clocks of all rules enabled in both and 0 are incremented by the delay ;
Each speci cation action may change both public and hidden variables. Actions that aect only hidden variables are not observable directly, and can be
thought of as silent actions. Since we assume implementations to be deterministic these silent actions can simply be elided from the model. This is the motivation behind the following de nitions. The function seen returns the public variable bindings of any action. If there are no public variables changed in output action then the action is silent and seen() = , the null action. Otherwise, is observable. State n is a (=A=V )-successor of state 0 i there exist states 1 ; : : : n?1 and labels 0 =0 =0 : : : n?1 =n?1=n?1 such that { each i+1 is a (i =i =i)-successor of i { all but the last action are silent: seen(i) = for i : 0::n ? 2 { the nal action is an observable one: A = n?1 6= { = Pin=0?1 i { V = n?1 The (=A=V )-successors of the initial state binding for a timed transition system determine a labelled transition system (LTS). For testing purposes we require that the LTS be nite and we call the resulting labelled graph of states a timed action automata. The timed action automata for a TTS, P k E , is denoted A(P k E ). We can now de ne the test hypothesis. A speci cation S , an implementation I and environment E satisfy the test hypothesis i 1. S k E and I k E are timed transition systems whose computations can be characterised, using the =A=V rules, as ( nite state) timed action automata. 2. Any two input actions or two output actions of A(S k E ) or A(I k E ) must be separated by at least one time unit. This constraint is necessary so that a tester can observe each output action correctly and can oer input actions at the right time. For example, an input could be oered just before an output or just after. We also demand the standard TTS constraint that time must behave reasonably [5]: Zeno speci cations are not allowed. 3. S k E must be deadlock and livelock free. This is because it is not possible to verify permanent inactivity by nite experiments. Note however that the implementation may contain deadlocks or livelocks in which case that A(I k E ) 6 A(S k E ) will be detected by the tests generated. 4. I and S are deterministic when executed in environment E . If this is not the case then, even if the IUT passes every test, we can not deduce that the system will always behave correctly. This means that for all timed transition rules of I and S the upper and lower time bounds must be equal and enabling conditions must be mutually exclusive. The environment E , however, can be (and usually is) non-deterministic. This is reasonable since the tester can simulate (deterministically) any chosen environment behaviour. In practice, limited freedom about the precise time an action occurs can be provided by allowing a certain tolerance in the times outputs are observed [10]. 5. After each test case is executed the implementation under test can be reset to its initial state within a nite time.
6. The number of states in A(I k E ) does not exceed n + k where n is the number of states in the speci cation, A(S k E ) The tester must postulate a suitable k, which is usually selected on pragmatic grounds. As k grows the number of tests grows exponentially and so this is the limiting factor in producing a complete test set. The problem of exponential growth of number of tests is also present in other formal test methods [9, 11] although for dierent reasons.
3 Test Generation Algorithm The test generation algorithm takes the timed action automata A(S k E ) associated with a TTS speci cation and constructs a nite set of test cases which will test each timed action edge of the automata. Each test is simply a sequence of timed actions. Input actions are to be oered by the test harness and output actions are to be observed. The test set has the property that if all tests succeed then the implementation and speci cation automata are trace equivalent where trace equivalence corresponds to bisimilarity under our test hypothesis. The N transitions of A(S k E ) are labelled t1 to tN. The number of states in A(S k E ) is n and k is the maximum number of extra states which may be used in the automata A(I k E ). The tests will actually be applied in a breadth rst manner, to ensure that errors are detected as soon as possible, but we can generate the test cases in any order - in our tool depth rst. Trace concatenation mapped over sets is written ++.
Algorithm 1 (Test Generation)
T:=EMPTY; for j:=0 to k do for i:=1 to N do T:=T UNION ( {reach(src(ti))} ++ {test(ti)} ++ step(dst(ti),j,x) ++ cs(x) ++ {reset} )
Each test case consists of four components. The rst part, reach(src(ti)) followed by test(ti), exercises the transition to be tested. The second part, step(dst(ti),j,x), performs a number of transition sequences which allow for the possibility that the implementation may possess extra states. The third part, cs(x), resolves any ambiguity in the nal state reached by exercising a set of sequences of transitions which distinguishes the expected state from any other state with which it might be confused. The nal timed action, reset, returns the implementation to its initial state within a xed time in readiness for the next test [10]. The sequence reach(s) can be any one of the possible sequences of timed actions starting in the initial state and nishing in state s, in this case the source state of the transition being tested: src(ti). In the context of real-time systems, the important test parameter is the total time elapsed rather than the number
of observations in each test, and so a least-time path rather than a least-edge path to src(ti) is constructed. The timed action label of the transition ti being tested is given by test(ti). After the sequence {reach(src(ti))} ++ {test(ti)} the implementation should be in state dst(ti). To ensure this is the case, each state of the speci cation automata has a set of timed action sequences cs(x) called the characterising set for state x. Its sequences distinguish that state from any of the n-1 other states in the speci cation automaton. Characterising sequences may be empty but otherwise they must end with a visible output action. Each characterising sequence has a pass or fail result (explained below). Alternatively, the implementation might go to one of the k extra states for which there is no equivalent in the speci cation. We ensure that this has not happened by checking that every implementation timed action sequence from that state of length 0 to k matches those of the speci cation. Furthermore, the nal state reached by each such sequence must pass the characterising set tests for that state. Every possible sequence of length j from state s in the graph is given by the set step(s,j,x) where x is the state reached after these j steps and j ranges from 0 to k. Checking that speci cation and implementation agree on every sequence of length 0 up to k ensures that the implementation's version of transition ti matches the speci cation's transition ti. For example, compare a sequence of two inputs followed by an output from state s1 allowed by a speci cation
s1 ! 1 ==i ! s2 ! 2 = =i ! s3 ! 3 = =o ! s4
with this one of an implementation model which has 2 extra states
r1 ! 1 ==i ! r102 ! 2 = =i ! r103 ! 3 ==o ! r4 That the states s1 and r1 are not equivalent is detected by tests from the set step(s1,2,s3)++cs(s3) which detect that states s3 and r103 produce dierent outputs and .
Characterising Sequences The timed action automata we generate from
timed transition system speci cations contain not only labelled transition information but also information about the values of all public variables in each state. This diers from the nite state automata of Chow's test algorithm where transition actions can be observed but nothing is known about the state. Using this information, it is possible to simplify the standard algorithms for constructing a characterising set. First, it is not necessary to use a sequence to distinguish two states whose public variable values dier. For example, a state just reached by a \turn-red" action could not be confused with a \green" state. Second, since the environment is entirely under the tester's control, two states which have dierent environment parts need no further comparison. Algorithm 2 (Characterising Sequences) For the timed action automata M = A(S k E ),
1. Partition the states which occur in M so that within each partition, every state shares the same values for all public variables. If any of these partitions contains only one element, then that con guration is characterised by its observable sub-part and its characterising sequence is the empty sequence. 2. For the remaining partitions, take any pair of states j ; k from the same partition pi . These states are not distinguished by their observable parts. Construct a (minimal) sequence of timed actions ending in a visible output action: t0 =0 =0; :::; ts =s =s ; ts+1 =s+1 =o such that t0 =0 =0; :::; ts =s =s is a possible sequence for both j and k . The nal timed output action ts+1 =as+1 =o should be a possible after the pre-sequence from j but fail (see below) after the same sequence from k or vice versa. Use this sequence to partition all states in pi . Add the sequence t0 =0 =0 ; :::; ts+1 =s+1 =o to the characterising sequence set and note for each state whether the check condition is True or False. 3. Repeat step 2 until every pair of con gurations is distinguished either by their dierent observable parts or by their response to one of the characterising sequences. The same characterising sequence may be used to distinguish dierent state pairs. For a nite state automata with n states, n ? 1 characterising sequences will suce to distinguish all states [6]. However, since the test for transition ti must be run with each of the characterising sequences for dst(ti) it is obviously important to make characterising sets as small as possible. The results of Section 4 con rm that by using observable dierences to characterise state pairs we obtain many fewer than n ? 1 sequences.
Applying the Tests Each test in the test set is performed on the implementation under test by a test driver. The driver oers each timed input in the test case in sequence and observes that all timed outputs occur at the correct time. The implementation may fail a test by failing to produce any expected output from the test sequence at the expected time or by producing an unexpected output. That is, a system under test fails a test i for an expected output ==o we observe any of 1. an early output 0 ==o where 0 < 2. an incorrect output action 0 = =o where 6= and 0 3. a late or missing output = =o for < 0 4. an incorrect characterising sequence check. That is, ts+1 =s+1 =o; True fails conditions 1 to 3 or ts+1 =s+1 =o; False does not do so. If any test from the set of test cases is failed then the implementation under test does not satisfy its speci cation.
4 Examples We plan to use a variety of case studies to identify a practical methodology for testing real-time systems. In this section we describe our rst experiments
in this area. We have implemented a tool to generate test cases from timed transition system speci cations. The rst version of this tool was implemented in Miranda, and the current version in SML. The results presented this section were generated using the SML tool. The rst example is a trac light controller for a pelican crossing [5]. The system has two public variables namely, light, which may take the values red and green, and request which may be True or False. In addition it possesses a hidden location variable cloc. By default the controller keeps the light red, but it guarantees to set the light green within 1 second of a request being made, and to maintain the light green until no further request has been received for some pre-determined time period , taken to be 5 in this example. A speci cation which meets these requirements is as follows. Initially: Public req=F; Public light=red; Hidden cloc=0; Implementation: If req=T and cloc=0 Then req:=F and cloc:=1 Between 0 And 0; If cloc=1 Then light:=green and cloc:=2 Between 1 And 1; If cloc=2 Then cloc:=3 Between 5 And 5; If cloc=3 and req=T Then req:=F and cloc:=2 Between 0 And 0; If cloc=3 and req=F Then cloc:=0 and light:=red Between 1 And 1;
The controller operates in an environment which may make requests. In general requests may be received at any time; If req=F Then req:=T Between 1 And infinity;
It turns out that, on the assumption that the implementation has no more states than its speci cation, no extra interesting behaviour is observed by waiting more than 10 seconds between requests, and the speci cation of the environment may be modi ed accordingly; If req=F Then req:=T Between 1 And 10;
We place a further restriction on the environment that there is a minimum delay which can be expected between requests. This can be modelled by the following environment. If req=F Then req:=T Between
And 10;
The size of the timed action automata generated from this speci cation for dierent values of are shown in gure 1. The interesting point is that the behaviour of the combined system depends critically on the delay between requests. If takes a value larger than 7 then it is never possible for a request to be received whilst the light is green, and the combined system is degenerate. Between these values there is a complex interaction between the values of and . Clearly, if the environment is restricted there is a reduction in the number of tests that must be performed. Details of the tests generated are available [2].
gamma 1 edges 47 nodes 15 tests 90
2 41 13 51
3 37 12 40
4 33 11 31
5 29 10 24
6 24 9 17
7 18 8 14
8 9 5 6
Fig. 1. Number of test cases for the trac light under dierent environments Recall that a characterising sequence distinguishes a pair of states in a timed action automata. Frequently the state reached by a test sequence can be determined unambiguously since the values of all public variables are known. In the present example, taking to be 1, 4 out of the 15 states can be determined unambiguously. The remaining 11 states can be distinguished by just 6 distinct characterising sequences, each consisting of a single observation. It is clear that the test algorithm described in Section 3 will result in a certain amount of redundancy, since many test sequences will be subsequences of other tests. This is particularly true when extra states are allowed for. In the present case, if we assume k=0 then the 127 test sequences generated can be reduced to 90 by subsuming short sequences into longer ones. If we allow for the possibility of two extra implementation states, the gures are 41,079 and 12,702 respectively. The number of tests required for dierent values of k are shown below. The tests for k=p include all those for k