Automatic test case generation for critical embedded ... - CiteSeerX

7 downloads 123498 Views 294KB Size Report
automated test generation tools from SCADE specifications and defined a coverage criterion adapted to SCADE ... Equivalence classes: partitioning of inputs such as a test of a given class is functionally ..... Software Engineering (ASE) 2000.
Automatic test case generation for critical embedded systems Guy Durrieu2

Odile Laurent1

Christel Seguin2

Virginie Wiels2

1

2 Airbus ONERA / DTIM 316 route de Bayonne 2 avenue E. Belin, BP 4025 31000 Toulouse cedex 03 France 31055 Toulouse cedex France [email protected] {durrieu,seguin,wiels}@cert.fr

Abstract This paper presents a research project on the feasability of automatic test generation from formal specification in an industrial context. Airbus has used SCADE for several years to specify critical avionics systems. We have experimented automated test generation tools from SCADE specifications and defined a coverage criterion adapted to SCADE specifications.

1. Introduction Airbus has used formal methods for several years to specify avionics systems. Thanks to these techniques, development cycles have been shortened significantly, automatic code generation from formal specification played an essential role in this improvement. A first research project conducted by Airbus and ONERA showed that the use of formal techniques for the specification of avionics systems allowed the use of formal proof techniques for the validation of these systems [LMW01]. Now SCADE Prover is a commercial tool to verify properties of a formal SCADE specification. Static validation techniques can thus be used in an operational way and this constitutes a first breakthrough with respect to the classical verification and validation process which was exclusively based on dynamic techniques such as simulation and test. It is not however realistic to think that this kind of techniques will be able to replace all tests for critical avionics systems. This industrial fact leads us to look for an assistance in the generation of test cases using automatic test generation from formal specification. This paper presents the work done in a project (financed by DPAC, French Program Direction for Civil Aviation) where the objective was to study the feasibility of automatic test generation based on formal techniques by applying them to operational industrial systems. Section 1 describes the current verification and validation means. Section 2 presents our objectives with respect to these practices and explains the approach. Section 2 also presents the structural coverage criterion used in our approach. Section 3 presents the two tools we used and synthesizes the experimentations done with these tools. Conclusion and future work end the paper.

2. Current verification and validation means Currently, the main verification and validation activities done at Airbus are the following: - System level simulation: the considered system is validated in a simulated environment. The designer validates the software specification on real time computers providing a panel of commands representing possible pilot actions. - Aircraft level simulation: several systems are validated in a simulated environment. - System benches: first tests with real equipments, on a single system. - Multi-systems benches: at this stage, real equipments exist for the different systems, the goal is to be sure they are correct with respect to their specification. - General bench: last tests before the tests on the real aircraft, they are done on real equipments and with a good representation of the aircraft environment. In this project, we were particularly interested by verification and validation activities done at the detailed specification level (formal SCADE specifications). For this kind of specifications, validation activities are possible before system level simulation: tests using SCADE simulator and proof of properties using SCADE Prover. All the above verification and validation activities necessitate the definition of pertinent test vectors (for simulation activities) and the definition of properties (for proof activities). But the definition of test cases is not easy because they have to ensure that all the functions of the system are tested and that dangerous configurations are not forgotten. Civil airworthiness authorities require, for systems of level A, B or C, functional verification and validation activities to ensure correct operation in normal and abnormal conditions, criteria for the termination of these functional V&V activities are also to be defined. To respect these requirements, Airbus defines tests by identifying - Equivalence classes: partitioning of inputs such as a test of a given class is functionally equivalent to any other test of the same class, - Singular points, - Structural coverage criteria, based on the structure of the specifications, chosen as termination criteria. Currently, functional test cases are described manually in test plans. For each functional test case, the following pieces of information are provided: - Textual description of the test objective, - Identification of minimal test cases (that must imperatively be executed), - For each test objective, the set of chosen scenario and expected results. Each scenario is described as a sequence of operations together with the associated means.

3. SCADE The SCADE environment, developed by Esterel Technologies, was defined to assist the development of critical embedded systems. This environment is composed of three tools: • a graphical editor, • a simulator,



a code generator that automatically translates graphical specifications into C code.

The SCADE language is a graphical synchronous data flow specification language. In this language, time is divided into discrete instants defined by a global clock. At instant t the synchronous program reads inputs from external environment and computes outputs. The synchrony hypothesis states that the computation of the outputs is made at the same instant t. It means that for periodic systems like control/command systems, all outputs are computed at each cycle. A system is described by several interconnected nodes. A node computes values for its outputs from values of its inputs. A node is defined by a set of equations and assertions. An equation VAR=EXPR specifies that the variable VAR is always equal to EXPR. An assertion assert BOOLEXPR means that the boolean expression BOOLEXPR is assumed to be always true during the execution of the program. Any variable or expression is considered to represent the sequence of values it takes during the whole execution of the system and Lustre operators operate globally over these sequences. For example, VAR=EXPR means that the sequences of values associated with VAR and EXPR are identical. Expressions are made of variable identifiers, constants, usual arithmetic, boolean and conditional operators and two specific temporal operators: previous pre and followed-by →. • If E is an expression denoting the sequence (e_0, e_1,e_2,...), then pre(E) denotes the sequence (nil, e_0, e_1,...) where nil is an undefined value. • If E and F are two expressions of the same type respectively denoting the sequences (e_0, e_1, e_2,...) and (f_0,f_1, f_2,...), then E → F is an expression denoting the sequence (e_0, f_1, f_2,...). An example of specification is given below:

x

>=2

z1

z2

8

z3

z4 S R Q

z5

z6

p1 p2 p3 p4

This SCADE specification computes an output z6 (boolean) from five inputs: x (real), p1 to p4 (booleans). z1 is true if x is greater or equal to 2, z2 is z1 or p1, z3 is true if z2 is confirmed during at least 8 cycles, z4 is z3 or p2, z5 is the result of a latch, finally z6 is z5 or p4.

4. Approach for automatic test generation This section presents our objectives with respect to the current practices described in section 2, explains the proposed approach and briefly describes the structural coverage criterion we have chosen. Objectives The main objective of the project was to study the feasibility of test generation automation based on formal techniques. It consisted in determining whether automatic test generation techniques and the associated tools were applicable to specifications of real industrial problems, but also in seeing if these techniques could automatically identify equivalence classes and singular points, two central notions used in current practice. A second issue was to evaluate the structural coverage obtained by a set of test cases. This is possible only if a pertinent structural coverage criterion is defined at the specification level.

Test objectives and test cases Before explaining the approach, we will define on the example of section 2 what a test objective and a test case are. Formally, a test objective is any property that can be expressed in SCADE. From a user point of view, different levels of test objectives are possible. The objective for the specification of section 3 was to have z6 true, but more detailed objectives are possible, they usually express relationships between inputs and outputs. A test case is a sequence of n steps - each step gives the value of the inputs and outputs of the considered system - such that the test objective is attained. One possible test case for the example specification of section 2 and the test objective (z6 = true) is the following: #CYCLE 1 2 3 4 5 6 7 8 9 10

x 1 1 1 1 1 3 4 12 6 5

p1 false true true true true false false false true false

p2 false false false false false false false false false false

p3 true true true true true true true true true false

p4 false false false false false false false false false false

z6 false false false false false false false false false true

This test case is 10 cycles long. Each line represents one time cycle and gives the values of inputs and output at this cycle. Approach For a given system, a set of Functional Test Objectives (FTO) are defined. The chosen tools (TCG and GATEL, see section 5) are able to generate test cases from a SCADE specification of the system, a description of the environment (as SCADE assertions) and FTOs.

At first, only one test case is defined for each FTO (step 1 in the following figure), but this is not sufficient. The behaviour class defined by the FTO has to be refined to generate several test cases for each FTO (step 2). A structural coverage criterion is then used to decide whether a sufficient refinement level has been obtained.

SCADE specification 1

FTO1



FTOn

2 TC11



TC1i

TCn1



TCnk

TC : Test Case Structural coverage criterion We have chosen a structural coverage criterion at the specification level (adapted to SCADE specifications). The criterion has been defined by LSR-IMAG, for more information, see [LPB04]. We do not give here the detailed definition of this criterion but try to describe its objective in terms of coverage: for each node and for each output variable of this node, all possible values of each input variable must be exercised in a context where its value has an influence on the value of the output variable. This criterion is very close to the constraints imposed by MCDC (Modified Condition Decision Coverage) [HVCR01], the main difference is that MCDC is defined at the code level while we are interested in a criterion at the specification level. To precisely compare both criteria, it would be necessary to study the transformations introduced by automatic code generation. Practically, we give for each basic SCADE operator a set of boolean conditions that must be set to true at least once by a test case for the operator to be covered. For example, an AND operator (o = AND(i1,i2)) is covered if the following boolean conditions are set to true at least once: not i1 not i2 and i1 i1 and i2. Within Airbus, a library of special nodes called symbols exists, symbols are skill oriented nodes that are very often used. Typical symbols are: filters, triggers, integrators. We also defined coverage conditions for these symbols, either by combining the conditions of the basic operators defining the symbol or by using designer expertise. For example, the conditions defined for a latch (o=latch(set,reset)) are the following:

reset set and not reset not reset and not set and pre(o) not reset and not set and not pre(o). Finally, time plays an essential role in the systems we consider; the criterion defines conditions to be covered at a given time cycle, but two issues must be answered to adequately handle time: we have first to be able to know how many cycles need to be considered, and if n cycles are considered, we have to decide whether it is necessary to cover every boolean condition at every cycle. These two points have not yet been definitely answered for the moment. We have explored different solutions depending on the tool (cf section 5).

5. Test generation tools We have experimented two tools: GATEL and TCG. GATEL [MA00] is a tool developed by CEA (French Nuclear Research Centre) based on constraint resolution techniques. TCG (Test Case Generator) is a tool developed by Prover Technology (http://www.prover.com) based on a proof kernel. GATEL The user provides a test description, i.e. a description of the environment, a description of the program under test and a test objective. The program is written in Lustre (textual formal specification language on which SCADE is based [CHPP87]), but GATEL only accepts booleans and integers (no real numbers, no structured types). The environment description is a set of hypotheses on the behaviour of the environment, it defines invariant relationships between the program inputs and the outputs at the previous instant. The goal of these hypotheses is to restrict the possible values of inputs to realistic ones with respect to the environment of the system. The test objective defines a property that must be satisfied by the generated test cases. This property may be an invariant (written in Lustre) that must be verified at each instant or a property that must be verified at least at one instant (additional operator reach). GATEL automatically generates a test case from this test description. Several test cases can also be obtained by interactively “splitting“ the operators included in the test description - it corresponds to a partition of the system into equivalence classes and to generate one test case for each class. GATEL proposes default splitting rules for all operators, but a different strategy can also be encoded in the tool. An interesting functionality of GATEL is that it first produces test sequences: a test sequence is a test case where only the values that have an influence on the result are given, and where value intervals are given for numerical variables. For example, the test sequence corresponding to the test case given in section 4 is the following:

#CYCLE 1 2 3 4 5 6 7 8 9 10

x _ _ _ _ _ 3..? 3..? 3..? _ 3..?

p1 _ true true true true _ _ _ true _

p2 _ _ _ _ _ _ _ _ _ _

p3 _ _ _ _ _ _ _ _ _ false

p4 _ _ _ _ _ _ _ _ _ _

z6 _ _ _ _ _ _ _ _ _ true

A test sequence represents a class of test cases and gives interesting information on the system. Each test sequence can then be completely instantiated to produce a test case. GATEL kernel is a constraint solving algorithm on boolean variables and integer intervals. From the environment assertion, test objective and definition of the input and output variables, the tool creates a constraint system characterising the final state that must be reached by the lustre program. The resolution of this constraints system consists in progressively eliminating constraints introduced by the user in the test description. It is implemented by alternate steps of constraints propagation/simplification and variables instantiation. The resolution starts from the final state of the test case (computed from the test description) and incrementally builds previous states using a backward exploration technique. To apply GATEL to the example of section 3, we provide a textual version of the example and adapted splitting rules, then we define the test objective (reach z6). The tool first automatically produces one simple test sequence satisfying the objective: #CYCLE 3 2 1 0

x -56807 121403 -690771 -774527

p1 true false true true

p2 false true false false

p3 true false false false

p4 false true false false

z6 false true true true

Then the user can refine this by interactively applying defined splitting rules. First, the rule for the OR operator defining z6 is applied, it results in two test sequences (one where z5 is true and one where p4 is true and z5 false). For each of these two possibilities, the user can further refine the test sequences by applying the splitting rules for BASCR. A test tree is progressively built by applying splitting rules from outputs to inputs. As an illustration, we give the test tree obtained by splitting the OR operator as explained above and the BASCR operator only in the branch where p4 is true (right branch).

starting point objective z6=true split of the OR split of the BASCR when p4 true

At this stage, we have only refined the test sequences at one time cycle (cycle 0). When all the splitting is done at one cycle, we can further refine by considering the next time cycle. On the example here, it is necessary to go at least till cycle 9 to see the effect of the CONF operator. The remaining question is whether to consider all the possible boolean combinations at each of the 9 cycles. GATEL allows all the combinations to be considered, but we quickly obtained a huge test tree that is difficult to handle. We think such an extensive splitting is not always necessary and we have defined a methodology to guide the user (a list of cases where no further splitting is needed). This kind of methodology could be adapted with respect to application domain and test habits.

TCG The user defines Functional Test Objectives (FTO) using SCADE. The tool also considers Structural Test Objectives (STO) which correspond by default to a branch coverage (one case for each branch of each conditional statement) but can be adapted to handle other coverage criteria (like the one we define in section 4). The tool runs in three steps. • First it generates test cases for the given functional test objectives while trying at the same time to cover as many structural test objectives as possible. The structural test objectives that are considered at this stage are only those that are in the dynamic cone of influence of the functional test objective (i.e. the set of variables that have an influence on the value of the FTO). • Then TCG can generate test cases for the remaining STO (these STO enlighten parts of the specification that are not exercised by the FTO). • Last, if there are still STO that have not been covered, TCG uses its proof capabilities to see whether these STO are reachable or not (it can help in identifying dead code).

The main principle used in TCG for the test case generation is the finding of counterexamples. The tool searches for counter examples of the negation of the test objective in order to obtain scenarios where the test objective is satisfied. TCG is completely automated. The syntax of the produced test cases is compatible with the SCADE simulator. TCG is not yet completely integrated into SCADE but a prototype has been implemented to allow its use from the SCADE interface. TCG also produces an excel file that synthesizes the coverage of the objectives by the test cases. An example of this file is given below: name symbol s_1 por

s_5

pand

s_13

pCONF1

s_18

s_22

s_25

pand

pand

BRUDSV

variable OR1 OR2 OR3 AND1 AND2 AND3 CONF11 CONF12 CONF13 CONF14 AND1 AND2 AND3 AND1 AND2 AND3

cov

functional scenarios 0 1 2 3 X X X X X X X X X X X X X

S

structural scenarios 0 1 2 3

X X

X

X

X

X

S S S

X

X

X

X

4

X

X

X

X

X X X

X X X

X

X

X

S S X

X

X

X

X

X

X

X

S

X X

X

in cone s_25 s_25 s_25 s_25 s_25 s_25 s_25 s_25 s_25 s_25 s_25 s_25 s_25 s_25 s_25 s_25

On each line are the test objectives, first the structural test objectives for the operators (AND, OR, CONF1) then the functional test objective (here only one: BRUDSV). The columns give the test cases (or scenarios), there are 4 functional test cases (i.e. test cases generated during the first step) and 5 structural test cases (generated during the second step). A mark means that the test objective of this line is covered by the test scenario of this column. The ‘S’ denotes the test objectives that are only covered by structural scenarios. As far as time is concerned, TCG computes test cases of a given length (at least the number of cycles that is needed to reach the functional objective) and tries to cover each structural objective only at one time cycle, so that finally all the boolean combinations will be covered by the set of test cases, but maybe at different instants.

Experimentations / Comparison Both tools have been experimented on several examples taken from Airbus specifications. Two main systems were studied: the flight control secondary computer and the flight warning system. Both tools have been able to generate interesting test cases and their good performances allowed us to deal with real size examples. Moreover, the tools have been parameterised to take into account the coverage criterion we have defined.

However, implementations of the test generation approach are different in the two tools. TCG is completely automatic and can be integrated into the SCADE environment. The optimisation of the number of test cases is interesting. But test objectives are covered only at one time cycle and there is no notion of equivalence classes. GATEL provides the equivalence class notion that we consider essential and allows the coverage of test objectives at each time cycle, but it is more complex to use (interactive), the definition of a methodology is thus needed to guide the user.

6. Conclusion and future work The main result of the project is to show that automatic test generation is indeed feasible in practice. Tools exist and are able to handle real industrial examples. An adapted criterion has been chosen to analyse coverage at the specification level. A few experimentations have been conducted to compare automatically generated tests with tests obtained with the current manual approach, but they are not sufficient. It would be interesting to fully compare both approaches by applying them in parallel on a given system with functional test objectives. As far as tools are concerned, methods proposed by GATEL and TCG could be combined to take advantage of their complementarity. More work is also necessary to analyse the cases where the tools could not generate a test case covering a given test objective. Finally, at the methodological level, the proposed approach is compliant with the certification standards. However, the proposed approach is different from the classical coverage analysis at the code level, so we think it is necessary to study more thoroughly the possible substitution of the coverage analysis at the code level by a coverage analysis at the specification level. This reflection should be part of a more global study of the integration of this approach to the actual verification and validation process.

References [CHPP87]

[HVCR01]

[LPB04] [LMW01]

[MA00]

P. Caspi, N. Halbwachs, D. Pilaud, and J. Plaice. LUSTRE, a declarative language for programming synchronous systems. In 14th Symposium on Principles of Programming Languages (POPL 87), Munich, pages 178-188. ACM Press, 1987. K.J. Hayhurst, D.S. Veerhusen, J.J. Chilenski, L.K. Rierson. A practical tutorial on Modified Condition / Decision Coverage. NASA / TM-2001210876. A. Lakehal, I. Parissis, L. du Bousquet. Critères de couverture structurelle de programmes Lustre. Proceedings of AFADL 2004. . O. Laurent, P. Michel, V. Wiels. Using formal verification techniques to reduce simulation and test effort. FME 2001. LNCS 2021, Jose Nuno Oliveira and Pamela Zave eds. Springer Verlag. B. Marre, A. Arnould. Test Sequence Generation from Lustre Descriptions: GATEL. 15th international conference on Automated Software Engineering (ASE) 2000.

Suggest Documents