Combining Fault Avoidance, Fault Removal and Fault Tolerance: An Integrated Model A. Mili, B. Cukic, T. Xia Institute for Software Research 1000 Technology Drive Fairmont, WV 26554, USA famili,bcukic,
[email protected] Abstract Fault avoidance, fault removal and fault tolerance represent three successive lines of defense against the contingency of faults in software systems and their impact on system reliability. Beyond the colorful discussions of the relative merits of these techniques, the law of diminishing returns advocates that they be used in concert, where each is applied whenever it is most effective. Such a premise remains an idle act of faith so long as these techniques cannot be captured by a uniform model. This paper proposes such a model, and illustrates how it can be used in practice to improve the quality of software products.
Keywords Fault avoidance, Fault removal, Fault tolerance, Formal specifications, Verification and validation.
1 Successive Lines of Defense Despite three decades of intensive research, the verification and validation of software products remains an active research area. A great deal of progress has been achieved in this field, but the advent of new programming languages and new software development paradigms, combined with the increasing reliance on software and the increasing complexity of software applications, have maintained the pressure for more research. All the methods of verification and validation revolve around the theme of dealing with the existence and the manifestation of faults. Traditionally, these methods are classified into three categories, which differ by how early faults are identified and dealt with; the categories can be seen as successive lines of defense against the effects of faults on software quality.
R. Ben Ayed School of Engineering, PO Box 37 University of Tunis Belvedere, 1002 Tunisia
[email protected]
Fault Avoidance. These methods take the view that it is possible to build fault-free software, and focus on means to specify, verify and derive software products that are free of faults. Fault Removal. These methods concede that despite our best efforts, developed software may still contain faults, and apply methods to remove faults from existing software products. Fault Tolerance. These methods concede that neither fault avoidance nor fault removal ensure that software products are free of faults; they advocate that measures be taken to prevent faults from causing failure.
Not surprisingly, each family of methods is effective for some types of faults and ineffective for others —as we will discuss later in this paper; hence it makes sense to apply all three families of methods, by virtue of the Law of Diminishing Returns. By doing so, we afford the luxury of applying each method where it is most effective, and dropping it in favor of another when its effectiveness deteriorates. In this paper, we discuss the implications of combining these three categories of methods, using a unified relational framework. We highlight in particular that software specifications can be structured as aggregates of sub-specifications, where each sub-specification is more adapted to a particular family of methods than to others; so that the law of diminishing returns becomes less a general philosophical statement than an actual working solution. In section 2 we give a brief introduction to relations and relational calculus, which we use throughout the paper. In section 3 we use this background to provide a general model for fault avoidance, fault removal and fault tolerance. In section 4 we analyze the structure of specifications, and discuss how to decompose the verification and validation effort of a software product into a fault avoidance effort, fault removal effort and fault tolerance effort. In section 5, we
consider the more constructive question of: given a complex specification, how do we dispatch its components to various verification methods in such a way as to minimize cost and maximize impact. Section 6 is devoted to a case study, where we apply the proposed method to a numerical analysis program and assess the gains achieved by the proposed method. Finally, section 7 presents a synthesis of our results as well as prospects for future work.
2 Relational Specifications In this section we briefly discuss the use of relations as program specifications; the interested reader may find more details on relational mathematics from e.g. [10] and on relational specifications from e.g. [9]. In this section we introduce the material that is minimally required to make this paper self-contained.
2.1 Elements of Relations A relation on set S is a subset of the Cartesian product Constant relations on a set S include the identity relation, denoted by I , the universal relation, denoted by L, and the empty relation, denoted by . A vector a is a relation of the form a = A S for some non-empty set A; there is a straightforward isomorphism between vector a (a relation) and set A. An invector (inverse of a vector) is a relation of the form S A, for some non-empty set A. In addition to traditional set theoretic operations, we will use the following operations on relations: The domain of relation R, which we denote by dom(R); the image set of element s by relation R, which we denote by s R; the inb; and the product verse of relation R, which we denote by R 0 of relations R and R , which we denote by R R0 (abbreviated by RR0 , when no ambiguity arises); Note that for a given relation R, the relation RL is a vector; it can be written as RL = f(s; s0 )js 2 dom(R)g: The restriction of relation R to set A is denoted by A nR. Note that if a is the vector defined by a = A S then A nR = a \ R. Also note that R0 L \ R is the restriction of R to the domain of R0 (a recurrent expression in the sequel).
S S.
2.2 Specifying with Relations We use relations to represent specifications. A relation R on some space S contains all the input output pairs (s; s0 ) that the specifier considers correct. If an element s is outside the domain of R, we understand that it is not expected to be submitted by the user, and the behavior of candidate implementations on s is arbitrary (in particular, they may fail to terminate). If s 2 dom(R), then s R is the set of outputs that are considered correct for input s.
Homogeneous relations (from S to S ) are adequate to specify software products that implement simple input output functions, but inadequate for software products that maintain an internal state (such as objects, data types, stimulus response systems, etc). For such systems, we also use a relational model, where specifications are represented by heterogeneous relations; this model is discussed in [12]. For the sake of readability, we focus our discussions on homogeneous specifications and simple input output programs, with the knowledge that our results can be generalized to other forms of specifications and other kinds of software products.
2.3 Refinement Ordering We want to define an ordering relation between specifications to the effect that one specification captures stronger requirements information than another. We say that specification R refines (or is a refinement of) specification R0 if and only if any implementation that satisfies R satisfies R0 . We denote this property by R w R0 , or equivalently R0 v R. This notion is captured in the following definition.
R w R0 , (RL R0 L ^ R0 L \ R R0 ):
Given that w is a partial ordering, it is legitimate to ponder the question of whether it has lattice-like properties, i.e. whether any two specifications have a join (least upper bound) and a meet (greatest lower bound). We briefly discuss meets in this section, and leave the discussion of joins to section 4.1. We have the following proposition (due to [2]). Proposition 1 Any two specifications R and R0 have a meet (greatest lower bound), whose expression is
R u R0 = RL \ R0 L \ (R [ R0 ): More important than the expression of the meet is its concrete interpretation: R u R0 represents the requirements information that is common to both R and R0 ; i.e. R u R0 represents all the requirements that are captured in both R and R0 . The meet can be interpreted by means of the following characterization: If we know that under all circumstances, specification A refines R or refines R0 , but we are not sure which it refines, then all we can say is
A w ( R u R 0 ): As an example, if we have a program P that sorts arrays in an arbitrary order (increasing or decreasing), depending e.g. on some random function, then all we can say about it is that it refines
Inc u Dec;
where Inc (resp. Dec) specifies a program that sorts arrays in increasing (resp. decreasing) order.
begin A:= empyset; forall x in A’ do begin P(x,y); {run P on input data x, producing output data y} if oracle(x,y) then A:= A+x; {if execution on x is successful, then add x to A} end; return(A) end.
3 Fault Management Methods We discuss how the refinement ordering introduced in the previous section, as well as the associated lattice properties, can be used to model all three families of verification and validation methods.
3.1 Modeling Fault Avoidance We consider a software product P (say, a program P , for simplicity), and we consider that we have established the (total) correctness of this program with respect to some specification V . We denote by [P ] the relation that program P defines on its space, i.e. the set of pairs (s; s0 ) such that if P starts execution in state s then it terminates in state s0 . If P is implemented in a (traditional) deterministic language, then this relation is in fact a function, but whether it is actually a function or not has little impact on our subsequent discussions. Nevertheless, we refer to [P ] as the functional abstraction of P ; we may confuse a program and its functional abstraction when this raises no ambiguity. By virtue of the definition of [P ], we infer that dom([P ]) is the set of states s such that if P starts execution in state s then it terminates. To be (totally) correct with respect to specification V , program P must terminate for all inputs states in the domain of V , and must be partially correct with respect to V . For termination, we must have dom([P ]) dom(V ): For partial correctness, whenever an input state satisfies the precondition (s 2 dom(V )) and execution of program P terminates in some state s0 ((s; s0 ) 2 [P ]), then the output s0 must satisfy the postcondition ((s; s0 ) 2 V ); in other words,
8s; s0 : s 2 dom(V ) ^ (s; s0 ) 2 [P ] ) (s; s0 ) 2 V: Using algebraic notations, we can write the first clause as
[P ]L V L; and the second clause as
V L \ [P ] V: The conjunction of these two clauses yields that [P ] refines V ; this provides the proof for the following proposition. Proposition 2 A program P is correct with respect to a specification V if and only if [P ] w V . This characterization of correctness is equivalent, modulo superficial differences of notation, to traditional definitions of total correctness [3, 4, 8].
Figure 1. A Generic Certification Model.
3.2 Modeling Fault Removal We consider a program P , and we attempt to capture the result of submitting this program to a set of tests. We distinguish between three goals of software testing:
Debugging. Under this goal, we submit input data to the program and observe its behavior, in the hope of sensitizing, identifying then removing faults. Certification. Under this goal, we submit input data to the program and observe its behavior, in the hope of showing that the program behaves according to its specifications for all instances of the test data. Reliability estimation. Under this goal [1, 7, 6], we submit input data to the program and observe its behavior; whenever it fails, we identify the fault, remove it, then resume testing. We estimate reliability by observing the evolution of inter-failure intervals.
Because we are concerned with verification and validation in this paper, the first goal is of no interest to us, but the second and third goals are. While the second goal produces a logical statement about the program, the third produces a probabilistic statement. For the sake of parsimony, we focus on the second goal in this paper, and briefly discuss our prospects for the third goal in the conclusion. We consider a program that we want to certify, i.e. to test under the goal of certification. Certification testing proceeds as follows (see figure 1): We are given a test oracle, which we represent by a relation, say , and a set of test data, say A0 . For the sake of consistency, we assume that all the test data elements are in the domain of , i.e. A0 dom( ). For each element of the test data, say x, we run the program P on input x and observe the outcome. We consider that the outcome is successful if and only if: first,
the program terminates for input x; second, the final state y that it produces satisfies the condition (x; y) 2 . We let A be the set of input data elements for which the test is successful, and we let T be the restriction of to A, i.e. T =A n . By virtue of the first clause in the definition of successful tests, we know that program P terminates for all elements of A, hence the domain of [P ] is a superset of A. By definition of T , we find that A is the domain of T , hence we infer
[P ]L TL: Also, by virtue of the second clause in the definition of successful tests, we know that if x is in A and y is an output produced by [P ] for input x, then (x; y ) 2 . We write this as:
8x; y : x 2 A ^ (x; y) 2 [P ] ) (x; y) 2 : By virtue of a logical identity, we write this as:
8x; y : x 2 A ^ (x; y) 2 [P ] ) x 2 A ^ (x; y) 2 : By virtue of the definition of restriction and by definition of T , we write this as:
8x; y : x 2 A ^ (x; y) 2 [P ] ) (x; y) 2 T: Because A dom( ), dom(A n ) = A, we write: 8x; y : x 2 dom(A n ) ^ (x; y) 2 [P ] ) (x; y) 2 T: By virtue of the definition of relation T , we write:
8x; y : (x; y) 2 TL \ [P ] ) (x; y) 2 T: If we rewrite this algebraically, and consider that we already have [P ]L TL, we find
[P ] w T: This constitutes the proof of the following proposition. Proposition 3 If we run certification testing on program P using oracle and test data A0 , where A0 dom( ), and we let A be the set of all the elements of A0 that produce a successful test, then we can infer
[P ] w T; where T is the restriction of to A. Interestingly, certification testing, just as correctness verification, produces a statement to the effect that the program under consideration refines some specification.
3.3 Modeling Fault Tolerance Fault tolerance refers to the set of measures that a program takes to avoid failure after faults have caused errors. One way to achieve fault tolerance [11] is to structure the program as a set of blocks of the form: B: begin ps:= s; {saving initial state} body; {changing s, preserving ps} if not correct(ps,s) then recovery(ps,s) end; In its most general form, predicate correct defines a binary relation, which we denote by C , between the state of the program at the beginning of the block (ps) and the state of the program at the end (s), when condition correct is checked. Likewise, procedure recovery defines a binary relation, namely its functional abstraction, which we denote by R. Whenever block B is executed and predicate correct evaluates to true, we know that B is correct with respect to specification C , which (by virtue of proposition 2) we write as:
[B ] w C:
On the other hand, whenever block B is executed and predicate correct evaluates to false, we know that procedure recovery is invoked with the (retrieved) initial state (ps), whence the effect of the execution of body is overridden by the execution of the recovery routine and we get [B ] = R, which we rewrite (for uniformity) as
[B ] w R:
Because we do not know in general whether predicate correct will evaluate to true or false, all we can infer in general is that the functional abstraction of [B ] refines either C (if the test of predicate correct is true) or R (if the test of predicate correct is false). By virtue of the discussions of section 2.3, we infer
[B ] w F;
where F = C u R. If our program has more than one block, say two for example, that are structured in sequence, then we let F1 and F2 be the relations associated to each block and we infer the relation with respect to which the program is fault tolerant by composing F1 and F2 . The details of how these relations are composed are beyond the scope of this paper. This concludes our claim that, interestingly, all three methods produce statements to the effect that the program refines some relational specification. In the next section we discuss how to use this result in such a way as to break down the verification and validation effort and dispatch it among the three families of methods.
4 Combining Methods In section 2.3 we had discussed the meet operator that derives from the lattice of the refinement ordering; in this section, we briefly consider the join operator. We have the following proposition (due to [2]).
4.1 The Lattice of Refinement Proposition 4 Two relations R and R0 have a join (least upper bound) with respect to the refinement ordering if and only if they satisfy the condition (called the consistency condition):
4.2 Cumulating Verification Results We consider a software product P on which we have applied three V&V methods: Correctness verification with respect to some specification V , producing [P ] w V ; Certification testing with respect to some specification T (defined by oracle and test data set A as T =A n ), producing [P ] w T ; Fault tolerance with respect to some specification F (defined by the correctness criterion C and the recovery routine R as F = C u R), producing [P ] w F . By virtue of the discussions of section 4.1, specifications V , T , and F have a join (since they have an upper bound, which is P ). Furthermore, by virtue of lattice theory, we infer
RL \ R0 L = (R \ R0 )L:
[P ] w (V t T t F ):
R t R0 = R \ R0 L [ R0 \ RL [ (R \ R0 ):
Hence although these three different families of methods have radically different means to verify functional properties of programs, their results can be formulated in a uniform model and can be combined into a cumulative result.
When R and R0 do satisfy the consistency condition, their join is given by the following expression:
The consistency condition holds whenever R and R0 add to each other’s information (rather than to contradict each other); in [2] we had shown that R and R0 have a least upper bound if and only if they have an upper bound. The join reflects all the requirements information of R (upper bound of R) and all the requirements information of R0 (upper bound of R0 ) and nothing else (least upper bound). In other words, it performs the addition of specifications R and R0 . As illustrations of this operator, consider the following examples:
f(s; s0)js0 = sg t f(s; s0 )js 0 ^ s0 0g = f(s; s0 )js0 = s ^ s0 0g. f(s; s0)js0 = sg t f(s; s0 )js < 0 ^ s0 = 0g = f(s; s0 )js0 = sg [ f(s; s0 )js < 0 ^ s0 = 0g. f(s; s0)js0 = s + 1g and f(s; s0)js0 = s + 2g have no 2
2
2
2
join.
Sort = Prm t Ord. where
Sort: output array is the sorted permutation of the input array.
Prm: output array is a permutation of the input array. Ord: output array is ordered. In the sequel we discuss how this join operator can be used to combine verification and validation efforts on a software product.
5 Dispatching Verification Tasks What is even more useful than to cumulate incidental verification results, is, for a given verification goal, to dispatch it among verification methods in an optimal manner; this is the subject of the current section.
5.1 Structuring Specifications In [2] and subsequent works, we have advocated that the join operator can be used as a means to structure complex specifications without violating the generally accepted principle of abstraction (which provides that specifications must not favor specific design choices, leaving these choices at the discretion of the designer). Given a complex set of requirements, one can capture these requirements with arbitrarily partial, arbitrarily weak sub-specifications, say R1 , R2 , R3 , .. Rn . Then we derive the overall specification as
R = R1 t R2 t R3 t :: t Rn : Given a product (program) P that is supposed to satisfy this specification, we are interested in establishing the property [P ] w R by dividing the set of sub-specifications into three classes: sub-specifications for which correctness verification is appropriate will be factored together to produce
V = Rv(1) t Rv(2) t :: t Rv(k) : We do the same for specifications T and F . In the sequel, we discuss how to decide, for each term Ri , whether to factor it into V , T or F .
5.2 Correctness Verification Terms Specification terms that are typically good candidates for correctness verification methods can be characterized in two distinct but not orthogonal manners:
Semantic characterization. Ideal terms for correctness verification methods are relations that reflect information that the program seeks to preserve as it is proceeding towards a solution. We have observed over some period of time that program specifications can be broken down into terms that reflect conservative properties that programs preserve while executing, and terms that reflect properties that programs attempt to achieve at the end of the execution. The former tend to be simpler to formulate and simpler to prove: It is much easier to prove preservation of a conservative property that the program is maintaining while proceeding towards termination than it is to prove complex properties about what the program is trying to establish. Syntactic characterization. Ideal terms for correctness verification methods are reflexive and transitive relations (hence in particular equivalence relations). We can justify the choice of reflexive and transitive relations in two ways: a logic based argument and an algebraic argument. – Logical. Most program proofs follow an inductive argument, which espouses the structure of the program (iterative, recursive). An inductive proof includes a basis of induction (dealing with trivial arguments) and an inductive step (dealing with composite arguments). Reflexivity of the specification makes the basis of induction trivial; transitivity of the specification makes the inductive step trivial. Hence the proof of a program with respect to a reflexive and transitive relation is fairly trivial in general; and involves a linear effort (as a function of the program’s size). – Algebraic. Whether the program is iterative or recursive, its inductive proof requires the derivation of a reflexive transitive root of the specification at hand; in the context of iterative programs, the derivation of this relation is the well known problem of inventing loop invariants [14]. When the specification is reflexive and transitive, it is its own reflexive and transitive root; hence the invention of the inductive argument, which is reputed to be very difficult (and nearly impossible to automate), is trivial for reflexive transitive specifications.
In practice, the two characterizations given above appear to hold simultaneously most of the time, as the following
examples show. P = insertionsort. R = Prm t Ord. We recommend to take V = Prm, since Prm is an equivalence relation between the initial state of the array and the current state, and reflects what property the program tries to preserve as it sorts the array. Note that the loop invariant required to prove that the program satisfies Prm is Prm itself (no inductive argument is involved, because Prm is reflexive and transitive). Note also that it is in fact very easy to prove the preservation of Prm: whenever the array is modified, it is modified by merely swapping two cells, hence preserving Prm —end of proof. As another (even more trivial) example, consider the following factorial program. P = begin k:=1; f:=1; while k 0); B and X , of type vectortype (vector of size N ). We consider that we are interested in proving the following properties about our Gaussian elimination program: that the system of equations obtained after the Gaussian elimination process has the same set of solutions as the initial system; that it is triangular; and that the final result obtained in X solves the initial system with an acceptable precision (say ). These three requirements are formulated in the following specifications:
SamSol = f(s; s0 )j8Y :A(s) Y = B (s) , A(s0 ) Y = B (s0 )g: 0 TriAng = f(s; s )j8i; j :1 j < i N ) A(s0 )[i; j ] = 0g: 0 Prec = f(s; s )jNorm(A(s) X (s0 ) , B (s)) < g, where Norm is some norm defined on space realN . For each of these relations, we derive the adequacy vector to determine which method is best adapted to them. The results of this analysis are shown in figure 3. We summarize the results of figure 3 in the following table, where each column is labeled with a specification term and each row is labeled with a verification method.
SamSol TriAng Prec
Avoidance Removal Tolerance
3 -1 -3
-2 2 5
-3 3 2
We infer from this table that specification SamSol is best handled by a correctness method (fault avoidance), specification Prec is best handled by a testing method (fault removal), and specification TriAng is best handled by a fault tolerance method. An interesting property of this particular example is that it also shows that SamSol is the specification that fault avoidance can handle best, Prec is the specification that fault removal can handle best, and TriAng is the specification that fault tolerance can handle best; this need not always be the case, of course, but is nevertheless a nice illustration.
6.1 Experimental Observations The Gaussian elimination program which we wrote to satisfy the compound specification
R = SamSol t TriAng t Prec is five pages long, for a total of about 220 lines of C code. Its proof, using an inductive relational decomposition method, is 17 pages long, and not altogether very convincing. By contrast, deployment of our approach has produced the following outcome:
Fault Tolerance. Deployment of the fault tolerance capability with respect to specification TriAng involves a very simple executable condition (to the effect that one side of the row is zero) and a very simple recovery routine (taking a linear combination of two rows). Fault Avoidance. Verifying that the program satisfies specification SamSol is fairly straightforward: the backward substitution step does not affect variables A and B , hence satisfies SamSol vacuously; the Gaussian elimination step affects A and B only in the context of replacing one row by a linear combination of this row with another —which satisfies SamSol by virtue of a simple result of numerical analysis. Certification Testing. Rather than derive a specification of the form A nPrec (for some set A) that the program is certified to refine, we have opted instead to measure the ratio of inputs for which specification Prec is satisfied, for = 10,13. For some values of N between 3 and 20, we have run 1000 experiments, whereby we produce random values for A and B and attempt to solve the system of equations AX = B ; then we check the ratio of correct answers (according to Prec). We find:
N 3 6 8 10 12
Percentage 97.9 93.3 90.3 85.2 81.8
N 14 16 18 20
Percentage 76.2 73.6 66.3 58.0
Spec.:
Features Methods Avoidance Removal Tolerance Selection
Arity 1 2 -1 1 1 0 1 -1 X
Refl. and Trans. Y N 1 -1 0 0 -1 1 X
Coding Complexity L H 0 0 1 -1 1 -1 X
Exec. Time L H 0 0 0 0 1 -1 X
Inductive Reasoning possible impossible 1 -1 0 1 1 -1 X
SamSol
Features Methods Avoidance Removal Tolerance Selection
Arity 1 2 -1 1 1 0 1 -1 X
Refl. and Trans. Y N 1 -1 0 0 -1 1 X
Coding Complexity L H 0 0 1 -1 1 -1 X
Exec. Time L H 0 0 0 0 1 -1 X
Inductive Reasoning possible impossible 1 -1 0 1 1 -1 X
TriAng
Arity 1 2 -1 1 1 0 1 -1 X
Refl. and Trans. Y N 1 -1 0 0 -1 1 X
Coding Complexity L H 0 0 1 -1 1 -1 X
Exec. Time L H 0 0 0 0 1 -1 X
Inductive Reasoning possible impossible 1 -1 0 1 1 -1 X
Spec.:
Features Methods Avoidance Removal Tolerance Selection
Score 3 -1 -3 Spec.: Score -2 2 5
Prec
Score -3 3 2
Figure 3. Assigning Methods to Terms: Gauss Elimination
6.2 Analytical Observations In section 6.1, we have discussed how dispatching the three terms of the specification onto three appropriate methods led to an economical verification and validation effort. In this section we discuss ahy any pne method applied exclusively to the whole specification would have been ineffective, or prohibitively expensive (or both).
Avoidance. Application of a program correctness method to the Gaussian elimination program with respect to the whole specification at hand requires a loop invariant that captures such aspects as: the selection of the pivot, the swapping of rows, the round off error control, the invertibility of the matrix, the preservation of the solution set, etc. Because our method applies the correctness method to relation SamSol alone, which is reflexive and transitive, the loop invariant is merely (s0 ; s) 2 SamSol, where s0 is the (symbolic) initial state and s is the current state. Removal. If we were to apply program testing to the program at hand, we would be unable to infer any property beyond the observations we make on the test data, and we would be unable to distinguish between observations of failure that stem from cumulative roundoff errors and those that stem from logical
design faults.
Tolerance. If we were to apply program fault tolerance to this program, we would adversely affect its performance (due to time and space overheads) and we would significantly magnify its complexity (hence reduce its understandability and, paradoxically, its reliability).
By dispatching various aspects of the specification to different verification methods, we obviate the shortcomings of each method, as we deploy each method where its impact is maximal and its cost is minimal. The availability of a common model allows us to combine results obtained from different methods into a single statement about the correctness of the program at hand with respect to the specification of interest.
7 Conclusion We briefly review our results, then we discuss prospects of future research.
7.1 Synthesis In this paper, we have presented an integrated approach to the verification and validation of software products, and
have discussed how this approach can be used to minimize the cost and maximize the impact of a verification and validation effort. Among our contributions, we mention:
A theoretical result to the effect that all three families of verification and validation methods (proving, certification testing, fault tolerance) can be modeled by a single formula, to the effect that we aim to show that the program refines some specification. We have revisited a result to the effect that complex specifications can be structured as aggregates of simpler specifications in a way that enhances constructibility and readability without violating the principle of abstraction. The observation, barely illustrated in this paper but otherwise borne out by our experience, that each term of a specification structured as discussed above can lend itself to a specific verification and validation method much more than the others. The observation, equally borne out by our experience, that each family of methods is better adapted to some type of specification terms than others. We have presented a tabular method that allows one to assign methods to specification terms in a straightforward analytical manner.
These observations lead us to the belief that it is possible to achieve significant gains in the quality and productivity of verification and validation work by a judicious decomposition of complex specifications, and the judicious assignment of methods to specification terms. The examples we have used in this paper, while they are very simple, allow us nevertheless to illustrate how we can dispatch specification terms to verification methods in such a way that each method is applied well within the realm of its optimal performance: hence we apply program verification without having to cope with the difficult task of inventing loop invariants, we apply program fault tolerance without having to deal with the tedium, risk and expense of periodically saving the program state, and we apply program testing without the uncertainties that stem from the absence of an automated oracle.
7.2 Prospects We have mentioned in section 2 that our results can be extended to deal with specifications of objects (in the sense of OOP), since object specifications can be represented by relations; we still consider that application of our results to object specifications is not straightforward, and deserves some research attention. The other issue that we wish to
focus our attention on is the experimental validation of our approach on large scale examples; we are confident that application of a variety of methods will help us achieve gains in productivity and quality that are not possible with a single method. We are currently looking at the Space Shuttle Reentry software requirements [13], with a view to formulating some aspects in relational terms and making the determination, ahead of time, which method is better adapted for each term that we formulate. Another issue that we wish to consider is the application of testing for the purpose of reliability estimation rather than certification; this allows us to overcome the limitation that we suffer now (of limiting the domain of specification terms on which we apply testing), but will replace our logical conclusions with probabilistic conclusions.
References [1] S. Becker and J. Whittaker. Cleanroom Software Engineering Practice. IDEA publishing, 1997. [2] N. Boudriga, F. Elloumi, and A. Mili. The lattice of specifications: Applications to a specification methodology. Formal Aspects of Computing, 4:544–571, 1992. [3] E. Dijkstra. A Discipline of Programming. Prentice Hall, 1976. [4] D. Gries. The Science of programming. Springer Verlag, 1981. [5] E. Hehner. A Practical Theory of Programming. Prentice Hall, 1992. [6] R. Linger. Cleanroom software engineering for zero-defect software. In Proceedings, 15th Hawaii International Conference on Software Engineering, Baltimore, MD, May 1993. [7] R. Linger and P. Hausler. Cleanroom software engineering. In Proceedings, 25th Hawaii International Conference on System Sciences, Kauai, Hawaii, January 1992. [8] Z. Manna. A Mathematical Theory of Computation. McGraw Hill, 1974. [9] A. Mili, J. Desharnais, F. Mili, and M. Frappier. Computer Program Construction. Oxford University Press, 1994. [10] G. Schmidt and T. Stroehlein. Relations and Graphs, Discrete Mathematics for Computer Scientists. EATCSMonographs on Theoretical Computer Science. Springer Verlag, 1993. [11] D. Siewioreck and R. Swarz. Reliable Computer Systems, Design and Implementation. Digital Press, 1992. [12] D. Skuce and A. Mili. Behavioral specifications in objectoriented programming. Journal of Object-Oriented Programming, pages 41–49, January 1995. [13] R. Staff. Space shuttle orbiter, operational flight, level c, functional subsystem software requirements guidance, navigation and control, part a: Entry through landing. Technical Report P.O.No 1970483303, PDRD P1433q, WBS 1.4.3.2 (SFOC-FE0036), Rockwell Aerospace, August 1996. [14] J. Stark and A. Ireland. Invariant discovery via failed proof attempts. In Proceedings, 8th International Workshop on Logic Based Program Synthesis and Transformation, Manchester, UK, June 1998.