Our paper proposes a checker-based approach to program veri cation, which works .... P and Q. The only fact we have to prove is that q guarantees Q to hold. A.
Correct Programs without Proof ? On Checker-Based Program Veri cation Wolfgang Goerigk, Thilo Gaul, Wolf Zimmermann To be published in Springer Advances in Computing Sciences, Proc. ATOOLS'98, Malente, Germany, 1998
1 Introduction In many cases, the eort of proving the correctness of large program systems seems not to be justi able. Since heuristics and programming tricks are used and necessary to solve complex problems successfully, mathematical inductive argumentation often fails, because the algorithms to be veri ed get too complex and tricky. We need more modular approaches to guarantee program correctness. Our paper proposes a checker-based approach to program veri cation, which works, if partial correctness of the application suces. In many cases, it is much easier to check a given result to be a correct solution of a problem than to verify the generating algorithm. A classical example is the solution of systems of linear equations, where simple matrix-vector multiplication is sucient in order to double check the calculated results. More realistic examples come for instance from compiler veri cation. In (Heberle et al. 1998) we show how a complete compiler front end including lexical, syntactical and semantical analysis can be checked by veri ed and guided unparsing. In the back end, sophisticated algorithms for e.g. register allocation or instruction scheduling, though hard to verify, are easily checkable because their correctness predicates are very simple. If we build checker routines into the program code, large program parts need not to be veri ed, because they are checked completely. We can concentrate on the veri cation of the checker code. Thus, we propose a modular approach using a combination of checking and proving. We may concentrate on the full correctness of small program parts in order to guarantee the partial correctness of the entire program. Moreover, we can formulate checker routines as correctness predicates in a suciently small functional language. Although our ideas are quite general and apply to program certi cation using code inspection as well, we will concentrate on mechanical program veri cation for functional Lisp programs using the Boyer-Moore theorem prover ACL2 here (Kaufmann Moore 1994), so as to link mechanical program veri cation for Lisp programs to the Veri x and VerComp work1 (Goerigk et al. 1996), (Gaul et al. 1997) on rigorous compiler and compiler implementation veri cation for ComLisp (Goerigk Homann 1996). In Veri x and VerComp we use preser1
The work reported here has been supported by the Deutsche Forschungsgemeinschaft (DFG) in the Veri x and VerComp projects .
vation of partial correctness as the central implementation correctness notion (Muller-Olm 1996), (Goerigk Muller-Olm 1996), (Goerigk et al. 1996). Thus, mechanically proved partial correctness of Lisp functions carries over to binary executable programs on concrete hardware in a rigorous mathematical sense (Goerigk Homann 1998), (Goerigk 1996). Checkers establish correctness properties of programs without proof, if the checkers are correct. It is not only convenient but sometimes even necessary to assure correctness properties using checkers; in section 4 we will give an example. On the other hand, runtime checks need execution time, and it will remain a major goal and challenge of program veri cation to prove that runtime checks are unnecessary, that programs are correct.
2 Checkers and partial correctness We assume a programming task to be speci ed by pre- and postconditions in part VDM-style: Let a (partial) function f : A ?! B be speci ed by a pre-condition tot tot P A ( or P : A ?! Bool ) and a post-condition Q A B ( or Q : A B ?! Bool ). We use Q(x; y) for (x; y) 2 Q, which means that Q holds for (x; y) 2 AB . f is called (totally) correct w.r.t. P and Q, i, for all x 2 A P (x) ) Q(x; f (x)) holds, i.e. if the result f (x) is guaranteed to be de ned and to ful ll the postcondition Q whenever the pre-condition P holds for the argument x. f is called partially correct w.r.t. P and Q, i, for all x 2 A P (x) ^ de ned (f (x)) ) Q(x; f (x)) : Partial correctness requires Q to hold only if the result f (x) is de ned. In many cases, and in particular for many software engineering tasks (where f denotes a program semantics), partial correctness is the appropriate correctness notion. The following remark is the source of a very practical approach to software veri cation, which we call checker-based: For any given de nition of f , we can de ne a partially correct checked version f Q of f easily by (x) if de ned (f (x)) ^ Q(x; f (x)) part Q f : A ?! B : x 7! funde ned (1) otherwise: Obviously, the result of f Q is only de ned if Q holds for (x; f (x)) 2 A B , and therefore, f Q is indeed partially correct w.r.t. P and Q.
2.1 The checker approach
f Q is not necessarily algorithmic, even if f is, since the post-condition Q may be de ned implicitly. Therefore, we formulate a slight generalization of the above remark, where the checker predicate ( lter) q is intended to be operational:
part Let q : A B ?! Bool be a predicate s.t. for all x 2 A, y 2 B we have ( q(x; y) = tt ) Q(x; y) ). Then f q is partially correct w.r.t. P and Q as well: (x) if de ned (f (x)) ^ q (x; f (x)) = tt (2) part q f : A ?! B : x 7! funde ned otherwise: If we use q to check the results of f , we intend to get q "as near as possible" to Q in order to make the checked version f q of f a "good" program { otherwise we could just use f false which would not give any result at all. However, we are not interested in delivering "bad" programs. We would not be able to sell them, anyway. On the other hand, if we would de ne q by f itself, i.e. q(x; y) ( y = f (x) ) (assuming strict equality), then the proof of ( q(x; y) ) Q(x; y) ) would force us to prove the correctness of f . That would make the check unnecessary. The important observation is that our generalization opens up a wide spectrum of possibilities to combine checking and proving without any compromise w.r.t. correctness. The completely checked version f q above as one extreme does not require any correctness proof for f and indeed is (partially) correct w.r.t. P and Q. The only fact we have to prove is that q guarantees Q to hold. A typical example comes from school mathematics: Suppose we have to deliver a program which solves a system A x = b of n linear equations. We can implement matrix-vector multiplication (q) easily in order to double check the results for instance of a Gauss elimination procedure (f ). f q then checks the post-condition Q(A; b; x) ( A x = b ) whenever a result x is produced. The other extreme in the spectrum of possibilities would be to deliver f itself, without any checking, but together with a full correctness proof. This is and will remain the aim of classical program veri cation. However, in the following we want to demonstrate the advantages of combining checks and proof.
2.2 The Goodenough/Gerhart approach
In (Goodenough Gerhart 1975) J. Goodenough and S. Gerhart propose to divide the program veri cation task into two separate parts, a proof part and a test part, which together are proved to imply correctness (Langmaack 1997). We will adopt this idea and divide the correctness problem into a checker part and a proof part. part Let Q1 A B (the theorem) and q2 : A B ?! Bool (the test) be two predicates which together imply the post-condition Q, i.e. let us assume that we are able to prove for all x 2 A, y 2 B P (x) ^ de ned (f (x)) ) Q1 (x; f (x)) and Q1 (x; y) ^ ( q2 (x; y) = tt ) ) Q (x; y) : In this case, checking q2 is sucient to guarantee correctness, i.e. the function f q2 (as de ned below) is again partially correct w.r.t. P and Q. (x) if de ned (f (x)) ^ q2 (x; f (x)) = tt (3) part B : x 7! funde ned f q2 : A ?! otherwise:
3 A running example: quicksort
Suppose we want to deliver a proved correct version of Quicksort, which sorts lists of rational numbers. Suppose also that we are not yet able to prove that the result of Quicksort is really sorted. In the following we want to demonstrate that we nevertheless are able to construct and to deliver the desired program, using our approach of checker-based program veri cation. The post-condition which speci es sorting algorithms and in particular Quicksort is de ned as usual: Q (x; y) ( issorted (y) ^ ( 8 n : ncount (n; x) = ncount (n; y) ) ) where issorted (y) speci es that the list y is sorted, and ncount (n; y) computes the number of occurrences of n in the list y, i.e. we specify the result y to be a sorted permutation of the argument x. issorted and ncount are both operational, and in section 3.1 we de ne the corresponding Lisp implementations. Thus, Q (x; y) consists of two parts: the rst part is the test q2 (x; y) issorted (y), and the second part is the theorem Q1 (x; y) ( 8 n : ncount (n; x) = ncount (n; y) ). The pre-condition is true, and every ACL2 function is total. Now we de ne a (functional) program quicksort and prove that for every list x the result quicksort (x) is a permutation of x, i.e. we prove the theorem Q1 (x; quicksort (x)): 8 x : 8 n : ncount (n; x) = ncount (n; quicksort (x) ) : Using our earlier remark on the Goodenough/Gerhart approach we now use (3) above and prove that the checked version quicksort (x) if issorted (quicksort (x)) quicksort q2 : x 7! unde ned otherwise: is partially correct w.r.t. the pre-condition true and the post-condition Q. Section 3.1 gives a complete ACL2 proof script which proves this fact for a Lisp implementation of Quicksort. We can now be sure that our implementation quicksort q2 is safe in the sense that it will never return incorrect results; if we use a compiler which preserves partial correctness for implementation, we can guarantee this for the binary executable program as well. Our result does not guarantee completeness or quality, i.e. we can not guarantee the fact that, unless we reach the resource limitations of the underlying hardware, the executable quicksort q2 program will indeed return results. However, we may use other, more traditional methods, like test or code inspection, in order to convince ourselves of this property. We are ready to deliver our program with a safety guarantee in the above sense.
3.1 The ACL2 proof script for quicksort
This section contains the complete ACL2 script for the Quicksort example. The following text is automatically generated from a running ACL2 proof script2 . 2
We thank Michael Smith (Computational Logic Inc.) for his tool IACL2 (in x ACL2).
Speci cation First, we de ne the functions which we need to specify Quicksort. The result of Quicksort is to be a sorted permutation of its argument list.
defun leq
(
(x y) (if (rationalp x) (if (rationalp y) (