Structural Simulation Proofs based on ASMs even for Non-Terminating Programs Sabine Glesner Institut f¨ ur Programmstrukturen und Datenorganisation Universit¨ at Karlsruhe, 76128 Karlsruhe, Germany Email:
[email protected] Wolf Zimmermann Institut f¨ ur Informatik Martin-Luther-Universit¨ at Halle-Wittenberg, 06099 Halle, Germany E-mail:
[email protected] – Extended Abstract – Abstract When transforming programs, one often has the requirement to preserve their semantics, as e.g. in compilers. To guarantee that such a requirement is fulfilled, formal proofs are necessary. Therefore, we want to do a simulation proof by showing that the state transitions which appear when running the original and the transformed program, resp., are the same. In a first attempt, one could try an inductive simulation proof over the number of these state transitions. But, as we show here, this approach does not work in general. Induction is fine as long as programs terminate. But for non-terminating programs it is not appropriate. In this paper, we show that a coinductive proof technique is necessary in principle and that the problem of simulation proofs for non-terminating programs can be solved easily using coinduction.
1
Problems with Simulation Proofs
In this section, we demonstrate that simulation proofs for non-terminating programs cannot be done by induction. As an example, consider the transformation of a repeat-loop into a while-loop: repeat S until ¬B;
is transformed into
S; while B do S od;
To formally prove the correctness of this transformation, we need a formal semantics of these two programs. We decide to use an operational description based on abstract state machines (ASMs) [Gur95]. In this framework, the operational semantics is given wrt. the abstract syntax tree (AST). Some nodes in the AST represent dynamic tasks, e.g. nodes of sort While portray the decision whether the loop body is executed. The AST must contain certain attributes as e.g. the dynamic task to be executed next (NT), the condition of a loop (cond), and the first task of the body of a loop (TT). Moreover, the operational semantics has a pointer to the task to be executed (CT) which is moving through the program during execution as an abstract program pointer, cf. figure 1. The notation in this description is the one used in Montages [KP97]. The repeat1
cond
opd
I
S
B
S
B
While
NT
cond TT
I
Not
NT
T
TT
Repeat NT
if CT ∈ Repeat then if value(CT.cond) = true then CT:=CT.NT else CT:=CT.TT
S
T
if CT ∈ Not then value(CT) := ¬ value(CT.opd)
if CT ∈ While then if value(CT.cond) = true then CT:=CT.TT else CT:=CT.NT
Figure 1: Program Transformation and the while-loop are each represented by a graph. Square nodes denote graphs themselves while circle nodes are tasks. I denotes the initial state while T is the terminal state. Dashed lines show the control flow of the program, the other lines the data flow (use-def-chains). To show the correctness of the transformation of the repeat- into the whileloop, we try an inductive simulation proof showing that the two programs go through the same sequence of state transitions. Therefore, we do an induction over the number of repetitions of the loops whereby we assume that the initial states are the same. Then one can easily see that the states reached in the following repetitions of the repeat- and the while-loop, resp., are also the same. (The first repetition is the base case. For space reasons, we do not give details of this argument.) So, are we done now? No, unfortunately not, because this is not a valid induction proof. If the loop body does not terminate, then the induction step is not valid: In an inductive proof, the induction variable may be arbitrarily large but not infinite. Technically speaking, there is no next smaller element in the induction step which may be used in the argument since the number of repetitions is infinite. In the next section, we show that this problem also shows up in the wellknown Hoare calculus and that it can be solved using coinduction.
2
Coinductive Simulation Proofs Based on ASMs
Let us consider one of the well-known proof rules of the Hoare calculus [Hoa69] which also uses the coinduction principle. If one wants to prove that a recursive procedure p is correct wrt. a precondition P and a postcondition Q , then one assumes that for all recursive calls of p within the procedure body of p, precondition P and postcondition Q hold. If the procedure p always terminates, then this would be an induction proof since the recursion depth of the inner calls is always smaller than the recursion depth of p itself. But if the procedure p does not terminate, then this is still a valid proof: not an induction but a coinduction proof. (Remember that the proof rules of the Hoare calculus do not assume termination of the programs.) {P } proc p ··· {P } p {Q } ··· endproc {Q }
2
In a coinductive proof, one shows that an implementation, here: the procedure p, is correct wrt. its specification, here: pre- and postcondition P and Q . Correctness means that there is no contradiction between specification and implementation. And this is exactly the fact that can be verified with the above mentioned proof rule of the Hoare calculus1 In an inductive proof, starting from base cases, one assumes that a certain fact holds for finitely many items and shows that it also holds for the next item (“next” wrt. some well-founded order). In a coinductive proof, one goes the other way round: One assumes that the fact already holds for infinitely many items, e.g. for an infinite sequence. Then one shows that the fact also holds if one further smaller item is added to the infinitely many ones, e.g. if a new element is added to the beginning of the infinite sequence. Technically, coinduction proofs need a certain relation, called bisimulation relation. The coinduction principle then states that whenever two elements are contained in a bisimulation relation, then they are equal. In the same way as induction is a definition as well as a proof principle, coinduction can also be used to define structures and prove properties of them. One should not be confused by the fact that both principles use structural arguments. We do not have enough space here to discuss coinduction in more detail. For a short introduction with many examples see appendix B in [NNH99]. In our case, we need a special instance of the coinduction principle, namely for infinite lists (of state transitions). For infinite lists, a bisimulation relation R is defined as follows: If (l1 , l2 ) are in R, then head (l1 ) = head (l1 ) and (tail (l1 ), tail (l2 )) ∈ R. Thereby head is the function taking lists as arguments and returning their first element. tail also takes lists as arguments and returns as result the modified input list by removing the first element. Let us return to our simulation proof given in section 1. We want to prove that the implementation of the repeat-loop by the while-loop is correct. We do this by case distinction: First case: All loop bodys terminate. Then we define the bisimulation R as follows: R contains all pairs of lists such that the i-th element of the first (second, resp.) list contains the state transistions of the repeat-loop (while-loop, resp.) in the i-th repetition of the loop. It is easy to see that the bisimulation condition is met. Hence we conclude that all pairs of lists in R are equal which means that both loops go through the same state transitions. Second case: For some i, i a natural number, one of the two loop bodys does not terminate in the i-th repetition of the loop. Then one can show easily by induction that the state transitions in the first (i-1) repetitions of the two loops are the same. Hence, in the i-th repetition both loops start in the same state. Because they consist of the same syntactical structures, they go through the same state transitions and both of them do not terminate. Before ending this section, we want to remark that structural and operational semantics as Montages [KP97] or SOS (structural operational semantics) [Plo81] are always coinductive definitions. Such a semantics is based on the (inductive) definition of the syntax of a programming language and defines a state transition 1 The proof rule for the while loop is also based on the coinduction principle since the while loop is also a potential source of non-termination.
3
system with potentially infinite state transition sequences, see [JR97] for details.
3
A Coinductive Proof Method for ASMs
We want to emphasize that coinductive definitions are common when formally defining the semantics of programming languages - even if they are not always called like this. The adequate proof principle is coinduction which can be used in abstract state machines without any problems as demonstrated in this paper. To fully exploit the coinductive definition principle, one can put ASMs hierarchically together, analogous to StateCharts [Har87, DH89]. Then coinduction is the method of choice to define the composition semantically as well as to prove properties of it.
Acknowledgements We want to thank the people in the Verifix-Projekt for many helpful discussions on coinduction, especially Axel Dold, Wolfgang Goerigk, Gerhard Goos and Andreas Wolf.
References [DH89]
D. Drusinsky and D. Harel. Using statecharts for hardware description and synthesis. IEEE Trans. on Computer Design, 1989.
[Gur95] Y. Gurevich. Evolving Algebras: Lipari Guide. In E. B¨orger, editor, Specification and Validation Methods. Oxford University Press, 1995. [Har87]
D. Harel. Statecharts: A visual formalism for complex systems. In Science of Computer Programming, pages 231–274, 1987.
[Hoa69] C.A.R. Hoare. An Axiomatic Basis for Computer Programming. Communications of the ACM, 12(10):576 – 580, October 1969. [JR97]
Bart Jacobs and Jan Rutten. A Tutorial on (Co)Algebras and (Co)Induction. EATCS Bulletin, 67:222–259, 1997.
[KP97]
Philipp W. Kutter and Alfonso Pierantonio. Montages: Towards the Semantics of Realistic Programming Languages, January 1997.
[NNH99] Flemming Nielson, Hanne Riis Nielson, and Chris Hankin. Principles of Program Analysis. Springer, 1999. [Plo81]
Gordon D. Plotkin. A structural approach to operational semantics. Report DAIMI FN-19, Computer Science Department, Aarhus University, Denmark, September 1981.
4