10.1 Generic abstract interpretation of boolean expressions . .... by the formal semantics of the language which formally defines which errors are detected at .... For program analysis, we can only use a machine encoding L of a subset of all possible value .... It would then follow that ãα, γã is not a Galois connection since best ...
The Calculational Design of a Generic Abstract Interpreter Patrick COUSOT LIENS, Département de Mathématiques et Informatique École Normale Supérieure, 45 rue d’Ulm, 75230 Paris cedex 05, France Abstract. We present in extenso the calculation-based development of a generic compositional reachability static analyzer for a simple imperative programming language by abstract interpretation of its formal rule-based/structured small-step operational semantics.
Contents 1.
Introduction
3
2.
Definitions
4
3.
4.
Values 3.1 Machine integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Properties of Values
5
5. Abstract Properties of Values 5.1 Galois connection based abstraction . . . . 5.2 Componentwise abstraction of sets of pairs 5.3 Initialization and simple sign abstraction . 5.4 Initialization and interval abstraction . . . 5.5 Algebra of abstract properties of values . . 6.
7.
5 5 5
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Environments 6.1 Concrete environments . . . . . . . . . . . . . . . 6.2 Properties of concrete environments . . . . . . . . . 6.3 Nonrelational abstraction of environment properties 6.4 Algebra of abstract environments . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
Semantics of Arithmetic Expressions 7.1 Abstract syntax of arithmetic expressions . . . . . . . . 7.2 Machine arithmetics . . . . . . . . . . . . . . . . . . . 7.3 Operational semantics of arithmetic expressions . . . . 7.4 Forward collecting semantics of arithmetic expressions 7.5 Backward collecting semantics of arithmetic expressions
. . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . .
6 6 7 7 8 9
. . . .
10 10 10 10 12
. . . . .
13 13 13 13 14 14
8. Abstract Interpretation of Arithmetic Expressions 15 8.1 Lifting Galois connections at higher-order . . . . . . . . . . . . . . . . . . . . . . . . 15
1
8.2 8.3 8.4 8.5 8.6 8.7 9.
Generic forward/top-down abstract interpretation of arithmetic expressions . Generic forward/top-down static analyzer of arithmetic expressions . . . . . Initialization and simple sign abstract forward arithmetic operations . . . . . Generic backward/bottom-up abstract interpretation of arithmetic expressions Generic backward/bottom-up static analyzer of arithmetic expressions . . . Initialization and simple sign abstract backward arithmetic operations . . . .
Semantics of Boolean Expressions 9.1 Abstract syntax of boolean expressions . . . . 9.2 Machine booleans . . . . . . . . . . . . . . . 9.3 Operational semantics of boolean expressions 9.4 Equivalence of boolean expressions . . . . . . 9.5 Collecting semantics of boolean expressions .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
15 18 19 21 25 27
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
29 29 30 31 31 31
10. Abstract Interpretation of Boolean Expressions 10.1 Generic abstract interpretation of boolean expressions . . . . . . . . . 10.2 Generic static analyzer of boolean expressions . . . . . . . . . . . . . 10.3 Generic abstract boolean equality . . . . . . . . . . . . . . . . . . . . 10.4 Initialization and simple sign abstract arithmetic comparison operations
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
32 32 35 36 36
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
11. Reductive Iteration 37 11.1 Iterating monotone and reductive abstract operators . . . . . . . . . . . . . . . . . . . 37 11.2 Reductive iteration for boolean and arithmetic expressions . . . . . . . . . . . . . . . . 38 11.3 Generic implementation of reductive iteration . . . . . . . . . . . . . . . . . . . . . . 39 12. Semantics of Imperative Programs 12.1 Abstract syntax of commands and programs . . . . . . . . . 12.2 Program components . . . . . . . . . . . . . . . . . . . . . 12.3 Program labelling . . . . . . . . . . . . . . . . . . . . . . . 12.4 Program variables . . . . . . . . . . . . . . . . . . . . . . . 12.5 Program states . . . . . . . . . . . . . . . . . . . . . . . . . 12.6 Small-step operational semantics of commands . . . . . . . . 12.7 Transition system of a program . . . . . . . . . . . . . . . . 12.8 Reflexive transitive closure of the program transition relation 12.9 Predicate transformers and fixpoints . . . . . . . . . . . . . 12.10 Reachable states collecting semantics . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
13. Abstract Interpretation of Imperative Programs 13.1 Fixpoint precise abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2 Fixpoint approximation abstraction . . . . . . . . . . . . . . . . . . . . . . . . 13.3 Abstract invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4 Abstract predicate transformers . . . . . . . . . . . . . . . . . . . . . . . . . . 13.5 Generic forward nonrelational abstract interpretation of programs . . . . . . . . 13.6 The generic abstract interpreter for reachability analysis . . . . . . . . . . . . . 13.7 Abstract initial states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.8 Implementation of the abstract entry states . . . . . . . . . . . . . . . . . . . . 13.9 The reachability static analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . 13.10 Specializing the abstract interpreter to reachability analysis from the entry states 14. Conclusion
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
40 40 40 41 42 43 43 45 45 57 59
. . . . . . . . . .
59 60 61 62 63 63 75 76 77 77 79 84
2
1. Introduction The 1998 Marktoberdorf international summer school on Calculational System Design has been “focusing on techniques and the scientific basis for calculation-based development of software and hardware systems as a foundation for advanced methods and tools for software and system engineering. This includes topics of specification, description, methodology, refinement, verification, and implementation.”. Accordingly, the goal of our course was to explain both • the calculation-based development of an abstract interpreter for the automatic static analysis of a simple imperative language, and • the principles of application of abstract interpretation to the partial verification of programs by abstract checking. For short in these course notes, we concentrate only on the calculational design of a simplified but compositional version of the static analyzer. Despite the fact that the considered imperative language is quite simple and the corresponding analysis problem is supposed to be classical and satisfactorily solved for a long time [9], the proposed analyzer is both compositional and much more precise (e.g. for boolean expressions) than the solutions, often naïve, proposed in the literature. Consequently the results presented is these notes, although quite elementary, go much beyond a mere introductory survey and are of universal use. A static analyzer takes as input a program written in a given programming language (or a family thereof) and statically, automatically and in finite time1 outputs an approximate description of all its possible runtime behaviors considered in all possible execution environments (e.g. for all possible input data). The approximation is sound or conservative in that not a single case is forgotten but it may not be precise since the problem of determining the strongest program properties (including e.g. termination) is undecidable. This automatically determined information can then be compared to a specification either for program transformation, validation, or for error detection 2 . This comparison can also be inconclusive when the automatic analysis is too imprecise. The specification can be provided by the formal semantics of the language which formally defines which errors are detected at runtime or can be defined by the programmer for interactive abstract checking. The purpose of the static analysis is to detect the presence or absence of runtime errors at compile-time, without executing the program. Because the abstract checking is exhaustive, it can detect rare faults which are difficult to come upon by hand. Because the static determination of non-trivial dynamic properties is undecidable, the analysis may also be inconclusive for some tests. By experience, this represents usually from 5 to 20% of the cases which shows that static analysis can considerably reduces the validation task (whether it is done by hand or semi-automatically). See [27] for a recent and successful experience for industrial critical code. The main idea of abstract interpretation [5, 9, 13] is that any question about a program can be answered using some approximation of its semantics. This approximation idea applies to the semantics themselves [6] which describe program execution at an abstraction level which is often very far from the hardware level but is nevertheless precise enough to conclude e.g. on termination (but not e.g. on exact execution times). The specification of a correct static analyzer and its proof can be understood as an approximation of a semantics, a process which is formalized by the abstract interpretation theory. In the context of the Marktoberdorf summer 1 from 2As
a few seconds for small programs to a few hours for very large programs; shown in the course, the situation is not so simple since the analysis and verification do interact.
3
school, these course notes put the emphasis on viewing abstract interpretation as a formal method for the calculational design of static analyzers for programming languages equipped with a formally defined semantics.
2. Definitions A poset hL , vi is a set L with a partial order v (that is a reflexive, antisymmetric and transitive binary relation on L) [20]. A directed complete partial order (dcpo) hL , v, ti is a poset hL , vi such Gthat increasing chains x0 v x1 v . . . of elements of L have a least upper bound xi . A complete partial order (cpo) hL , ⊥, v, ti is a dcpo hL , v, ti with an (lub, join) i≥0
infimum ⊥ = t∅. A complete lattice hL , v, ⊥, >, t, ui is a poset hL , vi such that any subset X ⊆ L has a lub tX . It follows that ⊥ = t∅ is the infimum, > = tL is the supremum and any subset has a greatest lower bound (glb, meet) uX = t{x ∈ L | ∀y ∈ X : x v y}. mon A map F ∈ L 7→ L of L into L is monotonic (written F ∈ L 7−→ L) if and only if ∀x, y ∈ L : x v y H⇒ F (x) v F (y) . v
mon
If F ∈ L 7−→ L is a monotonic map of L into L and m v F (m) then lfpm F denotes the v-least fixpoint of F which is v-greater than or equal to m (if it exists). It is characterized by v
=
lfpm F ,
m
v
lfpm F ,
(m v x) ∧ (F (x) = x) v
1
v
F (lfpm F )
H⇒
v v
lfpm F v x .
v
lfp F = lfp⊥ F is the least fixpoint of F . The greatest fixpoint (gfp) is defined dually, replacing v by its inverse w, the infimum ⊥ by the supremum >, the lub t by the greatest lower bound (glb) u, etc. In order to generalize the Kleene/Knaster/Tarski fixpoint theorem, the transfinite iteration sequence is defined as (O is the class of ordinals) 1
F 0 (m) = m, 1
F δ+1 (m) = F (F δ (m)) for successor ordinals, G 1 F λ (m) = F δ (m) for limit ordinals .
(1)
δ 9, 1
min_int =
greatest machine integer; − max_int − 1,
z ∈ Z,
smallest machine integer;
(2)
mathematical integers; 1
i ∈ I = [min_int, max_int],
bounded machine integers.
3.2 Errors We assume that the programming language semantics keeps track of uninitialized variables (e.g. by means of a reserved value) and of arithmetic errors (overflow, division by zero, . . . , e.g. by means of exceptions). We use the following notations i ,
initialization error;
a , e∈E
arithmetic error; 1
= {i , a }, 1
v ∈ I = I ∪ E,
errors; machine values.
(3)
4. Properties of Values A value property is understood as the set of values which have this property. The concrete properties of values are therefore elements of the powerset ℘ (I ). For example [1, max_int] ∈ ℘ (I ) is the property “is a positive machine integer” while {2n + 1 ∈ I | n ∈ Z} is the property “is an odd machine integer”. h℘ (I ), ⊆, ∅, I , ∪, ∩, ¬i is a complete boolean lattice. Elements of the powerset ℘ (I ) are understood as predicates or properties of values with subset inclusion ⊆ as logical implication, ∅ is false, I is true, ∪ is the disjunction, ∩ is the conjunction and ¬ is the negation.
5
5. Abstract Properties of Values 5.1 Galois connection based abstraction For program analysis, we can only use a machine encoding L of a subset of all possible value properties. L is the set of abstract properties. Any abstract property p ∈ L is the machine encoding of some value property γ ( p) ∈ ℘ (I ) specified by the concretization function γ ∈ L 7→ ℘ (I ). For any particular program to be analyzed, this set can be chosen as a finite set (since there always exists a complete abstraction into a finite abstract domain to prove a specific property of a specific system/program, as shown by the completeness proof given in [16]). However, when considering all programs of a programming language this set L must be infinite (as shown by the incompleteness argument of [16]). This does not mean that L and its meaning γ must be the same for all programs in the language (see Sec. 13.4 for a counter-example). But then LJPKand γ JPK must be defined for all programs P in the language, not only for a few given ones. This is a fundamental difference with abstract model checking where a user-defined problem specific abstraction is considered for each particular system (program) to analyze. We assume that hL , v, ⊥, >, t, ui is a complete lattice so that the partial ordering v also called approximation ordering is understood as abstract logical implication, the infimum ⊥ encodes false, the supremum > encodes true, the lub t is the abstract disjunction and the glb u is the abstract conjunction. The fact that the approximation ordering v should encode logical implication on abstract properties is formalized by the assumption that the concretization function is monotone, that is, by definition pvq
H⇒
γ ( p) ⊆ γ (q) .
(4)
In general, an arbitrary concrete value property P ∈ ℘ (I ) has no abstract equivalent in L. However it can be overapproximated by any p ∈ L such that P ⊆ γ ( p). Overapproximation means that the abstract property p (or its meaning γ ( p)) is weaker than the overapproximated concrete property P. Observe that ∩{γ ( p) | P ⊆ γ ( p)} is a better overapproximation of the concrete property P than any other p ∈ L such that P ⊆ γ ( p). The situation where for all concrete properties P ∈ ℘ (I ) this best approximation ∩{γ ( p) | P ⊆ γ ( p)} has a corresponding encoding in the abstract domain L corresponds to Galois connections [13]. This encoding of the best approximation is provided by the abstraction function α ∈ ℘ (I ) 7→ L such that P ⊆ Q H⇒ α(P) v α(Q) ∀P ∈ ℘ (I ) : P ⊆ γ (α(P)) ∀ p ∈ α(γ ( p)) v p
(α preserves implication), (α(P) overapproximates P), (γ introduces no loss of information).
(5) (6) (7)
Observe that if p ∈ L overapproximates P ∈ ℘ (I ), that is P ⊆ γ ( p) then α(P) v α(γ ( p))) v p by (5) and (7) so that α(P) is more precise that p since when considering meanings, γ (α(P)) ⊆ γ (q). It follows that α(P) is the best overapproximation of P in L. The conjunction of properties (4) to (7) is equivalent to ∀P ∈ ℘ (I ), p ∈ L : α(P) v p
⇐⇒
P ⊆ γ ( p) .
(8)
The above characteristic property (8) of Galois connections is denoted γ
← − hL , vi . h℘ (I ), ⊆i − −α− → 6
(9)
TOP INI
ERR
NEG ZERO POS BOT
Figure 1: The lattice of initialization and simple signs
Definitions and proofs relative to Galois connections can be found in pages 103–141 of [14] which were distributed to the summer school students as a preliminary introduction to abstract interpretation. Recall that in a Galois connection α preserves existing joins, γ preserves existing meets and one adjoint uniquely determine the other. We have α(P) = u{ p | P ⊆ γ ( p)}, γ ( p) = ∪{P | α(P) ⊆ p} .
(10)
It follows that α(P) is the abstract encoding of the concrete property γ (α(P)) = γ (u{ p | P ⊆ γ ( p)}) = u{γ ( p) | P ⊆ γ ( p)} which is the best overapproximation of the concrete property P by abstract properties p ∈ L (from above, whence such that P ⊆ γ ( p)). 5.2 Componentwise abstraction of sets of pairs The nonrelational/componentwise abstraction of properties of pairs of values (that is sets of pairs) consists in forgetting about the possible relationships between members of these pairs by componentwise application of the Galois connection (9). Formally 1
α 2 (P) = hα({v1 | ∃v2 : hv1 , v2 i ∈ P}), α({v2 | ∃v1 : hv1 , v2 i ∈ P})i, (11) 1
γ 2 (h p1 , p2 i) = {hv1, v2 i | v1 ∈ γ ( p1 ) ∧ v2 ∈ γ ( p2 )}
(12)
so that γ2
− hL × L, v2 i h℘ (I × I ), ⊆i ← −−− −− → 2
(13)
α
with the componentwise ordering 1
h p1 , p2 i v2 hq1 , q2 i = p1 v q1 ∧ p2 v q2 . 5.3 Initialization and simple sign abstraction We now consider an application where abstract properties record initialization and sign only. The lattice L is defined by Hasse diagram of Fig. 1. The meaning of these abstract properties is the following 1
γ (BOT) = {a }, 1 γ (NEG) = [min_int, −1] ∪ {a }, 1
γ (ZERO) = {0, a}, 1
γ (POS) = [1, max_int] ∪ {a } . 7
1
γ (INI) = I ∪ {a }, 1
γ (ERR) = {i , a}, 1 γ (TOP) = I ,
(14)
In order to later illustrate consecutive losses of information, we have chosen not to include the 1 abstract values NEGZ, NZERO and POSZ such that γ (NEGZ) = [min_int, 0] ∪ {a }, γ (NZERO) 1 1 = [min_int, −1] ∪ [1, max_int] ∪ {a } and γ (POSZ) = [0, max_int] ∪ {a }. 1 Observe that if we had defined γ (ERR) = {i } then γ would not be monotone so that (9) would not hold. Another abstract value would be needed to discriminate the initialization and arithmetic errors (see Fig. 3). 1 Another possible definition of γ would have been (14) but with γ (BOT) = ∅. Then γ would not preserve meets (since e.g. γ (NEG u POS) = γ (BOT) = ∅ 6= {a } = γ (NEG) u γ (POS)). It would then follow that hα, γ i is not a Galois connection since best approximations may not exist. For example {a } would be upper approximable by the minimal ERR, NEG, ZERO or POS, none of which being more precise than the others in all contexts. Another completely different choice of γ would be 1
γ (INI) = I,
1
1
γ (ERR) = {i , a },
1
γ (TOP) = I .
γ (BOT) = ∅,
1
γ (NEG) = [min_int, −1],
1
γ (ZERO) = {0}, 1
γ (POS) = [1, max_int]}, With such a definition of γ for a program analysis taking arithmetic overflows into account, the usual rule of signs POS + POS = POS would not hold since the sums of large positive machine integers may yield an arithmetic error a such that a 6∈ γ (POS). The correct version of the rule of sign would be POS + POS = TOP, which is too imprecise. Using (10) and the notation (c1 ? v1 | c2 ? v2 . . . | cn ? vn ¿ vn+1 ) to denote v1 when condition c1 holds else v2 when condition c2 holds and so on for vn or else vn+1 when none of the conditions c1 , . . . , cn hold, we get the initialization and simple sign abstraction, as follows (P ∈ ℘ (I )) 1
α(P) = ( P ⊆ {a } ? BOT | P ⊆ [min_int, −1] ∪ {a } ? NEG | P ⊆ {0, a} ? ZERO | P ⊆ [1, max_int] ∪ {a } ? POS | P ⊆ I ∪ {a } ? INI | P ⊆ {i , a } ? ERR ¿ TOP) .
(15)
The adjoined functions α and γ satisfy conditions (4) to (7) which are equivalent to the characteristic property (8) of Galois connections (9). 5.4 Initialization and interval abstraction The traditional lattice for interval analysis [8, 9] is defined by the Hasse diagram of Fig. 2 (where −∞ and +∞ are either lower and upper bounds of integers or, as considered here, shorthands for max_int and min_int). The corresponding meaning is 1
γ (BOT) = ∅, 1
γ ([a, b]) = {x ∈ I | a ≤ x ≤ b} . In order to take initialization and arithmetic errors into account, we can use the lattice with Hasse diagram and concretization function given in Fig. 3. Combining interval and error 8
[-∞,+∞] …
… …
…
…
[-∞,0]
[0,+∞] …
…
…
…
[-3,0]
[-∞,-1]
…
…
[-2,1]
…
…
[-1,2]
[0,3]
…
[1,+∞] …
[-3,-1]
[-∞,-2]
[-2,0]
[-1,1]
[0,2]
[1,3]
… …
[2,+∞] …
…
[-2,-1]
…
[-2,-2]
[-1,0]
[-1,-1]
[0,1]
[0,0]
[1,2]
[1,1]
…
[2,2]
…
…
BOT
Figure 2: The lattice I of intervals
ERR AER
γ (NER) γ (IER) γ (AER) γ (ERR)
IER NER
1
= 1 = 1 = 1 =
I I ∪ {i } I ∪ {a } I ∪ {a , i }
Figure 3: The lattice E of errors
information, we get 1
L = E×I with the following meaning 1
γ (he, i i) = γ (e) ∩ γ (i ) . 5.5 Algebra of abstract properties of values The abstract algebra, which consists of abstract values (representing properties of concrete values) and abstract operations (corresponding to abstract property transformers) can be encoded in program modules as follows (the programming language is Objective CAML) module type Abstract_Lattice_Algebra_signature = sig type lat (* abstract properties val bot : unit -> lat (* infimum
9
*) *)
val val val val val val val val val ... end;;
isbotempty initerr top join meet leq eq in_errors print
: : : : : : : : :
unit -> bool unit -> lat unit -> lat lat -> lat -> lat -> lat -> lat -> lat -> lat -> lat -> lat -> bool lat -> unit
lat lat bool bool
(* (* (* (* (* (* (* (*
bottom is emptyset? uninitialization supremum least upper bound greatest lower bound approximation ordering equality included in errors?
*) *) *) *) *) *) *) *)
(isbotempty ()) is γ (⊥) = ∅ while (in_errors v) implies γ (v) ⊆ {a , i }.
6. Environments 6.1 Concrete environments As usual, we use environments ρ to record the value ρ(X) of program variables X ∈ V. 1
ρ ∈ R = V 7→ I ,
environments.
Since environments are functions, we can use the functional assignment/substitution notation defined as ( f ∈ D 7→ E) 1
f [d ← e](x) =
f (x),
if x 6= d ;
1
f [d ← e](d) = e ;
(16)
1
f [d1 ← e1 ; d2 ← e2 ; . . .; dn ← en ] = ( f [d1 ← e1 ])[d2 ← e2 ; . . .; dn ← en ] . 6.2 Properties of concrete environments Properties of environments are understood as sets of environments that is elements of ℘ (R) where ⊆ is logical implication. Such properties of environments are usually stated using predicates in some prescribed syntactic form. Environment properties are therefore their interpretations. For example the predicate “X = Y” is interpreted as {ρ ∈ V 7→ I | ρ(X) = ρ(Y)} and we prefer the second form. 6.3 Nonrelational abstraction of environment properties In order to approximate environment properties, we ignore relationships between the possible values of variables γr
− − hV 7→ ℘ (I ), ⊆i ˙ h℘ (V 7→ I ), ⊆i ← −−− → α− r by defining αr (R) = λX ∈ V•{ρ(X) | ρ ∈ R}, γr (r ) = {ρ | ∀X ∈ V : ρ(X) ∈ r (X)} and the pointwise ordering which is denoted with the dot notation 1 ˙ r0 = r⊆ ∀X ∈ V : r (X) ⊆ r 0 (X) .
10
For example if R = {[X 7→ 1; Y 7→ 1], [X 7→ 2; Y 7→ 2]} then αr (R) is [X 7→ {1, 2}; Y 7→ {1, 2}] so that the equality information (X = Y) is lost. Since all possible relationships between variables are lost in the nonrelational abstraction, such nonrelational analyzes often lack precision, but are rather efficient. Now the Galois connection (9) γ
− hL , vi h℘ (I ), ⊆i ← −−− α→ can be used to approximate the codomain γc
− − hV 7→ L, vi ˙ ← ˙ hV 7→ ℘ (I ), ⊆i −−− → α− c as follows 1 ˙ r0 = rv ∀X ∈ V : r (X) v r 0 (X), 1
αc (R) = α B R, 1
γc (r ) = γ B r, ˙ >, ˙ t, ui ˙ ⊥, ˙ ˙ is a complete lattice for the pointwise ordering v. so that hV 7→ L, v, We can now use the fact that the composition of Galois connections γ12
− hL 2 , v2 i hL 1, v1 i ← −−− −− → α21
γ23
− hL 3 , v3 i hL 2 , v2 i ← −−− −− →
and
α32
is a Galois connection γ12 Bγ23
− hL 3 , v3 i . hL 1 , v1 i ← −−− −− −− −− −− → α32 Bα21
The composition of the nonrelational and codomain abstractions is γ˙
← − hV 7→ L, vi ˙ h℘ (V 7→ I ), ⊆i − −− → α˙
(17)
where 1
α(R) ˙ = αc B αr (R) = λX ∈ V• α({ρ(X) | ρ ∈ R}),
(18)
1
γ˙ (r ) = γr B γc (r ) = {ρ | ∀X ∈ V : ρ(X) ∈ γ (r (X))} .
(19)
If L has an infimum ⊥ such that γ (⊥) = ∅, we observe that if r ∈ V 7→ L has ρ(X) = ⊥ then γ˙ (r ) = ∅. It follows that the abstract environments with some bottom component all represent the same concrete information (∅). The abstract lattice can then be reduced to eliminate equivalent abstract environments (i.e. which have the same meaning) [13, 14]. We have γ˙
⊥ − hV 7−→ ˙ ← ˙ hV 7→ I , ⊆i L, vi −−− → α˙
where ⊥
1
V 7−→ L = {ρ ∈ V 7→ L | ∀X ∈ V : ρ(X) 6= ⊥} ∪ {λX ∈ V• ⊥} .
11
(20)
Numbers d ∈ Digit ::= 0 | 1 | . . . | 9 n ∈ Nat ::= Digit | Nat Digit
digits, numbers in decimal notation.
Variables X∈V
variables/identifiers.
Arithmetic expressions A ∈ Aexp ::= n | X | ? | +A | −A | A1 + A2 | A1 − A2 | A1 ∗ A2 | A1 / A2 | A1 mod A2 .
numbers, variables, random machine integer, unary operators, binary operators,
Figure 4: Abstract syntax of arithmetic expressions 6.4 Algebra of abstract environments In the static analyzer, the complete lattice of environments is encoded by a module parameterized by the module encoding the complete lattice L of abstract properties of values. It is therefore a functor with a formal parameter (along with the expected signature for L) which returns the actual structure itself. The static analyzer is generic in that by changing the actual parameter one obtains different static analyzers corresponding to different abstractions of properties of values. module type Abstract_Env_Algebra_signature = functor (L: Abstract_Lattice_Algebra_signature) -> sig open Abstract_Syntax type env (* complete lattice of abstract environments type element = env val bot : unit -> env (* infimum val is_bot : env -> bool (* check for infimum val initerr : unit -> env (* uninitialization val top : unit -> env (* supremum val join : env -> (env -> env) (* least upper bound val meet : env -> (env -> env) (* greatest lower bound val leq : env -> (env -> bool) (* approximation ordering val eq : env -> (env -> bool) (* equality val print : env -> unit (* substitution *) val get : env -> variable -> L.lat (* r(X) val set : env -> variable -> L.lat -> env (* r[X max_int;
1
if 10 n + d ≤ max_int .
nd = a , nd = 10 n + d,
We respectively write u ∈ I 7→ I for the machine arithmetic operation and u ∈ Z 7→ Z for the mathematical arithmetic operation corresponding to the language unary arithmetic operators u ∈ {+, −}. Errors are propagated or raised when the result of the mathematical operation is not machine-representable, so that we have (e ∈ E, i ∈ I): 1
u e = e ; ui ui
= u i,
1
if u i ∈ I;
1
if u i 6∈ I .
= a ,
(21)
We respectively write b ∈ I × I 7→ I for the machine arithmetic operation and b ∈ Z × Z 7→ Z for the mathematical arithmetic operation corresponding to the language binary arithmetic operators b ∈ {+, −, ∗, /, mod}. Evaluation of operands, whence error propagation is left to right. The division and modulo operations are defined for non-negative first argument and positive second argument. We have (N+ is the set of positive naturals, e ∈ E, v ∈ I , i, i 1 , i 2 ∈ I) 1
e b v = e ; 1
i b e = e ; 1
i1 b i2 = i1 b i2,
if b ∈ {+, −, ∗} ∧ i 1 b i 2 ∈ I;
(22)
1
if b ∈ {/, mod} ∧ i 1 ∈ I ∩ N ∧ i 2 ∈ I ∩ N ∧ i 1 b i 2 ∈ I;
1
if i 1 b i 2 6∈ I ∨ (b ∈ {/, mod} ∧ (i 1 6∈ I ∩ N ∨ i 2 6∈ I ∩ N+ )) .
i1 b i2 = i1 b i2, i 1 b i 2 = a ,
+
7.3 Operational semantics of arithmetic expressions The big-step operational semantics [31] (renamed natural semantics by [26]) of arithmetic expressions involves judgements ρ ` A Z⇒ v meaning that in environment ρ, the arithmetic expression A may evaluate to v ∈ I . It is defined in Fig. 5. 4 Observe
that if m and M are the strings of digits respectively representing the absolute value of min_int and max_int then m > max_int so that ρ ` m Z⇒ a whence ρ ` - m Z⇒ a . However ρ ` (- M) - 1 Z⇒ min_int.
13
ρ ` n Z⇒ n,
decimal numbers;
(23)
ρ ` X Z⇒ ρ(X),
variables;
(24)
i ∈I , ρ ` ? Z⇒ i
random;
(25)
ρ ` A Z⇒ v , ρ ` u A Z⇒ u v
unary arithmetic operations;4
(26)
ρ ` A1 Z⇒ v1 , ρ ` A2 Z⇒ v2 , ρ ` A1 b A2 Z⇒ v1 b v2
binary arithmetic operations .
(27)
Figure 5: Operational semantics of arithmetic expressions 7.4 Forward collecting semantics of arithmetic expressions The forward/bottom-up collecting semantics of an arithmetic expression defines the possible values that the arithmetic expression can evaluate to in a given set of environments cjm
Faexp ∈ Aexp 7→ ℘ (R) 7−→ ℘ (I ), 1
FaexpJAKR = {v | ∃ρ ∈ R : ρ ` A Z⇒ v} .
(28)
The forward collecting semantics FaexpJAKR specifies the strongest postcondition that values of the arithmetic expression A do satisfy when this expression is evaluated in an environment satisfying the precondition R. The forward collecting semantics can therefore be understood as a predicate transformer [22]. In particular it is a complete join morphism (denoted with cjm 7−→), that is (S is an arbitrary set) [ [ FaexpJAKRk , FaexpJAK Rk = k∈S
k∈S
which implies monotony (when S = {1, 2} and R1 ⊆ R2 ) and ∅-strictness (when S = ∅) FaexpJAK∅ = ∅ . 7.5 Backward collecting semantics of arithmetic expressions The backward/top-down collecting semantics BaexpJAK(R)P of an arithmetic expression A defines the subset of possible environments R such that the arithmetic expression may evaluate, without producing a runtime error, to a value belonging to given set P cjm
cjm
Baexp ∈ Aexp 7→ ℘ (R) 7−→ ℘ (I ) 7−→ ℘ (R), 1
BaexpJAK(R)P = {ρ ∈ R | ∃i ∈ P ∩ I : ρ ` A Z⇒ i } .
14
(29)
8. Abstract Interpretation of Arithmetic Expressions 8.1 Lifting Galois connections at higher-order In order to approximate monotonic predicate transformers knowing an abstraction (9) of value properties and (20) of environment properties, we use the following functional abstraction [13] 1
F
α (8) = α B 8 B γ˙ ,
(30)
1
F
γ (ϕ) = γ B ϕ B α˙ so that γ
F
− h(V 7→ L) 7−mon ˙ ← ˙ . → L, vi h℘ (V 7→ I ) 7−→ ℘ (I ), ⊆i −−− −− F→ mon
α
(31)
The intuition is that for any abstract precondition p ∈ L, or its concrete equivalent γ˙ ( p) ∈ ℘ (V 7→ I ), the abstract predicate transformer ϕ should provide an overestimate ϕ( p) of the postcondition 8(γ ( p)) defined by the concrete predicate transformer 8. This soundness requirement can be formalized as follows: ∀ p ∈ L : γ (ϕ( p)) ⊇ 8(γ˙ ( p)) ∀ p ∈ L : 8(γ˙ ( p)) ⊆ γ (ϕ( p)) ∀ p ∈ L : α(8(γ˙ ( p))) v ϕ( p) ˙ ϕ α B 8 B γ˙ v F ˙ ϕ α (8) v
⇐⇒ ⇐⇒ ⇐⇒ ⇐⇒ 1
Hsoundness requirementI Hdef. inverse ⊇ of ⊆I Hdef. Galois connectionI ˙ Hdef. vI F
Hdef. α I.
(32)
F
Choosing ϕ = α (8) is therefore the best of the possible sound choices since it always provides the strongest abstract postcondition, whence, by monotony, the strongest concrete one. Observe that 8 (as defined by the collecting semantics) and α are (in general) not comF putable so that α (8) was not proposed by [13] as an implementation of the abstract predicate transformer but instead as a formal specification. In practice, this specification must be refined into an algorithm effectively computing the abstract predicate transformer ϕ. This point is sometimes misunderstood [28]. Moreover [13] does not require the abstract predicate transformer ϕ to be chosen as the F best possible choice α (8). Clearly (32) shows that any overestimate is sound (although less precise but hopefully more efficiently computable). This is also sometimes misunderstood [28]. 8.2 Generic forward/top-down abstract interpretation of arithmetic expressions We now design the generic forward/top-down nonrelational abstract semantics of arithmetic expressions Faexp Faexp
F
∈ Aexp 7→ (V 7−→ L) 7−→ L ,
F
∈ Aexp 7→ (V 7−→ L) 7−→ L ,
⊥
mon
when γ (⊥) = 6 ∅;
mon
when γ (⊥) = ∅
by calculus. This consists consists, for any possible approximation (9) of value properties, in approximating environment properties by the nonrelational abstraction (20) and in applying 15
the functional abstraction (31) to the forward collecting semantics (28). We get an overapproximation such that F ˙ α F (FaexpJAK) . Faexp JAK w F
(33) F
Starting from the formal specification α (FaexpJAK), we derive an algorithm Faexp JAK satisfying (33) by calculus F
α (FaexpJAK) F = Hdef. (30) of α I α B FaexpJAK B γ˙ = Hdef. of composition BI • λr α(FaexpJAK(γ˙ (r ))) = Hdef. (28) of FaexpJAKI λr • α({v | ∃ρ ∈ γ˙ (r ) : ρ ` A Z⇒ v}) . If r is the infimum λY• ⊥ where the infimum ⊥ of L is such that γ (⊥) = ∅, then γ˙ (r ) = ∅ whence: F
α (FaexpJAK)(λY• ⊥) = Hdef. (19) of γ˙ I α(∅) = HGalois connection (9) so that α(∅) = ⊥I ⊥. When r 6= λY• ⊥ or γ (⊥) 6= ∅, we have F
α (FaexpJAK)r = (λr • α({v | ∃ρ ∈ γ˙ (r ) : ρ ` A Z⇒ v}))r = Hdef. lambda expressionI α({v | ∃ρ ∈ γ˙ (r ) : ρ ` A Z⇒ v}) and we proceed by induction on the arithmetic expression A. When A = n ∈ Nat is a number, we have
1 F
α (FaexpJnK)r = α({v | ∃ρ ∈ γ˙ (r ) : ρ ` n Z⇒ v}) = Hdef. (23) of ρ ` n Z⇒ vI α({n}) = Hby defining n = α({n})I n
=
F
1
Hby defining Faexp JnKr = nI F Faexp JnKr . When A = X ∈ V is a variable, we have
2 F
α (FaexpJXK)r = α({v | ∃ρ ∈ γ˙ (r ) : ρ ` X Z⇒ v}) = Hdef. (24) of ρ ` X Z⇒ vI α({ρ(X) | ρ ∈ γ˙ (r )}) = Hdef. (19) of γ˙ I 16
α(γ (r (X))) v HGalois connection (9) so that α B γ is reductiveI r (X) 1 F = Hby defining Faexp JXKr = r (X)I F Faexp JXKr . When A = ? is random, we have
3 F
= = v =
α (FaexpJ?K)r α({v | ∃ρ ∈ γ˙ (r ) : ρ ` ? Z⇒ v}) Hdef. (25) of ρ ` ? Z⇒ vI α(I) F Hby defining ? w α(I)I F ? 1 F F Hby defining Faexp J?Kr = ? I F Faexp J?Kr . When A = u A0 is a unary operation, we have
4
α (FaexpJu A0 K)r α({v | ∃ρ ∈ γ˙ (r ) : ρ ` u A0 Z⇒ v}) Hdef. (4) of ρ ` u A0 Z⇒ vI α({u v | ∃ρ ∈ γ˙ (r ) : ρ ` A0 Z⇒ v}) Hγ B α is extensive (6), α is monotone (5)I α({u v | v ∈ γ B α({v 0 | ∃ρ ∈ γ˙ (r ) : ρ ` A0 Z⇒ v 0})}) Hinduction hypothesis (33), γ (4) and α (5) are monotoneI F α({u v | v ∈ γ (Faexp JA0 Kr )}) F F Hby defining u such that u ( p) w α({u v | v ∈ γ ( p)})I F F u (Faexp JA0 Kr ) 1 F F F Hby defining Faexp Ju A0 Kr = u (Faexp JA0 Kr )I F Faexp Ju A0 Kr . F
= = v v v =
When A = A1 b A2 is a binary operation, we have
5 F
= = v v v v =
α (FaexpJA1 b A2 K)r α({v | ∃ρ ∈ γ˙ (r ) : ρ ` A1 b A2 Z⇒ v}) Hdef. (27) of ρ ` A1 b A2 Z⇒ vI α({v1 b v2 | ∃ρ ∈ γ˙ (r ) : ρ ` A1 Z⇒ v1 ∧ ρ ` A2 Z⇒ v2 }) Hα monotone (5)I α({v1 b v2 | ∃ρ1 ∈ γ˙ (r ) : ρ1 ` A1 Z⇒ v1 ∧ ∃ρ2 ∈ γ˙ (r ) : ρ2 ` A2 Z⇒ v2 }) Hγ B α is extensive (6), α is monotone (5)I α({v1 b v2 | v1 ∈ γ B α({v | ∃ρ ∈ γ˙ (r ) : ρ ` A1 Z⇒ v}) ∧ v2 ∈ γ B α({v | ∃ρ ∈ γ˙ (r ) : ρ ` A2 Z⇒ v})}) Hinduction hypothesis (33), γ (4) and α (5) are monotoneI F F α({v1 b v2 | v1 ∈ γ (Faexp JA1 Kr ) ∧ v2 ∈ γ (Faexp JA2 Kr )}) F F Hby defining b such that b ( p1 , p2 ) w α({v1 b v2 | v1 ∈ γ ( p1 ) ∧ v2 ∈ γ ( p2 )})I F F F b (Faexp JA1 Kr, Faexp JA2 Kr ) 1 F F F F Hby defining Faexp JA1 b A2 Kr = b (Faexp JA1 Kr, Faexp JA2 Kr )I F Faexp JA1 b A2 Kr . 17
1
F
Faexp JAK(λY• ⊥) = ⊥
if γ (⊥) = ∅
F
1
F
1
F
1
F
1
F
F
1
F
F
(34)
F
Faexp JnKr = n
Faexp JXKr = r (X) Faexp J?Kr = ?
Faexp Ju A0 Kr = u (Faexp JA0 Kr ) F
F
F
Faexp JA1 b A2 Kr = b (Faexp JA1 Kr, Faexp JA2 Kr ) parameterized by the following forward abstract operations F
n
?
F
= α({n}) w α(I)
F
u ( p) w α({u v | v ∈ γ ( p)}) F
b ( p1 , p2 ) w α({v1 b v2 | v1 ∈ γ ( p1 ) ∧ v2 ∈ γ ( p2 )})
(35) (36)
Figure 6: Forward abstract interpretation of arithmetic expressions F
In conclusion, we have designed the forward abstract interpretation Faexp of arithmetic expressions in such a way that it satisfies the soundness requirement (33) as summarized in Fig. 6. F By structural induction on the arithmetic expression A, the abstract semantics Faexp JAK F F of A is monotonic (respectively continuous) if the abstract operations u and b are monotonic (resp. continuous), since the composition of monotonic (resp. continuous) functions is monotonic (resp. continuous). 8.3 Generic forward/top-down static analyzer of arithmetic expressions The operations on abstract value properties which are used for the forward abstract interpretation of arithmetic expressions of Fig. 6 must be provided with the module implementing each particular algebra of abstract properties. module type Abstract_Lattice_Algebra_signature = sig (* complete lattice of abstract properties of values *) type lat (* abstract properties *) ... (* forward abstract interpretation of arithmetic expressions *) val f_INT : string -> lat val f_RANDOM : unit -> lat val f_UMINUS : lat -> lat val f_UPLUS : lat -> lat val f_PLUS : lat -> lat -> lat val f_MINUS : lat -> lat -> lat val f_TIMES : lat -> lat -> lat val f_DIV : lat -> lat -> lat val f_MOD : lat -> lat -> lat ... end;;
In functional programming, the translation from Fig. 6 to a program is immediate as follows module type Faexp_signature =
18
functor (L: Abstract_Lattice_Algebra_signature) -> functor (E: Abstract_Env_Algebra_signature) -> sig open Abstract_Syntax (* generic forward abstract interpretation of arithmetic operations *) val faexp : aexp -> E(L).env -> L.lat end;; module Faexp_implementation = functor (L: Abstract_Lattice_Algebra_signature) -> functor (E: Abstract_Env_Algebra_signature) -> struct open Abstract_Syntax (* generic abstract environments *) module E’=E(L) (* generic forward abstract interpretation of arithmetic operations *) let rec faexp’ a r = match a with | (INT i) -> (L.f_INT i) | (VAR v) -> (E’.get r v) | RANDOM -> (L.f_RANDOM ()) | (UMINUS a1) -> (L.f_UMINUS (faexp’ a1 r)) | (UPLUS a1) -> (L.f_UPLUS (faexp’ a1 r)) | (PLUS (a1, a2)) -> (L.f_PLUS (faexp’ a1 r) (faexp’ a2 r)) | (MINUS (a1, a2)) -> (L.f_MINUS (faexp’ a1 r) (faexp’ a2 r)) | (TIMES (a1, a2)) -> (L.f_TIMES (faexp’ a1 r) (faexp’ a2 r)) | (DIV (a1, a2)) -> (L.f_DIV (faexp’ a1 r) (faexp’ a2 r)) | (MOD (a1, a2)) -> (L.f_MOD (faexp’ a1 r) (faexp’ a2 r)) let faexp a r = if (E’.is_bot r) & (L.isbotempty ()) then (L.bot ()) else faexp’ a r end;; module Faexp = (Faexp_implementation:Faexp_signature);;
Speed and low memory consumption are definitely required for analyzing very large programs. This may require a much more efficient implementation where the abstract interpreter [7] is replaced by an abstract compiler producing code for each arithmetic expression to be analyzed using may be register allocation algorithms and why not common subexpressions elimination (see e.g. Ch. 9.10 of [1]) to minimize the number of intermediate abstract environments to be allocated and deallocated. 8.4 Initialization and simple sign abstract forward arithmetic operations Considering the initialization and simple sign abstraction of Sec. 5.3, the calculational design of the forward abstract operations proceeds as follows α({n}) = H(15) and case analysisI NEG if n ∈ [min_int, −1] ZERO if n = 0 POS if n ∈ [1, max_int] BOT if n < min_int or n > max_int 1 F = n .
19
α(I) = H(15)I INI 1
F
= ? . F
1
We design − ( p) = α({− v | v ∈ γ ( p)}) by case analysis F
− (BOT)
F
− (POS)
F
− (ERR)
= = = = = = = = = = = =
F
α({− v | v ∈ γ (BOT)}) α({− v | v ∈ {a }}) α({a })
Hdef. (35) of − I Hdef. (14) of γ I Hdef. (21) of −I Hdef. (15) of αI F Hdef. (35) of − I Hdef. (14) of γ I Hdef. (21) of − and (2)I Hdef. (15) of αI F Hdef. (35) of − I Hdef. (14) of γ I Hdef. (21) of −I Hdef. (15) of αI
BOT
α({− v | v ∈ γ (POS)}) α({− v | v ∈ [1, max_int] ∪ {a }}) α([−max_int, −1] ∪ {a }) NEG
α({− v | v ∈ γ (ERR)}) α({− v | v ∈ {i , a }}) α({i , a }) ERR F
F
The calculational design for the other cases of − and that of + is similar and we get p F
+ ( p) F − ( p)
BOT NEG
ZERO POS
INI ERR
TOP
BOT NEG
ZERO POS
INI ERR
TOP
BOT POS
ZERO NEG
INI ERR
TOP
The calculational design of the abstract binary operators is also similar and will not be fully detailed. For division, we get q
F
/ ( p, q) BOT NEG ZERO
p
BOT NEG ZERO POS INI ERR TOP
BOT BOT BOT BOT BOT ERR ERR
BOT BOT BOT BOT BOT ERR ERR
BOT BOT BOT BOT BOT ERR ERR
POS
INI
ERR TOP
BOT BOT ZERO INI INI ERR TOP
BOT BOT POS INI INI ERR TOP
BOT BOT ERR ERR ERR ERR ERR
BOT BOT TOP TOP TOP ERR TOP
Let us consider a few typical cases. First division by a negative number always leads to an arithmetic error F
/ (POS, NEG)
= α({v1 / v2 | v1 ∈ γ (POS) ∧ v2 ∈ γ (NEG)}) = α({v1 / v2 | v1 ∈ [1, max_int] ∪ {a } ∧ v2 ∈ [min_int, −1] ∪ {a })}) = α({a }) = BOT
F
Hdef. (36) of / I Hdef. (14) of γ I Hdef. (22) of /I Hdef. (15) of αI
No abstract property exactly represents non-negative numbers which yields imprecise results
20
F
/ (POS, POS)
F
= α({v1 / v2 | v1 ∈ γ (POS) ∧ v2 ∈ γ (POS)}) = α({v1 / v2 | v1 ∈ [1, max_int] ∪ {a } ∧ v2 ∈ [1, max_int] ∪ {a })}) = α([0, max_int] ∪ {a }) = INI
Hdef. (36) of / I Hdef. (14) of γ I Hdef. (22) of /I Hdef. (15) of αI
Because of left to right evaluation, left errors are propagated first F
/ (BOT, ERR)
F
/ (ERR, BOT)
F
/ (TOP, BOT)
F
α({v1 / v2 | v1 ∈ γ (BOT) ∧ v2 ∈ γ (ERR)}) α({v1 / v2 | v1 ∈ {a } ∧ v2 ∈ {i , a }) α({a })
= = = = = = = = = =
Hdef. (36) of / I Hdef. (14) of γ I Hdef. (22) of /I Hdef. (15) of αI F Hdef. (36) of / I Hdef. (14) of γ I Hdef. (22) of /I Hdef. (15) of αI F Hdef. (36) of / I Hdef. (14) of γ I
BOT
α({v1 / v2 | v1 ∈ γ (ERR) ∧ v2 ∈ γ (BOT)}) α({v1 / v2 | v1 ∈ {i , a } ∧ v2 ∈ {a }) α({i , a }) ERR
α({v1 / v2 | v1 ∈ γ (TOP) ∧ v2 ∈ γ (BOT)}) α({v1 / v2 | v1 ∈ [min_int, max_int] ∪ {i , a } ∧ v2 ∈ {a }) = α({i , a }) = ERR
Hdef. (22) of /I Hdef. (15) of αI
The other forward abstract binary arithmetic operators for initialization and simple sign analysis are as follows q
F
+ ( p, q)
p
BOT
NEG
ZERO
POS
INI
ERR
TOP
BOT
BOT
BOT
BOT
BOT
BOT
BOT
BOT
NEG
BOT
NEG
NEG
INI
INI
ERR
ZERO
BOT
NEG
ZERO
POS
INI
POS
BOT
INI
POS
POS
INI
BOT
INI
INI
ERR
ERR
ERR
ERR
TOP
ERR
TOP
p
BOT
NEG
ZERO
POS
INI
ERR
TOP
BOT
BOT
BOT
BOT
BOT
BOT
BOT
BOT
TOP
NEG
BOT
INI
NEG
NEG
INI
ERR
TOP
ERR
TOP
ZERO
BOT
POS
ZERO
NEG
INI
ERR
TOP
INI
ERR
TOP
POS
BOT
POS
POS
INI
INI
ERR
TOP
INI
INI
ERR
TOP
INI
BOT
INI
INI
INI
INI
ERR
TOP
ERR
ERR
ERR
ERR
ERR
ERR
ERR
ERR
ERR
ERR
ERR
ERR
TOP
TOP
TOP
ERR
TOP
TOP
ERR
TOP
TOP
TOP
TOP
ERR
TOP
p
q
F
∗ ( p, q)
q
F
− ( p, q)
BOT
NEG
ZERO
POS
INI
ERR
TOP
BOT
BOT
BOT
BOT
BOT
BOT
BOT
BOT
NEG
BOT
POS
ZERO
NEG
INI
ERR
ZERO
BOT
ZERO
ZERO
ZERO
ZERO
POS
BOT
NEG
ZERO
POS
INI
BOT
INI
ZERO
ERR
ERR
ERR
ERR
TOP
ERR
TOP
TOP
mod
F
q ( p, q)
BOT
NEG
ZERO
POS
INI
ERR
TOP
BOT
BOT
BOT
BOT
BOT
BOT
BOT
BOT
TOP
NEG
BOT
BOT
BOT
BOT
BOT
BOT
BOT
ERR
TOP
ZERO
BOT
BOT
BOT
ZERO
ZERO
ERR
TOP
INI
ERR
TOP
POS
BOT
BOT
BOT
INI
INI
ERR
TOP
INI
INI
ERR
TOP
INI
BOT
BOT
BOT
INI
INI
ERR
TOP
ERR
ERR
ERR
ERR
ERR
ERR
ERR
ERR
ERR
ERR
ERR
ERR
TOP
TOP
ERR
TOP
TOP
ERR
ERR
ERR
TOP
TOP
ERR
TOP
p
8.5 Generic backward/bottom-up abstract interpretation of arithmetic expressions We now design the backward/bottom-up abstract semantics of arithmetic expressions Baexp
G
mon
mon
∈ Aexp 7→ (V 7→ L) 7−→ L 7−→ (V 7→ L) .
For any possible approximation (9) of value properties, we approximate environment properties by the nonrelational abstraction (20) and apply the following functional abstraction G
γ
mon ← − h(V 7→ L) 7−mon ¨ − ¨ h℘ (R) 7−→ ℘ (I ) 7−→ ℘ (R), ⊆i → L 7−→ (V 7→ L), vi −− −− G→ mon
mon
α
21
where 1 ¨ 9 = ∀R ∈ ℘ (R) : ∀P ∈ ℘ (I ) : 8(R)P ⊆ 9(R)P, 8⊆ 1 ¨ ψ = ∀r ∈ V 7→ L : ∀ p ∈ L : ϕ(r ) p v ˙ ψ(r ) p, ϕv G
1
α (8) = λr ∈ V 7→ L • λ p ∈ L • α(8( ˙ γ˙ (r ))γ ( p)), G
(37)
1
γ (ϕ) = λR ∈ ℘ (R)• λP ∈ ℘ (I )• γ˙ (ϕ(α(R))α(P)) ˙ . The objective is to get an overapproximation of the backward collecting semantics (29) such that G ¨ α G (BaexpJAK) . Baexp JAK w
(38)
G
We derive Baexp JAK by calculus, as follows G
α (BaexpJAK) G = Hdef. (37) of α I λr ∈ V 7→ L • λ p ∈ L • α(BaexpJAK( ˙ γ˙ (r ))γ ( p)) = Hdef. (29) of BaexpJAKI λr ∈ V 7→ L • λ p ∈ L • α({ρ ˙ ∈ γ˙ (r ) | ∃i ∈ γ ( p) ∩ I : ρ ` A Z⇒ i }) . If r is the infimum λY• ⊥ where the infimum ⊥ of L is such that γ (⊥) = ∅, then γ˙ (r ) = ∅ whence G
α (BaexpJAK)(λY• ⊥) p = Hdef. (19) of γ˙ I α(∅) ˙ = Hdef. (18) of αI ˙ • λY ⊥ . Given any r ∈ V 7→ L, r 6= λY• ⊥ or γ (⊥) 6= ∅ and p ∈ L, we proceed by structural induction on the arithmetic expression A. When A = n ∈ Nat is a number, we have
1 G
= = = ˙ v = =
α (BaexpJnK)(r ) p α({ρ ˙ ∈ γ˙ (r ) | ∃i ∈ γ ( p) ∩ I : ρ ` n Z⇒ i }) Hdef. (23) of ρ ` n Z⇒ i I α({ρ ˙ ∈ γ˙ (r ) | n ∈ γ ( p) ∩ I}) Hdef. conditional (. . . ? . . . ¿ . . .))I (n ∈ γ ( p) ∩ I ? α( ˙ γ˙ (r )) ¿ α(∅)) ˙ ) Hα˙ B γ˙ is reductive (7) and def. (18) of αI ˙ ¿ (n ∈ γ ( p) ∩ I ? r λY• ⊥)) 1 G Hby defining n ( p) = (n ∈ γ ( p) ∩ I)I G (n ( p) ? r ¿ λY• ⊥)) 1 G G Hby defining Baexp JnK(r ) p = (n ( p) ? r ¿ λY• ⊥))I G Baexp JnK(r ) p .
22
When A = X ∈ V is a variable, we have
2 G
= = ˙ v = = =
= = ˙ v ˙ v =
α (BaexpJXK)(r ) p α({ρ ˙ ∈ γ˙ (r ) | ∃i ∈ γ ( p) ∩ I : ρ ` X Z⇒ i }) Hdef. (24) of ρ ` X Z⇒ i I α({ρ ˙ ∈ γ˙ (r ) | ρ(X) ∈ γ ( p) ∩ I}) H[γ B α is extensive (6) and α˙ is monotone (5)I α({ρ ˙ ∈ γ˙ (r ) | ρ(X) ∈ γ ( p) ∩ γ B α(I)}) Hdef. (19) of γ˙ I α({ρ ˙ | ∀Y 6= X : ρ(Y) ∈ γ (r (Y)) ∧ ρ(X) ∈ γ (r (X)) ∩ γ ( p) ∩ γ B α(I)}) Hγ is a complete meet morphismI α({ρ ˙ | ∀Y 6= X : ρ(Y) ∈ γ (r (Y)) ∧ ρ(X) ∈ γ (r (X) u p u α(I))}) Hdef. (16) of environment assignmentI α({ρ ˙ | ∀Y 6= X : ρ(Y) ∈ γ (r [X ← r (X) u p u α(I)](Y)) ∧ ρ(X) ∈ γ (r [X ← r (X) u p u α(I)](X)}) Hdef. (19) of γ˙ I α({ρ ˙ | ρ ∈ γ˙ (r [X ← r (X) u p u α(I)])} Hset notationI α( ˙ γ˙ (r [X ← r (X) u p u α(I)])) Hα˙ B γ˙ is reductive (7)I r [X ← r (X) u p u α(I)] F Hdef. (36) of ? I F r [X ← r (X) u p u ? ] 1 G F Hby defining Baexp JXK(r ) p = r [X ← r (X) u p u ? ]I G Baexp JXK(r ) p . When A = ? is random, we have
3 G
= = = ˙ v = = =
α (BaexpJ?K)(r ) p α({ρ ˙ ∈ γ˙ (r ) | ∃i ∈ γ ( p) ∩ I : ρ ` ? Z⇒ i }) Hdef. (25) of ρ ` ? Z⇒ i I α({ρ ˙ ∈ γ˙ (r ) | γ ( p) ∩ I = 6 ∅}) Hdef. conditional (. . . ? . . . ¿ . . .))I ¿ α( (γ ( p) ∩ I = ∅ ? α(∅) ˙ ˙ γ˙ (r )))) Hdef. (18) of α˙ and α˙ B γ˙ reductive (7)I (γ ( p) ∩ I = ∅ ? λY• ⊥ ¿ r ) HnegationI (γ ( p) ∩ I 6= ∅ ? r ¿ λY• ⊥)) 1 G Hby defining ? ( p) = (γ ( p) ∩ I 6= ∅)I G (? ( p) ? r ¿ λY• ⊥)) 1 G G Hby defining Baexp J?K = (? ( p) ? r ¿ λY• ⊥))I G Baexp J?K(r ) p . When A = u A0 is a unary operation, we have
4
α (BaexpJu A0 K)(r ) p = α({ρ ˙ ∈ γ˙ (r ) | ∃i ∈ γ ( p) ∩ I : ρ ` u A0 Z⇒ i }) = Hdef. (4) of ρ ` u A0 Z⇒ i I G
23
= ˙ v ˙ v = = ˙ v ˙ v ˙ v
=
α({ρ ˙ ∈ γ˙ (r ) | ∃i 0 : ρ ` A0 Z⇒ i 0 ∧ u i 0 ∈ γ ( p) ∩ I}) Hset theoryI α({ρ ˙ ∈ γ˙ (r ) | ∃i 0 ∈ {v | ∃ρ 0 ∈ γ˙ (r ) : ρ 0 ` A0 Z⇒ v} : ρ ` A0 Z⇒ i 0 ∧ u i 0 ∈ γ ( p) ∩ I}) Hγ B α extensive (6) and α˙ monotone (20), (5)I α({ρ ˙ ∈ γ˙ (r ) | ∃i 0 ∈ γ (α({v | ∃ρ 0 ∈ γ˙ (r ) : ρ 0 ` A0 Z⇒ v})) : ρ ` A0 Z⇒ i 0 ∧ u i 0 ∈ γ ( p) ∩ I}) F H(33) implying Faexp JA0 Kr w α({v | ∃ρ 0 ∈ γ˙ (r ) : ρ 0 ` A0 Z⇒ v}), γ and α˙ monotone (20), (5)I F α({ρ ˙ ∈ γ˙ (r ) | ∃i 0 ∈ γ (Faexp JA0 Kr ) : ρ ` A0 Z⇒ i 0 ∧ u i 0 ∈ γ ( p) ∩ I}) Hdef. (21) of u (such that u i 0 ∈ I only if i 0 ∈ I)I F α({ρ ˙ ∈ γ˙ (r ) | ∃i 0 ∈ γ (Faexp JA0 Kr ) ∩ I : ρ ` A0 Z⇒ i 0 ∧ u i 0 ∈ γ ( p) ∩ I}) Hset theoryI F α({ρ ˙ ∈ γ˙ (r ) | ∃i 0 ∈ {i ∈ γ (Faexp JA0 Kr ) | u i ∈ γ ( p) ∩ I} ∩ I : ρ ` A0 Z⇒ i 0 }) Hγ B α extensive (6) and α˙ monotone (20), (5)I F α({ρ ˙ ∈ γ˙ (r ) | ∃i 0 ∈ γ (α({i ∈ γ (Faexp JA0 Kr ) | u i ∈ γ ( p) ∩ I})) ∩ I : ρ ` A0 Z⇒ i 0 }) G G Hdefining u such that u (q, p) w α({i ∈ γ (q) | u i ∈ γ ( p) ∩ I}), γ and α˙ monotone (20), (5)I G F α({ρ ˙ ∈ γ˙ (r ) | ∃i 0 ∈ γ (u (Faexp JA0 Kr, p)) ∩ I : ρ ` A0 Z⇒ i 0 }) G ˙ α({ρ ˙ ∈ γ˙ (r ) | ∃i 0 ∈ γ ( p)∩I : Hinduction hypothesis (38) implying Baexp JA0 K(r ) p w ρ ` A0 Z⇒ i 0 })I G G F Baexp JA0 K(r )(u (Faexp JA0 Kr, p)) 1 G G G F Hdefining Baexp Ju A0 K(r ) p = Baexp JA0 K(r )(u (Faexp JA0 Kr, p))I G Baexp Ju A0 K(r ) p . When A = A1 b A2 is a binary operation, we have
5 G
= = =
˙ v
˙ v
=
= ˙ v
α (BaexpJA1 b A2 K)(r ) p α({ρ ˙ ∈ γ˙ (r ) | ∃i ∈ γ ( p) ∩ I : ρ ` A1 b A2 Z⇒ i }) Hdef. (27) of ρ ` A1 b A2 Z⇒ i I α({ρ ˙ ∈ γ˙ (r ) | ∃i 1 , i 2 : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 ∧ i 1 b i 2 ∈ γ ( p) ∩ I}) Hset theoryI α({ρ ˙ ∈ γ˙ (r ) | ∃i 1 ∈ {v | ∃ρ 0 ∈ γ˙ (r ) : ρ 0 ` A1 Z⇒ v} : ∃i 2 ∈ {v | ∃ρ 0 ∈ γ˙ (r ) : ρ 0 ` A2 Z⇒ v} : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 ∧ i 1 b i 2 ∈ γ ( p) ∩ I}) Hγ B α extensive (6) and α˙ monotone (20), (5)I α({ρ ˙ ∈ γ˙ (r ) | ∃i 1 ∈ γ (α({v | ∃ρ 0 ∈ γ˙ (r ) : ρ 0 ` A1 Z⇒ v})) : ∃i 2 ∈ γ (α({v | ∃ρ 0 ∈ γ˙ (r ) : ρ 0 ` A2 Z⇒ v})) : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 ∧ i 1 b i 2 ∈ γ ( p) ∩ I}) F H(33) implying Faexp JAi Kr w α({v | ∃ρ 0 ∈ γ˙ (r ) : ρ 0 `i Z⇒ v}), i = 1, 2, γ and α˙ monotone (20), (5)I F F α({ρ ˙ ∈ γ˙ (r ) | ∃i 1 ∈ γ (Faexp JA1 Kr ) : ∃i 2 ∈ γ (Faexp JA2 Kr ) : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 ∧ i 1 b i 2 ∈ γ ( p) ∩ I}) Hdef. (22) of b (such that i 1 b i 2 ∈ I only if i 1 , i 2 ∈ I)I F F α({ρ ˙ ∈ γ˙ (r ) | ∃i 1 ∈ γ (Faexp JA1 Kr ) ∩ I : ∃i 2 ∈ γ (Faexp JA2 Kr ) ∩ I : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 ∧ i 1 b i 2 ∈ γ ( p) ∩ I}) Hset theoryI F F α({ρ ˙ ∈ γ˙ (r ) | ∃hi 1 , i 2 i ∈ {hi 10 , i 20 i ∈ γ (Faexp JA1 Kr ) × γ (Faexp JA2 Kr ) | i 10 b i 20 ∈ γ ( p) ∩ I} ∩ (I × I) : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 }) Hγ 2 B α 2 extensive (13), (6) and α˙ monotone (20), (5)I 24
α({ρ ˙ ∈ γ˙ (r ) | ∃hi 1 , i 2 i ∈ γ 2 (α 2 ({hi 10 , i 20 i ∈ γ (Faexp JA1 Kr ) × γ (Faexp JA2 Kr ) | i 10 b i 20 ∈ γ ( p) ∩ I})) ∩ (I × I) : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 }) G Hdefining b such that G b (q1 , q2 , p) w2 α 2 ({hi 10 , i 20 i ∈ γ 2 (hq1 , q2 i) | i 10 b i 20 ∈ γ ( p) ∩ I}), γ 2 and α˙ monotone (20), (5)I G F F α({ρ ˙ ∈ γ˙ (r ) | ∃hi 1 , i 2 i ∈ γ 2 (b (Faexp JA1 Kr, Faexp JA2 Kr, p)) ∩ (I × I) : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 }) Hlet notationI G F F let h p1 , p2 i = b (Faexp JA1 Kr, Faexp JA2 Kr, p) in α({ρ ˙ ∈ γ˙ (r ) | ∃hi 1, i 2 i ∈ γ 2 (h p1 , p2 i) ∩ (I × I) : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 }) Hdef. (12) of γ 2 and α˙ monotone (20), (5)I G F F let h p1 , p2 i = b (Faexp JA1 Kr, Faexp JA2 Kr, p) in α({ρ ˙ 1 ∈ γ˙ (r ) | ∃i 1 ∈ γ p1 ∩ I : ρ1 ` A1 Z⇒ i 1 } ∩ {ρ2 ∈ γ˙ (r ) | ∃i 2 ∈ γ p2 ∩ I : ρ2 ` A2 Z⇒ i 2 }) Hα˙ complete join morphismI G F F let h p1 , p2 i = b (Faexp JA1 Kr, Faexp JA2 Kr, p) in α({ρ ˙ 1 ∈ γ˙ (r ) | ∃i 1 ∈ γ p1 ∩ I : ρ1 ` A1 Z⇒ i 1 }) u˙ α({ρ ˙ 2 ∈ γ˙ (r ) | ∃i 2 ∈ γ p2 ∩ I : ρ2 ` A2 Z⇒ i 2 }) Hinduction hypothesis (38) implying G ˙ α({ρ ˙ ∈ γ˙ (r ) | ∃i 0 ∈ γ ( p) ∩ I : ρ ` A0 Z⇒ i 0 })I Baexp JA0 K(r ) p w G F F let h p1 , p2 i = b (Faexp JA1 Kr, Faexp JA2 Kr, p) in G G Baexp JA1 K(r ) p1 u˙ Baexp JA2 K(r ) p2 1 G Hdefining Baexp JA1 b A2 K(r ) p = G F F let h p1 , p2 i = b (Faexp JA1 Kr, Faexp JA2 Kr, p) in G G Baexp JA1 K(r ) p1 u˙ Baexp JA2 K(r ) p2 I G Baexp JA1 b A2 K(r ) p . F
˙ v
=
=
=
˙ v
=
F
G
In conclusion, we have designed the backward abstract interpretation Baexp of arithmetic expressions in such a way that it satisfies the soundness requirement (38) as summarized in Fig. 7. G For all p ∈ L and by induction on A, the operator λr • Baexp JAK(r ) p on V 7→ L is ˙ v-reductive and monotonic. 8.6 Generic backward/bottom-up static analyzer of arithmetic expressions A rapid prototyping of Fig. 7 with signature module type Baexp_signature = functor (L: Abstract_Lattice_Algebra_signature) -> functor (E: Abstract_Env_Algebra_signature) -> functor (Faexp: Faexp_signature) -> sig open Abstract_Syntax (* generic backward abstract interpretation of arithmetic operations *) val baexp : aexp -> E (L).env -> L.lat -> E (L).env end;;
is given by the following implementation module Baexp_implementation = functor (L: Abstract_Lattice_Algebra_signature) -> functor (E: Abstract_Env_Algebra_signature) ->
25
1
G
Baexp JAK(λY• ⊥) p = λY• ⊥
if γ (⊥) = ∅ 1 G G Baexp JnK(r ) p = (n ( p) ? r ¿ λY• ⊥)) 1
G
(39)
F
Baexp JXK(r ) p = r [X ← r (X) u p u ? ] 1 G G Baexp J?K(r ) p = (? ( p) ? r ¿ λY• ⊥)) 1
(40)
Baexp Ju A0 K(r ) p = Baexp JA0 K(r )(u (Faexp JA0 Kr, p)) G
G
1
G
G
F
G
F
F
Baexp JA1 b A2 K(r ) p = let h p1 , p2 i = b (Faexp JA1 Kr, Faexp JA2 Kr, p) in G G Baexp JA1 K(r ) p1 u˙ Baexp JA2 K(r ) p2 parameterized by the following backward abstract operations on L G
=
1
(n ∈ γ ( p) ∩ I)
(41)
G
= w
1
(γ ( p) ∩ I 6= ∅) α({i ∈ γ (q) | u i ∈ γ ( p) ∩ I})
(42) (43)
b (q1 , q2 , p) w2 α 2 ({hi 1 , i 2 i ∈ γ 2 (hq1 , q2 i) | i 1 b i 2 ∈ γ ( p) ∩ I})
(44)
n ( p)
? ( p) G u (q, p) G
Figure 7: Backward abstract interpretation of arithmetic expressions functor (Faexp: Faexp_signature) -> struct open Abstract_Syntax (* generic abstract environments *) module E’ = E (L) (* generic forward abstract interpretation of arithmetic operations *) module Faexp’ = Faexp(L)(E) (* generic backward abstract interpretation of arithmetic operations *) let rec baexp’ a r p = match a with | (INT i) -> if (L.b_INT i p) then r else (E’.bot ()) | (VAR v) -> (E’.set r v (L.meet (L.meet (E’.get r v) p) (L.f_RANDOM ()))) | RANDOM -> if (L.b_RANDOM p) then r else (E’.bot ()) | (UMINUS a1) -> (baexp’ a1 r (L.b_UMINUS (Faexp’.faexp a1 r) p)) | (UPLUS a1) -> (baexp’ a1 r (L.b_UPLUS (Faexp’.faexp a1 r) p)) | (PLUS (a1, a2)) -> let (p1,p2) = (L.b_PLUS (Faexp’.faexp a1 r) (Faexp’.faexp a2 r) p) in (E’.meet (baexp’ a1 r p1) (baexp’ a2 r p2)) | (MINUS (a1, a2)) -> let (p1,p2) = (L.b_MINUS (Faexp’.faexp a1 r) (Faexp’.faexp a2 r) p) in (E’.meet (baexp’ a1 r p1) (baexp’ a2 r p2)) | (TIMES (a1, a2)) -> let (p1,p2) = (L.b_TIMES (Faexp’.faexp a1 r) (Faexp’.faexp a2 r) p) in (E’.meet (baexp’ a1 r p1) (baexp’ a2 r p2)) | (DIV (a1, a2)) -> let (p1,p2) = (L.b_DIV (Faexp’.faexp a1 r) (Faexp’.faexp a2 r) p) in (E’.meet (baexp’ a1 r p1) (baexp’ a2 r p2)) | (MOD (a1, a2)) -> let (p1,p2) = (L.b_MOD (Faexp’.faexp a1 r) (Faexp’.faexp a2 r) p) in (E’.meet (baexp’ a1 r p1) (baexp’ a2 r p2)) let baexp a r p =
26
if (E’.is_bot r) & (L.isbotempty ()) then (E’.bot ()) else baexp’ a r p end;; module Baexp = (Baexp_implementation:Baexp_signature);;
The operations on abstract value properties which are used for the backward abstract interpretation of arithmetic expressions of Fig. 7 must be provided with the module implementing each particular algebra of abstract properties, as follows module type Abstract_Lattice_Algebra_signature = sig (* complete lattice of abstract properties of values type lat (* abstract properties ... (* forward abstract interpretation of arithmetic expressions ... (* backward abstract interpretation of arithmetic expressions val b_INT : string -> lat -> bool val b_RANDOM : lat -> bool val b_UMINUS : lat -> lat -> lat val b_UPLUS : lat -> lat -> lat val b_PLUS : lat -> lat -> lat -> lat * lat val b_MINUS : lat -> lat -> lat -> lat * lat val b_TIMES : lat -> lat -> lat -> lat * lat val b_DIV : lat -> lat -> lat -> lat * lat val b_MOD : lat -> lat -> lat -> lat * lat ... end;;
*) *) *) *)
The next section is an example of calculational design of such abstract operations for the initialization and simple sign analysis. 8.7 Initialization and simple sign abstract backward arithmetic operations In the abstract interpretation (40) of variables, we have ?
F
= INI G
by definition (15) of α. From the definition (41) of n and (14) of γ , we directly get by case analysis p
G
n ( p)
BOT NEG
n ∈ [min_int, −1] n=0 n ∈ [1, max_int] n < min_int ∨ n > max_int
ff ff ff ff
tt ff ff ff
ZERO POS INI
ff tt ff ff
ff ff tt ff
tt tt tt ff
ERR TOP
ff ff ff ff
tt tt tt ff
G
From the definition (42) of ? and (14) of γ , we directly get by case analysis p ? ( p) G
BOT
ff
NEG ZERO POS
tt
tt
tt
INI ERR
tt
ff
TOP
tt
For the backward unary arithmetic operations (43), we have p + (q, p) G − (q, p) G
BOT BOT BOT
NEG
ZERO
POS
INI
ERR
TOP
q u NEG q u ZERO q u POS q u INI BOT q u INI q u POS q u ZERO q u NEG q u INI BOT q u INI
Let us consider a few typical cases. 27
If p = BOT or p = ERR then by (14),
1
u i ∈ γ ( p) ∩ I ⊆ {i , a} ∩ [min_int, max_int] = ∅ G
is false so that u (q, p) = α(∅) = BOT. 2 If p = POS then by (14), − i ∈ γ ( p) ∩ I = [1, max_int] if and only if Hby def. (21) of G −I i ∈ [min_int + 1, −1] so that − (q, p) = α(γ (q) ∩ [min_int + 1, −1]) ⊆ α(γ (q) ∩ γ (NEG)) by (14). But γ preserves meets whence this is equal to α(γ (q u NEG)) v q u NEG since α B γ is reductive (7). 3 If p = INI or p = TOP then by (14), − i ∈ γ ( p) ∩ I = [min_int, max_int] if G and only if Hby def. (21) of −I i ∈ [min_int + 1, max_int] so that − (q, p) = α(γ (q) ∩ [min_int + 1, max_int]) ⊆ α(γ (q) ∩ γ (INI)) by (14). But γ preserves meets whence this is equal to α(γ (q u INI)) v q u INI since α B γ is reductive (7). For the backward binary arithmetic operations (44), we have 1
G
1
G
/ (q1 , q2 , p) = mod (q1 , q2 , p) = (q1 ∈ {BOT, NEG, ERR} ∨ q2 ∈ {BOT, NEG, ZERO, ERR} ∨ p ∈ {BOT, NEG, ERR} ? hBOT, BOTi ¿ ( p = POS ? smash(hq1 u POS, q2 u POSi) ¿ hq1 u INI, q2 u POSi))) 1 smash(hx, yi) = ( x = BOT ∨ y = BOT ? hBOT, BOTi ¿ hx, yi)) . If b ∈ {/, mod} and q1 ∈ {BOT, NEG, ERR} or q2 ∈ {BOT, NEG, ZERO, ERR} then i 1 ∈ γ (q1 ) ⊆ [min_int, −1]∪{i, a} or i 2 ∈ γ (q2 ) ⊆ [min_int, 0]∪{i, a } in which case i 1 b i 2 6∈ G I by (22). If follows that b (q1 , q2 , p) = α 2 (∅) = hBOT, BOTi by (11) and (15). If p ∈ {BOT, NEG, ERR} then i 1 b i 2 6∈ γ ( p) ∩ I ⊆ [min_int, −1] in contradiction with G (22) showing that i 1 b i 2 is not negative. Again b (q1 , q2 , p) = α 2 (∅) = hBOT, BOTi by (11) and (15). Otherwise to have i 1 b i 2 ∈ I, we must have i 1 ∈ [0, max_int] and i 2 ∈ [1, max_int] whence necessarily i 1 ∈ γ (INI) and i 2 ∈ γ (POS) so that α 2 (γ 2 (hq1 u INI, q2 u POSi)) v2 1 G hq1 u INI, q2 u POSi = b (q1 , q2 , p). Moreover the quotient is strictly positive only if the dividend is non zero. G With the same reasoning, for addition + , we have G
+ (q1 , q2 , p) = hBOT, BOTi
if q1 ∈ {BOT, ERR} ∨ q2 ∈ {BOT, ERR} ∨ p ∈ {BOT, ERR} G + (q1 , q2 , p) = hq1 u INI, q2 u INIi if p ∈ {INI, TOP} .
Otherwise G
+ (q1 , q2 , NEG) q1
NEG ZERO POS INI, TOP
G
+ (q1 , q2 , ZERO) q1
NEG ZERO POS INI, TOP
q2 NEG
ZERO
POS
hNEG, NEGi hNEG, ZEROi hZERO, NEGi hBOT, BOTi hPOS, NEGi hBOT, BOTi hINI, NEGi hNEG, ZEROi
hNEG, hBOT, hBOT, hNEG,
POSi hNEG, INIi BOTi hZERO, NEGi BOTi hPOS, NEGi POSi hINI, INIi
q2 NEG
hBOT, hBOT, hPOS, hPOS,
BOTi BOTi NEGi NEGi
ZERO
hBOT, hZERO, hBOT, hZERO, 28
BOTi ZEROi BOTi ZEROi
POS
hNEG, hBOT, hBOT, hNEG,
INI, TOP
INI, TOP
POSi hNEG, BOTi hZERO, BOTi hPOS, POSi hINI,
POSi ZEROi NEGi INIi
q2
G
+ (q1 , q2 , POS) NEG ZERO POS INI, TOP
q1
NEG
hBOT, hBOT, hPOS, hPOS,
ZERO
BOTi BOTi NEGi NEGi
INI, TOP
POS
hBOT, BOTi hNEG, POSi hNEG, POSi hBOT, BOTi hZERO, POSi hZERO, POSi hPOS, ZEROi hPOS, POSi hPOS, INIi hPOS, ZEROi hINI, POSi hINI, INIi G
The backward ternary substraction operation − is defined as 1
G
G
F
− (q1 , q2 , p) = let (r1 , r2 ) = − (q1 , − (q2 ), p) in F (r1 , − (r2 )) . G
The handling of the backward ternary multiplication operation ∗ is similar G
∗ (q1 , q2 , NEG) q1
hBOT, hBOT, hPOS, hPOS,
NEG ZERO POS INI, TOP
∗ (q1 , q2 , ZERO) q1
G
NEG ZERO POS INI, TOP
hBOT, hBOT, hBOT, hBOT,
NEG
POS
BOTi BOTi BOTi BOTi
hNEG, hBOT, hBOT, hNEG,
POSi BOTi BOTi POSi
ZERO
POS
INI, TOP
hNEG, hBOT, hPOS, hINI,
POSi BOTi NEGi INIi
INI, TOP
hBOT, BOTi hNEG, ZEROi hBOT, BOTi hNEG, ZEROi hZERO, NEGi hZERO, ZEROi hZERO, POSi hZERO, INIi hBOT, BOTi hPOS, ZEROi hBOT, BOTi hPOS, ZEROi hZERO, NEGi hINI, ZEROi hZERO, POSi hINI, INIi
∗ (q1 , q2 , POS) q1
BOTi BOTi NEGi NEGi
ZERO
q2
G
NEG ZERO POS INI, TOP
q2 NEG
q2 NEG
hNEG, hBOT, hBOT, hNEG,
NEGi BOTi BOTi NEGi
ZERO
hBOT, hBOT, hBOT, hBOT,
BOTi BOTi BOTi BOTi
POS
hBOT, hBOT, hPOS, hPOS,
BOTi BOTi POSi POSi
INI, TOP
hNEG, hBOT, hPOS, hINI,
NEGi BOTi POSi INIi
9. Semantics of Boolean Expressions 9.1 Abstract syntax of boolean expressions We assume that boolean expressions are normalized according to the abstract syntax of Fig. 8. The normalization is specified by the following recursive rewriting rules
29
Arithmetic expressions A1 , A2 ∈ Aexp . Boolean expressions B, B1, B2 ∈ Bexp ::= | | | |
true false
A1 = A2 | A1 < A2 B1 & B2 B1 | B2
truth, falsity, arithmetic comparison, conjunction, disjunction.
Figure 8: Abstract syntax of boolean expressions
1
1
T (true) = true,
T (¬true) = false,
1
1
T (false) = false, 1
T (A1 < A2 ) =
T (¬false) = true, 1
A1 < A2 ,
T (¬( A1 < A2 )) = T (A1 >= A2 ),
1
1
T ( A1 A2 ) =
A2 < A1 ,
A1 = A2 ,
T (¬( A1 > A2 )) = T (A1 = A2 )) =
1
1
1
T ( A1 >= A2 ) = ( A1 = A2 ) | ( A2 < A1 ), T (B1 | B2 ) = T (B1) | T (B2 )
A1 < A2 ,
T (¬(B1 | B2)) = T (¬(B1)) & T (¬(B2)),
1
1
T (B1 & B2 ) = T (B1) & T (B2 ),
T (¬(B1 & B2)) = T (¬(B1)) | T (¬(B2)), 1
T (¬(¬(B))) = T (B).
9.2 Machine booleans We let B be the logical boolean values and B be the machine truth values (including errors E = {i , a }) 1
1
B = {tt, ff},
B = B ∪ E .
We respectively write c∈ I ×I 7→ B for the machine arithmetic comparison operation and c∈ Z × Z 7→ B for the mathematical arithmetic comparison operation corresponding to the language binary arithmetic comparison operators c ∈ {}. Evaluation of operands, whence error propagation is left to right. We have e ∈ E, v ∈ I , i, i 1 , i 2 ∈ I) 1
e c v = e , 1
i c e = e ,
(45)
1
i1 c i2 = i1 c i2 . We respectively write u ∈ B 7→ B for the machine boolean operation and u ∈ B 7→ B for the mathematical boolean operation corresponding to the language unary operators u ∈ {¬}. Errors are propagated, so that we have (e ∈ E, b ∈ B) 1
u e = e , 1
ub = ub .
30
ρ ` true Z⇒ tt, ρ ` false Z⇒ ff, ρ ` A1 Z⇒ v1 , ρ ` A2 Z⇒ v2 , ρ ` A1 c A2 Z⇒ v1 c v2
truth;
(47)
falsity;
(48)
arithmetic comparisons;
(49)
ρ ` B Z⇒ w , ρ ` u B Z⇒ u w
unary boolean operations;
ρ ` B1 Z⇒ w1 , ρ ` B2 Z⇒ w2 , ρ ` B1 b B2 Z⇒ w1 b w2
binary boolean operations.
Figure 9: Operational semantics of boolean expressions We respectively write b ∈ B × B 7→ B for the machine boolean operation and b ∈ B × B 7→ B for the mathematical boolean operation corresponding to the language binary boolean operators b ∈ {&, |}. Evaluation of operands, whence error propagation is left to right. We have (e ∈ E, w ∈ B , b, b1 , b2 ∈ B) 1
e b w = e , 1
b b e = e ,
(46)
1
b1 b b2 = b1 b i 2 . 9.3 Operational semantics of boolean expressions The big-step operational semantics [31] of boolean expressions involves judgements ρ ` B Z⇒ b meaning that in environment ρ, the boolean expression b may evaluate to b ∈ B . If is formally specified by the inference system of Fig. 9. 9.4 Equivalence of boolean expressions In general, the semantics of a boolean expression B is not the same as the semantics of its transformed form T (B). This is because the rewriting rule T (A1 > A2 ) = A2 < A1 does not respect left to right evaluation whence the error propagation order. For example if ρ(X) = i then ρ ` X > (1 / 0) Z⇒ i while ρ ` (1 / 0) < X Z⇒ a . However we will consider that all boolean expressions have been normalized (i.e. B = T (B)) because the respective evaluations of B and T (B) either produce the same boolean values (in general there is more than one possible value, because of random choice) or both expressions produce errors (which may be different). We have ∀b ∈ B : ρ ` B ⇒ Z b ⇐⇒ ρ ` T (B) Z⇒ b, (∃e ∈ E : ρ ` B Z⇒ e) ⇐⇒ (∃e0 ∈ E : ρ ` T (B) Z⇒ e0 ) . 9.5 Collecting semantics of boolean expressions The collecting semantics CbexpJBKR of a boolean expression B defines the subset of possible environments ρ ∈ R for which the boolean expression may evaluate to true (hence without
31
producing a runtime error) cjm
Cbexp ∈ Bexp 7→ ℘ (R) 7−→ ℘ (R), 1
CbexpJBKR = {ρ ∈ R | ρ ` B Z⇒ tt} .
(50)
10. Abstract Interpretation of Boolean Expressions 10.1 Generic abstract interpretation of boolean expressions We now consider the calculational design of the generic nonrelational abstract semantics of boolean expressions mon
Abexp ∈ Bexp 7→ (V 7→ L) 7−→ (V 7→ L) . For any possible approximation (9) of value properties, this consists in approximating environment properties by the nonrelational abstraction (20) and in applying the following functional abstraction to the collecting semantics (50). γ¨
cjm ← − h(V 7→ L) 7−mon ˙ − ¨ h℘ (R) 7−→ ℘ (R), ⊆i → (V 7→ L), vi −− → α¨
(51)
where 1 ˙ 9 = 8⊆ ∀R ∈ ℘ (R) : 8(R) ⊆ 9(R), 1 ¨ ψ = ∀r ∈ V 7→ L : ϕ(r ) v ˙ ψ(r ), ϕv 1
α(8) ¨ = α˙ B 8 B γ˙ ,
(52)
1
γ¨ (ϕ) = γ˙ B ϕ B α˙ . We must get an overapproximation such that ¨ α(CbexpJBK) AbexpJBK w ¨ .
(53)
We derive AbexpJBK as follows α(CbexpJBK) ¨ Hdef. (52) of αI ¨ • λr ∈ V 7→ L α(CbexpJBK ˙ γ˙ (r )) = Hdef. (50) of CbexpI λr ∈ V 7→ L • α({ρ ˙ ∈ γ˙ (r ) | ρ ` B Z⇒ tt}) . =
If r is the infimum λY• ⊥ and the infimum ⊥ of L is such that γ (⊥) = ∅ then γ˙ (r ) = ∅. In this case α(CbexpJBK ¨ λY• ⊥) = Hdef. (19) of γ˙ I α(∅) ˙ = Hdef. (18) of αI ˙ λY• ⊥ . Otherwise r 6= λY• ⊥ or γ (⊥) 6= ∅, we have α(CbexpJBK)r ¨ = Hdef. lambda expressionI α({ρ ˙ ∈ γ˙ (r ) | ρ ` B Z⇒ tt} , and we proceed by induction on the boolean expression B. 32
1 = = ˙ v =
2 = = = =
3
When B = true is true, we have α(CbexpJ ¨ trueK)r α({ρ ˙ ∈ γ˙ (r ) | ρ ` true Z⇒ tt} Hdef. (47) of ρ ` true Z⇒ bI α( ˙ γ˙ (r )) Hα˙ B γ˙ is reductive (51), (7)I r 1 Hby defining AbexpJtrueKr = r I AbexpJtrueKr . When B = false is false, we have α(CbexpJ ¨ falseK)r α({ρ ˙ ∈ γ˙ (r ) | ρ ` false Z⇒ tt} Hdef. (48) of ρ ` false Z⇒ bI α(∅) ˙ Hdef. (18) of αI ˙ λY• ⊥ 1 Hby defining AbexpJfalseKr = λY• ⊥I AbexpJfalseKr . When B = A1 c A2 is an arithmetic comparison, we have
α(CbexpJA ¨ 1 c A2 K)r = α({ρ ˙ ∈ γ˙ (r ) | ρ ` A1 c A2 Z⇒ tt}) = Hdef. (49) of ρ ` A1 c A2 Z⇒ bI = α({ρ ˙ ∈ γ˙ (r ) | ∃v1 , v2 ∈ I : ρ ` A1 Z⇒ v1 ∧ ρ ` A2 Z⇒ v2 ∧ v1 c v2 = tt}) = Hset theory and γ B α is extensive (6)I = α({ρ ˙ ∈ γ˙ (r ) | ∃v1 ∈ γ (α({v | ∃ρ ∈ γ˙ (r ) : ρ ` A1 Z⇒ v})) : ∃v2 ∈ γ (α({v | ∃ρ ∈ γ˙ (r ) : ρ ` A2 Z⇒ v})) : ρ ` A1 Z⇒ v1 ∧ ρ ` A2 Z⇒ v2 ∧ v1 c v2 = tt}) = Hset theory and (33)I F F = α({ρ ˙ ∈ γ˙ (r ) | ∃v1 ∈ γ (Faexp JA1 Kr ) : ∃v2 ∈ γ (Faexp JA2 Kr ) : ρ ` A1 Z⇒ v1 ∧ ρ ` A2 Z⇒ v2 ∧ v1 c v2 = tt}) = Hlet notationI F F let h p1 , p2 i = hFaexp JA1 Kr , Faexp JA2 Kri in α({ρ ˙ ∈ γ˙ (r ) | ∃v1 ∈ γ ( p1 ) : ∃v2 ∈ γ ( p2 ) : ρ ` A1 Z⇒ v1 ∧ ρ ` A2 Z⇒ v2 ∧ v1 c v2 = tt}) = Hdef. (45) of c implying v1 , v2 6∈ E = {i , a }I F F let h p1 , p2 i = hFaexp JA1 Kr , Faexp JA2 Kri in α({ρ ˙ ∈ γ˙ (r ) | ∃i 1 ∈ γ ( p1 ) ∩ I : ∃i 2 ∈ γ ( p2 ) ∩ I : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 ∧ i 1 c i 2 = tt}) = Hset theoryI F F let h p1 , p2 i = hFaexp JA1 Kr , Faexp JA2 Kri in α({ρ ˙ ∈ γ˙ (r ) | ∃hi 1, i 2 i ∈ {hi 10 , i 20 i | i 10 ∈ γ ( p1 ) ∩ I ∧ i 20 ∈ γ ( p2 ) ∩ I ∧ i 10 c i 20 = tt} : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 ∧}) 2 2 ˙ v Hγ B α extensive (13), (6) and α˙ monotone (20), (5)I 33
F
˙ v
=
= ˙ v
=
= ˙ v
=
4 = = = ˙ v = = ˙ v =
F
let h p1 , p2 i = hFaexp JA1 Kr , Faexp JA2 Kri in α({ρ ˙ ∈ γ˙ (r ) | ∃hi 1, i 2 i ∈ γ 2 (α 2 ({hi 10 , i 20 i | i 10 ∈ γ ( p1 ) ∩ I ∧ i 20 ∈ γ ( p2 ) ∩ I ∧ i 10 c i 20 = tt})) : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 ∧}) Hdefining cˇ such that: cˇ ( p1 , p2 ) w2 α 2 ({hi 10 , i 20 i | i 10 ∈ γ ( p1 ) ∩ I ∧ i 20 ∈ γ ( p2 ) ∩ I ∧ i 10 c i 20 = tt}), γ 2 and α˙ monotone (20), (5)I F F let h p1 , p2 i = hFaexp JA1 Kr , Faexp JA2 Kri in α({ρ ˙ ∈ γ˙ (r ) | ∃hi 1, i 2 i ∈ γ 2 (cˇ ( p1 , p2 )) : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 }) Hlet notationI F F let h p1 , p2 i = cˇ (Faexp JA1 Kr , Faexp JA2 Kr) in α({ρ ˙ ∈ γ˙ (r ) | ∃hi 1, i 2 i ∈ γ 2 (h p1 , p2 i)) : ρ ` A1 Z⇒ i 1 ∧ ρ ` A2 Z⇒ i 2 }) Hset theoryI F F let h p1 , p2 i = cˇ (Faexp JA1 Kr , Faexp JA2 Kr) in α({ρ ˙ ∈ γ˙ (r ) | ∃i 1 ∈ γ ( p1 ) : ρ ` A1 Z⇒ i 1 } ∩ {ρ ∈ γ˙ (r ) | ∃i 2 ∈ γ ( p2 ) : ρ ` A2 Z⇒ i 2 }) Hα˙ monotone (20), (5)I F F let h p1 , p2 i = cˇ (Faexp JA1 Kr , Faexp JA2 Kr) in α({ρ ˙ ∈ γ˙ (r ) | ∃i 1 ∈ γ ( p1 ) : ρ ` A1 Z⇒ i 1 }) u˙ α({ρ ˙ ∈ γ˙ (r ) | ∃i 2 ∈ γ ( p2 ) : ρ ` A2 Z⇒ i 2 }) Hdef. (29) of BaexpI F F let h p1 , p2 i = cˇ (Faexp JA1 Kr , Faexp JA2 Kr) in ˙ α(BaexpJA α(BaexpJA ˙ ˙ 1 K(γ˙ (r ))γ ( p1 )) u 2 K(γ˙ (r ))γ ( p2 )) G Hdef. (37) of α I F F let h p1 , p2 i = cˇ (Faexp JA1 Kr , Faexp JA2 Kr) in G G α (BaexpJA1 K)(r ) p1 u˙ α (BaexpJA2 K)(r ) p2 G Hdef. (38) of Baexp and u˙ monotoneI F F let h p1 , p2 i = cˇ (Faexp JA1 Kr , Faexp JA2 Kr) in G G Baexp JA1 K(r ) p1 u˙ Baexp JA2 K(r ) p2 1 F F Hby defining AbexpJA1 c A2 Kr = let h p1 , p2 i = cˇ (Faexp JA1 Kr , Faexp JA2 Kr ) inI G G Baexp JA1 K(r ) p1 u˙ Baexp JA2 K(r ) p2 AbexpJA1 c A2 Kr . When B = B1 & B2 is a conjunction, we have α(CbexpJB ¨ 1 & B2 K)r α({ρ ˙ ∈ γ˙ (r ) | ρ ` B1 Z⇒ w1 ∧ ρ ` B2 Z⇒ w2 ∧ w1 & w2 = tt}) Hdef. (46) of &I α({ρ ˙ ∈ γ˙ (r ) | ρ ` B1 Z⇒ tt ∧ ρ ` B2 Z⇒ tt}) Hset theoryI α({ρ ˙ ∈ γ˙ (r ) | ρ ` B1 Z⇒ tt} ∩ {ρ ∈ γ˙ (r )ρ ` B2 Z⇒ tt}) Hα˙ monotone (20), (5)I α({ρ ˙ ∈ γ˙ (r ) | ρ ` B1 Z⇒ tt}) u˙ α({ρ ˙ ∈ γ˙ (r )ρ ` B2 Z⇒ tt}) Hdef. (50) of CbexpI ˙ α(CbexpJB α(CbexpJB ˙ ˙ 1 Kγ˙ (r )) u 2 Kγ˙ (r )) Hdef. (52) of αI ¨ ˙ α(CbexpJB α(CbexpJB ¨ ¨ 1 K)r u 2 K)r Hinduction hypothesis (53) and u˙ monotoneI AbexpJB1Kr u˙ AbexpJB2Kr 1 Hby defining AbexpJB1 & B2Kr = AbexpJB1Kr u˙ AbexpJB2Kr I 34
1
AbexpJBK λY• ⊥ = λY• ⊥
if γ (⊥) = ∅
(54)
1
AbexpJtrueKr = r 1
AbexpJfalseKr = λY• ⊥ 1
F
F
AbexpJA1 c A2 Kr = let h p1 , p2 i = cˇ (Faexp JA1 Kr , Faexp JA2 Kr ) in G G Baexp JA1 K(r ) p1 u˙ Baexp JA2 K(r ) p2 1
AbexpJB1 & B2Kr = AbexpJB1Kr u˙ AbexpJB2 Kr 1
AbexpJB1 | B2Kr = AbexpJB1Kr t˙ AbexpJB2 Kr parameterized by the following abstract comparison operations cˇ , c ∈ { lat -> lat * lat val a_LT : lat -> lat -> lat * lat end;;
A functional implementation of Fig. 10 is module Abexp_implementation = functor (L: Abstract_Lattice_Algebra_signature) -> functor (E: Abstract_Env_Algebra_signature) -> functor (Faexp: Faexp_signature) -> functor (Baexp: Baexp_signature) -> struct open Abstract_Syntax
35
*) *) *) *) *)
(* generic abstract environments *) module E’ = E (L) (* generic forward abstract interpretation of arithmetic operations *) module Faexp’ = Faexp(L)(E) (* generic backward abstract interpretation of arithmetic operations *) module Baexp’ = Baexp(L)(E)(Faexp) (* generic abstract interpretation of boolean operations *) let rec abexp’ b r = match b with | TRUE -> r | FALSE -> (E’.bot ()) | (EQ (a1, a2)) -> let (p1,p2) = (L.a_EQ (Faexp’.faexp a1 r) (Faexp’.faexp a2 r)) in (E’.meet (Baexp’.baexp a1 r p1) (Baexp’.baexp a2 r p2)) | (LT (a1, a2)) -> let (p1,p2) = (L.a_LT (Faexp’.faexp a1 r) (Faexp’.faexp a2 r)) in (E’.meet (Baexp’.baexp a1 r p1) (Baexp’.baexp a2 r p2)) | (AND (b1, b2)) -> (E’.meet (abexp’ b1 r) (abexp’ b2 r)) | (OR (b1, b2)) -> (E’.join (abexp’ b1 r) (abexp’ b2 r)) let abexp b r = if (E’.is_bot r) & (L.isbotempty ()) then (E’.bot ()) else abexp’ b r end;;
10.3 Generic abstract boolean equality The calculational design of the abstract equality operation = ˇ does not depend upon the specific choice of L α 2 ({hi 1 , i 2 i | i 1 ∈ γ ( p1 ) ∩ I ∧ i 2 ∈ γ ( p2 ) ∩ I ∧ i 1 = i 2 = tt}) Hdef. (45) of =I 2 α ({hi, i i | i ∈ γ ( p1 ) ∩ γ ( p2 ) ∩ I}) 2 Hγ B α is extensive (6) and α 2 is monotoneI v 2 α ({hi, i i | i ∈ γ ( p1 ) ∩ γ ( p2 ) ∩ γ (α(I))}) Hγ preserves meetsI 2 α ({hi, i i | i ∈ γ ( p1 u p2 u α(I))}) Hdef. (12) of γ 2 I α 2 (γ 2 (h p1 u p2 u α(I), p1 u p2 u α(I)i)) v2 Hα 2 B γ 2 is reductive and let notationI let p = p1 u p2 u α(I) in h p, pi F 2 Hdef. (36) of ? I v F let p = p1 u p2 u ? in h p, pi 1 F = Hby defining = ˇ = let p = p1 u p2 u ? in h p, piI = ˇ . In conclusion 1
F
p1 = ˇ p2 = let p = p1 u p2 u ? in h p, pi . 10.4 Initialization and simple sign abstract arithmetic comparison operations The abstract strict comparison functor (E: Abstract_Env_Algebra_signature) -> functor (Faexp: Faexp_signature) -> struct (* generic abstract elementironments *) module E’ = E (L) (* iterative fixpoint computation *) module F = Fixpoint((E’:Poset_signature with type element = E (L).env)) (* generic backward abstract interpretation of arithmetic operations *) module Baexp’ = Baexp(L)(E)(Faexp) (* generic reductive backward abstract int. of arithmetic operations *) let baexp a r p = let f x = Baexp’.baexp a x p in F.gfp f r end;;
39
module Baexp_Reductive_Iteration = (Baexp_Reductive_Iteration_implementation (Baexp):Baexp_signature);;
Either of the Baexp or Baexp_Reductive_Iteration modules can be used by (i.e. passed as parameters to) the generic static analyzer. Here is an example of reachability analysis where all variables are assumed to be uninitialized at the program entry point with the initialization and simple sign abstraction. Abstract invariants automatically derived by the analysis are written below in italic between round brackets. without reductive iteration:
with reductive iteration:
{ x:ERR; y:ERR; z:ERR } x := 0; y := ?; z := ?; { x:ZERO; y:INI; z:INI } if ((x=y)&(y=z)&((z+1)=x)) then { x:ZERO; y:ZERO; z:NEG } skip else { x:ZERO; y:INI; z:INI } skip fi { x:ZERO; y:INI; z:INI }
{ x:ERR; y:ERR; z:ERR } x := 0; y := ?; z := ?; { x:ZERO; y:INI; z:INI } if ((x=y)&(y=z)&((z+1)=x)) then { x:BOT; y:BOT; z:BOT } skip else { x:ZERO; y:INI; z:INI } skip fi { x:ZERO; y:INI; z:INI }
Informally, without reductive iteration, from {x:ZERO; y:INI} and (x=y) we get {x:ZERO; y:ZERO}. Besides, from {y:INI; z:INI} and (y=z) we gain no information. Finally from {x:ZERO; z:INI} and ((z+1)=x), we get {x:ZERO; z:NEG}. By conjunction, we conclude with the invariant {x:ZERO; y:ZERO; z:NEG}. With reductive iteration, the analysis is repeated. So from {y:ZERO; z:NEG} and (y=z), we reduce to BOT.
12. Semantics of Imperative Programs 12.1 Abstract syntax of commands and programs The abstract syntax of programs is given in Fig. 11. 12.2 Program components A program may be represented in abstract syntax as a finite ordered labelled tree, the leaves of which are labelled with identity commands, assignment commands and boolean expressions (which can themselves be represented by finite abstract syntax trees) and the internal nodes of which are labelled with conditional, iteration and sequence labels. Each subtree bCcπ , which uniquely identifies a component (subcommand or subsequence) C of a program, can be designated by a position π , that is a sequence of positive integers — in Dewey decimal notation — , describing the path within the program abstract syntax tree from the outermost program root symbol to the head of the component at that position (which is standard in rewrite
40
Variables X∈V. Arithmetic expressions A ∈ Aexp . Boolean expressions B ∈ Bexp . Commands C ∈ Com ::= | | |
identity, assignment, conditional, iteration.
skip X := A if B then S1 else S2 fi while B do S od
List of commands S, S1 , S2 ∈ Seq ::= C | C;S
command, sequence.
Program P ∈ Prog ::= S ;;
program.
Figure 11: Abstract syntax of commands and programs systems [21]). These program components are defined as follows: 1
CmpJS ;;K = {bS ;;c0 } ∪ Cmp O JSK, n [ 1 π Cmp JC1 ; . . . ; Cn K = {bC1 ; . . . ; Cn cπ } ∪ Cmpπ.i JCi K, i=1 1
Cmpπ Jif B then S1 else S2 fiK = {bif B then S1 else S2 ficπ } ∪ Cmpπ.1JS1 K ∪ Cmpπ.2 JS2 K, 1
Cmpπ Jwhile B do S1 odK = {bwhile B do S1 odcπ } ∪ Cmpπ.1JS1 K, 1
Cmpπ JX := AK = {bX := Acπ }, 1
Cmpπ JskipK = {bskipcπ } . For example CmpJskip ; skip ;;K = {bskip ; skip ;;c0 , bskipc01 , bskipc02 } so that the two occurrences of the same command skip within the program skip ; skip ;; can be formally distinguished. 12.3 Program labelling In practice the above positions are not quite easy to use for identifying program components. We prefer labels ` ∈ Lab designating program points (P ∈ Prog) at P , after P ∈ CmpJPK 7→ Lab, in P ∈ CmpJPK 7→ ℘ (Lab) . Program components labelling is defined as follows (for short we leave positions implicit, writing C for bCcπ and assuming that the rules for designating subcomponents of a component are clear from Sec. 12.2) 41
∀C ∈ CmpJPK : at P JCK 6= after P JCK .
(56)
If C = skip ∈ CmpJPK or C = X := A ∈ CmpJPK then in P JCK = {at P JCK, after P JCK} .
(57)
If S = C1 ; . . . ; Cn ∈ CmpJPK where n ≥ 1 is a sequence of commands, then at P JSK = at P JC1 K, after P JSK = after P JCn K, n [ in P JCi K, in P JSK = i=1
∀i ∈ [1, n[: after P JCi K = at P JCi+1 K = in P JCi K ∩ in P JCi+1 K, ∀i, j ∈ [1, n] : ( j 6= i − 1 ∧ j 6= i + 1) H⇒ (in P JCi K ∩ in P JC j K = ∅) .
(58)
If C = if B then St else S f fi ∈ CmpJPK is a conditional command, then in P JCK = {at P JCK, after P JCK} ∪ in P JSt K ∪ in P JS f K, {at P JCK, after P JCK} ∩ (in P JSt K ∪ in P JS f K) = ∅, in P JSt K ∩ in P JS f K = ∅ .
(59)
If C = while B do S od ∈ CmpJPK is an iteration command, then in P JCK = {at P JCK, after P JCK} ∪ in P JSK, {at P JCK, after P JCK} ∩ in P JSK = ∅ . If P = S ;; ∈ CmpJPK is a program, then at P JPK = at P JSK,
after P JPK = after P JSK,
in P JPK = in P JSK .
12.4 Program variables The free variables Var ∈ (Prog ∪ Com ∪ Seq ∪ Aexp ∪ Bexp) 7→ ℘ (V) are defined as usual for programs (S ∈ Seq) 1
VarJS ;;K = VarJSK ; list of commands (C ∈ Com, S ∈ Seq) 1
VarJC ; SK = VarJCK ∪ VarJSK ; commands (X ∈ V, A ∈ Aexp, B ∈ Bexp, S, St , S f ∈ Seq) 1
VarJskipK = ∅, 1
VarJX := AK = {X} ∪ VarJAK, 1
VarJif B then St else S f fiK = VarJBK ∪ VarJSt K ∪ VarJS f K, 1
VarJwhile B do S odK = VarJBK ∪ VarJSK ; 42
(60)
arithmetic expressions (n ∈ Nat, X ∈ V, u ∈ {+, −}, A1 , A2 ∈ Aexp, b ∈ {+, −, ∗, /, mod}) 1
1
VarJnK = ∅,
VarJu A1 K = VarJA1 K,
1
1
VarJXK = {X},
VarJA1 b A2 K = VarJA1 K ∪ VarJA2 K,
1
VarJ?K = ∅ and boolean expressions (A1 , A2 ∈ Aexp, r ∈ {=, 0 (`i , `i+1 ∈ in P JCi K for all i ∈ [1, n]) h`i , ρi i 7HHJCi KH⇒ h`i+1 , ρi+1 i . h`i , ρi i 7HHJC1 ; . . . ; Cn KH⇒ h`i+1 , ρi+1 i
(74)
Program P = S ;; h`, ρi 7HHJSKH⇒ ρ . h`0, ρ 0 i 7HHJS ;;KH⇒ h`0 , ρ 0 i Figure 12: Small-step operational semantics of commands and programs
44
(75)
12.7 Transition system of a program The transition system of a program P = S ;; is h6 JPK, τ JPKi where 6 JPK is the set (61) of program states and τ JCK, C ∈ CmpJPK is the transition relation for component C of program P, defined by 1
τ JCK = {hh`, ρi, h`0, ρ 0 ii | h`, ρi 7HHJCKH⇒ h`0, ρ 0 i} .
(76)
Execution starts at the program entry point with all variables uninitialized 1
EntryJPK = {hat P JPK, λX ∈ VarJPK• i i} .
(77)
Execution ends without error when control reaches the program exit point 1
ExitJPK = {after P JPK} × EnvJPK . When the evaluation of an arithmetic or boolean expression fails with a runtime error, the program execution is blocked so that no further transition is possible. A basic result on the program transition relation is that it is not possible to jump into or out of program components (C ∈ CmpJPK)) hh`, ρi, h`0, ρ 0 ii ∈ τ JCK
H⇒
{`, `0} ⊆ in P JCK .
(78)
The proof, by structural induction on C, is trivial whence omitted. 12.8 Reflexive transitive closure of the program transition relation The reflexive transitive closure of the transition relation τ JCK of a program component 1 C ∈ CmpJPK is τ ? JCK = (τ JCK)? . τ ? JPK can be expressed compositionally (by structural induction the the components C ∈ CmpJPK of program P). The computational design follows. 1
For the identity C = skip and the assignment C = X := A
τ ? JCK = Hdef. of (τ JCK)? and τ JCK so that at P JCK 6= after P JCK by (56) implies (τ JCK)2 = ∅, whence by recurrence (τ JCK)n = ∅ for all n ≥ 2, 1 S was defined as the identity on the set SI 16 JPK ∪ τ JCK . 2
For the conditional C = if B then St else S f fi, we define 1
τ B = {hhat P JCK, ρi, hat P JSt K, ρii | ρ ` B Z⇒ tt}, ¯
1
τ B = {hhat P JCK, ρi, hat P JS f K, ρii | ρ ` T (¬B) Z⇒ tt}, 1
τ t = {hhafter P JSt K, ρi, hafter P JCK, ρii | ρ ∈ EnvJPK}, τf
1
= {hhafter P JS f K, ρi, hafter P JCK, ρii | ρ ∈ EnvJPK} . 45
It follows that by (64) to (69), we have τ JCK = τtt JCK ∪ τff JCK where 1
τtt JCK = τ B ∪ τ JSt K ∪ τ t , 1
¯
τff JCK = τ B ∪ τ JS f K ∪ τ f . By the conditions (59) and (78) on labelling of the conditional command C, we have τtt JCK B τff JCK = τff JCK B τtt JCK = ∅ so that τ ? JCK = (τtt JCK)? ∪ (τff JCK)? .
(79)
Intuitively the steps which are repeated in the conditional must all take place in one branch or the other since it is impossible to jump from one branch into the other. Assume by induction hypothesis that (τtt JCK)n = τ B B τ JSt Kn−2 B τ t ∪ τ B B τ JSt Kn−1 ∪ τ JSt Kn−1 B τ t ∪ τ JSt Kn .
(80)
This holds for the basis n = 1 since τ JSt K−1 = ∅ and τ JSt K0 = 16 JPK is the identity. For n ≥ 1, we have = = =
=
=
=
=
(τtt JCK)n+1 Hdef. t n+1 = t n B tI (τtt JCK)n B τtt JCK Hinduction hypothesisI B (τ B τ JSt Kn−2 B τ t ∪ τ B B τ JSt Kn−1 ∪ τ JSt Kn−1 B τ t ∪ τ JSt Kn ) B τtt JCK HB distributes over ∪ (and B has priority over ∪)I B τ B τ JSt Kn−2 B τ t B τtt JCK ∪ τ B B τ JSt Kn−1 B τtt JCK ∪ τ JSt Kn−1 B τ t B τtt JCK ∪ τ JSt Kn B τtt JCK Hby the labelling scheme (59), (78) and the def. (64) to (69) of the possible transitions so that τ t B τtt JCK = ∅, etc.I τ B B τ JSt Kn−1 B τtt JCK ∪ τ JSt Kn B τtt JCK Hdef. of τtt JCK and B distributes over ∪ I τ B B τ JSt Kn−1 B τ B ∪ τ B B τ JSt Kn−1 B τ JSt K ∪ τ B B τ JSt Kn−1 B τ t ∪ τ JSt Kn B τ B ∪ τ JSt Kn B τ JSt K ∪ τ JSt Kn B τ t Hby the labelling scheme (59), (78) and the def. (64) to (69) of the possible transitions so that τ B B τ B = ∅, τ JSt Kn B τ B , etc.I B τ B τ JSt Kn ∪ τ B B τ JSt Kn−1 B τ t ∪ τ JSt Kn+1 ∪ τ JSt Kn B τ t H∪ is associative and commutative and def. (80) of (τtt JCK)n+1 I (τtt JCK)n+1 .
By recurrence, (80) holds for all n ≥ 1 so that (τtt JCK)? = Hdef. t ? I[ (τtt JCK)0 ∪ (τtt JCK)n n≥1
=
Hdef.[ t 0 and (80)I (τ B B τ JSt Kn−2 B τ t ∪ τ B B τ JSt Kn−1 ∪ τ JSt Kn−1 B τ t ∪ τ JSt Kn ) 16 JPK ∪ n≥1
46
=
HB distributes over ∪I [ [ [ [ τ JSt Kn−2 Bτ t ∪τ B B τ JSt Kn−1 ∪ τ JSt Kn−1 Bτ t ∪ τ JSt Kn 16 JPK ∪τ B B n≥1
n≥1
n≥1
n≥1
= Hchanging variables k = n − 2 and j = n − 1, τ JSt K−1 = ∅, τ JSt K0 = 16 JPK and by the labelling scheme (59), (78) and the def. (64) to (69) of the possible transitions, τB B τ t = ∅,etc.I [ [ [ [ τ JSt Kn τ B B τ JSt Kk B τ t ∪ τ B B τ JSt K j ∪ τ JSt K j B τ t ∪ k≥1 B Hτ B τ t
t ?I
j ≥0
j ≥0
n≥0
= ∅ and def. of τ B (τ JSt K) B τ t ∪ τ B B (τ JSt K)? ∪ (τ JSt K)? B τ t ∪ (τ JSt K)? = HB distributes over ∪ (and ? has priority over B which has priority over ∪)I (16 JPK ∪ τ B ) B (τ JSt K)? B (16 JPK ∪ τ t ) . =
B
?
A similar result is easily established for (τff JCK)? whence by (79), we get τ ? Jif B then St else S f fiK = (16 JPK ∪ τ B ) B (τ JSt K)? B (16 JPK ∪ τ t ) ∪ ¯
(16 JPK ∪ τ B ) B (τ JS f K)? B (16 JPK ∪ τ f ) . 3 The case of iteration is rather long to handle and can be skipped at first reading. By analogy with the conditional, the big-step operational semantics (94) of iteration should be intuitive. Formally, for the iteration C = while B do S od, we define 1
τ B = {hhat P JCK, ρi, hat P JSK, ρii | ρ ` B Z⇒ tt}, ¯
1
τ B = {hhat P JCK, ρi, hafter P JCK, ρii | ρ ` T (¬B) Z⇒ tt}, 1
τ R = {hhafter P JSK, ρi, hat P JCK, ρii | ρ ∈ EnvJPK} . It follows that by (70) to (73), we have ¯
τ JCK = τ B ∪ τ JSK ∪ τ R ∪ τ B .
(81)
n We define the composition i=1 ti of relations t1 , . . . , tn (B is associative but not commutative so that the index set must be totally ordered for the notation to be meaningful): n
1
when n < 0,
1
when n = 0,
ti = ∅, i=1 0
ti = 16 JPK ,
i=1 n
1
ti = t1 B . . . B tn , when n > 0 .
i=1
S In order to compute τ ? JCK = n≥0 τ JCKn for the component C = while B do S od of program P, we first compute the n-th power τ JCKn for n ≥ 0. By recurrence τ JCK0 = 16 JPK , τ JCK1 = τ JCK = τ B ∪ τ JSK ∪ τ R ∪ τ B¯ . For n > 1, we have (τ JCK)2 = Hdef. t 2 = t B tI τ JCK B τ JCK 47
= (τ = τB ¯ τB τR
=
Hdef. (81) of τ JCKI ¯ ¯ ∪ τ JSK ∪ τ R ∪ τ B ) B (τ B ∪ τ JSK ∪ τ R ∪ τ B ) HB distributes over ∪ (and B has priority over ∪)I ¯ B τ B ∪ τ JSK B τ B ∪ τ R B τ B ∪ τ B B τ B ∪ τ B B τ JSK ∪ τ JSK B τ JSK ∪ τ R B τ JSK ∪ ¯ ¯ ¯ B τ JSK ∪ τ B B τ R ∪ τ JSK B τ R ∪ τ R B τ R ∪ τ B B τ R ∪ τ B B τ B ∪ τ JSK B τ B ∪ ¯ ¯ ¯ BτB ∪τB BτB B B Hτ B τ = ∅, by (71) and (56); τ JSK B τ B = ∅, by (72), (71), (78) and (60); ¯ τ B B τ B = ∅, by (70), (71) and (56); τ R B τ JSK = ∅, by (73), (72) and (60); ¯ τ B B τ JSK = ∅, by (70), (72), (78) and (60); τ B B τ R = ∅, by (71), (73) and (56); τ R B τ R = ∅, by (73), (60) and (78); ¯ τ B B τ B = ∅, by (71), (70), (60) and (78) τ JSK B τ B¯ = ∅, by (72), (70), (60) and (78); ¯ ¯ τ B B τ B = ∅, by (70) and (56)I ¯ B τ B ∪ τ B B τ JSK ∪ τ JSK2 ∪ τ JSK B τ R ∪ τ R B τ B .
B
τR
The generalization after computing the first few iterates n = 1, . . . , 4 leads to the following induction hypothesis (n ≥ 1) 1
(τ JCK)n = An ∪ Bn ∪ Cn ∪ Dn ∪ E n ∪ Fn ∪ G n
(82)
where [
1
An =
j
(τ B B τ JSKki B τ R ) ;
(83)
i=1
j
n= 6 (ki +2) i=1
(This corresponds to j loops iterations from and to the loop entry at P JCK where the i -th execution of the loop body S exactly takes ki ≥ 1 7 steps. An = ∅, n ≤ 1.) j [ 1 B ki R B ` (84) Bn =
(τ B τ JSK B τ ) B τ B τ JSK ; i=1
j
n=( 6 (ki +2))+1+` i=1
(This corresponds to j loops iterations from and to the loop entry at P JCK where the i -th execution of the loop body S exactly takes ki ≥ 1 steps followed by a successful condition B and a partial execution of the loop body S for ` ≥ 0 8 steps. B0 = ∅, B1 = τ B .) j [ 1 B¯ B ki R ; (85) Cn =
(τ B τ JSK B τ ) B τ j
i=1
n=( 6 (ki +2))+1 i=1
(This corresponds to j loops iterations where the i -th execution of the loop body S has ki ≥ 1 ¯ steps within S until termination with condition B false. C0 = ∅, C1 = τ B .) j [ 1 ` R B ki R τ JSK B τ B (τ B τ JSK B τ ) ; (86) Dn = i=1
j
n=`+1+( 6 (ki +2)) i=1
7 8
For short, the constraints ki > 0, i = 1, . . . , j are not explicitly inserted in the formula. Again, the constraint ` ≥ 0 is left implicit in the formula.
48
(This corresponds to an observation of the execution starting in the middle of the loop body S for ` steps followed by the jump back to the loop entry at P JCK, followed by j complete loops iterations from and to the loop entry at P JCK where the i -th execution of the loop body S exactly takes ki ≥ 1 steps. D0 = ∅, D1 = τ R .) j [ 1 ` R B k R B m i τ JSK B τ B (τ B τ JSK B τ ) B τ B τ JSK ; (87) En = i=1
j
n=( 6 (ki +2))+`+2+m i=1
(This corresponds to an observation of the execution starting in the middle of the loop body S for ` ≥ 0 steps followed by the jump back to the loop entry at P JCK. Then there are j loops iterations from and to the loop entry at P JCK where the i -th execution of the loop body S exactly takes ki ≥ 1 steps. Finally the condition B holds and a partial execution of the loop body S for m ≥ 0 steps is performed. E 0 = E 1 = ∅ and E 2 = τ R B τ B .) j [ 1 B¯ ` R B ki R τ JSK B τ B (τ B τ JSK B τ ) B τ ; (88) Fn = i=1
j
n=( 6 (ki +2))+`+2 i=1
(This case is similar to E n except that the execution of the loop terminates with condition B ¯ false. F0 = F1 = ∅ and F2 = τ R B τ B .) 1
G n = (τ JSK)n ;
(89)
(This case corresponds to the observation of n ≥ 1 steps within the loop body S.). We now S proof (82) by recurrence on n. Given a formula Fn ∈ { An , . . . , Fn } of the form Fn = T (n, `, m, . . . ), where n, `, m, . . . are free variables of the condition C and C(n,`,m,... ) S term T , we write Fn | C 0 (n, `, m, . . . ) for the formula T (n, `, m, . . . ). C(n,`,m,... )∧C 0 (n,`,m,... ) ¯
For the basis observe that for n = 1, A1 = ∅, B1 = τ B , C1 = τ B , D1 = τ R , E 1 = ∅, 3.1 F1 = ∅ and G 1 = (τ JSK)1 = τ JSK so that (τ JCK)1 = τ JCK ¯
= τ B ∪ τ JSK ∪ τ R ∪ τ B = B 1 ∪ G 1 ∪ D1 ∪ C 1 = A1 ∪ B1 ∪ C1 ∪ D1 ∪ E 1 ∪ F1 ∪ G 1 .
3.2 For n = 2, observe that A2 = ∅, B2 = τ B B τ JSK, C2 = ∅, D2 = τ JSK B τ R , ¯ E 2 = τ R B τ B , F2 = τ R B τ B and G 2 = (τ JSK)2 so that ¯
(τ JCK)2 = τ R B τ B ∪ τ B B τ JSK ∪ τ JSK2 ∪ τ JSK B τ R ∪ τ R B τ B = E 2 ∪ B2 ∪ G 2 ∪ D2 ∪ E 2 ∪ F2 = A2 ∪ B2 ∪ C2 ∪ D2 ∪ E 2 ∪ F2 ∪ G 2 .
49
3.3 For the induction step n ≥ 2, we have to consider the compositions An B τ JCK, . . . , G n B τ JCK in turn. An B τ JCK Hdef. (81) of τ JCKI ¯ An B (τ B ∪ τ JSK ∪ τ R ∪ τ B ) = HB distributes over ∪, n ≥ 2 so j ≥ 1 whence An = τ 0 B τ R , τ R B τ JSK = ∅ and τ R B τ R = ∅I ¯ An B τ B ∪ An B τ B = Hdef. (83) of An and τ JSK0 = 16 JPK I j S B k R
(τ B τ JSK i B τ ) B τ B B τ JSK0 =
j
n+1=( 6 (ki +2))+1+0
i=1
i=1
S
∪
j
n+1=( 6 (ki +2))+1
j
(τ B
B
τ JSKki
B τ R)
¯
BτB
i=1
i=1
=
Hdef. (84) of Bn+1 with additional constraint ` = 0 and def. (85) of Cn+1 I Bn+1 | ` = 0 ∪ Cn+1 .
Bn B τ JCK = Hdef. (81) of τ JCKI ¯ Bn B (τ B ∪ τ JSK ∪ τ R ∪ τ B ) = HB distributes over ∪, either ` = 0 in Bn , in which case Bn = τ 0 B τ B , τ B B τ B = ∅, ¯ τ B B τ R = ∅ and τ B B τ B = ∅ or ` > 0 in Bn , in which case Bn = τ 00 B τ JSK, τ JSK B τ B = ∅ and τ JSK B τ B¯ = ∅I (Bn | ` = 0) B τ JSK ∪ (Bn | ` > 0) B τ JSK ∪ (Bn | ` > 0) B τ R = Hdef. (84) of Bn I (Bn+1 | ` = 1) ∪ (Bn+1 | ` > 1) ∪
S
j
n=( 6 (ki +2))+1+`
j
(τ B
B τ JSKki
B τ R)
BτB
i=1
B
τ JSK`
BτR
i=1
=
HB distributes over ∪I (Bn+1 | ` = 1) ∪ (Bn+1 | ` > 1) ∪ j S B k R B ` R
(τ B τ JSK i B τ ) B (τ B τ JSK B τ ) j
n+1=( 6 (ki +2))+2+`
i=1
i=1
=
Hby letting k j +1 = ` ≥ 1I (Bn+1 | ` = 1) ∪ (Bn+1 | ` > 1) ∪
[
j +1 B ki R
(τ B τ JSK B τ )
j +1
n+1= 6 (ki +2) i=1
Hby letting j 0 = j + 1 and def. (84) of An+1 I (Bn+1 | ` = 1) ∪ (Bn+1 | ` > 1) ∪ An+1 = Hassociativity of ∪I (Bn+1 | ` > 0) ∪ An+1 . =
=
Cn B τ JCK Hdef. (81) of τ JCKI 50
i=1
¯
Cn B (τ B ∪ τ JSK ∪ τ R ∪ τ B ) ¯ ¯ ¯ ¯ ¯ ¯ = HB distributes over ∪, Cn = τ 0 B τ B and τ B B τ B = τ B B τ JSK = τ B B τ R = τ B B τ B = ∅I ∅. Dn B τ JCK = Hdef. (81) of τ JCKI ¯ Dn B (τ B ∪ τ JSK ∪ τ R ∪ τ B ) = HB distributes over ∪, Dn has the form τ 0 B τ R and τ R B τ JSK = τ R B τ R = ∅I ¯ Dn B τ B ∪ D n B τ B = Hdef. (87) of E n and (88) of Fn I (E n+1 | m = 0) ∪ Fn+1 . E n B τ JCK Hdef. (81) of τ JCKI ¯ E n B (τ B ∪ τ JSK ∪ τ R ∪ τ B ) = HB distributes over ∪, E n | m = 0 has the form τ 0 B τ B while E n | m >= 0 has the ¯ ¯ form τ 00 B τ JSK, τ B B τ B = τ B B τ R = τ B B τ B = ∅ and τ JSK B τ B = τ JSK B τ B = ∅I R (E n | m = 0) B τ JSK ∪ (E n | m > 0) B τ JSK ∪ (E n | m > 0) B τ = Hdef. (87) of E n and (86) of Dn+1 where ki = m ≥ 1 so that ` < nI (E n+1 | m = 1) ∪ (E n+1 | m > 1) ∪ (Dn+1 | ` < n) = H∪ is associativeI (E n+1 | m > 0) ∪ (Dn+1 | ` < n) . =
Fn B τ JCK = Hdef. (81) of τ JCKI ¯ Fn B (τ B ∪ τ JSK ∪ τ R ∪ τ B ) ¯ ¯ ¯ = HB distributes over ∪, by def. (88) of Fn has the form τ 0 B τ B and τ B B τ B = τ B B τ JSK ¯ ¯ ¯ = τ B B τ R = τ B B τ B = ∅I ∅. G n B τ JCK Hdef. (89) of G n and (81) of τ JCKI ¯ (τ JSK)n B (τ B ∪ τ JSK ∪ τ R ∪ τ B ) ¯ = HB distributes over ∪, n ≥ 1, τ JSK B τ B = τ JSK B τ B = ∅I (τ JSK)n B τ JSK ∪ (τ JSK)n B τ R ) = Hdef. n + 1-th power and (86) of Dn+1 I (τ JSK)n+1 ∪ (Dn+1 | ` = n) .
=
Grouping all cases together, we get = = =
=
(τ JCK)n+1 Hdef. n + 1-th power and (82)I (An ∪ Bn ∪ Cn ∪ Dn ∪ E n ∪ Fn ∪ G n ) B (τ JCK)n HB distributes over ∪, def. (89) of G n I (An B τ JCK ∪ Bn B τ JCK ∪ Cn B τ JCK ∪ Dn B τ JCK ∪ E n B τ JCK ∪ Fn B τ JCK B (τ JCK)n B τ JCK Hreplacing according to the above lemmataI (Bn+1 | ` = 0 ∪ Cn+1 )∪((Bn+1 | ` > 0)∪ An+1 )∪∅∪((E n+1 | m = 0)∪ Fn+1 )∪((E n+1 | m > 0) ∪ (Dn+1 | ` < n)) ∪ ∅ ∪ ((τ JSK)n+1 ∪ (Dn+1 | ` = n)) H∪ is associative and commutative and (Dn+1 | ` > n) = ∅I An+1 ∪ Bn+1 ∪ Cn+1 ∪ Dn+1 ∪ E n+1 ∪ Fn+1 ∪ G n+1 51
By recurrence on n ≥ 1, we have proved that 1
(τ JCK)n = An ∪ Bn ∪ Cn ∪ Dn ∪ E n ∪ Fn ∪ (τ JCK)n so that τ ? JCK = (τ JCK)? [ = (τ JCK)0 ∪ (An ∪ Bn ∪ Cn ∪ Dn ∪ E n ∪ Fn ∪ (τ JSK)n ) n≥1 [
= (τ JCK) ∪ ( 0
[
An ∪
n≥1
n≥1
Bn ∪
[
Cn ∪
n≥1
[
Dn ∪
n≥1
[
En ∪
n≥1
[
Fn ∪ (τ JSK)? ) .
n≥1
We now compute each of these terms. [ An n≥1
=
[ n≥1
Hdef. (83) of An I j [
(τ B B τ JSKki B τ R ) i=1
j
n= 6 (ki +2) i=1
=
Hfor n ∈ [1, 3] this is ∅ while for n > 3, we can always find j and k1 ≥ 1, ...,k j ≥ 1 j
such that n = 6 (ki + 2). Reciprocally, for all choices of j and k1 ≥ 1, ...,k j ≥ 1 i=1
j
there exists an n > 3 such that n = 6 (ki + 2).I (τ B B τ JSK+ B τ R )+ = Hτ B B τ R = ∅I (τ B B τ JSK? B τ R )+ .
i=1
By the same reasoning, we get [ Bn = (τ B B τ JSK? B τ R )? B τ B B τ JSK? , n≥1
[
¯
Cn = (τ B B τ JSK? B τ R )? B τ B ,
n≥1
[
Dn = τ JSK? B τ R B (τ B B τ JSK? B τ R )? ,
n≥1
[
E n = τ JSK? B τ R B (τ B B τ JSK? B τ R )? B τ B B τ JSK? ,
n≥1
[
¯
Fn = τ JSK? B τ R B (τ B B τ JSK? B τ R )? B τ B .
n≥1
Grouping now all cases together and using the fact that B distributes over ∪, we finally get τ ? JCK = τ JSK0 ∪ (τ B B τ JSK? B τ R )+ ∪ (τ B B τ JSK? B τ R )? B τ B B τ JSK? ¯
∪ (τ B B τ JSK? B τ R )? B τ B ∪ τ JSK? B τ R B (τ B B τ JSK? B τ R )? ∪ τ JSK? B τ R B (τ B B τ JSK? B τ R )? B τ B B τ JSK? ¯
∪ τ JSK? B τ R B (τ B B τ JSK? B τ R )? B τ B ¯
= (16 JPK ∪ τ JSK? B τ R ) B (τ B B τ JSK? B τ R )? B (16 JPK ∪ τ B B τ JSK? ∪ τ B ) . 52
4 The case of sequence is also long to handle and can be skipped at first reading. The big-step operational semantics (95) of the sequence is indeed rather intuitive. Formally, for the sequence S = C1 ; . . . ; Cn , n ≥ 1, we first prove a lemma. 4.1 Let P be the program with subcommand S = C1 ; . . . ; Cn . Successive small steps in S must be made in sequence since, by the definition (76) and (74) of τ JSK and the labelling scheme (58), it is impossible to jump from one command into a different one τ k1 JC1 K B . . . B τ kn JCn K = (∀i ∈ [1, n] : ki = 0 ? 16 JPK
(90)
| ∃1 ≤ i ≤ j ≤ n : ∀` ∈ [1, n] : (k` 6= 0 ⇐⇒ ` ∈ [i, j ]) ? τ ki JCi K B . . . B τ k j JC j K ¿ ∅)) . The proof is by recurrence on n. 4.1.1 If, for the basis, n = 1 then either k1 = 0 and τ 0 JC1 K = 16 JPK or k1 > 0 and then τ k1 JC1 K = τ ki JCi K B . . . B τ k j JC j K by choosing i = j = 1. 4.1.2
For the induction step, assuming (90), we prove that T = τ k1 JC1 K B . . . B τ kn JCn K B τ kn+1 JCn+1 K
is of the form (90) with n + 1 substituted for n. Two cases, with several subcases have to be considered. 4.1.2.1
If ∀i ∈ [1, n] : ki = 0 then we consider two subcases.
4.1.2.1.1 If kn+1 = 0 then ∀i ∈ [1, n + 1] : ki = 0 and T = τ k1 JC1 K B . . . B τ kn JCn K B τ kn+1 JCn+1 K = 16 JPK B τ 0 JCn+1 K = 16 JPK . 4.1.2.1.2 Otherwise kn+1 > 0 and then ∀` ∈ [1, n + 1] : (k` 6= 0 ⇐⇒ ` ∈ [n + 1, n + 1]) k 1 and T = τ JC1 K B . . . B τ kn JCn K B τ kn+1 JCn+1 K = 16 JPK B τ kn+1 JCn+1 K = τ ki JCi K B . . . B τ k j JC j K by choosing i = j = n + 1. 4.1.2.2 4.1.2.2.1 have
Otherwise, ∃i ∈ [1, n] : ki 6= 0. If ∃1 ≤ i ≤ j ≤ n : ∀` ∈ [1, n] : (k` 6= 0 ⇐⇒ ` ∈ [i, j ]) then by (90), we T = τ ki JCi K B . . . B τ k j JC j K B τ kn+1 JCn+1 K .
4.1.2.2.1.1 [i, j ]) and:
If kn+1 = 0 then ∃1 ≤ i ≤ j ≤ n + 1 : ∀` ∈ [1, n + 1] : (k` 6= 0 ⇐⇒ ` ∈ T = τ ki JCi K B . . . B τ k j JC j K B τ kn+1 JCn+1 K, = τ ki JCi K B . . . B τ k j JC j K B 16 JPK , = τ ki JCi K B . . . B τ k j JC j K .
4.1.2.2.1.2
Otherwise kn+1 > 0 and we distinguish two subcases. 53
4.1.2.2.1.2.1
If j < n then t k+1 = t B t k = t k B t so
T = τ ki JCi K B . . . B τ k j −1 JC j K B B τ JC j K B τ JCn+1 K B τ kn+1 −1 JCn+1 K . By the definition (76) and (74) of τ JCK and the labelling scheme (58), we have τ JC j KBτ JCn+1 K = ∅ since j < n so that in that case T = ∅. 4.1.2.2.1.2.2 Otherwise j = n so ∀` ∈ [0, i [: k` = 0, ∀` ∈ [i, n + 1] : k` > 0 and T = τ ki JCi K B . . . B τ kn JCn K B τ kn+1 JCn+1 K whence ∀` ∈ [1, n + 1] : (k` 6= 0 ⇐⇒ ` ∈ [i, j ]) with 1 ≤ i < j = n + 1 and T = τ ki JCi K B . . . B τ k j JC j K. 4.1.2.2.2 Otherwise ∀1 ≤ i ≤ j ≤ n : ∃` ∈ [1, n] : (k` 6= 0 ∧ ` 6∈ [i, j ]) ∨ (` ∈ [i, j ] ∧ k` = 0). 4.1.2.2.2.1 This excludes n = 1 since then i = j = ` = 1 and k1 = 0 in contradiction with ∃i ∈ [1, n] : ki 6= 0. 4.1.2.2.2.2 If n = 2 then k1 = 0 and k2 > 0 or k1 > 0 and k2 = 0 which corresponds to case 4.1.2.2.1, whence is impossible. 4.1.2.2.2.3 So necessarily n ≥ 3. Let p ∈ [1, n] be minimal and q ∈ [1, n] be maximal such that k p 6= 0 ad kq 6= 0. There exists m ∈ [ p, q] such that km = 0 since otherwise k` 6= 0 and either ` < p in contradiction with the minimality of p or ` > j in contradiction with the maximality of q. We have p < m < q with k p 6= 0, km = 0 and k j = 0. Assume m to be minimal with that property, so that km−1 6= 0 and then that q 0 is the minimal q with this property so that kq 0 −1 = 0. We have k1 = 0, . . . , k p−1 = 0, k p 6= 0, . . . , km−1 = 0, km = 0, kq 0 −1 = 0 kq 0 6= 0, . . . It follows, by the definition (76) and (74) of τ JCK and the labelling scheme (58) that τ k1 JC1 K B . . . B τ kn JCn K = ∅ that T = ∅ B τ kn+1 JCn+1 K = ∅. It remains to prove that ∀1 ≤ i ≤ j ≤ n + 1 : ∃` ∈ [1, n + 1] : (k` 6= 0 ∧ ` 6∈ [i, j ]) ∨ (` ∈ [i, j ] ∧ k` = 0) .
4.1.2.2.2.3.1
If j < n + 1 then this follows from (90).
4.1.2.2.2.3.2 Otherwise j = n + 1 in which case either kn+1 = 0 and then we choose ` = j or kn+1 > 0 so that q 0 = j = n + 1. If j ≤ m then for ` = m, we have k` = km = 0. Otherwise m < i ≤ q 0 . Choosing ` = p, we have ` ∈ [1, j ] with k` = k p 6= 0. We will need a second lemma, stating that k small steps in C1 ; . . . ; Cn must be 4.2 made in sequence with k1 steps in C1 , followed by k2 in C2 , . . . , followed by kn in Cn such that the total number k1 + . . . + kn of these steps is precisely k [ τ k JC1 ; . . . ; Cn K = τ k1 JC1 K B . . . B τ kn JCn K . (91) k=k1 +...+kn
The proof is by recurrence on k ≥ 0. 4.2.1
For k = 0, we get k1 = . . . = kn = 0 and 16 JPK on both sides of the equality. 54
4.2.2 For k = 1, there must exist m ∈ [1, n] such that km = 1 while for all j ∈ [1, n]−{m}, k j = 0. By the definition (76) and (74) of τ JC1 ; . . . ; Cn K, we have τ JC1 ; . . . ; Cn K =
n [
τ JCm K .
m=1
For the induction step k ≥ 2, we have
4.2.3
τ k+1 JC1 ; . . . ; Cn K = Hdef. t k+1 = t k B t of powersI τ k JC1 ; . . . ; Cn K B τ JC1 ; . . . ; Cn K = Hdef. (76) and (74) of τ JC1 ; . . . ; Cn KI n [ k τ JCm K τ JC1 ; . . . ; Cn K B m=1
=
HB distributes over ∪I
n [
τ k JC1 ; . . . ; Cn K B τ JCm K
m=1
= n [
Hinduction hypothesis (91)I [ τ k1 JC1 K B . . . B τ kn JCn K B τ JCm K k=k1 +...+kn
m=1
HB distributes over ∪I n [ [ τ k1 JC1 K B . . . B τ kn JCn K B τ JCm K
=
1
=
k=k1 +...+kn m=1
T .
Hby definitionI
4.2.3.1
We first show that T ⊆
[
0
0
τ k1 JC1 K B . . . B τ kn JCn K .
k+1=k10 +...+kn0
According to lemma (90), three cases have to be considered for 1
t = τ k1 JC1 K B . . . B τ kn JCn K B τ JCm K . 4.2.3.1.1 The case ∀i ∈ [1, n] : ki = 0 is impossible since then k = contradiction with k ≥ 2. 4.2.3.1.2
Pn
j =1 k j
Else if ∃1 ≤ i ≤ j ≤ n : ∀` ∈ [1, n] : (k` 6= 0 ⇐⇒ ` ∈ [i, j ]) then 1
t = τ ki JCi K B . . . B τ k j JC j K B τ JCm K . We discriminate according to the value of m.
55
= 0 in
4.2.3.1.2.1
If m = j , we get t = τ ki JCi K B . . . B τ k j +1 JC j K, 0
0
= τ k1 JC1 K B . . . B τ kn JCn K 0 with k + 1 = k10 + . . . + kn0 where k10 = 0, . . . , ki−1 = 0, ki0 = ki , . . . , k 0j = k j + 1, k 0j +1 = 0, . . . , kn0 = 0.
4.2.3.1.2.2
If m = j + 1, we get t = τ ki JCi K B . . . B τ k j JC j K B τ 1 JC j +1K, 0
0
= τ k1 JC1 K B . . . B τ kn JCn K 0 with k + 1 = k10 + . . . + kn0 where k10 = 0, . . . , ki−1 = 0, ki0 = ki , . . . , k 0j = k j , k 0j +1 = 1, k 0j +2 = 0, . . . , kn0 = 0.
4.2.3.1.2.3 Otherwise, by the definition (76) and (74) of τ JCK and the labelling scheme 0 0 (58), τ JC j K B τ JCm K = ∅ so that T = ∅ that is t = τ k1 JC1 K B . . . B τ kn JCn K with k`0 = k` for 0 = k + 1. ` ∈ [1, n] − {m} and km m 4.2.3.1.3 4.2.3.2
Otherwise T = ∅ so that the inclusion is trivial. Inversely, we now show that [ 0 0 τ k1 JC1 K B . . . B τ kn JCn K ⊆ T . k+1=k10 +...+kn0
According to lemma (90), three cases have to be considered for 1
0
0
t = τ k1 JC1 K B . . . B τ kn JCn K . Pn
4.2.3.2.1 k ≥ 0.
If ∀i ∈ [1, n] : ki0 = 0 then k + 1 =
4.2.3.2.2
Else if ∃1 ≤ i ≤ j ≤ n : ∀` ∈ [1, n] : (k`0 6= 0 ⇐⇒ ` ∈ [i, j ]) then 1
0 i=1 ki
= 0 which is impossible with
0
0
t = τ ki JCi K B . . . B τ k j JC j K . with all ki0 > 0, . . . , k 0j > 0. 4.2.3.2.2.1
If k 0j = 1 then t is 1
0
0
t = τ ki JCi K B . . . B τ k j −1 JC j −1 K B τ 1 JC j K . so we choose k1 = 0, . . . , ki−1 = 0, ki = ki0 , . . . , k j −1 = k 0j −1 , k j = 0, . . . , kn = 0 and m = j with k + 1 = k10 + . . . + kn0 whence k = k1 + . . . + kn . 4.2.3.2.2.2 Otherwise k 0j > 1 then we have t of the form required for T by choosing k1 = 0, . . . , ki−1 = 0, ki = ki0 , . . . , k j = k 0j − 1, k j +1 = 0, . . . , kn = 0 and m = j with k + 1 = k10 + . . . + kn0 whence k = k1 + . . . + kn . 56
4.2.3.2.3
0
0
Otherwise t = τ k1 JC1 K B . . . B τ kn JCn K is ∅ which is obviously included in T .
We can now consider the case 4 of the sequence S = C1 ; . . . ; Cn , n ≥ 1
4.3
τ ? JC1 ; . . . ; Cn K = [ Hdef. reflexive transitive closureI τ k JC1 ; . . . ; Cn K k≥0
=
=
Hlemma (91)I [ τ k1 JC1 K B . . . B τ kn JCn K
k[ 1 +...+k n ≥0 [
...
k1 ≥0
τ k1 JC1 K B . . . B τ kn JCn K
kn ≥0
= HB distributes over ∪I [ [ τ k1 JC1 K B . . . B τ kn JCn K k1 ≥0
=
kn ≥0
Hdef. reflexive transitive closureI τ ? JC1 K B . . . B τ ? JCn K . Finally, for programs P = S ;;, we have
5
τ ? JS ;;K = [ Hdef. reflexive transitive closureI τ k JS ;;K k≥0
= [ Hby the definition (76) and (75) of τ JS ;;KI τ k JSK k≥0
=
Hdef. reflexive transitive closureI τ ? JSK .
In conclusion the calculational design of the reflexive transitive closure of the program transition relation leads to the functional and compositional characterization given in Fig. 13. Here compositional means, in the sense of denotational semantics, by induction on the program syntactic structure. Observe that contrary to the classical big step operational or natural semantics [31], the effect of execution is described not only from entry to exit states but also from any (intermediate) state to any subsequently reachable state. This is better adapted to our later reachability analysis. 12.9 Predicate transformers and fixpoints The pre-image pre[t] P of a set P ⊆ S of states by a transition relation t ⊆ S × S is the set of states from which it is possible to reach a state in P by a transition t pre[t] P = {s | ∃s 0 : hs, s 0 i ∈ t ∧ s 0 ∈ P} . 1
The dual pre-image pre[t] f P is the set of states from which any transition, if any, must lead to a state in P 57
•
τ ? JskipK = 16 JPK ∪ τ JskipK
•
τ ? JX := AK = 16 JPK ∪ τ JX := AK
(92)
τ ? Jif B then St else S f fiK = (16 JPK ∪ τ B ) B τ ? JSt K B (16 JPK ∪ τ t ) ∪ ¯ (16 JPK ∪ τ B ) B τ ? JS f K B (16 JPK ∪ τ f ) where:
•
(93)
1
τ B = {hhat P Jif B then St else S f fiK, ρi, hat P JSt K, ρii | ρ ` B Z⇒ tt} ¯
1
τ B = {hhat P Jif B then St else S f fiK, ρi, hat P JS f K, ρii | ρ ` T (¬B) Z⇒ tt} 1
τ t = {hhafter P JSt K, ρi, hafter P Jif B then St else S f fiK, ρii | ρ ∈ EnvJPK} τf •
1
= {hhafter P JS f K, ρi, hafter P Jif B then St else S f fiK, ρii | ρ ∈ EnvJPK} τ ? Jwhile B do S odK = (16 JPK ∪ τ ? JSK B τ R ) B (τ B B τ ? JSK B τ R )? B
(94)
B¯
(16 JPK ∪ τ B B τ ? JSK ∪ τ ) where: 1 τ B = {hhat P Jwhile B do S odK, ρi, hat P JSK, ρii | ρ ` B Z⇒ tt} ¯
1
τ B = {hhat P Jwhile B do S odK, ρi, hafter P Jwhile B do S odK, ρii | ρ ` T (¬B) Z⇒ tt} 1
τ R = {hhafter P JSK, ρi, hat P Jwhile B do S odK, ρii | ρ ∈ EnvJPK} τ ? JC1 ; . . . ; Cn K = τ ? JC1 K B . . . B τ ? JCn K
•
τ ? JS ;;K = τ ? JSK .
•
(95) (96)
Figure 13: Big step operational semantics
1
pre[t] f P = ¬ pre[t](¬P) = {s | ∀s 0 : hs, s 0 i ∈ t H⇒ s 0 ∈ P} . The post-image post[t] P is the inverse pre-image, that is the set of states which are reachable from P ⊆ S by a transition t 1
post[t] P = pre[t −1 ] P = {s 0 | ∃s : s ∈ P ∧ hs, s 0 i ∈ t} .
(97)
The dual post-image ] post[t] P is the set of states which can only be reached, if ever possible, by a transition t from P 1 ] post[t] P = ¬ post[t](¬P) = {s 0 | ∀s : hs, s 0 i ∈ t H⇒ s ∈ P} .
]
We have the Galois connections (t ⊆ S × S) pg re[t]
post[t]
−− h℘ (S), ⊆i, h℘ (S), ⊆i ← −−− −− −− −− −→
− −− h℘ (S), ⊆i h℘ (S), ⊆i ← −−− − −− −− −→ pre[t]
post[t] 1
as well as (P ⊆ S, γ P (Y ) = {hs, s 0 i | s ∈ P H⇒ s 0 ∈ Y }) γ P −1
γP
− − h℘ (S), ⊆i, h℘ (S × S), ⊆i ← − h℘ (S), ⊆i . (98) h℘ (S × S), ⊆i ← −−− −− − −− −− −− −− −− → −−− −− −− −− −− −− −− → • λt • pre[t] P λt
post[t] P
58
We often use the fact that pre[t1 B t2 ] = pre[t1 ] B pre[t2 ], post[t1 B t2 ] = post[t2 ] B post[t1 ], post[1 S ] P = P and pre[1 S ] P = P .
(99) (100)
The following fixpoint characterizations are classical (see e.g. [4], [5]) ⊆
pre[t ? ] F = lfp λX • F ∪ pre[t] X ⊆
f X pre[t f ? ] F = gfp λX • F ∩ pre[t] ⊆
post[t ? ] I = lfp λX • I ∪ post[t] X ⊆
⊆
= lfp F λX • X ∪ pre[t] X, ⊆
= gfp F λX • X ∩ pre[t] f X, ⊆
= lfp I λX • X ∪ post[t] X,
(101)
⊆
] post[t ? ] I = gfp λX • I ∩ ] post[t] X = gfp I λX • X ∩ ] post[t] X . 12.10 Reachable states collecting semantics The reachable states collecting semantics of a component C ∈ CmpJPK of a program P ∈ Prog is the set post[τ ? JCK](In) of states which are reachable from a given set In ∈ ℘ (6 JPK) of initial states, in particular the entry states In = EntryJPK when C is the program P. The program analysis problem we are interested in is to effectively compute a machine representable program invariant J ∈ ℘ (6 JPK) such that post[τ ? JCK] In ⊆ J .
(102)
Using (101), the collecting semantics post[τ ? JCK] In can be expressed in fixpoint form, as follows cjm
PostJCK ∈ ℘ (6 JPK) 7−→ ℘ (6 JPK), 1
where
PostJCKIn = post[τ ? JCK] In ⊆ −→ = lfp PostJCKIn, cjm cjm −→ PostJCK ∈ ℘ (6 JPK) 7−→ ℘ (6 JPK) 7−→ ℘ (6 JPK) , −→ 1 PostJCKIn = λX • In ∪ post[τ JCK] X .
(103) (104)
It follows that we have to effectively compute a machine representable approximation to the least solution of a fixpoint equation.
13. Abstract Interpretation of Imperative Programs The classical approach [13, 5], followed during the Marktoberdorf course, consists in expressing the reachable states in fixpoint form (104) and then in using fixpoint transfer theorems [13] to get, by static partitioning [5], a system of equations attaching precise (for program proving) or approximate (for automated program analysis) abstract assertions to labels or program points [13]. Any chaotic [10] or asynchronous [4] strategy can be used to solve the system of equations iteratively. In practice, the iteration strategy which consists in iteratively or recursively traversing the dependence graph of the system of equations in weak topological order [3] speeds up the convergence, often very significatively [32]. This approach is quite general in that it does not depend upon a specific programming language. However for the simplistic language considered in these notes, the iteration order naturally mimics the execution order, as expressed by the big step relation semantics of Fig. 13. This remark allows us to obtain the corresponding efficient recursive equation solver in a more direct and simpler purely computational way. 59
13.1 Fixpoint precise abstraction The following fixpoint abstraction theorem [13] is used to derive a precise abstract semantics from a concrete one expressed in fixpoint form. Recall that the iteration order of F is the least ≤ ordinal such that F (⊥) = lfp F, if it exists. γ
− Theorem 2 If hM, , 0, ∨i is a cpo, the pair hα, γ i is a Galois connection hM, i ← −−− α→ mon mon hL , vi, F ∈ M 7−→ M and G ∈ L 7−→ L are monotonic and
∀x ∈ M : x lfp F H⇒ α B F (x) = G B α(x) then
v
α(lfp F ) = lfp G and the iteration order of G is less than or equal to that of F . Proof Since 0 is the infimum of M and F is monotone, the transfinite iteration sequence F δ (0), δ ∈ O (1) starting from 0 for F is an increasing chain which is strictly increasing and ≤ ultimately stationary and converges to F = lfp F where is the iteration order of F (see ≤ [12]). It follows that ∀δ ∈ O : F δ (0) ≤ lfp F so α B F (F δ (0)) = G B α(Gδ (⊥)) .
(105)
0 is the infimum of M so ∀y ∈ L : 0 ≤ γ (y) whence α(0) v y for Galois connections (8) proving that ⊥ = α(0) is the infimum of L. Let Gδ (⊥), δ ∈ O be the transfinite iteration v sequence (1) starting from ⊥ for G. It is increasing and convergent to Gε = lfp G where ε is minimal (see [12]). We have α(F 0 (0)) = α(0) = ⊥ = G0 (⊥). If by induction hypothesis F δ (0) = γ (Gδ (⊥)) then α(F δ+1 (0)) = Hdef. (1) of the transfinite iteration sequenceI α(F (F δ (0))) = Hcommutation property (105)I G B α(Gδ (⊥)) = Hdef. (1) of the transfinite iteration sequenceI δ+1 G (⊥) . If λ is a limit ordinal and ∀δ < λ, α(F δ (0)) = Gδ (⊥) by induction hypothesis then by def. (1) of transfinite iteration sequences, def. of lubs and α preserves existing lubs in Galois connections, we have α(F λ (0)) = α( ∨ F δ (0)) = t α(F δ (0)) = t Gδ (⊥) = Gλ (⊥). By δ aDom) (* least upper bound val leq : aDom -> (aDom -> bool) (* approximation ordering (* substitution *) val get : aDom -> label -> E(L).env (* j(l) val set : aDom -> label -> E(L).env -> aDom (* j[l , t, ui, hα, γ i, etc. will be left implicit, although when programming this must be made explicit. We proceed by structural induction on the components CmpJPK of P as defined in Sec. 12.2, proving that for all C ∈ CmpJPK monotony mon
APostJCK ∈ ADomJPK 7−→ ADomJPK, soundness ˙¨ APostJCK, αJPK(PostJCK) ¨˙ v
(112)
locality ∀J ∈ ADomJPK : ∀` ∈ inJPK − in P JCK : J` = (APostJCKJ )` ,
(113)
63
dependence ∀J, J 0 ∈ ADomJPK : (∀` ∈ in P JCK : J` = J`0 ) H⇒ (∀` ∈ in P JCK : (APostJCKJ )` = (APostJCKJ 0)` ) .
(114)
Intuitively the locality and dependence properties express that the postcondition of a command can only depend upon and affect the abstract local invariants attached to labels in the command. 1 = = = ˙¨ v =
For programs P = S ;;, this will ultimately allow us to conclude that αJPK(PostJPK) ¨˙ Hdef. (103) of PostJPKI ? ˙αJPK(post[τ JPK]) ¨ Hprogram syntax of Sec. 12.1 so that P = S ;;I ? ˙¨ JS ;;K]) αJPK(post[τ H(96)I ? ˙¨ αJPK(post[τ JSK]) Hinduction hypothesis (112)I APostJSK 1 Hby letting APostJS ;;K = APostJSK and P = S ;;I APostJPK .
APostJPK = APostJSK is obviously monotonic by induction hypothesis while (113) vacuously holds and (114) follows by equality. We go on by structural induction on C ∈ CmpJPK, starting from the basic cases. 2 = = = = = ¨ v = =
= =
Identity C = skip where at P JCK = ` and after P JCK = `0 . ˙¨ αJPK(PostJ skipK) ˙¨ Hdef. (110) of αJPKI αJPK ¨ B PostJskipK B γ¨ JPK Hdef. (103) of PostI αJPK ¨ B post[τ ? JskipK] B γ¨ JPK Hbig step operational semantics (92)I αJPK ¨ B post[16 JPK ∪ τ JskipK] B γ¨ JPK HGalois connection (98) so that post preserves joinsI ˙ post[τ JskipK]) B γ¨ JPK αJPK ¨ B (post[16 JPK ] ∪ HGalois connection (106) so that αJPK ¨ preserves joinsI ˙ ¨ (αJPK ¨ B post[16 JPK ] B γ¨ JPK) t (αJPK ¨ B post[τ JskipK] B γ¨ JPK) H(100) and (106) so that αJPK ¨ B γ¨ JPK is reductiveI 1ADomJPK t˙¨ (αJPK ¨ B post[τ JskipK] B γ¨ JPK) Hdef. (107) of αI ¨ ˙ • 1ADomJPK t¨ λJ λl ∈ in P JPK• α({ρ ˙ | hl, ρi ∈ post[τ JskipK] B γ¨ JPK(J )}) Hdef. (97) of post I 1ADomJPK t˙¨ λJ • λl ∈ in P JPK• α({ρ ˙ | ∃hl 0 , ρ 0 i ∈ γ¨ JPK(J ) : hhl 0, ρ 0 i, hl, ρii ∈ τ JskipK}) Hdef. (76) and (62) of τ JskipKI ˙ | l = `0 ∧ h`, ρi ∈ γ¨ JPK(J )}) 1ADomJPK t˙¨ λJ • λl ∈ in P JPK• α({ρ Hdef. (108) of γ¨ I 64
˙¨ v = = 1
=
λJ • J t¨ λl ∈ in P JPK•(l = `0 ? α({ρ ˙ | ρ ∈ γ˙ (J` )}) ¿ α(∅)) ˙ ) HGalois connection (17) so that α˙ B γ˙ is reductiveI • ˙ ) λJ J t¨ λl ∈ in P JPK•(l = `0 ? J` ¿ α(∅)) HGalois connection (17) so that α(∅) ˙ is the infimumI λJ • λl ∈ in P JPK•(l = `0 ? J`0 t˙ J` ¿ Jl ) Hdef. substitutionI λJ • J [`0 ← J`0 t˙ J` ] 1 Hby letting APostJskipK = λJ • J [`0 ← J`0 t˙ J` ]I APostJskipK .
Monotony and the locality (113) and dependence (114) properties are trivial. 3 = = = = = ¨ v = =
=
= ¨ v ¨ v
¨ v
¨ v
Assignment C = X := A where at P JCK = ` and after P JCK = `0 . ˙¨ αJPK(PostJ X := AK) ˙¨ Hdef. (110) of αJPKI αJPK ¨ B PostJX := AK B γ¨ JPK Hdef. (103) of PostI αJPK ¨ B post[τ ? JX := AK] B γ¨ JPK Hbig step operational semantics (92)I αJPK ¨ B post[16 JPK ∪ τ JX := AK] B γ¨ JPK HGalois connection (98) so that post preserves joinsI ˙ post[τ JX := AK]) B γ¨ JPK αJPK ¨ B (post[16 JPK ] ∪ HGalois connection (106) so that αJPK ¨ preserves joinsI ˙ (αJPK ¨ B post[16 JPK ] B γ¨ JPK) t¨ (αJPK ¨ B post[τ JX := AK] B γ¨ JPK) H(100) and (106) so that αJPK ¨ B γ¨ JPK is reductiveI ˙ 1ADomJPK t¨ (αJPK ¨ B post[τ JX := AK] B γ¨ JPK) Hdef. (107) of αI ¨ ˙ • ¨ 1ADomJPK t λJ λl ∈ in P JPK• α({ρ ˙ | hl, ρi ∈ post[τ JX := AK] B γ¨ JPK(J )}) Hdef. (97) of post I 1ADomJPK t˙¨ λJ • λl ∈ in P JPK• α({ρ ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : hhl 0, ρ 0 i, hl, ρii ∈ τ JX := AK}) Hdef. (76) and (63) of τ JX := AKI 1ADomJPK t˙¨ λJ • λl ∈ in P JPK• α({ρ ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : l 0 = ` ∧ l = `0 ∧ ∃i ∈ I : ρ = ρ 0 [X ← i ] ∧ ρ 0 ` A Z⇒ i }) Hdef. (108) of γ¨ JPKI • λJ J t¨ λl ∈ in P JPK•(l = `0 ? α({ρ[ ˙ X ← i ] | ρ ∈ γ˙ (J` ) ∧ i ∈ I ∧ ρ ` A Z⇒ i }) ¿ α(∅)) ˙ ) HGalois connection (17) so that α˙ is monotonicI λJ • let V ⊇ {i | ∃ρ ∈ γ˙ (J` ) : ρ ` A Z⇒ i } in let V 0 ⊇ V ∩ I in J t¨ λl ∈ in P JPK•(l = `0 ∧ V 0 6= ∅ ? α({ρ[ ˙ X ← i ] | ρ ∈ γ˙ (J` ) ∧ i ∈ V 0 }) ¿ α(∅)) ˙ ) HV 0 = ∅ ⇒ V ∩ I = ∅ ⇒ V ⊆ E since V ⊆ E ∪ I and E ∩ I = ∅ whence V 6⊆ E ⇒ V 0 6= ∅ together with the monotony of αI ˙ λJ • let V ⊇ {i | ∃ρ ∈ γ˙ (J` ) : ρ ` A Z⇒ i } in let V 0 ⊇ V ∩ I in J t¨ λl ∈ in P JPK•(l = `0 ∧ V 6⊆ E ? α({ρ[ ˙ X ← i ] | ρ ∈ γ˙ (J` ) ∧ i ∈ V 0 }) ¿ α(∅)) ˙ ) HGalois connection (9) so that γ B α is extensive, α is monotonic, V = γ (v) and V 0 = γ (v 0 )I λJ • let v w α({i | ∃ρ ∈ γ˙ (J` ) : ρ ` A Z⇒ i }) in let v 0 w α(γ (v) ∩ I) in ¨ ∈ in P JPK•(l = `0 ∧γ (v) 6⊆ E ? α({ρ[ J tλl ˙ X ← i ] | ρ ∈ γ˙ (J` )∧i ∈ γ (v 0 )}) ¿ α(∅)) ˙ ) HGalois connection (9) so that α is monotonic whence α(X ∩ Y ) v α(X ) u α(Y ), α B γ is reductive and def. (18) of αI ˙ 65
λJ • let v w α({i | ∃ρ ∈ γ˙ (J` ) : ρ ` A Z⇒ i }) in let v 0 w v u α(I) in J t¨ λl ∈ in P JPK•(l = `0 ∧ γ (v) 6⊆ E ? λY ∈ V• α({ρ[X ← i ](Y) | ρ ∈ γ˙ (J` ) ∧ i ∈ γ (v 0 )}) ¿ α(∅)) ˙ ) F ¨ v Hdef. ρ[X ← i ], γ˙ (J` ) = ∅ implies v = ⊥ whence γ (v) ⊆ E, def. (36) of ? and 1
¨ v
¨ v ¨ v ¨ v = 1
=
f(x) = γ (x) ⊆ EI F λJ • let v w α({i | ∃ρ ∈ γ˙ (J` ) : ρ ` A Z⇒ i }) in let v 0 w v u ? in J t¨ λl ∈ in P JPK•(l = `0 ∧ ¬f(v) ? λY ∈ V•(Y = X ? α({i | i ∈ γ (v 0 )}) ¿ α({ρ(Y) | ρ ∈ γ˙ (J` )}))) ¿ α(∅))) Hdef. (28) of the forward collecting semantics Faexp of arithmetic expressions, Galois connection (9) so that α B γ is reductive, def. (19) of γ˙ I • λJ let v w α B Faexp Bγ˙ (J` ) in F J t¨ λl ∈ in P JPK•(l = `0 ∧ ¬f(v) ? λY ∈ V•(Y = X ? v u ? ¿ α({ρ(Y) | ρ(Y) ∈ γ (J`(Y))}))) ¿ α(∅))) HGalois connection (9) so that α B γ is reductive, ⊥ = α(∅)I • λJ let v w α B Faexp Bγ˙ (J` ) in F J t¨ λl ∈ in P JPK•(l = `0 ∧ ¬f(v) ? λY ∈ V•(Y = X ? v u ? ¿ J` (Y))) ¿ ⊥)) F F Hdef. (30) of α , def. of J` [X ← v u ? ]I F F ¨ ∈ in P JPK•(l = `0 ∧¬f(v) ? J` [X ← v u ? F ] ¿ ⊥)) λJ • let v w α (Faexp)(J` )u? in J tλl F Hsoundness (33) of the abstract interpretation Faexp JAK of arithmetic expressions A ¨ and def. of tI F F • λJ let v = Faexp JAK(J` ) in λl ∈ in P JPK•(l = `0 ∧ ¬f(v) ? J`0 t˙ J` [X ← v u ? ] ¿ Jl ) Hdef. of J [`0 ← ρ]I F F λJ • let v = Faexp JAK(J` ) in (¬f(v) ? J [`0 ← J`0 t˙ J` [X ← v u ? ]] ¿ J ) 1 F Hby letting APostJX := AK = λJ • let v = Faexp JAK(J` ) in (f(v) ? J ¿ J [`0 ← F J`0 t˙ J` [X ← v u ? ]]))I APostJX := AK . F
Monotony is trivial by monotony of Faexp JAK and so are the locality (113) and dependence (114) properties by (57). 4 = = = = ˙¨ v = = ˙¨ v 1
=
Sequence C1 ; . . . ; Cn , n > 0. ˙¨ αJPK(PostJC 1 ; . . . ; C n K) ˙¨ Hdef. (110) of αJPKI αJPK ¨ B PostJC1 ; . . . ; Cn K B γ¨ JPK Hdef. (103) of PostI αJPK ¨ B post[τ ? JC1 ; . . . ; Cn K] B γ¨ JPK Hbig step operational semantics (95)I αJPK ¨ B post[τ ? JC1 K B . . . B τ ? JCn K] B γ¨ JPK Hdistribution (99) of post over composition BI αJPK ¨ B post[τ ? JCn K] B . . . B post[τ ? JC1 K] B γ¨ JPK Hmonotony and Galois connection (109) so that γ¨ JPK B αJPK ¨ is extensiveI ? ? αJPK ¨ B post[τ JCn K] γ¨ JPK B αJPK ¨ B . . . B γ¨ JPK B αJPK ¨ post[τ JC1 K] B γ¨ JPK ˙¨ Hdef. (110) of αJPKI ? ? ˙¨ ˙¨ αJPK(post[τ JCn K]) B . . . B αJPK(post[τ JC1 K]) Hdef. (103) of PostI ˙¨ ˙¨ αJPK(PostJC n K) B . . . B αJPK(PostJC 1 K) Hmonotony and induction hypothesis (112)I APostJCn K B . . . B APostJC1 K 1 Hby letting APostJC1 ; . . . ; Cn K = APostJCn K B . . . B APostJC1 KI 66
APostJC1 ; . . . ; Cn K . Monotony follows from the induction hypothesis and the definition of APostJC1 ; . . . ; Cn K by composition of monotonic functions APostJCi K, i = 1, . . . , n. The locality (113) and dependence (114) properties follow by induction hypothesis for the APostJCi K,Si = 1, . . . , n n whence for APostJC1 ; . . . ; Cn K by definition (58) of in P JC1 ; . . . ; Cn K = i=1 in P JCi K. 5 5.1 = = =
= =
= ˙¨ v
Conditional C = if B then St else S f fi where at P JCK = ` and after P JCK = `0 . By (93), we will need an over approximation of αJPK ¨ B post[τ B ] B γ¨ JPK Hdef. (107) of αJPKI ¨ ˙ | hl, ρi ∈ post[τ B ] B γ¨ JPK(J )}) λJ • λl ∈ in P JPK• α({ρ Hdef. (97) of postI • λJ λl ∈ in P JPK• α({ρ ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : hhl 0 , ρ 0 i, hl, ρii ∈ τ B }) B Hdef. (93) of τ I λJ • λl ∈ in P JPK• α({ρ ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : l 0 = ` ∧ l = at P JSt K ∧ ρ = ρ 0 ∧ ρ 0 ` B Z⇒ tt}) Hdef. (108) of γ¨ JPKI λJ • λl ∈ in P JPK•(l = at P JSt K ? α({ρ ˙ ∈ γ˙ (J` ) | ρ ` B Z⇒ tt}) ¿ α(∅)) ˙ ) 1 ˙ = Hdef. (50) of the collecting semantics CbexpJBK of boolean expressions B and ⊥ α(∅)I ˙ ˙) • λJ λl ∈ in P JPK•(l = at P JSt K ? α˙ B CbexpJBK B γ˙ (J` ) ¿ ⊥) Hdef. (52) of αI ¨ ˙) • • ¨ λJ λl ∈ in P JPK (l = at P JSt K ? α(CbexpJBK)(J `) ¿ ⊥) Hsoundness (53) of the abstract semantics AbexpJBK of boolean expressionsI ˙). • (115) λJ λl ∈ in P JPK•(l = at P JSt K ? AbexpJBK(J`) ¿ ⊥)
It is now easy to derive an over approximation of = = = ˙¨ v
= 5.2
αJPK ¨ B post[16 JPK ∪ τ B ] B γ¨ JPK HGalois connection (98) so that post preserves joinsI ˙ post[τ B ]) B γ¨ JPK αJPK ¨ B (post[16 JPK ] ∪ HGalois connection (106) so that αJPK ¨ preserves joinsI (αJPK ¨ B post[16 JPK ] B γ¨ JPK) t¨ (αJPK ¨ B post[τ B ] B γ¨ JPK) Hby (100) post preserves identityI (αJPK ¨ B γ¨ JPK) t¨ (αJPK ¨ B post[τ B ] B γ¨ JPK) Hby the Galois connection (106) so that is αJPKB ¨ γ¨ JPK is reductive and previous lemma (115)I ˙) λJ • J t˙ λl ∈ in P JPK•(l = at P JSt K ? AbexpJBK(J`) ¿ ⊥) λJ • λl ∈ in P JPK•(l = at P JSt K ? Jat P JSt K t˙ AbexpJBK(J`) ¿ Jl ) . (116) By (93), we will also need an over approximation of
αJPK ¨ B post[τ t ] B γ¨ JPK = Hdef. (107) of αJPKI ¨ λJ • λl ∈ in P JPK• α({ρ ˙ | hl, ρi ∈ post[τ t ] B γ¨ JPK(J )}) = Hdef. (97) of post I λJ • λl ∈ in P JPK• α({ρ ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : hhl 0 , ρ 0 i, hl, ρii ∈ τ t }) 67
Hdef. (93) of τ t I λJ • λl ∈ in P JPK• α({ρ ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : l 0 = after P JSt K ∧ ρ 0 = ρ ∧ l = `0}) = Hdef. (108) of γ¨ JPKI • λJ λl ∈ in P JPK•(l = `0 ? α({ρ ˙ | ρ ∈ γ˙ (Jafter P JSt K )}) ¿ α(∅)) ˙ ) ˙ = α(∅)I = HGalois connection (17) so that α˙ B γ˙ is reductive and ⊥ ˙ 0 ¿ ˙ • • λJ λl ∈ in P JPK (l = ` ? Jafter P JSt K ⊥)) (117)
=
It is now easy to derive an over approximation of = = = ˙¨ v
=
5.3 of = = ˙¨ v
=
=
= ˙¨ v
αJPK ¨ B post[16 JPK ∪ τ t ] B γ¨ JPK HGalois connection (98) so that post preserves joinsI ˙ post[τ t ]) B γ¨ JPK αJPK ¨ B (post[16 JPK ] ∪ HGalois connection (106) so that αJPK ¨ preserves joinsI (αJPK ¨ B post[16 JPK ] B γ¨ JPK) t¨ (αJPK ¨ B post[τ t ] B γ¨ JPK) Hby (100) post preserves identityI (αJPK ¨ B γ¨ JPK) t¨ (αJPK ¨ B post[τ t ] B γ¨ JPK) Hby the Galois connection (106) so that is αJPKB ¨ γ¨ JPK is reductive and previous lemma (117)I ˙) λJ • J t˙ λl ∈ in P JPK•(l = `0 ? Jafter P JSt K ¿ ⊥) ˙ is the infimumI H⊥ λJ • λl ∈ in P JPK•(l = `0 ? J`0 t˙ Jafter P JSt K ¿ Jl ) (118) By (93), for the then branch of the conditional, we will need an over approximation αJPK ¨ B post[(16 JPK ∪ τ B ) B τ ? JSt K B (16 JPK ∪ τ t )] B γ¨ JPK Hdistribution (99) of post over BI αJPK ¨ B post[16 JPK ∪ τ t ] B post[τ ? JSt K] B post[16 JPK ∪ τ B ] B γ¨ JPK HGalois connection (106) so that γ¨ JPK B αJPK ¨ is extensive and monotonyI αJPK ¨ B post[16 JPK ∪ τ t ] B γ¨ JPK B αJPK ¨ B post[τ ? JSt K] B γ¨ JPK B αJPK ¨ B B post[16 JPK ∪ τ ] B γ¨ JPK Hlemma (116) and monotonyI ¨ B post[τ ? JSt K] B γ¨ JPK B λJ • λl ∈ in P JPK•(l = αJPK ¨ B post[16 JPK ∪ τ t ] B γ¨ JPK B αJPK at P JSt K ? Jat P JSt K t˙ AbexpJBK(J`) ¿ Jl ) ˙¨ Hdef. (110) of αJPKI ? ˙¨ αJPK ¨ B post[16 JPK ∪ τ t ] B γ¨ JPK B αJPKpost[τ JSt K] B λJ • λl ∈ in P JPK•(l = at P JSt K ? ¿ Jat P JSt K t˙ AbexpJBK(J`) Jl ) Hdef (103) of PostI ˙¨ αJPK ¨ B post[16 JPK ∪ τ t ] B γ¨ JPK B αJPK(PostJS t K) B λJ • λl ∈ in P JPK•(l = at P JSt K ? ¿ Jat P JSt K t˙ AbexpJBK(J`) Jl ) Hinduction hypothesis (112) and monotonyI αJPK ¨ B post[16 JPK ∪ τ t ] B γ¨ JPK B APostJSt K B λJ • λl ∈ in P JPK•(l = at P JSt K ? Jat P JSt K t˙ AbexpJBK(J`) ¿ Jl ) Hlemma (118)I • λJ λl ∈ in P JPK•(l = `0 ? J`0 t˙ Jafter P JSt K ¿ Jl ) B APostJSt K B λJ • λl ∈ in P JPK•(l = at P JSt K ? Jat P JSt K t˙ AbexpJBK(J`) ¿ Jl ) Hdef. of the let constructI
68
0
λJ • let J t = λl ∈ in P JPK•(l = at P JSt K ? Jat P JSt K t˙ AbexpJBK(J`) ¿ Jl ) in 00 0 let J t = APostJSt K(J t ) in 00 t 00 ¿ J t 00 ) λl ∈ in P JPK•(l = `0 ? J`t0 t˙ Jafter l JS K P
(119)
t
Observe that monotony follows by induction hypothesis and the locality (113) and dependence (114) properties by induction hypothesis and the labelling condition (59). 5.4 Since the case of the else branch of the conditional is similar to (5.3), we can now come back to the calculational design of APostJif B then St else S f fiK as an upper approximation of = = =
=
˙¨ if B then St else S f fiK) αJPK(PostJ ˙¨ Hdef. (110) of αJPKI αJPK ¨ B PostJif B then St else S f fiK B γ¨ JPK Hdef. (103) of PostI αJPK ¨ B post[τ ? Jif B then St else S f fiK] B γ¨ JPK Hbig step operational semantics (93)I ¯ αJPK ¨ B post [(16 JPK ∪ τ B ) B τ ? JSt K B (16 JPK ∪ τ t ) ∪ (16 JPK ∪ τ B ) B τ ? JS f K B (16 JPK ∪ τ f )] B γ¨ JPK HGalois connection (98) so that post preserves joinsI B τ ? JSt K B (16 JPK ∪ τ t )] ∪˙ αJPK ¨ B (post [(16 JPK ∪ τ B ) ¯
post[(16 JPK ∪ τ B ) B τ ? JS f K B (16 JPK ∪ τ f )]) B γ¨ JPK = HGalois connection (106) so that αJPK ¨ preserves joinsI (αJPK ¨ B post[(16 JPK ∪ τ B ) B τ ? JSt K B (16 JPK ∪ τ t )] B
t˙¨
γ¨ JPK)
B¯
(αJPK ¨
B
post[(16 JPK ∪ τ ) B τ ? JS f K B (16 JPK ∪ τ f )] B γ¨ JPK) ˙¨ v Hlemma (5.3) and similar one for the else branchI 0 λJ • let J t = λl ∈ in P JPK•(l = at P JSt K ? Jat P JSt K t˙ AbexpJBK(J`) ¿ Jl ) in (120) 00 0 t t let J = APostJSt K(J ) in 00 t 00 ¿ J t 00 ) λl ∈ in P JPK•(l = `0 ? J`t0 t˙ Jafter l JS K P t t¨ 0 let J f = λl ∈ in P JPK•(l = at P JS f K ? Jat P JS f K t˙ AbexpJT (¬B)K(J`) ¿ Jl ) in 00 0 let J f = APostJS f K(J f ) in f 00 f 00 f 00 λl ∈ in P JPK•(l = `0 ? J`0 t˙ Jafter JS K ¿ Jl ) P
=
P
=
f
Hby grouping similar termsI 0 λJ • let J t = λl ∈ in P JPK•(l = at P JSt K ? Jat P JSt K t˙ AbexpJBK(J`) ¿ Jl ) 0 and J f = λl ∈ in P JPK•(l = at P JS f K ? Jat P JS f K t˙ AbexpJT (¬B)K(J`) ¿ Jl ) in 00 0 let J t = APostJSt K(J t ) 00 0 and J f = APostJS f K(J f ) in 00 00 f 00 f 00 t 00 ¿ J t 00 t˙ J f ) ˙ ˙ λl ∈ in P JPK•(l = `0 ? J`t0 t˙ Jafter t J t J 0 l l ` JS K after JS K t
P
f
00
0
f
Hby locality (113) and labelling scheme (59) so that in particular J`t0 = J`t0 = J`t0 = J`0 f0
f 00
= J`0 = J`0 and APostJSt K and APostJS f K do not interfereI
69
λJ • let J 0 = λl ∈ in P JPK• (l = at P JSt K ? Jat P JSt K t˙ AbexpJBK(J`) | l = at P JS f K ? Jat P JS f K t˙ AbexpJT (¬B)K(J`) ¿ Jl ) in 00 let J = APostJSt K B APostJS f K(J 0) in 00 00 ¿ J 00 ) λl ∈ in P JPK•(l = `0 ? J`000 t˙ Jafter t˙ Jafter l P JSt K P JS f K = Hby letting APostJif B then St else S f fiK be defined as in (129)I APostJif B then St else S f fiK Iteration C = while B do S od where ` = at P JCK and `0 = after P JCK.
6 6.1 = = = =
= =
= ˙¨ v 1
=
B ˙¨ ]) αJPK(post[τ ˙¨ Hdef. (110) of αJPKI B αJPK ¨ B post[τ ] B γ¨ JPK Hdef. (107) of αJPKI ¨ • • λJ λl ∈ in P JPK α({ρ ˙ | hl, ρi ∈ post[τ B ] B γ¨ JPK(J )}) Hdef. (97) of postI ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : hhl 0 , ρ 0 i, hl, ρii ∈ τ B }) λJ • λl ∈ in P JPK• α({ρ Hdef. (94) of τ B I λJ • λl ∈ in P JPK• α({ρ ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : l 0 = ` ∧ l = at P JSK ∧ ρ = ρ 0 ∧ ρ 0 ` B Z⇒ tt}) Hdef. (108) of γ¨ JPKI • λJ λl ∈ in P JPK•(l = at P JSK ? α({ρ ˙ ∈ γ˙ (J` ) | ρ ` B Z⇒ tt}) ¿ α(∅)) ˙ ) 1 ˙ = Hdef. (50) of the collecting semantics CbexpJBK of boolean expressions B and ⊥ α(∅)I ˙ ˙) λJ • λl ∈ in P JPK•(l = at P JSK ? α˙ B CbexpJBK B γ˙ (J` ) ¿ ⊥) Hdef. (52) of αI ¨ ˙) λJ • λl ∈ in P JPK•(l = at P JSK ? α(CbexpJBK)(J ¨ `) ¿ ⊥) Hsoundness (53) of the abstract semantics AbexpJBK of boolean expressionsI ˙) λJ • λl ∈ in P JPK•(l = at P JSK ? AbexpJBK(J`) ¿ ⊥) B Hby introducing the APost JCK notation where C = while B do S odI APost B JCK . (121)
6.2
= = ˙¨ v 1 =
By (94), we will need an over approximation of
Similarly (` = at P JCK),
B¯ ˙¨ αJPK(post[τ ]) ¯ • ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : hhl 0 , ρ 0 i, hl, ρii ∈ τ B }) λJ λl ∈ in P JPK• α({ρ ¯ Hdef. (94) of τ B I λJ • λl ∈ in P JPK• α({ρ ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : l 0 = ` ∧ l = after P JCK ∧ ρ = ρ 0 ∧ ρ 0 ` T (¬B) Z⇒ tt}) ˙) λJ • λl ∈ in P JPK•(l = after P JCK ? AbexpJT (¬B)K(J`) ¿ ⊥) ¯ Hby introducing the APost B JCK notation where C = while B do S odI ¯ (122) APost B JCK .
70
6.3 = = = = = = 1
=
By (94), we will also need an over approximation of (` = at P JCK)
R ˙¨ ]) αJPK(post[τ ˙¨ Hdef. (110) of αJPKI R αJPK ¨ B post[τ ] B γ¨ JPK Hdef. (107) of αJPKI ¨ • • ˙ | hl, ρi ∈ post[τ R ] B γ¨ JPK(J )}) λJ λl ∈ in P JPK α({ρ Hdef. (97) of post I λJ • λl ∈ in P JPK• α({ρ ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : hhl 0 , ρ 0 i, hl, ρii ∈ τ R }) Hdef. (94) of τ R I λJ • λl ∈ in P JPK• α({ρ ˙ | ∃hl 0, ρ 0 i ∈ γ¨ JPK(J ) : l 0 = after P JSK ∧ ρ 0 = ρ ∧ l = `}) Hdef. (108) of γ¨ JPKI • λJ λl ∈ in P JPK•(l = ` ? α({ρ ˙ | ρ ∈ γ˙ (Jafter P JSK )}) ¿ α(∅)) ˙ ) ˙ = α(∅)I HGalois connection (17) so that α˙ B γ˙ is reductive and ⊥ ˙ ˙) λJ • λl ∈ in P JPK•(l = ` ? Jafter P JSK ¿ ⊥) Hby introducing the APost R JCK notation where C = while B do S odI (123) APost R JCK .
6.4 = = = =
For the loop entry, we will need an over approximation of B¯ B ? αJPK(post[1 ¨˙ 6 JPK ∪ τ B τ JSK ∪ τ ]) ˙¨ Hdef. (110) of αJPKI ¯ αJPK ¨ B post[16 JPK ∪ τ B B τ ? JSK ∪ τ B ] B γ¨ JPK HGalois connection (98) so that post preserves joinsI ˙ post[τ B B τ ? JSK] ∪ ˙ post[τ B¯ ]) B γ¨ JPK αJPK ¨ B (post[16 JPK ] ∪ Hby (99) post distributes over BI ¯ ? ˙ αJPK ¨ B (post[16 JPK ] ∪(post[τ JSK] B post[τ B ]) ∪˙ post[τ B ]) B γ¨ JPK HGalois connection (106) so that αJPK ¨ preserves joinsI ¨ (αJPK ¨ B post[16 JPK ] B γ¨ JPK) ∪ (αJPK ¨ B post[τ ? JSK] B post[τ B ] B γ¨ JPK) ∪¨ (αJPK ¨ B ¯
post[τ B ] B γ¨ JPK) ˙¨ v HGalois connection (106) so that γ¨ JPK B αJPK ¨ is extensive and monotony, I ? ¨ B post[τ JSK] B γ¨ JPK B αJPK ¨ B post[τ B ] B γ¨ JPK) ∪¨ (αJPK ¨ B post[16 JPK ] B γ¨ JPK) ∪¨ (αJPK ¯
(αJPK ¨ B post[τ B ] B γ¨ JPK) ˙¨ ¨ B γ¨ JPK v H(100) so that post[16 JPK ] is the identity, Galois connection (106) so that αJPK ˙¨ is reductive, def. (103) of PostJSK, def. (110) of αJPKI B¯ B ˙ ˙¨ ˙¨ ˙¨ 1ADomJPK7−mon ])) t˙¨ αJPK(post[τ ]) t¨ (αJPK(PostJSK) B αJPK(post[τ →ADomJPK ˙¨ v Hlemma (121), lemma (122), induction hypothesis (112) and monotonyI ¯ 1ADomJPK7−mon t˙¨ (APostJSK B APost B JCK) t¨˙ APost B JCK . →ADomJPK
6.5
For the loop exit, we will need an over approximation of
? R αJPK(post[1 ¨˙ 6 JPK ∪ τ JSK B τ ]) ˙¨ = Hdef. (110) of αJPKI αJPK ¨ B post[16 JPK ∪ τ ? JSK B τ R ] B γ¨ JPK = HGalois connection (98) so that post preserves joinsI ˙ post[τ ? JSK B τ R ]) B γ¨ JPK αJPK ¨ B (post[16 JPK ] ∪
71
(124)
= = ˙¨ v ˙¨ v
˙¨ v
Hby (99) post distributes over BI ? ˙ JSK] B post[τ R ])) B γ¨ JPK αJPK ¨ B (post[16 JPK ] ∪(post[τ HGalois connection (106) so that αJPK ¨ preserves joinsI ¨ (αJPK (αJPK ¨ B post[16 JPK ] B γ¨ JPK) ∪ ¨ B post[τ ? JSK] B post[τ R ] B γ¨ JPK) HGalois connection (106) so that γ¨ JPK B αJPK ¨ is extensive and monotonyI ? ˙ ¨ ¨ B post[τ R ] B γ¨ JPK) (αJPK ¨ B post[16 JPK ] B γ¨ JPK) t (αJPK ¨ B post[τ JSK] B γ¨ JPK B αJPK H(100) so that post[16 JPK ] is the identity, Galois connection (106) so that αJPK ¨ B γ¨ JPK ˙ is reductive, def. (103) of PostJSK, def. (110) of αJPKI ¨ R ˙¨ (αJPK(PostJSK) ˙¨ ˙¨ 1ADomJPK7−mon ])) t B αJPK(post[τ →ADomJPK Hlemma (123), induction hypothesis (112) and monotonyI t˙¨ (APostJSK B APost R JCK) . 1ADomJPK7−mon →ADomJPK
(125)
Observe that in all cases (121), (122), (123), (124) and (125), monotony follows by induction hypothesis and the locality (113) and dependence (114) properties by induction hypothesis and the labelling condition (60). 6.6
By (94), we will also need an over approximation of
B ˙¨ αJPK(post[(τ B τ ? JSK B τ R )? ] ) = HChurch λ-notationI ˙¨ • post[(τ B B τ ? JSK B τ R )? ] In) αJPK(λIn ˙¨ = Hdef. (110) of αJPKI αJPK ¨ B λIn• post[(τ B B τ ? JSK B τ R )? ] In B γ¨ JPK = Hfixpoint characterization (101)I ⊆ αJPK ¨ B λIn• lfp λX • In ∪ post[τ B B τ ? JSK B τ R ] X B γ¨ JPK = Hdef. application and composition BI ⊆ λJ • αJPK(lfp ¨ λX • γ¨ JPK(J ) ∪ post[τ B B τ ? JSK B τ R ] X )
In order to apply Th. 3, we compute = = =
= = = = ˙¨ v
αJPK ¨ B λX • γ¨ JPK(J ) ∪ post[τ B B τ ? JSK B τ R ] X B γ¨ JPK HChurch λ-notationI • λX αJPK( ¨ γ¨ JPK(J ) ∪ post[τ B B τ ? JSK B τ R ](γ¨ JPK(X ))) HGalois connection (106) so that αJPK ¨ preserves joinsI B ? ¨ γ¨ JPK(J )) t¨ αJPK(post[τ ¨ B τ JSK B τ R ](γ¨ JPK(X ))) λX • αJPK( HGalois connection (106) so that αJPK ¨ B γ¨ JPK is reductive and by (99) post distributes over BI λX • J t˙¨ αJPK ¨ B post[τ R ] B post[τ ? JSK] B post[τ B ] B γ¨ JPK HGalois connection (106) so that γ¨ JPK B αJPK ¨ is extensive and monotonyI λX • J t˙¨ αJPK ¨ B post[τ ? JSK] B γ¨ JPK B αJPK ¨ B post[τ B ] B γ¨ JPK ¨ B post[τ R ] B γ¨ JPK B αJPK ˙¨ Hdef. (110) of αJPKI B ˙ ˙¨ λX • J t¨ αJPK ¨ B post[τ R ] B γ¨ JPK B αJPK ¨ B post[τ ? JSK] B γ¨ JPK B αJPK(post[τ ]) Hlemma (121) and monotonyI • λX J t˙¨ αJPK ¨ B post[τ ? JSK] B γ¨ JPK B APost B JCK ¨ B post[τ R ] B γ¨ JPK B αJPK ˙¨ Hdef. (110) of αJPK and def. (103) of PostI R ˙ ˙¨ λX • J t¨ αJPK ¨ B post[τ ] B γ¨ JPK B αJPK(PostJSK) B APost B JCK ˙¨ Hinduction hypothesis (112) and monotony, def. (110) of αJPKI R B ˙ ¨ B post[τ ] B γ¨ JPK B APostJSK B APost JCK λX • J t¨ αJPK 72
˙¨ Hdef. (110) of αJPKI R ˙ ˙ ¨ ]) B APostJSK B APost B JCK λX • J t¨ αJPK(post[τ = Hlemma (123)I • λX J t˙¨ APost R JCK B APostJSK B APost B JCK
=
so that we conclude B ˙¨ αJPK(post[(τ B τ ? JSK B τ R )? ] ) ⊆ ¨ λX • γ¨ JPK(J ) ∪ post[τ B B τ ? JSK B τ R ] X ) = λJ • αJPK(lfp = HTh. 3I ¨ v = λJ • lfp λX • J t¨ APost R JCK B APostJSK B APost B JCK(X )
(126)
Monotony follows when taking the least fixpoint of a functional which by induction hypothesis, is monotonic. The locality (113) and dependence (114) properties can be proved by induction hypothesis and the labelling condition (60) for all fixpoint iterates and is preserved by lubs whence when passing to the limit. 6.7 We can now come back to the calculational design of APostJwhile B do S odK as an upper approximation of = = =
=
˙¨ v ˙¨ v
˙¨ while B do S odK) αJPK(PostJ ˙¨ Hdef. (110) of αJPKI αJPK ¨ B PostJwhile B do S odK B γ¨ JPK Hdef. (103) of PostI αJPK ¨ B post[τ ? Jwhile B do S odK] B γ¨ JPK Hbig step operational semantics (94) of the iterationI ¯ αJPK ¨ B post[(16 JPK ∪ τ ? JSK B τ R ) B (τ B B τ ? JSK B τ R )? B (16 JPK ∪ τ B B τ ? JSK ∪ τ B )] B γ¨ JPK Hdistribution (99) of post over BI ¯ αJPK ¨ B post[16 JPK ∪ τ B B τ ? JSK ∪ τ B ] B post[(τ B B τ ? JSK B τ R )? ] B post[16 JPK ∪ τ ? JSK B τ R ] B γ¨ JPK HGalois connection (106) so that γ¨ JPK B αJPK ¨ is extensive and monotonyI B¯ B ? αJPK ¨ B post[16 JPK ∪ τ B τ JSK ∪ τ ] B γ¨ JPK B αJPK ¨ B post[(τ B B τ ? JSK B τ R )? ] B γ¨ JPK B αJPK ¨ B post[16 JPK ∪ τ ? JSK B τ R ] B γ¨ JPK Hlemmata (124), (126), (125) and monotonyI ¨ v ¯ (1ADomJPK7−mon t˙¨ (APostJSK B APost B JCK) t˙¨ APost B JCK) B λJ • lfp λX • J t¨ →ADomJPK mon APost R JCK B APostJSK B APost B JCK(X ) B (1 t˙¨ (APostJSK B ADomJPK7−→ADomJPK
APost JCK)) = APostJwhile B do S odK . R
1
In conclusion the calculational design of the generic forward nonrelational abstract semantics of programs leads to the functional and compositional characterization given in Fig. 14. In order to effectively compute an overapproximation of the set post[τ ? JPK] In of states which are reachable by repeated small steps of the program P from some given set In of initial states, we use an overapproximation of the initial states ¨ I αJPK(In) ¨ v
73
(133)
•
APostJskipK = λJ • J [after P JskipK ← Jafter P JskipK t˙ Jat P JskipK ]
(127)
•
APostJX := AK = λJ •let ` = at P JX := AK, `0 = after P JX := AK in F let v = Faexp JAK(J` ) in F (f(v) ? J ¿ J [`0 ← J`0 t˙ J` [X ← v u ? ]])) where: ∀v ∈ L : f(v) H⇒ γ (v) ⊆ E
(128)
•
C = if B then St else S f fi, APostJCK = λJ • let J 0 = J [at P JSt K ← Jat P JSt K t˙ AbexpJBK(JatP JCK ); at P JS f K ← Jat P JS f K t˙ AbexpJT (¬B)K(Jat P JCK )] in
(129)
let J 00 = APostJSt K B APostJS f K(J 0) in 00 00 00 J 00 [after P JCK ← Jafter t˙ Jafter t˙ Jafter JCK JS K P
•
P
t
P JS f K
]
C = while B do S od, APostJCK = ˙¨ (APostJSK B APost B JCK) t˙¨ APost B¯ JCK B 1ADomJPK7−mon t →ADomJPK ¨ v λJ • lfp λX • J t¨ APost R JCK B APostJSK B APost B JCK(X ) B ˙¨ (APostJSK B APost R JCK) 1ADomJPK7−mon t →ADomJPK where: 1 ¨ P JSK ← AbexpJBKJat JCK ] APost B JCK = λJ • ⊥[at P 1 B¯ ¨ APost JCK = λJ • ⊥[after P JCK ← AbexpJT (¬B)KJat JCK ]
(130)
P
¨ P JCK ← Jafter JSK ] APost JCK = λJ • ⊥[at P R
• •
1
APostJC1 ; . . . ; Cn K = APostJCn K B . . . B APostJC1 K APostJS ;;K = APostJSK .
(131) (132)
Figure 14: Generic forward nonrelational reachability abstract semantics of programs and compute APostJPKI . Elements of ADomJPK must therefore be machine representable, which is obviously the case if the lattice L of abstract value properties is itself machine representable. Moreover the computation of APostJPKI terminates if the complete lattice hL , vi satisfies the ascending chain condition. Otherwise convergence must be accelerated using widening/narrowing techniques 9 . The soundness of the approach is easily established ¨ w ¨ w ¨ w ¨ w
APostJPKI Hsoundness (111)I ˙¨ αJPK(PostJPK)I Habstraction (133) of the entry condition and monotonyI ˙αJPK(PostJPK) ¨ αJPK(In) ¨ Hdef. (110)I αJPK ¨ B PostJPK B γ¨ JPK B αJPK(In) ¨ HGalois connection (106) so that γ¨ JPK B αJPK ¨ is extensive and monotonyI αJPK(PostJPK(In)) ¨ 9 which
were explained in the course but not, for short, in the notes, see [9, 16].
74
or equivalently, by the Galois connection (106) post[τ ? JPK] In ⊆ γ¨ JPK(APostJPKI ) .
(134)
Notice that the set γ¨ JPK(APostJPKI ) is usually infinite so that its exploitation must be programmed using the encoding used for ADomJPK (or some machine representable image). 13.6 The generic abstract interpreter for reachability analysis The abstract syntax of commands is as follows type com = | SKIP of label * label | ASSIGN of label * variable * aexp * label | SEQ of label * (com list) * label | IF of label * bexp * bexp * com * com * label | WHILE of label * bexp * bexp * com * label
For a command C, the first label at P JCK (written (at C)) and the second after P JCK (written (after C)) satisfy the labelling conditions of Sec. 12.3. The boolean expression B of conditional and iteration commands is recorded by T (B) and T (¬(B)) as defined in Sec. 9.1. The signature of the generic abstract interpreter [7] is module type APost_signature = functor (L: Abstract_Lattice_Algebra_signature) -> functor (E: Abstract_Env_Algebra_signature) -> functor (D: Abstract_Dom_Algebra_signature) -> functor (Faexp: Faexp_signature) -> functor (Baexp: Baexp_signature) -> functor (Abexp: Abexp_signature) -> sig open Abstract_Syntax (* generic forward nonrelational abstract reachability semantics of *) (* commands *) val aPost : com -> D(L)(E).aDom -> D(L)(E).aDom end;;
Again the implementation is a prototype (in particular global operations on abstract invariants does not take the locality (113) and dependence properties (114) into account, a program optimization which is currently well beyond the current compiler technology for functional languages). module APost_implementation = functor (L: Abstract_Lattice_Algebra_signature) -> functor (E: Abstract_Env_Algebra_signature) -> functor (D: Abstract_Dom_Algebra_signature) -> functor (Faexp: Faexp_signature) -> functor (Baexp: Baexp_signature) -> functor (Abexp: Abexp_signature) -> struct open Abstract_Syntax open Labels (* generic abstract environments module E’ = E(L) (* generic abstract invariants module D’ = D(L)(E) (* generic forward abstract interpretation of arithmetic operations
75
*) *) *)
module Faexp’ = Faexp(L)(E) (* generic [reductive] abstract interpretation of boolean operations *) module Abexp’ = Abexp(L)(E)(Faexp)(Baexp) (* iterative fixpoint computation *) module F = Fixpoint((D’:Poset_signature with type element=D(L)(E).aDom)) (* generic forward nonrelational abstract reachability semantics *) exception Error_aPost of string let rec aPost c j = match c with | (SKIP (l, l’)) -> (D’.set j l’ (E’.join (D’.get j l’) (D’.get j l))) | (ASSIGN (l,x,a,l’)) -> let v = (Faexp’.faexp a (D’.get j l)) in if (L.in_errors v) then j else (D’.set j l’ (E’.join (D’.get j l’) (E’.set (D’.get j l) x (L.meet v (L.f_RANDOM ()))))) | (SEQ (l, s, l’)) -> (aPostseq s j) | (IF (l, b, nb, t, f, l’)) -> let j’ = (D’.set j (at t) (E’.join (D’.get j (at t)) (Abexp’.abexp b (D’.get j l)))) in let j’’ = (D’.set j’ (at f) (E’.join (D’.get j’ (at f)) (Abexp’.abexp nb (D’.get j’ l)))) in let j’’’ = (aPost t (aPost f j’’)) in (D’.set j’’’ l’ (E’.join (E’.join (D’.get j’’’ l’) (D’.get j’’’ (after t))) (D’.get j’’’ (after f)))) | (WHILE (l, b, nb, c’, l’)) -> let aPostB j = (D’.set (D’.bot ()) (at c’) (Abexp’.abexp b (D’.get j l))) in let aPostnotB j = (D’.set (D’.bot ()) l’ (Abexp’.abexp nb (D’.get j l))) in let aPostR j = (D’.set (D’.bot ()) l (D’.get j (after c’))) in let j’ = (D’.join j (aPost c’ (aPostR j))) in let f x = (D’.join j’ (aPostR (aPost c’ (aPostB x)))) in let j’’ = (F.lfp f (D’.bot ())) in (D’.join j’’ (D’.join (aPost c’ (aPostB j’’)) (aPostnotB j’’))) and aPostseq s j = match s with | [] -> raise (Error_aPost "empty sequence of commands") | [c] -> (aPost c j) | h::t -> (aPostseq t (aPost h j)) end;; module APost = (APost_implementation:APost_signature);;
13.7 Abstract initial states We are left with the problem of defining the set In of initial states. More generally in the course we considered an assertion language allowing such safety and liveness non-trivial specifications. For short here, we consider the simple case when In is just the set EntryJPK of program entry states (see (77)) αJPK(EntryJPK) ¨ Hdef. (107) of αJPK ¨ and (77) of EntryJPK I • λ` ∈ in P JPK α({λ ˙ X ∈ VarJPK• i | ` = at P JPK}) = Hdef. (18) of αI ˙ λ` ∈ in P JPK•(` = at P JPK ? λX ∈ VarJPK• α({i }) ¿ α(∅)) ˙ ) 1 1 ¨ = λ` ∈ in P JPK• ⊥ ˙ and (16) of substitutionI ˙ = α(∅), ˙ ⊥ = Hdef. ⊥ ¨ • ⊥[at P JPK ← λX ∈ VarJPK α({i })] 1 ¨ P JPK ← λX ∈ VarJPK• α({i })]I = Hby defining AEntryJPK = ⊥[at
=
76
(135)
AEntryJPK . 13.8 Implementation of the abstract entry states The immediate translation is module AEntry_implementation = functor (L: Abstract_Lattice_Algebra_signature) -> functor (E: Abstract_Env_Algebra_signature) -> functor (D: Abstract_Dom_Algebra_signature) -> struct open Abstract_Syntax open Labels (* generic abstract environments *) module E’ = E(L) (* generic abstract invariants *) module D’ = D(L)(E) (* abstraction of entry states *) exception Error_aEntry of string let aEntry c = if (at c) (entry ()) then raise (Error_aEntry "not the program entry point") else (D’.set (D’.bot ()) (at c) (E’.initerr ())) end;;
13.9 The reachability static analyzer The generic abstract interpreter APostJPK(AEntryJPK) can be partially instantiated with (or without) reductive iterations, as follows [7] module Analysis_Reductive_Iteration_implementation = functor (L: Abstract_Lattice_Algebra_signature) -> struct open Program_To_Abstract_Syntax module ENTRY = AEntry(L)(Abstract_Env_Algebra)(Abstract_Dom_Algebra) module POST = APost(L)(Abstract_Env_Algebra)(Abstract_Dom_Algebra)(Faexp) (Baexp_Reductive_Iteration)(Abexp_Reductive_Iteration) module PRN = Pretty_Print(L)(Abstract_Env_Algebra)(Abstract_Dom_Algebra) let analysis () = print_string "type the program to analyze..."; print_newline (); let p = abstract_syntax_of_program () in let j = (POST.aPost p (ENTRY.aEntry p)) in (PRN.pretty_print p j) end;;
and then to a particular value property abstract domain module ISS’ = Analysis_Reductive_Iteration(ISS_Lattice_Algebra);;
Three examples of initialization and simple sign reachability analysis from the entry states are given below. The comparison of the first and second examples illustrates the loss of information 1 due to the absence of an abstract value POSZ such that γ (POSZ) = [0, max_int] ∪ {a }. The third example shows the imprecision on reachability resulting from the choice to have γ ()BOT 6= ∅).
77
TOP
INI
ERR
INE
NEGZ NZERO
POSZ
NEG
POS
ZERO
ARE
γ (BOT) γ (INE) γ (ARE) γ (ERR) γ (NEG) γ (ZERO) γ (POS) γ (NEGZ) γ (NZERO) γ (POSZ) γ (INI) γ (TOP)
1
= 1 = 1 = 1 = 1 = 1 = 1 = 1 = 1 = 1 = 1 = 1 =
∅ {i } {a } {i , a } [min_int, −1] ∪ {a } {0, a } [1, max_int] ∪ {a } [min_int, 0] ∪ {a } [min_int, −1] ∪ [1, max_int] ∪ {a } [0, max_int] ∪ {a } I ∪ {a } I = I ∪ {i , a }
BOT
Figure 15: The lattice of errors and signs
{ n:ERR; i:ERR } n := ?; i := 1; { n:INI; i:POS } while (i < n) do { n:POS; i:POS } i := (i + 1) { n:POS; i:POS } od { n:INI; i:POS }
{ n:ERR; i:ERR } n := ?; i := 0; { n:INI; i:INI } while (i < n) do { n:INI; i:INI } i := (i + 1) { n:INI; i:INI } od { n:INI; i:INI }
{ x:ERR } x := (1 / 0); { x:BOT } skip; { x:BOT } x := 1 { x:POS }
Precision can be increased to solve these problems by using the lattice of errors and signs specified in Fig. 15, as shown below. { n:ERR; i:ERR } n := ?; i := 0; { n:INI; i:POSZ } while (i < n) do { n:POS; i:POSZ } i := (i + 1) { n:POS; i:POS } od { n:INI; i:POSZ }
{ x:ERR } x := (1 / 0); { x:BOT } skip; { x:BOT } x := 1 { x:BOT }
The next two examples (for which the gathered information is the same whether reductive iterations are used or not) show that the classical handling of arithmetic or boolean expressions using assignments of simple monomials to auxiliary variables (i1 in the example below) is less precise than the algorithm proposed in these notes. { x:ERR; y:ERR } x := 0; y := ?; { x:ZERO; y:INI } while (x = -y) do { x:ZERO; y:ZERO } skip { x:ZERO; y:ZERO } od { x:ZERO; y:INI }
{ x:ERR; y:ERR; i1:ERR } x := 0; y := ?; i1 := -y; { x:ZERO; y:INI; i1:INI } while (x = i1) do { x:ZERO; y:INI; i1:ZERO } skip; i1 := -y { x:ZERO; y:INI; i1:INI } od { x:ZERO; y:INI; i1:INI }
78
The same loss of precision due to the nonrelational abstraction (17) appears when boolean expressions are analyzed by compilation into intermediate short-circuit conditional code { x:ERR; y:ERR; z:ERR } x := 0; y := ?; z := ?; { x:ZERO; y:INI; z:INI } if ((x=y)&((z+1)=x)&(y=z)) then
{ x:ERR; y:ERR; z:ERR } x := 0; y := ?; z := ?; { x:ZERO; y:INI; z:INI } if ((x=y)&((z+1)=x)) then { x:ZERO; y:ZERO; z:NEG } if (y=z) then { x:ZERO; y:BOT; z:BOT } skip else { x:ZERO; y:ZERO; z:NEG } skip fi { x:ZERO; y:ZERO; z:NEG } else { x:ZERO; y:INI; z:INI } skip fi { x:ZERO; y:INI; z:INI }
{ x:BOT; y:BOT; z:BOT } skip
else { x:ZERO; y:INI; z:INI } skip fi { x:ZERO; y:INI; z:INI }
Similar examples can be provided for any nontrivial nonrelational abstract domain. 13.10 Specializing the abstract interpreter to reachability analysis from the entry states As a very first step towards efficient analyzers, the abstract interpreter of Fig. 14 can be specialized for reachability analysis from program entry states [7]. We want to calculate APostJPK(AEntryJPK) and more generally, for all program subcommands C ∈ CmpJPK ¨ P JCK ← r ]) λr ∈ AEnvJPK• APostJCK(⊥[at that is 1
APostEn P JCK = α εP JCK(APostJCK)
(136)
where ¨ P JCK ← r ]) α εP JCK = λF • λr • F(⊥[at
(137)
˙ ? f (Jat JCK ) ¿ >) ¨) γ Pε JCK = λ f • λJ •(∀l 6= at P JCK : Jl = ⊥ P
(138)
1 1
is such that ˙¨ f α εP JCKF v ˙¨ and (137) of α ε JCKI ⇐⇒ Hdef. of the pointwise ordering v P ¨ P JCK ← r ]) v ¨ f (r ) ∀r ∈ AEnvJPK : F(⊥[at ¨ is the supremum, while for ⇐, by choosing ⇐⇒ Hfor ⇒, by choosing r = Jat P JCK and > ¨ P JCK ← r ]I J = ⊥[at ˙ ? f (Jat JCK ) ¿ >) ¨) ¨ (∀l 6= at P JCK : Jl = ⊥ ∀J ∈ ADomJPK : F(J ) v P ε ⇐⇒ Hdef. (138) of γ P JCKI ¨ γ Pε JCK( f )J ∀J ∈ ADomJPK : F(J ) v ˙¨ (which is overloaded)I ⇐⇒ Hdef. of the pointwise ordering v ˙¨ γ ε JCK( f ) Fv P 79
whence γ ε JCK
˙¨ ←−P−−−− hAEnvJPK 7−mon ¨˙ . hADomJPK 7−→ ADomJPK, vi → ADomJPK, vi −−− → ε −− mon
α P JCK
We consider the simple situation where γ (⊥) = ∅ for which (34), (39) and (54) do hold. It follows by structural and fixpoint induction that for all C ∈ CmpJPK ¨ = ⊥ ¨ . APostJCK(⊥)
(139)
We calculate APostEn P JCK by structural induction and trivially prove simultaneously ˙ , locality ∀r ∈ AEnvJPK : ∀l ∈ inJPK − in P JCK(APostEn P JCKr )l = ⊥ extension ∀r ∈ AEnvJPK : r v (APostEn P JCKr )at P JCK . 1
(140) (141)
Identity C = skip where at P JCK = ` and after P JCK = `0
APostEn P JskipK = Hdef. (136) of APostEn P and (137) of α εP I ¨ ← r ]) λr • APostJskipK(⊥[` = Hdef. (127) of APostJskipK, labelling condition (56) and substitution (16)I ¨ ← r ; `0 ← r ] . • λr ⊥[` (140) and (141) hold because in P JskipK = {`, `0} and reflexivity. 2
Assignment C = X := A where at P JCK = ` and after P JCK = `0
APostEn P JX := AK Hdef. (136) of APostEn and (137) of α e I ¨ ← r ]) λr • APostJX := AK(⊥[` = Hdef. (128) of APostJX := AKI F ¨ ← r ])` ) in λr • let v = Faexp JAK((⊥[` ¨ ¨ ← r ])[`0 ← (⊥[` ¨ ← r ])`0 t˙ (⊥[` ¨ ← r ])` [X ← v u ? F ]])) (f(v) ? ⊥[` ← r ] ¿ (⊥[` ˙ is the infimum and def. (16) of substitutionI = Hlabelling condition (56), ⊥ F ¨ ← r ] ¿ ⊥[` ¨ ← r ; `0 ← r [X ← v u ? F ]])) . λr • let v = Faexp JAK(r ) in (f(v) ? ⊥[` =
(140) and (141) hold because in P JX := AK = {`, `0} and reflexivity. 3 `0
Conditional C = if B then St else S f fi where at P JCK = ` and after P JCK =
APostEn P Jif B then St else S f fiK Hdef. (136) of APostEn and (137) of α e I ¨ ← r ]) λr • APostJif B then St else S f fiK(⊥[` = Hdef. (120) of APostJif B then St else S f fiKI 0 ¨ ← r ])at JS K t˙ • λr let J t = λl ∈ in P JPK•(l = at P JSt K ? (⊥[` P t ¨ ← r ])l ) in ¨ ← r ])` ) ¿ (⊥[` AbexpJBK((⊥[` 00 00 0 t 00 ¿ J t 00 ) let J t = APostJSt K(J t ) in λl ∈ in P JPK•(l = `0 ? J`t0 t˙ Jafter l P JSt K ¨t 0 ¨ ← r ])at JS K t˙ let J f = λl ∈ in P JPK•(l = at P JS f K ? (⊥[` P f ¨ ← r ])` ) ¿ (⊥[` ¨ ← r ])l ) in AbexpJT (¬B)K((⊥[` 00 0 f 00 f 00 f 00 let J f = APostJS f K(J f ) in λl ∈ in P JPK•(l = `0 ? J`0 t˙ Jafter JS K ¿ Jl ) . =
P
80
f
For the true alternative, by the labelling condition (59) so that ` 6= at P JSt K and def. (16) of substitution, we get 0 ¨ ← r ; at P JSt K ← AbexpJBKr] in λr • let J t = ⊥[` 00 00 0 t 00 ¿ J t 00 ) let J t = APostJSt K(J t ) in λl ∈ in P JPK•(l = `0 ? J`t0 t˙ Jafter l P JSt K = Hby the locality (113) and dependence (114) properties, the labelling conditions (56) so that ` 6= `0 and (59) so that `, `0 6∈ in P JSt KI ¨ P JSt K ← AbexpJBKr]) in ⊥[` ¨ ← r ; `0 ← J t ¨ t λr • let J t = APostJSt K(⊥[at after P JSt K ] t J = Hdef. (136) of APostEn, (137) of α e and induction hypothesisI ¨ ← r ; `0 ← J t ¨ t λr • let J t = APostEn P JSt K(AbexpJBKr) in ⊥[` after JS K ] t J , P
t
so that we get (147) by grouping with a similar result for the false alternative and using the labelling condition (59). (140) and (141) hold by induction hypothesis and (59). 4 Iteration C = while B do S od where at P JCK = `, after P JCK = `0 and `1 , `2 ∈ in P JSK: According to the definition (130) of APostJwhile B do S od, K, we start by the calculation of ¨ ← r ]) APost R JCK(⊥[` = Hdef. (130) of APost R JCKI ¨ ← (⊥[` ¨ ← r ])after JSK ] ⊥[` P = Hlabelling conditions (56) so that ` 6= `0 and def. (16) of substitutionI ¨ ← ⊥] ˙ = ⊥ ¨ . ⊥[`
(142)
It follows that R ˙ ¨ ← r ]) ¨ = 1ADomJPK7−mon t (APostJSK B APost JCK) (⊥[` →ADomJPK ¨˙ H(def. identity 1 and pointwise lub tI R ¨ ← r ] t¨ APostJSK B APost JCK(⊥[` ¨ ← r ]) = ⊥[` ¨ is the infimumI H(142), strictness (139) and ⊥ ¨ ← r] . = ⊥[`
(143)
For the fixpoint ¨ v ¨ ← r ]) λJ • lfp λX • J t¨ APost R JCK B APostJSK B APost B JCK(X ) (⊥[` =
Hdef. applicationI ¨ ← r ]) t¨ APost R JCK B APostJSK B APost B JCK(X ) , lfp λX •(⊥[` ¨ v
¨ ← x] and γ 0 = λJ • J` so that we have the Galois connection let us define α 0 = λx • ⊥[` γ0
− hADomJPK, vi ˙ ← ¨ . hAEnvJPK, vi −−− −− → 0 α
We have ¨ ← r ]) t¨ APost R JCK B APostJSK B APost B JCK(α 0 (x)) (⊥[` = Hdef. α 0 I ¨ ← r ]) t¨ APost R JCK B APostJSK B APost B JCK(⊥[` ¨ ← x]) (⊥[` B = Hdef. (130) of APost JCKI ¨ P JSK ← AbexpJBKx] ¨ (⊥[` ← r ]) t¨ APost R JCK B APostJSK B ⊥[at e = Hdef. (136) of APostEn, (137) of α and induction hypothesisI 81
¨ ← r ]) t¨ APost R JCK B APostEn P JSK(AbexpJBKx) (⊥[` = Hdef. (130) of APost R JCKI ¨ ← r ]) t¨ (APostEn P JSK(AbexpJBKx))after JSK (⊥[` P = Hdef. pointwise union t¨ and (16) of substitutionI ¨ ← r t˙ (APostEn P JSK(AbexpJBKx))after JSK ]) (⊥[` P = Hdef. α 0 I α 0 B λx • r t˙ (APostEn P JSK(AbexpJBKx))after P JSK (x) , so that by the fixpoint abstraction theorem 2, we get ¨
v ¨ ← r ]) t¨ APost R JCK B APostJSK B APost B JCK(X ) lfp λX •(⊥[` ˙ v = α 0 (lfp λx • r t˙ (APostEn P JSK(AbexpJBKx))after P JSK (x)) = Hdef. α 0 I ˙ ¨ ← lfpv λx • r t˙ (APostEn P JSK(AbexpJBKx))after JSK (x)] . ⊥[` P
(144)
It remains to calculate B¯ B ˙ ˙ ¨ ← r 0 ]) ¨ ¨ mon 1ADomJPK7−→ADomJPK t (APostJSK B APost JCK) t APost JCK (⊥[` ˙ Hdef. pointwise lub t¨ and identity 1I ¨ ← r 0 ] t¨ (APostJSK B APost B JCK(⊥[` ¨ ← r 0 ])) t¨ APost B¯ JCK(⊥[` ¨ ← r 0 ]) ⊥[` ¯
¯
Hdef. (130) of APost B JCK and APost B JCKI ¨ ← r 0 ] t¨ (APostJSK(⊥[at ¨ P JSK ← AbexpJBKr 0])) t¨ ⊥[` ¨ 0 ← AbexpJT (¬B)Kr 0] ⊥[` Hlabelling condition (56), def. (16) of substitution, def. (136) of APostEn, (137) of α e and induction hypothesisI ¨ ← r 0 ; `0 ← AbexpJT (¬B)Kr 0] t¨ (APostEn P JSK(AbexpJBKr 0) . ⊥[` (145) It follows that for the iteration APostEn P JCK, where C = while B do S od = Hdef. (136) of APostEn and (137) of α e I ¨ ← r ]) λr • APostJCK(⊥[` = Hdef. (130) of APostJwhile B do S odKI ˙¨ (APostJSK B APost B JCK) t˙¨ APost B¯ JCK B λr • 1ADomJPK7−mon t →ADomJPK ¨ v R B • • λJ lfp λX J t¨ APost JCK B APostJSK B APost JCK(X ) B ˙¨ (APostJSK B APost R JCK) (⊥[` ¨ ← r ]) 1ADomJPK7−mon t →ADomJPK = Hlemmata (143), (144) and (145)I ˙ v λr •let r 0 = lfp λx • r t˙ (APostEn P JSK(AbexpJBKx))after P JSK (x) in ¨ ← r 0 ; `0 ← AbexpJT (¬B)Kr 0] t¨ APostEn P JSK(AbexpJBKr 0) . ⊥[` (140) and (141) hold by induction hypothesis, induction on the fixpoint iterates and (60). 5 For the sequence C1 ; . . . ; Cn , n > 0 where ` = at P JC1 ; . . . ; Cn K = at P JC1 K and 0 ` = after P JC1 ; . . . ; Cn K = after P JCn K, we show that APostEn P JC1 ; . . . ; Cn Kr = let J 1 = APostEn P JC1 Kr in (146) 2 1 ¨ 1 let J = J t APostEn P JC2 KJat P JC2 K in ... let J n = J n−1 t¨ APostEn P JCn K(J n−1)at P JCn K in Jn , 82
as well as the locality (140) and extension (141) properties. We proceed by induction on n > 0. This is trivial for the basis n = 1. For the induction step n + 1, we have APostEn P JC1 ; . . . ; Cn ; Cn+1 K Hdef. (136) of APostEn and (137) of α e I ¨ ← r ]) λr • APostJC1 ; . . . ; Cn ; Cn+1 K(⊥[` = Hdef. (131) of APostJC1 ; . . . ; Cn ; Cn+1 K and APostJC1 ; . . . ; Cn K and associativity of BI ¨ ← r ]) • λr APostJCn+1 K B APostJC1 ; . . . ; Cn K(⊥[` e = Hdef. (136) of APostEn, (137) of α and induction hypothesis (146)I • λr let J 1 = APostEn P JC1 Kr in ... let J n = J n−1 t¨ APostEn P JCn K(J n−1)at P JCn K in APostJCn+1 KJ n =
To conclude the induction step, it remains to calculate APostJCn+1 KJ n = Hlocality property (140), labelling (58) so that after P JCn K = at P JCn+1 K = in P JCn K ∩ in P JCn+1 K, locality (113) and dependence (114) propertiesI λl ∈ in P JCK• (l ∈ in P JC1 ; . . . ; Cn K − {at P JCn=1 K} ? (J n )l ¨ P JCn=1 K ← (J n )at JC K ])) ¿ APostJCn+1 K⊥[at n=1 P = Hdef. (136) of APostEn, (137) of α e and structural inductionI λl ∈ in P JCK• (l ∈ in P JC1 ; . . . ; Cn K − {at P JCn=1 K} ? (J n )l ¿ APostEn P JCn+1 K(J n )at JC K ) n=1 P = Hlocality (140) and extension (141)I J n t¨ APostEn P JCn+1 K(J n )at P JCn=1K . so that APostEn P JC1 ; . . . ; Cn+1 Kr = let J 1 = APostEn P JC1 Kr in ... let J n+1 = J n t¨ APostEn P JCn+1 K(J n+1)at P JCn+1K in J n+1 . (140) and (141) hold by induction hypothesis and (58). 6
Programs P = S ;;
APostEn P JS ;;K Hdef. (136) of APostEn and (137) of α e I ¨ ← r ]) λr • APostJS ;;K(⊥[` = Hdef. (132) of APostJS ;;KI ¨ ← r ]) λr • APostJSK(⊥[` = Hdef. (136) of APostEn and (137) of α e I APostEn P JSK . =
The final specification is given in Fig. 16, from which programming is immediate [7]. Notice that the above calculation can be done directly on the program by partial evaluation [25] (although the present state of the art might not allow for a full automation of the program generation). The next step consists in avoiding useless copies of abstract invariants (deforestation). 83
•
¨ P JskipK ← r ; after P JskipK ← r ] APostEn P JskipKr = ⊥[at
•
APostEn P JX := AKr = let v = Faexp JAKr in ¨ P JX := AK ← r ] (f(v) ? ⊥[at ¿ ⊥[at ¨ P JX := AK ← r ; after P JX := AK ← r [X ← v u ? F ]]))
•
C = if B then St else S f fi, APostEn P JCKr = (147) tt let J = APostEn P JSt K(AbexpJBKr ) in let J ff = APostEn P JS f K(AbexpJT (¬B)Kr ) in ff ¨ P JCK ← r ; after P JCK ← J tt ˙ Jafter ] t¨ J tt t¨ J ff ⊥[at after P JSt K t P JS f K
•
F
C = while B do S od, APostEn P JCKr = ˙ v
let r 0 = lfp λx • r t˙ (APostEn P JSK(AbexpJBKx))after P JSK in ¨ P JCK ← r 0 ; after P JCK ← AbexpJT (¬B)Kr 0 ] t¨ ⊥[at APostEn P JSK(AbexpJBKr 0 )
•
C = C1 ; . . . ; Cn , APostEn P JCKr = 1 let J = APostEn P JC1 Kr in let J 2 = J 1 t¨ APostEn P JC2 K(J 1 )at P JC2 K in ... let J n = J n−1 t¨ APostEn P JCn K(J n−1 )at P JCn K in Jn
•
APostEn P JS ;;K = APostEn P JSK(λX ∈ VarJPK• α({i })) .
Figure 16: Generic forward nonrelational reachability from entry states abstract semantics of programs By choosing to totally order the labels (such that in P JCK = {` | at P JCK ≤ ` ≤ after P JCK}) and the program variables, abstract invariants can be efficiently represented as matrices of abstract values. The locality (140) and dependence (equivalent to (114)) properties for APostEn P JCK yield an implementation where the abstract invariant is computed by assignments to a global array. For large programs more efficient memory management strategies are necessary which is facilitated by the observation that the only global information needing to be permanently memorized are the loop abstract invariants.
14. Conclusion These notes cover in part the 1998 Marktoberdorf course on the “calculational design of semantics and static analyzers by abstract interpretation”. We have chosen to put the emphasis on the calculational design more than on the abstract interpretation theory and its possible applications to software reliability. The objective of these notes is to show clearly that the complete calculation-based development of program analyzers is possible, which is much more difficult to explain orally. The programming language considered in the course was the same, except that the small84
step operational semantics (Sec. 7., 9. and 12.) was defined using an ML-type based abstract syntax (indeed isomorphic to the grammar based abstract syntax of Sec. 7.1, 9.1 and 12.1). We considered a hierarchy of semantics by abstraction of a infinitary trace semantics expressed in fixpoint form (see [6]). The non-classical big-step operational semantics of Sec. 12.8 and reachable states semantics of Sec. 12.10 are only two particular semantics in this rich hierarchy. The interest of this point of view is to rub out the dependence upon the particular standard semantics which is used as initial specification of program behaviors since all semantics are abstract interpretations of one another, hence are all part of the same lattice of abstract interpretations [13]. The Galois connection and widening/narrowing based abstract interpretation frameworks (including the combination and refinement of abstract algebras) were treated at length. Very few elements are given in these written notes (see complements in [14, 16, 17] at higherorder and [15] for weaker frameworks not assuming the existence of best approximations). Finite abstract algebras like the initialization and simple sign domain of Sec. 5.3 are often not expressive enough in practice. One must resort to infinite abstract domains like the intervals considered in the course (see [8, 9]), which is the smallest abstract domain which is complete for determining the sign of addition [23]. With such infinite abstract domains which do not satisfy the ascending chain condition, widening/narrowing are needed for accelerating the convergence and improving the precision of fixpoint computations. Being based on a particular abstract syntax and semantics, the recursive analyzer considered in these notes is dependent upon the language to be analyzed. This was avoided in the course since the design of generic abstract interpreters was based on compositionally defined systems of equations, chaotic iterations and weak topological orderings. The emphasis in these notes has been on the correctness of the design by calculus. The mechanized verification of this formal development using a proof assistant can be foreseen with automatic extraction of a correct program from its correctness proof [30]. Unfortunately most proof assistants are presently still unstable, heavy if not rebarbative to use and sometimes simply bugged. The specification of the static analyzer which has been derived in these course notes is welladapted to the higher-order modular functional programming style. Further refinement steps would be necessary for efficiency. The problem of deriving very efficient analyzers which are both fast and memory sparing goes beyond classical compiler program optimization and partial evaluation techniques (as shown by the specialization to entry states in Sec. 13.10). This problem has not been considered in the course nor in these notes. A balance between correctness and efficiency might be found by developing both an efficient static analyzer (with expensive fixpoint computations, etc.) and a correct static verifier (which might be somewhat inefficient to perform a mere checking of the abstract invariant computed by the analyzer). Only the correctness of the verifier must be formally established without particular concern for efficiency. The main application of the program static analyzer considered in the course was abstract checking, as introduced10 in [5] and refined by [2]. The difference with abstract modelchecking [19] is that the semantic model is not assumed to be finite, the abstraction is not specific to a particular program (see [16] for a proof that finite abstract domains are inadequate in this context) and specifications are not given using a temporal logic. By experience, specifications separated from the program do not evolve with program modifications over large periods of time (10 to 20 years) and are unreadable for very large programs (over 100,000 lines). The solution proposed in the oral course was to insert safety/invariant and liveness/intermittent 10 without
name
85
together with final and initial assertions in the program text. The analysis must then combine forward and backward abstract interpretations (only forward analyses were considered in these written notes, see e.g. [18] for this more general case and an explanation of why decreasing iterations are necessary in the context of infinite systems). The final question is whether the calculational design of program static analyzers by abstract interpretation of a formal semantics does scale up. Experience shows that it does by small parts. This provides a thorough understanding of the abstraction process allowing for the later development of useful large scale analyzers [27].
Acknowledgments I thank Manfred Broy and the organizers of the International Summer School Marktoberdorf (Germany) on “Calculational System Design” for inviting me to the course and to write these notes. I thank Radhia Cousot and Roberto Giacobazzi for their comments on a draft.
References [1]
A.V. Aho, R. Sethi, and J.D. Ullman. Compilers. Principles, Technique and Tools. AddisonWesley, 1986. 19 [2] F. Bourdoncle. Abstract debugging of higher-order imperative languages. In Proc. PLDI, pp. 46–55. ACM Press, 1993. 85 [3] F. Bourdoncle. Efficient chaotic iteration strategies with widenings. In D. Bjørner, M. Broy, and I.V. Pottosin, editors, Proc. FMPA, Academgorodok, Novosibirsk, Russia, LNCS 735, pp. 128–141. Springer, Jun. 28–Jul. 2, 1993. 59 [4] P. Cousot. Méthodes itératives de construction et d’approximation de points fixes d’opérateurs monotones sur un treillis, analyse sémantique de programmes. Thèse d’État ès sciences mathématiques, Université scientifique et médicale de Grenoble, France, 21 March 1978. 59, 59 [5] P. Cousot. Semantic foundations of program analysis. In S.S. Muchnick and N.D. Jones, eds., Program Flow Analysis: Theory and Applications, ch. 10, pp. 303–342. Prentice-Hall, 1981. 3, 59, 59, 59, 85 [6] P. Cousot. Constructive design of a hierarchy of semantics of a transition system by abstract interpretation. ENTCS, 6, 1997. URL: http://www.elsevier.nl/locate/entcs/volume6.html, 25 pages. 3, 85 [7] P. Cousot. The Marktoberdorf’98 generic abstract interpreter. Available at URL: http://www.dmi.ens.fr/˜cousot/Marktoberdorf98.shtml. 19, 75, 77, 79, 83 [8] P. Cousot and R. Cousot. Static determination of dynamic properties of programs. In Proc. 2nd Int. Symp. on Programming, pp. 106–130. Dunod, 1976. 8, 85 [9] P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In 4th POPL, pp. 238–252, Los Angeles, Calif., 1977. ACM Press. 3, 3, 8, 39, 74, 85 [10] P. Cousot and R. Cousot. Automatic synthesis of optimal invariant assertions: mathematical foundations. In ACM Symposium on Artificial Intelligence & Programming Languages, Rochester, N.Y., SIGPLAN Notices 12(8):1–12, 1977. 59 [11] P. Cousot and R. Cousot. A constructive characterization of the lattices of all retractions, preclosure, quasi-closure and closure operators on a complete lattice. Portugal. Math., 38(2):185– 198, 1979. 38
86
[12] P. Cousot and R. Cousot. Constructive versions of Tarski’s fixed point theorems. Pacific J. Math., 82(1):43–57, 1979. 38, 60, 60, 62 [13] P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In 6th POPL, pp. 269–282, San Antonio, Texas, 1979. ACM Press. 3, 6, 11, 15, 15, 15, 37, 59, 59, 59, 60, 61, 85 [14] P. Cousot and R. Cousot. Abstract interpretation and application to logic programs. J. Logic Prog., 13(2–3):103–179, 1992. (The editor of JLP has mistakenly published the unreadable galley proof. For a correct version of this paper, see http://www.dmi.ens.fr/˜cousot.). 7, 11, 85 [15] P. Cousot and R. Cousot. Abstract interpretation frameworks. J. Logic and Comp., 2(4):511–547, Aug. 1992. 85 [16] P. Cousot and R. Cousot. Comparing the Galois connection and widening/narrowing approaches to abstract interpretation, invited paper. In M. Bruynooghe and M. Wirsing, editors, Proc. Int. Work. PLILP ’92,, Leuven, Belgium, LNCS 631, pp. 269–295. Springer, 13–17 Aug. 1992. 6, 6, 74, 85, 85 [17] P. Cousot and R. Cousot. Higher-order abstract interpretation (and application to comportment analysis generalizing strictness, termination, projection and PER analysis of functional languages), invited paper. In Proc. 1994 ICCL, Toulouse, France, pp. 95–112. IEEE Comp. Soc. Press, 16–19 May 1994. 85 [18] P. Cousot and R. Cousot. Refining model checking by abstract interpretation. Automated Software Engineering Journal, special issue on Automated Software Analysis, 6(1), 1999. To appear. 86 [19] D. Dams, O. Grumberg, and R. Gerth. Abstract interpretation of reactive systems: Abstractions preserving ∀CTL? , ∃CTL? and CTL? . In E.R. Olderog, editor, Proc. IFIP WG2.1/WG2.2/WG2.3 Working Conf. on Programming Concepts, Methods and Calculi (PROCOMET), IFIP Transactions. North-Holland/Elsevier, Jun. 1994. 85 [20] B.A. Davey and H.A. Priestley. Introduction to Lattices and Order. Cambridge U. Press, 1990. 4 [21] N. Dershowitz and J.-P. Jouannaud. Rewrite systems. In J. van Leeuwen, editor, Formal Models and Semantics, volume B of Handbook of Theoretical Computer Science, ch. 6, pp. 243–320. Elsevier, 1990. 41 [22] E.W. Dijkstra and C.S. Scholten. Predicate Calculus and Program Semantics. Springer, 1990. 14 [23] R. Giacobazzi and F. Ranzato. Completeness in abstract interpretation: A domain perspective. In M. Johnson, editor, Proc. 6th Int. Conf. AMAST ’97, Sydney, Australia, LNCS 1349, pp. 231–245. Springer, 13–18 Dec. 1997. 85 [24] P. Granger. Improving the results of static analyses of programs by local decreasing iterations. In Proc. 12th FST & TCS, pp. 68–79, New Delhi, India, LNCS 652. Springer, 18–20 Dec. 1992. 38 [25] N.D. Jones, Gomard C.K., Sestoft P., L.O. (Andersen, and T.) Mogensen. Partial Evaluation and Automatic Program Generation. Prentice-Hall, 1993. 83 [26] G. Kahn. Natural semantics. In K. Fuchi and M. Nivat, editors, Programming of Future Generation Computers, pp. 237–258. Elsevier, 1988. 13 [27] P. Lacan, J.N. Monfort, Le Vinh Quy Ribal, A. Deutsch, and G. Gonthier. The software reliability verification process: The A RIANE 5 example. In Proc. DASIA 98 – DAta Systems IN Aerospace, Athens, Grece. ESA Publications, SP-422, May 25–28 1998. 3, 86 [28] B. Le Charlier and P. Flener. On the desirable link between theory and practice in abstract interpretation. In P. Van Hentenryck, ed., Proc. SAS ’97, Paris, FRA, LNCS 1302, pp. 379–387. Springer, 8–10 Sep. 1997. 15, 15 [29] B. Le Charlier and P. Van Hentenryck. Reexecution in Abstract Interpretation of Prolog. In Krzysztof Apt, ed., Proceedings of the Joint International Conference and Symposium on Logic Programming, pp. 750–764, Washington, USA, Nov. 1992. The MIT Press. 38 [30] D. Monniaux. Réalisation mécanisée d’interpréteurs abstraits. Rapport de stage, DEA “Sémantique, Preuve et Programmation”, Jul. 1998. 85
87
[31] G.D. Plotkin. A structural approach to operational semantics. Tech. rep. DAIMI FN-19, Aarhus University, Denmark, Sep. 1981. 13, 31, 43, 57 [32] E. Schön. On the computation of fixpoints in static program analysis with an application to analysis of AKL. Res. rep. R95:06, Swedish Institute of Computer Science, SICS, 1995. 59
88