proved a-priori, but in general, it is required that the applicability of the instantiated de- velopment step is ... formation rules and methods, and to adequate system support (cf. e.g. the KORSO ...... m) o COMMUTATIVE) o. UFT(λξ. fib2(succ n)=.
Towards Correct, Efficient and Reusable Transformational Developments Bernd Krieg-Brückner, Junbo Liu, Hui Shi, Burkhard Wolff Universität Bremen Abstract. In the methodology for the development of correct softwareby transformation, each development step corresponds to the application of a preconceived transformation rule or method. The framework is generic with respect to an object language and permits the verification of semantic correctness. Elementary transformation rules incorporate a powerful notion of matching that allows abstraction to rule schemata. Higher-order rules are the elements of a tactical calculus with a number of desirable algebraic properties. This is the basis for a formalisation of transformational developments, for generalisation of concrete developments to tactical methods, and for a refinement of methods to efficient transformation scripts. Thus reusability of the development process is achieved and general, correct development methods can be established and refined into efficient tactical programs.
1
Introduction
1 . 1 Development by Transformation Most frameworks for the development of correct software are based on some notion of refinement, i.e. a development step with some formal justification for its correctness. In the framework of the KORSO methodology, for example (cf. [PW94]), this is expressed by the derived-from relation, associating an explicit justification object with the arrow denoting the development step from a unit A to a unit A'. There are various ways of formal justification: correctness may be established by verification of explicit proof obligations that arise from the semantics of the development step; in fact, the derived unit may be conceived independently and the relation (and formal justification) of refinement added as a separate step. We refer to this approach as a-posteriori verification or the "invent-and-verify" approach. An alternative is to have a notion of pre-conceived formal development step, for which correctness-preservation is intrinsic. The correctness of a schematic development step is proved a-priori, but in general, it is required that the applicability of the instantiated development step is verified as a precondition and justification for its application. In the methodology of program development by transformation, such a development step corresponds to the application of a transformation rule or method (CIP [CIP85, 87], PROSPECTRA [KKLT91, HK93] and [KB94a, b] are the basis for the work presented here). In contrast to the "invent-and-verify" approach, the target is constructed by the transformation, which generates the proof obligation (as the instantiated applicability condition). In both approaches the developer has to have expertise in choosing the right development step and creativity to come up with solutions. The advantage of the transformational approach is that each transformation rule is preconceived as an (algorithmic) problem solution or optimisation schema; it is clear what the parameters (embodying the design decisions that demand creativity) and proof obligations are at each step. The strict formalisation of each development step as a transformation is seen as a straightjacket by some critics; our notion of transformation is general enough, however, to incorporate, as an extreme case, an "in-
vent-and-verify transformation" in which the developer takes responsibility for the correctness of the refinement. The usefulness of the transformational approach boils down to the successful formalisation of software development knowledge in the form of correct transformation rules and methods, and to adequate system support (cf. e.g. the KORSO System Architecture Framework [KM+94], instantiated to the transformational approach). 1 . 2 Correctness The foremost requirement for correctness is often violated. In most transformation systems in practical use, transformation rules have not formally been proved to be correctness-preserving according to the semantics of the object language (see [Wol94, Liu95] for the verification of classic transformation rules, instantiated to SPECTRUM [Bro+93] and applied to the KORSO example LEX [KWL94]). Tactics for application are often buried in complex (meta-)programs and not available for formal reasoning; this is similar for proof systems. Thus a semantic framework is called for that allows formal reasoning at the meta-level in relation to the semantics of the object language, e.g. for correctness of rules, their composition, and tactics. Our notion of correctness of elementary rules (chapter 4) enables a separation of concerns between correctness of transformation rules and the correctness of their eventual application in a particular context (the proof obligation generated as the instantiated applicability condition). It is our goal to provide a separation between correctness of generic rules abstracting from the concrete semantics of the underlying specification or programming language, and the correctness of their instantiation to a particular semantics. This has been achieved for composition but not yet for elementary rules. 1 . 3 Reusability of the Development Process There is a possible added value that has great potential for an increase in productivity (beyond the correctness aspect of the transformational approach): since the development process itself is formalised (as a composition of transformations) we may be able to abstract from concrete developments to general development methods that can be instantiated and "re-played" in a similar situation, thus allowing reusability of the development process. It is this challenge that prompted the research presented here, based on previous experience: the development of an elementary notion of transformation rule (schema) that permits powerful pattern abstraction (by definable matching combinators, see chapter 2), of a tactical calculus for rule application that allows generalisation to methods (chapters 3, 5) and yet comes with a notion of refinement to be able to develop efficient transformation scripts (chapter 6). 1 . 4 Efficiency At the object-level, applicability conditions need to be established before a transformation rule can be applied. The requirement for efficiency calls for utmost automation to relieve the user from superfluous and tedious interactive proofs, and to allow composition of rules and the use of rules in tactics without the need for intermediate user interaction. Context-sensitivity of rules, e.g. using attributes for checking static semantic conditions (cf. [HK93]), will not be discussed here due to lack of space, nor the realisation of effective parameterisation in interactive dialog with the user, appropriate graphical presentation.
One way to aid the development of efficient tactical scripts is to use the development methodology and system for the object specification language at the meta-level as well, with automatic translation to a specialised target language for manipulating object programs. Such a uniform approach was suggested in [KKLT91]. Here, we will use algebraic laws on tactical scripts and transformations at the meta-level as well, see chapter 6. Thus the methodology is instantiated for the object language level and for the development of correct tactical programs at the level of the meta-language as well.
2
Elementary Transformation
2 . 1 Transformation Rules Transformation in our sense is as a form of deduction, applying transformation rules in a kind of term-rewriting process. A transformation rule is a pair of terms l and r, written as l ⇒ r (we use l, r, t, t' to denote terms of an object language). If a term t matches l , then a term t' can be constructed from r and the match of t and l. If additionally a certain applicability condition holds for t and t', we say that, by a transformation step, (l ⇒ r) (t) reduces to t'. (l ⇒ r) can be seen as a relation between t and t'; in general, the reduction relation is not a function. Transformation rules may have parameters and can even be higher-order, e.g. (l ⇒ r) ⇒ (c l ⇒ c r) is a possible transformation rule, where c is a constructor of a term language and l, r range only over terms (but not over transformation rules). The ⇒ arrow binds to the right. 2 . 2 Semantics of Elementary Transformation Rules Let TΣλ(V) be the set of higher order terms over the signature Σ and the set of variables V, and E a set of equations defining certain relations on terms. The syntax of transformation expressions is defined by: T := TΣλ(V) | T ⇒ T | T T. The scope of free variables is the whole rule; such variables are called matching variables. We assume a Hindley-Milner like type discipline over TΣλ (V) and T, which is not studied here in detail (for a full account, see [WS94]).There is a natural connection between transformation applications and the application of λ-abstractions in the λ-calculus, especially, if λ-abstractions are extended to pattern abstractions, as in [Oos90] for the case of an extended untyped λ-calculus. In order to describe the semantics of transformation rules, we introduce at first the set D to be the least solution (w.r.t. set inclusion) for the following set-equation: λ(V) ∪ (D × D) D = TΣ
The semantic function Sem: T → ℘(D), where ℘(D) is the powerset of D, is defined as: Sem˝ t Ì = { tσ | σ is ground substitution} Sem˝ (R ⇒ R')(R'')Ì = { t' | (t,t') ∈ Sem˝ R ⇒ R'Ì ∧ t ∈ Sem˝ R''Ì } Sem˝ R ⇒ R'Ì = { (t,t') | s ∈ Sem˝ R σÌ ∧ s =E t ∧ s' ∈ Sem˝ R' σÌ ∧ s' =E t' }
where R, R', R'' range over rule expressions T, and R σ denotes the substitution of free va riables in R. =E denotes the congruence modulo a set of equations E (see next section). For context sensitive transformations, the set comprehension is extended by an applicability condition.
(r ⇒ l) (t) reduces to t' iff (t,t') ∈ Sem˝ r ⇒ l Ì .
We write t (r ⇒ l) t' for (r ⇒ l) (t) reduces to t'. A transformation rule r ⇒ l is applicable to t iff ∃t'. t (r ⇒ l) t'. 2 . 3 Higher-Order Matching Higher-order matching as a basis for transformation is not a new idea (the use of secondorder terms is suggested in [HL78]), but is rarely used in practical systems. Higher-order terms have been used informally for a long time (cf. [BW82, CIP85, HK93]), e.g. in the schema: ∀ X. f X = if B X then T X else D (f (H X)) (K X).
Usually transformation rules including such higher-order variables B, T, D, H, K are treated like the first-order case. This makes them less abstract and less reusable. Our goal is to obtain powerful elementary transformation rules (corresponding, in fact, to rule schemata in a more conventional view) by real second-order matching with matching combinators; the decidability of second-order matching is proved in [HL78]. 2 . 4 Matching Combinators Theories considered so far in extended matching [JK91] are very specific, hard to combine and impractical for a transformational setting. In ([SW94, Shi94]) we present a theoretical framework for a refined "matching language" for schematic transformation rules. To obtain a suitable class of decidable and automatically solvable matching problems, we define a restricted class of theories that define syntactical similarities of terms for transformations. This class is a general extension of recursive functional definitions and allows easy combination. The concatenation function "++" on lists is a typical example that is defined by such a theory. Using "++" we can express some useful term schemata, such as a sequence of terms that contains a special term, etc. A matching problem consists in finding a substitution σ, such that e.g. σ(X ++ [c] ++ Y) = [c,b,c,d]. An obstacle to find such a solution is the fact that "++", called a matching combinator, is not a free symbol, but defined by a set of equations. Hence, the solution of such equations would traditionally be considered as a matching-modulo-E problem. The set of abstract syntax terms TλΣ(V) of an object language for matching problems is defined as an extension of the second-order λ-calculus with matching combinators. For clarity, we represent abstract syntax terms by the concrete syntax of a language such as SPECTRUM [Bro+93] possibly with matching combinators. In the sequel, we will use a font as in then to denote terms of the object language SPECTRUM and a font as in B X to denote abstract syntax terms of the transformation ("meta-")language. The theories associated with matching combinators are induced by sets of equations defined recursively over the constructors of the data types they operate on. "++" is defined as usual by the following two equations (X:Xs prefixes an element X to a list Xs): [] ++ Xs = Ys
(X : Xs) ++ Ys = X : (Xs ++ Ys).
In fact, the theories we are interested in can be treated as confluent term rewriting systems. It is not necessary to solve the general matching-modulo-E-problem σ t ↔∗E t'; with σ t →*R t', where R is a confluent term rewriting system, we achieve sufficient and
supposedly more efficient results. Note that the confluence on matching theories does not imply the confluence of a set of transformations rules constructed over them. The following examples shed some light on the power and usefulness of extended matching. Let us first consider two simple matching combinators: switch: Bool → (α, α) → (α, α) switch true (X, Y) = (X, Y) switch false (X, Y) = (Y, X)
choose: Bool → (α, α) → α choose true (X, Y) = X choose false (X, Y) = Y
and an application to Boolean laws in the form of rules: def MORGAN = not(choose C (and,or) (X, Y)) ⇒ (choose C (or,and)) (not X, not Y) def DISTRIB = (choose C (and,or)) (switch S ( (choose C (or,and) ) (X, Y), Z) ) ⇒ (choose C (or,and)) ( (choose C (and,or)) (switch S (X, Z)), (choose C (and,or)) (switch S (Y, Z)) ) def ABSORB = not(not X) ⇒ X
Depending on the values of the meta-variables C and S, one of the Boolean laws will be chosen: if C is true and S false, then DISTRIB becomes and(Z, or (X, Y)) ⇒ or (and (Z, X), and (Z, Y)). This is not necessarily an example for better readability, but for a compact combination of several structurally similar rules with minor deviations, occurring quite often in practice. Thus it is a first example for pattern abstraction. The parameters, e.g. C and S, could be chosen by user interaction, by a global application strategy, or locally by the matching algorithm. Another example for a powerful matching combinator is the substitution " | " ( ⊕ denotes homomorphic extension over abstract syntax of terms): | : Term → (Id, Term) → Term X | (Id, T) = choose (X = Id) (T, X) (T1 ⊕ T2) | (Id, T) = T1 | (Id, T) ⊕ T2 | (Id, T)
With the help of " | ", we can lift equations to rules by a higher-order rule: ∀X:T•L=R ⇒
( L | (X, V) ⇒ R | (X, V)
such that
TypeOf
V=T
)
(we use some informal notation here to denote an applicability condition (after such that ) and its logic, e.g. with a context function TypeOf ). This rule can be applied to the equational axiom: ∀ a:Nat • a+0 = a
yielding the following transformation rule: V +0 ⇒ V
such that
with the matches:
TypeOf
V = Nat
X = a, T = Nat, L = X +0, R = X
3
Tactical Combination
In this section we focus on the "combining forms" or tactical combinators that compose transformation terms to transformational developments. Our aim to formally represent and to reason over them gives rise to the definition of a tactical calculus. Most approaches so far consider tactics as separate programs; whenever they have a formal semantics at all, its properties are so poor that it is rarely used to formally reason about derivations. The view of transformation systems as abstract rewrite systems (see [Klo92]) and the corresponding notion of derivation in equational proof theory ([Bac91]) on the one hand and the relational calculi of ([Bac+92], [BM92], [Möl91]) on the other inspired our different view of tactical combinators. Since a rule is regarded as a relation, we can construct new relations by sequential composition of relations o and the union of two relations X. The latter can be interpreted as the angelic nondeterministic choice between two tactical expressions. Together with recursion (written µU. R), the reflexive transitive closure of a transformation term can be defined in the following way: (R)* def = µR1. I X (R o R1) where I is the identity transformation rule x ⇒ x. Here, the set of transformation variables (e.g. S, T, R, U) is sharply distinguished from the set of term variables V in the previous chapter. Since o and X are monotonic w.r.t. to set inclusion, the semantics of the recursion can be defined as the least fix-point as usual. 3 . 1 The Tactical Focus and its Properties Moreover, we can now represent the position (or: context) where a rule ought to be applied in a term in a very elegant fashion. For this purpose, we define the "focus" as a higher order rule (a,b and χ are term variables): FOC
= χ ⇒ (a ⇒ b) ⇒ χ a ⇒ χ b
def
The context χ (that has the character of an additional parameter to the rule) is directly represented by a λ-abstraction over terms, hence an ordinary function. The following example may demonstrate the use of the focus rule. Over the term signature Σ ={ pred, succ, fib:Nat→ Nat, +:Nat× Nat→ Nat, n:Nat}, the rule UNFOLDFIB
def
= fib(succ(succ X)) ⇒ fib X + fib(succ X) ,
can be formed (this rule is more intensively discussed in section 5). We want to apply UNFOLDFIB in the term T
def
= (fib(succ(pred
n)),
fib(succ(succ(pred
n)))) .
Let C be the context λξ.(fib(succ(pred n)),ξ), then UNFOLDFIB can be focussed with FOC C UNFOLDFIB
which reduces to the rule ⇒ C ( fib X+fib (succ X) ) = (fib(succ(pred n)), fib(succ(succ X ))) ⇒
C ( fib(succ(succ X)))
(fib(succ(pred
fib X + fib(succ X) )
n)),
that reduces T to: (fib(succ(pred
n)),
fib(pred
n)+fib(succ(pred
n)))
The semantic function Sem induces a semantic equivalence on transformation terms, denoted by "≡", that can be used for their formal manipulation. For example, since FOC is functional in the first two arguments, the equivalence FOC
(χ) (A⇒B) ≡ (χ A) ⇒ (χ B)
holds, which allows the simplification of FOC-applications as if we had β-reduction in our calculus. Note that name-clashes with matching variables have to be avoided when applying an equivalence within a rule. The focus rule has a number of properties, expressing distribution w.r.t. o and X over equal contexts, composition of nested contexts and commutation of applications in the non-overlapping case. 3 . 2 Tactical Terms and their Embedding in Higher Order Logics So far, the calculus is too weak to represent a tactical script that "normalises" an input term t w.r.t. a given transformation R. Moreover, we need an embedding in a logical language, that permits statements and deductions over tactical terms. It should be possible to express non-monotonic developments by the introduction of the "else" and the "and not" combinators < and > in our language, meaning "if R1 is not applicable, try R2" and "if R1 is applicable, then R2 should not be applicable afterwards". With < and >, the normal form operator ^ and classical rewriting-strategies (leftmost-innermost) can be expressed. We can define the set of tactical terms S of our tactical calculus as follows: S ::=
T
|
0
|
I
|
U
|
µ U .S
|
S
|
SoS
| SS
-- elem. transform., empty relation, identity -- transf. variables U, recursion, union -- composition, else, and not
S-terms are used to construct binary predicates a(S)b, denoting that (a,b) is in the relation S.
In particular, a(A⇒B)b holds iff (a,b)∈ Sem ˝ A⇒B Ì . If R and S are S -expressions, we define: R⊆S
def
R≡S
def
= =
∀a,b. a(R)b → a(S)b R⊆S ∧ S⊆R
It follows immediately, that ≡ is the extensional equivalence on tactical terms. The (nonmonotonic) applicability predicate and the termination predicate are defined as follows: R@a total(R) term(R)
def
=
∃b. a(R)b
def
=
∀a. R@a
def
∃f:α→nat. ∀a,b. a(R)b → f(a) > f(b)
=
¬ 0@a
I@a
term(0)
¬term(I)
Rules obey the following axioms (note that free variables are universally quantified): a (a ⇒ b)b A' =E A ∧ B' =E B
∧ a (A' ⇒ B')b → a (A ⇒ B)b a (A ⇒ B (R) A' ⇒ B')b ↔ (A ⇒ B )(R) (A' ⇒ B') ∧ a (A' ⇒ B')b
(Trafo-Appl) (Eval-Rule) (HO-Trafo-Appl)
The rest of the tactical combinators are characterised as follows: a (I) a a (RXS)b a (RS)b
def
= def = def =
∃n. a(Sn[0])b ∃c.a (R)c ∧ c(S)b (a (R)b∧ ¬S@b )
X
µU.S can be seen as the least upper bound n∈NS n(0). Finally, we define the normal form relator and the usual sequential orelse: def
R^
=
R* > R
R
k
S
def
= (R < S) X R
We can also express the congruence-closure, i.e. if term t1 (R) t2, then all terms t consisting of a context χ containing t1 are related to χ containing t2. The congruence-closure R~ is fundamental to the semantics of term-rewriting and can be represented in our calculus (not done here due to lack of space). 3 . 3 Algebraic Properties of Tactical Expressions On the basis of the logical interpretation of the last section, we propose an equational, pointless style of meta-deduction. A number of algebraic properties can be verified, stating that X forms a complete semilattice with the neutral element 0, stating that o has monoid structure, and left-and right distributivity w.r.t. X holds. Moreover, one has laws like: ≡ R 0 < R ≡ R n µS. R(S) ≡ R (0) X µS. R(S) mono(R1) → µS. R(S) ≡ R(µS. R(S)) total(R1) → R1 k R2 ≡ R1 The operations dom and codom yield domain and codomain of a tactical expression: R>
0
codom(R) ∩ dom(S) = ∅
→
R o (S
dom(R) ∩ dom(S) = ∅
→
RXS