Rapid Prototyping of Specification Language ... - Semantic Scholar

1 downloads 0 Views 129KB Size Report
[25] Steve Hill, “Combinators for parsing expressions,” Journal of Functional Programming, vol. 6, no. 3, pp. 445–463, May. 1996. [26] P. Borovansky, C. Kirchner ...
Rapid Prototyping of Specification Language Implementations Martin Leucker and Thomas Noll Lehrstuhl f¨ ur Informatik II, Aachen University of Technology Ahornstr. 55, D–52056 Aachen, Germany e–mail: {leucker,noll}@informatik.rwth-aachen.de Abstract —Specification languages such as LOTOS and SDL play an important rˆ ole in the design and implementation of distributed systems. Their formal syntax and semantics supports the development of compilers and of verification tools. This paper introduces a generic and uniform approach to support such languages in verification tools. We present a compiler generator which, given the description of a specification language, automatically generates a corresponding implementation. More specifically, the syntax and semantics of the specification language has to be defined using Meseguer’s Rewriting Logic formalism, a unified semantic framework for concurrency. From this description a compiler is derived which parses a given system specification and computes the corresponding semantic object, such as a labelled transition system. The latter can be processed further in subsequent analysis and verification phases. Thus we propose some kind of “meta–prototyping” approach in the sense that new specification formalisms for distributed systems can easily be tested without the need to develop an implementation by hand.

software systems. The formal specification of a system helps to understand the system under development. Furthermore, a common and formal basis for reasoning about the system is given.

I. Introduction

The application of formal methods requires the availability of supporting tools because formal methods are especially adequate for the design of large systems where an ad hoc or conventional software engineering approach is not reasonable. Generally speaking, large systems consist of distributed processes working together concurrently. While the distribution of the processes usually does not involve any conceptual problems, the concurrent behaviour makes the system difficult to understand. Therefore, we put our emphasis on analysing concurrent systems. During the last years several prototypes of corresponding tools have been developed, e.g., CWB ([3]), NCSUCWB ([4]), SPIN ([5]) and the symbolic model checker SMV ([6]). Most of the tools are tailored for a specific syntactic and semantic setting, e.g., CCS with transition system semantics and µ– calculus model checking.

Formal methods are becoming more and more popular for the specification, verification, and prototyping of industrial critical systems. Several case studies have shown that corresponding techniques can help to find errors during the design process (see [1] for an overview). They are also gaining commercial success, e.g., companies such as Intel, National Semiconductor or Texas Instruments are establishing new departments for formal methods (see for example the job adverts in [2]). The term formal methods usually denotes the application of rigorous mathematical methods for specifying and verifying complex hardware and

Our goal is to support rapid prototyping of distributed systems by facilitating the employment of new specification formalisms and semantic domains. To this aim we are developing a tool which, given the description of a specification language, automatically generates a corresponding implementation. More specifically, the syntax and semantics of the specification language under consideration has to be defined using the Rewriting Logic formalism, which was proposed in [7] and [8] as a unified semantic framework for concurrency. From this description a compiler is derived which parses any given system specification and

computes the corresponding semantic object. This compiler generator is part of our Truth verification tool ([9], [10]) which can subsequently be used to visualise and analyse the derived semantics. In particular it can perform model checking, i.e. it is possible to verify that the system under consideration fulfils certain conditions specified as formulae of some formal logic. In the remainder of this paper we introduce our compiler generator in greater detail. Section II gives an overview of its formal basis, the Rewriting Logic approach. Section III describes its implementation and the results obtained so far. Section IV concludes with some remarks. II. Rewriting Logic There are two main problems which limit the use of model checking techniques in practical situations: 1. There is a huge amount of design formalisms (like SDL [11], ACP [12], LOTOS [13], B(P N)2 [14], etc.) and hence the support by tools is rather limited in the sense that not even each design notion is implemented. This fact is strengthened by the plurality of techniques for describing requirements. Thus generic methods are desirable which increase the flexibility and adaptability of verification tools towards the specification language. 2. The state explosion problem: The usual semantic treatment of concurrent systems in current verification tools, i.e., the interleaving approach, leads to a huge number of states and limits the use of model checking to rather small components. With the first point in mind, the Process Algebra Compiler PAC ([15]) has been developed as a tool which, given the syntax and the operational rules of a process algebra, generates a compiler front– end which analyses programs and computes their meaning. However, since the semantics is specified in terms of structural operational rules, the scope of this tool is restricted to (labelled) transition systems (henceforth called LTSs for short). It is possible to add a further degree of freedom by allowing also the semantic domain to be specified. This goal can be achieved by employing the Rewriting Logic approach. It aims at a separate description of the static and of the dynamic aspects of a distributed system. More exactly, it distinguishes the laws describing the structure of the

states of the system from the rules which specify its possible transitions. The two parts are respectively formalised as an equational theory and as a (conditional) term rewriting system. Both structures operate on states, represented as (equivalence classes of) terms built up from the operators of the specification language under consideration. Since a single transition may comprise several independent rewriting steps, concurrent behaviour can explicitly be modelled. Rewriting Logic has been successfully applied to specify various languages and semantic domains; an overview can be found in [16]. Among others, P. Viry gives very natural specifications of CCS ([17]) and of the π–calculus ([18]). We have therefore chosen Rewriting Logic as the formal basis of our compiler generator which, given the definition of a specification language, automatically derives corresponding parsing and LTS–generating functions which can be used as a frontend for verification tools such as Truth. The overall structure is depicted in Figure 1. Rewriting Logic definition of SL Grammar

Term rewriting rules

Equations

Compiler generator

System specification in SL

Truth frontend SL Parser

LTS generating functions

Labelled transition system

Fig. 1. Generic implementation of specification languages (SL) using Rewriting Logic

As a simple example, we give a Rewriting Logic specification of CCS ([19]) using the syntax accepted by our compiler generator. The desription of a specification language consists of three parts. First, the syntax of the language has to be given in terms of a context free grammar. Beside the grammar itself, the definition contains information on scanning (tokens section), on typing (sorts section), on operator priorities and associativities (priorities section), and on the representation of CCS terms as abstract syntax trees (this map-

ping is defined in the grammar section and uses the symbols declared in the cons section).

| PROCVAR (Var(PROCVAR)) | proc PAR proc (Par(proc1, proc2)) | proc PLUS proc (Plus(proc1, proc2)) | OPENB proc CLOSEB (proc) | act DOT proc (Pref(act, proc)) | proc rest (Rest(proc, rest)) | proc OPENSB act SLASH act CLOSESB (Rel(proc, act1, act2))

ALGEBRA CCS sorts definition, process, action cons Def Par Plus Rest Rel Pref Nil Var Act NegAct

: : : : : : : : : :

string * process -> definition process * process -> process process * process -> process process * (action list) -> process process * action * action -> process action * process -> process unit -> process string -> process string -> action string -> action

rest = REST OPENCB acts CLOSECB acts = act+ act

SYNTAX tokens "\\" "/" "." "\(" "\)" "{" "}" "\[" "\]" "\+" "\|" "’" "nil" "=" "[A-Z][A-Za-z0-9]*" "[a-z]+"

=> => => => => => => => => => => => => => => =>

REST SLASH DOT OPENB CLOSEB OPENCB CLOSECB OPENSB CLOSESB PLUS PAR PRIME NIL EQUAL PROCVAR of String ACTION of String

(Act(ACTION)) (NegAct(ACTION))

The second part consists of a system of (conditional and labelled) rewrite rules R defining the operational semantics, that is, state transitions are regarded as rewrite steps. In the case of CCS it is of the following form: SEMANTICS vars d : (definition list) p, p’, q, q’, r : process a, b, c : action r : (action list) v : string transitions

p -(a)-> p’ (Sum) -------------------Plus(p, q) -(a)-> p’ : : : : :

(definition list) definition process (action list) action

grammar specification = def+ def

= ACTION | PRIME ACTION

(Pre) Pref(a, p) -(a)-> p

priorities left 20 PAR left 30 PLUS right 40 DOT nonterminals specification def proc rest, acts act

(acts)

= PROCVAR EQUAL proc (Def(PROCVAR, proc))

proc = NIL (Nil())

p -(a)-> p’ (Int) --------------------------Par(p, q) -(a)-> Par(p’, q) p -(a)-> p’, q -(b)-> q’, inverse(a, b) (Com) --------------------------------------Par(p, q) -(Act("tau"))-> Par(p’, q’) p -(a)-> p’, notin(a, r) (Res) ----------------------------Rest(p, r) -(a)-> Rest(p’, r) p -(a)-> p’ (Rel) --------------------------------Rel(p, a, b) -(b)-> Rel(p’, a, b)

p -(a)-> p’, different(a, b) (Pas) --------------------------------Rel(p, b, c) -(a)-> Rel(p’, b, c) p = lookup(d, v), p -(a)-> p’ (Var) ----------------------------Var(v) -(a)-> p’

The definition of auxiliary functions like inverse is omitted. Finally, the description contains a set of equations between process terms E, which identify certain states of the respective system. In this way, we reduce the state space of the resulting system as well as the number of rewrite rules. Note that, for example, the symmetric counterparts of (Sum) and (Int) are not required above since both the plus and the parallel operator are declared commutative in the following equations. The special meaning of the arrow symbols will be explained later. equations Plus(p, q) Plus(p, Plus(q, r)) Plus(p, Nil()) Plus(p, p) Par(p, q) Par(p, Par(q, r)) Par(p, Nil())

= = -> -> = = ->

Plus(q, p) Plus(Plus(p, q), r) p p Par(q, p) Par(Par(p, q), r) p

Obviously the introduction of equations has considerable consequences concerning the semantic treatment. States of the system are not longer represented as terms but as equivalence classes of terms. In other words, to compute the semantics of a specification it is necessary to implement term rewriting modulo an equational theory: [p]E −→R [p′ ]E −→R . . . It is obvious that for reasons of efficiency we have to find suitable representatives for each equivalence class. Hence, given a process term p, we have to find a normal form pˆ within the congruence class [p]E . Our approach is (similar to [18]) to rewrite p repeatedly to a term pˆ using a (partly) orientated version of E. Since E may contain equations which can not be orientated, we still have to deal with rewriting modulo these equations. For our needs, it is passable to restrict ourselves to rewriting modulo associativity and commutativity (AC) which is well understood ([20]). Hence, we assume E to be characterised by two sets AC

and ER where AC contains the AC–equations, and where ER is a set of rewrite rules (the “arrow equations” in the CCS example above) such that E = AC ∪ {l = r | l → r ∈ ER}. Note that ER must be terminating in order to guarantee the existence of normal forms. Although the equational rules in ER have a similar nature as the transition rules of R, we have to treat them completely different since the rewrite steps entailed by R represent the actual computation of the system while the rewrite steps induced by ER are considered to happen internally—they are just used to implement the equational theory efficiently. Hence, we have the following situation: [p]AC ↓∗ER [ˆ p]AC −→R [p′ ]AC ↓∗ER .. . This approach is obviously correct, i.e., every derived term p′ can also be derived in the full theory, that is by using R modulo E. However, ER and R have to match certain (strong) coherence properties to make this approach also complete, i.e., to ensure that every term derivable via R modulo E can be obtained by rewriting wrt. R and ER modulo AC . A detailed account on this topic can be found in [21]. It should be noted that the overall structure of a system specified in rewriting logic is that of an LTS. (Though, as stated above, a single transition may represent concurrent activities in different subcomponents.) Hence, we are able to reuse the efficient implementation of LTSs in Truth even when dealing with true concurrency. III. Implementation of the compiler generator and practical results At present we have a prototype implementation of the compiler generator to gain some experience with its usefulness concerning the state space reduction in the case of CCS and its applicability to the broad range of specification formalisms. Our compiler is written in the functional programming language Haskell ([22]), which is well suited for

parsing, and in which the Truth system is also implemented. As can be seen in the previous section, the file format of the specification language definition follows a YACC–similar syntax. Hence Happy ([23]), a YACC–compatible parser generator, is employed to generate the parser for the definition file. Figure 2 gives a schematic view on the compiler. According to Section II, the specification language definition can be divided into two parts, the definition of the syntax and the definition of the semantics.

AC–rewriting. Thus the compiler generator builds an ELAN input file according to the semantic rules as well as Haskell code for accessing ELAN. The permanent parser and the ELAN interface are compiled together with the basic Truth code to obtain a Truth version tailored for the specification formalism. Figure 3 gives an overview of the output of the compiler generator and its integration into Truth. Compiler generator

Definition file format Happy Parser sources Compiler sources Haskell Compiler

ELAN interface

SL Parser

ELAN input file

Truth sources Haskell Compiler Truth

Compiler generator SL Parser

ELAN input file

Fig. 2. The structure of the compiler

In the syntactic part, the user must provide typing information and a context–free grammar. From the grammar two representations of a parser for the respective specification language are constructed. • The first one is given in Haskell source code. It is permanent in the sense that it is stored in a file and included later in the Truth sources. Thus we obtain a version of Truth which is tailored for the specific specification formalism. • The second one is kept in memory as a transient object. It is needed for parsing the semantic part of the definition where both the rewrite rules and the equations employ the syntax of the specification language. Both parsers are implemented using the parser combinator library ([24]) and the extensions for operator precedences and automatic elimination of left–linear rules proposed in [25]. Using the transient parser, the semantic rules and equations of the definition are parsed. Instead of generating code for rewriting according to the given rules, we employ ELAN ([26]) for our prototype implementation. ELAN is a general–purpose rewrite system which supports

LTS generating functions

ELAN

Fig. 3. The structure of the resulting tool

The specific Truth version is now ready for use, i.e. to accept and to analyse specifications in the new formalism. If for example the user enters a system specification, it is parsed according to the syntax definition and stored in an internal format. If an analysis (e.g. of deadlock freeness) is requested by the user, the LTS representing the specified system is computed in a demand–driven fashion as follows: The textual representation of the current state s is sent to the ELAN system together with a rewriting command, resulting in the list of successor states being returned. To detect cycles in the LTS, every successor state has to be matched against the set of states obtained so far. Since some of the operators of the specification language may be declared associative or commutative and because ELAN does not compute an AC–normal form, we employ ELAN again to test whether two states are equivalent. If we encounter a new successor state, then it is linked into the LTS. While the effort for developing the prototype

is reduced to a minimum in this way, the use of ELAN has huge drawbacks concerning the runtime efficiency of the system, mainly for two reasons: the Haskell–ELAN interface is based on pipes, hence string–based and slow therefore, and ELAN is an interpreter system. However, from the point of view of memory efficiency, the prototype is quite successful regarding the state space reduction. We tried several examples: semaphore, bounded stack and alternating bit protocol (ABP, [19, p. 141]) to name a few. An n–ary semaphore is basically a counter bounded by n; one straightforward implementation of it (cf. [19, p. 33]) being Sem S0 S1

= S0 | ... | S0 = get.S1 = put.S0

where the right–hand side of the first equation consists of n S0 factors. For this example, there is a huge reduction. The reason is that every state is the parallel product of n unary semaphores whose ordering does not matter. Regarding the term modulo AC identifies all states representing the same value, leading to an exponential gain. The bounded stack is an example for a whole class of processes where our compiler achieves essential improvements, the fragment of well– terminating processes ([19, p. 173]). The idea is that a process has to indicate its termination via a ’done action. Well–terminating processes can be sequentially composed in a very natural way, as the following specification demonstrates: a stack process is either the empty process (which can just perform the ’done action) or a process accepting a pushed element, then behaving as a stack again, followed by popping the same element and then behaving as a stack again. This is expressed in the following equations which describe a stack over a single entry type a where a sequential process composition operator * is used: Stack = Done + a * Stack * ’a * Stack Done = ’done.nil It is easy to reduce the sequential composition of well–terminating processes to plain CCS. We give a corresponding specification of an n– bounded stack: Stack_n = ((a.Stack_n-1)[done/d] | d.’a.Stack_n)\{d} + Done

... Stack_0 = Done However, the bounded stack is an infinite state process in the original semantics, since every pairwise occurrence of a push (a) and a pop action (’a) results in a “dead” nil process existing in parallel with the rest of the system. Due to our Par(p, Nil()) -> p reduction rule (see Section II), the LTS computed by our implementation is finite. Since stacks play an important role in modeling hardware systems, the improvement is crucial. In the case of the ABP, however, no reduction could be achieved at all. We are still on the way testing the integration of different specification formalisms into our system Truth. First examples seem to be promising but it is to early to draw general results from them. IV. Conclusion In this paper we described the design and a first implementation of a specification language compiler generator. Given a specification language definition by its syntax and operational semantics, it generates parsing and semantic functions which can be integrated in a verification tool such as Truth to obtain a verification tool tailored for the specification formalism under consideration. Thus our approach supports rapid prototyping in the sense that new specification formalisms for distributed systems can easily be tested without the need to develop an implementation by hand. By applying our compiler generator to the CCS process algebra we were able to show that considerable reductions of the state space are attainable. This gain is due to the possibility to provide equational laws which decribe the structure of the system states, so that similar results can be expected for most specification formalisms. At present we employ our tool for a concrete programming language (Erlang [27]) together with an abstracted operational semantics. The goal is to obtain an analyser for Erlang programs with a minimum of effort. In particular, different types of abstract operational semantics can be tested easily in this way. However, the main problem of our implementation with regard to its its practical usefulness is the bad runtime efficiency, mainly due to the string–based interfacing and to the current inter-

preter implementation of ELAN. Future releases will hopefully overcome these drawbacks by compiling the rewriting rules. At the moment, we are testing several other rewrite tools with compiling facilities to improve our prototype. For the full version of our system, we are planning to implement a Haskell library for rewriting modulo AC which can be used in our compiler to obtain an efficient version of a verification tool for the corresponding specification formalism.

[16]

[17]

[18] [19]

References [1] [2] [3] [4] [5]

[6] [7]

[8] [9] [10]

[11] [12]

[13]

[14]

[15]

E. M. Clarke and J. M. Wing, “Formal methods: State of the art and future directions,” Tech. Rep. CMU-CS-96-178, Carnegie Mellon University (CMU), Sept. 1996. “The concurrency mailing list,” . F. Moller, The Edinburgh Concurrency Workbench (Version 6.1), Department of Computer Science, University of Edinburgh, Oct. 1992. R. Cleaveland and S. Sims, “The NCSU concurrency workbench,” Lecture Notes in Computer Science, vol. 1102, pp. 394–397, 1996. Jean-Charles Gr´egoire, Gerard J. Holzmann, and Doron A. Peled, Eds., The Spin Verification System, vol. 32 of DIMACS series. American Mathematical Society, 1997, ISBN 0-8218-0680-7, 203p. K. L. McMillan, “The SMV system, symbolic model checking - an approach,” Tech. Rep. CMU-CS-92-131, Carnegie Mellon University, 1992. Jos´e Meseguer, “Rewriting as a unified model of concurrency,” in Proceedings Concur’90 Conference, Amsterdam, Aug. 1990, Lecture Notes in Computer Science, Volume 458, pp. 384–400, Springer, Also, Report SRI-CSL-90-02R, Computer Science Lab, SRI International. J. Meseguer, “Conditional rewriting logic as a unified model of concurrency,” Theoretical Computer Science, vol. 96, no. 1, pp. 73–155, Apr. 1992. Martin Leucker and Stephan Tobies, “Truth—a platform for verification of distributed systems,” Tech. Rep. 98-05, RWTH Aachen, May 1998. M. Lange, M. Leucker, T. Noll, and S. Tobies, “Truth – a verification platform for concurrent systems,” in Proceedings of Tools’98. 1998, Christian-Albrechts University of Kiel. J. Ellsberger, D. Hogrefe, and A. Sarma, SDL – A Formal Object–Oriented Language for Communicating Systems, Prentice Hall, 1997. J. C. M. Baeten and C. Verhoef, “Concrete process algebra,” in Handbook of Logic in Computer Science, Vol 4, S. Abramsky, D. Gabbay, and T. S. E. Maibaum, Eds., vol. 4, pp. 149–268. Oxford University Press, 1994. T. Bolognesi and E. Brinksma, “Introduction to the ISO specification language LOTOS,” in The Formal Description Technique LOTOS, P. H. J. van Eijk, C. A. Vissers, and M. Diaz, Eds., pp. 23–73. Elsevier Science Publishers North-Holland, 1989. Eike Best and Richard P. Hopkins, “B(PN)2 — A basic Petri net programming notation,” in PARLE’93 Parallel Architectures and Languages Europe, Arndt Bode, Mike Reeve, and Gottfried Wolf, Eds. June 1993, vol. 694 of Lecture Notes in Computer Science, pp. 379–390, SpringerVerlag. R. Cleaveland, E. Madelaine, and S. Sims, “A front-end

[20]

[21] [22]

[23] [24] [25] [26]

[27]

generator for verification tools,” Lecture Notes in Computer Science, vol. 1019, pp. 153–173, 1995. Jos´e Meseguer, “Rewriting logic as a semantic framework for concurrency: a progress report,” in Seventh International Conference on Concurrency Theory (CONCUR ’96). Aug. 1996, vol. 1119 of Lecture Notes in Computer Science, pp. 331–372, Springer Verlag. Patrick Viry, “Rewriting: An effective model of concurrency,” in Proceedings of PARLE ’94 – Parallel Architectures and Languages Europe. 1994, vol. 817 of Lecture Notes in Computer Science, pp. 648–660, Springer-Verlag. Patrick Viry, “A rewriting implementation of pi-calculus,” Tech. Rep. TR-96-30, Dipartimento di Informatica, Mar. 26 1996. R. Milner, Communication and Concurrency, International Series in Computer Science. Prentice Hall, 1989. Nachum Dershowitz and Jean-Pierre Jouannaud, “Rewrite systems,” in Handbook of Theoretical Computer Science, J. van Leeuwen, Ed., vol. B: Formal Methods and Semantics, chapter 6, pp. 243–320. North-Holland, Amsterdam, 1990. Patrick Viry, “Rewriting modulo a rewrite system,” Tech. Rep. TR-95-20, Dipartimento di Informatica, Dec. 01 1995. John Peterson, Kevin Hammond, et al., “Report on the programming language haskell, a non-strict purelyfunctional programming language, version 1.3,” Tech. Rep., Yale University, May 1996. “Happy, a parser generator for haskell,” . Graham Hutton and Erik Meijer, “Monadic parsing in haskell,” Journal of Functional Programming, vol. 8, no. 4, 1998. Steve Hill, “Combinators for parsing expressions,” Journal of Functional Programming, vol. 6, no. 3, pp. 445–463, May 1996. P. Borovansky, C. Kirchner, H. Kirchner, P.E. Moreau, and M. Vittek, “Elan: A logical framework based on computational systems,” in Proc. of the First Int. Workshop on Rewriting Logic. 1996, vol. 4 of Electronic Notes in Theoretical Computer Science, Elsevier. J. Armstrong, M. Williams, and R. Virding, Concurrent Programming in Erlang, Prentice-Hall, Englewood Cliffs, NJ, 1993.