Query Generation Guidelines for Statecharts within Object-Oriented Designs Hong Liu, David P. Gluch Embry-Riddle Aeronautical University 600, S. Clyde Morris Blvd Daytona Beach, Fl 32114, U.S.A
[email protected],
[email protected]
ABSTRACT This paper provides preliminary results in defining guidelines for generating Linear Temporal Logic (LTL) queries for model checking Statecharts within an object-oriented design and in establishing a framework for eliciting informal queries from requirements documents. Since formal queries are verified against a design element, a semantic gap exists between requirement properties and formal queries. This gap can present a serious challenge in expressing informal queries that are easily translated into formal ones. However, if within object-oriented designs, an object-process methodology is used to guide Statechart modeling, the semantic gap can be bridged and proper informal queries can be readily elicited. Keywords: Software Engineering, Model Check, Linear Temporal Logic, Object-Process Methodology.
1. Introduction Formal verification practices based upon automatic model checking tools are beginning to be employed to confirm expected properties of design models [13]. In these practices, informal natural language statements based upon a system’s expected properties are translated to formal queries. These formal queries are generally mathematically-based refutable claims specified in either Linear Temporal Logic (LTL) or Computational Tree Logic (CTL) [1, 2, 8]. The system or a system design is typically modeled as a cluster of state machines often using Statecharts, which are essentially hierarchically organized state machines [6]. Formal verification offers considerable potential because of its theoretic soundness, reliability, effectiveness, and potential to automate tedious verification routines. However, some drawbacks hinder its adoption as a common industry practice. To most practitioners, who have not had extensive training in formal methods, one of most notable challenges is how to generate a set of formal queries that faithfully reflect the set of informal queries. This problem has been addressed by others [7, 9, 17]. This work addresses the issue of the expertise level required by a practitioner, by considering specific aspects of the problem within a familiar software engineering perspective (object-oriented designs).
The objective of this paper is to present our preliminary results on guidelines for generating queries in LTL form to verify Statechart models within an object-oriented design. Ideally, we would like first, to elicit some informal queries from the expected properties in requirements, then set up heuristics (some rules-ofthumb) for matching each informal query to a formal one. However, such a straightforward approach does not work for most real-world systems worthy of applying formal methods. On one hand, the expected properties in a typical analysis package are specified by the vocabulary of the system’s application domain. On the other hand, formal queries in LTL use states and transitions extracted from state machine design representations (often Statecharts). Most of these representations and associated queries involve logical entities that do not exist in the application domain package. It is clear that the desired informal queries need to satisfy two sets of vocabularies. One is a set of the application domain vocabularies used to express requirements and expected properties. The other is a set of logical states and transitions used in design. We identify this difference between vocabularies as a query semantic gap (QSG). Because of this gap, it is often very difficult to get forms of informal queries that are ready to be translated into the formal ones, unless this semantic gap is bridged. In additional to our previous work [11], [8], [9], other investigations have motivated and influenced this work. Z. Manna and A. Pnueli [14] defined the terminologies and classifications of atomic query patterns. Bandera presented some useful, but sometimes unwieldy query patterns in [17]. The I-Logic Model Certifier presented 15 kernel patterns, each of which can be combined with a query lead mode and a starting phase as a complete query to verify against Statechart models built in Statemate [13]. Another influential work is the Object-Process methodology (OPM) proposed by Dori Dov [4]. It is used as a guiding principle to bridge the semantic gap between expected properties and formal queries. The proposed query generation process and its artifacts are illustrated on Fig 1, where an analysis package with its prerequisite requirements documents and architecture are shown on the top; the guidelines that we present in this paper are shown on the bottom; and the three artifacts for formal verification are shown in the center. Section
two presents a framework for eliciting properties of requirements. Section 3 briefly introduces OPM and demonstrates why the formal queries associated with a the atomic query patterns of [12] by borrowing the idea of modes and kernel patterns of Statemate queries. Section 5 uses many examples to demonstrate how to map complicated formal queries from the literature and informal properties from our case study to formal queries in constrained compounded forms. Finally, section 6 summarizes and closes this paper. Project Documents
Analysis Package Requirements Engineering. Issues Formal Method Strategy Checking Tool …….
Architecture Design Conceptual View Process View Code View ……
Query Generation Artifacts Modified Statechart Guided by OPM
Itemized Informal Queries
Formal Queries in LTL form
Model Checking Process Check Result
Model Check Tool
Guidelines Query Elicit Framework
LTL Query Pattern Set
Pattern Matching Template
OPM Guideline
Figure 1. Query Generation Process and Its Artifacts
2. Basic Framework This framework describes an approach to the formulation of informal expected properties from a requirements specification and their expression as informal queries.
Eliciting Properties A verification and validation strategy considered here is embodied first in a global analysis package that is detailed in individual analysis packages. The concept for identifying informal queries is to use the conventional elicitation and capture processes and techniques augmented by a question driven inquiry approach and segmented into: structural aspects, behavioral aspects, functional aspects and integrated view (Use case and scenarios). Each of these is founded in the dichotomy considerations of desired versus undesired structure or behavior and considerations of how the system may break. The framework for eliciting properties and informal queries is summarized in Figure 2.
Property Decomposition Often a single expected property of the requirements must be expressed in one to several informal queries, which can be mapped to formal queries associated with an apropriate model checking tool. Though it is not so traditionally, it can be advantageous to elicit and capture informal queries after building their associated Statecharts. This can often align the vocabularies, such that the informal expressions are more meaningful within the context of a Statechart.
Statechart that is modeled in accordance with OPM can be easily mapped from its corresponding informal queries. Section 4 defines a set of compound query patterns from The generation process involves the perusal (exploration) of sources employing a variety of information elicitation and capture techniques, especially well-established techniques that have been used for requirements elicitation and capture. These techniques allow the identification of expected property statements. For queries, additional considerations are required to ensure a proper expression. These include considerations of the form (i.e., as a refutable declarative natural language statement with formality considered later). To leverage the difficulty of mapping an informal query to a formal one, we apply the following constraints to the structure of a single informal query expressed in natural language. 1. A compound sentence should involve no more than two different tenses. 2. The time span of each sentence is either after, before, until, between, or after-until two events [7]. 3. A subject or object phrase should be one or several states in an associated Statechart. The first two constraints may require a single property to be decomposed into a few sub-properties with a logical AND relationship. The third requirement is the most demanding due to the semantic gap identified in the introduction. In Section 3 we argue that nouns in expected property statements should have corresponding states in their associated Statechart of an object-oriented design where the Statechart is modeled in accordance with the OPM principles.
3. OPM Conceptual Models as Guides In this paper, we do not assume that readers are familiar with OPM and its tools. OPM is a formal systems engineering approach that, while recognizing the duality of objects and processes, establishes a peer relationship among them. This peer relationship enables modelers to describe real world problems naturally, using nouns and verbs as peer phrases to describe both physical and abstract phenomena. The dual representations of OPM models are specified by object-process diagrams (OPD) and a list of equivalent English object-process language (OPL) descriptions. In [11], we presented the rationale and procedure to build OPM conceptual models from requirements specifications. A detailed procedure for building Statecharts using the OPM conceptual model is beyond the scope of this paper. Our focus in this paper is on building Statecharts in accordance with OPM principles. An OPM-based approach facilitates Statechart analyzability by reducing the semantic gap between expected properties and formal queries. In this section, we introduce the essential principles of OPM and provide guidelines for building Statecharts founded upon those principles. Separation of concerns in the form of aspect-based decomposition is used as a principal strategy to conquer complexity in OOA/D. This strategy results in the
multiplicity of UML models [16]. Unfortunately, multiplicity also brings a traceability problem of entities spanning from concepts within an application domain to abstract design concepts. In addition, ensuring the consistency of those entities throughout the diagrams can
be difficult. Most importantly, because modelers have significant freedom to designate and organize the logical
S o u r c e s A p p lic a tio n
A n a ly s is P a c k a g e
• P r o d u c t S p e c ific • D o m a in K n o w le d g e
T e c h n o lo g y D e v e lo p m e n t • Id io s y n c r a tic
A n a ly s is P ro c e d u r e P a tte r n
A n a ly s is V ie w
• In te r n a l C o n s is te n c y a n a ly s is a n d q u e r y g e n e r a tio n
D e s ir e d
Q u e s tio n C a te g o r ie s • • • •
c g s h
o r e p r o p e r tie s lo b a l – lo c a l ta n d a r d is to r ic a lly c r itic a l
• • • •
U n d e s ir e d
S e ts
In v a r ia n ts C o n s tr a in ts P r e & P o s t-C o n d itio n s A s s e r tio n s
T e c h n iq u e s • D o c u m e n ta tio n R e v ie w • In d iv id u a l In te r v ie w s • G r o u p R e v ie w P r o c e s s e s
Fig 2: Framework for Eliciting Informal Queries states and transitions within a Statechart [15], it is problematic to trace an object’s designated states and the processes that trigger transitions. In these approaches the semantic gap is addressed by introducing abstract states or simply mapping a noun in the expected property to a set of corresponding states, thereby fulfilling the third requirement for informal queries identified in Section 2. In OPM, a single model is used to control the complexity through granularity levels, refinement, and abstraction similar to zooming in and zooming out in a digital map. As a top-down ontology, OPM integrates the structural (classes, objects), functional (processes, methods), and behavioral (events, activities, state transitions) considerations into one diagram. This central diagram usually includes many sub-diagrams hierarchically organized. The integrated view of OPM constrains, in a single diagram, the traceability issues for objects, processes, and states. This characteristic is used to guide Statechart modeling. We want to restrict our concern to object-oriented designs because these designs facilitate the traceability of objects in the architecture design to objects in the application domain and vice-versa. If one follows the OPM principles enlisted below to develop a Statechart associated with an object-oriented design, then it will contain a more integrated view, even though some objects and process are not explicitly specified as that of OPM diagrams. 1: The first principle from OPM is to clearly identify the responsibilities and roles of all entities such as states, objects and processes and answer the following three questions relating to identifying roles when an entity is to added to a model:
• Process: Which objects instrument and initiate the process and which of the three functionalities does it fulfill: create, destroy, or modify the states of an object? • Object: What are the states of the object; which processes create, destroy, and modify the object; and which processes does it instrument? • State: What object does it belong to; which process triggered the change of the state; and what are the source and destination states? 2: Objects that own states in a Statechart should be traceable to some concrete objects in the application system. The triggering process of a state transition, if possible should be traceable to some operation of the system or some methods in the architecture design. 3: The hierarchy of a Statechart should be similar to the hierarchy of the itemized functional requirements. If a Statechart is modeled in accordance with these principles, the semantic gap between expected properties and formal queries can be bridged and the third requirement of informal queries listed Section 2 can be fulfilled. However, if it is not possible to adhere to all of these principles, it may be necessary to rephrase specific requirements’ statements. Further details and examples about the similarity and difference between OPM models and Statecharts can be found in [5].
4. Query Structures in LTL The unconstrained and direct use of linear temporal logic (LTL) has great expressive power but requires extensive training to develop the skills to understand and use LTL expressions effectively. The hard coded patterns of Statemate target industry users. However, they lack the flexibility of a more direct implementation and are not very useful for queries associated with other tools such as
SMV or Spin. This paper considers query patterns that attempt to balance expressiveness and comprehensibility so that practitioners and undergraduate students can readily comprehend and use them.
Canonical Form of Atomic Query P, P1, P2, are proposition logic expressions, an atomic temporal logic expression (ATL) is defined as one of the following forms: 1. (NULL) P: Current operator 2. X P: next operator 3. p: henceforth operator 4. ◊p: future (first) operator 5. ◊p: iterative (infinitely often) operator 6. ◊ p: eventually invariant operator 7. P1 W P2: Weak until operator 8. P1 U P2: Strong until operator, where P1 U P2 = (P1 W P2) AND ◊P2. The last two: W and U are binary operators. The other six are all unary operators. They are defined in [14]. An ATL has only one of the 8 future temporal logic operators above.
Structure of Constrained Compound Queries Our objective is to present a set of compound query patterns that are easy to comprehend, but are sufficiently complex to express a wide variety of properties in natural language. The kernel of a compound query is uniformly structured as (Premise -> Conclusion) because we observe that most requirement properties can be described by conditional compound sentences in English. In additional, we borrow the concepts of mode from Statemate and scope from [7]. Hence, a Constrained Compound Queries (CCQ) is in the form of Mode (Scope ∧ Premise → Conclusion) Both the Premise and Conclusion are ATLs involving either states or events. The Mode is one of the 6 unary temporal operators defined above. It defines the region and frequency where the truth-values of the body (Scope ∧ Premise -> Conclusion) are evaluated. We identify 6 forms for Mode: Initial, Next, Invariant, First, Iteratively and Eventually-Invariant. Since Statemate only checks finite state models, the interpretations of these modes are slightly different from these of Statemate queries, even though four of them use the same mode names [13]. Scope defines when the query kernel is to be checked against the model. It is an ATL used to define the end of an initialization (the start phase) or the last state to check (terminal phase) in a single iteration (see example below). We assume that the start phase is included in the Scope but the terminal condition state is not. [7] defines 5 scopes as either after, before or between two events or two states. Scope is defined by a trigger α that results in start condition A becoming true immediately and persistently or a trigger ω that results in a terminal condition Ω becoming true immediately or both.
To be more specific regarding this Scope definition, let α be a Scope start trigger that results in start condition Α becoming true. Once Α becomes true, it will be true until an associated terminal trigger occurs. ω is a Scope terminal trigger that results in the condition Ω becoming true, where Ω is always false until ω occurs. Ω is a system terminal condition expressed in terms of a state variable(s). If a Scope start event α is not specified, then Α is true globally. If a Scope terminal event ω is not specified then is Ω is false globally. We reserve Greek characters α, ω, Α, and Ω to refer to triggers (generally associated with events) and conditions (associated with states) related to the scope definitions. Examples of Scope start triggers are: Set Cruise Control, Turn on Engine Power. Then, cruise = on and engine = on are start conditions. Examples of Scope terminal events are Turn Off Cruise Control, Turn off Engine Power, etc., where cruise = off and engine = off are terminal conditions. We call the states between the cruise on to cruise off as a single iteration. If the iterative mode is applied, the query kernel will be checked between these two states again and again. If we follow the third constraint on Statechart modeling discussed in section 3, which says that Startcharts are organized as system and subsystem hierarchically according to OPM, then we assume that the scope is either a system or a sub-system between initialization and termination, which occurs once typically in a single iteration. This reduces the complexity of the model and associated queries. To use CCQ, we recommend distinguishing the scope definition events from internal repeating events or state transitions. If an event occurs more than once in a single iteration, it should be included in the kernel query rather than as a scope definition event. For example, if we want check whether P occurs between S and T, none of which is Scope definition trigger. We can decompose the query into two queries. One is that P responses to S. And the other is that S precedes T. Then we check the whether both queries are true. This solution follows the property decomposition principles addressed in section 2. We set a uniform structure of CCQ to reduce the complexity of the queries without limiting their expressiveness. Firstly, an informal query in English usually consists of two different tenses, at most. Otherwise, it can be difficult to comprehend. Secondly, the implication operator can be used to code all other operators such as NOT, AND, OR XOR etc. Thirdly, state machine models are effective in describing control behaviors of reactive or concurrent systems, which represent chains of states (or events) consisting of a precedent condition(s) and an effect(s). Consequently, the corresponding queries are sequences of premises and conclusions. Most importantly, as shown in Fig 2, we can
always decompose a complicated property into informal queries using a constrained form. Therefore, the CCQ structure increases comprehensibility without significantly sacrificing expressiveness. Nevertheless, the idea of CCQ structure is not mature. This is work in progress and we expect to make modifications and extensions to the results presented here.
5. Mapping Properties to LTL Queries In [7], the authors classify commonly used queries in two primary categories. One is occurrence and the other is order. Occurrence includes absence, universality, existence, and bounded existence. Order includes precedence, response, chained precedence, and chained responses. Each pattern can have 5 different scopes, such as global, before, after, between, after-until. The patterns listed above can covers 92% of queries uncovered in the specifications studied by [7]. Many of examples in [7] do not fit our CCQ structure unless we rewrite them. Several examples are perhaps impossible to be reduced to CCQ directly. Our on-going work involves either rewriting patterns [7] into CCQ form or modifying the corresponding statechart so that an equivalent property on the modified statechart can be defined in CCQ form. We choose 6 examples shown in [7] to demonstrate our approach. In the first five examples, our CCQ forms are the mathematical equivalents of the original queries in [7]. An easy way to show the equivalences is to prove the inclusion of the quantified regular expressions of examples in [7] to our CCQ and vice-verse. 1. S Precedes P Globally: ◊P → ( ¬ P U(S∧¬P) ) . 2. S Precedes P Before ω: ◊ω → (¬ PU(S ∨ ω)). The first example is in CCQ form since the Premise and Conclusion are clearly ATLs and the Mode is NULL and the Start Phase includes the initial states (NULL, default). The second is also in CCQ form. Its Mode and Premise are both NULL and the Scope and Conclusions are both ATLs. 3. S Precedes P after α: Orig. form: ¬ α ∨◊(α ∧(¬ P U (S ∨ ¬P))). CCQ form: ((α ∧◊ P) → (¬ P U S )). 4. S Proceeds P between α and ω: Orig. form: ((α ∧◊ω)→ (¬ P U (S ∨ ω))). This query is in CCQ form with invariant mode, where α is the Scope and both Premise and Conclusions are ATLs. 5. S Proceeds P after α until ω: Orig. form: (α →((¬ P U ((S ∨ ω) ∨ ¬P)). CCQ form: ((α ∧◊ P) → (¬ P U (S ∨ω ))). To write occurrence patterns in CCQ is straightforward, since most of them can be expressed by a single ATL. The subtle differences between precedence and response as defined in [7] are their different emphases. Precedence emphasizes: if P occurs, S must occur before P within the
scope (typically safety concerns). Response emphasizes: if S occurs, P must follow within the scope (typically guarantee concerns). It is easy to see we can write P response to S patterns in similar CCQ forms. We add the After N steps start phase because some models address real time issues and some model checking tools (e.g. Statemate) are bounded by finite states. Some model checking tools, such as interval model checking tools and Statemate, allow the user to configure the first state to initiate the check. In the case of a model checker that does not support a start phase feature, we can include a concurrent state machine model as an initialization observer that counts steps. 6. As an example, in [7] a bounded existence property P occurs at most twice between α and ω is specified by 7 temporal operators and over a dozen parenthesis and logic operators. This expression is more complicated if P occurs at most three times or 10 times between α and ω. We can simplify the expression of this property by adding an occurrence observer as a concurrent state machine. The observer state machine includes a counter parameter C that is incremented whenever P occurs after α occurs. And C is reset to 0 when the terminal trigger ω occurs. To check that a bounded existence property P occurs at most N times between α and ω, we only need to check the following CCQ: ((α ∧◊ω) → ( C