An Automated Approach for Supporting Software Reuse via Reverse Engineering Gerald C. Gannody, Yonghao Chen, and Betty H. C. Chengz Department of Computer Science Michigan State University 3115 Engineering Building East Lansing, MI 48824-1226 E-mail: fgannod,chenyong,
[email protected] Abstract Software reuse is a process of constructing a software system using existing software components. Formal approaches to software reuse rely heavily upon specification matching criterion, where a search query using formal specifications is used to search a library to identify components that can be used for reuse purposes. In previous investigations, we addressed the use of formal methods and component libraries to support software reuse and construction of software based on architectural specifications. One of the primary difficulties of using a formal approach for software reuse that makes use of formally specified components is that it is not clear how the formal specification indices are created. Software reverse engineering is a process of examining system components and component interrelationships in order to derive a description of the system at a higher level of abstraction. We have developed an approach to reverse engineering that is based on the use of formal methods to derive formal specifications of existing programs. In this paper, we present an approach for combining software reverse engineering and software reuse to support populating specification libraries for the purposes of software reuse. In addition, we discuss the results of our initial investigations into the use of tools to support an entire process of populating and using a specification library to construct a software application. This work was supported in part by the NASA Graduate Student Researcher Fellowship NGT-70376, DARPA grant F30602-96-1-0298 (administered by Air Force’s Rome Laboratory), and the National Science Foundation grants CCR-9633391, CCR-9407318, CCR-9209873, and CDA-9312389. y This author was supported in part by a NASA Graduate Student Researchers Program Fellowship. A portion of this research was performed while this author was at the NASA Jet Propulsion Laboratory. z Please address all correspondences to this author.
1 Introduction Historically, the use of software components-off-theshelf (COTS) has been limited to complete applications. The introduction of object-oriented programming, design patterns, and other new development techniques have focused on the creation and reuse of finer grained units such as software COTS, but the wide-scale use of such components in the same manner as hardware integrated components has been limited. As a development technique, software reuse is a process of constructing a software system using existing software components. Formal approaches to software reuse rely heavily upon specification matching criterion, where a search query using formal specifications is used to search a library to identify components that can be used for reuse purposes. In previous investigations, we addressed the use of formal methods and component libraries to support software reuse [?, ?], and construction of software based on architectural specifications [?, ?]. One of the primary difficulties of using a formal approach for software reuse that makes use of formally specified components is that it is not clear how the formal specification indices are created. Software reverse engineering is a process of examining system components and component interrelationships in order to derive a description of the system at a higher level of abstraction [?]. We have developed an approach to reverse engineering that is based on the use of formal methods to derive formal specifications of existing programs [?, ?, ?]. In this paper, we present an approach for combining software reverse engineering and software reuse to support populating specification libraries for the purposes of software reuse. In addition, we discuss the results of our preliminary investigations into the use of tools to support an entire process of populating and using a specification library to construct a software application. The remainder of this paper is organized as follows. Section 2 presents background information in the areas of soft-
ware maintenance, formal methods, and software reuse. A framework for combining the use of software reverse engineering and software reuse is discussed in Section 3. Sections 4 and 5 describe the reverse engineering and reuse engineering aspects of the framework, respectively. An example that illustrates use of the entire framework is described in Section 6. Related work is presented in Section 7. Finally, Section 8 draws conclusions and suggests future investigations.
2 Background This section gives background information in the areas of software maintenance, software reuse and formal methods for software development.
methodologies that support the life-cycle (i.e., Structured Analysis and Structured Design [?]) make use of informal techniques, thus increasing the potential for introducing ambiguity, inconsistency, and incompleteness in designs and implementations. In contrast, formal methods used in software development are rigorous techniques for specifying, developing, and verifying computer software [?]. A formal method consists of a well-defined specification language with a set of well-defined inference rules that can be used to reason about a specification [?]. A benefit of formal methods is that their notations are well-defined and thus, are amenable to automated processing [?].
2.3 Strongest Postcondition
In previous investigations, we described the use of strongest postcondition as the formal basis for reverse engineering [?]. In addition, we have applied strongest postFigure 1 contains a graphical depiction of a process condition to the definition of the semantics of the C promodel for reverse engineering and reengineering [?]. The gramming language in order to analyze software used by process model is captured by two sectioned triangles, mission control technicians for NASA’s deep space prowhere each section in a triangle represents a different level gram [?, ?]. Strongest postcondition, denoted sp (S ; Q ) of abstraction. The higher levels in the model are concepts is defined as follows: given that Q holds, execution of and requirements. The lower levels include designs and imS results in sp (S ; Q ) true, if S terminates [?]. As such, plementations. Entry into this reengineering process model sp(S; Q) assumes partial correctness. The weakest preconbegins with system A, where Abstraction (or reverse engidition predicate transformer wp(S; R) is defined as the set neering) is performed to a level of detail appropriate to the of all states in which the statement S can begin execution task being performed. For instance, if a system is to be and terminate with postcondition R true. Given a Hoare reengineered in response to efficiency constraints, then abtriple Q f S g R, we note that wp is a “backward” rule, in straction to the design level may be appropriate. The next that a derivation of a specification begins with postcondistep is Alteration, where the system is configured into a tion R, and produces a predicate wp(S; R). The predicate new form at a different level of abstraction. Finally, Retransformer wp assumes a total correctness model of comfinement of the new form into an implementation can be putation, meaning that given S and R, if the computation performed to create system B. of S begins in state wp(S; R), then the program S will halt with condition R true. We contrast this model with the sp model, a “forward” derivation rule. That is, given a preconAlteration dition Q and a program S , sp derives a predicate sp(S; Q). The motivation for using sp instead of wp is that sp derives a postcondition, which is typically the objective of reverse ‘‘Forward Engineering’’ Concept ‘‘Reverse Engineering’’ engineering. That is, the objective of reverse engineering Refinement Abstraction is to determine the purpose of a given code segment. In Requirements contrast, wp requires a postcondition in order to derive a Design precondition.
2.1
Software Maintenance
2.4 Software Reuse
Implementation System A
System B
Figure 1: Reverse Engineering Process Model
2.2
Formal Methods
Although the waterfall development life-cycle provides a structured process for developing software, the design
Software reuse is the use of existing software components to construct new systems [?, ?]. At a high-level, software reuse consists of two types of engineering activities: one is component engineering, and the other is application engineering. Component engineering deals with the specification, creation, classification, storage, and retrieval of software components. Components can be extracted and/or reengineered from existing systems, or developed from scratch. Application engineering focuses on the assembly
of reusable components to construct problem-oriented solutions. Application engineering involves the specification of component composition, component adaptation, and integration. As a means to address tasks in component engineering activities, Jeng and Cheng developed a formal approach for specifying and classifying components in a reusable library [?, ?]. They also introduced the use of a generality operator as the formal basis for identifying reusable components via specification matching [?]. Zaremski and Wing [?] describe additional operators for matching queries to components for software reuse. In addition, Penix and Alexander [?] define the satisfies criterion for component matching. Figure 2 shows several of the matching operators and their relations. 1 When these operators are used for software reuse, A is a query specification and R is a library specification.
the reused component in both plug-in post and weak post matches.
3 A Software Reverse Engineering and Reuse Framework
Many software reuse approaches depend on the assumption that a library of reusable components is available for use. There are two techniques for populating these libraries with reusable code. The first technique is to construct components with the intention of placing them into code repositories. Examples of these repositories are the Microsoft Foundation Classes [?] as well as libraries for standard problem domains such as mathematics, graphics, and networking. The second technique for populating code repositories is by identifying existing code as potential candidates for reuse, and then packaging that code into a library. In either case, a primary concern is the mechanisms used for indexing, identifying, and retrieving the components from the libraries [?, ?, ?, ?]. Exact Pre/Post (Apre Rpre) /\ (Apost Rpost) This paper describes how a formal approach to reverse engineering can be integrated with a formal approach to software reuse in order to support after-the-fact construction and use of reusable code libraries. Figure 3 gives Plug-in an overview of the reverse engineering and software reuse (Apre => Rpre) /\ (Rpost => Apost) framework in the form of a data-flow diagram. The diagram shows two distinct components within the framework: the Reverse Engineering component, indicated by Weak Plug-in Plug-inthe Post process circle labeled “RevEgr Suite”, and the Reuse Rpost => Apost (Apre => Rpre) /\ ((Rpre /\ Rpost) => Apost) component, indicated by the process circle labeled “Reuse Suite”. Two integrating factors within this framework provide the linkage between the two components; the User Satisfies Weak Post and the Specification Library. The user is required to diRpre => (Rpost => Apost) rect the reverse engineering and the reuse processes, and (Apre => Rpre) /\ ((Apre /\ Rpost) => Apost) the specification library is the common medium and repository between the RevEgr Suite and the Reuse Suite. Figure 2: A partial-order of matching operators [?]
In Figure 2, an arrow represents a logical implication between two matching operators. When used to select reusable components, all the matching operators along the leftmost branch in the figure ensure that a selected component can be reused for implementing a query specification without semantic adaptation. 2 This semantic requirement is relaxed in the cases of “plug-in post” and “weak post” matches, where logical implications are not required between the preconditions of the two matched functions. Adaptation may be needed to establish the precondition of 1 In this paper, we assume that the signatures of a query specifica-
tion and a library specification match. For details on signature matching, see [?]. 2 Syntax mismatches may still exist, thus syntactic adaptation may be needed.
User Assertions, Decisions
Code
Code
Problem Definitions, Decisions
RevEgr Suite
Reuse Suite
Module Specifications
Packaged Components
Module Specifications Specification Library
Figure 3: The Reverse Engineering and Reuse Framework
Sys
Within this framework, a user can analyze source code and construct formal specifications that can be used to index the source for reuse purposes. This reverse engineering and population activity facilitates the reuse part of the framework, namely the search, identification, and packaging of components into new systems. Sections 4 and 5 discuss the specific techniques and tools that are used to support the reverse engineering and reuse aspects of the framework, respectively.
4.2 As-Built Specifications As stated in Section 2, the strongest postcondition (denoted sp) is defined as the strongest condition R that is true after the execution of a program S , when starting with condition Q true. Table 1 summarizes the strongest postcondition semantics of the Dijkstra guarded command language [?], where IF represents the n alternative conditional statement if B1
4 Reverse Engineering
! S1 ;
:::
In this section we describe an approach for reverse engineering that is applicable to the implementation and design levels of abstraction. Given the reengineering framework in Figure 4, the context of our reverse engineering investigations are indicated by the dashed arrow.
jj Bn ! Sn ; fi;
Bi ! Si represents a guarded command such that Si is only executed if logical expression (guard) Bi is true. DO represents the loop statement “do B ! S od” where S is executed iteratively until guard B is false.
Alteration
‘‘Reverse Engineering’’
Concept
sp (x ‘‘Forward Engineering’’ Refinement
Abstraction
Construct := e; Q ) sp (IF; Q ) sp (DO; Q )
sp (S1 ; S2 Q )
Requirements
;
sp Semantics
(9 :: xv ^ = xv ) sp (S1 B1 ^ Q ) _ _ sp (Sn Bn ^ Q ) : ^ (9 : 0 : sp (IFi Q )) sp (S2 sp (S1 Q )) v
Q
x
e
;
B
:::
i
;
i
;
;
;
Design Implementation System A
System B
Figure 4: Reverse Engineering Context
4.1
Method
In previous work, we investigated two phases of software reverse engineering and design recovery. The first phase involves the construction of low-level, as-built specifications from program code [?]. The specifications in this phase are considered to be as-built since they describe a system as it was implemented rather than how the system was designed. A defining characteristic of as-built specifications is that they contain a significant amount of implementation bias. The second phase of our investigations into reverse engineering involves the derivation of highlevel abstractions from as-built specifications [?]. By constructing high-level abstractions, several activities can be facilitated including high-level reasoning, program understanding, and software reuse. The combination of these two investigations provides the basis for an overall process for reverse engineering that involves two steps: 1) the construction of as-built specifications from program code, and 2) the derivation of highlevel specifications from the as-built specifications.
Table 1: Strongest Postcondition Semantics for the Dijkstra Guarded Command Language In the table, the semantics for sp (x := e; Q ) states that after the execution of “x := e” there exists some value v such that every free occurrence of x in Q is replaced with v and x = exv . The semantics for sp (IF; Q ) states that after execution of the if-fi statement, at least one of sp(Si ; Bi ^ Q) is true. In the case of iteration, denoted sp (DO; Q ), the semantics are that after execution of the loop, the loop guard is false (:B ), and a disjunctive expression describing the effects of iterating the loop some number of times (approximated by the notation IFk ) is true, where k 0. Finally, for sequences, sp (S1 ; S2 ; Q ) means that the postcondition for statement S1 is the precondition for some subsequent statement S2 .
4.3 Deriving Abstractions Section 2 describes several specification matching criteria that have been used for software component reuse purposes. In the context of specification matching, we have defined the concept of an abstraction match as follows [?]: Definition 1 (Abstraction Match) Let I be a program with specification i such that the corresponding precondition and postcondition are
ipre and ipost , respectively, and let l be an axiomatic specification with precondition lpre and postcondition lpost . A match is an abstraction match if i l, so that (lpre ! ipre ) ^ (ipost ! lpost ):
tions of as-built specifications. The remainder of this section presents delete a conjunct strategy and conditions for its application. For further discussion about the remaining strategies, please refer to [?].
One of the interesting properties of some of the matching operators in Figure 2 is that they are partial-order relations [?]. As such, high-level abstractions of axiomatic specifications can be derived as long as the matching criterion are preserved. That is, given an axiomatic specification I that consists of precondition Ipre and postcondition Ipost , we would like to identify an axiomatic specification A such that I A. We can identify such a specification by modifying I so that we have a specification I 0 that satisfies the relationship that I I 0 . If, for instance, is the abstraction match operator, then by either strengthening the precondition Ipre , weakening the postcondition Ipost , or both, we produce a specification I 0 that satisfies the property that I I 0 . In order to address the computational complexities of deriving abstractions based on matching, we have developed a number of guidelines that focus on weakening the postcondition Ipost and strengthening the precondition Ipost [?]. One technique for preserving an abstraction match relationship is by weakening the postcondition of a specification. Let I be a specification with precondition Ipre and postcondition Ipost and let I 0 be a specification such that 0 $I 0 0 Ipre pre and Ipost ! Ipost . As such, I I , since 0 )) I $ Ipre ) ^ (Ipost ! Ipost ) 0 !I 0 ((Ipre pre ) ^ (Ipost ! Ipost )): 0 (( pre
(1)
Expression (1) provides a basis for deriving abstractions from a specification by weakening a postcondition Ipost 0 to produce a postcondition Ipost . Several options are available for weakening the postcondition including those listed in Table 2, which includes delete a conjunct, add a disjunct, ^ to _ transformation, and ^ to ! transformation. Operation Delete a conjunct Add a disjunct ^ to ! ^ to _
Ipost A^B^C A^B A^B A^B
0 Ipost A^C (A ^ B ) _ C A!B A_B
Table 2: Weakening the postcondition In Section 6.1 we describe an example that applies the use of the delete a conjunct strategy for deriving abstrac-
Delete a conjunct. Given a specification in conjunctive form (not necessarily a normal form), deletion of a conjunct weakens a specification by removing additional or constraining conditions. Below are guidelines that can be used to identify the appropriate conjunct for deletion. Local Scope: If a conjunct specifies behavior that is local to a procedure and has no impact on the output variables of the system, then that conjunct is a candidate for deletion. Examples include specifications of the value of a loop index or temporary variables. Independence: If a conjunct specifies some behavior that is logically independent of the remaining conjuncts, then that conjunct is a candidate for deletion. As an example, consider the expression (x = c) ^ (c = y) ^ (z = n). The conjunct (z = n) is independent of the conjuncts (x = c) and (c = y ). Preservation: If a conjunct captures some behavior that must be expressed in the higher level specification, the remaining conjuncts are candidates for deletion. These guidelines are by no means comprehensive. Ultimately, a maintenance engineer using this approach must decide whether to delete a specific conjunct in a specification
4.4 Tool Support In order to support the techniques described in Sections 4.2 and 4.3, we have been developing a suite of prototype analysis tools that facilitate both phases of the reverse engineering process. The first tool, called A UTO S PEC [?], is used by the maintenance programmer to construct an asbuilt formal specification from program code. A second tool, called S PEC G EN [?], is used by the maintenance programmer to derive formal high-level abstractions from the as-built specifications that are constructed by AUTO S PEC. Figure 5 shows a data-flow diagram depicting the context of the tools used to support the two phase reverse engineering technique. The process circles labeled “AutoSpec” and “SpecGen” correspond to the AUTO S PEC and S PEC G EN support tools, respectively. The AUTO S PEC tool takes source code and user-defined assertions as input, and with user guidance, produces as-built specifica-
tions that can be used as input to the S PEC G EN tool. S PEC G EN uses the abstraction matching technique to then derive high-level module specifications based on user decisions to populate a library of specifications.
User
ABRIE
User
Problem Descriptions
Architecture Component Packaged Specification Select/ Info Components Arch Package Design Match As-built Module Specifications Specifications Specification AutoSpec SpecGen Library Component Specification Component Implementation
Assertions
Code
Code
Decisions Decisions
Decisions
Figure 5: Reverse Engineering Component Specification Library
The convention used in this paper for the as-built specifications generated by A UTOSPEC is given in Figure 6. The format is based on the Larch interface language [?] syntax. In this convention, domainsort and rangesort are the input and output types of a given function, respectively. The locals keyword lists the variables defined within the scope of the specification, if applicable. The requires keyword is used to indicate the precondition of the given function. The ensures keyword describes the postcondition of a given function. Finally, the modifies keyword lists the variables that are modified by the function. spec name ( (var: domainsort) ) ?! var: rangesort locals (var: domainsort) requires precondition modifies variables ensures postcondition Figure 6: Syntax of Library Specifications
5 Software Reuse Although a large number of techniques have been developed to address various reuse issues, the lack of seamless integration of these techniques imposes significant obstacles to achieving effective reuse. Previously, we developed an architecture-based framework, called ABRIE (Architecture Based Reuse and Integration Environment) for reuse-based software development [?]. Within this framework, reuse issues are addressed in an integrated and automated fashion. Figure 7 depicts the ABRIE framework in the form of a data flow diagram, where the dashed line encapsulates the ABRIE system. As shown in Figure 7, given the problem definition, software development starts with architecture design. By specifying the constituent components and their interre-
Figure 7: Software Reuse Framework lationships for a target system, the software architecture specification serves as a basis for evaluating and integrating existing components. The evaluation and selection of existing components are based on both architectural characteristics and behavioral specifications. Mismatches may be identified in the component evaluation and matching process. Based on the selected components and matching information, the packaging process generates a system construction file that describes how the target system can be constructed. Identified mismatches may also be resolved in the packaging process by generating appropriate wrappers for the reused components.
5.1 Software Architecture Specification Software architectures describe the overall organization of a software system in terms of its constituent elements, including computational units and their interrelationships [?, ?, ?]. A number of architectural description languages (ADLs) have been proposed to formally specify and analyze software architectures [?, ?, ?]. In general, an architecture is specified as a configuration of components and connectors. A component is an encapsulation of a computational unit. A component has an interface that specifies the capabilities that the component can provide and the ways that a component delivers its capabilities. The implementation of a component should conform to its interface, meaning that it implements the capabilities and delivers them in the way specified in the interface. The interface of a component is specified by the following properties: Type: type of the component. fPortg+: one or more ports supported by the component type
Sys
Constraints: constraints imposed on the ports of the component Component types are intended to capture architectural properties. For example, the type filter specifies that a component must be used in a pipe-and-filter system, whereas a component of type module should be used in a main program/subroutine system. Ports are the interaction points through which a component exchanges resources with its environment. Port specifications specify the signatures and optionally the behaviors of the resource. Common types of resources include procedure, data store, data stream, abstract data type, and events. A component may either provide or require a resource through a port. The type and direction of the resource associated with a port is described in the port type. For example, a port of type ProcDef defines and exports a procedure, whereas a port of type ProcInvoc requires a procedure from its environment. Logic-based formal specifications may be attached to a port to precisely capture the behavioral properties of a port. For example, for procedure-related ports such as those of type ProcDef, and ProcInvoc, we have used the Larch specification language [?] to specify their functionality. Formal specifications enable a semantic-based approach to determining the reusability through specification matching. By incorporating existing formal specification languages, we can take advantage of the available tools, including theorem provers. Connectors encapsulate the ways that components interact. A connector is specified by the following properties: Type: the type of the connector. fRoleg+: roles defined by the connector type. Constraints: constraints imposed on the roles of the connector. A connector defines a set of roles for the participants of the interaction specified by the connector. For example, a CallProc connector has two roles: a Definer and a Caller. Connector types are intended to capture recurring component interaction styles. Typical component interaction styles and the corresponding connector types include: procedure call (described by type CallProc), data access (AccessData), pipeline (Pipe), and remote procedure call (RPC). Components are connected by configuring their ports to the roles of connectors. A port cannot be configured to arbitrary roles. Each role has a domain that defines a set of port types. Only the ports whose types are in the domain can be configured to the role. Furthermore, the ports configured to the roles of a connector should satisfy the constraints of the connector. For example, in a CallProc connector, the port configured to Definer should define and export a procedure (function) that can fulfill the functionality required by the port configured to Caller. These obligations provide a way to check the correctness of architecture configurations.
5.2 Component Selection and Matching By specifying constituent components and connectors, the architecture specification serves as a basis for assembling a target system. The first step in the system assembly process is to find an implementation for each interface in the architecture specification of the target system. While the implementation may be constructed from scratch, we emphasize the reuse of existing components for fulfilling the interface. Given an interface specification, the objective of component selection is to locate the most appropriate components that may satisfy the interface. A basic criterion for component selection is that the type of a selected component should be the same as that specified in the interface. More fine-grained approaches may consider the number of ports, port types, and port signatures, or even port specifications and constraints. The rule is that the selected component should be suitable for fulfilling the interface. Given candidate components for implementing a (query) interface, the matching process aims to establish a mapping between the ports of a candidate component and the ports specified in the interface. Only if such a mapping is established can the reusability of the existing component for implementing the interface be validated. Furthermore, the matching process identifies mismatches (such as naming conflicts) between the existing component and the interface. The criteria for mapping a port Q to a port P is : (1) the capabilities delivered through the two ports are matched, and (2) the capabilities are delivered in the same way. Condition (2) means that the types of the two ports are the same. For the first condition, the simplest case is when the capabilities are exactly the same. However, relaxed matches should be considered. The concrete forms of these relaxed matches depend on the port types and port specifications. The specification matching criteria in Section 2.4 provides a semantic-based approach to determine the reusability of procedure-related ports.
5.3 Packaging After all the interfaces in an architecture specification have been matched with an implementation, the packaging process may be invoked to generate a package for the target system. System packaging consists of three steps: First, the packaging process checks if the implementation of an interface has mismatches with the interface. In the case that mismatches exist, software wrappers may be generated for adapting the reused components. The adaptation is based on the port mappings generated by the component matching process. Then the packaging process generates implementation for connectors. The implementation of connectors depends upon their types and role configurations. In addition to establishing connections, the implementation of a connector may need to resolve mismatches between
the ports that are configured to the connector. A typical mismatch is a naming conflict, which can be resolved with automatically generated code. Finally, construction files (such as makefiles) are generated to describe the construction process for the executables of the target system.
5.4
Tool Support
We have developed an integrated environment to support the whole reuse process. ABRIE is an experimental system designed to explore the use of software architectures as a framework for assembling software components. The design objective is to provide an integrated environment to address various reuse issues: specification of component composition, component management, and component integration. Three characteristics are specifically emphasized in the ABRIE design: visualization, multilevel abstractions, and automation. In addition to the textual form, visual representation and manipulation of components, connections, and architectures are supported. ABRIE uses three levels of abstraction in determining the reusability of existing components: types, signatures, and behavioral specifications. Automation is one main potential benefit of architecture-based reuse. In ABRIE, the component integration (packaging) process is fully automated. In component retrieval, evaluation, and matching, automated support is also provided. ABRIE consists of three functional components that respectively implement the three processes shown in Figure 7: architecture design, component management, and system packaging.
typedef int QDATA;
int enQueue int tail int head
#define MAXSIZE 100 struct queue { int head; int tail; QDATA data[MAXSIZE]; };
if ((q-> { print retur } else { q->da q->ta retur }
typedef struct queue Queue; /* Operations */ int is_empty(const Queue); QDATA head(const Queue); QDATA dequeue(Queue *); int enQueue(Queue *, QDATA *); Queue *new_queue(); void printQ(Queue);
}
QDATA deque
Queue *newQ; newQ = (Queue *)malloc(sizeof(Queue)); newQ->head = 0; newQ->tail = 0; return newQ; } } int is_empty(const Queue q){
QDATA head( return (q.head == q.tail);
return q } }
6 Example In this section we discuss an example that illustrates the use of the integrated reverse engineering and reuse framework. First, we populate a library, using reverse engineering techniques, with component specifications. Then we demonstrate the process of specifying an application and searching the specification and component library for suitable reusable code. Finally, we show the process of packaging components into the final application.
6.1
Figure 8: Queue Source Code
tail
. . .
Populating the Library
Figure 8 shows the source code for an array implementation for a queue abstract data type. The source code, written using the C programming language, implements a circular queue so that the head and tail of the queue can “wrap” around the lowest and highest indices of the array, as shown in Figure 9. The queue data structure consists of three parts: an index to the front, or head of the queue, an index to the end, or tail of the queue, and an array that is used to store the elements of the queue. The queue source code contains several functions that correspond to the operations typically associated with queues including enqueue, dequeue,
int temp if (!is_ temp q->he retur } else { retur }
Queue *new_queue(){
n
.
0 1
9 8 7 6
2 3 4 5
queue elements head
Figure 9: Circular Queue Diagram new queue, head, and is empty. The abstract behavior of these operations is as expected, where enqueue adds a new element to the end of the queue, dequeue removes the element at the front of the queue, new queue creates a new queue, head returns the element at the front
of the queue, and is empty checks to see if the queue contains any elements. As described in Section 4.1, construction of a specification that is suitable for populating a component specification library involves two primary steps: 1) construction of an as-built specification, and 2) derivation of a high-level abstraction. Using the sp semantics of the C programming language [?], the complete as-built specifications of the source code in Figure 8 can be constructed as shown in Appendix A (Figure 20), where the symbols ‘&&’ and ‘||’ are used to indicate the logical connectives and (^) and or (_), respectively. In addition, the symbol ‘.>’ is the points-to operator [?] for reasoning about pointers and pointer operations. For the purposes of illustration, the remainder of this section will focus on the analysis of the enQueue procedure. Figure 10 shows the as-built specification of the enqueue procedure as derived by the A UTOSPEC tool. Informally, the as-built specification of the enQueue procedure states that prior to the execution of the procedure, the pointer e points to an object param4, the value of param4 is pVal5, and the pointer q points to an object param3 such that the tail component of param3 has the value pVal4. In addition, the specification states that after execution of the procedure, one of two conditions is true. The first condition describes the behavior when the queue is full in which case the difference between the values param3.tail.V and param3.head.V is equal to the maximum size of the queue. Here, the return value of the procedure is 0, indicating a failure. The second condition describes the behavior when there is room to place an item on the queue. In this case, the return value of the procedure is 1, the index to the tail is incremented, and the data element is added to the queue data array. While the specification in Figure 10 is accurate with respect to the original source program, the level of detail can inhibit high-level reasoning. As described in Section 4.3, one technique that can be used to derive an abstraction of an as-built specification is to weaken the postcondition by using the delete a conjunct strategy. For the enQueue example, we must first put the ensures clause into a conjunctive form, as shown in Figure 11. In this form, the enQueue specification has four conjuncts that specify that (a) e points to the object param4, (b) param4.V has value pVal5, (c) q points to param3, and (d) the disjunctive statement:
spec int enQueue(Queue *q, QDATA *e) requires (((e .> _param4) && (_param4.V == _pVal5)) && ((q .> _param3) && (_param3.tail.V == _pVal4))) modifies q (_param3) ensures ((((((e .> _param4) && (_param4.V == _pVal5)) && ((q .> _param3) && (_param3.tail.V == _pVal4))) && ((_param3.tail.V - _param3.head.V) == MAXSIZE)) && (return.V = 0)) || (((((((e .> _param4) && (_param4.V == _pVal5)) && ((q .> _param3) && (_param3.tail.V == _pVal4))) && (!((_pVal4 - _param3.head.V) == MAXSIZE))) && (_param3.data[(_pVal4 % MAXSIZE)].V = _param4.V)) && (_param3.tail.V = (_pVal4 + 1))) && (return.V = 1)))
Figure 10: Output generated by A UTOSPEC for the enQueue procedure ((((_param3.tail.V = _pVal4) & ((_param3.tail.V - _param3.head.V) = MAXSIZE)) & (return.V = 0)) || (((((as_const9 = _pVal4) & (!((_pVal4 - _param3.head.V) = MAXSIZE))) & (_param3.data[as_const9 % MAXSIZE].V = _param4.V)) & (_param3.tail.V = (as_const9 + 1))) & (return.V = 1))).
Figure 12(a) shows the results generated by the S PEC G EN tool when applied to the specification in Figure 11. One of the operations that can be performed by the tool is to generate all the possible abstractions of a specification based on preserving one or more of the conjuncts in the specification. The representation of the specifications that are generated by preserving conjuncts is called a focus graph. In our example, we are interested in preserving
(e .> _param4) && (_param4.V = _pVal5) && (q .> _param3) ((((_param3.tail.V = _pVal4) & ((_param3.tail.V - _param3.head.V) = MAXSIZE)) & (return.V = 0)) || (((((as_const9 = _pVal4) & (!((_pVal4 - _param3.head.V) = MAXSIZE))) & (_param3.data[as_const9 % MAXSIZE].V = _param4.V)) & (_param3.tail.V = (as_const9 + 1))) & (return.V = 1)))
Figure 11: The enQueue ensures clause in a conjunctive form
the conjuncts (b) and (d) and deleting conjuncts (a) and (c) due to the independence property stated in Section 4.3. Figure 12(b) shows the focus graph for the example, where the vertex labeled “abcd” indicates that conjuncts (a), (b), (c), and (d) are conjuncted. This vertex corresponds to the original specification in Figure 11. The remaining vertices in the graph represent the possible abstractions that are formed by deleting conjunct (a), (c), or both. Using the information provided by S PEC G EN, several transformations of the specification in Figure 11 can be performed that simplify and introduce abstraction into the postcondition. First, based on the focus graph in Figure 12(b), conjuncts (a) and (c) are deleted due to the independence criteria. After the deletion of conjuncts (a) and (c), we can perform a textual substitution of all references to the param3 identifier with a more descriptive symbol, like Q. Finally, given that the term as const9 has the value pVal4 and in the precondition for the specification param3.tail.V == pVal4, we can replace as const9 with the term Q.tailˆ, which represents the pre-value for the tail component of the queue (i.e., the value of the tail component before execution of the procedure). Given these transformations, the abstraction for the enQueue procedure can be derived as shown in Figure 13. Informally, the specification states that after execution of the procedure, the return value is set to 0 when the difference between the head and tail of the queue is the maximum array size. When the difference between the head and tail is not the maximum array size, then the element E is added to the array at the tail index, the tail index is incremented, and the return value is set to 1. A process similar to the one used for enQueue can be applied to derive abstractions for the remainder of the queue as-built specifications. For the purposes of combining the reverse engineering suite with the reuse suite, the resulting specifications must be translated into the syntax for the ABRIE system. The reason for the differences between the syntax of the AUTOSPEC and ABRIE specification languages is historical. Our initial investigations for deriving specifications for programs focused on the analysis of the Dijkstra guarded command language [?]. As such, a general Larch Interface Language variant [?] was developed. The expansion of the AUTOSPEC tool to support the C programming language has since prompted a need to modify the tools to generate Larch C (LCL) specifications, an activity that we are currently performing. In contrast, ABRIE was not developed for a specific programming language, but was intended to be tailorable to a given language. As such, we used a generic procedureoriented sytnax for the Larch Interface Language. However, since the output formats for the A UTOSPEC and S PEC G EN systems and the input format for the ABRIE system are all based on the Larch interface language (all
contain header information and the requires, modifies and ensures clauses), the actual step for preparing the procedure specifications generated by A UTOSPEC and S PEC G EN to the library format for the ABRIE system is straightforward and can be facilitated with automated tools. Appendix B contains the module specification in the ABRIE syntax that was constructed for the example described in this section.
6.2 Specifying an Application In the following discussion, we describe how a solution to the Josephus problem [?] can be specified and assembled from reusable components in ABRIE. In particular, we show how the formal specifications generated by the reverse engineering process can be used to semantically determine the reusability. The Josephus game can be described as follows: N people, numbered 1 to N, are sitting in a circle. Starting at person 1, a hot potato is passed. After M passes, the person holding the hot potato is eliminated, the circle closes ranks, and the game continues with the person who was sitting after the eliminated person picking up the hot potato. The last remaining persons wins. Given N and M, the Josephus problem is to determine who will win. Figure 14 shows the structure of a solution to the Josephus problem that uses a queue to represent people sitting in a circle. The solution is specified in ABRIE. Component master simulates the game, and calls queue operations provided by component queue. The two components are connected through three connectors of procedure calls. As shown in the “Component Property” window of Figure 14, component queue has three ports, each of which defines and provides a queue operation. Figure 15 shows the textual specification of the architecture. As shown in Figure 15, component master is implemented using a C source file jp main.c. Component queue needs to be implemented and will be the focus of the reuse activities. The required behaviors of its ports have been specified. In the next subsection, we discuss how a library component can be selected based on these behavioral specifications to implement the queue interface.
6.3 Component Reuse ABRIE incorporates a library manager for organizing and managing existing components. Components are classified and retrieved based on their interfaces (i.e., types and ports). When implementing an abstract component (interface) in an architecture, a single click on the reuse button in the “Component Property” window (see Figure 14) triggers ABRIE to search for the current library (which is loaded through the library manager). All components of the same type as the query interface will be presented to the user. Based on their specifications, the user selects one candi-
date for further evaluation. Figure 16 shows the scenario of matching the library component circqueue for satisfying interface queue. In order to determine the reusability of circqueue, we need to establish a mapping from the ports of the target component queue to those of circqueue so that each operation specified in queue can be implemented by a corresponding operation in circqueue. As shown in Figure 16, we conjecture that qDelete can be matched (implemented) by dequeue, qInsert by enQueue, and qCreate by new queue. Given a match, specification-based proof obligations may be generated to validate the matching. Figure 17 depicts the proof obligation generated by ABRIE for matching dequeue to qDelete using the “satisfies” matching method (See Table 1). The proof obligations are generated based on the Larch specifications of the two components. The proof obligations are represented in terms of a Larch Shared Language (LSL) specification to facilitate analysis with the Larch Prover (LP), an interactive theorem proving system for multi-sorted first-order logic [?]. A LSL trait for each operation match is created, where the proof obligation stating the logical relationship (match criteria between the two operators) is stated in the “implies” section. After preprocessing the obligations, we can invoke the LP from the “Component Matching” window to assist in proving these obligations. Typically, user interaction is required in proving/disproving a conjecture. Figure 18 shows a snapshot of LP while it is discharging the proof obligations shown in Figure 17. In the figure, the user guides LP to use “prove by cases” method to prove the conjecture. The proof establishes the reusability of dequeue for implementing qDelete.
A similar process can be applied to validate the matchings of enQueue to qInsert and new queue to qCreate. Therefore, the port mapping between components queue and circqueue is justified, so is the reusability of circqueue for implementing interface queue. As exhibited in the mapping between circqueue and queue, naming conflicts, such as qDelete of queue and dequeue of circqueue, may exist between a query specification and the reused component. Resolving these mismatches is one of the tasks of the packaging process. Figure 19 shows the wrappers generated by ABRIE for resolving the naming conflicts between circqueue and queue, where wrappers are generated based on the established port mappings. The packaging process also checks connectors and generates their implementation, as well as a system construction file (a makefile) that describes how an executable system is produced.
7 Related Work Several approaches and experiences have been described for finding reusable components in program code [?, ?, ?]. Neighbors [?] described an approach for constructing reusable components from large existing systems by identifying the tightly coupled subsystems contained in the system. The approach is primarily based on the use of informal and experimental methods. Burd et al. [?] describe an approach that applies the use of the structured system analysis and design methodology (SSADM) to identify reusable components. The approach utilizes call graphs and data dependencies to support maintenance and software reuse. In contrast to these approaches, the approach described in this paper enables the use of automated reasoning techniques for generating specifications and checking matches between target specifications and library specifications. Yang et al. [?] describe a formal approach to identification of reusable components via program understanding. While the goals of their approach are similar to the reverse engineering component described in this paper, the technique is different in that their approach is based on the use of program transformations (a form of rewriting where a library of rules is used to replace one part of a program or specification with another) within the context of a wide spectrum language. In contrast, our approach is based on the use of program translation, where an atomic set of rules is used to rigorously derive specifications. Approaches to reverse engineering focus on the construction of specifications, both informal and formal, and are based on the identification of plans [?], the construction of high-level structural specifications such as data flow and call diagrams [?], or transformation of programs into specifications [?, ?]. The Rigi system [?] is targeted towards the construction of structural abstractions in order to facilitate program understanding. In the investigations described in this paper, the reverse engineering component focuses on the construction of functional abstractions and thus is targeted towards both program understanding and the support of software reuse activities. The reverse engineering approach suggested by Baxter and Mehlich [?] is based on applying design and program transformations in order to derive functional abstractions. One of the primary goals of their approach is to facilitate design maintenance. In contrast, the approach described in this paper focuses primarily on facilitating software reuse. Many approaches have been suggested for the retrieval of components from reusable component libraries, ranging from classification of search criteria [?, ?] to retrieval [?, ?, ?] and library structuring [?]. Jeng and Cheng describe the use of analogy and generality [?, ?] as the basis for matching functions. Zaremski and Wing have proposed a technique for signature [?] and specification
matching [?]. Fischer et al. have described an approach for retrieval of reusable components using filters to narrow the search space [?, ?]. While many of these approaches have made significant advances in the area of software reuse, few address the creation of specification libraries.
8 Conclusions and Future Investigations Specification libraries have been used extensively and with relative success for code derivation [?]. In addition, formal specification libraries are the basis for many software component reuse approaches [?, ?, ?, ?, ?]. One of the difficulties of using a formal approach to software component reuse is that the assumption of library existence may not be a reasonable one. However, by applying a formal reverse engineering technique to candidate reusable components, the arguments against the existence assumption can be mitigated. As such, an entire framework for constructing systems based on formal methods via software reuse can be facilitated. As described in Section 5.1, a component can be described by specifying three properties : type, port, and constraints. The specifications that are constructed using AUTOSPEC and S PEC G EN addresses only one aspect of these properties; the ports. In order to strengthen the interaction between the reverse engineering and reuse aspects of the integrated framework, it would be desirable to be able to capture other information such as the type and constraints of the particular component. Specifically, by specifying that a component is of a certain type, the component selection and matching procedure in the ABRIE system can better determine the appropriateness of using the component in a particular context. In order to address this issue, we are investigating techniques for deriving implicit behavioral properties from program source code. In addition, to further address the feasibility of the approach described in this paper, we are currently analyzing a software library used to implement quaternion operations for graphics applications. We are investigating the use of a quaternion specification library developed by NASA as the query specifications for retrieving the appropriate reusable components from a code library. One of the benefits of using a formal approach is that the formal notations facilitate the use of tools to automate software development processes, including software reverse engineering and reuse. To this end, the A UTOSPEC, S PEC G EN, and ABRIE tools have been developed to support the approaches described in this paper. In order to address problems of scale, the A UTOSPEC tool is being combined with a host of other semi-formal support tools to provide an environment for supporting the understanding of C programs. In addition, the S PEC G EN tool is being updated to facilitate the introduction of several strategies for simplifying as-built formal specifications. Finally, further work is
being performed to broaden the reuse strategies supported by ABRIE.
References [1] Jun-Jang Jeng and Betty H.C. Cheng. Reusing Analogous Components. IEEE Transactions on Knowledge and Data Engineering, 1996. [2] Jun-Jang Jeng and Betty H.C. Cheng. Specification Matching for Software Reuse: A Foundation. In Proceedings of the ACM Symposium on Software Reuse, pages 97–105, 1995. [3] Yonghao Chen and Betty H. C. Cheng. Formalizing and automating component reuse. In Proc. of 9th IEEE Intl. Conference on Tools with Artificial Intelligence, November 1997. [4] Yonghao Chen and Betty H. C. Cheng. Facilitating an automated approach to architecture-based software reuse. In Proceedings of the 12th International Conference on Automated Software Engineering, 1997. [5] Elliot J. Chikofsky and James H. Cross. Reverse Engineering and Design Recovery: A Taxonomy. IEEE Software, 7(1):13–17, January 1990. [6] Gerald C. Gannod and Betty H. C. Cheng. Strongest Postcondition as the Formal Basis for Reverse Engineering. Journal of Automated Software Engineering, 3(1/2):139–164, June 1996. A preliminary version appeared in the Proceedings for the IEEE Second Working Conference on Reverse Engineering, July 1995. [7] Gerald C. Gannod and Betty H. C. Cheng. Using Informal and Formal Methods for the Reverse Engineering of C Programs. In Proceedings of the 1996 International Conference on Software Maintenance, pages 265–274. IEEE, 1996. Also appears in the Proceedings for the Third IEEE Working Conference on Reverse Engineering. [8] Gerald C. Gannod and Betty H. C. Cheng. A Formal Automated Approach for Reverse Engineering Programs with Pointers. In Proceedings of the Twelfth IEEE International Automated Software Engineering Conference, pages 219–226. IEEE, 1997. [9] Eric J. Byrne. A Conceptual Foundation for Software Re-engineering. In Proceedings for the Conference on Software Maintenance, pages 226–235. IEEE, 1992.
[10] E. Yourdon and L Constantine. Structured Analysis and Design: Fundamentals Discipline of Computer Programs and System Design. Yourdon Press, 1978. [11] Jeannette M. Wing. A Specifier’s Introduction to Formal Methods. IEEE Computer, 23(9):8–24, September 1990. [12] B. H. C. Cheng. Applying formal methods in automated software engineering. Journal of Computer and Software Engineering, 2(2):137–164, 1994. [13] Gerald C. Gannod and Betty H. C. Cheng. A formal approach to reverse engineering c programs. Technical Report MSUCPS-TR98-12, Michigan State University, April 1998. [14] Edsger W. Dijkstra and Carel S. Scholten. Predicate Calculus and Program Semantics. Springer-Verlag, 1990. [15] Hafedh Mili, Fatam Mili, and Ali Mili. Reusing software: Issues and research directions. IEEE Transactions on Software Engineering, 21(6):528–561, June 1995. [16] Charles W. Krueger. Software reuse. ACM Computing Surveys, 24(2), June 1992. [17] Jun-jang Jeng and Betty H. C. Cheng. Using Automated Reasoning Techniques to Determine Software Reuse. International Journal of Software Engineering and Knowledge Engineering, 2(4):523–546, December 1992. [18] Jun-jang Jeng and Betty H. C. Cheng. Using Formal Methods to Construct a Software Component Library. Proc. of Fourth European Software Engineering Conference, September 1993. Published in Lecture Notes of Computer Science by Springer-Verlag. [19] A. M. Zaremski and J. M. Wing. Specification Matching of Software Components. In Proceedings of the 3rd ACM SIGSOFT Symposium on the Foundations of Software Engineering, 1995. [20] John Penix and Perry Alexander. Toward Automated Component Adaptation. In Proceedings of the 9th International Conference on Software Engineering and Knowledge Engineering, June 1997. [21] A. M. Zaremski and J. M. Wing. Signature Matching: a Tool for Using Software Libraries. ACM Transactions on Software Engineering and Methodology, April 1995. [22] Microsoft Corporation. Microsoft Visual C++ MFC Library Reference, Part 1 & 2. Microsoft Press, 1997.
[23] B. Fischer, M. Kievernagel, and W. Struckmann. VCR: A VDM-Based software component retrieval tool. In Proceedings of the ACM Symposium on Formal Methods Application in Engineering Practice, 1995. [24] Gerald C. Gannod and Betty H. C. Cheng. A specification matching based approach to reverse engineering. Technical Report MSUCPS-TR98-15, Michigan State University, April 1998. [25] J.V. Guttag and J.J. Horning. Larch: Languages and Tools for Formal Specification. Springer-Verlag, 1993. [26] David Garlan and Dewayne Perry. Introduction to the special issue on software architecture. IEEE Transaction on Software Engineering, 21(4), April 1995. [27] Dewayne E. Perry and Alexander L. Wolf. Foundations for the study of software architecture. ACM SIGSOFT Software Engineering Notes, 17(4), October 1992. [28] M. Shaw and D. Garlan. Software Architectures: Perspectives on an Emerging Discipline. Prentice Hall, 1996. [29] M. Shaw, R. DeLine, D. Klein, T. Ross, D. Young, and G. Zelesnik. Abstractions for software architecture and tools to support them. IEEE Transactions on Software Engineering, 21(4), April 1995. [30] David C. Luckham and et al. Specification and analysis of system architecture using rapide. IEEE Transactions on Software Engineering, 11(4), April 1995. [31] Special Issue on Software Architecture, IEEE Transaction on Software Engineering, volume 21. IEEE, April 1995. [32] Mark Allen Weiss. Algorithms, Data Structures, and Problem Solving with C++. Addison-Wesley Publishing Company, Inc., 1996. [33] Stephen J. Garlan and John V. Guttag. LP, the Larch Prover: User and Reference Manual. MIT, December 1994. [34] James M. Neighbors. Finding reusable components in large systems. In Proceedings of the Third IEEE Working Conference on Reverse Engineering, pages 2–10. IEEE, November 1996. [35] Elizabeth Burd, Malcolm Munro, and Clazien Wezeman. Extracting reusable modules from legacy code. In Proceedings of the Third IEEE Working Conference on Reverse Engineering, pages 189–196. IEEE, November 1996.
[36] Hongji Yang, Paul Luker, and William C. Chu. Code understanding through program transformation for reusable component identification. In Proceedings of the Fifth IEEE International Workshop on Program Comprehension, pages 148–157. IEEE, May 1997. [37] Alex Quilici. A Memory-Based Approach to Recognizing Program Plans. Communications of the ACM, 37(5):84–93, May 1994. [38] Scott R. Tilley, Kenney Wong, Margaret-Anne Storey, and Hausi A. M¨uller. Programmable Reverse Engineering. The International Journal of Software Engineering and Knowledge Engineering, 4(4):501– 520, 1994. [39] M. P. Ward and K. H. Bennett. A Practical Solution to Reverse Engineering Legacy Systems using Formal Methods. In Proceedings of the Working Conference on Reverse Engineering, 1993. [40] Ira D. Baxter and Michael Mehlich. Reverse Engineering is Reverse Forward Engineering. In Proceedings of the Fourth IEEE Working Conference on Reverse Engineering. IEEE, October 1997. [41] B. Fischer, M. Kievernagel, and W. Struckmann. Deduction-Based Software Component Retrieval. In Proceedings of IJCAI ’95 Workshop on Formal Approaches to the Reuse of Plans, Proofs, and Programs, August 1995.
(a) SpecGen
[42] Ali Mili, Rym Mili, and Roland T. Mittermeir. Storing and Retrieving Software Components: A Refinement Based System. IEEE Transactions on Software Engineering, 23(7), July 1997. [43] Douglas R. Smith and Eduardo A. Parra. Transformational Approach to Transportation Scheduling. In Proceedings of the 18th Knowledge-Based Software Engineering Conference, pages 60–68, Sept 1993.
(b) Focus Graph
Figure 12: S PEC G EN Interface and Output
spec int enQueue(Queue *q, QDATA E) requires ((q .> Q) && (Q.tail == Q.tailˆ)) modifies Q.tail, Q.data ensures ((((Q.tail - Q.head) = MAXSIZE) && (return = 0)) || ((((!((Q.tailˆ - Q.head) = MAXSIZE)) && (Q.data[Q.tailˆ % MAXSIZE] = E)) && (Q.tail’ = (Q.tailˆ + 1))) && (return = 1)))
Figure 13: The enQueue abstraction
Figure 14: Architecture of a solution to Josephus problem
Architecture jp Components Module master Ports ProcDef main() { } ProcInvoc createQueue() return Queue* { } ProcInvoc addToQueue(Queue*,int) return Bool { ProcInvoc delFromQueue(Queue*) return int { } Implementation source("/user/r02/chengb/chenyong/JP", "jp_mai End Module queue Ports ProcDef qCreate() return Queue* { uses auxTheories; ensures result.head=0 /\ result.tail=0; } ProcDef qInsert(Queue* q,int i) return Bool { uses auxTheories; modifies q; ensures (q.tail-q.head = MAXSIZE => result /\ (q.tailˆ -q.headˆ ˜= MAXSIZE => ( result = true /\ q.tail’ = q.tailˆ + 1 /\ q.data[mod(q.tailˆ, MAXSI } ProcDef qDelete(Queue* q) return int { uses auxTheories; requires q.headˆ ˜= q.tailˆ; modifies q; ensures result=q.data[mod(q.headˆ, MAXSIZE /\ q.head’=q.headˆ+1; } End Connections CallProc pc1 Roles Caller -> master . createQueue Definer -> queue . qCreate End CallProc pc2 Roles Caller -> master . addToQueue Definer -> queue . qInsert End CallProc pc3 Roles Caller -> master . delFromQueue Definer -> queue . qDelete End End
Figure 15: Architecture specification
.. LP1.15: prove ((q_pre.head ˜= q_pre.tail => true) /\ ((((q_pre.head q_post.head = (q_pre.head + 1) /\ result = q_pre.dat MAXSIZE) ]) \/ (q_post.head = q_pre.head /\ q_pre.he result = 0)) /\ q_pre.head ˜= q_pre.tail) => (result mod(q_pre.head, MAXSIZE) ] /\ q_post.head = (q_pre.h .. Attempting to prove conjecture dequeue_qDeleteTheorem.1: (q_pre.head ˜= q_pre.tail => true) /\ (((q_post.head = q_pre.head + 1 /\ result = q_pre.data[mod(q_pre.head, MAXSI /\ q_pre.head ˜= q_pre.tail) \/ (result = 0 /\ q_post.head = q_pre.head /\ q_pre.head = q_pre.tail)) /\ q_pre.head ˜= q_pre.tail => result = q_pre.data[mod(q_pre.head, MAXSIZE /\ q_post.head = q_pre.head + 1)
Figure 16: Component Matching
Suspending proof of conjecture dequeue_qDeleteTheorem.1 LP2: resume by cases q_pre.head=q_pre.tail
Creating subgoals for proof by cases New constant: q_prec Case hypotheses: dequeue_qDeleteTheoremCaseHyp.1.1: q_prec.head = q_pre dequeue_qDeleteTheoremCaseHyp.1.2: ˜(q_prec.head = q_p Same subgoal for all cases: ((q_post.head = q_prec.head + 1 /\ result = q_prec.data[mod(q_prec.head, MAXSIZE)] /\ ˜(q_prec.head = q_prec.tail)) \/ (result = 0 /\ q_post.head = q_prec.head /\ q_pr /\ ˜(q_prec.head = q_prec.tail) => result = q_prec.data[mod(q_prec.head, MAXSIZE)] /\ q_post.head = q_prec.head + 1
Attempting to prove level 2 subgoal for case 1 (out of 2
Added hypothesis dequeue_qDeleteTheoremCaseHyp.1.1 to th Level 2 subgoal for case 1 (out of 2) [] Proved by normalization.
Attempting to prove level 2 subgoal for case 2 (out of 2
Added hypothesis dequeue_qDeleteTheoremCaseHyp.1.2 to th Level 2 subgoal for case 2 (out of 2) [] Proved by normalization. Conjecture dequeue_qDeleteTheorem.1 [] Proved by cases q_pre.head = q_pre.tail.
Figure 17: Proof obligations
The system now contains 55 rewrite rules and 2 deduction system is NOT guaranteed to terminate.
Figure 18: A Snapshot of LP in resolving proof obligations
// _circqueue_wrapper.cc // Generated by ABRIE for wrapping component circqueue #include "auxTypes.h" extern int dequeue(Queue *); int qDelete(Queue *q) { return dequeue(q); } extern Bool enQueue(Queue *, int); Bool qInsert(Queue *q, int i) { return enQueue(q,i); } extern Queue* new_queue(); Queue *qCreate() { return new_queue(); }
Figure 19: Wrappers generated by ABRIE for resolving naming conflicts
A As-built specification for the Queue source code Figure 20 shows the as-built specifications for the queue source code as they were constructed by the A UTOSPEC system. The figure contains five specifications corresponding to the dequeue, enQueue, new queue, head, and is empty operations. The format for the specifications, based on the Larch interface language [?] is shown in Figure 6. spec QDATA dequeue(Queue *q) locals int temp requires (q .> _param2) && (_param2.tail.V = _pVal3.tail) && (_param2.head.V = _pVal3.head) modifies q (_param2) ensures (q .> _param2) && (_param2.tail.V = _pVal3.tail) && ((as_const2 == _pVal3.head) && (is_empty(_param2.V) != 1) && (temp.V == (as_const2 % MAXSIZE)) && (_param2.head.V = (as_const2 + 1)) && (return.V = _param2.data[temp.V])) || ((_param2.head.V = _pVal3.head) && (!(is_empty(_param2.V) != 1)) && (return.V = 0)) spec QDATA head(const Queue q) requires (q.V = _param1) && (q.tail.V == _pVal3.tail) && (q.head.V == _pVal3.head) ensures (q.V = _param1) && (q.tail.V = _pVal3.tail) && (q.head.V = _pVal3.head) (return.V = q.data[(q.head.V % MAXSIZE)])
spec Queue *new_queue() requires true ensures (newQ.V .> o) && (o.head.V = 0) && (o.tail.V = 0) && (return.V = newQ.V)
Figure 20: A UTOSPEC output of as-built specifications for queue source code
spec int enQueue(Queue *q, QDATA *e) requires (((e .> _param4) && (_param4.V == _pVal5)) && ((q .> _param3) && (_param3.tail.V == _pVal4))) modifies q (_param3) ensures ((((((e .> _param4) && (_param4.V == _pVal5)) && ((q .> _param3) && (_param3.tail.V == _pVal4))) && ((_param3.tail.V _param3.head.V) == MAXSIZE)) && (return.V = 0)) || (((((((e .> _param4) && (_param4.V == _pVal5)) && ((q .> _param3) && (_param3.tail.V == _pVal4))) && (!((_pVal4 - _param3.head.V) == MAXSIZE))) && (_param3.data[(_pVal4 % MAXSIZE)].V = _param4.V)) && (_param3.tail.V = (_pVal4 + 1))) && (return.V = 1))) spec int is_empty(const Queue q) requires (q.V = _param0) ensures (q.V = _param0) && (return.V = (q.head.V == q.tail.V))
B Circular Queue Library Specification Figure 21 shows the library specification for the queue source code. The format for the specifications is based on the Larch interface language [?] and is used by the ABRIE system as a means for storing specifications and references to supporting source code. Module CircularQueue Ports ProcDef dequeue(Queue* q) return int { uses auxTheories; requires true; modifies q.head; ensures (q.headˆ ˜= q.tailˆ /\ q.head’ = q.headˆ + 1 /\ result = q.data[mod(q.headˆ,MAXSIZE)]) \/ (q.head’ = q.headˆ /\ q.headˆ = q.tailˆ /\ result = 0); } ProcDef enQueue(Queue* q, int e) return int { uses auxTheories; requires true modifies q.tail, q.data; ensures (q.tail - q.head = MAXSIZE) /\ result = 0) \/ (q.tailˆ - q.headˆ ˜= MAXSIZE /\ q.data’[mod(q.tailˆ,MAXSIZE)] = e /\ q.tail’ = q.tailˆ + 1 /\ result = 1); } ProcDef head(Queue q) return int { uses auxTheories; requires true; ensures result = q.data[mod(q.headˆ,MAXSIZE)]; } ProcDef is_empty(Queue q) return Bool { uses auxTheories; requires true ensures result = (q.head == q.tail); } ProcDef new_queue() return Queue* { uses auxTheories; requires true; ensures o.head = 0 /\ o.tail = 0 /\ result = o; } Implementation source ("/user/r02/chengb/gannod/Research/CircQueue/queue/","queue.c") End
Figure 21: Circular Queue Library Specification