The Location Linking Concept: A Basis for Verification of Code Using Pointers Gregory Kulczycki1 , Hampton Smith2 , Heather Harton2 , Murali Sitaraman2 , William F. Ogden3 , and Joseph E. Hollingsworth4 1
2
Battelle Memorial Institute, 2111 Wilson Blvd, Arlington, VA 22201, USA
[email protected], School of Computing, Clemson University, Clemson, SC 29634, USA {hamptos | hkeown | msitara}@clemson.edu, 3 Department of Computer and Information Science, Ohio State University, Columbus, OH, 43210, USA
[email protected], 4 Department of Computer Science, Indiana University Southeast, New Albany, IN 47150, USA
[email protected]
Abstract. Ultimately, any verifying compiler effort needs to be able to verify code that makes use of pointers, though language mechanisms for data abstraction, alias avoidance and control, or disciplined software development techniques may minimize the need for code that is directly based on pointering. It is also clear that the verification machinery of such as compiler must use specifications of components to reason about component-based software in order to be scalable. So this paper follows a natural question that arises by putting these two ideas together: Can the general machinery in a verifying compiler for component specificationbased verification also be used to verify code that uses typically built-in types, such as arrays and pointers, if those types are defined to have specifications similar to any other component? This paper answers the question in the affirmative by presenting a Location Linking Template, a concept that captures pointer behavior, and using it to verify the code of a simple data abstraction realized using pointers. Additionally, we note that the concept can be extended and realized so that different languages can plug in alternative implementations to give programmers the flexibility to choose manual memory management or automatic garbage collection depending on their performance concerns. 5 Keywords: Formal specification; linked data structures; memory management; reusable components; verification 5
This research funded in part by NSF grant CCF-0811748
1
Introduction
Software components must be used strictly on the basis of their specifications [14]. This is necessary for clients to understand and reason about components without concern for how they might be implemented. Implementation-neutral specifications give implementers the flexibility to provide alternative implementations for components based on different performance profiles. To verify componentbased software in a scalable fashion, the verification machinery of a verifying compiler should use only the specifications of subcomponents[8, 21, 22]. We have developed and experimented with such a compiler with RESOLVE[17]. Data types commonly built into languages can also be viewed as components reused by nearly all software systems. If components are client-oriented software6 , then the software elements most frequently used by clients—builtin data types—are the quintessential components. A natural research question, therefore, is if it is possible to use the same verification machinery for verifying component-based software to also verify code that is based on built-in types such as arrays and pointers. In answering this question, this paper presents a concept for “location linking” to capture the behavior of the most complex and controversial of all data structures—the pointer. Section 2 offers an informal introduction to this pointer component. Section 3 presents a formal specification. Section 4 discusses how a sufficiently generic concept supports alternative memory management choices. Section 5 discusses results from applying the RESOLVE verifying compiler on a data abstraction implemented such a pointer component. Finally, Section 6 contains conclusions and a summary. 1.1
Reasoning about pointer behavior and related work
Pointers break encapsulation [15], complicating reasoning about software. Hoare compares them to jumps (goto statements) because they can “be used to create wide interfaces between parts of a program which appear to be disjoint” [7]. Despite having been well understood for over a decade, this fundamental problem remains[5] and many schemes continue to be suggested to curtail, remove, or otherwise manage the alias problem. Examples include ownership systems[1] and dynamic frames[9]. A complementary technique that merits specific discussion is separation logic[16], wherein procedures may define properties that hold on certain parts of the heap over their lifetime. This technique is particularly effective for links contained entirely within a component, while dealing with leaked pointers still introduces significant complexity as discussed in [3], where a policy of “blaming the client” is suggested. If the behavior of a component is to remain well-specified, this requires, at minimum, a scheme for specifying the available types of alias interference for which the component explicitly abdicates responsibility—a longstanding problem[2]. 6
Attributed to Christine Mingins in [14].
1
1.2
Overview of a Specification-Based Approach
The approach we take for unifying verification of pointer-based code with verification of other software relies on having a formal specification of a pointer component. This idea is illustrated in Figure 1 which shows a design time diagram of a software system. In the figure, circles represent component specifications and rectangles represent realized implementations. Each implementation may depend on several other components. As an example, to reason that the Queue-Based realization is correct with respect to the specified functional behavior in the Messenger component specification, it is sufficient to know the component specifications Array and Queue. Thus, the Queue component may be implemented as an Array-Based realization or a Pointer-Based realization, but this is irrelevant to the person reasoning about the implementation of Messenger. The verification of the Pointer-Based realization in turn depends only on the specification of a pointer component. This specification-based reasoning allows components that have the same functional specification—but potentially different performance behavior—to be substituted for one another in the same system without requiring the programmer to reanalyze the entire system. This is important for maintainability and scalability.
Fig. 1. Design-time diagram with component specifications and implementations.
Just as a concept exists that defines the behavior of a Queue independent from implementation, pointer behavior can also be captured in a concept. Doing so permits the system to use the same general verification machinery on pointers as on queues and other components, but does not preclude a language designer from providing syntactic sugar for pointer operations or instructing a compiler to translate such syntax as more straightforward pointer operations. In this paper, we present specifications and code in the RESOLVE language [11, 20, 18], an integrated specification and programming language designed to facilitate full program verification. Though any specification language can be used to describe the components behavior, using the component in the context
2
of the RESOLVE system highlights some of its benefits. In particular, the RESOLVE reasoning system supports a simple, value-based semantics in which (1) the state space is made up of the currently defined variables and their values, and (2) the effects of a procedure call are restricted to the arguments to the call and global variables listed in the updates clause of the operation declaration. In addition, the RESOLVE programming language avoids unintentional aliasing from reference assignment and parameter passing, relying on swapping as its primary means of data assignment [4]. These design choices ensure that any aliasing is enacted exclusively by the pointer component.
2
Specification of a system of linked locations
This section introduces an informal idiom for a pointer component based on a metaphor of linked locations. Figure 2 shows a diagram of an example system where symbols (Greek letters) are the information and each location has exactly one link. The eight locations to the left of the dotted line are free; the six locations to the right are occupied. The location with the slash through it is the Void location and it can never be taken. The information in free locations is always an initial value, and the links of free locations always point to Void. We omit these details in the interest of making the diagram less cluttered.
Fig. 2. A system of linked locations with symbol information and one link per location.
Before a system of linked locations can be used, it has to be instantiated with an information type and a number of links. The depiction above is a result of the following instantiation. Facility Symbol Pointer Fac i s L o c a t i o n L i n k i n g T e m p l a t e ( Symbol , 1 ) r e a l i z e d by D e f a u l t R e a l i z ; Once instantiated, variables to allocate, manipulate information and links, or abandon locations can be declared and used as shown below. Var p , q : S y m b o l P o i n t e r F a c . P o s i t i o n ; The effects of various operations on Position type variables are illustrated in Figure 3.
3
Fig. 3. The effect of selected actions on a system.
3
A Formal Specification
This section describes a formal specification of the Location Linking Template. The relationships between the mathematical objects in the concept and the notions introduced informally are straightforward. Listing 1.1. A formal specification of Location Linking Template
Concept L o c a t i o n L i n k i n g T e m p l a t e ( type I n f o ; evaluates k : I n t e g e r ) ; uses Function Theory , Closure Op Ext ; requires 1 ≤ k ; Defines L o c a t i o n : S e t ; Defines Void : L o c a t i o n ; Defines O c p n D i s p I n c r : N>0 ; Var Ref : L o c a t i o n × [ 1 . . k ] → L o c a t i o n ; Var Content : L o c a t i o n → I n f o ; Var Occupied Loc : ℘ ( L o c a t i o n ) ; Constraints I n f o . I s I n i t i a l [ Content [ L o c a t i o n ∼ Occupied Loc ] ] = { True } and Ref [ ( L o c a t i o n ∼ Occupied Loc ) × [ 1 . . k ] = { Void } and Void ∉ Occupied Loc and
4
| | L o c a t i o n | | > Total Mem Cap / O c p n D i s p I n c r ; I n i t i a l i z a t i o n ensures Occupied Loc = φ ; Family P o s i t i o n ⊆ L o c a t i o n ; exemplar p ; i n i t i a l i z a t i o n ensures p = Void ; Definition Var A c e s s i b l e L o c : ℘ ( L o c a t i o n ) = ( { Void } ∪ C l o s u r e f o r ( Location , ⋃ {λu : L o c a t i o n . ( Ref ( i , u ) ) } , i∶[1..k]
Position . Val in [ Position . Receptacle ] ) ; f i n a l i z a t i o n updates A c c e s s i b l e L o c ; Operation Take New Loc ( updates p : P o s i t i o n ) ; updates Occupied Loc , A c c e s s i b l e L o c ; requires p ∉ Occupied Loc and O c p n D i s p I n c r ≤ Rem Mem Cap ; ensures p ∉ #A c c e s s i b l e L o c and Occupied Loc = #Occupied Loc ∪ {p } ; Operation Rem Loc Capacity ( ) : I n t e g e r ; ensures Rem Loc Capacity = ⌊Rem Mem Cap / O c p n D i s p I n c r ⌋ ; Operation R e d i r e c t L i n k a t ( preserves p : P o s i t i o n ; preserves i : I n t e g e r ; updates q : P o s i t i o n ) ; updates Ref ; requires p ∈ Occupied Loc and 1 ≤ i ≤ k which entails i : [ 1 . . k ] ; ensures Ref = λu : L o c a t i o n , #q if u = p and j = i λj : [ 1 . . k ] . ( { ) and #Ref(u, j) otherwise q = #Ref ( p , i ) ; Operation F o l l o w L i n k ( updates p : P o s i t i o n ; preserves i : I n t e g e r ) ; updates A c c e s s i b l e L o c ; requires p ∈ Occupied Loc and 1 ≤ l ≤ k which entails i : [ 1 . . k ] ; ensures p = Ref(#p , i ) ;
5
Operation Swap Contents ( preserves p : P o s i t i o n ; updates I : I n f o ) ; updates Content ; requires p ∈ Occupied Loc ; ensures I = #Content ( p ) and #I if q = p Content = λq : L o c a t i o n . ( { ); #Content(q) otherwise Operation R e l o c a t e ( r e p l a c e cpy : P o s i t i o n ; restores orig : Position ) ; updates A c c e s s i b l e L o c ; ensures cpy = o r i g ; Operation Abandon Location ( c l e a r s p : P o s i t i o n ) ; updates Occupied Loc , Ref , Content , Accessible Loc ; requires p ∈ Occupied Loc ; ensures Occupied Loc = #Occupied Loc ∼ {#p} and I n f o . I n i t ( Content(#p ) ) and Content↿ ( L o c a t i o n ∼ {#p } ) = #Content↿ ( L o c a t i o n ∼ {#p } ) and Ref = λq : L o c a t i o n , Void if q = p λj : [ 1 . . k ] . ( { ); #Ref(q, j) otherwise Operation C h e c k C o l o c a t i o n ( preserves p , q : P o s i t i o n , replaces A r e C o l c t d : Boolean ) ; ensures i f p ∈ Occupied Loc ∪ { Void } and q ∈ Occupied Loc ∪ { Void } , then A r e C o l c t d = ( p = q ) ; Operation I s V o i d ( preserves p : P o s i t i o n ) : b o o l e a n ; ensures I s V o i d = ( p = Void ) ; Operation Occupied Loc Ct ( ) : I n t e g e r ; ensures Occupied Loc Ct = ( | | Occupied Loc | | ) ; end ; 3.1
Shared Conceptual State
The shared conceptual state of Location Linking Template is specified through defines clauses, (conceptual or specification) variable declarations, and their constraints. The defines clauses at the beginning are placeholders for deferred implementation-dependent definitions. Though we expect that objects of type
6
Location will somehow be tied to a machines memory addresses, for the purposes of reasoning about this component the programmer need only know that Location is a set, Void is a specific location, and Ocpn Disp Incr is a nonzero natural number that represents the memory overhead for a Location. Total Memory Capacity is a global variable across the system. Objects of type Location correspond to the notion of locations described in Section 2. The type parameter, Info, indicates the type of information that a location contains, while the second parameter, k, indicates the number of links from a given location. The three conceptual variables near the beginning of the concept enable clean mathematical reasoning about locations: Ref(q, i) returns the location targeted by the ith link of q, Content(q) returns the information at a given location q, and Occupied Loc is the set of all non-free locations. The first conjunct of the constraints clause asserts that all the unoccupied locations have initial information value. It uses the square bracket notation to lift a function from one that operations on a domain D and a range R, to one that operates on the powerset of D, unioning all results, and returning a value in the powerset of R. So the first conjunct unions the result of testing to see if each Info in unoccupied locations is an initial value—which must be the singleton set containing only True, i.e., all unoccupied locations must always contain initial values. The second conjunct ensures that each link of each unoccupied location references Void. The third ensures that Void cannot become an occupied location. And the last conjunct ensures that the Location set is at least as big as necessary. The assertion that free locations have default information and default links exists strictly for reasoning about functional behavior. The performance part of this specification assumes that no memory is allocated for information until the Take New Loc operation is invoked. The value of Ocpn Disp Incr is the overhead space that a newly taken location with k links occupies in memory. Once a system of locations is instantiated, the initialization clause ensures that all locations in the newly created system are unoccupied, and the constraints clause ensures that all of these free locations have default information and default links. The default target for links is the Void location. 3.2
Position Type
The concept exports the programming type Position, which represents a pointer. Position variables are modeled mathematically as Locations. For example, the initialization ensures clause asserts that p = Void. Since the symbol p that occurs here is the mathematical value of the programming variable p rather than the programming variable p itself, the assertion is interpreted as “The location of the variable named p is Void.” The clause indicates that all new pointer variables are conceptually at the Void location. The exemplar clause simply introduces an example position variable so that the name can be used in the scope of the type declaration. The mathematical definition Accesible Loc is a variable because its value depends not only on the value of its parameter, q, but also the conceptual variables Ref and Occuped Loc; so the same location may be accessible in one program
7
state and inaccessible in the next. Variable definitions such as this are simply a notational convenience, as the necessary conceptual variables can be passed explicitly as parameters. Accesible Loc states that a Location is accessible if it is Void or in the closure of all links from all Locations starting from any named position (a “receptacle” in the syntax here) currently in use. The type Position.Receptacle represents the set of all named positions currently in scope, while Position.Val in is a function that takes a Receptacle and returns its corresponding Location. When a variable is declared of type Position, its unique identifier is added to Position.Receptacle. When a variable is destroyed (goes out of scope) the identifier is removed. When an operation can potentially modify state that affects a definition variable such as Accessible Loc, we include the variable in the updates clause; ones not specified to be updated are not affected. For example, the finalization (which acts as an implicit operation) includes Accessible Loc in its updates clauses, since the destruction of a Position may impact the set of accessible locations if one of its links referenced a location that was otherwise inaccessible. 3.3
Operations
The management actions informally described in Section 2 correspond directly to the operations given in the concept. The Take New Loc operation takes a single position variable as a parameter. The updates parameter mode indicates to the client that the operation modifies the value of p. The updates clause on the following line gives the conceptual (state) variables that we can expect this operation to modify. In this case, we can expect the operation to affect both the occupation status of one or more locations and the accessibility of the system. The requires clause guarantees that p cannot reside at a taken location and that sufficient memory exsts. Since the Void location is perpetually free, it will be the location where p typically resides when the operation is called. In general, performance behavior such as memory use should be decoupled from the behavioral spec, but here sufficient memory is a constraint of any conceivable implementation. Performance specification is discussed further in [6, 19]. The ensures clause guarantees that the newly taken location was not previously accessible and that the set of occupied locations is extended by precisely the newly occupied location. As noted above, all operations have an implicit frame property that ensures that any variables not mentioned in the parameter list or updates clause will remain unchanged. Thus, a client can be sure that the newly taken location has default information and that all of its links point to the Void location. The parameters in these operations have various modes summarize the operatin’s effect. The updates mode has already been mentioned. The clears mode ensures that an argument will return with an initial value. The preserves mode prohibits any changes to an argument’s value. The replaces mode indicates that the incoming value will be ignored and replaced with a meaningful one. The eval-
8
uates mode indicates that the operation expects an expression, which modifies the parameter swapping behavior. We show only specifications of those operations that are used in the code in the next section. Operation Follow Link causes a position to point to the target of one of its links, whereas Swap Contents exchanges the Info associated with a given location with the given Info. The interested reader is referred to [12] for specifications of additional operations to redirect a position variable’s link, abandon one, etc.
4
Memory management
Through its operations, the Location Linking Template provides all the functionality of traditional pointers. For example, the client can obtain the benefits of aliasing by positioning two or more variables at the same location. But the concept also allows the client to fall into the traditional traps involving pointers: dangling references and memory leaks. This section looks at different ways these problems can be managed. 4.1
Performance and extensions
A dangling reference occurs when a location is free but remains accessible, as in the following code. Var x , y : P o s i t i o n ; Take New Loc ( x ) ; Relocate (y , x ) ; Abandon Location ( x ) ; When x abandoned its location, the location’s status changed from taken to free. Though x was relocated to Void, y remained at the location, so the location continues to be accessible. Position variables are effectively bound to the type of Info during instantiation, so there is no danger of inadvertently modifying (through the dangling reference) the contents of a memory location that is being used by another variable somewhere else in the program. Real memory locations on a machine are limited, so the specification permits implementations that can reclaim memory even if a dangling reference existed for them. The Is Occ operation (provided in an extension to the concept and shown in Listing 1.2) effectively tells the client whether a worker is a dangling reference. Since a Position variable resides at the location in question, the location is accessible. If the location is taken, it is usable by the client; if the location is free, the client cannot affect it. A memory leak occurs when a location is taken but not accessible. The following code segment creates a memory leak. Var x , y : P o s i t i o n ; Take New Loc ( x ) ; Relocate (x , y ) ;
9
Listing 1.2. Extensions to Location Linking Template Extension O c c C h e c k i n g C a p a b i l i t y for Location Linking Template ; Operation I s O c c ( preserves p : P o s i t i o n ) : Boolean ; ensures I s O c c = ( p ∈ Occupied Loc ) ; end ; Extension C l e a n u p C a p a b i l i t y f o r L o c a t i o n L i n k i n g T e m p l a t e ; Operation A b a n d o n U se l es s ( ) ; updates Occupied Loc , Content , Ref ; ensures Occupied Loc = #Occupied Loc ∩ A c c e s s i b l e L o c and Content↿ ( ( L o c a t i o n ∼ #Occupied Loc ) ∪ Accessible Loc ) = #Content↿ ( ( L o c a t i o n ∼ #Occupied Loc ) ∪ A c c e s s i b l e L o c ) and Info . I s I n i t i a l [ Content [# Occupied Loc ∼ Occupied Loc ] ] = { True } and Ref = λq : L o c a t i o n , λj : [ 1 . . k ] . ( ⎧ Void if q ∈ #Occupied Loc ∼ ⎪ ⎪ ⎪ Occupied Loc ); ⎨ ⎪ ⎪ ⎪ #Ref(q, j) otherwise ⎩ end ;
The location that was taken by x continues to have a taken status but has become inaccessible. The operation that performs garbage collection, Abandon Useless, is provided in an extension to the concept. Extensions are separate from the main concept because their operations cannot be implemented without the normal performance penalties. Some realizations do not implement extensions, but some may. Both the Abandon Useless and Is Occ operations reside in extensions. A garbage collecting implementation of Location Linking Template would also provide a procedure for the Abandon Useless operation. A client may then choose to ignore the Abandon Location operation and periodically invoke the Abandon Useless operation instead. Another extension could be written that uses a two-phase approach to garbage collection, as in the popular mark-andsweep approach. It would have at least two operations: one, Mark Inaccessible, would update a set of marked locations; another, Abandon Marked, would abandon the locations in the set and then clear the set.
10
4.2
Implementation flexibility
A given programming language will typically hardcode the choice of implementation for the location linking concept, but the concept itself allows implementation options. If a language allows options, then a facility declaration mechansim such as in RESOLVE can be used. The following declaration creates a pointer facility containing Queue information and one link per location, and it uses the default no-frills realization for Location Linking Template that puts the burden of memory management on the client. Facility Queue Pointer Fac i s Location Linking Template ( Queue , 1 ) r e a l i z e d by D e f a u l t R e a l i z ; A garbage collecting implementation would implement the Cleanup Capability extension in addition to the operations in the concept. F a c i l i t y GC Queue Pointer Fac i s L o c a t i o n L i n k i n g T e m p l a t e ( Queue , 1 ) extended by C l e a n u p C a p a b i l i t y r e a l i z e d by G a r b a g e C o l l e c t i n g R e a l i z ; Each facility acts as a distinct component. Thus, an object of type Queue Pointer Fac.Position cannot be used where an object of type GC Queue Pointer Fac.Position is expected. The stack component described in the next section contains a local pointer facility and uses a form of manual component-level memory management[13]. Manual memory management for global pointer facilities becomes more difficult with each realization that imports the facility, because all realizations that import the same facility share the same set of locations. Therefore, global pointer facilities may be good candidates for garbage collection implementations. The facility mechanism allows multiple pointer components with different implementations to exist in the same program.
5
Application
Using Location Linking Template to implement linked data structures will be familiar to anyone who has implemented a linked list in a language with explicit pointers such as C or Pascal, though the nomenclature is different. This section gives excerpts from a stack data abstraction implemented using the Location Linking Template along with the necessary verification conditions (VCs) generated by the RESOLVE verifying compiler in establishing its correctness. The key point is that the compiler uses the same verification machinery for generation of these VCs for as for any other code based on a formal specification. Due to space constraints, we make a number of simplifications for the example discussed here. The specifications and implementations used here assume that an unlimited number of locations exist; so the notions of occupied locations or abandoning locations are not used; The Stack specification is unbounded.
11
Similarly, since a stack implementation requires only one link, we fix the number of links, k to be 1. So, Follow Link operation, for example, does not have an argument that indicates which link to follow. Also, note that all notations here, unlike previous sections, are given in ascii, because our compiler can only accept ascii inputs at this time. 5.1
Specification of a Stack Concept
Just as a Pointer is modeled mathematically as a Location containing a generic type, a Stack is modeled mathematically as a string (i.e., a finite sequence) of genericly typed entries. Concept Unbounded_Stack_Template(type Entry); uses String_Theory; Type Family Stack is modeled by Str(Entry); exemplar S; initialization ensures S = empty_string; Operation Pop(replaces R: Entry; updates S: Stack); requires S /= empty_string; ensures #S = o S; (* Other operations omitted *) end Unbounded_Stack_Template; 5.2
Pointer-Based Implementation of Stacks
In this implementation, the Stack type is represented by a position. This requires an instantiation of Location Linking Template with appropriate parameters (Entry in the place of Info, for example) and a standard realization in the library. The representation convention uses a locally-defined predicate Is Void Reachable (not shown) that is true iff Void can be reached by following links defined by some reference function (like, for example, “Ref” from Location Linking One Template). The invariant is assumed to be true by every procedure at its beginning (except initialization) and needs to confirmed to be true at its end by every procedure (except finalization). So a number of generated VCs pertain to the invariant. The correspondence uses another definition, Str Info, that takes a Location, as well as a Content function and a linking function and returns the sequence of Info elements contained along its link chain of as a string (finite sequence) of Entries7 . Only the code for Pop is shown here. 7
The actual definitions of Str Info and Is Void Reachable are slightly more complex in order to maintain totality yet deal with cycles, but we omit this detail for brevity.
12
Facility Entry_Ptr_Fac is Location_Linking_One_Template_(Entry) realized by Std_Location_Linking_One_Realiz; Type Stack is represented by Entry_Ptr_Fac.Position; convention Is_Void_Reachable(S, Ref); correspondence Conc.S = Str_Info(S, Content, Ref); Procedure Pop(replaces R: Entry; updates S: Stack); Swap_Contents(S, R); Follow_Link(S); end; (* Other Procedures omitted *) end; 5.3
Verification Process
Applying the specification-based verification machinery yields the VCs found in Table 1 to be proved for implementation correctness, which arise from the ensures clauses of Stack operations, requires clauses of called Location Linking operations, and conventions. Proofs of the VCs are straightforward and can be handled by most provers, such as those summarized in [10]. As an example, consider the following VC for establishing the convention at the end of a call to Pop8 : Goal : I s V o i d R e a c h a b l e ( Ref ( S ) , Ref ) Given : I s V o i d R e a c h a b l e ( S , Ref ) Note that this is a straightforward proof—if S’s links can be followed to Void given the reference function Ref, then something that links directly to S can also be followed to Void under the same reference function.
6
Summary
We have presented a formal specification of a concept to capture pointer behavior. The specification is designed such that extensions to the basic specification can give language designers and programmers the flexibility to choose between manual memory management and automatic garbage collection based on their performance concerns. We have shown that a verifying compiler with the necessary machinery to reason about component-based software via the specifications of reusable components can be used naturally to verify pointer-based code using the given specification. 8
Irrelevant conjuncts of the “Given:” portion have been removed for brevity.
13
Table 1. VCs for Location Linking Realization VC 1 2 3
5 6
Given true true Str Info(S, Content, Ref) /= empty string Str Info(S, Content, Ref) /= empty string Is Void Reachable(S, Ref) Is Void Reachable(S, Ref)
7 8 9 10
true Temp’ /= Void Temp’ /= Void Is Void Reachable(S, Ref)
4
11 Is Void Reachable(S, Ref) 12 true 13 true
Goal Is Void Reachable(Void, Ref) Str Info(Void, Content, Ref) = empty string S /= Void S /= Void Is Void Reachable(Ref(S), Ref) Str Info(S, Content, Ref) = ( o Str Info(Ref(S), lambda L: Z ( {R if L = S; Content(L) otherwise}), Ref)) Void = Void Temp’ /= Void Temp’ /= Void Is Void Reachable(Temp’, lambda L: Z ( {S if L = Temp’; Ref(L) otherwise})) Is Void Reachable(S, Ref) (S = Void) = (Str Info(S, Content, Ref) = empty string) Str Info(S, Content, Ref) = Str Info(S, Content, Ref)
References 1. Banerjee, A., Naumann, D.A.: State based ownership, reentrance, and encapsulation. In: In European Conference on Object-Oriented Programming (ECOOP. pp. 387–411 (2005) 2. Ernst, G.W., Hookway, R.J., Ogden, W.F.: Modular verification of data abstractions with shared realizations. IEEE Trans. Softw. Eng. 20, 288–307 (April 1994), http://dl.acm.org/citation.cfm?id=630806.631117 3. Filipović, I., O’Hearn, P., Torp-Smith, N., Yang, H.: Blaming the client: on data refinement in the presence of pointers. Form. Asp. Comput. 22, 547–583 (September 2010), http://dx.doi.org/10.1007/s00165-009-0125-8 4. Harms, D.E., Weide, B.W.: Copying and swapping: Influences on the design of reusable software components. IEEE Trans. Softw. Eng. 17, 424–435 (May 1991), http://dl.acm.org/citation.cfm?id=114769.114773 5. Hatcliff, J., Leavens, G.T., Rustan, K., Leino, M., M¨ uller, P., Parkinson, M., Hatcliff, J., Leavens, G.T., Rustan, K., Leino, M., M¨ uller, P., Parkinson, M.: Behavioral interface specification languages (2009), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.150.723 6. Hehner, E.C.R.: Formalization of time and space. Formal Aspects of Computing 10, 290–306 (1998), http://dx.doi.org/10.1007/s001650050017, 10.1007/s001650050017 7. Hoare, C.A.R.: Recursive data structures. In: Hoare, C.A.R., Jones, C.B. (eds.) Essays in Computing Science. Prentice Hall, New York (1989)
14
8. Jones, C.B.: Systematic software development using VDM. Prentice Hall International (UK) Ltd., Hertfordshire, UK, UK (1986) 9. Kassios, I.T.: Dynamic frames: Support for framing, dependencies and sharing without restrictions. In: Misra, J., Nipkow, T., Sekerinski, E. (eds.) FM. Lecture Notes in Computer Science, vol. 4085, pp. 268–283. Springer (2006) 10. Klebanov, V., M¨ uller, P., Shankar, N., Leavens, G.T., W¨ ustholz, V., Alkassar, E., Arthan, R., Bronish, D., Chapman, R., Cohen, E., Hillebrand, M.A., Jacobs, B., Leino, K.R.M., Monahan, R., Piessens, F., Polikarpova, N., Ridge, T., Smans, J., Tobies, S., Tuerk, T., Ulbrich, M., Weiß, B.: The 1st verified software competition: Experience report. In: Butler, M., Schulte, W. (eds.) FM. Lecture Notes in Computer Science, vol. 6664, pp. 154–168. Springer (2011) 11. Kulczycki, G., Sitaraman, M., Roche, K., Yasmin, N.: Formal specification. In: Wah, B.W. (ed.) Wiley Encyclopedia of Computer Science and Engineering. John Wiley & Sons, Inc. (2008) 12. Kulczycki, G., Smith, H., Harton, H., Sitaraman, M., Ogden, W.F., Hollingsworth, J.E.: Technical report RSRG-11-04, The Location Linking Concept: A Basis for Verification of Code Using Pointers (Sep 2011), http://www.cs.clemson.edu/group/resolve/reports.html 13. Meyer, B.: Object-Oriented Software Construction. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1st edn. (1988) 14. Meyer, B.: On to components. Computer 32, 139–140 (1999) 15. Noble, J., Vitek, J., Potter, J.: Flexible alias protection. In: ECOOP’98. pp. 158– 185. Springer-Verlag (1998) 16. Reynolds, J.C.: Separation logic: A logic for shared mutable data structures. In: Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science. pp. 55–74. LICS ’02, IEEE Computer Society, Washington, DC, USA (2002), http://dl.acm.org/citation.cfm?id=645683.664578 17. Sitaraman, M., Adcock, B., Avigad, J., Bronish, D., Bucci, P., Frazier, D., Friedman, H., Harton, H., Heym, W., Kirschenbaum, J., Krone, J., Smith, H., Weide, B.: Building a push-button resolve verifier: Progress and challenges. Formal Aspects of Computing 23, 607–626 (2011), http://dx.doi.org/10.1007/s00165-010-0154-3, 10.1007/s00165-010-0154-3 18. Sitaraman, M., Atkinson, S., Kulczycki, G., Weide, B.W., Long, T.J., Bucci, P., Heym, W.D., Pike, S.M., Hollingsworth, J.E.: Reasoning about softwarecomponent behavior. In: ICSR. pp. 266–283 (2000) 19. Sitaraman, M., Kulczycki, G., Krone, J., Ogden, W.F., Reddy, A.L.N.: Performance specification of software components. In: SSR. pp. 3–10 (2001) 20. Sitariman, M., Weide, B.: Component-based software using resolve. SIGSOFT Softw. Eng. Notes 19, 21–22 (October 1994), http://doi.acm.org/10.1145/190679.199221 21. Spivey, J.M.: The Z notation: a reference manual. Prentice-Hall, Inc., Upper Saddle River, NJ, USA (1989) 22. Wing, J.M.: A specifier’s introduction to formal methods. Computer 23, 8–23 (September 1990), http://dl.acm.org/citation.cfm?id=102815.102816
15
A
Location Linking Realiz VCs
// // Generated by the RESOLVE Verifier , March 2009 version // from file : Lo c at i o n _ L i n k i n g _ R e a l i z . rb // on : Wed Sep 14 15:35:42 EDT 2011 // Free Variables : Entry , Info VC : 0 _1 : Requirement for Facility Declaration Rule for Entry_Ptr_Fac : Lo c a t i o n _ L i n k i n g _ T e m p l a t e . co (1) Goal : true Given : Free Variables : S : Entry_Ptr_Fac . Position , Void :Z , Str_Info : Str ( Info ) , Last_Char_Num :N , min_int :Z , max_int :Z , Max_Char_Str_Len : N VC : 1 _1 : Correspondence Rule for Stack : L o c a t i o n _ L i n k i n g _ R e a l i z . rb (10) Goal : true Given : (((( min_int 0) ) and Is_Void_Reachable (S , Ref ) )
16
Free Variables : S : Entry_Ptr_Fac . Position , Conc_S : Str ( Entry ) , Content : Z -> Info , Ref : Z -> Z VC : 2 _1 : Convention for Stack modified by Variable Declaration rule : Lo c at io n _L i n k i n g _ R e a l i z . rb (9) Goal : Is_Void_Reachable ( Void , Ref ) Given : VC : 2 _2 : Initialization Rule for Stack modified by Variable Declaration rule : U n b o u n d e d _ S t a c k _ T e m p l a t e . co (6) Goal : Str_Info ( Void , Content , Ref ) = empty_string Given : 1: Conc_S = Str_Info ( Void , Content , Ref ) Free Variables : Void :Z , Str_Info : Str ( Info ) , Last_Char_Num :N , min_int :Z , max_int :Z , Max_Char_Str_Len :N , Conc_S : Str ( Entry ) , R : Entry , S : Entry_Ptr_Fac . Position , Content : Z -> Info , Ref : Z -> Z , S ’: Entry_Ptr_Fac . Position , R ’: Entry , Temp : Z VC : 3 _1 : Requires Clause of Swap_Info in Procedure Pop : L oc a ti on _ Li n ki ng _ R e a l i z . rb (15) Goal : S /= Void Given : ((((( min_int 0) ) and Is_Void_Reachable (S , Ref ) ) and ( Entry . is_initial ( R ) and Str_Info (S , Content , Ref ) /= empty_string ) )
17
VC : 3 _2 : Requires Clause of Follow_Link in Procedure Pop : L oc a ti on _ Li n ki ng _ R e a l i z . rb (16) Goal : S /= Void Given : (((((( min_int 0) ) and Is_Void_Reachable (S , Ref ) ) and ( Entry . is_initial ( R ) and Str_Info (S , Content , Ref ) /= empty_string ) ) and Content ’ = lambda L : Z ({{ R if L = S Content ( L ) otherwise }}) ) VC : 3 _3 : Convention for L oc a t i o n _ L i n k i n g _ R e a l i z : L oc a ti on _ Li n ki ng _ R e a l i z . rb (9) Goal : Is_Void_Reachable ( Ref ( S ) , Ref ) Given : (((((( min_int 0) ) and Is_Void_Reachable (S , Ref ) ) and ( Entry . is_initial ( R ) and Str_Info (S , Content , Ref ) /= empty_string ) ) and Content ’ = lambda L : Z ({{ R if L = S Content ( L ) otherwise }}) )
18
VC : 3 _4 : Ensures Clause of Pop : U n b o u n d e d _ S t a c k _ T e m p l a t e . co (13) Goal : Str_Info (S , Content , Ref ) = ( < Content ( S ) > o Str_Info ( Ref ( S ) , lambda L : Z ({{ R if L = S Content ( L ) otherwise }}) , Ref ) ) Given : (((((( min_int 0) ) and Is_Void_Reachable (S , Ref ) ) and ( Entry . is_initial ( R ) and Str_Info (S , Content , Ref ) /= empty_string ) ) and Content ’ = lambda L : Z ({{ R if L = S Content ( L ) otherwise }}) ) Free Variables : Void :Z , Str_Info : Str ( Info ) , Last_Char_Num :N , min_int :Z , max_int :Z , Max_Char_Str_Len :N , Conc_S : Str ( Entry ) , E : Entry , S : Entry_Ptr_Fac . Position , Content : Z -> Info , Ref : Z -> Z , S ’: Entry_Ptr_Fac . Position , E ’: Entry , Temp ’: Z , Temp : Z VC : 4 _1 : Requires Clause of Swap_Info in Procedure Push : L oc a ti on _ Li n ki ng _ R e a l i z . rb (23) Goal : Temp ’ /= Void Given : ((((( min_int 0) ) and Is_Void_Reachable (S , Ref ) ) and Temp ’ /= Void )
19
VC : 4 _2 : Requires Clause of Redirect_Link in Procedure Push : L oc a ti on _ Li n ki ng _ R e a l i z . rb (24) Goal : Temp ’ /= Void Given : (((((( min_int 0) ) and Is_Void_Reachable (S , Ref ) ) and Temp ’ /= Void ) and (E ’ = Content ( Temp ’) and Content ’ = lambda L : Z ({{ E if L = Temp ’ Content ( L ) otherwise }}) ) ) VC : 4 _3 : Convention for L oc a t i o n _ L i n k i n g _ R e a l i z : L oc a ti on _ Li n ki ng _ R e a l i z . rb (9) Goal : Is_Void_Reachable ( Temp ’ , lambda L : Z ({{ S if L = Temp ’ Ref ( L ) otherwise }}) ) Given : ((((((( min_int 0) ) and Is_Void_Reachable (S , Ref ) ) and Temp ’ /= Void ) and (E ’ = Content ( Temp ’) and Content ’ = lambda L : Z ({{ E if L = Temp ’ Content ( L ) otherwise }}) ) ) and Ref ’ = lambda L : Z ({{ S if L = Temp ’ Ref ( L ) otherwise }}) )
20
VC : 4 _4 : Ensures Clause of Push : U n b o u n d e d _ S t a c k _ T e m p l a t e . co (9) Goal : Str_Info ( Temp ’ , lambda L : Z ({{ E if L = Temp ’ Content ( L ) otherwise }}) , lambda L : Z ({{ S if L = Temp ’ Ref ( L ) otherwise }}) ) = ( o Str_Info (S , Content , Ref ) ) Given : ((((((( min_int 0) ) and Is_Void_Reachable (S , Ref ) ) and Temp ’ /= Void ) and (E ’ = Content ( Temp ’) and Content ’ = lambda L : Z ({{ E if L = Temp ’ Content ( L ) otherwise }}) ) ) and Ref ’ = lambda L : Z ({{ S if L = Temp ’ Ref ( L ) otherwise }}) ) Free Variables : Void :Z , Str_Info : Str ( Info ) , Last_Char_Num :N , min_int :Z , max_int :Z , Max_Char_Str_Len :N , Conc_S : Str ( Entry ) , S : Entry_Ptr_Fac . Position , Content : Z -> Info , Ref : Z -> Z , Temp :Z , Is_Empty : Boolean . B VC : 5 _1 : Convention for L oc a t i o n _ L i n k i n g _ R e a l i z : L oc a ti on _ Li n ki ng _ R e a l i z . rb (9) Goal : Is_Void_Reachable (S , Ref ) Given : (((( min_int 0) ) and Is_Void_Reachable (S , Ref ) )
21
The Location Linking Concept: A Basis for Verification of Code Using Pointers Gregory Kulczycki, Hampton Smith, Heather Harton, Murali Sitaraman, William F. Ogden, and Joseph E. Hollingsworth Technical Report RSRG-11-04 School of Computing 100 McAdams Clemson University Clemson, SC 29634-0974 USA
September 2011
Copyright © 2011 by the authors. All rights reserved.
22
VC : 5 _2 : Ensures Clause of Is_Empty : U n b o u n d e d _ S t a c k _ T e m p l a t e . co (16) Goal : ( S = Void ) = ( Str_Info (S , Content , Ref ) = empty_string ) Given : (((( min_int 0) ) and Is_Void_Reachable (S , Ref ) ) VC : 5 _3 : Ensures Clause of Is_Empty : U n b o u n d e d _ S t a c k _ T e m p l a t e . co (16) Goal : Str_Info (S , Content , Ref ) = Str_Info (S , Content , Ref ) Given : (((( min_int 0) ) and Is_Void_Reachable (S , Ref ) )
23