Component Technology for Pointers - Clemson University

3 downloads 198 Views 387KB Size Report
In this paper we apply modern component technology to pointers and, as a result, .... Operation Relocate(updates p: Position; preserves new location: Position);.
Component Technology for Pointers: Why and How Greg Kulczycki, Murali Sitaraman, William F. Ogden, and Joseph E. Hollingsworth

Technical Report RSRG-03-03 Department of Computer Science 451 Edwards Hall Clemson University Clemson, SC 29634-0974 USA

April 2003

Copyright © 2003 by the authors. All rights reserved.

Component Technology for Pointers: Why and How Gregory Kulczycki,1 Murali Sitaraman,1 William F. Ogden,2 and Joseph E. Hollingsworth3 1 Department

of Computer Science, Clemson University, Clemson, SC, 29634, USA; of Computer and Information Science, The Ohio State University, Columbus, OH, 43210, USA; 3 Department of Computer Science, Indiana University Southeast, New Albany, IN, 47150, USA 2 Department

Keywords: Formal specification; reusable components; linked data structures; memory management

Abstract. Key software engineering questions concerning ease of understanding, reasoning, and efficient and predictable performance can be traced to the use of pointers in imperative programming languages. In this paper we apply modern component technology to pointers and, as a result, take an important step towards addressing these questions at a foundational level. We present a formal specification for a component that captures the complex behavior of pointers. The specification provides programmers with a precise mathematical abstraction for understanding pointer behavior, and enables sound and systematic reasoning about pointer-based data structures. The specification is designed to be implementation-neutral. Alternative implementations of the specification can be plugged in to give programmers the flexibility to choose manual memory management or automatic garbage collection depending on their performance concerns.1

1. Introduction Software components must be used strictly on the basis of their specifications [Mey99]. This is necessary for clients to understand and reason about components without concern of how they might be implemented. Implementation-neutral specifications give implementers the flexibility to provide alternative implementations for components based on different performance profiles. Component technology, popularized by standards such as CORBA, COM, and EJB,2 can be applied to any software element that is potentially composable [Szy98]. At the higher end of this spectrum, entire applications may be viewed as components of still larger systems; at the lower end, data types commonly built in to languages can be viewed as reusable components of nearly all software systems. If components are client-oriented software,3 then the software elements most frequently used by clients—components to capture built-in data types—are the quintessential components. Correspondence and offprint requests to: Murali Sitaraman, [email protected] 1 This research funded in part by NSF grant CCR-0113181. 2 See www.omg.com, www.microsoft.com/com, and java.sun.com/products/ejb respectively. 3 Attributed to Christine Mingins in [Mey99].

2

Kulczycki, Sitaraman, Ogden, and Hollingsworth

The objective of this paper is to demonstrate the software engineering benefits of applying component technology to the most complex and controversial of all data structures—the pointer. The rest of this section cites related work to motivate the need for a pointer component and to highlight the contributions of the paper. Section 2 gives the interface of a component that captures pointer behavior and presents an informal introduction to its behavior by describing a system of linked locations. Section 3 presents a formal specification for the component. Section 4 discusses effective memory management techniques and related performance issues. Section 5 demonstrates how the component can be used. The final section contains a summary.

1.1. Motivation for formal specification of pointer behavior Pointers complicate understanding and reasoning about software. Hoare compares them to jumps (goto statements) because they can “be used to create wide interfaces between parts of a program which appear to be disjoint” [Hoa89b]. Koenig and Moo call them a “slippery subject to master” and recommend that beginning C++ students be introduced first to classes with value semantics. [KM98]. The orthodox canonical form is a popular C++ idiom that allows programmers to view objects as values rather than references [Cop91], and value semantics is a fundamental design principle of the C++ Standard Template Library [MDS01]. Many object-oriented languages have implicit pointers, and researchers interested in reasoning about these languages have focused on aliasing as a fundamental problem with pointers [HLW+ 92]. Aliasing breaks encapsulation [NVP98], complicates formal specification and verification [WH01], and causes a “lack of modularity in reasoning due to the inability to localize object references” [CPN98]. Modular verification of object-oriented programs typically cannot be accomplished without some form of alias control [Smi95, M¨ ul02]. To reduce the reasoning difficulties caused by pointers and the aliasing they introduce, researchers have variously proposed methods for pointer replacement, alias prevention, and alias control. Hoare and Kieburtz suggested that pointers—like goto statements—should be replaced in high-level languages by more manageable constructs [Hoa89a, Kie76]. In support of this goal, Hoare introduced recursive data structures as an alternative to pointers for implementing non-cyclic linked data structures such as lists and trees [Hoa89b]. Efficient data movement and parameter passing for non-trivial objects is typically accomplished by copying object references, which creates aliases. Researchers have explored alternative mechanisms for data movement and parameter passing that are still efficient but do not introduce aliasing [HW91, Bak95, KSO+ 02]. A significant body of research has been devoted to alias control in object-oriented languages [Alm97, Hog91, NVP98]. These techniques are aimed at encapsulating aliasing to facilitate modular reasoning and are designed to be consistent with an object-oriented style of programming. Our approach to simplifying pointer-based reasoning centers on the formal specification of a pointer component. The general benefits of formal specification to facilitate modular software composition are well documented in the literature [Jon86, Spi89, Win90]. The component is designed so that programmers can easily implement linked data structures. Unlike previous work in recursive data structures, the component can be used to implement cyclic data structures. Additionally, the design is flexible and can be used with alternative data movement techniques.

1.2. Allowing alternative implementations for pointers The definition and use of pointers is tied closely to memory management. Dangling references and memory leaks are among the most difficult problems to avoid in programming. Practitioners have devised pointerlike data structures such as safe pointers and checked pointers to alleviate these problems [Mey95, Ale01, PWH00], but they still require discipline on the part of the programmer. Hence, many software engineers advocate the use of automatic garbage collection [JL96, Mey97, GM96]. Garbage collection is an attractive language feature because it simplifies programming and allows programmers to focus their efforts on functional behavior and efficiency. However, garbage collection behavior is difficult to predict, and therefore garbage collectors are not always appropriate for systems in which predictable execution is necessary, such as real-time and embedded systems [Sha01, VG02]. Research into making garbage collection suitable for such systems is ongoing [BCR03, Hen98]. Some programmers prefer to manage memory themselves in any case. Stepanov, for example, says that “disciplined use of containers allows you to do whatever you need to do without automatic memory management” [SS95]. While opinions differ, there seems to be agreement that

Component Technology for Pointers: Why and How

3

C C.R2

implements

C.R1 uses

uses

F

D E E.R2 E.R1

Fig. 1. Design time diagram with component specifications and implementations.

ideally we should provide for both automatic and manual memory management: Stroustrup is not opposed to garbage collecting implementations of C++, but states that “I don’t want to make the C++ semantics dependent on a garbage collector” [ST02]; and the Real-Time Specification for Java is “independent of any particular [garbage collecting] algorithm” and also permits manual reclamation of memory [BG00]. Our approach is designed to be flexible, and it permits programmers to choose automatic garbage collection or manual memory management based on their performance needs.

1.3. A component-based approach to reasoning One reason to define reusable software components, such as the one described in this paper, is that they facilitate modular or compositional reasoning. Reasoning about a component is modular when it depends only on the specifications of other components, and does not require the programmer to reason about other implementations. This allows components that have the same functional specification—but potentially different performance behavior—to be substituted for one another in the same system without requiring the programmer to re-analyze the entire system. Modular reasoning is important for maintainability and scalability. Figure 1 shows a design time diagram of a software system in which ovals represent concepts (component specifications) and rectangles represent realizations (component implementations). Each realization implements a specification and may depend on several other components. To reason that a realization C.R1 is correct with respect to the specified functional behavior in C, it is sufficient for a programmer to know the component specifications D and E; the programmer does not need to know anything about how D and E might be implemented. Thus, the component specified by E may be implemented with E.R1 or E.R2, but this is irrelevant to the person reasoning about C.R1. This compositional style of reasoning also permits a programmer to substitute C.R2 for C.R1 as long as C.R2 is correct with respect to C. Such a substitution will not affect the functional behavior of the system.

1.4. Contributions The general contribution of this paper is the application of component technology to pointers. Specific contributions include: • We present an abstraction of pointers using a mathematical system of linked locations that captures pointer behavior and is especially well-suited for the implementation of linked data structures—both cyclic and non-cyclic. The abstraction is based on a new metaphor that both students and veteran programmers alike will find accessible. We give a formal specification of the pointer component that details its functional behavior, also including performance considerations. The specification supports modular reasoning in pointer-based programs and serves as a precise mathematical interface that allows clients to understand and reason about pointers as distinct data structures. • The pointer component makes it possible for an underlying support system to implement pointers in multiple ways. The specification can be implemented using garbage collection, reference counting, or a basic implementation that leaves the responsibility of reclaiming memory locations to programmers. Different

4

Kulczycki, Sitaraman, Ogden, and Hollingsworth

Concept Location Linking Template(type Info; evaluates k: Integer); Type Family Position; Operation Take New Location(updates p: Position); Operation Abandon Location(clears p: Position); Operation Relocate(updates p: Position; preserves new location: Position); Operation Check Colocation(preserves p, q: Position; replaces are colocated: Boolean); Operation Swap Locations(preserves p: Position; evaluates i: Integer; updates new target: Position); Operation Redirect Link(preserves p: Position; evaluates i: Integer; preserves new target: Position); Operation Follow Link(updates p: Position; evaluates i: Integer); Operation Swap Contents(preserves p: Position; updates I: Info); Operation Is At Void(preserves p: Position): Boolean; Operation Location Size(): Integer; end Location Linking Template; Fig. 2. Syntactic interface of a component that captures pointer behavior.

implementations of the specification may be used in different parts of a larger system. Programmers can make this choice based on their performance needs.

2. An informal introduction to a system of linked locations This section introduces the interface of a pointer component and gives an informal description of its behavior. The component provides all the functionality of general pointers and has an design that is particularly wellsuited for the easy implementation of typically linked data structures such as lists, trees, and graphs. The description given here differs from traditional explanations of pointers, and it is based on a metaphor of linked locations. While it is intended to be easier to grasp for students new to the notion of references, it is also straightforward to programmers already comfortable with pointers and linked data structures. Figure 2 shows an interface of the component. The formal specification appears later in section 3.

2.1. Organization The interface in Figure 2 is given in Resolve notation [SW94], though any formal language, such as those summarized in [Win90], is appropriate. The keyword concept denotes a specification which, in this case, is parameterized. A client needs to instantiate the Location Linking Template to create a particular system of linked locations. Informally, a system is made up of a finite number of locations, each of which contains information and a fixed number of one-way connections to other locations called links. The kind of information and the precise number of links is supplied by the client when she creates the system. Locations are either free or taken. Locations start off as free and are taken by the client one at a time as she needs them. Only taken locations may be manipulated by the client. When the client no longer needs a (taken) location, she can abandon it, and the location will become free again. Since there are only a finite number of locations available, a smart client will be careful to abandon locations that she no longer intends to use (unless they are collected automatically). The client manipulates locations through her workers. A client can recruit workers, and workers can retire, so unlike the number of locations in a system, the number of workers can vary. All workers reside at some location in the system and serve as representatives of the location they occupy. If the client wishes to alter some aspect of a location, she must do so through a worker. Suppose Pat and Quin are workers. The client may wish to redirect the third link at Pat’s location toward Quin’s location. Rather than change the link herself, she must direct Pat to change it. Pat, redirect the third link at your location toward Quin’s location.

The practical result of this command structure is that even if a location has been taken, the client cannot

Component Technology for Pointers: Why and How

5

g free, non-void locations

f

x,y

t

p,q

a

b s

f

b

Fig. 3. A system of linked locations with symbol information and one link per location.

modify its information or links unless the location is occupied by a worker. Therefore, the smart client is careful to ensure that she can get a worker to any location she wants to update. A location that can be reached by a worker is said to be accessible. Every system has a special location named Void that serves as a default location. The default location is perpetually free, i.e., it cannot be taken by the client. Unlike other free locations, however, the Void location is a useful part of the system. When a client recruits a new worker, he is automatically located at the Void location. Furthermore, the links of every location always point to Void until the client modifies them. Figure 3 shows a diagram of an example system where symbols (Greek letters) are the information and each location has exactly one link. Workers are represented by lowercase Roman letters. The seven locations inside the dotted line are free, the six locations containing symbols are taken, and the location with the slash through it is the Void location. Notice that cycles are permitted in a system, as are locations whose links point back to themselves. Furthermore, any number of workers can reside at a single location, and different locations may have copies of the same information. This figure is slightly inaccurate because all locations—even free ones—have information and links. The information in free locations is always default information (represented here by the Greek letter φ), and the links of free locations always point to Void. We omit these details in the interest of making the diagram less cluttered.

2.2. Actions The actions presented here are those that a client may perform on a system. They include administrative actions, such as establishing the system and recruiting workers, and management actions, such as changing information and redirecting links. Examples of each action are presented followed by a short description and comments. The first two actions are administrative. • Establish a system of linked locations that hold symbol information and have one link each. Before the client can manage her system she must create it, and this action establishes the specified system. The information and links in free locations are not observable by the client. There are a finite number of locations but the client has no control over the number. From a programming perspective, this action corresponds to instantiating the Location Linking Template. During instantiation, the specification is bound to a particular implementation and the component parameters are bound to the actuals provided by the client. Facility Symbol Pointer Fac is Location Linking Template(Symbol, 1) realized by Default Realiz;

• Recruit workers p and q. The client can only manipulate locations through her workers. All newly recruited workers begin their careers at the Void location. The equivalent of this action in the programming world is the declaration of Position variables. The Position type is exported by the Location Linking Template. Var p, q: Position;

The remaining actions enable the client to manage the system. Since all locations are manipulated through workers, the actions take the form of commands to a worker. The actions described here all correspond to

6

Kulczycki, Sitaraman, Ogden, and Hollingsworth

Action

Procedure Call

p, relocate to q’s location.

Relocate(p,q);

Before p a

p a

p, redirect your first link toward q’s location while q jumps to the link’s original target.

After q

p,q

b

q b

a

b

p a

b

Swap_Locations(p,1,q);

q f

f p a

p, redirect your first link toward q’s location.

Redirect_Link(p,1,q);

p, follow your first link.

Follow_Link(p,1);

p a

p, exchange the information at your location with messenger t’s information.

Swap_Contents(p,s);

p a

q

p a

b

f

q b

f

q

p,q a

b

d t

p d

b

a t

Fig. 4. The effect of selected actions on a system.

procedure calls to operations in the Location Linking Template, and the effects of most of these actions are illustrated in Figure 4. • p, take a new location. A client must take a location before she can manipulate it, but the taking can only be performed under certain conditions. First, p must not occupy a taken location. If p is a new recruit, this is not a problem because he resides at the perpetually free location Void. Second, there must be at least one free location to take. Once p takes a location its status changes to taken. A taken location remains taken until the client abandons it. • p, abandon your location. For this action to be performed p must occupy a taken location. After p abandons the location, he relocates to Void and the location is added to the pool of free locations. Recall that all free locations have default information and default links, so a side effect of this action is that the information in the location is changed back to the default and all links in the location are redirected to Void. A problem may occur if two workers reside at the same location when the client abandons it. For example, suppose p and q reside at the same location. Both workers are representatives of the location, so the client can direct either of them to abandon it. If she tells p to abandon the location, p will relocate to Void, but q will find himself occupying a free location. If the client does sloppy bookkeeping she may try to manipulate q’s location, causing unpredictable results. In a programming environment, a worker (Position variable) at a free location is known as a dangling reference. • p, relocate to q’s location. This is one way a client can move p from one location to another. In this case the new location must already have a worker at it, but the locations may be free or taken. Thus, this action may be used to move workers to or from Void. A companion action to this enables the client to ask p if he and q are colocated. • p, follow your location’s third link. If p is at a taken location, the client can move him to a new location by directing him to follow any of the links there. A location is accessible if it is occupied by a worker or if it can be reached by a worker who follows a series of links. For example, if p occupies a location that contains a link to a second location, both locations are accessible regardless of whether the second

Component Technology for Pointers: Why and How

7

location is occupied by a worker, because p can reach the second location by following the link from his original location. When a location is first taken it is accessible because a worker resides there, but actions that relocate a worker or redirect links can potentially change the accessibility of the system. In an ideal setting a client would like to ensure that all taken locations remain accessible. A taken but inaccessible location presents a problem: The location is not free so the client cannot take it, and the location is not accessible so the client cannot use it. The total number of locations is finite so if enough locations fall into this state the client may find herself trying to take a new location from an empty pool. In a programming environment, taken but inaccessible locations are known as memory leaks. Note that it is technically possible for the Void location to be inaccessible, but practically the client need only recruit a new worker and Void becomes accessible. • p, exchange the information at your location with messenger t’s information. The client needs a way to modify the information contained in locations and this action provides it. Implicit in this action is an assumption that the client has a way to manage information, including the ability to recruit messengers who carry that information. This action can only be performed on taken locations. The “messenger” t in this action corresponds to a variable declared to be of the information type. Recall that the client provides the information type when she instantiates the system. This is the only action involving information directly, indicating that a client cannot modify information while it is still in the system—she can only exchange it with information that is outside the system. A client may, of course, swap out the information at a particular location, modify it, and then swap it back in. • p, redirect the third link at your location toward q’s location. If p occupies a taken location, any of its links can be redirected toward q’s location. q’s location need not be taken, so a link at p’s location can be directed toward Void. This action can potentially change the accessibility of the system.

3. The formal specification of Location Linking Template This section describes the formal specification of the Location Linking Template given in Figure 5. Most of the relationships between the mathematical objects in the concept and the notions introduced informally are straightforward.

3.1. Definitions The defines clauses at the beginning of Location Linking Template indicate that any implementation of this concept must provide definitions for the mathematical type Location, the mathematical object Void, and the mathematical object Taken Location Displacement. Though we expect that objects of type Location will somehow be tied to a machine’s memory addresses, we don’t want to presume how the implementer will model them. In particular, the specification is flexible enough so that Locations that are free or inaccessible (or both) need not correspond to real memory locations. For the purposes of reasoning about this component, the programmer need only know that Location is a set, Void is a specific location, and Taken Location Displacement is a positive integer. Objects of type Location in the concept correspond to the notion of locations in the system of linked locations described above. The type parameter, Info, indicates the type of information that a location contains, while the second parameter, k, indicates the number of links that a location contains. The three conceptual variables near the beginning of the concept are functions that take locations as arguments: Contents(q) returns the information at a given location q, Target(q, i ) returns the location targeted by the i -th link of q, and Is Taken(q) returns true if q is taken and false if q is free. The constraints clause ensures three things: that the distinguished location Void is always free, that all free locations have default information and default links, and that a specific relationship holds between the cardinality of the set Location and Taken Location Displacement. The assertion that free locations have default information and default links exists strictly for reasoning about functional behavior. The performance part of this specification assumes that no memory is allocated for information until the Take New Location operation is invoked. The value of Taken Location Displacement is the amount of space that a newly taken location occupies in memory—it depends on the type Info, the number k of links, and the implementation. For example, a programmer may implement Taken Location Displacement as the amount of memory for a default object of type Info plus the amount of memory for k memory addresses. The last part of the

8

Kulczycki, Sitaraman, Ogden, and Hollingsworth

Concept Location Linking Template(type Info; evaluates k: Integer); uses String Theory, Std Boolean Facility, Std Integer Fac; Defines Location: Set; Defines Void: Location; Defines Taken Location Displacement: N+ ; Var Target: Location × [1..k] → Location; Var Contents: Location → Info; Var Is Taken: Location → B; Constraints ¬Is Taken(Void) and ( ∀q : Location, if ¬Is Taken(q), then Info.Is Initial(Contents(q)) and ∀j : [1..k], Target(q, j) = Void ) and ||Location|| ≥ Total Memory Capacity/Taken Location Displacement; Initialization ensures ∀q : Location, ¬Is Taken(q); Type Family Position is modeled by Location; exemplar p; Definition Variable Is Accessible(q: Location): B = ∃p id : Position.names, ∃h : [1..k], ∃ρ : Str(Location × [1..k]) 3 Position.Is Active(p id) and h(Position.Denote(p id), h)i Is Prefix ρ and ∀u, v : Location, ∀i, j : [1..k], if h(u, i)i ◦ h(v, j)i Is Substring ρ then Target(u, i) = v and h(q, 1)i Is Suffix ρ; initialization updates Is Accessible; ensures p = Void; finalization updates Is Accessible; Operation Take New Location(updates p: Position); updates Is Taken, Is Accessible; requires ¬Is Taken(p) and Taken Location Displacement ≤ Remaining Memory Capacity; ensures ¬#Is Taken(p) and ¬#Is n Accessible(p) and true if q = p ∀q : Location, Is Taken(q) = ; #Is Taken(q) otherwise Operation Abandon Location(clears p: Position); updates Target, Contents, Is Taken, Is Accessible; requires Is Taken(p); ensures n  ∀q : Location,  false if q = p Is Taken(q) = and #Is Taken(q) otherwise if Is Taken(q) then Contents(q) = #Contents(q) and ∀n : [1..k], Target(q, n) = #Target(q, n); Operation Relocate(updates p: Position; preserves new location: Position); updates Is Accessible; ensures p = new location; Operation Check Colocation(preserves p, q: Position; replaces are colocated: Boolean); ensures if ((Is Taken(p) or p = Void) and (Is Taken(q) or q = Void)) then are colocated = ( p = q ); Operation Swap Locations(preserves p: Position; evaluates i: Integer; updates new target: Position); updates Target; requires 1 ≤ i ≤ k and Is Taken(p); ensures  ∀q : Location, n ∀j : [1..k],  #new target if q = p and j = i Target(q, j) = and #Target(q, j) otherwise new target = #Target(p, i); Operation Redirect Link(preserves p: Position; evaluates i: Integer; preserves new target: Position); updates Target, Is Accessible; requires 1 ≤ i ≤ k and Is Taken(p); ensures ∀q : Location, n ∀j : [1..k], #new target if q = p and j = i ; Target(q, j) = #Target(q, j) otherwise Fig. 5. A formal specification of Location Linking Template. (continued on next page)

Component Technology for Pointers: Why and How

9

Operation Follow Link(update p: Position; preserves i: Integer); updates Is Accessible; requires Is Taken(p) and 1 ≤ i ≤ k; ensures p = Target(#p, i); Operation Swap Contents(preserves p: Position; updates I: Info); updates Contents; requires Is Taken(p); ensures I = #Contents(p) and ∀q : Location, n #I if q = p Contents(q) = ; #Contents(q) otherwise Operation Is At Void(preserves p: Position): Boolean; ensures Is At Void = ( p = Void ); Operation Location Size(): Integer; ensures Location Size = ( Taken Location Displacement ); end Location Linking Template; Fig. 5. (continued from previous page) A formal specification of Location Linking Template.

f f f

f f

f

f

f

f

f

(a)

(b)

Fig. 6. A newly initialized system of linked locations (a), and a simplified version (b).

constraints section therefore ensures that the total number of locations is at least as large as the total amount of memory in the system (Total Memory Capacity) divided by the amount of memory that a newly taken location occupies. In other words, the number of mathematical locations is always greater than or equal to the number of locations that can be stored in real memory. The initialization clause ensures that immediately after this concept is instantiated all locations are free. Instantiation can be viewed as establishing a system of linked locations and giving it a name. The following facility declaration establishes Symbol Pointer Fac as a system of linked locations with Symbol information and one link per location. Facility Symbol Pointer Fac is Location Linking Template(Symbol, 1) realized by Default Realiz;

The initialization clause ensures that all locations in the newly created system are free, and the constraints clause ensures that all of these free locations have default information and default links. The default target for links is the Void location, so assuming that the default symbol is φ, a picture of the newly created system will look similar to Figure 6(a). The constraints indicate that free locations always hold default information and default links and that the Void location is special, so a useful simplification of this picture is given in Figure 6(b), where a dotted line surrounds free, non-Void locations, and the Void location is represented by a circle with a bisecting slash.

3.2. Position Type The concept exports the programming type Position. As noted, Position variables are the workers described in the system above. Position variables are modeled mathematically as locations. Therefore, although position variables may be thought of as workers in the system, their mathematical values correspond to locations. For example, the initialization ensures clause asserts that p = Void. Since the symbol p that occurs here

10

Kulczycki, Sitaraman, Ogden, and Hollingsworth

is the mathematical value of ’the programming variable p’ rather than ’the programming variable p’ itself, the assertion is interpreted as The location of the worker named p is Void, or simply Worker p resides at Void. The clause indicates that all newly recruited workers begin their careers at Void. The exemplar clause simply asserts that p represents an arbitrary position variable when it occurs in the scope of the type declaration. The predicate Is Accessible is defined inside the type declaration only because we follow the convention of defining objects before using them. Definitions that occur in the scope of the type declaration are treated as if they occurred at the module level. The mathematical definition Is Accessible is a variable rather than a constant (most definition objects are constants) because its value depends not only on the value of its parameter, q, but on the state of the program—the same location may be accessible in one program state and inaccessible in the next. The definition states that a location is accessible if there is a series of links to that location starting from a location represented by a Position variable. The type Position.names represents the set of all possible identifiers for variables declared to be of type Position, and the predicate Position.Is Active(p id ) indicates whether the identifier corresponds to an active variable. When a variable is declared of type Position, it gets a unique identifier from the set Position.names and that identifier becomes active. When a variable is destroyed (goes out of scope) the identifier becomes inactive again. The function Position.Denote(p id ) returns the object corresponding to the Position variable identified by p id. The angle brackets and small circle are notations for mathematical strings. hxi is a string containing the object x, and the ◦ operator indicates string concatenation. When an operation can potentially modify state that affects a definition variable such as Is Accessible, we include the variable in the updates clause as a convenience to the client. Hence, both the initialization and finalization (which act as implicit operations) include Is Accessible in their updates clauses to indicate that the declaration or destruction of a Position variable may potentially change the accessibility of the system.

3.3. Operations The management actions informally described above correspond directly to the operations given in the concept. Allocation and deallocation often have special status in programming languages. In the Location Linking Template they are handled the same as any other operation. The Take New Location operation is repeated here for convenience. Operation Take New Location(updates p: Position); updates Is Taken, Is Accessible; requires ¬Is Taken(p) and Taken Location Displacement ≤ Remaining Memory Capacity; ensures ¬#Is Taken(p) and ¬#Is n Accessible(p) and true if q = p ∀q : Location, Is Taken(q) = ; #Is Taken(q) otherwise

The operation takes a single position variable as a parameter. The updates parameter mode indicates to the client that the operation modifies the value of p. The updates clause on the following line gives the conceptual (state) variables that we can expect this operation to modify. In this case, we can expect the operation to affect both the taken status of one or more locations and the accessibility of the system. The requires clause has two conditions. First, the position variable p cannot reside at a taken location. Since the Void location is perpetually free, it will be the location where p typically resides when the operation is called. The second condition is that the amount of real memory available (represented by the global system variable Remaining Memory Capacity) is greater than or equal to the memory that must be allocated to a newly taken location. This, like the requirement involving memory in the constraints clause, is an example of a lightweight performance specification. The unavoidable impact of performance considerations on formal specification design is documented in [Sit96]. These specifications are included with the behavioral specifications because they are intended to apply to all implementations of the concept, even though various implementations are still free to make other performance tradeoffs, as we shall see in the next section. In this particular case, the notion that there must be enough memory to dynamically create a new object is central to the behavior of the operation, since otherwise the operation will fail. Note that the principle of observability compels us to include an operation that allows the user to check this condition. The function Location Size supplied by this module (along with a global operation that returns the amount of available memory) serves this purpose. A discussion of performance specification and analysis can be found in [KOS02, Heh98]. For complete performance specifications—those encompassing both duration and displacement—we envision a

Component Technology for Pointers: Why and How

11

separate performance profile for each implementation. The displacement information described here merits special treatment because it represents displacement in the system that is permanent until it is explicitly deallocated. The ensures clause guarantees two things: first, that the newly taken location was not previously taken or previously accessible; second, that the newly taken location has a status of taken while the status of all other locations remains unchanged. All operations have an implicit frame property that ensures that any variables not mentioned in the parameter list or updates clause will remain unchanged. Thus, a client can be sure that the newly taken location has default information and that all of its links point to the Void location by the following reasoning: From the constraints clause, she knows that all free locations contain default information and default links; from the ensures clause, she knows that the newly taken location was previously free; and from the frame property she knows that the Target and Contents variables did not change since they were not mentioned in the updates clause. The parameters in these operations have various modes that affect the specification and help the client understand what happens to the arguments of a call. The updates mode has already been mentioned. The clears mode ensures that an argument will return from the procedure with an initial value of its type, the preserves mode prohibits any changes to an argument’s value, and the replaces mode indicates that the incoming value will be ignored and replaced. The evaluates mode indicates that the operation expects an expression in this position—it is typically used with types that are often returned from functions, like integers. The Swap Locations operation was not described in the informal introduction. The operation is similar to Redirect Link except that it also relocates the worker from the new target location to the old target location (see Figure 4). In effect, the specified link and the specified worker are swapping locations. This operation has the desirable property that it does not affect the accessibility of the system, which means that no new memory leaks will be created by using it. Furthermore, using Swap Locations and Relocate one can implement both Redirect Link and Follow Link, effectively making them secondary operations.

4. Memory management Through its operations, the Location Linking Template provides all the functionality of traditional pointers. For example, the client can obtain the benefits of aliasing by positioning two or more workers at the same location. But the concept also allows the client to fall into the traditional traps involving pointers: dangling references and memory leaks. This section looks at different ways these problems can be managed.

4.1. Performance and extensions A dangling reference occurs when a location is free but remains accessible, as in the following code. Var x, y: Position; Take New Location(x); Relocate(y, x); Abandon Location(x);

When x abandoned his location, the location’s status changed from taken to free. Though x was relocated to Void, y remained at the location, so the location continues to be accessible. Position variables are effectively bound to the type of Info during instantiation, so there is no danger of inadvertently modifying (through the dangling reference) the contents of a memory location that is being used by another variable somewhere else in the program. Real memory locations on a machine are limited, so the specification permits implementations that can reclaim memory even if a dangling reference existed for them. The Is Usable operation (provided in an extension to the concept and shown in Figure 7) effectively tells the client whether a worker is a dangling reference. Since a worker resides at the location in question, the location is accessible. If the location is taken, it is usable by the client; if the location is free, the client cannot affect it. A memory leak occurs when a location is taken but not accessible. The following code segment creates a memory leak. Var x, y: Position; Take New Location(x); Relocate(x, y);

12

Kulczycki, Sitaraman, Ogden, and Hollingsworth

Extension Usability Checking Capability for Location Linking Template; Operation Is Usable(preserves p: Position): Boolean; ensures Is Usable = ( Is Taken(p) ); end Usability Checking Capability; Extension Cleanup Capability for Location Linking Template; Operation Abandon Inaccessible(); updates Is Taken, Contents, Target; ensures ∀q : Location, Is Taken(q) = (#Is Accessible(q) and #Is Taken(q)) and if Is Taken(q) then Contents(q) = #Contents(q) and ∀i : [1..k], Target(q, i) = #Target(q, i); end Cleanup Capability; Fig. 7. Extensions to Location Linking Template.

The location that was taken by x continues to have a taken status but has become inaccessible. Real memory locations are limited, so a proliferation of memory leaks is a serious problem. There are traditionally two ways to deal with memory leaks. The first is to avoid them by keeping a careful accounting of the locations in the system and explicitly abandoning a location before it becomes a leak. The second is to do periodic garbage collection. The operation that performs garbage collection, Abandon Inaccessible, is provided in an extension to the concept. Extensions are separate from the main concept because their operations cannot be implemented without costly performance penalties. Some realizations do not implement extensions, but some may. Both the Abandon Inaccessible and Is Usable operations reside in extensions. A garbage collecting implementation of Location Linking Template would also provide a procedure for the Abandon Inaccessible operation. A client may then choose to ignore the Abandon Location operation and periodically invoke the Abandon Inaccessible operation instead. This reflects an all-or-nothing style of garbage collection, as in the copying-collection approach to garbage collection. Another extension could be written that uses a two-phase approach to garbage collection, as in the popular mark-and-sweep approach. It would have at least two operations: one, Mark Inaccessible, would update a set of marked locations; another, Abandon Marked, would abandon the locations in the set and then clear the set.

4.2. Implementation flexibility For clients to take full advantage of the implementation options allowed by this component, they need a programming environment that lets them to choose different implementations on a case by case basis. The facility declaration provides a mechanism for doing this in Resolve, the language used throughout this paper [SW94, SAK+ 00]. The following declaration creates a pointer facility containing Queue information and one link per location, and it uses the default no-frills realization for Location Linking Template that puts the burden of memory management on the client. Facility Queue Pointer Fac is Location Linking Template(Queue, 1) realized by Default Realiz;

A garbage collecting implementation would implement the Cleanup Capability extension in addition to the operations in the concept. Facility GC Queue Pointer Fac is Location Linking Template(Queue, 1) extended by Cleanup Capability realized by Garbage Collecting Realiz;

Each facility acts as a distinct component. Thus, an object of type Queue Pointer Fac.Position cannot be used where an object of type GC Queue Pointer Fac.Position is expected. The question of how clients will choose to manage memory will depend on several factors, including how predictable the application needs to be and how difficult it is to do manual memory management. Manual memory management is easier when pointer facilities are local rather than global. Local facilities are declared inside a realization, while global facilities are declared as distinct facilities and imported by other compilation units (typically realizations). Memory management for local pointer facilities can often be accomplished manually since reasoning about reference behavior remains local to the realization. This also provides a

Component Technology for Pointers: Why and How

13 head pre last

φ head

α β

γ

φ

α

pre

last

β

γ

Fig. 8. The top portion of the figure depicts an empty list and the bottom portion depicts a list with three items where the cursor is between the second and third item. Abstract list representations are on the left and their concrete counterparts are on the right.

simple way of encapsulating aliasing, because the programmer can be certain that no local pointer variables will be exported outside of the component containing the local facility. The list component described in the next section (and implemented in the appendix) contains a local pointer facility. The list implementation uses a form of manual memory management known as component-level memory management, which is a term for manual memory management when it is applied to reusable software components [Mey97]. For acyclic data structures such as lists and trees, a reference counting implementation may be appropriate. Instead of clients having to prove to themselves (or a verification system) that the code does not create memory leaks, they need only prove that their data structure remains acyclic. Manual memory management for global pointer facilities becomes more difficult with each realization that imports the facility, because all realizations that import the same facility share the same set of locations. Therefore, global pointer facilities may be good candidates for garbage collection implementations. The facility mechanism allows multiple pointer components with different implementations to exist in the same program. The next section gives an example of a list component implemented using Location Linking Template.

5. Application Using Location Linking Template to implement linked data structures will be familiar to anyone who has implemented a linked list in a language with explicit pointers such as C or Pascal. This section describes a data type—List Template—and its implementation using Location Linking Template. Both the concept (specification) and a realization (implementation) are given in the appendix. The mathematical model for List consists of two strings—a left string and a right string.4 The conceptual position between the strings is known as the cursor position. For example, hα, βihγi is a list of three Greek letters whose cursor is between β and γ, and h ih i is an empty list whose cursor is at the beginning of the list. The operations in List Template include Insert, Remove, Advance, and Reset. When an item is inserted into the list it is inserted at the beginning of the right string; when an item is removed from the list it is removed from the beginning of the right string. To advance the cursor, an item is moved from the beginning of the right string to the end of the left string. To reset the cursor, all items in the left string are moved (preserving their order) to the beginning of the right string. The implementation of List Template (Location Linking Based ) appears after its specification in the appendix. The list data type is represented by a record with three Position fields (head, pre, and last) and two Integer fields (right length and left length). The lengths correspond to the left and right string lengths in the concept. The field head perpetually resides at a dummy location at the beginning of the list. This location serves as a sentinel—it will not be used to hold information, but it allows the implementer to ignore certain boundary conditions. The field pre is always located at the end of the left string; if the left string is empty, pre is colocated with head. The field last is always located at the end of the right string; if the right string is empty, last is colocated with pre. Figure 8 shows the location of these position variables for two different states of a list object, and figure 9 illustrates how a system is updated during the call to an Insert operation. An interesting aspect of the Location Linking Based implementation of List Template are the conventions (also called representation invariants) that must hold before and after each procedure. The first few 4

A mathematical string is like a sequence, except that it is not indexed.

14

Kulczycki, Sitaraman, Ogden, and Hollingsworth

pre

post

pre

α

β

α

new

E

post

new

β

E

Fig. 9. The Insert procedure consists of four main sections. Recruiting a worker named post to reside at the location targeted by pre, creating a new location with the desired information (the left diagram), redirecting links so that the new location is positioned correctly in the list (the right diagram), and ensuring that the worker last resides at his correct location in the updated system. When the procedure ends, the workers new and post are retired.

assertions in the conventions clause describe how the fields of the record (head, pre, last, left length, and right length) are related. Using these assertions together with the correspondence (also called the abstraction relation) one can show that the location Void is accessible from all locations that hold items in the list. The last assertion: ∀q: Location, ( Is Accessible(q) iff Is Taken(q) );

prohibits dangling references and memory leaks. The list example provides a classic application of data abstraction and information hiding: the list interface is easy to use effectively and clients do not have to reason about pointers to do so.

6. Summary We have presented a formal specification of a component that captures pointer behavior. The specification allows programmers to understand pointers precisely and reason soundly and systematically about pointerbased code and components. It is sufficiently general to be used for object identity. It allows efficient data movement and is suitable for the implementation of acyclic and cyclic data structures. The component is based on strict value semantics. It provides a simple way to encapsulate aliasing, and it supports modular reasoning. The design has been influenced by performance considerations. Extensions to the basic specification give programmers the flexibility to choose between manual memory management and automatic garbage collection based on their performance concerns.

References [Ale01] [Alm97] [Bak95] [BCR03] [BG00] [Cop91] [CPN98] [GM96] [Heh98] [Hen98] [HLW+ 92] [Hoa89a] [Hoa89b]

Andrei Alexandrescu. Modern C++ Design: Generic Programming and Design Patterns Applied. Addison-Wesley, 2001. Paulo Sergio Almeida. Balloon types: Controlling sharing of state in data types. In Proceedings ECOOP ’97, number 1241 in Lecture Notes in Computer Science, pages 32–59, New York, 1997. Springer-Verlag. Henry G. Baker. ’use-once’ variables and linear objects—storage management, reflection and multi-threading. ACM SIGPLAN Notices, 30(1):45–52, January 1995. David F. Bacon, Perry Cheng, and V. T. Rajan. A real-time garbage collector with low overhead and consistent utilization. In Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 285–298. ACM Press, 2003. Greg Bollella and James Gosling. The real-time specification for Java. Computer, 33(6):47–54, 2000. James O. Coplien. Advanced C++ Programming Styles and Idioms. Addison-Wesley, 1991. David G. Clarke, John M. Potter, and James Noble. Ownership tpyes for flexible alias protection. In OOPSLA ’98 Conference Proceedings, pages 48–64. ACM Press, 1998. James Gosling and Henry McGilton. The Java language environment: A white paper. Technical report, Sun Microsystems, Inc., 1996. http://java.sun.com/docs/white/langenv/. Eric C. R. Hehner. Formalization of time and space. Formal Aspects of Computing, 10(3):290–306, 1998. R. Henriksson. Scheduling Garbage Collection in Embedded Systems. PhD thesis, Lund Institute of Technology, July 1998. John Hogg, Doug Lea, Alan Wills, Dennis deChampeaux, and Richard Holt. The Geneva Convention on the treatment of object aliasing. OOPS Messenger, 3(2):11–16, 1992. C. A. R. Hoare. Hints on programming language design. In C. A. R. Hoare and C. B. Jones, editors, Essays in Computing Science. Prentice Hall, New York, 1989. C. A. R. Hoare. Recursive data structures. In C. A. R. Hoare and C. B. Jones, editors, Essays in Computing Science. Prentice Hall, New York, 1989.

Component Technology for Pointers: Why and How

[Hog91] [HW91] [JL96] [Jon86] [Kie76] [KM98] [KOS02] [KSO+ 02] [MDS01] [Mey95] [Mey97] [Mey99] [M¨ ul02] [NVP98] [PWH00] [SAK+ 00] [Sha01] [Sit96] [Smi95] [Spi89] [SS95] [ST02] [SW94] [Szy98] [VG02] [WH01] [Win90]

15

John Hogg. Islands: Aliasing protection in object-oriented languages. In Proceedings OOPSLA ’91, volume 26 of ACM SIGPLAN Notices, pages 271–285. ACM, 1991. Doug E. Harms and Bruce W. Weide. Copying and swapping: Influences on the design of reusable software components. IEEE Transactions on Software Engineering, 17(5):424–435, May 1991. Richard Jones and Rafael D. Lins. Garbage Collection : Algorithms for Automatic Dynamic Memory Management. John Wiley & Sons, 1996. C B Jones. Systematic software development using VDM. Prentice Hall International (UK) Ltd., 1986. Richard B. Kieburtz. Programming without pointer variables. In Proceedings of the SIGPLAN ’76 Conference on Data: Abstraction, Definition, and Structure. ACM Press, 1976. Andrew Koenig and Barbara Moo. Teaching standard C++. JOOP, 11(7):11–17, 1998. Joan Krone, William F. Ogden, and Murali Sitaraman. Modular verification of performance correctness. In OOPSLA 2001 SAVCBS Workshop Proceedings, 2002. http://www.cs.iastate.edu/~leavens/SAVCBS/papers-2001/ index.html. Gregory W. Kulczycki, Murali Sitaraman, William F. Ogden, Bruce W. Weide, and Gary T. Leavens. Reasoning about procedure calls with repeated arguments and the reference-value distinction. Technical Report 02-13, Department of Computer Science, Iowa State University, Ames, Iowa, 50011, December 2002. David R. Musser, Gillmer J. Derge, and Atul Saini. STL Tutorial and Reference Guide. Addison-Wesley, Boston, 2nd edition, 2001. Scott Meyers. More Effective C++. Addison-Wesley, 1995. Bertrand Meyer. Object-Oriented Software Construction. Prentice Hall PTR, Upper Saddle River, New Jersy, 2nd edition, 1997. Bertrand Meyer. On to components. IEEE Computer, January 1999. Peter M¨ uller. Modular specification and verfication of object-oriented programs. Lecture Notes in Computer Science, 2262, 2002. James Noble, Jan Vitek, and John Potter. Flexible alias protection. Lecture Notes in Computer Science, 1445:158– 185, 1998. Scott M. Pike, Bruce W. Weide, and Joseph E. Hollingsworth. Checkmate: concerning C++ dynamic memory errors with checked pointers. In Proceedings of the 31st SIGCSE Technical Symposium on Computer Science Education. ACM Press, March 2000. Murali Sitaraman, Steven Atkinson, Gregory Kulczycki, Bruce W. Weide, Timothy J. Long, Paolo Bucci, Wayne Heym, Scott Pike, and Joseph E. Hollingsworth. Reasoning about software-component behavior. In Procs. Sixth Int. Conf. on Software Reuse, pages 266–283. Springer-Verlag, 2000. Alan C. Shaw. Real time systems and software. John Wiley & Sons, 2001. Murali Sitaraman. Impact of performance considerations on formal specification design. Formal Aspects of Computing, 8(6):716–736, 1996. Graeme Smith. Reasoning about Object-Z specifications. In Proceedings of Asia-Pacific Software Engineering Conference, pages 489–497. IEEE Computer Society Press, December 1995. J.M. Spivey. The Z Notation. Prentice Hall, New York, 1989. Al Stevens and Alexander Stepanov. C programming. Dr. Dobb’s Journal, March 1995. Bjarne Stroustrup and Pierre Tran. Interview of bjarne stroustrup. Developpeur Reference, March 2002. http: //www.research.att.com/~bs/nantes-interview-english.html. Murali Sitaraman and Bruce W. Weide. Component-based software using RESOLVE. ACM Software Engineering Notes, 19(4):21–67, 1994. Clemens Szyperski. Component software: beyond object-oriented programming. ACM Press/Addison-Wesley Publishing Co., 1998. Frank Vahid and Tony Givargis. Embedded System Design: A Unified Hardware/Software Introduction. John Wiley & Sons, 2002. Bruce W. Weide and Wayne D. Heym. Specification and verification with references. In Proceedings OOPSLA Workshop on Specification and Verification of Component-Based Systems. ACM, October 2001. Jeannette M. Wing. A specifier’s introduction to formal methods. IEEE Computer, 23(9):8–24, 1990.

Appendix A Concept List Template(type Entry); Defines List Unit Displacement: N+ ; Type Family List is modeled by Cart Prod Left: Str(Entry); Right: Str(Entry); end; exemplar S; initialization ensures |S.Left| = 0 and |S.Right| = 0; Operation Insert(alters E: Entry; updates S: List);

16

Kulczycki, Sitaraman, Ogden, and Hollingsworth

requires List Unit Displacement ≤ Remaining Memory Capacity; ensures S.Left = #S.Left and S.Right = h#Ei ◦ #S.Right; Operation Remove(replace R: Entry; updates S: List); requires |S.Right| > 0; ensures S.Left = #S.Left and #S.Right = hRi ◦ S.Right; Operation Advance(updates S: List); requires |S.Right| > 0; ensures S.Left ◦ S.Right = #S.Left ◦ #S.Right and |S.Left| = |#S.Left| + 1; Operation Advance To End(updates S: List); ensures |S.Right| = 0 and S.Left = #S.Left ◦ #S.Right; Operation Reset(updates S: List); ensures |S.Left| = 0 and S.Right = #S.Left ◦ #S.Right; Operation Swap Rights(updates S1, S2: List); ensures S1.Left = #S1.Left and S2.Left = #S2.Left and S1.Right = #S2.Right and S2.Right = #S1.Right; Operation Left Length(restores S: List): Integer; ensures Left Length = ( |S.Left| ); Operation Right Length(restores S: List): Integer; ensures Right Length = ( |S.Right| ); Operation Unit Size(): Integer; ensures Unit Size = ( List Unit Displacement ); end List Template;

Appendix B Realization Location Linking Realiz for List Template; uses Location Linking Template; Facility Std Pointer Fac is Location Linking Template(Entry, 1) realized by Std Realiz; Definition List Unit Displacement = Taken Location Displacement; Definition Variable Next(p: Location): Location = Target(p, 1); Type List = Record head, pre, last: Position; left length, right length: Integer; end; conventionsnS.left length ≥ 0 and S.right length ≥ 0 and S.head if S.left length = 0 S.pre = and S.left length (S.head) otherwise n Next S.pre if S.right length = 0 S.last = and NextS.right length (S.pre) otherwise Next(S.last) = Void and ∀q : Location, ( Is Accessible(q) iff Is Taken(q) ); correspondence S.left length

Conc.S.Left =

Y

hContents(Nextk (S.head))i and

k=1 S.right length

Conc.S.Right =

Y k=1

initialization Take New Location(S.head); Relocate(S.pre, S.head); Relocate(S.last, S.head); end;

hContents(Nextk (S.pre))i;

Component Technology for Pointers: Why and How

Procedure Insert(alters E: Entry; updates S: List); Var post, new: Position; Relocate(post, pre); Follow Link(post, 1); Take New Location(new); Swap Contents(new, E); Redirect Link(S.pre, 1, new); Redirect Link(new, 1, post); If S.right length = 0 then Follow Link(S.last, 1); end; S.right length := S.right length + 1; end Insert; Procedure Remove(replace R: Entry; updates S: List); Var p: Position; Relocate(p, pre); Follow Link(p, 1); Follow Link(p, 1); Swap Locations(S.pre, 1, p); Swap Contents(p, R); Abandon Location(p); If S.right length = 1 then Relocate(S.last, S.pre); end; S.right length := S.right length - 1; end Remove; Procedure Advance(updates S: List); Follow Link(S.pre, 1); S.left length := S.left length + 1; S.right length := S.right length - 1; end Advance; Procedure Advance To End(updates S: List); Relocate(S.pre, S.last); S.left length := S.left length + S.right length; S.right length := 0; end Advance To End; Procedure Reset(updates S: List); Relocate(S.pre, S.head); S.right length := S.right length + S.left length; S.left length := 0; end Reset; Procedure Swap Rights(updates S1, S2: List); Var post: Position; Relocate(post, S1.pre); Follow Link(post, 1); Swap Locations(S2.pre, 1, post); Swap Locations(S1,pre, 1, post); S1.last :=: S2.last; S1.right length :=: S2.right length; If S.right length 6= 0 then Relocate(S1.last, S1.pre); end; If S.right length 6= 0 then Relocate(S1.last, S1.pre); end; end Swap Rights; Procedure Left Length(restores S: List): Integer; Left Length = S.left length; end; Procedure Right Length(restores S: List): Integer; Right Length = S.right length; end; Procedure Unit Size(): Integer;

17

18

Kulczycki, Sitaraman, Ogden, and Hollingsworth

Unit Size = Location Size(); end; end Location Linking Realiz;

Appendix C The realization below makes use of special syntax provided for the Location Linking Template. Workers in the system will be of type Node. A Node is a Position for a system in which locations contain information of type Entry and a single link, labeled next. An up-arrow (↑) before a worker indicates that he is taking a new location, and a down-arrow (↓) in front of a worker indicates that he is abandoning his location. A hat (ˆ) between a location and a link name denotes the link at that location, and an arrow (→) between locations denotes movement of a worker or link (at the foot of the arrow) to a new location (at the head of the arrow). So x → y is syntactic sugar for Relocate(x, y), and x → xˆnext is syntactic sugar for Follow Link(x, 1), and xˆnext → yˆnext is syntactic sugar for a secondary operation that redirects the first link at x’s location to the location pointed to by the first link of y’s location. The double arrow (↔) indicates simultaneous movement in both directions. The statement x *:=: C is syntactic sugar for Swap Contents(x, C). Realization Location Linking Based for List Template; uses Location Linking Template; Type Node = ˆEntry(next); (* Facility implicitly declared here. *) (* Definitions are the same as above. *) Type List = Record head, pre, last: Node; left length, right length: Integer; end; (* Conventions and correspondence are the same as above. *) initialization ↑S.head; S.pre → S.head; S.last → S.head; end; Procedure Insert(alters E: Entry; updates S: List); Var new: Node; ↑new; new *:=: E; newˆnext → S.preˆnext; S.preˆnext → new; If S.right length = 0 then S.last → S.lastˆnext; end; S.right length := S.right length + 1; end Insert; Procedure Remove(replace R: Entry; updates S: List); Var old: Node; old → S.preˆnext; S.preˆnext → S.preˆnextˆnext; old *:=: R; ↓old; If S.right length = 1 then S.last → S.pre; end; S.right length := S.right length - 1; end Remove; Procedure Advance(updates S: List); S.pre → S.preˆnext; S.left length := S.left length + 1; S.right length := S.right length - 1; end Advance;

Component Technology for Pointers: Why and How

Procedure Advance To End(updates S: List); S.pre → S.last; S.left length := S.left length + S.right length; S.right length := 0; end Advance To End; Procedure Reset(updates S: List); S.pre → S.head; S.right length := S.right length + S.left length; S.left length := 0; end Reset; Procedure Swap Rights(updates S1, S2: List); If S.right length 6= 0 then S1.preˆnext ↔ S2.preˆnext; S1.last ↔ S2.last; end; end Swap Rights; (* Remaining procedures are the same as above. *) end Location Linking Based;

19