An Abstract Machine for the Old Value Retrieval - Semantic Scholar

5 downloads 0 Views 324KB Size Report
tions simulate steps of object-oriented systems and preserve an invariant implying properties ..... time-stamp is the number of the currently executing method.
An Abstract Machine for the Old Value Retrieval? Piotr Kosiuczenko Institute of Information Systems, WAT, Warsaw, Poland

Abstract. The evaluation of post-conditions requires the computation of old attribute values. Until recently, existing computation methods were not efficient in terms of time- and space-complexity. Moreover they were applicable only to a restricted form of post-conditions. Recently a new algorithm was proposed to overcome those deficiencies. In this paper, an abstract machine corresponding to this algorithm is defined. Its transitions simulate steps of object-oriented systems and preserve an invariant implying properties needed to compute old attribute values. The machine is based on a kind of structure called here sufficiently persistent, as opposed to persistent and partially persistent structures. A space-bound on the structure size is given. It is also demonstrated that methods which do not have post-conditions can be abstracted away. Keywords: persistent data structures, old value retrieval, @pre, old.

1

Introduction

Contracts are used to specify object-oriented systems from the user point of view [10]. They consist of three basic constraint types: invariants, pre- and postconditions. The system consistency is ensured by invariants. A pre-condition specifies in which states a method can be called. Post-conditions specify system states after a method execution. Their validation is not straightforward since it is necessary to compare attribute values in method pre- and post-states and method calls can be nested. Old attribute values are accessed with the help of operator @pre in case of Object Constraint Language (OCL, see [12]) and old in case of Eiffel [11], Java Modeling Language (JML, see [4]) and Spec# [1]. The copying of a whole pre-state before a method execution is out of question due to its time- and memory-cost. This problem can be avoided by saving before the execution values of those attributes whose @pre/old values are referred to in the corresponding post-condition. It is similar to the way old variable values are treated in the Hoare logic [6], which uses fresh variables to store values from before a method execution. This approach is followed in tools supporting other contractual languages such as OCL, JML and Spec# (see [9, 13] for an OCL ?

This is a corrected version of the paper which appeared in: Bolduc, C. et al. (eds.): Mathematics of Program Construction (MPC 2010), LNCS, Vol. 6120, Springer, 2010. This research has been partially supported by grant No PBZ-MNiSW-DBO02/I/2007/.

tools overview). The current implementations of @pre operator are discussed in [8, 9, 3]. This approach requires the restriction of post-conditions syntax to formulas of the form: t0 [t1 @pre/x1 ,..., tn @pre/xn ] (we use here the OCL notation), where term t0 does not include @pre and term ti @pre is obtained from term ti by replacing every attribute a by a@pre, for i = 1,..., n (for example, term (self.a.b)@pre is an abbreviation of OCL term [email protected]@pre). [t1 @pre/x1 ,..., tn @pre/xn ] denotes here the simultaneous substitution of terms ti for variables xi , for i = 1,..., n. Values of terms ti are computed before the underlying method is executed, saved and then after the method execution used to compute the value of the post-condition. For example constraint self.a.b@pre = [email protected]@pre+1 is not of the above form. In this case, it is not possible to compute in the pre-state values of attribute b for the objects related by a with self in the post-state. To do this one would have to know in advance which objects will be related by a in the post-state. There are other problems with this approach. If terms ti are of collection types, then the actual collection values must be cloned. Such clones are computationally expensive and pose logical problems, since reference identity cannot be used for object comparison. Computation of all potentially needed values in the pre-state can be even nonterminating. For example, let us consider expression if 1 + 1 = 2 then 1 else q@pre endif where q is a computationally complex, or nonterminating, integer-valued query. Obviously, in general there is no need to evaluate q in the pre-state to compute the value of the whole expression. In the paper [9] an algorithm was proposed which overcomes above mentioned problems. It allows to access @pre-values during post-condition evaluation as needed. Consequently, it is applicable to all forms of post-conditions. Its execution does not increase the complexity class of post-condition evaluation as if the computation of old values had a constant time. Since values of @pre-terms are not recorded in advance, there is neither need to clone system states nor to restrict the post-condition syntax. The algorithm is implemented in AspectJ. It superimposes the so called fat structures [5] on object-oriented systems. Those structures make arbitrary linked structures partially persistent, i.e. when a sequence of updates is performed, then all versions of a linked structure can be accessed and the newest version can be modified. The method proposed in [5] applies to a sequence of operations. Basically, every modification of an attribute is accompanied by storing the modified value and the corresponding modification number in a data structure. The notion of persistence requires that all previous versions of a structure can be accessed and modified [5]. However method calls can be recursive and form a tree instead of a sequence. There is no obvious relation between consecutive versions of modified attributes and the call-stack. Thus we need a different way of handling attribute modifications. On the other hand, fat structures hinder the garbage collection. Therefore stored information should be minimized. In the paper [2], the notion of semi persistence has been coined for backtracking algorithms. The authors 2

have observed that when backtracking from a branch it suffices to use the old version of a structure without undoing changes. The ancestors of the current version are reused, but never another version obtained from a common ancestor. They call a data structure semi-persistent if only ancestors of the newest version can be updated. This notion differs from partial persistence, since it allows the update of ancestor structures. However, the authors have not considered ways of ensuring semi persistence, nor the way we can handle old attribute values and minimize the memory use. In this paper we define a labelled transition system, or as we sometimes say an abstract machine, which simulates steps of object-oriented programs and manages attribute histories. Method execution is simulated by transitions like method call, setting object attributes and method return. For every attribute we define a history-stack whose elements are pairs consisting of an old attribute’s values and its time-stamps being a calls number. Basically, if an attribute is set for the first time during a method execution and the stack does not include timely snapshots, its previous value is pushed on the stack with a time-stamp being the number of the currently executing method. We define also clean operations which remove outdated snapshots from history-stacks. We formulate an invariant which guarantees that @pre-attribute values are recoverable from the current ones and the corresponding histories. We show that it is preserved by the transitions. An efficient space management is crucial here, since objects reclaim during garbage collection is restrained due to saving old attribute values in history queues. Partial persistence is more than we need and as such causes unnecessary space overhead. In that case, the history of an attribute has size equal to the number of its modifications. The access time to attribute values from before the current method execution is logarithmic if binary search trees are use to store attribute histories [5]. We call a structure sufficiently persistent, if all its versions directly proceeding non-terminated method calls can be accessed, but only the most recent version can be updated. Sufficiently persistent structures are related to semi persistent structures as partially persistent structures to persistent ones. Defined abstract machine corresponds to a general form of sufficiently persistent structures. It should be stressed, that our methods applies to all forms of linked structures, not only object-oriented ones. We show that in case of the Towers of Hanoi algorithm, our method requires space of size O(n2 ), where n is the number of rings; whereas partially persistent structures require space of exponential size. In our case, at any point of execution the lengths of attribute histories are linearly bound by the maximal size of the call-stack reached up to that point. We show that the cumulative access time to an attribute value from before a method execution is constant. This is due to cleaning of outdated values and to the fact that in our approach we register only the attribute values prior to a method execution, not every version. If recursion is replaced by iteration, then our method results in leaner structures. Some methods do not have post-conditions containing attributes of the form a@pre; we call them irrelevant. We prove that it is possible to abstract from calls of such methods. We do it by defining a proper bisimulation relation. This 3

corresponds in a sense to the refactoring pattern called inline method [7], which results in a less structured code. Thus the length of attribute history can be linearly bound by the number of relevant methods on the call-stack. This paper is organized as follows. In Section 2, we define a labelled transition system which manages old attribute values. In Section 3, we discuss its properties and define an invariant preserved by its transitions. In Section 4, we prove that when computing old attribute values it is possible to abstract from methods without post-conditions. In Section 5, we relate the abstract machine to the AspectJ implementation proposed in [9] and consider the time- and spacecomplexity of proposed method. Section 6 concludes this paper.

2

The Abstract Machine

In this section we define a labelled transition system, or as we sometimes say an abstract machine. Transitions of this system define steps of an object-oriented system and the way attribute values are archived. Executions of object-oriented programs, for example written in Java, correspond to a restricted subset of all runs of this machine. This machine does not take into consideration restrictions due to the use of method parameters. In any execution state it is possible to modify any object. For simplicity we do not deal with object creation explicitly. We assume that the initialization of an object consists of setting its attributes which are all initially equal to ⊥. Similarly we do not model here the garbage collection. We discuss the issue of efficient space use in subsection 5.2. 2.1

States

In this subsection we define states of the abstract machine. We assume that there exists an infinite set of object locations/addresses OL and that the undefined symbol ⊥ does not belong to OL. A = {a1 , ..., an } is a finite set of attributes. Op is the set of methods/operations. We assume that main belongs to Op. N is the set of natural numbers. Below for an arbitrary set B , B⊥ denotes the set B ∪ {⊥}. We model a heap state (called here object store or simply store) by a function mapping pairs consisting of an attribute and an object location to object location, i.e. St =def {st : A × OL → OL⊥}. In our model, all objects can have potentially all attributes. Classes can be modelled by infinite subsets of OL with the restriction on attributes. Thus if say objects of class B do not possess attribute a, then for every object o of class B and every store st, st(a, o) = ⊥ must hold. Likewise we do not deal directly with inheritance. However, we can restrict in a similar way method calls. Below we abstract from method parameters making the machine runs even more general; considered methods can potentially modify any object. Clearly we get a number of machine runs which do not correspond to an execution of a method implemented in a language such as Java. We call a pair consisting of an attribute value and the corresponding timestamp ‘attribute snapshot’. An attribute history for an object is a sequence 4

of snapshots: H =def (OL × N )∗ . Attribute histories are modelled by history functions. A history function for an attribute and a location is either undefined or equal to a sequence of snapshots. AH =def {h : A × OL → H⊥} is the set of such functions. SH =def (St × Op × N )+ is the set consisting of store histories, or as we sometimes say heap histories. Such a history is a sequence of triples consisting of a store, the name of a currently executing method and a time-stamp corresponding to a method call. Method calls are numbered starting with one. For every subsequent call, the counter is increased by one. Computation states are the states of the abstract machine. They consist of a nonempty store history, an attributes history function and a value corresponding to the number of executed method calls: CS =def SH × AH × N . The initial state models the situation when method main starts to execute. The store is constantly equal to ⊥ since no attributes are set. All values are undefined, since there is nothing to be archived when main is called. st⊥ and h⊥ are respectively the store and the history functions constantly equal to ⊥. The call number is 0, since no method different from main started to execute, i.e. inSt =def ((st⊥, main, 0 ), h⊥, 0 ). 2.2

Transitions

In this subsection we define a transition relation R corresponding to computation following the post-condition specification style (cf. e.g. [12]). They are five kinds of transition steps. The first one corresponds to a method call. The second one concerns setting an attribute. This transition records changed attribute values in the attribute history-stacks. The third one concerns a method return. The last two concern cleaning an attribute history. In the first case, only the top of a history-stack is cleaned. In the second one, all outdated snapshots are removed. We denote by f [x 7→ v ] a function that maps x to v and differs from f only for argument x; i.e. f [x 7→ v ](x ) = v and f [x 7→ v](y) = f (y), for y 6= x. Let computation states cs, cs 0 be of the form (sh, h, n) and (sh 0 , h 0 , n 0 ), respectively. Below we assume that store history sh has the form (st0 , main, 0 )...(stk , opk , nk ). We assume that sh0 is its initial subsequence up to k − 1, i.e. sh = sh0 (stk , opk , nk ). If h(a, o) 6= ⊥, then we assume that h(a, o) is a sequence of the form ah0 (ol , rl ), where ah0 = (o0 , r0 )...(ol−1 , rl−1 ). Note that opk is the method executing when the transitions below are started and nk is its call number. Let a ∈ A, o, o 0 ∈ OL. 1. cs Rcall(op) cs 0 ⇔ n 0 = n + 1 ∧ sh 0 = sh(stk , op, n + 1 ) ∧ h 0 = h 2. cs Rset(a,o,o 0 ) cs 0 ⇔ ∃st 0 ∈St n 0 = n ∧ sh 0 = sh0 (st 0 , opk , nk ) where st 0 = stk [(a, o) 7→ o 0 ]. Moreover, (i) h(a, o) = ⊥ ⇒ h 0 = h[(a, o) 7→ (o 0 , nk )] (ii) h(a, o) =  ∨ rl < nk ∧ stk (a, o) 6= ol ⇒ h 0 = h[(a, o) 7→ h(a, o)(stk (a, o), nk )] 0 (iii) rl < nk ∧ stk (a, o) = ol ⇒ h = h[(a, o) 7→ ah0 (ol , nk )] (iv) In other cases h 0 = h 3. cs Rreturn(opk ) cs 0 ⇔ sh 0 = (st0 , main, 0 ) ...(stk −2 , opk −2 , nk −2 ) (stk , opk −1 , nk −1 ) 5

(i) |sh0 | > 0 ⇒ h 0 = h ∧ n 0 = n (ii) |sh0 | = 0 ⇒ ∀a∈A, o∈OL h 0 (a, o) = if h(a, o) 6= ⊥ then  else ⊥ endif ∧ n0 = 0 4. cs RcleanTop(a,o) cs 0 ⇔ 1 < |h(a, o)| ∧ nk 6 rl−1 ∧ sh 0 = sh ∧ n 0 = n ∧ h 0 = h[(a, o) 7→ ah0 ] 5. cs RcleanWhole(a,o) cs 0 ⇔ 1 < |h(a, o)| ∧ sh 0 = sh ∧ n 0 = n ∧ h 0 = h[(a, o) 7→ ah] where ah = (os0 , rs0 )(os1 , rs1 )...(osp , rsp ) is a subsequence of h(a, o) such that (∀ 06i6k, 06j6l ni 6 rj ⇒ ∃ 06d6p ni 6 rsd 6 rj ) ∧ (∀ 06d

st.top().meterReading) { if(st.top().value != cur) //corresponds to (2ii) st.push(new SnapshotVal(cur, Meter.getReading())); else st.top().meterReading = Meter.getReading(); //corresponds to transition (2iii) } } } 15

If a class C contains an attribute b of type T requiring archiving, then we introduce aspect ArchiveC which superimposes on C attribute bHIST of type Stack and method getBATpre(). The method is implemented using getValueATpre() and getLastUpdateTime() returning the last update time. Every manipulation of b is detected by pointcut modB. If the current meter-reading is larger than 0, meaning that there is a relevant method on the stack, then the archiving is performed by doArchiving. If no method with a post-condition is executed, then there is no need for archiving the pre-state. public aspect ArchiveC { public Stack Anchor.bHIST = new Stack(); Element C.getBATpre() { return C.getValueATpre(bHIST, b); } Integer C.getBLastUpdateTime() { return C.getLastUpdateTime(bHIST); } pointcut modB(C target) : target(target) && set(T C.b); before(C target) : modB(target) { if(Meter.getReading() > 0) { Archive.doArchiving(target.bHIST, target.b); } } } Lemma 2 implies that method getValueATpre(Stack st, T val) is correctly implemented. More precisely, if the if-part of the lemma is satisfied for ni+1 being the number of the currently executing method call, then the value stored in the topmost snapshot is returned. This implies that if on the top of history-stack st a snapshot is located with a time-stamp larger than or equal to the current one and the previous snapshot, if there is any, has a time-stamp smaller than the current one, then the value stored in the snapshot is the @pre-value. In the other case, val is returned. The optimization step allows us to archive attributes for relevant method calls. 5.2

Complexity

In this subsection we discuss the question of time and memory use. We show that the proposed algorithm does not increase the time complexity class of constrained methods and that the access to @pre-values during a post-condition evaluation can be treated as if it needed a constant time. We compare also the space requirements of our approach with the requirements of partially persistent structures. Our method does not increase time complexity class of instrumented methods since setting an attribute is accompanied by at most one snapshot archiving 16

which requires a bound number of steps. A call of a constrained method results in increasing the call-stack and the call counter. Access to a @pre-value may require removal of some outdated snapshots using method cleanTop. Removals require a bound number of steps and there are at most as many snapshots to remove as executions of set in the past. Thus the time for removal of an outdated snapshot can be accounted for when treating the execution of set. Similarly, we can account for the history removal when the control returns to method main. In case of partially persistent structures, the size of an attribute history is equal to the number of its modifications. The access to old attribute values is logarithmic in respect to the number of attribute updates if binary search trees are used to store old values [5]. In our case, we store only the attribute value prior to a method execution when the attribute is modified for the first time. When a post-condition is evaluated, the access to an @pre-value may require the removal of outdated snapshot from the top of a history-stack (see subsection 5.1). However, this can be accounted for when considering operation set. Thus the evaluation can be treated as if the extraction of @pre-values had a constant time. It should be noted that operation cleanWhole can be performed in linear time in respect to the actual stack size k and the history length l (see the previous subsection). Thus, there exists a constant c1 such that the operation requires not more than c1 · (k + l) steps. If we start cleanWhole only when the length of the history-stack doubles the size of the call-stack, i.e. l = 2 · k, then the operation requires at most c1 · l · 3/2 steps. Since afterwards the length of the history is smaller than or equal to k, every outdated snapshot removal requires on average at most 3 · c1 steps. Similarly, there is a constant c2 such that the removal of an outdated snapshot from the top requires at most c2 steps. We define constant c to be the maximum of c1 and c2 . c binds the number od steps needed for an outdated snapshot removal. The use of space is really crucial, since the object reclaim during garbage collection is restrained by history attributes. If an object is stored in a history attribute of another object, then it cannot be deleted before it is removed from the history, or the other object is deleted. Thus, the information about the past should be kept as minimal as possible. During a method execution the corresponding call-stack evolves. For an arbitrary sequence of computation states inSt, cs1 ,..., csm related by transition relation R, let k0 , k1 ,..., km be the sizes of the corresponding call-stacks (see subsection 2.2) and let kmax be the maximal stack size. We show now that for an arbitrary sequence of this form, the cleaning can be performed in a way guarantying that the length of an arbitrary history does not exceed 2 · kmax without increasing the complexity class of the executed method. This can be ensured by starting operation cleanWhole before every execution of set(a, o, o 0 ) if the length of the corresponding history doubles the actual stack size, i.e. |h(a, o)| = 2·k. set is the only operation which increases the length of a history. Thus, it is guaranteed that at all times for every attribute a and object o, |h(a, o)| 6 2 · kmax , where kmax is the maximal stack size reached before. Unfortunately, we cannot prove in this way that the history length is bound by the actual stack size k times 2, i.e. |h(a, o)| 6 2 · k. However, this 17

bound can be ensured using algorithms increasing the time-complexity class of instrumented methods. We consider now the problem of Hanoi Towers with n rings. The underlying structure can be implemented using three linked lists storing natural numbers sorted increasingly. The standard solution uses a recursive algorithm which moves n − 1 rings from the source location to the intermediate one, then moves the last ring to the target location and finally moves rings from the intermediate location to the target one. This requires an exponential number of moves. Therefore, finally the corresponding partially persistent structure has an exponential size order with respect to n. The size of the call-stack is maximally n. Consequently, we can ensure that the sufficiently persistent structure has at most the size order n2 , since there are n rings to move, every rung is linked to at most one other ring and n is the maximal size of the call-stack. At the end of the algorithm execution, the histories can be emptied (see transition (3)), since there is no need to store information about the past forms of the structure. Note that recursion should be used sparingly due to its high time and memory overhead. Instead one should use iteration. In case of iterative algorithms, our method performs much better than partially persistent structures.

6

Conclusion

In this paper we defined an abstract machine simulating steps of an objectoriented system and managing old attribute values. We defined an invariant and proved that it is preserved by transitions of this machine. Using this invariant we proved that it is possible to compute attribute values from before a method execution using attribute histories. We demonstrated that it is possible to abstract from calls of methods which do not have post-conditions. We related the abstract machine to AspectJ implementation proposed in an earlier paper. Finally, we showed that this machine requires less space to store old values than partially persistent structures. In the future, we are going to investigate how to apply our technique to partially persistent structures and if the space bounds presented in this paper can be improved.

References 1. Barnett, M., Leino, R., K., M., Schulte, W., The Spec# Programming System: An Overview, in CASSIS 2004, LNCS, Vol. 3362, Springer, 2004. 2. Conchon, S., Fillitre, J., C., Semi-Persistent Data structures S. Drossopoulou (Ed.): ESOP 2008, LNCS 4960, 2008, pp. 322 - 336. 3. Dzidek, W., Briand, L., Labiche, Y., Lessons Learned from Developing a Dynamic OCL Constraint Enforcement Tool for Java, Best Papers of Satelite Workshops at the Models’05 conference, LNCS, Vol. 3844, Springer, 2006, pp. 9 – 19. 4. Darvas, A., M¨ uller, P., Reasoning About Method Calls in JML Specifications, Proceedings of the 7th Workshop on Formal Techniques for Java-like Programs (FTfJP05), Glasgow, Scotland, July, 2005.

18

5. Driscoll, J., R., Sarnak, N., Sleator, D., D., Tarjan, R., E., Making Data Structures Persistent, Journal of Computer and System Sciences, Vol. 38, No. 1, 1989. 6. Floyd, R. W., Assigning meanings to programs, in Mathematical Aspects of Computer Science, Vol. 19, Proceedings of Symposium in Applied Mathematics, American Mathematical Society, 1967, pp. 19 – 32. 7. Fowler, M., Refactoring: improving the design of existing code Reading, Mass., Addison-Wesley, 2000. 8. Hussmann, H., Finger, F., Wiebicke, R., Using Previous Property Values in OCL Postconditions: An Implementation Perspective, int. Workshop ‘UML 2.0 - The Future of the UML Constraint Language OCL’, 2’nd of October, York, UK, 2000. 9. Kosiuczenko, P., On the Implementation of @pre, in Chechik, M., Wirsing, M. (Eds.): Fundamental Approaches to Software Engineering, LNCS, Vol. 5503, Springer, 2009, pp. 246 – 261. 10. Meyer, B., Applying design by contract, Computer, Vol. 25(10), IEEE Computer Society Press, 1992, pp. 40 – 51. 11. Meyer, B., Eiffel: The Language, Object-Oriented Series, Prentice Hall, New York, 1992. 12. OMG, OCL 2.0 Specification, Version 2005-06-06, Jun 2005. 13. Toval, A., Requena, V., Fernandez, J., Emerging OCL Tools, Journal of Software and System Modelling, Vol. 2(4), Springer, 2003, pp. 248 – 261.

19

Suggest Documents