obtained and the adequate source code modifications to be made to get dynamic .... the handling of arrays can be expressed in a generic manner: each element of the array ... When array elements are of simple type, individual elements are stored in a .... insert an AttributeInfo element into the DeltaStateContainer for each.
Using Compile-Time Reflection for Object Checkpointing 1 Marc-Olivier Killijian, Jean-Charles Fabre, Juan-Carlos Ruiz-Garcia LAAS-CNRS, 7 Avenue du Colonel Roche 31077 Toulouse cedex, France {killijian, fabre, ruiz}@laas.fr
Abstract. This paper tackles the problem of checkpointing object oriented programs using a reflective approach. The objective of the technique and the corresponding tool is to provide checkpointing methods to application classes. In conventional object oriented fault-tolerant systems, the implementation of these methods to save and restore the state of objects is often delegated to the application programmer; the dependability of the system relies thus on his ability to implement these core functions. Our approach enable the automatic provision of these methods for classes that obey some programming restrictions. In a second step, we present an optimization of this technique using runtime reflection; the data checkpointed corresponding to the set of attributes that have been modified since the last checkpoint. Some preliminary results of the evaluation of these techniques and an overview of the CORBA system framework in which they are used is finally described.
1 Introduction The definition of application checkpoints is a major issue in the design and the implementation of dependable systems, especially for building fault tolerance strategies. Checkpointing the state of individual active objects is always needed for many replication strategies and mandatory for cloning a new replica during the reconfiguration of the system after a failure. The problem of checkpointing a distributed application involves complex algorithms to ensure the consistency of the distributed recovery state of the application. All these algorithms make the assumption that individual checkpoints of the application objects can be obtained easily. However, this is a strong assumption and, in practice, it is not so easy when complex active objects are considered. The available solutions either rely on very hardware or operating system dependent mechanisms or delegate this task to the application 1
This work has been partially supported by the European Esprit Project n° 20072, DEVA, by a contract with FRANCE TELECOM (ref. ST.CNET/DTL/ASR/97049/DT) and by a grant from CNRS (National Center for Scientific Research in France) in the framework of international agreements between CNRS and JSPS (Japan Society for the Promotion of Science).
1 / 20
programmer. These solutions have many drawbacks and recent approaches investigate the use of compiler-assisted checkpointing techniques [9, 15] . The proposed solution can be compared to the later and shows that compile-time reflection is a promising approach to tackle the problem of obtaining the state of individual objects. This state information is here obtained at a high level of abstraction and handled through the combined use of both compile-time and runtime reflection. The technique proposed here is part of the definition of a metaobject protocol for implementing fault tolerance in CORBA applications [8] , although its use is not restricted to this topic. Obtaining the state of an object is part of this MOP but is also useful for other aims, e.g. object migration in mobile systems or simply for load balancing. In this paper, we investigated this problem for object-oriented programs; the real question is what is the state of individual object oriented programs? This state is required to resume execution on a remote system. After a brief analysis of current approaches and an introduction to our solution (Section 2), we describe a new approach to the object state capture using compile-time reflection. Indeed, using class definitions and implementations at compile-time, we are able to generate automatically methods, such as SaveState and RestoreState, responsible for the capture and restoration of the state of objects of this class (Section 3). An optimization of this approach using runtime reflection is then proposed: thanks to runtime information, we are able to checkpoint only the attributes that have been modified since the last checkpoint (Section 4). A comparison of our approach with a conventional non-reflective compile-time solution illustrates both the performance efficiency and the coverage of the state capture (Section 5). Section 6 gives an overview of the system in which these mechanisms are presently used.
2 Problem Statement and Related Work The available techniques to obtain the state information of an object vary very much regarding the abstraction level where they apply and of course depend on the real information that is needed. The state information of an active object encompasses two kind of information, the internal state of the object (namely state variables) and the information in transit (invocation messages). We concentrate here on the internal state, the latter dealing with end-to-end communication protocols not described in this paper, see [5] . The internal state of an active object is mapped at a very low level to memory objects (segments, regions, etc.) which are handled either by the operating system or better by a middleware (the runtime layer for the application objects). Understanding the mapping implies diving into the operating system or the middleware to identify which memory objects hold the internal state of a given application object. This approach often leads to a customization of the software runtime layer in order to obtain such detailed information. This is a first drawback of this solution since off-the-shelf runtime layers cannot be used in this case. A second drawback of this solution is that it provides raw information concerning the state of an object. It is worth noting that such information is not appropriate to install and initialize a newly created object on a different site of the system for following reason:
2 / 20
1. Some data items are not significant on a different site since their value is very site dependent (e.g. pointers to objects, file descriptors, semaphore descriptors, etc.) 2. The semantics of state information variables must thus be interpreted in order to perform an appropriate initialization of such variables on a remote site. Indeed, some data items are related to internal objects within the middleware of the remote site; a raw copy of the memory objects is thus not consistent on the target site and some additional actions must be performed. These actions must create and initialize the corresponding internal objects within the runtime layer of the remote site. As soon as these actions complete and that the new instance is updated with the current state information, then the new object copy holds a consistent state from which the computation can continue. Such important observation indicates that semantic information regarding such data items is necessary to obtain a consistent checkpoint for the new object copy. Another approach consists in providing the user with libraries of functions [14] or classes [10, 11, 13] to deal with fault tolerance protocols and state information. In object oriented terms, the application classes must inherit from some base class in which the two methods SaveState and RestoreState are defined as virtual methods. This second approach is clearly not transparent for the user and relies on user’s skills to use the library functions or to implement the virtual methods correctly. This means that the implementation of both methods must be error-free. For instance, a wrong implementation of the SaveState function will lead the new copy to hold an inconsistent state from which the execution could not be restarted. This is a major drawback of such solutions. Moreover, since the implementation can be very difficult for very complex objects the probability of introducing software faults (bugs) in so doing is certainly very high. Another side effect of such solutions is that any modification (evolution) of the application object implementation must be consistently reported to the SaveState and RestoreState implementation. Because different application programmers can perform the long-term evolution of such application objects, a consistent implementation of these methods is not guaranteed during the lifetime of the application. Clearly, in the short or in the long term the whole distributed application will fail. It is worth noting that the definition and the implementation of such core functions must be tool assisted. Following this idea, some works have investigated the use of customized compilers to generate these functions automatically, e.g. [15] . More recently, the use of compile-time reflection was introduced to tackle this problem [7, 8] . In this type of solution, the identification of the internal state of application objects is very language dependent. This is however the only way to obtained detailed information about the internal state of application objects on off-the-shelf runtime systems. Any other solution would provide a coarse view of this information and make the interpretation of site-dependent objects very uncertain. A language independent solution would require the runtime system (the middleware, e.g. the ORB) to reify such detailed information, i.e. making the runtime system reflective. Both solutions rely on a reflective approach, the latter involving anyway a customization of the runtime system or the use of an appropriate reflective runtime system (only the Java runtime system provides type information using the java.lang.reflection package).
3 / 20
It is worth mentioning that pure object-oriented languages are easier to handle with this approach rather that hybrid languages such as C++. However, the experiments reported in this paper have been performed for C++ objects and the tool based on Open C++ [1] . This is why some programming conventions and restrictions have been considered in order to enforce a strong encapsulation principle and avoid programming statements that may lead to uncontrolled side effects to the object state. The necessary restrictions would have been very different using Java. We will comment on this point later, in section 4.3, but programming restrictions are not a real problem from our viewpoint. The most important issue is that the state obtained must be complete and consistent with the current state of the computation, enabling the computation to continue with a new object copy on a remote site. This is really mandatory for a fault tolerant system whose first objective is to tolerate faults, e.g. object crashes. The first role of the tool is then to filter such programming conventions; any class not obeying the programming restrictions we have identified will be rejected. It is up to the application designer/programmer to implement the SaveState and RestoreState for this class (not recommended) or iterate on the design/implementation of the class to obey the programming restrictions. A second important design issue for the tool, is the coverage of the object state, i.e. make sure that all the necessary information for the new copy to resume execution is obtained. This assumption must be ensured when getting the whole state of the object (cf. Section 3) but also when obtaining any partial state information (cf. Section 4). Any superset of such necessary information is acceptable, although the objective is to minimize the redundancy for performance reasons. This work is part of the definition and the implementation of a metaobject protocol for fault-tolerant CORBA applications [8] . This MOP has two roles: (i) interception of object creation, deletion and method invocation and (ii) the capture/restoration of the object state. With this MOP, metaobjects can implement non-functional mechanisms such as active or passive replication for fault tolerance and authentication for communication security as previously illustrated in the Friends system [6] . This MOP can also be used for other aims, as previously mentioned.
3 Object State using Compile-Time Reflection
3.1. Context and Motivations We describe here how compile-time reflection can be used to define and generate the implementation of both methods SaveState and RestoreState. These two methods are first generated to obtain the whole state of the application objects, namely the full set of its private attributes. Public attributes are forbidden in a first step to ensure a strong encapsulation principle (programming restriction that can be bypassed, see section 4.3). We also assume in the remainder of this paper that base level
4 / 20
methods are executed sequentially within application objects (no internal concurrency, this problem has not been tackled yet). The state is thus obtained at runtime on the source object and forwarded within checkpoint messages through metaobjects to the target object (see. Fig.1). We also assume that a checkpoint can be taken after one or several method execution; this strategy is left open to the metaobjects.
checkpoints
MetaObject’
RestoreState
SaveState
MetaObject
Source Object
Target Object
Fig. 1. Metaobjects checkpointing objects
The use of compile-time reflection enables the needed static information to be obtained and the adequate source code modifications to be made to get dynamic information when necessary. In practice, metaclasses are used to analyze, translate classes and generate new methods at compile-time. These features enable the automatic generation of both state information data structures and the SaveState and RestoreState methods for each class. The reflective compiler OpenC++ v2 [1] was used in our experiments as a powerful macro processing system [2] . The generic approach proposed in this paper need to be applied to a convenient object model. CORBA [12] provides such convenient object model. However, depending on the programming language used to implement CORBA objects, some programming restrictions must be obeyed. Finally, the IDL compiler is used in combination with the metacompiler to manage the state information. 3.2 Approach Overview The starting point is that compile-time reflection enables application classes definition to be reified during the compilation process. This includes attributes names and types, parent classes, object references (composition), etc. This information is handled by metaclasses at compile-time. Given this information, (i) a data structure (called StateContainer) is defined to hold the object state and also (ii) new class methods to save and restore the object state are created. In brief, the role of these methods is to write the attributes’ values into the StateContainer or the StateContainer to the attributes respectively. The StateContainer structure is defined by a metaclass that creates a field for each attribute of the class. Any field in this structure holds an IDL type. The
5 / 20
translation of the attributes types to these fields is performed according to the C++-toIDL mapping defined by OMG [12] . A simple example of such a StateContainer data structure is proposed in Fig.2. class Example { int a,b; float c,d; char e; void set (int i, int j, float k, float l, char m); void calculus( int step); ..... }
struct StateContainer { long a,b; double c,d; char e; }
Fig. 2. Example of a StateContainer Structure
The SaveState and RestoreState generated methods fill in and out the corresponding StateContainer data structure with the state information of a concrete objects at runtime. For each class, the interface of these methods can thus be defined in fig. 3. StateContainer SaveState(); void RestoreState(StateContainer); Fig. 3. Interface for SaveState and RestoreState
The body of these methods is also generated by metaclasses. Both aspects are presented in the following sections. 3.3 Object Attributes Handling and Methods Generation
Object Model Assumptions To ensure the consistency of the checkpoints obtained using this approach, we need to have a real object model. We consider a pure object world; objects have only private attributes and are single threaded (no internal concurrency). Objects’ attributes types considered are simple types, classes, pointers to objects, arrays of the three preceding types (simple, class, and pointer on object) and object references by mean of CORBA_Reference. The work presented in this paper does not solve all the problems identified in the previous section, in particular regarding the use of internal variables within the middleware. This is why a pure object model was considered; even if local pointers cannot be raw-copied, this approach enables to create remote copies of local objects by re-creating them. If the pure object assumption is not met, i.e. if the system/middleware is hybrid, one possible solution is to wrap the system calls using object servers. For instance a file object server can encapsulate file accesses and log local actions; this would enable
6 / 20
to copy remote file references. The same approach can be applied to other sitedependent variables. Simple Types The handling of simple C++ types is the basic case, as explained previously, each member of simple type in the class is mapped to an element of the StateContainer data structure following the IDL to C++ mapping. An example is given in Fig. 4. The generation of the StateContainer data structure for simple types is very easy: the metaclass parses the class definition for attribute declarations. For each attributes, the metaclass retrieves its type, gets the corresponding IDL type from a dictionary and generates an entry into the StateContainer for this attribute (see. Fig. 5). The generation of the SaveState and RestoreState methods follows the same scheme. The method SaveState creates a StateContainer data structure (see Fig. 6) and writes each attribute into the corresponding field of the structure. Similarly, the RestoreState method interprets the StateContainer input information and writes each field into the corresponding attribute (see Fig. 7 ). Arrays In the object model considered, arrays can be of different types: arrays of simple data types, arrays of objects, arrays of strings and arrays of pointers to objects. However, the handling of arrays can be expressed in a generic manner: each element of the array is written into the corresponding array element of the StateContainer data structure. When array elements are of simple type, individual elements are stored in a similar way as previously done for simple type attributes. This also applies for arrays of strings. In all other cases, a recursive technique is used to handle each element, i.e. each object in the array. The example given in figures 4-7 illustrates the technique for both simple data types and arrays. class NQueen { public: NQueen(int num); bool compute(int l); private: char ChessBoard[MAX][MAX]; inline bool check(int R,int C); int N; int placed; int nbsteps; }
Fig. 4. Original class definition
struct char long long long };
StateContainer { ChessBoard[100][100]; N; placed; nbsteps;
Fig. 5. StateContainer Data Structure
7 / 20
StateContainer NQueen::GetState() { StateContainer State; for(int i=0;i nbsteps; }
Fig. 6. The SaveState Method
Fig. 7. The RestoreState Method
Object Composition and Delegation In the object model described above we make a clear distinction between internal objects (composition) and external objects (delegation), see fig. 8. The former correspond objects addressed either directly (an object is an attribute of another object) or by reference (using pointers) within a CORBA object. The latter corresponds to the delegation relationship to a different CORBA object using explicit references, i.e. CORBA_Reference. This has a strong impact on the checkpointing technique since external objects and internal objects are handled in a different way. SaveState()
SaveState()
copy_ref()
SaveState()
SaveState()
Fig. 8. Composition vs. Delegation
External object can be shared by several objects and are checkpointed independently from the object holding a reference to it. They are thus checkpointed by their own metaobject. This means that any CORBA object is checkpointed independently. However, their reference has to be duplicated and stored into the StateContainer. Internal objects are members of one instance of a class, so they are really part of the state of objects of this class. These objects are checkpointed recursively; for instance in Fig 9, the SaveState method of class B calls the SaveState method of class A to get the state of both Object_1 and Object_2. The corresponding states are stored into the StateContainer data structure of class B. This corresponds to a deep copy
8 / 20
of the objects while the delegation relationship implies only swallow copies (duplication of the reference). Examples of composition
Examples of delegation
Class A ; class B { A Object_1; A* Object_2; }
class C { CORBA_Reference delegate; }
Fig. 9. Composition versus Delegation
Similarly, the RestoreState method performs the restoration of the state recursively. It is worth noting that, during this operation, newly created objects since the last checkpoint have to be created at least. In practice, all objects addressed by pointers are created and updated by the last corresponding state available in the checkpoint. Class Inheritance Inheritance (presently, only single inheritance is considered) was identified in our previous experiments [6] as problematic using basic runtime MOPs such as those provided by Open C++ v1 [3] . Thanks to compile-time reflection, inheritance can be handled in a recursive way as for object composition: each class is responsible for checkpointing its own set of attributes and derived classes call automatically their base classes Save/RestoreState method in order to complete the checkpoint. Polymorphism can also be used with these techniques since base objects are responsible for obtaining their own state. Let a class hierarchy composed of classes A, B and C both inheriting from a mother class M; another class D owns an attributes P whose type is pointer to class M. When an instance of D is saving its state, it calls P->SaveState(); the object pointed by P can save its own state either it is of class A, B or C. Packing and Portability Issues The representation of the StateContainer is an important issue. This representation should be generic for two purposes: 1. the state of any object must by handled within StateContainer independently of the class; 2. using a generic format for StateContainer would also enable the propagation of this state information to different environments. For this purpose, the StateContainer data structure is defined in IDL and mapped to the Any IDL type. The IDL compiler generates automatically conversion functions from and to the type Any for each StateContainer data structure (see Fig. 10). The Any type can hold any data structure.
9 / 20
Any