1
A synchronization mechanism for typed objects in a distributed system D. Decouchant, S. Krakowiak, M. Meysembourg, M. Riveill, X. Rousset de Pina Laboratoire de Génie Informatique, IMAG BP 53X, 38041 Grenoble Cedex, France e-mail:
[email protected]
Abstract This paper presents a mechanism for synchronizing shared objets in a distributed system based on persistent, typed objects. This mechanism allows the synchronization constraints to be expressed as separate control clauses and to be factored for a class of objects. The interference of this mechanism with inheritance is examined and a solution is proposed. Examples of synchronized objects are provided and a semaphore-based implementation of the mechanism is described.
1. Introduction A major characteristic of many computer systems is the potential sharing of data by a number of concurrent activities. This sharing must be controlled in order to guarantee that the shared data remain in a consistent state. This problem has been known for a long time, and a number of solutions have been proposed. However, two new features have appeared in recent systems: object orientation, i.e. a specific mode of information structuring, mainly characterized by the association of operations to data, by dynamic instance creation and by inheritance of properties, distribution, i.e. the ability to execute computational processes that span a number of physical nodes on a network, and to move objects from one node to another. The synchronization mechanism presented in this paper has been designed for the object-oriented distributed system Guide, currently being implemented at Grenoble, as a joint project of Laboratoire de Génie Informatique and Bull Research Center. This system also
10/11/98
2
embodies the object-oriented architecture defined in the Comandos project under the ESPRIT Program supported by the Commission of European Communities [Horn 87]. Before describing this mechanism in detail, we briefly introduce the main features of the Guide system and language that are relevant to the topic of this paper. An object is the association of a set of data (its state) and a set of procedures (its operations or methods), which provide the only means for consulting or changing the state. Object are typed. A type is a specification of the operations applicable to objects of that type, each operation being defined by its signature. The internal representation of object and the code of methods are not included in the type definition. The examples are written in Guide, an object-oriented language that has been developed within the project, and the notations should be self-explanatory. The following declaration specifies a type. TYPE ProducerConsumer IS METHOD Put(IN Element); METHOD Get(OUT Element ); END ProducerConsumer.
// deposit an Element in the buffer // extract an Element from the buffer
The type Element is supposed to be defined elsewhere. A class is an object model and an object constructor. It defines a specific implementation of a type and a mechanism for the creation of objects represented according to this implementation. These objects are instances of the class. A class provides a description of the internal representation of its instances and the code of the operations applicable to these instances. The operations that create and delete instances are provided by the run-time system. The instances of a class have the same behavior (described by the type that the class implements); in addition, they share the code of their operations, although each instance has a separate state representation. CLASS FixedSizeBuffer IS IMPLEMENTS ProducerConsumer; METHOD Put(IN m: Element); BEGIN END Put; METHOD Get(OUT m: Element); BEGIN END Get; END FixedSizeBuffer.
This class could have been defined as generic, i.e. parameterized by a value or a type, but we retain the non-generic form in this example to make the programs simpler. Note that no synchronization constraints are specified. A given type may be implemented by several different classes. The instances of two different classes which implement the same type share the same external interface but have different internal representations and different operation implementations. Objects are persistent, i.e. the lifetime of an object is independent of that of the program or process in which the object was created. An object is named by a system-wide reference (a universal unique name). An object is always located on a single node, i.e.
10/11/98
3
objects are not partitioned (however, an object may move from one node to another and include references to other, possibly remote, objects). Objects are essentially passive.The active entities of the system are called jobs and activities. A job may be regarded as a multiprocessor distributed virtual machine. It defines an adressing space, possibly distributed on several nodes, in which objects may be dynamically bound upon invocation. A job contains one or more parallel activities, i.e. sequential threads of control. The execution of an activity is described as a sequence of (synchronous) invocations on object methods. Communication between activities, within the same job or in different jobs, takes place through shared objects. Examples of widely used shared objects are communication channels, whose function is similar to that of Unix pipes; however, they carry typed objects instead of bytes. The expression of synchronization for shared objects is described in section 2. An implementation of the synchronization mechanism is presented in section 3. 2. Expressing synchronization constraints 2.1 Introductory example Our basic choice is to express synchronization as a set of constraints associated with objects, not as primitives appearing in the program of activities. This is fully consistent with the object approach. In addition, since the class mechanism is used to describe the behavior of objects, the specification of synchronization is included in a class and applies to all instances of the class. The only way for an activity to access or modify an object is to execute a method of this object. Therefore, we specify synchronization as a set of activation conditions. An activation condition is attached to a method and must be satisfied before the execution of this method may start. This condition is expressed in terms of the internal state of the object and of the parameters of the invocation. If no activation condition is attached to a method, then the execution of this method is unconstrained. Activation conditions are illustrated by the following example, which gives a complete program for the class FixedSizeBuffer introduced in section 1. These conditions enforce the usual bounded buffer synchronization scheme. CLASS FixedSizeBuffer IS IMPLEMENTS ProducerConsumer; CONST size=; buffer : ARRAY[0..size-1] OF Element; first, last, nbr : Integer = 0, 0, 0; METHOD Put(IN m: Element); BEGIN buffer[last]:=m; last:=last+1 MOD size; nbr:=nbr+1 END Put; METHOD Get(OUT m: Element); BEGIN m:=buffer[first]; first:= first +1 MOD size;
10/11/98
4 nbr:=nbr-1 END Get; CONTROL Put: NOT Get AND NOT Put AND nbr0 END FixedSizeBuffer.
The two statements in the control clause specify activation conditions for the methods Put and Get, respectively. The activation condition for Put specifies that no Get and no Put is active and the buffer is not full. The activation condition for Get specifies that no Put and no Get is active and the buffer is not empty. The next section defines a more general form for the expression of activation conditions.
10/11/98
5
2.2 Expressing activation conditions The syntax of the control clause is as follows: CONTROL [ : ]*
where
is a boolean expression which may contain the
- instance variables which represent the internal state of an instance, - actual parameters of the method, - synchronization counters, - names of methods. Synchronization counters are internal data that specify, for each method of a given object, the total number of invocations, the total number of completed executions, the current number of pending invocations, etc. These counters are automatically updated by the system. Synchronization based on counters was introduced in another context by Robert and Verjus [Robert 77]. The following counters are defined for each method m: invoked(m) started(m) completed(m) current(m) pending(m)
: : : :
number of invocations of method m number of accepted (non-blocked) invocations of method m number of completed executions of method m number of activities currently executing method m : number of activities currently blocked on invocations to m
The following relationships hold : current(m) = started(m) - completed(m) pending(m) = invoked(m) - started(m) The first three counters are created and initialized to 0 when the instance is created. They record total numbers, and are non-decreasing. An alternative program for FixedSizeBuffer is as follows: CLASS FixedSizeBuffer IS IMPLEMENTS ProducerConsumer; CONST size=; buffer : ARRAY[0..size-1] OF Element; first, last: Integer = 0, 0; METHOD Put(IN m: Element); BEGIN buffer[last]:=m; last:=last+1 MOD size; END Put; METHOD Get(OUT m: Element); BEGIN m:=buffer[first]; first:= first +1 MOD size; END Get; CONTROL Put: (completed(Put) - completed (Get) completed (Get)) AND current(Get)=0; END FixedSizeBuffer.
10/11/98
6
Mutual exclusion between all methods in an object is expressed by a single activation condition which is common to all methods: • current(mi) = 0
Since this condition is frequently used, it is replaced by a single keyword EXCLUSIVE in the control clause. It should be noted that activation conditions are expressed using only boolean expressions. Modification of internal variables or more generally algorithms are not allowed. As a consequense some synchronization scheme can not be expressed directly; this limitation is illustrated in 2.4. 2.3 Inheriting synchronization constraints Synchronization counters provide a simple and synthetic way for the expression of synchronization constraints on shared objects. However, an important issue is the possible interference of this expression with the inheritance mechanism. The object model defined in Guide includes single inheritance, with possible overloading. For example, let M be a method defined in class C and let C1 be a subclass of C. Then the program of M may be overloaded (i.e. redefined) in C1. However, the program of the original method M, as defined in C, is still accessible within C1 (although it is not accessible by users of instances of C1, who only see the overloaded method). We now have to specify how the inheritance mechanism applies to synchronization. Two remarks are in order: 1) The activation condition for a method is actually part of the program. As such, it may be inherited and overloaded. Note that if the activation condition for an inherited method is overloaded, the whole method must be considered as overloaded even if its actual program is not. 2) The internal counters associated with a method M (e.g. invoked(M), etc) are actually part of the state of the instance. Therefore, if a method defined in a class C is inherited without overloading in C1 (a subclass of C), so is the representation of its internal counters. However, if method M is overloaded in C1, a new set of counters must be defined for M in C1. The counters associated with M as defined in C are maintained separately (although they are not explicitly visible within C1). For example : CLASS C IS … METHOD M(…) … CONTROL ... M: condition0 END C. CLASS C1 IS C … METHOD M(…) BEGIN
10/11/98
// this is a subclass of C // overloads method M defined in superclass
7 (1) …
10/11/98
… SUPER.M(…); // invokes the method M in superclass ... END M;
8 CONTROL … M: condition1; END C1.
// refers to the overloaded methods
Both the body of method M and its activation condition are overloaded in class C1. In this case, two sets of counters are maintained for method M. The counters of the first set are updated when the original method, defined in the super class, is invoked, as for example in the line labeled (1). The counters of the second set are updated when the overloaded method is invoked. The first set of counters are only used in association with activation conditions that apply to the non-overloaded M method in class C and are not visible in class C1. We do not control neither at compile time nor at execution time if r1 and r0 can lead to deadlock. 2. 4 The "Readers and Writers" problem Among all the canonical synchronization problems which we have programmed using our mecanism, the readers and writers problem best illustrates the power and limitations of the mecanim. We introduce the type Tfile: TYPE TFile IS METHOD Write (IN e: Element, dep: Integer);// write e in the file at position dep METHOD Read (IN dep: Integer): Element; //read an Element at position dep END TFile.
and a class File which implement such type: CLASS File IS IMPLEMENTS TFile; METHOD Write (IN e: Element, dep: Integer); BEGIN END Write; METHOD Read (IN dep: Integer): Element; BEGIN END Read; END File.
Instances of class File are similar to Unix files, i.e. access is unrestrited. We shall now implement access restriction according to the Readers and Writers scheme, using inherited types and classes to specify the additional conditions and operations. The following class implements the Readers and Writers scheme with priority to the Readers: CLASS ReadersFirst IMPLEMENTS TFile;
IS File
CONTROL Write: (current(Write) = 0) AND (current(Read) = 0) AND (pending(Read) = 0); Read: current(Write) = 0; END ReadersFirst.
10/11/98
9
In the same manner, we define the class WritersFirst, which gives priority to the writers (e.g. reads are delayed if a write call is current or pending). CLASS WritersFirst IMPLEMENTS TFile;
IS File
CONTROL Write: (current(Write) = 0) AND (current(Read) = 0); Read: (current(Write) = 0) AND (pending(Write) = 0); END WritersFirst.
The implementation of a symetric scheme is more difficult. In this scheme, a Write request may only be executed after completion of all Reads and Writes prior to it. Conversely, a Read request may only be executed after completion of all Writes prior to it. When a Read is in progress, any incoming Reads are processed until a Write request arrives. We must therefore be able to memorize pending invocations of Writes during the execution of both primitives. This is achieved by defining two additionnal methods, AskRead and AskWrite, whose body is empty and whose control clause is used to count the number of pending Writes and pending Reads. TYPE TNoPriority IS TFile METHOD AskRead (IN Integer); METHOD AskWrite (IN Integer); END TNoPriority. CLASS NoPriority IS File IMPLEMENTS TNoPriority; METHOD AskWrite (IN NbWriteAndNbReadBeforeMe: Integer); BEGIN //used to memorize the invoked(Write)+invoked(Read) value at call time. END AskWrite; METHOD AskRead (IN NbWriteBeforeMe: Integer); BEGIN //used to memorize the invoked(Write) value at call time. END AskRead; METHOD Write (IN e: Element, dep: Integer); BEGIN SELF.AskWrite(invoked(Write) + invoked(Read) - 1);// control the Write access SUPER.Write(e,dep); // call the method Write of the Super Class END Write; METHOD Read (IN dep: Integer): Element; BEGIN SELF.AskRead(invoked(Write));// to control the Read access to the file RETURN SUPER.Read(dep); // call the method Read of the Super Class END Read; CONTROL AskWrite: (completed(Write) + completed(Read)) = NbWriteAndNbReadBeforeMe; AskRead: (completed(Write) = NbWriteBeforeMe ) END NoPriority.
3. Implementation of the synchronization mechanism We describe the principle of an implementation of the synchronization mechanism presented in the previous section. This description uses semaphores to specify the low-level scheduling of activities, although any equivalent mechanism would be appropriate. Let us first briefly describe how distribution is taken care of. An activity is represented by several processes, each of which is local to a node where the activity has invoked an object. Remember that an instance of an object is located on a single node. A locating algorithm (not described here) allows to determine the current location of an object upon invocation, given its reference. The calling job then "diffuses" to the site of the called object,
10/11/98
10
or, conversely, the object is moved to the site where the call was executed, depending on the load sharing policy. Diffusion of a job to a site essentially amounts to the creation of a new process on that site, as a representative for the calling activity (if a representative is already present, it is reused), followed by a remote invocation using that representative. In summary, the synchronization is always local to a node and involves several local processes which represent the calling activities on that node. The implementation uses the following data structure associated with each synchronized instance: a) a mutual exclusion semaphore that is used to lock the access to the rest of the structure; b) an activity queue which records the identifications of the activities which are waiting to execute a method of the class on this instance. c) for each method applicable to the instance: - a record containing the counters defined in section 2.2 (only three counters have to be maintained). An additional set of counters must be maintained for each overloaded method, as explained in 2.3. - an internal procedure (with boolean result) which computes the activation condition for the method, if any; Each instance of the class has its own copy of this structure while it is loaded on a node. This structure is automatically generated by the compiler as a result of the parsing of the control clauses. In addition, each activity has a private semaphore which is used to block its execution while it waits for a condition to be satisfied. Since an activity is essentially a sequence of synchronous method invocations, a single semaphore per activity is sufficient. We now describe the programs involved in this implementation. This description is given in the Guide language, as a set of methods associated with type SynchronizedObject, which is a supertype of all objects that contain a control clause. TYPE SynchronizedObject IS METHOD InitAtLoad ; METHOD FreeAtUnload ; METHOD ExecutionCondition (IN methodIndex : Integer) : Boolean ; METHOD Prologue (IN methodIndex : Integer) ; METHOD Epilogue (IN methodIndex : Integer) ; END SynchronizedObject.
When an object is loaded from secondary storage to be executed, the procedure InitAtLoad is automatically activated by the system. It initializes the data structure described above. When the object returns to secondary storage, the procedure FreeAtUnload is invoked. The invocation of a method in a synchronized object is bracketed by calls to a Prologue and an Epilogue, which take as argument the index of the method. These calls are automatically inserted by the compiler. The function of the Prologue is to evaluate the activation condition associated with the method, and to block the invoking activity if necessary. The function of the Epilogue is to wake up the activities which are currently waiting for an activation condition. We now give an outline of the program of class SynchronizedObject. CLASS SynchronizedObject IS
10/11/98
11 IMPLEMENTS SynchronizedObject; SYNONYM ActivitiesList = List OF REF Activity ; mutexSem : REF Semaphore ; waitingActivityList : REF ActivityList ;
10/11/98
// loc
12 METHOD ExecutionCondition (IN methodIndex : Integer) : Boolean ; BEGIN // empty method in class SynchronizedObject, overloaded by the boolean // control expression defined in its sub-classes END ExecutionCondition ; METHOD Prologue (IN methodIndex : Integer) ; okForExecution, invokedUpdate : Boolean ; BEGIN okForExecution := FALSE ; invokedUpdate := FALSE; WHILE NOT okForExecution DO BEGIN mutexSem.P ; if NOT invokedUpdate THEN BEGIN invokedUpdate := TRUE END; okForExecution := SELF.ExecutionCondition (methodIndex) ; IF okForExecution THEN ELSE mutexSem.V ; IF NOT okForExecution THEN END ; END Prologue ; METHOD Epilogue (IN methodIndex : Integer) ; waitingActivity : REF Activity ; BEGIN mutexSem.P ; mutexSem.V ; END Epilogue ; METHOD InitAtLoad ; cpt : Integer ; BEGIN END InitAtLoad ; METHOD FreeAtUnload ; BEGIN END FreeAtUnload ; END SynchronizedObject.
We finally present the code that is automatically inserted in the program of the class of a synchronized object. CLASS aClass IS IMPLEMENTS aType;
10/11/98
// implicit sub-class of SynchronizedObject
13 SYNONYM CounterType = RECORD // internally defined type name invoked : Integer ; started : Integer ; completed : Integer ; END ; // internal state variables inserted by the compiler CONST NbMethods = Counters : Array [NbMethods] OF CounterType ; … // methods defined by the user … METHOD aMethod ; // #n denotes the index of this method (a constant) BEGIN SELF.Prologue (#n) ; // inserted by the compiler SELF.Epilogue (#n) ; // inserted by the compiler END aMethod ; … // method inserted by the compiler // for the evaluation of execution conditions METHOD ExecutionCondition (IN methodIndex : Integer) : Boolean ; BEGIN CASE methodIndex OF 1 : 2 : < … > … END ; END ExecutionCondition ; END aClass.
4. Conclusion We have presented a synchronization mechanism for shared typed objects in a distributed environment. This mechanism is characterized by the following features: Synchronization constraints are common to a class of objects, and are expressed as separate control clauses in the program of that class. A control clause contains an activation condition (possibly empty) for each method defined in the class. An activation condition associated to a method specifies a "guard" which must be true before the execution of the method may start. This condition is expressed in terms of the internal state of the object, the parameters of the method invocation, and internal counters attached to the object. Synchronization constraints are inherited by the subclassing mechanisms. Internal counters should be maintained in multiple copies for overloaded methods. This mecanism may be compared with the mediator mecanism proposed by J.E. Grass and R.H. Campbell. [Grass 86]. The mediator concept is more general as it allows to specify any algorithm and therefore allows to modify object data. To express synchronization, mediators introduce primitives involving activity identification, thus allowing to define policies based on the identity of calling activities. We think that this particular feature should not be allowed in an object-oriented framework where applications are expressed in terms of objects rather than in terms of activities.
10/11/98
14
The choice of a basic synchronization mechanism is obviously a tradeoff between complexity and power. From the point of view of the expression power, exclusion and execution conditions may be viewed as a subset of the mediator concept. However, we think that this basic tool is satisfactory for the following reasons : Most synchronization schemes can be expressed with exclusion and execution conditions. We think that the system should only provide basic mechanisms. The design of more complex synchronization schemes is under the responsibility of the application designer. The implementation of the system is currently under way [Decouchant 88]. Its first version is based on Unix: activities are mapped on Unix processes, and objects are represented in shared memory. Acknowledgments Roland Balter and Jacques Mossière contributed to the definition of the synchronization mechanism. Interaction with members of the Object-Oriented group of the Comandos ESPRIT project was beneficial to this work. The Guide project is also partially supported by Centre National de la Recherche Scientifique (CNRS), Centre National d’Etudes des Télécommunications (CNET) and Comandos ESPRIT project. References [Campbell 74] Campbell R.H., Habermann A.N., The specification of process synchronization by path expressions, Operating Systems, Lecture Notes in Computer Science, Springer-Verlag (1974), pp. 89-102 [Decouchant 88] Decouchant D., Duda A., Freyssinet A., Paire E., Riveil M., Rousset de Pina X., Vandôme G., Implementing a distributed object-oriented architecture on Unix, to appear, Proc. EUUG, Lisbon, oct. 1988 [Grass 86] Grass J.E., Campbell R.H., Mediators: a synchronization mecanism, Proceedings of 6 th International Conference on Distributed Computing Sytems, Cambridge (1986), pp 468477. [Horn 87] Horn C.J., Krakowiak S., An object-oriented architecture for distributed office systems, Proc. ESPRIT Technical Conf., Brussels (sept. 1987) [Robert 77] Robert P., Verjus J.-P., Toward autonomous descriptions of synchronization modules, Proc IFIP. Congress (B. Gilchrist, ed.), North-Holland (1977), pp. 981-986
10/11/98