Resolution of Parallel Deadlock by Partial Abortion - CiteSeerX

10 downloads 0 Views 42KB Size Report
deadlock is detected, one deadlocked transaction T is selected o be aborted in order ... inds of objects, worker, secretary, room, and ohp (overhead s t projector).
Resolution of Parallel Deadlock by Partial Abortion Shinji YASUZAWA, Makoto TAKIZAWA, and Takuma OUCHI Dept. of Information and Systems Engineering Tokyo Denki University Ishizaka, Hatoyama, Hiki, Saitama 350-03, Japan Tel. +81-492-96-2911 ext.2406 Fax. +81-492-96-6185 e-mail {yasu, taki, ouchi}@takilab.k.dendai.ac.jp

Abstract A distributed system is composed of multiple objects interconnected by communication networks. Each object is an abstract data type. Users write transactions to manipulate distributed objects in a nested fashion. Transactions are composed of operations and also atomic units of work for users. Each operation can call operations on another objects. Suppose that one operation op1 on an object o1 calls two operations op2 on p and op3 on q. op2 and op3 can be called in parallel. This means that op1 can be executed in parallel. In this sense, nested transactions can be executed in parallel in the distributed systems. In this paper, we would like to discuss deadlock problems occurred when transactions are executed in parallel. 1. Introduction Recently, distributed applications are required in many systems. In the distributed applications, transaction management has become more important. A distributed system is considered to be composed of multiple objects which are interconnected by communication networks. Objects are abstract data types, which provide the data structure and the operations for manipulating the data structure. Users issue transactions [GRAY81] to manipulate one or more than one object. Transactions are composed of multiple operations on the objects and also atomic units of work. Each operation op on an object o can call operations on another objects. Suppose that one operation a on an object r calls two operations b on an object p and c on q. By executing a, the objects r, p, and q are manipulated by a, b, and c, respectively. If b and c are called independently by a, they -1-

can be executed in parallel. Transactions and operations are specified by using some concurrent language. Each object o uses conflict-based locking mechanism [WEIH89] to synchronize the concurrent execution of operations on o. If the locking mechanism is used, deadlock might occur. Since transactions in new applications like CAD require more objects for longer time than conventional ones, there is bigger possibility that deadlock occurs. In this paper, we discuss deadlock problems occurred when transactions are executed in parallel. We assume that deadlock can be detected by some deadlock detection mechanism [KNAP87]. After the deadlock is detected, one deadlocked transaction T is selected to be aborted in order to resolve the deadlock. In the conventional system, all the operations executed in T are aborted, i.e. T is totally aborted. However, all the operations in T are not necessarily related to the deadlock. In our system, only a part of T which is necessary to resolve the deadlock is aborted, i.e. T is partially aborted. In order to abort the operations executed in T, they are undone by restoring the old states, e.g. before image of the operations stored in the log and backup copy in the conventional system. In this paper, we present another method in which some operation is executed to abort each operation or transaction. The operation is named a compensate operation or a compensating transaction [KORT90]. The compensate operations are defined for each operation on the basis of the application semantics. By executing the compensate operations in stead of using the conventional state based log, information to be stored in the log can be reduced. It is useful because longer transactions have to store more data in the log. In section 2, we describe our system model for executing nested transactions on the objects. In section 3, we present a model of parallel transactions. In section 4, we present how deadlock occurs when each transaction is executed in parallel. In section 5, we discuss how to resolve the deadlock by aborting partially a deadlocked transaction T by executing the compensate operations of the operations executed in T. 2. System Model A distributed system M is a collection of objects interconnected by communication networks. Each object o is an abstract data type which provides data structure Do and a set Po of operations for manipulating Do. We assume that o has its own recovery mechanism, i.e. o has an object log Lo which keeps in track of operations executed in o. An object o provides a compensate operation op˜ for each operation op. op˜ is defined on the basis of the applica-2-

tion semantics. For a system state s, let op(s) denote a state obtained by applying op on s. op˜ is an operation such that for every state s, op˜ (op(s)) = s. s can be restored by executing op˜ . For example, suppose that a state t is obtained by applying an operation append of some record r to a state u of a file f. Then, if delete of r is applied to t, u is obtained again. delete is an example of a compensate operation of append, i.e. delete is append˜ . In this paper, we try to abort the transaction by executing the compensate operations of the committed operations of a higher level. Each op of o has a lock mode mode(op). Some compatibility relation is defined among the modes provided by o. mode(op1) is compatible with mode(op2) iff op1 can use o when op2 holds o. op1 is compatible with op2 iff mode(op1) is compatible with mode(op2). op1 is incompatible with op2 iff op1 is not compatible with op2. There are two kinds of operations provided by o, i.e. non-primitive and primitive ones. The primitive operations directly manipulate o without locking o and do not call any operation. On the other hand, the non-primitive operations can call primitive operations of o and another operations. [Example 2.1] An office system Office is composed of several kinds of objects, worker, secretary, room, and ohp (overhead projector). Each worker object w has a schedule table wsch as the local state. The worker provides operations Book to update wsch to make an appointment with it, Cancel to cancel the booking. Cancel is the compensate of Book, and vice versa. Each room object has the schedule table rsch which denotes when the room is booked. The room provides operations BookR to book the room, i.e. to update rsch. BookR calls BookO to book the device ohp to be used in the meeting. The secretary takes an operation OpenM and then asks the members to join the meeting by sending the operation Book to the worker objects and book the room for the meeting by sending BookR to the room object.` Each object communicates with another objects. In our system, the objects are assumed to be interconnected by a reliable broadcast communication network named an SPO network [NAKA91]. In this network, each object can broadcast messages to their destination objects in the sending order without any loss of messages. 3. Transactions 3.1 Transaction Structure A transaction T is composed of operations, i.e. T calls -3-

operations on the objects. Each operation in T can call another operations. In this sense, T is structured in a partially ordered tree named a transaction tree. The root denotes T. Each leaf denotes a primitive operation. Here, a notation [[op]] is used to explicitly denote that an operation op is primitive. Non-leaf nodes are called subtransactions which denote non-primitive operations. For a transaction T, opT is used to denote that op is called in T. [Example 3.1] Let us consider Office as presented in Example 2.1. In order for a manager mgr to hold a meeting, mgr asks the secretary sec to open the meeting with the members a and b when mgr would be free. sec makes not only an appointment of a and b, but also asks room to be booked for the meeting. In addition to booking room, sec books the device ohp. This transaction Meeting is represented in a tree as shown in Fig.1. ` Meeting Book(mgr) [[Write(msch)]]

OpenM(sec)

Book(a)

Book(b)

BookR(room)

[[Write(asch)]] [[Write(bsch)]] [[Write(rsch)]] BookO(ohp) [[Write(ohp)]]

Fig.1 Transaction Meeting Suppose that op on an object o (directly) calls m operations op1, ..., opm, (for i = 1, ..., m). Let [op and op] denote the begin and end of op, respectively. Let Σop be a set of operations called by op and begin/end operations. We introduce a precedence relation →op on Σop which denotes the execution sequence of operations. For two operations opi and opj called by op, opi →op opj means that opj is executed after opi commits. If there exists no relation →op among opi and opj, they can be executed in parallel. When all the operations called by op commit, op commits. Hence, op] has to wait for all the operations called by op, i.e. [op →op opi →op op] for i = 1, and the top is [op. In this paper, only AND-parallelism is considered. [Example 3.2] Let us consider Meeting of Fig.1. ΣMeeting = {[Meeting, Book(mgr), [OpenM(sec), Book(a), Book(b), BookR(room), OpenM(sec)], Meeting]}. In Meeting, mgr asks -4-

sec to organize the meeting after his schedule is fixed by Book(mgr). Since there is no precedence relation →Meeting among Book(a), Book(b), and BookR(room), a and b, and room can be booked in parallel. As a result, Meeting is represented as a directed graph as shown in Fig.2, where the nodes denote operations and directed edges represent the precedence relation. The graph is named a transaction graph. ` [Meeting

e

[Book(mgr)

e

[[Write(msch)]]

e

Book(mgr)]

e

[OpenM(sec) [Book(b)

[Book(a)

[BookR(room)

[[Write(bsch)]] [[Write(asch)]] [[Write(rsch)]] [BookO(ohp) Book(b)]

Book(a)]

BookR(room)]

OpenM(sec)]

[Arrange [Book(b)

[BookO(ohp) [[Write(ohp)]]

Fig.2 Meeting and Arrange 3.2 Synchronization The locking mechanism [ESWA76] for nested transactions is presented in [MOSS85]. Here, Lock(op) denotes a set of objects held by op after op commits. Suppose that op calls operations op1, ..., opm. op is executed as follows. [Execution scheme of op] (1) If op is the root, nothing is done and Lock(op) = φ. (2) If op is not the root, o must be locked in a mode mode(op) before executing op. (2-1) For every mode m of operations which hold o, if mode(op) is compatible with m, op can be applied to o. (2-2) Otherwise, op blocks and waits until (2-1) holds. (3) When op1, ..., opm commit, Lock(op) = {o} ∪ Lock(op1) ∪...∪ Lock(opm). If op is the root of the transaction, all -5-

the locks in Lock(op) are released.

`

All the locks are released when all the operations commit in T. This means that the transactions are two-phase locked [ESWA76]. 4. Parallel Deadlock There might occur some deadlock if the locking mechanism is used. A system state is represented by a waitfor graph. According to the famous deadlock discussion [KNAP87], the system state s is deadlocked iff the wait-for graph for s includes a directed cycle. Transactions in the deadlock cycle are deadlocked. In this paper, we discuss what kind of deadlock occurs when the transaction is executed in parallel. Also, we try to abort only the smallest part of the transaction which is necessary to resolve the deadlock in stead of aborting the whole part of the transaction. In order to do so, the internal structure of the transaction, i.e. the dependency relation ⇒L on the operations must be considered. [Definition] An operation op1 is said to L-depend on op2 (op1 ⇒L op2) iff (1) op1 waits on an object o held by op2, and op1 and op2 conflict, (2) op1 precedes op2 in some transaction, or (3) for some op3, op1 ⇒L op3 ⇒L op2. ` A system state s is represented as a directed graph named an extended wait-for (EWF) graph where the nodes denote the operations and the directed edges denote the minimum covering of ⇒L. If there is a directed edge from op1 to op2 in the EWF graph, op1 is said to be an immediate predecessor of op2 and op2 is an immediate successor of op1. A current operation of T is one being executed. Let C(T) be a set of current operations of T. For each operation op in T, CO(T, op) denotes a set of current operations called by op. [Example 4.1] Suppose that there is another transaction Arrange which tries to book both b and ohp in parallel as shown in Fig.2 Suppose that Arrange waits on b held by Meeting, and Meeting waits on ohp held by Arrange, i.e. [BookArrange(b) ⇒L [BookMeeting(b) and [BookOMeeting(ohp) ⇒L [BookOArrange(ohp). Here, although there is no directed cycle on ⇒L. both transactions are deadlocked. ` [Definition] An operation op1 is said to depend on op2 (op1 ⇒ op2) iff (1) op1 ⇒L op2, (2) op1 and op2 are called by some transaction T, op1 holds an object and op2 waits on some object, or (3) For some operation op3, op1 ⇒ op3 ⇒ op2. ` -6-

[Example 4.2] In Fig.2, there exists a cycle on ⇒, i.e. [BookOArrange(ohp) ⇒ [BookArrange(b) ⇒ [BookMeeting(b) ⇒ [BookOMeeting(ohp) ⇒ [BookOArrange(ohp). This represents that deadlock occurs among Meeting and Arrange. ` [Definition] An operation op is said to be deadlocked iff op ⇒ op. ` A deadlocked operation op is said to be waited iff there exists an operation op1 in another transaction such that op ⇒ op1 ⇒ op. Let W(T) be a set of waited operations in T. Let LW(T) be a set of waited operations which no waited operation precedes. If the deadlock is detected and some deadlocked T is selected to be aborted, all the operations in LW(T) must be aborted. 5. Deadlock Resolution After the deadlock is detected by some deadlock maneger (DM) as discussed in [KNAP87], the deadlock is resolved by aborting some deadlocked transaction. 5.1 Greatest Committed Operations In our system, compensate operations are executed for committed operations instead of aborting them by using the conventional state-based log [BERN87]. In this paper, we assume that every operation op has the compensate operation op˜ . The compensate operation is defined based on the application semantics. Operations which commit and are called by uncommitted operations are said to be greatest committed operations in T. In this paper, only the greatest committed operations are compensated in stead of executing the compensate operations of all the committed operations. [Example 5.1] Let us consider an EWF graph as shown in Fig.2. As stated in Example 4.1, deadlock occurs among Meeting and Arrange. Suppose that Meeting is selected to be aborted by the DM and all the operations executed in Meeting are aborted. One method is to execute the compensate operations of all the operations in the reverse order. In stead of executing the compensate operations of all the operations executed in Meeting, the greatest committed operations BookMeeting(mgr), BookMeeting(a), and BookMeeting(b) can be compensated by executing BookMeeting(mgr)˜ , BookMeeting(a)˜ , and BookMeeting(b)˜ . ` Since op˜ is a transaction, op˜ requires the locks on the objects which op˜ uses. Each committed operation op still -7-

holds objects when op˜ is executed. When op is aborted by executing op˜ , the locks held by op or op˜ are released. If op˜ requires objects different from ones obtained in op, further deadlock might occur by executing op˜ . Assume that each compensate operation op˜ requires only objects obtained by op in compatible modes with op. 5.2 Partial Abortion Suppose that Meeting is selected to be aborted in Fig.2. In the conventional method, all the operations in Meeting are aborted. However, the deadlock can be resolved by aborting only the operation BookMeeting(b). It is an example of the partial abortion of Meeting. Next problem is whether BookMeeting(b) can be restarted after aborting it or not. [Example 5.2] Let us consider the EWF graph of Fig.2. Suppose that Arrange is selected to be aborted. Since Meeting waits on ohp held by [BookOArrange(ohp), [BookOArrange(ohp) has to be aborted. When BookOArrange(ohp) is aborted, the local state, i.e. local variables of Arrange is the same as one on which BookOArrange(ohp) was applied before is executed. After that, BookOArrange(ohp) can be restarted. ` Following the example, there are two cases. Here, let op be a waited operation deadlocked in a transaction T, i.e. an operation to be aborted. [Operations to be aborted] (1) If op is current, op and all the operations preceded by op are aborted. Then, op is restarted. (2) If op commits already and there is some operation op’’ preceding op and that op’’ and op are called by the some operation op’, op’ and all the operations following op’ are aborted. Then, op’ is restarted. ` 5.3 Compensation Mechanism First, the DM constructs an EWF graph G, and selects a deadlocked transaction T. Let AO(T) denote a set of operations to be aborted in T. AO(T) is obtained by traversing G starting from the waited operations in LW(T). Since operations are called in parallel, some operations called by the operations in AO(T) may be still being executed in s. They have to be terminated. In our system, the message Pabort(T, AO(T)) which requires objects to abort the operations in AO(T) is broadcast to all the objects. [Behavior of an object o] -8-

(1) [On receipt of Pabort(T, AO(T))] If o has a current op called by an operation in AO(T), o aborts op, and o broadcasts a Compensate(T, op) to all the objects which execute the immediate predecessors of op in T. (2) [On receipt of Compensate(T, op)] If o has an immediate predecessor op’ of op, o waits for Compensate from all the successors. If o receives all the Compensates, (2-1) if op’ is some commit operation a], a˜ is executed. If a˜ commits, o broadcasts Unlock(T, {a, a˜ }) to unlock all the objects obtained by a or a˜ . o broadcasts Compensate(T, [a). (2-2) otherwise, i.e. op’ is not a commit, (op’)˜ is executed. If (op’)˜ commits, o broadcasts Unlock(T, {op’, (op’)˜ }) to release all the locks obtained by op’ or (op’)˜ . o broadcasts Compensate(T, op’). (3) [On receipt of Unlock(T, OP)] If o has an operation op called by an operation in OP, o releases the locks of op and removes the record on op from the log Lo. ` [Example 5.3] In Fig.2, suppose that Meeting is selected to be aborted to resolve the deadlock occurred among Meeting and Arrange. Here, C(Meeting) = {OpenMMeeting(sec)], BookRMeeting(room)], [BookOMeeting(ohp)}, LW(Meeting) = {[BookMeeting(b)}, and AO(Meeting) = {OpenMMeeting(sec)}. All the operations called by OpenMMeeting(sec) have to be aborted. First, pabort(Meeting, {OpenMMeeting(sec)}) is broadcast. The objects sec, room, and ohp receive it, and stop the execution. sec sends Compensate(Meeting, OpenMMeeting(sec)) to b to abort BookMeeting(b). b executes BookMeeting(b)˜ . When BookMeeting(b)˜ commits, b releases the locks obtained by BookMeeting(b). Also, the locks obtained by BookMeeting(b)˜ are released. a executes BookMeeting(a)˜ in the same way as b. room also compensates BookRMeeting(room). On receipt of Compensate from a, b, and room, sec aborts [OpenMMeeting(sec) and then OpenMMeeting(sec) is restarted. ` 6. Concluding Remarks In this paper, we have discussed the parallel execution of transactions in the distributed system. We have presented what kind of deadlock might occur and how to resolve the deadlock in a case that each transaction is executed in parallel. In our method, some deadlocked transaction is selected and aborted partially. Also, it is aborted by executing the compensate operations of the operations which have been executed in the transaction. Time for aborting and restarting transactions by the partial abortion and also the size of log by using the compensate operations can be reduced. Since the com-9-

pensate operation is also an operation on an object, it is executed in parallel in the same manner as the operation. The further deadlock might occur by executing the compensate operations to resolve the deadlock, since the compensate operation might require objects which have not been held by an object. This problem is discussed in [TAKI90]. References [BERN87] Bernstein, P. A., Hadzilacos, V., and Goodman, N., "Concurrency Control and Recovery in Database Systems," Addison Wesley, 1987. [ESWA76] Eswaren, K. P., Gray, J., Lorie, R. A., and Traiger, I. L., "The Notion of Consistency and Predicate Locks in Database Systems," CACM, Vol.19, No.11, 1976, pp.624-637. [GRAY81] Gray, J., "The Transaction Concept: Virtues and Limitations," Proc. of the 7th VLDB, 1981. [KNAP87] Knapp, E., "Deadlock Detection in Distributed Databases," ACM Computing Surveys, Vol.19, No.4, 1987, pp.303-328. [KORT90] Korth, H. F., Levy, E., and Silberschatz, A., "A Formal Approach to Recovery by Compensating Transactions," Proc. of the 16th VLDB, 1990, pp.96-106. [NAKA91] Nakamura, A. and Takizawa, M., "Reliable Broadcast Protocol for Selectively Partially Ordering PDUs (SPO protocol)," Proc. of the IEEE 11th ICDCS, Arlington, 1991, pp.239-246. [MOSS85] Moss, J. E., "Nested Transactions: An Approach to Reliable Distributed Computing," The MIT Press Series in Information Systems, 1985. [TAKI90] Takizawa, M. and Deen, S. M., "Synchronization Problems of Compensate Operations in the Object-Model," Proc. of International Conf. on Cooperating Knowledge Based Systems, Keele, England, 1990. [WEIH89] Weihl, W. E., "Local Atomicity Properties: Modular Concurrency Control for Abstract Data Types," ACM Trans. on Programming Language and Systems, Vol.11, No.2, 1989, pp.249-283.

- 10 -