Document not found! Please try again

Exploiting Multi-Level Transactions in Distributed ... - CiteSeerX

0 downloads 0 Views 108KB Size Report
Abstract. Multi-level transactions occur naturally in distributed database systems. .... and management of distributed deadlocks involving a number of sites is diffi-.
Exploiting Multi-Level Transactions Distributed Database Systems M. K IRCHBERG Massey University, New Zealand

Abstract Multi-level transactions occur naturally in distributed database system They exploit the fact that many low-level conflicts become irrelevant if high level operation semantics are taken into account. Thus, multi-level transa tions may lead to an increase in concurrency. Hence, the transaction throug put increases too. In distributed database systems, the challenge in synchronising conc rent user transactions is to extend the serializability argument, concurren control algorithms, and reliability issues to the distributed execution enviro ment. Nowadays most of these issues have been investigated heavily but flat transactions only. Comparatively less work has been done on the basis more sophisticated transaction models like the multi-level transaction mod This article exploits multi-level transactions in distributed databases. W face each of the outlined challenges, and introduce and investigate possib — popular and alternative — solutions. Extensions w.r.t. the serializability argument are well-known for decad The main focus of this paper is on the other two challenges. Despite seve disadvantages, locking protcols still dominate the market. We show that the are promising alternatives, i.e., the hybrid FoPL protocol, around especia when using a multi-level scheduler in a distributed environment. Consid ing recovery, only a few mechanisms have been proposed for multi-lev recovery. These are ARIES/NT, MLR by Weikum et al, MLR by Lomet a ARIES/ML. We show that ARIES/ML is the most suitable one and fit to used in distributed systems. Keywords

Distributed Databases, Distributed System Architectures, Transaction Proces Concurrency Control, Logging and Recovery.

1 Introduction

Modern enterprises are typically based upon a distributed architecture. T formation processing needs can be best satisfied by truely distributed systems.

 This work has been supported in part by FRST/NERF of New Zealand. 1

2

Proceedings in Informatics

In distributed database systems, the challenge in synchronising co user transactions is to extend the serializability argument and concurre trol algorithms to the distributed execution environment. In addition, r issues, i.e., dealing with communication, transaction, system and media have to be considered. Nowadays most of these issues have been investigated heavily bu transactions only. Comparatively less work has been done on the basis of tive, more sophisticated transaction models. We took on this challenge a our work on a more advanced transaction model, the so-called multi-lev action model [Bee89]. This model is counted as the most promising tra model in theory, but still awaiting its practical maturity.

A multi-level transaction is a special kind of an open nested transactio the leaves in the transaction tree have the same depth. Each node in the responds to some operation implemented by its successors. The root is action. The lowest level L0 corresponds to operations that access dire physical database. The main advantage coming with multi-level transa the proposed increase of transaction throughput. A crucial issue for all da The basic idea of multi-level serializability is that sequences of low-le page-level or record-level, database operations represent application-de operations on higher levels, and there are usually less conflicts on highe Consequently, some of the conflicts on lower levels, so-called pseudoc can be ignored. Hence, their detection increases the rate of concurrency. The general approach to concurrency control is the use of lockin cols, especially two-phase locking. However, they suffers some sever nesses decreasing the transaction throughput. These weaknesses include t lock problem, the impossibility to accept all (conflict-)serializable sched Therefore, alternatives to locking protocols dominate the research in con control. Solutions comprise timestamp [Lin83, Li87], optimistic [Kun79 Har84, Elm87] and hybrid [Bok87, Hal89, Hal91, Rip98, Sch00] protoco Since concurrency control may force transactions to be aborted, it is n to employ recovery mechanisms dealing with these situations. For this th ticated ARIES algorithm [Moh92] is generally accepted as a good startin So far only a few research projects have focussed on recovery mechan multi-level transactions. Mainly, these are ARIES/NT [Rot89], MLR b [Lom92], and a non-ARIES-based approach to multi-level recovery by W al [Wei91]. However, none of these approaches comes without major res or disadvantages limiting its practicability. ARIES/ML [Dre98, Sch00] these problems. It preserves the major features of ARIES, can be coup all kinds of concurrency control protocols, supports both inverse and non operations as well as concurrent rollbacks of multiple (sub-)transactions. more, it enables undo and redo of single subtransactions in order to reso situations, i.e., deadlocks, more efficiently.

M.Kirchberg: Exploiting Multi-Level Transactions in Distributed DB

In distributed databases multi-level transactions occur naturally [Ber8 E.g., in distributed object bases we may think of a global level, a local lo ject level, a local level of physical objects and a record level. In this p exploit multi-level transactions and corresponding algorithms in a distrib ecution environment. We show how multi-level transactions, multi-level rency control and multi-level recovery mechanisms can be used to syn distributed transactions efficiently. Thus, we compare and contrast corres popular and alternative approaches.

The first challenge in synchronising concurrent user transactions is t the serializability argument to the distributed execution environment. T tensions are well-known for decades [Ber81, Ber85, Gra93, Osz99]. The second challenge considers concurrency control algorithms. Di arise with guaranteeing global serializability. If locking-based algorithms we run into the problem of dealing with distributed deadlocks. The d and management of distributed deadlocks involving a number of sites cult [Osz97]. There are several algorithms for deadlock detection in di databases [Cha82, Obe82]. However, their complexity is non-negligible tamp protocols are particularly easy to adapt to the distributed case, whe timistic and hybrid protocols are rarely used even in prototype systems. H research results indicate that the complexity of these approaches is less complexity of detecting distributed deadlocks. Nevertheless, locking prot still the dominant concurrency control protocols in distributed execution ments. The third challenge is to guarantee reliability. This refers to the desir icity and durability properties of transactions. In a distributed execution ment, two specific aspects of reliability protocols need to be discusse are commit protocols, i.e., two-phase commit [Osz99], and appropriate mechanisms, i.e., ARIES/ML.

Outline. In Section 2 we introduce the multi-level transaction mode ing two approches to multi-level concurrency control and a multi-level mechanisms. Section 3 then briefly outlines a sample architecture for di object bases on the operational and physical level. In Section 4 we focu three challenges mentioned above. We investigate possible solution, and and contrast them with one another. Finally, Section 5 concludes our wor

2 The Multi-Level Transaction Model

In this section we discuss the multi-level transaction model including mu transactions, multi-level schedules and conflict serializability. Furthermor

4

Proceedings in Informatics

troduce two different approaches to multi-level concurrency control. In p we discuss the most popular locking protocol, strict two-phase locking (s and a hybrid protocol, called FoPL. Subsequently, we give a brief ove published multi-level recovery algorithms. Finally, the multi-level reco proach of our choice is introduced in more detail.

2.1 Multi-Level Transactions

    

  

An n-level-system Ln 1 L0 consists of n layers Li i i ,w is a set of objects and i a set of operators. An Li -operation is an ele i i i. An n-level transaction is defined next exploiting the notion of an in denote the set of which is a finite set of finite sequences over . We let sequences. α denotes the length of α . Furthermore, we identify with sequences of length 1 and denote the empty sequence by ε. An index tree of depth n is a finite subset I with ε I, α k 1 αk I and α I α n α1 I for all α and k . An n-level-transaction T j over an index tree I of depth n assigns to e j I an Ln α -operation, denoted as op jα and partial orders i on each

    

 

 



    !



 "





"

 

' &  %  $ # # ( op ) α * i  n + , such that op &% ' op ,.- k 0 / holds. We call ' L % -precedence relation. Edges in a transaction tree represent the implementation of an L -o by a set of L  -operations. If op is an L -operation of a transaction trans  op  op (0 1 i  n) is the L 2 -operation — which is( also k [arent operation — that invokes op . Conversely, act  op  op , defines the set of L  -operations, so-called child operations implemen j



jαk



i

j

i

i

i 1



jµk

i

jµk

i 1



jµk



i 1

Li -operation op jν . Precedence relations are meant to express a necessary ordering of im j j op jαk i op jβ , wheneve ing operations to require op jα i op jβ volved operations exist. In this case, the transaction T j is well-defined.

 %'

!

 %'

,

2.2 Multi-Level Schedules

  (   +            %' 3 4  % ' is the set of all L -operations in these transactions  0 1 i  n .  induces a partial order  on each level i 5 0 by op  op ! act  op 6 7 op ,  act  op 8 op   op , Using this, we may define the level-by-level-schedule S 9 ( j  i 1   one-level-schedule     .

The execution of concurrent transactions is described by an n-level-sche T1 Tk of n-level-tran n n 1 0 0 . It consists of a set n j and a partial order 0 on 0 containing all L0 -precedence relations, wh j

k j 1

i

i

0

i

µ

ν

ν

µk

µ

i 1

ν

i j

i

j

j

i

ν

M.Kirchberg: Exploiting Multi-Level Transactions in Distributed DB

   

The basic idea of multi-level concurrency control is to use the sem operations in level-specific conflict relations CONi i i . Non-co operations should commute. Same as with precedence relations the intention behind the conflict forces us to require the following conformity condition: If opµ opν holds for some opµ opν act opµ an i , then there should exist opµk act opν with opµk opν CONi 1 .



     , 





  

2.3 Multi-Level Conflict Serializability

We have to extend the notion of conflict-serializability to multi-level tran to make these arguments rigorous. First, an n-level-schedule with a total is called serial. Then serializability means equivalence to a serial schedu following formal sense. Let n n 1 0 0 be an n-level-schedule with induced parti 0 n 1) be conflict relations. Define i on level i. Let CONi (i opµ opν CONi opµ i opν for opµ opν opν i. Then two n-level-schedules are called (conflict-) equivalent iff their as relations i coincide for all i 0 n 1. An n-level schedule which is equivalent to a serial one, is called (n-level-)serializable. According to [Bee89] an n-level-schedule S is n-level-serializable, if by-level schedules Si i 1 of S (0 i n) are conflict serializable.

              : !;     <    :



 

 1

9

2.4 Multi-Level Concurrency Control Protocols

After a short introduction to the multi-level transaction model, we now two different concurrency control protocols: str-2PL and FoPL. 2.4.1 The Strict Two-Phase Locking Protocol

Only Li -operations accessing the same object might give rise to a conflic fore, it is sufficient to define a level-specific lock lockop for each oper Then each Li -operation opµ x may only be executed after getting a loc on object x. op1 x and opν op2 x are Li -operations on the sam If opµ x i , their corresponding locks lockop1 and lockop2 are called incomp opµ i opν or opν i opµ holds. Otherwise they are called compatible. Strict two-phase locking is defined by the following two rules: 1 operation opµk working on object x i has to acquire a lock lockop object x before actually being executed. 2) This lock is released at th the Li 1 -operation opµ to which opµk belongs, together with all other acquired during opµ ’s execution.

?