Simulation Models of Two-Phase Locking of Distributed Transactions

7 downloads 0 Views 132KB Size Report
Some results of the simulations of 2PL in DDB are presented. Key words: Distributed transactions, Distributed database, Two-phase locking, Simulation models,.
International Conference on Computer Systems and Technologies - CompSysTech’08

Simulation Models of Two-Phase Locking of Distributed Transactions Svetlana Vasileva, Aleksandar Milev Abstract: The paper suggests simulation models of concurrency control of global transactions in Distributed Database Management Systems. We consider mainly concurrency control according to the method of Two-phase locking (2PL). We present simulation models of 2PL in Distributed databases (DDB), which are developed in the environment for imitation modeling GPSS World. Queuing systems models modeling the execution of 2PL in DDB with Data replication are presented: centralized 2PL, primary copy 2PL and distributed 2PL. Some results of the simulations of 2PL in DDB are presented. Key words: Distributed transactions, Distributed database, Two-phase locking, Simulation models, GPSS blocks, GPSS transactions.

INTRODUCTION GPSS World system is a complex modeling instrument comprising the fields of the discrete as well as the analogue simulation. The use of GPSS World gives the opportunity to evaluate the effect of the constructor’s decisions in many complex systems from the real world. [6] Considering this basic formulation and the activity and results of many authors with their research of using simulation modeling [5], [6] and many others which are not mentioned here, convince us in the advantages of the GPSS World environment for imitation modeling of systems for processing of transactions, what in fact are the Database Management Systems (DBMS). The authors face the problem of investigating the concurrency control of the transactions in distributed DBMS (DDBMS) in which the data are exposed to fragmentation and replication through the nodes of the system with the use of simulation. In the optimistic methods for concurrency control (timestamp ordering and data validation), while having conflicts of concurrent transactions more often, there are often rollbacks as well, which leads to more prolonged retain of transactions in the system than in two-phase locking (2PL). This imposes to investigate transaction concurrency control according to the method of 2PL. In the resources which we analyzed [1], [2], [7], [8], [11], [12], [13], and others which we do not mention here, we have not found a detailed description of the algorithms which simulate 2PL in distributed database (DDB), which imposed their development. We didn’t find simulation models of the main DDB models in our reference review and according to this the present paper suggests GPSS models simulating the work of the 2PL algorithms in DDBMS depicted in details in [3] and [4]. BASIC ELEMENTS OF THE DEVELOPED GPSS SIMMULATION MODELS The presented simulation models use generated streams of transactions which imitate global transactions in DDB systems. They are all in parallel streams and their intensity λ is given in tr per ms (number of transactions per milliseconds). For simulation of randomness of processed element we use uniformly distributed Random Number Generator in the interval restricted by the number of the elements in DDB. The number of a site where a copy of a given data element exists and where the transaction will be redirected to read or write is a random number as well. In our simulation models there are exist conditions for permitting the lock of a data element and conditions for permitting the lock of a data element for adjacent transactions, requesting a lock for reading the element in the locks table. Simulation of Lock table is done by two-dimensional matrix where the type of the lock is considered and the number of transaction, which has overcome the element as well. Number of the site generated the transaction is consider too. There exists a condition if the - V.12-1 -

International Conference on Computer Systems and Technologies - CompSysTech’08

element is locked with a shared lock and the transaction requests a shared lock, and there is no transaction requesting exclusive lock. The type of the lock requested by the transaction is being recorded. The location of the first and the second rand the third replicas of data elements are random and are simulated for different sites. The models are constructed to read the first copy and to write both of the copies. Further the data elements are stipulated to have four and more copies. Type of the requested lock for the element, which will be processed by the generated transaction is a random value. The value is taken from one of the numbers: 1 (for reading), 2 (for updating) or 3 (for writing).Type of the operation, which the generated transaction will do upon the element being processed is random too and this value is from one of the numbers: 0 (if the lock type is 1), 1 (addition or subtraction) or 2 (multiplication or division), chosen with equal probabilities. We simulate different states of the transaction in the phases which the transaction follows. They are the occupation of the locks which is the1st phase and the second one when the transaction finishes its work and has to release the locks. In the future when deadlocks of longer transactions are possible to appear, it will be considered appropriately for a transaction which will have to be restarted. COMPONENTS OF THE MODELS OF SERVICING OF THE CONCURRENT TRANSACTIONS The conceptual schemes of the developed simulation models are presented in [10]. These schemes include the following basic components: - In Centralized 2PL protocol: SP2 – generator of transactions, with intervals of coming in which are distributed according to an exponential law; QLM0 – queue of the waiting for processing transactions; LM0 – Lock manager device. If the necessary locks are granted the transaction is split and replicated by the corresponding Transaction coordinator TCP2 and is executed in corresponding nodes DMP7 (and DMP8), after that they are reunited in the transaction manager device; the busy locked are released – block LT0 and leave the system. - In Primary copy 2PL protocol: SP2 – Generator of transactions, with input intervals exponentially distributed; TCP2 – Transaction coordinator, it splits the transaction, defines the necessary primary copies and points the transaction to the corresponding lock managers LMP7; QLMP7 – queue of the transaction waiting for LMP7 processing; If the necessary locks are granted the transaction is replicated and is executed in corresponding nodes DMP8 (and DMP9), after that they are reunited in the transaction manager TMP2 block; the busy locked are released in corresponding blocks LTP7 and the transaction leaves the system. - In Distributed 2PL protocol: SP2 – Generator of transactions, with input intervals exponentially distributed; TCP2 – Transaction coordinator, it splits and replicates the transaction defines the necessary executor nodes DMP7 (and DMP8) and points the transaction to the corresponding lock managers LMj; QLMj – queue of the transaction waiting for LMP7 (and LMP8) processing; If the necessary lockings are granted the transaction is executed in corresponding data managers DMP7 (and DMP8), after that they are reunited in the transaction manager TMP2 block; the busy locked are released in corresponding lock tables LTP7 (and LTP8) and the transaction leaves the system. SCHEMES OF THE ALGORITHMS SIMULATING 2PL The paper considers models with 2 and 3 copies of data elements, therefore there are two and three parameters for each to store the number of the sites to which the subtransactions of the global transaction in DDB (in the GPSS models simulated by split GPSS transaction) should be directed. Fig. 1 - 3 show block-schemes of the algorithms modeling performance of simple (accessed to 1 data element) global transactions in DDBMS in centralized 2PL protocol, primary copy 2PL protocol and distributed 2PL protocol correspondingly for 2 data element - V.12-2 -

International Conference on Computer Systems and Technologies - CompSysTech’08

copies. The renewing transaction is splitting in three for model with 3 data element copies. There will be one more branch PatS3 in part “Transmission through channels” and the corresponding sub-transaction will proceed to the third site-executor, in this case SP9.

Fig. 2. Block-scheme of the algorithm which Fig. 1. Block-scheme of the algorithm which models the performance of the centralized 2PL models the performance of the primary copy 2PL

- In Centralized 2PL model (fig. 1) Parameters P7 and P8 record correspondingly the number of the site, where it is the nearest copy of the data element and the number of the site, where it is the second replica of data element, processed by transaction. The transfer through the network to the central lock manager LM0 and to the sites-executors DM is simulated with retention. The functions PredS0 and PrS0Dev give the parameters of uniformly distribution of the transfer time through the channels of the site generators to LM0 - In Primary Copy 2PL model (fig. 2) Parameter P7 represents primary site number (given by the function PrimaS). This is the site where the lock manager which controls the lock of the primary copy of data element is. Parameters P8 and P9 record correspondingly the number of the site, where it is the nearest copy of the data element and the number of the site, where it is the second replica of data element, processed by transaction. The transfer through the network to the primary lock manager LMP7 and to the sitesexecutors DM is simulated with retention. The matrices RAZST ( х ) и RAZDEV ( х ) give the parameters of the uniformly distribution of the transfer time through the channels by the site generators to LMP7 and to the data managers DM. - In Distributed 2PL model (Fig. 3) Parameters P7 and P8 have the same meaning as in the centralized 2PL protocol. Functions DistrS1 and DistrS2 give the number of the nodes, where there are replicas of data element. The transfer through the network to the site-executors DM, where the lock managers are, is simulated with retention. The matrices RAZST and RAZDEV (with the same sizes and elements as in primary copy 2PL) give the parameters of the uniformly distribution of time of delivering through the channels of the site generators to the site-executors. For the simulation - V.12-3 -

International Conference on Computer Systems and Technologies - CompSysTech’08

of the process of waiting for the sub-transactions in their processing by the lock managers LM we define and use the functions Opash1 and Opash2, which give the numbers of the queues. The organizing of the queues is of the FIFO type.

Fig. 3. Block-scheme of the algorithm which models the performance of the distributed 2PL

SIMULATION RESULTS Our research has been made for 2 and 3 copies of the element. Several parameters have been changed in order to obtain the whole picture of the states in which the chosen three models went through. These states have been investigated in the simulated models and analyzed. The parameters and indexes of the simulations of the considered models are as follows: NumTr – general number of the generated transactions for the time of modeling; FixTr – general number of the completed (committed) transactions for the same period; X=FixTr/Tn – throughput of the queuing system; Tn – time interval in which the system is being watched; Ps=FixTr/NumTr – probabilities for transaction service. The results are received in 6 streams of concurrent transactions with different intensity. The copies of the data elements are distributed evenly and random by 6 sites in the system. Service and rejection probabilities are calculated according to closely associated formulas and the received values are different from those received through more detailed expressions. The results of our model simulations of the models for equal intensity of 6 input flows for 2 and 3 element copies are summarized and presented graphically in fig. 4 - fig.6 . The similar summarized results for different intensity are shown on fig.7 – fig 9. 900

0,3 0,2

700

Number of transactions

Number of transac tions

800

600 0,2

500 400

0,1

300 200

0,1

770

0,4

760

0,3

750

0,3

740

0,2

730 0,2

720

0,1

710

0,1

700

100 0

0,0 60

90

120 ms

150

Ps (2 cop )

690

Ps (3cop) X (2 cop)

90

120 ms

150

Ps (2 cop ) Ps (3cop) X (2 cop) X (3 cop)

X (3 cop)

Fig. 4 Throughput of the model and probability service for Centralized 2PL

0,0 60

Fig.5 Throughput of the model and probability service for Primary Copy 2PL

- V.12-4 -

0,4

350

700

0,3

300

600

0,3

500

0,2

400 0,2

300

0,1

200

0

0,0 60

90

120

0,2

150

0,1

100 0,1 50

Ps (2 cop )

0

350

0,0 I distr

II distr

III distr

Ps (2 cop ) Ps (3cop)

IV distr

X (2 cop)

X (2 cop)

X (3 cop)

X (3 cop)

Throughput of the model and probability service for Distributed 2PL

Fig.7 Throughput of the model and probability service for Centralized 2PL 0,3

350

0,3

300

300 0,2

Number of transactions

Number of transactions

200

Ps (3cop)

150

ms

Fig.6

0,2 250

0,1

100

0,3 Probality

800

Number of transactions

Number of transactions

International Conference on Computer Systems and Technologies - CompSysTech’08

250 200

0,2

150

0,1

100 0,1

0,2 250 200

0,2

150

0,1

100

0

0,0 I distr

II distr

III distr

IV distr

0,1 50

50 Ps (2 cop )

0

Ps (3cop)

0,0 I distr

II distr

III distr

IV distr

Ps (2 cop ) Ps (3cop)

X (2 cop)

X (2 cop)

X (3 cop)

X (3 cop)

Fig.8 Throughput of the model and probability service for Primary Copy 2PL

Fig.9 Throughput of the model and probability service for Distributed 2PL

The diagrams show the results of the conducted simulations of the presented GPSS models of 2PL in DDB for different input intervals Тin of transaction coming. They are in conformity with the effectiveness indexes of algorithms which are suggested in [9]. Service probabilities of transactions, the service coefficient of transactions are shown for different 2PL types. The analysis shows that Centralized 2PL model runs more stable when data elements have two copies because of the split renewing transaction from one family requiring waiting each others. The coefficients of fixed transactions match if lower input flow intensity exists due to successful service of transaction by the lock manager. Fig 5 shows that fixed transaction coefficients are almost the same for 3 copies in used input flow intensity and it is smaller then the coefficients for 2 copies. This could be due to the split renewed transactions from one family have to wait each other in order to be processed. This retains the global transaction. From the other side transaction execution for 3 copies is more normal distributed then for 2 ones. The initial copies site allocation influences over this fact and allocation of prime sites as well. From these graphics we can make a conclusion that transaction execution is more uniformly in distributed 2PL model then the centralized and primary copy ones. Fig 7 shows that coefficient of fixed transaction is more high for 3 copies due to existence of more reading fixing transactions and the lack of necessity for splitting and waiting one another. The number of transactions served in model when 3 copies exist is less for a unit model time due to necessity of waiting and transaction transmitting depends on initial allocation of copies in sites and allocation of primary sites. The analysis shows that model algorithm for distributed 2PL system is much stable then the others. - V.12-5 -

International Conference on Computer Systems and Technologies - CompSysTech’08

CONCLUSIONS AND FUTURE WORK Further on work will be as follows: 1. Simulation models of the servicing of distributed transactions with different complexity (longer transactions and having greater number of entrance streams of global transactions) in 2PL protocols with detecting and deadlocks solving and the development of a simulation model and majority copies 2PL algorithm. 2. Statistics gathering of the results of the simulations and evaluation of the efficiency of 2PL algorithms according to the criteria suggested in [9]. REFERENCES [1] Agrawal, R., M. Carrey, M. Livny. Concurrency Control Performance Modeling: Alternatives and Implications. http://www.cs.berkeley.edu/~brewer/cs262/ConcControl.pdf [2] Carey, М., M. Livny. Distributed Concurrency Control Performance: A Study of Algorithms, Distribution and Replication. Proceedings of the 14th VLDB Conference, Los Angeles, 1988, pp. 13-25. [3] Connolly, T., C. Begg. Database Systems. Addison – Wesley, 2002. [4] Garcia-Molina, H., J. Ulman, J. Widom. Database systems: The complete book. New Jersey, Prentice Hall 07458. [5] General Purpose Simulation System World. Minuteman Software, www.minutemansoftware.com. [6] GPSS World – общецелевая система имитационного моделирования. www.gpss.ru [7] Miller, J., N. Griffeth. Performance Modeling of Database and Simulation Protocols: Design Choices for Query Driven Simulation. http://chief.cs.uga.edu/~jam/papers/prot.ps. [8] Srinisava, R., C. Williams, P. Reynolds. Distributed transaction Processing on an Ordering Network. Technical Report CS-2001-08, February 2001. [9] Vasileva, S., Effectivity of Algorithms for Concurrency Control of Transactions in Distributed Database Management Systems. Opportunities for Assessment. Proceedings of the 3rd Balkan Conference in Informatics “Research in Informatics and Information Society Technologies”, 2007, pp.205-216. [10] Vasileva, S., P. Milev, B. Stoyanov. Some Models of a Distributed Database Management Systems with Data Replication. International Conference on Computer Systems and Technologies - CompSysTech 2007, Rousse, Bulgaria, II.12-1-II.12-6. [11] Алиев, А., А. Юсубов. Моделирование системы управления транзакциями расспределенной базы данных. http://ict.edu.ru/ft/003584/itmo11.pdf, с.50-54 [12] Алиев, А., Аг. Алиев. Имитационное моделирование модулей системы управления транзакциями. http://www.ncs.ru/ws/list_doc.dhtml. //V Всероссийская конференция молодых ученых по математическому моделированию и информационным технологиям. 2004. [13] Гасанова, Н. Разработка алгоритмов управления транзакциями, основанных на методе временных меток в расспределенных базах данных. 2004, http://science.az/autoreferats/referat_hasanova_nazli.pdf. ABOUT THE AUTHORS Assist. Prof. Svetlana Vasileva, PhD student, College - Dobrich, University of Shumen, Phone: +359 58 603 248, Е-mail: [email protected] Assist. Prof. Aleksandar Milev, PhD, Department of Mathematics and Informatics, University of Shumen, Phone: +359 54 830 340, Е-mail: [email protected].

- V.12-6 -