SPECULATION-BASED PROTOCOLS FOR IMPROVING THE PERFORMANCE OF READ-ONLY TRANSACTIONS
Thesis submitted in partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Computer Science and Engineering
by T. Ragunathan 200499003
[email protected]
Center for Data Engineering International Institute of Information Technology Hyderabad, India December 2010
INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY Hyderabad, India
CERTIFICATE
It is certified that the work contained in this thesis, titled “Speculation-Based Protocols for Improving the Performance of Read-only Transactions” by T. Ragunathan, has been carried out under my supervision and is not submitted elsewhere for a degree.
Date
Advisor: Dr. P. Krishna Reddy
c T. Ragunathan, 2010 Copyright ⃝ All Rights Reserved
To my wife and children, for their everlasting love and support.
Acknowledgements
I would like to convey my deep sense of appreciation and gratitude to my supervisor Professor P. Krishna Reddy for motivating and guiding me towards my research work. He showed me how to do research. He has given me valuable suggestions for moulding me as a researcher. He spent his precious time with me for long research discussions and for providing valuable feedback on my research drafts. Without his love and affectionate guidance and help, I would not have completed this thesis work. I thank him for the enormous patience that he has shown towards me. I am deeply obliged to Professor Kamalakar Karlapalem, Head of CDE and Dean (Academics), who taught me the fundamentals and advancements of database systems. Also, I wish to express heartfelt thanks to him, for providing valuable suggestions towards building the analytical model. I extend my sincere thanks to Dr. Vikram Pudi for providing valuable feedback. I express my sincere thanks to Professor Rajeev Sangal, Director who is the major guiding force for all of the research students of IIIT-H. My sincere thanks are also due to Professor P. J. Narayanan, Dean (Research), whose writings regarding research and research students, motivated me to carry out research work in a balanced manner. I am grateful to all the faculty of IIIT-H whose encouragement towards research helped me to contribute some good work to the research community. I thank my colleagues Mr. Uday Kiran, Mr. Venugopal Reddy , Mr. Kumaraswamy and Mr. Satheesh Kumar for helping me in various aspects during my stay at IIIT-H. I also thank Ms. K. V. Seetha Lakshmi for her help in making logistical arrangements for attending conferences and seminars. I extend my thanks to the members of eSagu team, staff members of college main office, server room and library for their valuable help. I wish to express my special thanks to my wife Mrs. R. Saraswathy and my sons R. Kishore Raja and R. Vikram Prabhu, who lent me a share of their precious time for completion of this thesis work. Finally, I thank the almighty who guided me throughout the tough periods.
Abstract In the emerging web databases and e-commerce scenario, information systems have to meet intensive information requirements from a large number of users. These information systems receive both update transactions (UTs) and read-only transactions (ROTs). A UT contains both read and write operations and an ROT contains only read operations. Designing efficient protocols to process ROTs is a research issue. There are three aspects for designing a concurrency control protocol for processing ROTs: correctness, data currency and performance. The protocol should process transactions by satisfying serializability criteria. Also, the protocol should process transactions without any data currency issues. Note that “data currency” is another important aspect of the protocol. A protocol should process the ROTs without missing the effect of any preceding UTs. Finally, the protocol should give high performance. If both the UTs and ROTs are processed with the popular two-phase locking (2PL) protocol, the performance degrades with the data contention as the UTs have to wait for ROTs and vice versa in case of a conflict. On the other hand, it can be noted that, the 2PL processes ROTs without any correctness and data currency issues. So, efforts are being made to propose improved protocols by exploiting the special property of ROT that “it does not modify the data” and by compromising on correctness and data currency. There are efforts to improve the performance by processing ROTs with multi-version based approaches, at lower isolation levels, and by compromising correctness. One of the popular protocols is snapshot isolation (SI)-based protocol, in which an ROT reads the snapshot of the database ignoring modifications done by preceding UTs. In SI-based protocols, even though performance is improved, both correctness and data currency aspects are compromised. The 2PL protocol suffers from the performance problems and SI-based protocol suffers from both correctness and data currency problems. So, the research challenge is to develop efficient concurrency control protocols to improve the performance of ROTs without correctness and data currency problems. In this thesis, we have proposed speculation- and semantics-based protocols for processing ROTs that provide high performance and process transactions without any correctness and data currency issues. Speculation-based protocols have been proposed in the literature to improve the performance of UTs. Basically, under speculation, a transaction carries out multiple executions by reading the uncommitted values of preceding conflicting transactions. A transaction retains one of the speculative executions based on the termination decisions of preceding transactions. Speculation improves the parallelism but requires extra processing resources. Also, the number of speculative executions and number of object versions explode with data contention. In this thesis, the notion of speculation is being extended to propose improved protocols for processing ROTs. In the proposed protocols, ROTs are processed with speculation and UTs are processed with 2PL. The ROTs access after-images produced by UTs and carry out speculative executions. As a result, the following advantages have been identified: (i) an ROT can carry out speculative executions and commit independently based on the status of preceding UTs, and (ii) the number of speculative executions and data versions is significantly reduced. We have proposed two speculation-based protocols for processing ROTs. In the synchronous approach, the speculative executions of an ROT can be carried out in a synchronous manner, i.e., all the speculative
executions of an ROT are carried out at the same pace. In the asynchronous approach, the speculative executions of an ROT are carried out at different pace. It was observed that asynchronous approach improves the performance over the synchronous approach. In the proposed speculation-based protocols for ROTs, UTs follow 2PL and are made to wait if they conflict with concurrent UTs and ROTs. To reduce the waiting time of UTs, we have proposed a semanticsbased approach by proposing the notion of “compensatability”. In this protocol, a UT need not wait for an ROT, if the computation of that ROT is compensatable. However, the compensatable ROT has to perform compensating computations before its commitment. We have proposed enhanced protocols by extending semantics to the proposed speculation-based protocols. The proposed protocols bring significant benefits. The ROTs are processed without violating correctness and compromising data currency. The performance results show that the proposed protocols significantly improve the performance over 2PL and SI-based protocols with manageable extra processing resources.
ii
Contents 1
2
Introduction
1
1.1 1.2
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview of Proposed Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Speculation-Based ROT Approaches . . . . . . . . . . . . . . . . . . . . . . . . . .
1 2 2
1.3
1.2.2 Semantics-Based Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 3
1.4 1.5
Major Contributions of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 5
Overview of Transaction Management, Research Problem and Related work 2.1 Overview of Transaction Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6 6
2.1.1 2.1.2 2.1.3 2.2
2.3
2.4 3
Transaction Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Correctness Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transaction Manager and Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . .
6 7 8
Research Problem about Processing of Read-only Transactions . . . . . . . . . . . . . . . . 2.2.1 Read-only Transactions and Update Transactions . . . . . . . . . . . . . . . . . . . 2.2.2 About Data currency of Read-only Transactions . . . . . . . . . . . . . . . . . . . .
8 8 8
2.2.3 Research Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9 10
2.3.1 2.3.2 2.3.3
Isolation levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concurrency Control Protocols for Processing ROTs . . . . . . . . . . . . . . . . . Speculation-Based Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 11 14
2.3.4 2.3.5
Semantics-Based Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analytical Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14 16
Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
The 2PL, SI-Based and SL Protocols
18
3.1
Two-phase Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Performance of 2PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18 19
3.2
Snapshot Isolation-Based Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Processing of ROTs in FCWR . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20 20 20
3.2.3 Performance of FCWR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Speculative Locking Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20 22
3.3
i
3.3.1
3.4 4
22
3.3.2 Overview of the SL protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Performance of SL Protocols Regarding Processing of ROTs . . . . . . . . . . . . . Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23 24 25
Synchronous Speculative Locking Protocol for ROTs
26
4.1
Basic Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Few Data Object Versions and Speculative Executions . . . . . . . . . . . . . . . . 4.1.2 Independent Commitment of ROTs . . . . . . . . . . . . . . . . . . . . . . . . . .
26 27 27
4.2
Proposed Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Transaction Processing with SSLR . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Protocols for UTs and ROTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28 28 29
4.3 4.4
Correctness Proof of SSLR Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simulation Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30 32
4.4.1 4.4.2 4.4.3
Closed Queue Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simulation Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32 34 35
4.4.4 4.4.5
Protocols Simulated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Experiments under Unlimited Resources . . . . . . . . . . . . . . . . . . . . . . . .
36 36
4.4.6 4.4.7 4.4.8
Experiments under Limited Resources . . . . . . . . . . . . . . . . . . . . . . . . . Experiments about Resource Utilization . . . . . . . . . . . . . . . . . . . . . . . . Experiments for Data Currency . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40 40 42
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Simulation of Speculative Executions of ROT . . . . . . . . . . . . . . . . . . . . .
43 43
4.5.2 Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43 44
Asynchronous Speculative Locking Protocol for ROTs 5.1 Basic Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Transaction Processing with ASLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45 45 46
5.3 5.4
Differences between SSLR and ASLR Protocols . . . . . . . . . . . . . . . . . . . . . . . . The ASLR Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48 49
5.5 5.6
Correctness Proof of ASLR protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Performance Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.1 Experiments under Unlimited Resources . . . . . . . . . . . . . . . . . . . . . . . .
50 52 53
5.6.2 5.6.3
Experiments under Limited Resources . . . . . . . . . . . . . . . . . . . . . . . . . Experiments for Resource Utilization . . . . . . . . . . . . . . . . . . . . . . . . .
55 56
5.6.4 Experiment for Data Currency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56 56 58
4.5
4.6 5
Basic Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7 5.8
ii
6
Semantics-Based Speculative Locking Protocols for ROTs
59
6.1
Basic Idea of Semantics-Based Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Basic Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Semantics-Based SSLR protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59 60 61
6.2.1 6.2.2
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The SSSLR Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61 62
6.3
The Semantics-Based ASLR protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 The SASLR Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
64 64 65
6.4 6.5
Comparison of Schedules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Performance Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67 68
6.5.1 6.5.2 6.5.3
Simulation Model and Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . Protocols Simulated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Experiments under 30% and 50% UTs Environment . . . . . . . . . . . . . . . . .
68 68 69
Discussion and Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72 72
6.6.2 Implementation issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74 74
Performance Evaluation through Analytical Modeling 7.1 Performance Evaluation of 2PL and Optimistic protocols . . . . . . . . . . . . . . . . . . . 7.1.1 Transaction Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
75 75 75
6.2
6.6
6.7 7
7.1.2 7.1.3 7.2
7.3
7.4 8
Hardware Resource Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Response Time Study for 2PL and Optimistic Protocols . . . . . . . . . . . . . . .
76 77
Response Time Study of 2PL, FCWR and ASLR Protocols for ROT Processing Environment 7.2.1 2PL Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 ASLR protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80 80 82
7.2.3 7.2.4
FCWR protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83 84
Performance Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Simulation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 Simulation Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
84 84 86
7.3.3 7.3.4
Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87 87
7.3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87 90
Conclusion
91
List of publications
94
Appendix A
96
Bibliography
100
iii
List of Figures 2.1
Example for data currency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
3.1 3.2
Lock compatibility matrix for 2PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Depiction of transaction processing with 2PL: ‘xi ’ represents ‘ith ’ version of ‘x’, ri [xj ] indicates that read operation is executed on ‘xj ’ by the transaction Ti and wi [xj ] denotes that
19
the transaction Ti performs a write operation on a particular version of ‘x’ and produces the version ‘xj ’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
3.3 3.4 3.5
Depiction of transaction processing with FCWR . . . . . . . . . . . . . . . . . . . . . . . . Example for low data currency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Depiction of transaction processing with SL (Tij indicates jth (j > 0) speculative execution
21 22
3.6
of Ti ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lock compatibility matrix for SL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23 23
3.7
Depiction of Object tree for SL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
4.1
Depiction of Object trees for (a) SL and (b) proposed protocol . . . . . . . . . . . . . . . .
27
4.2 4.3 4.4
Lock compatibility matrix for SSLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Depiction of transaction processing with SSLR . . . . . . . . . . . . . . . . . . . . . . . . Logical queuing model:closed queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28 29 33
4.5 4.6
Physical queuing model related to Figure 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . MPL versus Throughput for 2PL protocol . . . . . . . . . . . . . . . . . . . . . . . . . . .
34 37
4.7 4.8 4.9
MPL versus Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . % of UTs versus Transaction aborts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Details of speculative executions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37 38 39
4.10 MPL versus Average Response time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.11 MPL versus Throughput in limited resources environment . . . . . . . . . . . . . . . . . .
39 40
4.12 MPL versus CPU Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.13 MPL versus I/O Device Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.14 MPL versus Average data currency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41 41 42
4.15 MPL versus Average data currency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
5.1 5.2 5.3
Processing speculative executions in ASLR . . . . . . . . . . . . . . . . . . . . . . . . . . Lock compatibility matrix for ASLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Depiction of Transaction processing with ASLR . . . . . . . . . . . . . . . . . . . . . . . .
46 47 48
5.4 5.5
Serializable schedule of SSLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Serializable schedule of ASLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49 49
iv
5.6
MPL versus Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
5.7 5.8 5.9
% of UTs versus Transaction aborts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Details of speculative executions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MPL versus Average Response time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54 54 55
5.10 MPL versus Throughput in limited resources environment . . . . . . . . . . . . . . . . . . 5.11 MPL versus CPU Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56 57
5.12 MPL versus I/O device Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.13 MPL versus Average Data currency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57 58
6.1 6.2 6.3
Lock compatibility matrix for SSSLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Depiction of transaction processing with SSSLR . . . . . . . . . . . . . . . . . . . . . . . . Lock compatibility matrix for SASLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61 62 64
6.4 6.5
Depiction of transaction processing with SASLR . . . . . . . . . . . . . . . . . . . . . . . Serializable schedule of SSLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
64 67
6.6 6.7 6.8
Serializable schedule of SSSLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Serializable schedule of ASLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Serializable schedule of SASLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67 67 67
6.9 % of CROTs vs Throughput (30% UTs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10 % of CROTs vs Throughput (50% UTs) . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69 70
6.11 % of CROTs vs UT throughput (30% UTs) . . . . . . . . . . . . . . . . . . . . . . . . . . 6.12 % of CROTs vs UT throughput (50% UTs) . . . . . . . . . . . . . . . . . . . . . . . . . . 6.13 % of CROTs vs ROT throughput (30% UTs) . . . . . . . . . . . . . . . . . . . . . . . . . .
70 71 71
6.14 % of CROTs vs ROT throughput (50% UTs) . . . . . . . . . . . . . . . . . . . . . . . . . . 6.15 % of CROTs vs Average number of speculative executions (30% UTs) . . . . . . . . . . . .
72 73
6.16 % of CROTs vs Average number of speculative executions (50% UTs) . . . . . . . . . . . .
73
7.1
Logical queuing model: Open queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
7.2 7.3 7.4
Physical queuing model related to Figure 7.1 . . . . . . . . . . . . . . . . . . . . . . . . . Transaction arrival rate versus Average response time . . . . . . . . . . . . . . . . . . . . . Transaction arrival rate versus Prob. of aborts . . . . . . . . . . . . . . . . . . . . . . . . .
86 88 88
7.5
Transaction arrival rate versus Prob. of lock contention . . . . . . . . . . . . . . . . . . . .
90
v
List of Tables 4.1
Simulation Parameters, Meaning and Values . . . . . . . . . . . . . . . . . . . . . . . . . .
35
6.1
Simulation Parameters, Meaning and Values . . . . . . . . . . . . . . . . . . . . . . . . . .
68
7.1
Parameters, Meaning and Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
7.2
Notations and Meanings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
vi
Notations • Transactions are represented with Ti , Tj ,... (i, j are integers) • Data objects are denoted with ‘x’,‘y’,.. For the data object ‘x’, ‘xi ’ (i = 0 to n) represents ith version of ‘x’ • The notation ri [xj ] indicates that read operation is executed on ‘xj ’ by the transaction Ti and wi [xj ] denotes that the transaction Ti performs a write operation on a particular version of ‘x’ and produces ‘xj ’ • The notations ‘si ’, ‘ci ’, and ‘ai ’ denote the start, commit and abort of the transaction Ti respectively • Tij indicates jth speculative execution of the transaction Ti • ri [x], (or wi [x]) denotes the execution of a Read (or Write) operation issued by transaction Ti on a data object ‘x’. • The notations used for simulation and analytical study are available in Tables 4.1, 6.1, 7.1 and 7.2.
vii
Chapter 1
Introduction 1.1 Background In the internet era, business organizations are relying on computer-based information systems for information management. The information systems enable users to store, read, and update information. Also, information systems serve several users in parallel. An information system should be built such that it serves multiple user requests in an efficient and correct manner. The database and database management system (DBMS) are the important components of an information system. A database is a collection of data objects and DBMS is a collection of programs responsible for managing the database. Based on the requirements, users access information systems by executing application programs. The DBMS processes these application programs by considering them as transactions. A transaction is processed by ensuring the following properties: atomicity, consistency, isolation and durability [1]. The work in this thesis can be categorized as an effort to propose efficient protocols for ensuring isolation property. We explain the isolation property in a detailed manner. To improve the performance, the DBMS processes the transactions in a concurrent manner. When several transactions are executed concurrently, some of the operations may be executed in an interleaved fashion. If multiple transactions access common data objects, arbitrary interleaving of operations on the data objects may lead to several problems such as dirty read and lost update. As a result, the isolation property is violated. To ensure correctness (proper interleaving of operations), serilizability criteria is followed. As per the serializability criteria, the concurrent execution of a set of transactions is correct, if it is equivalent to a serial execution of the same set of transactions. The DBMS employs concurrency control protocols to ensure the proper interleaving of operations of transactions for satisfying serializability criteria. There are two kinds of transactions in the DBMS: read-only transactions (ROTs) and update transactions (UTs). An update transaction (UT) is a transaction which can perform both read and write operations on database. An ROT is a transaction that contains only read operations which do not modify data. Two-phase locking (2PL) protocol is followed in the DBMS to ensure that transactions are processed in a correct manner. In 2PL, if an ROT conflicts with a UT, the processing of ROT is delayed till the corresponding UT terminates. Also, if a UT conflicts with an ROT, the processing of UT is delayed till ROT gets the access to the objects. As a result, the throughput performance (number of transactions processed per second) deteriorates as data contention increases.
1
To improve the performance, efforts are being made to propose improved protocols by exploiting the fact that “an ROT does not modify the data” and by compromising correctness. Several efforts have been made in the literature to process ROTs and UTs using separate protocols for improving the overall performance of transaction processing in DBMSs [2] [3] [4][5]. There are efforts to improve the performance by processing ROTs with multi-version based approaches, at lower isolation levels, and by compromising correctness. One of the popular protocols is snapshot isolation (SI)-based protocol, in which an ROT reads the snapshot of the database by ignoring the effects of other UTs. The commercial DBMSs like Oracle and Microsoft SQL Server follow SI-based protocols to process ROTs [6]. In SI-based protocol, ROTs read the snapshot of the database ignoring the effects of concurrent UTs. As a result, the data currency of the transaction is affected. It means the transaction misses the effect of preceding UTs. Also, it violates serializability criteria. Based on the above discussion, the following can be identified as the major issues in developing protocols for processing ROTs: correctness, performance and data currency. • Correctness: The protocol should process the concurrent transactions by satisfying serializability criteria. • Performance: The protocol should provide high “throughput”. • Data currency: The protocol should provide the most recent data available in the database to the ROTs. (Please refer to Section 2.2 for details) It can be observed that, in the emerging e-commerce scenario, information systems have to serve large number of online users. Very frequently these information systems receive ROTs. The ROTs have to be processed in a correct manner and recent information should be provided to the users [7]. The read-only queries/ROTs generated in the information system have to be processed by satisfying the serializability criteria so that the schedules of ROTs and UTs will be consistent. The information system has to provide high throughput in order to serve large number of read-only queries generated in the system. Overall, the 2PL protocol satisfies correctness criteria and processes transactions without any data currency issues. But, its performance deteriorates with data contention. Even though the SI-based protocol improves the performance over 2PL, serializability criteria is violated and it suffers from data currency problems. So, the research challenge is to come up with a new concurrency control protocol which can provide high performance by processing transactions correctly without any data currency issues. In this thesis, we investigate the development of improved concurrency control protocols to process ROTs without any correctness and data currency problems by using the notions of speculation and semantics.
1.2
Overview of Proposed Approaches
In this section, we discuss the overview of the proposed speculative locking and semantics-based approaches.
1.2.1
Speculation-Based ROT Approaches
In the literature, speculative locking (SL) protocol [8] has been proposed to improve the transaction processing performance in distributed database systems. In SL, a transaction releases the lock on the data object whenever it produces the corresponding after-image. The waiting transaction accesses both beforeand after-images and carries out speculative executions. After the completion, the transaction retains one 2
of the speculative executions based on the termination decision of preceding transactions. The SL protocol improves the transaction parallelism by carrying out multiple executions per transaction. The SL protocol is proposed to improve the transaction processing performance in OLTP environments by considering transactions which contain both the read and write operations. Through SL, the performance can be improved by trading extra processing resources without violating serializability criteria. Also, in SL, the number of speculative executions of a transaction and data object versions explode with data contention. In this thesis, we have investigated efficient protocols for processing ROTs by extending speculation. It was identified that significant benefits could be achieved if we employ speculation to process ROTs and 2PL to process UTs. As a result, it was observed that the number of speculative executions to be carried out by ROTs is reduced significantly as ROTs do not modify any data. Also, the number of data object versions are reduced significantly. Another significant benefit is that an ROT can commit independently whenever it completes without waiting for preceding UTs and by retaining the effect of the committed transactions at that instant. So, the speculation-based approach can improve the performance of ROTs without any correctness and data currency problems. Also, the speculation-based approaches can exploit the parallel processing power of modern multi-core CPU-based systems for the parallel execution of speculative threads of ROTs. Also, it was observed that the speculative executions of an ROT can be carried out either in synchronous manner or in asynchronous manner. Based on this, we have proposed two protocols. In the synchronous approach, an ROT waits for the production of after-image of a data object whenever it conflicts with a UT and starts speculative executions by accessing both before- and after-images. As a result, all the speculative executions are processed at the same pace. In the asynchronous approach, an ROT continues its execution by accessing the available data objects whenever it conflicts with a UT. Further speculative executions are started dynamically whenever preceding UTs produce after-images. As a result, each speculative execution of an ROT is carried out at different pace. Overall, both the protocols improve the performance significantly over 2PL and SI-based protocols. Among these protocols, the asynchronous approach provides more concurrency than synchronous approach.
1.2.2
Semantics-Based Approaches
In the proposed speculative-locking protocols for ROTs, UTs are processed with 2PL and ROTs are processed with speculation. In these protocols, a UT waits if it conflicts with an ROT. It was observed that it is possible to process certain class of UTs without blocking (if a UT conflicts with ROTs) by exploiting the semantics of the ROTs. We have come up with a notion called “compensatability” which can be defined for the computations performed by the ROTs. Based on this, we classify the ROTs as compensatable and noncompensatable ROTs. A protocol is proposed for processing the compensatable-ROTs in an effective manner. The resultant approach improves the performance by reducing the blocking of UTs which are in conflict with compensatable-ROTs. As a result, the performance can be improved.
1.3 Motivation Due to wide usage of internet, large number of users are accessing the information systems through online. Serving information access requests of these users correctly and efficiently is very important. More often, the users pose read-only queries/transactions to these information systems. The information systems should be able to provide the most recent data available in the database to these users. For example, the online railway
3
reservation system should be able to provide the latest information regarding the seat availability status of trains to the users. In an online stock exchange system, for the queries issued by the registered users, the system has to provide the most recent data available in the database which is very much essential for the users to take a fair decision on purchasing or selling the shares. So, developing high performance protocols for processing ROTs without any data currency related issues is an important research problem [7]. We discussed in the previous section that SI-based protocols are used for improving the performance of ROTs over 2PL protocol by compromising on correctness and data currency issues. In the literature, efforts are going on to improve the performance of SI-based protocols on the issue of correctness. In [9], a new method for implementing serializable isolation based on a modification of snapshot isolation is discussed. This method improves the performance over 2PL protocol without violating correctness criteria. However, this method increases the number of transaction aborts than that of the conventional SI-based protocol. Sometimes, transactions can get aborted even though their execution do not violate serializability criteria. In this method, the transactions read from the snapshot of the database and they miss the updates performed by the concurrent UTs. So, this protocol suffers from data currency issues. The main issue with the protocols proposed in this thesis is that the performance is traded with extra processing resources. It means that with extra processing resources, there is an opportunity to improve parallelism. We have demonstrated through performance evaluation that it requires only 0.2 times of additional processing resources (main memory or CPU power). We feel that by observing technology trends, the future information systems will be powered with multi-core CPU-based systems and additional memory can be added at an affordable cost. It is a research issue to find avenues to improve parallelism among applications to exploit parallel processing capabilities of modern multi-core CPU-based computers. We have to investigate and identify the opportunities of parallelism in database systems to utilize the parallel processing capabilities of modern computer architectures. The notion of speculation provides opportunity to improve the parallelism in database systems. By observing the current technology trends, the proposed protocols provide an opportunity to improve the performance of ROTs by trading extra processing resources in a cost-effective manner without any correctness and data currency issues.
1.4
Major Contributions of the Thesis
The major contributions of the thesis work are as follows. (i) Synchronous speculation-based protocol for ROTs We have proposed an approach how synchrounous speculation can improve the performance of ROTs. The advantage of speculation in processing speculative executions have been clearly explained. The protocol and proof of correctness is presented. We also discuss the performance evaluation results. (ii) Asynchronous speculation-based protocol for ROTs We have proposed an approach how asynchronous speculation can improve the performance of ROTs. The difference between synchronous and asynchronous approaches has been explained clearly. It has been shown that asynchronous speculation provides more concurrency. The protocol and proof of correctness is proposed. We also present the performance evaluation results. (iii) Semantics-based protocols The notion of “compensatability” has been proposed. The protocol to process ROTs by using the notion of “compensatability” has been discussed. How the waiting of UTs can be reduced using the notion 4
of “compensatability” has been explained. Integrated protocols have been proposed by combining the semantics with speculation. The performance evaluation results are discussed. (iv) Performance evaluation through analytical modeling We have also evaluated the performance of the proposed asynchronous speculation-based protocol through analytical methods. We have compared the results of 2PL, SI-based and asynchronous speculationbased protocols, obtained through analytical and simulation studies and observed similar performance trends.
1.5
Organization of the Thesis
The rest of the thesis is organized as follows. In the next chapter, we discuss the basic concepts of transaction management, correctness criteria for transaction processing, and the research issues about ROT processing. We also present the related work regarding correctness criteria, processing ROTs, semantics-based protocols and analytical methods for performance evaluation. In Chapter 3, we briefly discuss how 2PL and snapshot isolation-based protocols process ROTs and corresponding issues. The basic speculative locking protocol proposed in the literature is discussed. In Chapter 4, we present synchronous speculative locking protocol along with performance results. In Chapter 5, we present the asynchronous speculative locking protocol for ROTs along with performance results. In Chapter 6, we explain how the notion of “compensatability” can be used to reduce the waiting of UTs. Also, we present the proposed integrated protocols for processing ROTs along with performance results. In Chapter 7, we evaluated the performance of the proposed protocols using analytical study. Chapter 8 presents the summary and future work. In Appendix A, we explain the correctness criteria for transaction processing and the correctness proof of 2PL protocol.
5
Chapter 2
Overview of Transaction Management, Research Problem and Related work In this chapter, we discuss the overview of transaction management. Next, we present the notion of data currency and present requirements to design protocols to process ROTs. Subsequently we discuss the related approaches that have been proposed in the literature for processing of ROTs and concerning with semanticsbased approaches for processing transactions. We also provide the review of analytical methods for evaluating the performance of concurrency control approaches.
2.1 Overview of Transaction Management In this section, we briefly explain the transaction concept, ACID properties of transactions and correctness criteria.
2.1.1
Transaction Concept
A database is a collection of related data objects. A DBMS is a collection of programs that enables users to create and maintain a database. The database and DBMS together is called as database system. A transaction is an execution of a program that accesses a shared database using read and write operations. A transaction terminates by either executing a commit or an abort operation. A commit operation implies that the transaction was successful, and hence all its updates should be incorporated into the database. An abort operation indicates that the transaction has failed, and hence requires the DBMS to cancel or abolish all its effects on the database system. Transactions are represented with Ti , Tj ,... (i, j are integers). We use ri [x], (or wi [x]) to denote the execution of a Read (or Write) operation issued by transaction Ti on data object ‘x’. After reading a data object, the transaction may perform computations using that data object. Then the transaction may modify same or some other data object present in the database. Here, we use the notations ci and ai to denote Ti ’s Commit and Abort operations respectively. The formal definition of transaction is as follows [10]. Definition 1 A transaction Ti is a partial order with ordering relation 0) number of speculative executions and access the data object with m versions. Then, each of n speculative executions have to split into m speculative executions. So, in the SL protocol, the number of speculative executions explode with data contention.
3.3.3
Performance of SL Protocols Regarding Processing of ROTs
(i) Correctness In the SL protocol, as per commit dependency rule, a transaction cannot commit until the preceding transactions are terminated. So, the SL protocol ensures serializable schedules [8]. 24
X0 T1 T2
X1 T
X3 T3 X7
2 X2
T3 X5
T3 X6
T3 X4
Figure 3.7: Depiction of Object tree for SL (ii) Throughput SL protocols improve concurrency by trading extra processing resources. SL protocols perform better than both 2PL and FCWR protocols. However, the number of speculative executions explode with data contention. (iii) Data currency In SL, a transaction reads the before- and after-images produced by the preceding transactions and proceed with speculative executions. Also, in SL, one of the speculative executions of a transaction which consists the effect of all preceding transactions is selected for commitment. So, we can say that a transaction processed under SL, always read the recent data from the database. Hence, we can say that SL always provides the maximum data currency to the transactions. (iv) Complexity As explained in the preceding section, under SL protocol, the number of data object versions and speculative executions of an ROT explode with data contention. So improving the throughput of transaction processing under high data contention with speculation is an issue.
3.4
Chapter Summary
The popular 2PL protocol processes the ROTs correctly, but its performance deteriorates with data contention. The 2PL protocol provides high data currency to ROTs. The SI-based protocols are widely used for improving the performance of ROTs. But, they compromise on correctness and data currency issues. Speculative locking protocols proposed in the literature can improve the performance of the ROTs without correctness and data currency issues. However, in SL protocols, the number of speculative executions explode with the increase in data contention. In the next chapter, we discuss the proposed synchronous speculation-based protocol for ROTs in detail.
25
Chapter 4
Synchronous Speculative Locking Protocol for ROTs The 2PL protocol processes the transaction correctly without any data currency problems, but its performance deteriorates with data contention. Even though the SI-based protocols improve the performance over 2PL, they process the transactions with correctness and data currency issues. The SL protocols have got the scope for improving the performance of ROTs without correctness and data currency problems. However, in SL, the speculative executions of a transaction explode with data contention which requires more processing resources. We have investigated regarding the requirements of the ROT processing environment. Based on this investigation, we have come up with SL-based protocols for ROTs to improve the performance by executing less number of speculative executions. In this chapter, we propose the synchronous speculative locking protocol for ROTs (SSLR) in which the speculative executions are carried out in a synchronous manner. Subsequently, we discuss the correctness proof and performance results of the proposed protocol. In Section 4.1, we discuss the basic idea of the proposed protocol. In Section 4.2, we explain how ROTs are processed with the proposed protocol and then the details of the protocol. In Section 4.3, we discuss the correctness proof. The simulation results are presented in section 4.4. In Section 4.5, we discuss the simulation of speculative executions of ROT and the implementation issues of SSLR protocol.
4.1
Basic Idea
The basic idea is as follows: A UT follows 2PL and allows only other ROTs to access after-images. An ROT is processed with speculation. If there is a conflict, the ROT waits till the preceding UT produces afterimage and starts two speculative executions in parallel: one is with the before-image and another is with the after-image. Since the speculative executions of an ROT are carried out in a synchronous manner, we call the proposed protocol as synchronous speculative locking protocol for ROTs (SSLR). As a result, it was observed that significant benefits can be achieved which leads to significant improvements in transaction processing performance, especially ROTs. • Few data object versions and speculative executions • Independent commitment of ROTs 26
We discuss these benefits in detail.
4.1.1
Few Data Object Versions and Speculative Executions
We discuss how the proposed approach can reduce the number of data object versions and speculative executions. • Data object versions In the proposed SSLR protocol, at a time only one UT can have EW-lock on a data object. So, only two versions are maintained for a data object in the object tree for the proposed protocols. The processing is depicted in Figure 4.1 (b). In this figure, T1 is a UT. T1 accesses the data object version ‘x0 ’ and produces ‘x1 ’. No other UT can access ‘x’ until T1 releases the lock. X0 X0
T1 T2
X1 T
2 T1
X3 T3 X7
X2 T3 X5
T3
T3
X6
X4
(a)
X1 (b)
Figure 4.1: Depiction of Object trees for (a) SL and (b) proposed protocol • Speculative executions It can be observed that write operations are the cause for the generation of new uncommitted versions under SL. As a result, the number of speculative executions that are to be carried out by waiting transactions and the number of versions stored in the trees explode with the increase in data contention. As a result, we require more memory and processing resources. Regarding processing of ROTs, it can be observed that an ROT only reads the data and does not generate any new versions. So, if we process only ROTs through speculation and UTs with 2PL, it is possible to improve the performance by consuming less resources as compared to the resources required for processing UTs with speculation. If the number of conflicts occurred for an ROT is n, then 2n speculative executions are generated for that ROT. However, the value of n is reduced significantly over SL.
4.1.2
Independent Commitment of ROTs
In SL, a transaction can commit only after the termination of transactions with which it has formed commit dependency. Note that, in SL, after completing execution, UTs must wait for the termination of preceding transactions to decide which speculative execution is to be retained. Suppose an ROT is processed under speculative mode. After its completion it can commit without waiting for the termination of UTs with which it has formed commit dependency; It can simply commit by retaining appropriate execution which contains the effect of committed transactions at that instant. Note that, this type of processing ensures serializability criteria.
27
Lock requested by Tj RR RU EW
RR yes yes no
Lock held by Ti RU EW SPW yes no ssp yes yes no no no no no
Figure 4.2: Lock compatibility matrix for SSLR
Let Ti be an ROT and Tj be a UT. Suppose Ti forms commit dependency with Tj . In SL, Ti commits only after the termination of Tj . Whereas in the proposed protocol, whenever Ti completes its execution it would be possible to commit Ti by retaining one of the speculative executions without waiting for Tj to terminate. However, it can be noted that, a UT waits for the termination of preceding UTs and ROTs. Overall, in the proposed SSLR, the ability to commit ROT independently by selecting appropriate speculative execution is a crucial property and brings significant flexibility. The waiting by ROT could be reduced as compared to 2PL by ensuring the maximum data currency.
4.2
Proposed Protocol
In this section, we discuss how ROTs are processed under the proposed SSLR. Next, we present the details of the protocol.
4.2.1
Transaction Processing with SSLR
The proposed protocol is a lock based protocol. The locks are acquired in a dynamic manner by both the ROTs and UTs. The lock compatibility matrix of SSLR is shown in Figure 4.2. The W-lock is divided into EW-lock and SPW-lock. The EW-lock is requested by UTs for writing the data object. The RU-lock (Read lock for UT) is requested by UTs for reading a data object. The RR-lock (read lock for ROT) is requested by ROTs for reading a data object. The entry “ssp yes” (synchronous speculation yes) indicates that the requesting ROT carries out speculative executions and forms a commit dependency. • Processing of UTs The UTs are processed with 2PL, but with a slight difference. The UTs request RU-lock for read and EW-lock for write/update. After producing after-image, the EW-lock is converted into SPW-lock. (Note that the conversion from EW-lock to SPW-lock is an atomic operation.) Note that EW-lock is incompatible with all other locks. Also, RU-lock is incompatible with EW- and SPW-locks. • Processing of ROTs In SSLR, speculative executions of an ROT are processed in a synchronous manner. Suppose, an ROT is carrying out n speculative executions and requests RR-lock on a data object. Whenever preceding UT holds EW-lock, the ROT holds all speculative executions till the preceding UT converts EW-lock into SPW-lock on a conflicting data object. As RR-lock is speculatively compatible to SPW-lock, the ROT accesses after-image of the data object and carries out 2n speculative executions and forms a commit dependency with the preceding UTs.
28
Under SSLR, an ROT commits whenever it completes execution. Whenever an ROT completes execution, it retains one of the executions by selecting appropriate speculative execution which contains the effects of the transactions which have committed at that instant. The processing of ROTs under SSLR is illustrated in Figure 4.3. Here, T1 and T3 are UTs which are processed with 2PL and T2 is an ROT which is processed with SSLR. T1 obtains EW-lock on data object ‘x’. It reads ‘x0 ’ and produces ‘x1 ’ and converts the EW-lock on ‘x’ to SPW-lock. T3 , being a UT, waits till T1 releases the lock on ‘x’. The ROT T2 is processed as follows. Note that, even though both T1 and T2 have arrived at the same instant, T2 waits till T1 produces after-image ‘x1 ’, T2 carries out two executions T21 and T22 by accessing ‘x0 ’ and ‘x1 ’ respectively. Note that, T21 and T22 are carried out synchronously. After T2 ’s completion, T21 is retained even though T1 is not yet committed. We can observe that, T2 is committed without waiting for the termination of T1 . Also, the transactions are serialized as per the order T2 ≪ T1 ≪ T3 . T1
r1[x0] w1[x1] r1[y0] w1[y1]r1[p0]w1[p1] S1 T21 r [x ] r [z ] 2 0 2 0
T2 S2
T22 r2[x1] r2[z0]
T3
C1
C2 r3[x1] w3[x2]
S3
C3
Time
Figure 4.3: Depiction of transaction processing with SSLR Next, we discuss the commit dependency rule followed in SSLR. Note that, the notion of commit dependency in SSLR is different as compared to the notion of commit dependency in SL. In SSLR, if Ti carries out speculative executions and forms a commit dependency with Tj , Ti commits whenever it is completed (say at time ‘t’) by retaining appropriate speculative execution based on the termination status of Tj at time ‘t’. That is, even though Tj is not completed, Ti can retain one of the speculative execution and proceed to commit whenever Ti completes execution. This is possible, only if Ti is an ROT. Whereas in SL, Ti has to wait for Tj ’s termination as SL is proposed for UTs.
4.2.2
Protocols for UTs and ROTs
In this subsection, we discuss the data structures required for implementing SSLR protocol. Next, we present the protocols for UTs and ROTs separately. The list dependset(Tij ) stores the commit dependency details of ‘jth ’ speculative execution of a transaction Ti . This list is maintained for each speculative execution of transactions. A lockqueueis a FIFO queue maintained for each data object to store the pending lock requests. • Protocol for UTs In SSLR, a UT requests for RU-lock to read and EW-lock to write. The protocol for UTs is as follows. Let Ti be a UT. 1. Lock acquisition. Let Ti requests for RU- or EW-lock on ‘x’. The lock request is entered into the lockqueue.
29
1.1 Ti obtains RU-lock if no transaction holds EW-lock or SPW-lock. Step (2) is followed. 1.2 Ti obtains EW-lock on ‘x’, if no transaction holds RU-, RR-, EW-, and SPW-locks. 2. Execution. During execution, whenever Ti produces the after-image for a data object, EW-lock on the data object is converted into SPW-lock. If Ti obtains all the locks, step (3) is followed. Otherwise, step (1) is followed. 3. Commit/Abort Rule. Whenever Ti commits, the speculative executions of ROTs which have been carried out with the before-images of Ti are terminated. Whenever Ti aborts, the speculative executions of ROTs carried out with the after-images of Ti are terminated. The information regarding Ti is deleted from the dependset maintained for each of the speculative execution of ROTs which are dependent on Ti . Also, all the related lock entries of Ti are deleted. • Protocol for ROTs In SSLR, an ROT requests for RR-lock to read. The protocol for ROTs is as follows. Let Tj be an ROT. 4. Lock acquisition. Let Tj requests for RR-lock on ‘x’. The lock request is entered into the lockqueue. 4.1 If no transaction holds EW- or SPW-locks, the RR-lock is allocated to Tj . Step (5.1) is followed. 4.2 If the preceding transaction is holding SPW-lock, lock is granted in speculative mode (ssp yes). Step (5.2) is followed. 5. Execution. 5.1 Tj continues with the current executions by accessing ‘x’. If Tj obtains all the locks, step (6) is followed. Otherwise, step (4) is followed. 5.2 Each execution of Tj is split into two executions; one with the before-image and the other one with the after-image. The identifier of the preceding transaction is added into the dependset of the speculative executions of Tj , which used the after-image of ‘x’. If no further lock requests for Tj , step (6) is followed. Otherwise, step (4) is followed. 6. Commit/Abort. Once Tj ’s processing is completed, one of its speculative executions is chosen as follows: Suppose Tj has completed at time ‘t’. Tj retains that speculative execution which contains the effect of all committed transactions which have committed before ‘t’. If Tj is aborted, then all its speculative executions are also aborted. The locks allocated to Tj are released after the termination and the lock entries of Tj are deleted.
4.3 Correctness Proof of SSLR Protocol In this section, we present the correctness proof of SSLR on the lines of 2PL [10]. The terms transaction, history over a set of transactions, and serializability theorem are defined in Appendix A. Under 2PL, three types of conflicts occur: R-W, W-W and W-R conflicts. However, under SSLR, the following conflicts occur (refer to Figure 4.2): RR-EW, RR-SPW, RU-EW, RU-SPW, EW-RR, EW-RU, EWEW, EW-SPW conflicts. (Note that in SSLR, in place of W-lock both EW-lock and SPW-lock are employed. 30
Also, UTs request RU-lock for read and EW-lock for write. The UTs convert EW-lock into SPW-lock after completing the work on the data object. ROTs request RR-lock to read.) Let pi [x] denote an operation (RR, RU, EW) requested by transaction manager (TM) for Ti and pli [x] denotes a lock (RR-, RU-, EW- and SPW-locks). The notation ‘≪’ indicates partial order. The notation ‘Ti ≪ Tj ’ indicates Ti precedes Tj in the history. We use the notations ‘o’ to denote an operation (RR, RU, EW), ‘l’ to denote locking operations, ‘li ’ to denote all the locking operations of Ti , ‘u’ to denote unlocking operations, ‘ui ’ to denote all the unlocking operations of Ti and ci to denote the commitment of Ti . The SSLR protocol manages locks using the following rules. 1. A transaction has to acquire lock on a data object in order to perform an operation on it. 2. When the scheduler receives an operation pi [x] from the TM, the scheduler tests the conflict between pli [x] with some qlj [x] that is already set. If there is no conflict, the lock is set for Ti . 3. Once the scheduler has set a lock for Ti , say pli [x], it may not release that lock at least until Ti is committed or aborted. 4. Once a transaction completes the work on a data object, the scheduler converts the EW-lock into the SPW-lock for that object in an atomic manner. 5. Once the scheduler has released a lock for a transaction, it may not subsequently obtain any more locks for that transaction (on any data object). 6. When the scheduler receives an operation pi [x] from the TM, the scheduler tests the conflict between pli [x] with some qlj [x] that is already set. Note that, two types of conflicts can occur in SSLR namely “no” and “ssp yes”. (a) If the conflict is of “no”, it delays pi [x] by forcing Ti to wait until it can set the lock it needs. This is equivalent to “qlj [x] ≪ pli [x]”. (b) If the conflict is of “ssp yes”, Ti forms commit dependency with Tj and it carries out speculative executions. After its completion, Ti retains one of the speculative executions based on the Tj ’s termination status. If Tj has already committed, it will retain that execution which is being carried out by reading the after-images produced by Tj . This is equivalent to the order “qlj [x] ≪ pli [x]” or (“cj ≪ ci ” or “uj ≪ ci ”). Otherwise Ti retains that execution which is carried out by reading the before-images of Tj . This is equivalent to the order “pli [x] ≪ qlj [x]” or (“ci ≪ cj ” or “ui ≪ cj ”). Based on the preceding rules, we propose the following propositions. Proposition 1. Let H be a history produced by an SSLR scheduler. If oi [x] is in C(H), then oli [x] and oui [x] are in C(H), and “oli [x] ≪ oi [x] ≪ oui [x]”. There are three kinds of operations: read by ROTs, read by UTs and write by UTs. Whenever transaction manager requests read operation on behalf of an ROT or UT, the operation is executed after obtaining the corresponding lock as per rules (1) and (2). When a UT requests a write operation, the operation is executed after obtaining the EW-lock. After the completion of work on the data object, the EW-lock is converted into SPW-lock atomically. All the locks are released after the commit/abort of transactions as per rule (3). So, lock is obtained for every operation and released after the completion of the operation i.e. “oli [x] ≪ oi [x] ≪ oui [x]”. 31
Proposition 2. Let H be a complete history produced by an SSLR scheduler. If pi [x] and qi [y] are in C(H), then “pli [x] ≪ qui [y]”. As per the rule (5), a transaction cannot obtain any lock after releasing any other lock. It means every locking operation is executed before any unlocking operation. (However, it can be noted that conversion of EW-lock into SPW-lock which is not equivalent to the unlocking operation.) Proposition 3. Let H be a history produced by an SSLR scheduler. If ci and ui operations are in C(H), then ci ≪ ui . As per rule (3), a transaction may not release the locks until it commits or aborts. So, commit operation of a transaction precedes all unlocking operations of that transaction. Proposition 4. Let H be a history produced by an SSLR scheduler. If pi [x] and qj [x] ( i ̸= j) are conflicting operations in C(H), then either “ui ≪ cj ” or “uj ≪ ci ”. Suppose we have two operations pi [x] and qj [x] that are in conflict, then according to commit dependency rule (6(b)), either “ui ≪ cj ” or “uj ≪ ci ”. Theorem 1 Let H be history of the committed transactions under SSLR. Then, H is serializable. Proof:
To prove H is serializable, we have to prove that SG (H) is acyclic. Suppose an edge Ti →
Tj is in SG (H). As per the propositions (1),(2) and (3), “ui ≪ lj ” (unlocking operation of Ti precedes locking operations of Tj on data objects) or “ui ≪ cj ” (unlocking operations of Ti on data objects precedes commitment of Tj ) as per the proposition 4. Suppose Ti → Tj → Tk is in SG (H). This means that “uj ≪ lk ” or “uj ≪ ck ”. By transitivity, “ui ≪ lk ” or “ui ≪ ck ”. By induction, this argument extends to long paths. For any long path T1 → T2 →... → Tn , “u1 ≪ ln ” or “u1 ≪ cn ”. Suppose SG (H) contains a cycle T1 → T2 → ... → Tn → T1 . This means “ui ≪ ln ” or “ui ≪ cn ” and in turn “un ≪ li ” or “un ≪ ci ”. By transitivity, “u1 ≪ l1 ” or “u1 ≪ c1 ”. The term “u1 ≪ l1 ” is a contradiction as per the propositions (1) & (2). The term “u1 ≪ c1 ” is a contradiction as per the proposition 3. Thus SG (H) has no cycles and therefore H is serializable.
4.4 Simulation Experiments In this section, we discuss the simulation model which is used for evaluating the performance of the concurrency control protocols based on [52]. The simulation model is described by using logical and physical queueing models. We have considered closed queue system for evaluating the performance. Next, we discuss the parameters used for conducting simulation experiments. Also, we explain the metrics used for evaluating the performance of the protocols. Subsequently, we discuss the various protocols which are considered for performance evaluation. Finally, we discuss the performance results.
4.4.1
Closed Queue Model
We have developed a discrete event simulator based on the closed queuing model for measuring the performance of SSLR protocol. Central to our simulation model for evaluating concurrency control algorithm performance is the closed queueing model of a centralized database system shown in Figure 4.4. In this model, when a new transaction originates, it enters the ready queue. The transaction then enters cc queue (concurrency control queue) and makes the first of its concurrency control requests. The cc manager serves the data object access requests
32
available in the cc queue in FCFS basis. If the data object access is granted by the cc manager, the transaction proceeds to the object queue and accesses the data object. If the data object access request is not granted, the transaction enters the blocked queue until it is once again able to proceed. If a request leads to a decision to restart the transaction, it goes back of the cc queue possibly after some period through ready queue. It then begins requesting for data object access from the first object. Eventually the execution of a transaction may complete and the cc manager may chose to commit that transaction. If the transaction is an ROT, it is completed. If the transaction is a UT, it writes its updates into the database by going through update queue. After the completion of commit operation, the transaction leaves the system. In this closed queue system, at a time only a fixed number of transactions are allowed to stay alive in the system. As and when a transaction is committed, possibly after some period, a new transaction is generated from one of the terminals and is allowed to enter into the ready queue. This model can be used for both locking and optimistic protocols. The physical queuing model is depicted in Figure 4.5. The physical model is a collection of terminals, multiple CPU and I/O servers. The delay paths for think and restart delays are also reflected in the physical queue model. The service discipline used for the CPU queues is FCFS. Whenever a transaction requires CPU service, it is assigned to a free CPU server; otherwise the transaction waits until one becomes free. Our I/O model is a probabilistic model of a database that is spread out across all of the disks. There is a queue maintained for each I/O servers. When a transaction needs a service, it chooses a disk (at random, with all disks being equally likely) and waits in an I/O queue associated with the selected disk. The service discipline for the I/O queues is also FCFS.
Figure 4.4: Logical queuing model:closed queue
33
TERMINALS
. . .
delay ready queue
cpu
. . cpu think
disk
. .
: . disk
Figure 4.5: Physical queuing model related to Figure 4.4
4.4.2
Simulation Parameters
The description of parameters with values is shown in Table 4.1. The database size is assumed to be “dbSize”. The parameters “cpuTime” and “ioTime” denote the I/O and CPU time associated with reading and writing an object (equivalent to an operating system page). The parameters “rotMaxTranSize” and “rotMinTranSize” are the maximum and minimum number of objects in ROT respectively. The maximum and minimum number of objects in UT is represented by the parameters “utMaxTranSize” and “utMinTranSize” respectively. Each resource unit (RU) constitutes 1 CPU and 2 I/O servers by considering that one CPU can drive two I/O servers. The parameter “noResUnits” represents the number of resource units. The parameter “MPL” denotes the number of active transactions exist in the system. The parameter “% of UTs” denotes the percentage of UTs currently active in the system. The parameter “% of ROTs” means the percentage of ROTs currently active in the system. The value for “dbSize” is chosen as 1000 data objects [52]. This value is chosen to create a situation in which conflicts are more frequent. The value for “cpuTime” is chosen as 5 ms by considering the speed of modern processors [53]. The value for “ioTime” is fixed as 10 ms by considering the speed range of current hard disk drives [54]. Regarding transaction size, we have chosen different parameter values for ROTs and UTs by considering the load character in modern information systems. The values for “rotMaxTranSize” and “rotMinTranSize” are fixed at 20 and 15 respectively and the values for “utMaxTranSize” and “utMinTranSize” are 15 and 5 objects respectively [55]. The size of a ROT is a random number between 15 and 20 (both inclusive) and UT is a random number between 5 and 15 (both inclusive). We have fixed percentage of ROTs as 70% and percentage of UTs as 30% as per [34]. We conducted some of the experiments by varying “MPL” from 10 to 100. 34
Table 4.1: Simulation Parameters, Meaning and Values Parameter Meaning Value dbSize Number of objects in the database 1000 cpuTime Time to carry out CPU request 5ms ioTime Time to carry out I/O request 10ms rotMaxTranSize Size of largest ROT 20 objects rotMinTranSize Size of smallest ROT 15 objects utMaxTranSize Size of largest UT 15 objects utMinTranSize Size of smallest UT 5 objects noResUnits Number of RUs ( 1 CPU, 2 I/O) 8 MPL Multiprogramming Level (10-100) Simulation Variable % of UTs Percentage of UTs currently active 30% % of ROTs Percentage of ROTs currently active 70%
4.4.3
Performance Metrics
We have employed the following performance metrics: throughput, UT throughput, ROT throughput, percentage of transaction aborts, response time, and average number of speculative executions per transaction, % of transaction aborts, % of CPU utilization, % of I/O device utilization and average data currency. • Throughput: It is the number of transactions completed per second. The UT throughput is the number of UTs completed per second and the ROT throughput is the number of ROTs completed per second. • Percentage of transaction aborts: It is the ratio of the number of aborted transactions to the number of committed transactions. • Response time: For a transaction, the response time is the elapsed time between the commitment of a transaction and the starting of a transaction. Let ‘t1 ’ denotes the time instance at which a transaction T submitted in the system. Let ‘t2 ’ denotes the time instance at which T has been committed in the system. Then ‘t2 ’ - ‘t1 ’ is the response time of T. Let ‘s’ be the sum of response times of all transactions committed in the system and ‘n’ be the total number of transactions (both UTs and ROTs) committed in the system. Then average response time is equal to s/n. • CPU/IO utilization: Let ‘e’ denotes total number of speculative executions, then average number of speculative executions per transaction is equal to e/n. Let ‘ct’, ‘it’ and ‘st’ denote CPU idle time, I/O device idle time and total simulation time respectively. Then, percentage of CPU utilization is equal to ct it ) and percentage of I/O device utilization is equal to 100(1- st ). 100(1- st • Average data currency: Let us consider that an ROT, Ti reads a version of a data object ‘o’ which has been produced at the time instance ‘ti ’. Let us consider that the most recent version of ‘o’ (which is available at the commitment time of Ti ) has been produced at the time instance ‘tj ’. Then the data currency provided to Ti with regard to ‘o’ is equal to ti - t j . Let ‘tc’ be the lowest data currency value of the data objects accessed by Ti . Then the data currency provided to T is equal to ‘tc’. Let ‘rn’ be the total number of ROTs present in the system. Let ‘rc’ be the sum of ‘tc’ values of all ROTs present in the system. Then the average data currency provided to ROTs is equal to rc/rn.
35
4.4.4
Protocols Simulated
We have compared SSLR with 2PL, FCWR, SI-2PL, and SL protocols. In the experiments, the graphs show the mean results of 20 experiments; each experiment was carried out for 10,000 transactions. The results are plotted with a mean of 95 percent confidence intervals. These confidence intervals are omitted from the graphs. The following protocols are implemented. • 2PL: The lock requests are generated by a transaction in a dynamic manner, one by one. We have considered a variation of 2PL called “strict two-phase locking” here. The strict 2PL scheduler releases all locks of a transaction together, when the transaction terminates. • FCWR: This is a variation of SI-based protocol. In this protocol, the read requests of the transactions are carried out by accessing the snapshot of the committed data and UTs are executed as per the FCWR rule. • SI-2PL: This is similar to the approach proposed in [3]. Using snapshot isolation, the read requests of the transactions are carried out without any waiting. The UTs follow 2PL procedure. So, the lock requests are issued dynamically, one by one, by the UTs. • SL: This is the protocol proposed in [8]. The lock requests are issued dynamically. Both UTs and ROTs speculate. Whenever a transaction produces after-image for a data object, the EW-lock on the data object is changed to SPW-lock. All the speculative executions of a transaction are executed in a synchronous manner. • SSLR: This is the proposed protocol. The lock requests are issued dynamically one by one by the transactions. UTs follow 2PL and ROTs follow synchronous speculation. The speculative executions of ROTs are carried out in a synchronous manner. In all these protocols, we have assumed that aborted transactions are resubmitted again after the time duration equals to the average response time in order to reduce repeated aborts. For SL and SSLR protocols, we have assumed that all the speculative executions are carried out in parallel. Also, we have not considered the overhead involved in thread creation and management. The modern operating systems support efficient environments like thread pool [56] for the creation of threads. So, we believe that there will not be a major change in the results due to the addition of this overhead. Also, we have not taken into account the cost of deadlock detection as it is same for all locking-based protocols.
4.4.5
Experiments under Unlimited Resources
Figure 4.6 shows how the throughput performance of 2PL protocol vary with the increase in MPL by considering different number of resource units. We can observe from the figure that increasing the number of resource units beyond 8 will not increase the performance much. So, we have decided to use 8 resource units for conducting the experiments for evaluating the performance of 2PL, FCWR, SSLR, SL and SI-2PL protocols. In the following experiments, we have reported results by simulating unlimited resources environment; i.e., there is no restriction on the number of speculative executions for a transaction. Figure 4.7 shows how throughput performance for 2PL, FCWR, SL, SSLR and SI-2PL varies with MPL in an unlimited resources environment. The performance of 2PL protocol suffers due to more lock waiting time. For FCWR, the performance deteriorates due to increased number of aborts. It can be noted that the
36
Figure 4.6: MPL versus Throughput for 2PL protocol
Figure 4.7: MPL versus Throughput
37
% of Transaction Aborts
1000 2PL SSLR FCWR SI-2PL
800
MPL=20, #RUs=8
600
400
200
0
20
40
60
80
100
% of UTs
Figure 4.8: % of UTs versus Transaction aborts performance of SL, SSLR is significantly higher than that of 2PL and FCWR because of improved parallelism due to speculation. Among the SL protocols, the performance of SL and SSLR is close. This is due to the fact that under unlimited resources environment, both SL and SSLR perform in a similar manner. It can be noted that SI-2PL also exhibits higher performance. However, it can be noted that FCWR suffers from correctness and data currency problems and SI-2PL suffers from data currency problems. Overall, speculative protocols exhibit better performance over other protocols. Figure 4.8 shows how the abort performance of 2PL, SSLR, SI-2PL and FCWR protocols vary with the increase in number of UTs. It can be observed that the number of transaction aborts under FCWR increases with the increase in data contention. However, the number of transaction aborts under 2PL, SI-2PL and SSLR protocols is very less in comparison with FCWR. In Figure 4.9, the details regarding percentage of transactions which consumed 1, 2, 4, 8 and above 8 speculative executions in case of SL and SSLR are shown. It can be noted that about 60 percent of transactions carry out single execution in SL, whereas about 70 percent of transactions carry out single execution in SSLR. This is due to the fact that both UTs and ROTs speculate in SL whereas only ROTs speculate in SSLR. As more number of transactions carry out speculation, the number of transactions which carry out single execution is less in SL over SSLR. Only three percent of transactions carry out two speculative executions in SL, whereas it is 24 percent in SSLR. It can be observed that 24 percent of transactions carry out four speculative executions in SL, whereas it is 5 percent in SSLR. Also, in SL, about 10 percent transactions carry out more than 8 speculative executions, whereas the percentage value is very less in case of SSLR. The results show that a transaction carries out more speculative executions in SL as compared to SSLR. The average number of speculative executions for SL and SSLR comes to 4.2 and 1.5 respectively. This experiment shows that the proposed SSLR protocol requires very few speculative executions as compared to SL. So, by consuming less extra processing resources, it is possible to improve the performance in ROT environment. Figure 4.10 shows the average response time performance for the protocols 2PL, FCWR, SSLR, SL and SI-2PL. The performance of 2PL is poor due to high lock waiting time. As FCWR aborts more number of UTs due to conflict, its performance is also poor. SSLR and SL exhibit good performance. We can observe that the
38
80
% of Transactions
70
#RUs=8, MPL=20, #UTs=30%
60 SL SSLR
50 40 30 20 10 0
4 8 1 2 >8 Number of speculative executions Figure 4.9: Details of speculative executions
Figure 4.10: MPL versus Average Response time
39
50 #UTs = 30%, #RUs= 8, MPL = 20
Throughput
40 30 SSLR 2PL SL
20 10
1
1.2
1.4
1.6
1.8
2
Total Memory Units (in multiples of MPL) Figure 4.11: MPL versus Throughput in limited resources environment performance of SL and SI-2PL protocol is close. So, overall we can conclude that in the unlimited resources environment, SSLR protocol performs better than FCWR and 2PL protocols. Note that, even though SI-2PL protocol performs well, it does not satisfy correctness criteria and it provides reduced data currency to ROTs.
4.4.6
Experiments under Limited Resources
In this experiment we report results by simulating the limited resource environment. The number of speculative executions is restricted for a transaction. We assume that a fixed number of resources are available in the system. Each processing unit is termed as one memory unit (MU) which can be used to process one speculative execution. We assume that a fixed number of MUs are available in the system. The MUs are allocated dynamically based on the requirement of the transaction. If the requested number of MUs are not available, the transaction is put to wait. Figure 4.11 shows the performance of 2PL, SL and SSLR protocols by simulating limited resources environments. The experiment is carried out by fixing MUs equal to MPL. We have plotted the results by varying MUs from MPL to 2*MPL and evaluated the performance. It can be observed that the performance of SSLR reaches the maximum value and saturates at MUs value equal to 1.2*MPL. Whereas the performance of SL linearly increases with MUs and the performance is significantly less over SSLR even at MUs=2*MPL. Also, note that the 2PL performance does not vary with the increase in MUs. This experiment shows that the performance of ROTs can be improved significantly with a fraction of (0.2 times) additional resources.
4.4.7
Experiments about Resource Utilization
In Figure 4.12, the CPU utilization of 2PL, FCWR and SSLR protocols is shown. It can be noted that CPU utilization is high in case of FCWR. This is due to the fact that more number of UTs are aborted and resubmitted. Note that in SSLR, a transaction waits for the lock till the preceding transaction produces afterimage. Due to this waiting, the CPU utilization is less in SSLR. Also, the CPU utilization in case of 2PL is much lesser than other protocols due to more waiting. In Figure 4.13, the performance regarding I/O device utilization of 2PL, FCWR and SSLR protocols is
40
% of CPU Utilization
90
#RUs=8, #UTs = 30%
80 2PL SSLR
70 60
FCWR
50 40 30 20 10
0
20
40
60
80
100
80
100
MPL
Figure 4.12: MPL versus CPU Utilization
% of I/O Device Utilization
90 #RUs = 8, #UTs = 30%
80 2PL SSLR FCWR
70 60 50 40 30 20 10 0
20
40
60 MPL
Figure 4.13: MPL versus I/O Device Utilization
41
Figure 4.14: MPL versus Average data currency shown. The trend is similar to the case of CPU utilization graphs in Figure 4.12.
4.4.8
Experiments for Data Currency
Experiment 1 In Figure 4.14, the average data currency performance with MPL is shown. Here the “average data currency” value is calculated by taking the lowest data currency value of the data objects accessed by the transactions. More negative value of “average data currency” indicates that the transactions are provided with less “data currency” and the less negative value of “average data currency” indicates that transactions are provided with high “data currency”. We can observe from the figure that the “average data currency” value of 2PL and SSLR protocols is 0 which indicates that these protocols are providing highest “data currency” possible to ROTs. Where as, the FCWR and SI2Pl protocols provide low “data currency” to ROTs. Note that, as MPL increases, the negative value of “ average data currency” also increases, which indicates that “data currency” provided by FCWR and SI2PL protocols to ROTs decrease. Also, we can observe from the figure that the performance difference between FCWR and SI2PL protocols is comparatively less. Experiment 2 In Figure 4.15, the average data currency performance with MPL is shown. Here the “average data currency” value is calculated by taking the average of the data currency value of the data objects accessed by the transactions. More negative value of “average data currency” indicates that the transactions are provided with less “data currency” and the less negative value of “average data currency” indicates that transactions are provided with high “data currency”. We can observe from the figure that the “average data currency” value of 2PL and SSLR protocols is 0 which indicates that these protocols are providing highest “data currency” possible to ROTs. Whereas, the FCWR and SI2Pl protocols provide low “data currency” to ROTs. Note that, as MPL increases, the negative value of “ average data currency” also increases, which indicates that “data currency” provided by FCWR and SI2PL protocols to ROTs decrease. However, from the figure we can observe that the performance difference between FCWR and SI2PL protocols is comparatively more than the one observed in experiment 1. We observe from the Figure 4.10 that the average response time of ROTs is more in FCWR protocol than in SI2PL protocol. As ROTs spend more time in the system, there 42
Figure 4.15: MPL versus Average data currency is a possibility that the data objects accessed by them could have been modified by the concurrent UTs. As a result, ROTs are provided with low data currency in FCWR. Whereas in SI2PL, this possibility is less as ROTs spend less time in the system. So, SI2PL provides high data currency to ROTs in comparison with FCWR. Overall, we conclude that SSLR and 2PL provides high data currency to ROTs, whereas FCWR and SI2PL are providing low data currency to ROTs.
4.5
Discussion
In this section, we discuss the issue of simulating the speculative executions of ROT and the implementation issues.
4.5.1
Simulation of Speculative Executions of ROT
In the simulation experiments, we assumed that all speculative executions of an ROT are processed in parallel. Also, we have not taken into account the overhead of thread creation/killing activity. In the simulation program, we maintained a ‘count’ variable for each ROT generated in the system. Whenever speculative threads are created by an ROT or killed due to commit/abort of UTs, we have appropriately changed the value of this ‘count’ variable. The results obtained through simulation experiments of SSLR protocol nearly reflect the processing of speculative executions in multi-core environment. However, under the single CPU/few CPUs environments, the threads may be put into the wait queue for obtaining the CPU time. As a part of future work, we are planning to conduct simulation experiments by considering the ROT processing in single CPU environment.
4.5.2
Implementation Issues
Some of the implementations issues of the proposed SSLR protocol are as follows.
43
(i) Lock conversion There is a notion of “lock conversion” under SSLR protocol. We have considered that a UT converts EW-lock to SPW- lock whenever it produces the after-images. We believe that since transactions are stored procedures, it is possible to put the lock conversion markers in the transactions by analyzing the corresponding stored procedures. The lock conversion marker indicates when the transaction finishes work on that object. During execution, when lock conversion marker is encountered, the corresponding EW-lock is converted to SPW-lock. (ii) Index locking In DBMS, hierarchical locking schemes are used to support concurrent access of index structures. A node in the index structure can be locked either in shared mode (for reading) or exclusive mode (for writing). We have to investigate the issue of converting exclusive lock held by a node in the index structure into speculative lock, so that ROTs can speculatively read that node to access the relevant data page. (iii) Creation of speculative threads In the SSLR protocol, the transaction manager has to create speculative threads for the ROTs with the help of the thread creation system calls available in the host operating system. In SSLR protocol, only the main speculative thread of an ROT issues the request for locks on the data objects. Also, whenever the transaction manager creates a child thread for a parent thread, the child thread has to start its execution from a particular point and it should be provided with the same state of the parent thread up to that point. As a part of future work, we study the thread creation facilities available in various operating systems to examine the possibilities of creating threads in such a fashion.
4.6
Chapter Summary
In this chapter, we discussed the details of the proposed synchronous speculative locking protocol for ROTs. The proposed SSLR protocol improves the performance of ROTs by performing less number of speculative executions than the basic SL protocol. Also, the proposed protocol does not affect the correctness and provides maximum data currency to transactions. The results of our simulation study shows that the proposed SSLR protocol can perform better than 2PL and FCWR protocols. The simulation results indicate that SSLR requires 0.2 times of additional processing resources to bring the above crucial benefits. In the next chapter, we discuss the asynchronous speculation-based protocol in detail.
44
Chapter 5
Asynchronous Speculative Locking Protocol for ROTs In the SSLR protocol discussed in the previous chapter, ROTs are processed using synchronous speculation procedure. In synchronous speculation, whenever a conflict occurs for an ROT, it has to wait for for the generation of after-image by the preceding UT. We have made efforts to avoid this waiting time by starting the speculative executions in an asynchronous manner. In this chapter, we discuss how speculative executions can be carried out in an asynchronous manner for improving the performance of ROTs. Also, we present the correctness proof and the simulation results of the proposed asynchronous speculative locking protocol for ROTs (ASLR). In Section 5.1, we discuss the basic idea of the proposed ASLR protocol. In Section 5.2, we discuss how transactions are processed under ASLR protocol with an example. In Section 5.3, we present the details of the protocol. In Section 5.4, we discuss the differences between SSLR and ASLR protocols. In Section 5.5, the correctness proof of ASLR protocol is discussed. In Section 5.6, we present the simulation results of ASLR protocol. In Section 5.7, we discuss regarding the implementation issues of ASLR protocol.
5.1
Basic Idea
In synchronous speculation, whenever a conflict occurs for an ROT based on a data object, the ROT has to wait for for the generation of after-image of that data object by the preceding UT. Instead of waiting, the ROT can be allowed to continue its current speculative executions with the before-image of the conflicting data object. Once, the after-image of that data object becomes available, the ROT is allowed to start further speculative executions in an asynchronous manner. This type of processing reduces the waiting time of ROTs. In SSLR protocol, all speculative executions of a transaction are processed synchronously. That is, the waiting transaction waits for the preceding transaction to produce after-image and starts speculative executions synchronously. However, in case of ROT environment, an ROT can read available version and can proceed with the existing speculative executions without waiting for the preceding transaction to produce after-image in asynchronous manner. Whenever preceding transaction produces after-image, the ROT starts further speculative executions. So, there are two ways to carry out speculative executions: one is synchronous method and another is asynchronous method. In the proposed protocol, the speculative executions of an ROT are carried out in an asynchronous manner. 45
In SSLR, an ROT can commit with an appropriate speculative execution/thread, only when all of its speculative threads are completed. But in ASLR, an ROT can commit with a speculative thread if that thread consists of the effects of all UTs that have committed before this ROT. This type of processing of speculative executions further reduces the waiting time of ROTs.
T1 T2
r1[x0] w1[x1] r1[y0] w1[y1] S1
C1
T21 r2[x0] r2[p0] r2[q0] S2
T22 r2[x1]
C2
Time Figure 5.1: Processing speculative executions in ASLR Figure 5.1 shows how speculative executions are processed in ASLR protocol. Here, T1 is a UT and T2 is an ROT. T2 is in conflict with T1 based on the data object ‘x’. Here, T2 reads the before-image of ‘x’ and proceeds with the speculative execution T21 . Once the after-image ‘x1 ’ is produced by T1 , T2 starts additional speculative execution T22 . Note that, T2 commits the speculative thread T21 without waiting for T1 to terminate. As T1 is not committed yet, committing the thread T21 does not violate correctness. The proposed ASLR protocol improves the performance of ROTs due to reduced waiting. Due to speculative processing and an autonomy to commit whenever the processing is completed, the proposed protocol reduces the waiting time without compromising the correctness and data currency.
5.2
Transaction Processing with ASLR
The lock compatibility matrix of ASLR is shown in Figure 5.2. The W-lock is divided into EW-lock and SPW-lock. The EW-lock is requested by UTs for writing the data object. The RU-lock (Read lock for UT) is requested by UTs for reading a data object. The RR-lock (read lock for ROT) is requested by ROTs for reading a data object. Similar to SSLR, EW-lock is incompatible with the other locks. Also, RU-lock is compatible with both RR- and RU-locks, but incompatible to both EW- and SPW-locks. However, there is a difference with SSLR, on the aspect of compatibility of RR-lock with other locks. In SSLR, RR-lock is speculatively compatible only with SPW-lock whereas in ASLR, RR-lock is speculatively compatible with both EWlock and SPW-lock. But, the nature of compatibility is different from SSLR. So, a different name called “asp yes”(asynchronous speculative yes) is used here which means that the ROT carries out the possible speculative executions by accessing available versions of data object and forms a commit dependency with the preceding UTs. The method of commitment is also different (refer to discussion regarding “commit processing”).
1. Processing of UTs The UTs are processed with 2PL. The EW-lock is incompatible with all other locks and RU-lock is incompatible with both EW- and SPW-locks. 2. Processing of ROTs 46
Lock requested by Tj RR RU EW
RR yes yes no
Lock held by Ti RU EW SPW yes asp yes asp yes yes no no no no no
Figure 5.2: Lock compatibility matrix for ASLR
• Processing In ASLR, speculative executions of ROT are processed in an asynchronous manner. Whenever an ROT involves in a conflict, it can continue the current speculative executions by accessing the available data object versions, without waiting for the preceding transaction to produce afterimage. Whenever the after-image becomes available, further speculative executions are started dynamically. In ASLR, RR-lock is speculatively compatible with EW-lock. Suppose, an ROT is carrying out n speculative executions. Whenever preceding UT holds EW-lock, the ROT accesses the available object versions and continues n speculative executions. Whenever, preceding UT converts EW-lock into SPW-lock, the ROT starts additional n speculative executions. In this manner, the speculative executions of an ROT progress asynchronously. • Commit processing In ASLR, an ROT tries to commit whenever it completes one of the speculative execution. Whenever one or more speculative executions of an ROT completes execution (say at time ‘t’) the following procedure is followed for the each completed speculative execution. If a speculative execution (Tij ) contains the effect of all the committed transactions at ‘t’, it retains that execution and aborts all other speculative executions of that transaction. Otherwise, Tij becomes obsolete and therefore aborted. Figure 5.3 depicts the processing under ASLR. Here, T1 and T3 are UTs which are processed with 2PL and T2 is an ROT which is processed with ASLR. T2 accesses the before-image ‘x0 ’ and other available values of data objects ‘y0 ’ and ‘z0 ’ and starts speculative execution T21 . Once the after-image ‘x1 ’ becomes available, another speculative execution T22 is started. Note that T21 and T22 are executed in a parallel manner. Whenever the processing is completed for any one of the speculative execution, the ROT can be committed provided it contains the effect of committed transactions at that instant. We can observe that, T21 does not depend on T1 and it is committed once it completes the execution without waiting for T1 . So, T22 is aborted. Note that being UT, T3 waits for T1 for the release of the lock on ‘x’ as per 2PL rule. It can be noted that, in ASLR, each speculative execution of an ROT progresses at a different pace. As a result, some of the speculative executions may complete early. However, there is a chance that the speculative execution which may complete early may not commit as it may not contain the effect of all transactions which have committed at the time of its completion. Overall, for an ROT the speculative execution which started early can commit with high probability. As a result, ASLR reduces the waiting time and improves performance over SSLR. However, in the worst case, the performance is equivalent to SSLR for certain transactions.
47
T1 T2
r1[x0] w1[x1] r1[p0] w1[p1]r1[q0]w1[q1] C1 S1
T21 r2[x0] r2[y0] r2[z0] S2
T22 r2[x1]
C2
T3
r3[x1] w3[x2]
S3
C3
Time
Figure 5.3: Depiction of Transaction processing with ASLR
5.3 Differences between SSLR and ASLR Protocols In this section, we discuss the differences between SSLR and ASLR protocols. 1. The SSLR protocol processes the ROTs with synchronous speculation whereas ASLR protocol processes the ROTs with asynchronous speculation. Note that, asynchronous speculation reduces the waiting time of ROTs. 2. In SSLR, an ROT has to wait for the completion of all of its speculative executions, in order to choose the appropriate one for commitment. But in ASLR, an ROT does not wait for the completion of all its speculative executions. Instead, whenever a speculative execution of an ROT is completed and if that speculative execution consists the effect of all preceding committed UTs, then that speculative execution will be chosen for commitment. This type of processing further reduces the waiting time of ROTs. 3. In SSLR, the speculative executions for ROTs are started after the production of after-images. All the executions of SSLR are executed at the same pace. The implementation is straight forward. The ASLR allows the speculative executions for ROTs to start early without waiting for the production of afterimages. The executions start dynamically whenever the after-images are produces by the preceding transactions. Not all the speculations of an ROT are carried out at the same pace. So, implementation is difficult as compared to SSLR. 4. The ASLR protocol produces schedules (Histories) which are different from the ones produced by the SSLR protocol. We explain this through an example. Let T1 be a UT and T2 be an ROT. Let us assume that both the transactions are started at the same time. The schedule shown in Figure 5.4 represents an interleaved execution of the transactions T1 and T2 under SSLR protocol (Serialization order: T1 and T2 ). Note that, T2 has waited for the production of after-image ‘x1 ’ and committed after T1 . The schedule shown in Figure 5.5 represents an interleaved execution of the transactions T1 and T2 under ASLR protocol (Serialization order: T2 and T1 ). We can observe that, T2 has read the beforeimage ‘x0 ’ and started its execution. Also, T2 has committed before T1 . This type of schedule is not possible in SSLR. So, the ASLR protocol allows more concurrency than SSLR.
48
T1 r[x0 ] r[y0 ] w[y1 ] w[x1 ]
T2 T1 r[x0 ]
r[x1 ] r[z0 ]
T2 r[x0 ] r[z0 ] commit
r[y0 ] w[y1 ] w[x1 ] commit
commit commit Figure 5.4: Serializable schedule of SSLR
Figure 5.5: Serializable schedule of ASLR
5.4 The ASLR Protocol In this section, we discuss the data structures required for implementing ASLR protocol. Next, we present the protocols for UTs and ROTs separately. The list dependset(Tij ) stores the commit dependency details of ‘jth ’ speculative execution of a transaction Ti . This list is maintained for each speculative execution of transactions. A lockqueueis a FIFO queue maintained for each data object to store the pending lock requests. • Protocol for UTs In ASLR, a UT requests for RU-lock to read and EW-lock to write. The protocol for UTs is as follows. 1. Lock acquisition. Let Ti be a UT and requests for RU- to read ‘x’ or EW-lock to write ‘x’. The lock request is entered into the lockqueue. 1.1 Ti obtains RU-lock if no transaction holds EW-lock or SPW-lock. Step (2) is followed. 1.2 Ti obtains EW-lock on ‘x’, if no transaction holds RU-, RR-, EW-, and SPW-locks. 2. Execution. During execution, whenever Ti produces the after-image for a data object, EW-lock on the data object is converted into SPW-lock. If Ti obtains all the locks, step (3) is followed. Otherwise step (1) is followed. 3. Commit/Abort Rule. Whenever Ti commits, the speculative executions of ROTs that have been carried out with before-images of Ti are terminated. Whenever Ti aborts, the speculative executions of ROTs which have been carried out with after-images of Ti are terminated. The information regarding Ti is deleted from the dependset maintained for each of the speculative execution of ROTs which are dependent on Ti . Also, all the related lock entries of Ti are deleted. • Protocol for ROTs In ASLR, an ROT requests for RR-lock to read. The protocol for ROTs is as follows. 4. Lock acquisition. Let Tj be an ROT and requests for RR-lock to read ‘x’. The lock request is entered into the lockqueue. 4.1 If no transaction holds EW- or SPW-locks, the RR-lock is allocated to Tj . The step (5.1) is followed. 4.2 RR-lock is granted.
49
4.3 If the preceding transaction is holding SPW-lock, then the step (5.2) is followed. If the preceding transaction is holding EW-lock, then the step (5.3) is followed. 5. Execution. 5.1 Tj continues with the current executions by accessing ‘x’. Step (5.4) is followed. 5.2 Each execution of Tj is split into two speculative executions: one is with the before-image and the other one is with the after-image. The identifier of the preceding transaction is added into the dependset of the speculative executions of Tj , which used the after-image of ‘x’. Step (5.4) is followed. 5.3 Tj continues with the current executions by accessing ‘x’. Whenever the preceding UT converts EW-lock into SPW-lock, Tj starts additional speculative executions by accessing the after-image of ‘x’ produced by that UT. Also, the identifier of the preceding transaction is added into the dependset of the speculative executions of Tj , which used the after-image of ‘x’ 5.4 If Tj obtains all the locks, step (6) is followed. Otherwise, step (4) is followed. 6. Commit/Abort Rule. Suppose one of the speculative executions Tij of Ti has completed at time ‘t’. If the read set of Tij contains the effect of all the conflicting transactions that have committed before ‘t’, Tij is retained and Tj ’s other speculative executions are aborted. Otherwise, Tij is aborted. (Note that one of the speculative execution will be committed). If Ti is aborted, then all of its speculative executions are also aborted. Also the locks allocated to Ti are released and the lock entries of Ti are deleted.
5.5 Correctness Proof of ASLR protocol In this section, we present the correctness proof of ASLR. The terms transaction, history over a set of transactions, and serializability theorem are defined in appendix A. Regarding conflicts, under ASLR, the following conflicts occur (refer to Figure 5.2): RR-EW, RR-SPW, RU-EW, RU-SPW, EW-RR, EW-RU, EW-EW, EW-SPW conflicts. (Note that in ASLR also in place of W-lock both EW-lock and SPW-lock are employed. Also, UTs request RU-lock for read and EW-lock for write. The UTs convert EW-lock into SPW-lock after completing the work on the data object. ROTs request RR-lock to read.) Let pi [x] denote an operation (RR, RU, EW) requested by transaction manager (TM) for Ti and pli [x] denotes a lock (RR-, RU-, EW- and SPW-locks). The notation ‘≪’ indicates partial order. The notation ‘Ti ≪ Tj ’ indicates Ti precedes Tj in the history. We use the notations ‘o’ to denote an operation (RR, RU, EW), ‘l’ to denote locking operations, ‘li ’ to denote all the locking operations of Ti , ‘u’ to denote unlocking operations, ‘ui ’ to denote all the unlocking operations of Ti and ci to denote the commitment of Ti . The ASLR protocol manages locks using the following rules. 1. A transaction has to acquire lock on a data object in order to perform an operation on it. 2. When the scheduler receives an operation pi [x] from the TM, the scheduler tests the conflict between pli [x] with some qlj [x] that is already set. If there is no conflict, the lock is set for Ti . 3. Once the scheduler has set a lock for Ti , say pli [x], it may not release that lock at least until Ti is committed or aborted. 50
4. Once a transaction completes the work on a data object, the scheduler converts the EW-lock into the SPW-lock for that object in an atomic manner. 5. Once the scheduler has released a lock for a transaction, it may not subsequently obtain any more locks for that transaction (on any data object). 6. When the scheduler receives an operation pi [x] from the TM, the scheduler tests the conflict between pli [x] with some qlj [x] that is already set. Note that, two types of conflicts can occur in ASLR namely “no” and “asp yes”. (a) If the conflict is of “no”, it delays pi [x] by forcing Ti to wait until it can set the lock it needs. This is equivalent to “qlj [x] ≪ pli [x]”. (b) If the conflict is of “asp yes” and Tj currently holds SPW-lock, then Ti forms commit dependency with Tj and it carries out speculative executions. After its completion, Ti retains one of the speculative executions based on the Tj ’s termination status. If Tj has already committed, it will retain that execution which is being carried out by reading the after-images produced by Tj . This is equivalent to the order “qlj [x] ≪ pli [x]” or (“cj ≪ ci ” or “uj ≪ ci ”). Otherwise Ti retains that execution which is carried out by reading the before-images of Tj . This is equivalent to the order “pli [x] ≪ qlj [x]” or (“ci ≪ cj ” or “ui ≪ cj ”). (c) If the conflict is of “asp yes” and Tj currently holds EW-lock, then Ti acquires the lock and continues its execution. i. Whenever Tj converts the EW-lock into SPW-lock, Ti starts additional speculative executions. After its completion, Ti retains one of the speculative executions based on the Tj ’s termination status. If Tj has already committed, it will retain that execution which is being carried out by reading the after-images produced by Tj . This is equivalent to the order “qlj [x] ≪ pli [x]” or (“cj ≪ ci ” or “uj ≪ ci ”). Otherwise Ti retains that execution which is carried out by reading the before-images of Tj . This is equivalent to the order “pli [x] ≪ qlj [x]” or (“ci ≪ cj ” or “ui ≪ cj ”). ii. If after-image is not created by Tj , then Ti completes the execution which is being carried out by reading the before-images of Tj . This is equivalent to the order “pli [x] ≪ qlj [x]” or (“ci ≪ cj ” or “ui ≪ cj ”). Based on the preceding rules, we propose the following propositions. Proposition 1. Let H be a history produced by an ASLR scheduler. If oi [x] is in C(H), then oli [x] and oui [x] are in C(H), and “oli [x] ≪ oi [x] ≪ oui [x]”. There are three kinds of operations: read by ROTs, read by UTs and write by UTs. Whenever TM requests read operation on behalf of an ROT or UT, the operation is executed after obtaining the corresponding lock as per rules (1) and (2). When a UT requests a write operation, the operation is executed after obtaining the EWlock. After the completion of work on the data object, the EW-lock is converted into SPW-lock atomically. All the locks are released after the commit/abort of transactions as per rule (3). So, lock is obtained for every operation and released after the completion of the operation i.e. “oli [x] ≪ oi [x] ≪ oui [x]”. Proposition 2. Let H be a complete history produced by an ASLR scheduler. If pi [x] and qi [y] are in C(H), then “pli [x] ≪ qui [y]”.
51
As per the rule (5), a transaction cannot obtain any lock after releasing any other lock. It means every locking operation is executed before any unlocking operation. (However, it can be noted that conversion of EW-lock into SPW-lock which is not equivalent to the unlocking operation.) Proposition 3. Let H be a history produced by an ASLR scheduler. If ci and ui operations are in C(H), then ci ≪ ui . As per rule (3), a transaction may not release the locks until it commits or aborts. So, commit operation of a transaction precedes all unlocking operations of that transaction. Proposition 4. Let H be a history produced by an ASLR scheduler. If pi [x] and qj [x] ( i ̸= j) are conflicting operations in C(H), then either “ui ≪ cj ” or “uj ≪ ci ”. Suppose we have two operations pi [x] and qj [x] that are in conflict, then according to commit dependency rule (6(b)and 6(c)), either “ui ≪ cj ” or “uj ≪ ci ”. Theorem 2 Let H be history of the committed transactions under ASLR. Then, H is serializable. Proof: To prove H is serializable, we have to prove that SG (H) is acyclic. Suppose an edge Ti → Tj is in SG (H). As per the propositions (1),(2) and (3), “ui ≪ lj ” (unlocking operation of Ti precedes locking operations of Tj on data objects) or “ui ≪ cj ” (unlocking operations of Ti on data objects precedes commitment of Tj ) as per the proposition 4. Suppose Ti → Tj → Tk is in SG (H). This means that “uj ≪ lk ” or “uj ≪ ck ”. By transitivity, “ui ≪ lk ” or “ui ≪ ck ”. By induction, this argument extends to long paths. For any long path T1 → T2 →... → Tn , “u1 ≪ ln ” or “u1 ≪ cn ”. Suppose SG (H) contains a cycle T1 → T2 → ... → Tn → T1 . This means “ui ≪ ln ” or “ui ≪ cn ” and in turn “un ≪ li ” or “un ≪ ci ”. By transitivity, “u1 ≪ l1 ” or “u1 ≪ c1 ”. The term “u1 ≪ l1 ” is a contradiction as per the propositions (1) & (2). The term “u1 ≪ c1 ” is a contradiction as per the proposition 3. Thus SG (H) has no cycles and therefore H is serializable.
5.6 Performance Results We have considered closed queue model for conducting simulation experiments. This model is already discussed in the previous chapter. The simulation parameters used here are similar to the parameters discussed in the previous chapter. We have employed the following performance metrics: throughput, UT throughput, ROT throughput and average number of speculative executions per transaction, % of transaction aborts, % of CPU utilization, % of I/O device utilization and average data currency which are discussed in the previous chapter. We have compared ASLR and SSLR with 2PL, FCWR, SI-2PL, and SL protocols. In all these protocols, we have assumed that aborted transactions are resubmitted again after the time duration equals to average response time in order to reduce repeated aborts. For SL, ASLR and SSLR, we have assumed that all the speculative executions are carried out in parallel. Also, we have not taken into account the cost of deadlock detection as it is same for all locking-based protocols. In the experiments, the graphs show the mean results of 20 experiments; each experiment was carried out for 10,000 transactions. The results are plotted with a mean of 95 percent confidence intervals. These confidence intervals are omitted from the graphs. The following protocols are implemented.
52
80
2PL SSLR ASLR FCWR SI-2PL SL
Throughput
70 60 50
#UTs = 30%, #RUs = 8
40 30 20 10 0
20
40
60
80
100
MPL
Figure 5.6: MPL versus Throughput
5.6.1
Experiments under Unlimited Resources
In the following experiments, we have reported results by simulating unlimited resources environment; i.e., there is no restriction on the number of speculative executions for a transaction. Figure 5.6 shows how throughput performance for 2PL, FCWR, SL, SSLR, ALSR and SI-2PL varies with MPL in an unlimited resources environment. The performance of 2PL protocol is low due to more lock waiting time. In FCWR, the performance deteriorates due to increased number of aborts. It can be noted that the performance of SL, SSLR and ASLR is significantly higher than that of 2PL and FCWR because of improved parallelism due to speculation. Among the SL protocols, the performance of SL and SSLR is close. This is due to the fact that under unlimited resources environment, both SL and SSLR perform in a similar manner. The ASLR protocol exhibits more performance over SL and SSLR due to increased parallelism. It can be noted that SI-2PL also exhibits higher performance. However, it can be noted that FCWR suffers from correctness and data currency problems and SI-2PL suffers from data currency problems. Overall, speculative protocols exhibit better performance over other protocols. Figure 5.7 shows how abort performance of 2PL, SSLR, ASLR and FCWR protocols vary with the increase in number of UTs. It can be observed that the number of transaction aborts under FCWR increases with the increase in data contention. However, the number of transaction aborts under 2PL, SI-2PL, SSLR and ASLR protocols is very less in comparison with FCWR. In Figure 5.8, the details regarding percentage of transactions which consumed 1, 2, 4, 8 and above 8 speculative executions in case of SL, SSLR and ASLR are shown. It can be noted that about 60 percent of transactions carry out single execution in SL, whereas about 70 percent of transactions carry out single execution in SSLR and ASLR. This is due to the fact that both UTs and ROTs speculate in SL whereas only ROTs speculate in SSLR and ASLR. As more number of transactions carry out speculation, the number of transactions which carry out single execution is less in SL over SSLR and ASLR. Only three percent of transactions carry out two speculative executions in SL, whereas it is 24 percent in SSLR and ASLR. It can be observed that 24 percent of transactions carry out four speculative executions in SL, whereas it is 5 percent in SSLR and ASLR. Also, in SL, about 10 percent transactions carry out more than 8 speculative executions, whereas the percentage value is very less in case of SSLR and ASLR. The results show that
53
% of Transaction Aborts
1000 2PL SSLR ASLR FCWR SI-2PL
800 600
MPL=20, #RUs=8
400 200
0
20
40
60
% of UTs Figure 5.7: % of UTs versus Transaction aborts
Figure 5.8: Details of speculative executions
54
80
100
Figure 5.9: MPL versus Average Response time a transaction carries out more speculative executions in SL as compared to SSLR and ASLR. The average number of speculative executions for SL, SSLR and ASLR comes to 4.2, 1.5, and 1.5 respectively. This experiment shows that the proposed protocols require very few speculative executions as compared to SL. So, by consuming less extra processing resources, it is possible to improve the performance in ROT environment. Figure 5.9 shows the average response time performance for the protocols 2PL, FCWR, SSLR, ASLR, SL and SI-2PL. The performance of 2PL is poor due to high lock waiting time. As FCWR aborts more number of UTs due to conflict, its performance is also poor. SSLR and SL exhibit good performance. We can observe that the performance of ASLR protocol is marginally ahead of SI-2PL. So, overall we can conclude that in the unlimited resources environment, ASLR protocol performs better than SSLR,SL,FCWR,2PL and SI-2PL protocols.
5.6.2
Experiments under Limited Resources
In this experiment, we report results by simulating the limited resource environment. The number of speculative executions are restricted for a transaction. We assume that a fixed number of resources are available in the system. Each processing unit is termed as one memory unit (MU) which can be used to process one speculative execution. We assume that a fixed number of MUs are available in the system. The MUs are allocated dynamically based on the requirement of the transaction. If the requested number of MUs are not available, the transaction is put to wait. Figure 5.10 shows the performance of 2PL, SL, SSLR and ASLR protocols by simulating limited resources environments. The experiment is carried out by fixing MUs equal to MPL. We have plotted the results by varying MUs from MPL to 2*MPL and evaluated the performance. It can be observed that the performance of both SSLR and ASLR reaches the maximum value and saturates at MUs value equal to 1.2*MPL. Whereas the performance of SL linearly increases with MUs and the performance is significantly less over SSLR and ASLR even at MUs=2*MPL. Also, note that the 2PL performance does not vary with the increase in MUs. This experiment shows that the performance of ROTs can be improved significantly with a fraction of (0.2 times) additional resources.
55
50 #UTs = 30%, #RUs= 8, MPL = 20
Throughput
40 30 SSLR ASLR 2PL SL
20 10
1
1.2
1.4
1.6
1.8
2
Total Memory Units (in multiples of MPL) Figure 5.10: MPL versus Throughput in limited resources environment
5.6.3
Experiments for Resource Utilization
In Figure 5.11, the CPU utilization of 2PL, FCWR, SSLR and ASLR protocols is shown. It can be noted that CPU utilization is high in case of FCWR. This is due to the fact that more number of UTs are aborted and resubmitted. Among SSLR and ASLR, CPU utilization of ASLR is high. Note that in SSLR, a transaction waits for the lock till the preceding transaction produces after-image. Whereas in ASLR, a transaction accesses the available version of the data object and carries out processing. Due to more waiting, the CPU utilization is less in SSLR. Also, the CPU utilization in case of 2PL is much lesser than other protocols due to more waiting. In Figure 5.12, the performance regarding I/O device utilization of 2PL, FCWR, SSLR and ASLR protocols is shown. The trend is similar to the case of CPU utilization graphs in Figure 5.11.
5.6.4
Experiment for Data Currency
In Figure 5.13, the average data currency performance with MPL is shown. More negative value of “average data currency” indicates that the transactions are provided with less “data currency” and the less negative value of “average data currency” indicates that transactions are provided with high “data currency”. We can observe from the figure that the “average data currency” value of 2PL, SSLR and ASLR protocols is 0 which indicates that these protocols are providing highest “data currency” possible to ROTs. Where as, the FCWR protocol provides low “data currency” to ROTs. Note that, as MPL increases, the negative value of “ average data currency” also increases, which indicates that “data currency” provided by FCWR to ROTs decreases.
5.7 Discussion In the previous chapter (Section 4.5), as a part of discussion, we have explained the assumptions that have been followed for simulating the speculative executions of ROTs. Similar process has been followed for simulating the speculative executions of ROTs under ASLR protocol. In addition to the implementation issues (like lock conversion, index locking and creation of speculative threads) which have been discussed
56
% of CPU Utilization
90
#RUs=8, #UTs = 30%
80
2PL SSLR ASLR FCWR
70 60 50 40 30 20 10
0
20
40
60
80
100
80
100
MPL
Figure 5.11: MPL versus CPU Utilization
% of I/O Device Utilization
90 #RUs = 8, #UTs = 30%
80 2PL SSLR ASLR FCWR
70 60 50 40 30 20 10 0
20
40
60 MPL
Figure 5.12: MPL versus I/O device Utilization
57
Figure 5.13: MPL versus Average Data currency for SSLR protocol, we explain the following implementation issue concerning to ASLR protocol. In ASLR protocol, as the speculative threads of the ROTs are carried out at different pace, any of the speculative thread of an ROT can request for a lock on a data object. Also, once lock is obtained on a data object by a speculative thread of an ROT, the other speculative threads of that ROT need not request for the lock on the same data object. Instead, these threads can access the before-images from the memory and can continue the execution. Once the after-image becomes available, further speculative threads can be created in the system.
5.8 Chapter Summary In this chapter, we discussed the details of the proposed asynchronous speculative locking protocol for ROTs. The proposed protocol improves the performance of ROTs without compromising correctness and data currency issues. The results of simulation study indicates that the proposed ASLR protocol performs better than 2PL, FCWR, SL and SSLR protocols. The proposed protocol requires 0.2 times of additional processing resources for improving the performance. In the next chapter, we explain the semantics-based protocols.
58
Chapter 6
Semantics-Based Speculative Locking Protocols for ROTs In SL-based protocols, ROTs are processed using speculation and UTs are processed with 2PL. So, the UTs conflicting with ROTs and UTs have to wait in the queue. We have investigated further regarding ROT processing environment and found that it is possible to reduce the waiting of UTs by exploiting the notion of semantics of ROTs. In this chapter we discuss how semantics of ROTs will enhance the performance of both SSLR and ASLR protocols. In Section 6.1, we discuss the basic idea of the proposed semantics-based protocols. In Section 6.2, we present the details of the proposed synchronous speculative locking protocol for ROTs (SSSLR). In Section 6.3, we discuss the details of the proposed semantics-based asynchronous speculative locking protocol for ROTs (SASLR). In Section 6.4, we compare the schedules produced by SSLR, SSSLR, ASLR and SASLR protocols. In Section 6.5, we discuss the performance results of SSSLR and SASLR protocols. We present the implementation issues of the protocols in section 6.6.
6.1
Basic Idea of Semantics-Based Protocol
In both SSLR and ASLR protocols the UTs follow 2PL. So, the UTs conflict with ROTs are forced to wait in the queue. We have investigated regarding the UTs and come up with semantics-based approach which can processes UTs without waiting if they satisfy certain conditions. We have also come-up with a method in which ROTs can be executed without blocking and also without spawning speculative executions. Since 1980’s database researchers have been working on to exploit semantics of the applications for improving the performance of transactions. In the literature, many research works have been discussed based on the commutativity property. The steps of a transaction which satisfy commutativity property can be executed in parallel. This type of processing will improve the performance of transactions [32] [33]. We have made efforts to exploit semantics of ROTs for improving the performance. We have proposed a new notion called “compensatability” for ROTs. In this section, we propose semantics-based synchronous speculative locking protocol for ROTs by exploiting both speculation and “compensatability” property. The proposed protocol can improve the performance by increasing the UT throughput and by reducing the number of speculative executions for ROTs.
59
6.1.1
Basic Idea
Basically, in both SSLR and ASLR, UTs wait for ROTs. The basic idea is as follows: if the computation carried out by ROT is compensatable, the UTs need not wait. Both ROT and UT can process in parallel. But, before completion, the ROT compensates the computation. If we identify the ROTs which are “compensatable”, by modifying the transaction code, the parallelism can be improved by allowing the execution of UTs in parallel with conflicting ROTs. The property of ROTs which allow compensation, is called “compensatability”, which is defined as follows. Definition 3 Compensatability. Let Ti be an ROT and Tj be a UT. Consider that Ti accesses data object ‘x’ at the time instant ‘ts ’ and produces new data object ‘y’ at the time instant te (te > ts ). In parallel, Tj accesses ‘x’ and produces ‘x′ ’ at time instant tu (ts < tu < te ). Here ‘x′ ’ is the value of ‘x’ modified by Tj . Consider that if Ti would have accessed ‘x′ ’, it would have produced new data object ‘z’. We say, the computation by Ti on ‘x’ is “compensatable”, if there exists a function or computation ‘g’ such that z=g(x′ ). Suppose, Ti accesses ‘n’ data objects. The computation carried out by Ti is “compensatable”, if the computation carried out by Ti is “compensatable” for each conflicting data object. The problem is to find the function ‘g’ for each data object accessed by ROT. It may be difficult for UTs. However, it is not that difficult for ROTs as they do not modify any data objects. In addition, for those ROTs which produce arithmetic results such as SUM, AVERAGE, PERCENTAGE and so on, we consider that it is easy to find the function ‘g’ and improve the parallelism. We now explain the notion of “compensatability” with an example. Let T1 be an ROT and T2 be a UT. Let ‘x’, ‘y’ and ‘z’ be the integer data objects. T1 and T2 are defined as given below. T1 : r[x], r[y], z = x + y, d[z], commit. T2 : r[y], y = y+10, w[y′ ], commit. Both T1 and T2 are concurrent transactions. We can assume that, while T1 is performing computation (z = x + y), T2 modified the ‘y’ value to ‘y+10’ (y′ ). In T1 , d[z] refers to displaying the value of ‘z’ to the terminal. Note that, the computation performed by T1 (addition) satisfies the “compensatability” property. Here the function ‘g’ can be expressed as “z+ (y′ -y)”. The transaction T1 has to perform the compensating function ‘g’, after its completion, if T2 completes first. Otherwise, there is no need for the compensating computations. So, if the ROTs are performing computations of “compensatable” type and if they conflict with UTs, then the ROTs and UTs can be processed in parallel without blocking, which results in increased performance. However, before committing, the ROTs have to perform compensating computations by reading the updated data which are produced by the conflicting committed UTs. Compensating operations In the proposed protocol, ROTs of compensatable type are processed without blocking. However, when a UT conflicts with an ROT of compensatable type or an ROT of compensatable type conflicts with a UT, the transaction identification number of the UT and the identification number of the data object which is modified by that UT, are recorded in a list. During its commit time, an ROT of compensatable type has to read the identification numbers of the conflicting UTs and data objects from this list and search for the same in the
60
Lock requested by Tj CR NR RU EW
CR yes yes yes yes
Lock held by Ti NR RU EW yes yes sm yes yes yes no yes yes no no no no
SPW sm yes sp yes no no
Figure 6.1: Lock compatibility matrix for SSSLR
transaction log. The transaction log is searched in the reverse order by the ROTs of compensatable type. This is similar to the approach followed in [23]. After reading the up-to-date values of the data objects from the transaction log, the ROT of compensatable type can perform the compensating computations as per the procedure available in the transaction program of that ROT of compensatable type. The procedure to perform compensating computations has to be developed by the database programmers. We believe that only few lines of code have to be added to the transaction program for performing compensating computations. The software routine for searching the transaction log has to be available in the transaction manager for the use of ROTs of compensatable type. It can be observed that the notion of “compensatability” can be extended to both SSLR and ASLR protocols. In the following sections, we explain both semantics-based SSLR (SSSLR) and semantics-based ASLR (SASLR).
6.2 6.2.1
The Semantics-Based SSLR protocol Overview
The proposed protocol exploits the “compensatability” property of the operations used in the applications for improving the performance of ROTs. In this approach, based on the “compensatability” property of the operations performed, we classify the ROTs into two types namely compensatable ROTs (CROTs) and noncompensatable (NCROTs). If all computing operations of an ROT satisfy the “compensatability” property, we call that ROT as CROT. Otherwise, we call the ROT as NCROT. We allow CROTs to execute without blocking and NCROTs to follow synchronous speculation as per SSLR. Note that, CROTs do not perform speculative executions, but they have to perform compensating computations during commit time. 2PL is chosen to process the UTs. However, the UTs conflicting with CROTs alone are processed without blocking.
Types of locks The CROTs request compensating read locks (CR-locks) for reading. The NCROTs request non-compensating read locks (NR-locks) for reading. The UTs request read update locks (RU-locks) for reading and exclusive write locks (EW-locks) for writing. The lock compatibility matrix of SSSLR is shown in Figure 6.1. The entry “yes” indicates that the corresponding locks are compatible and “no” indicates that the corresponding locks are incompatible. The entry “sm yes” (semantic yes) indicates that the requesting CROT is allowed to continue the execution. However, the CROT has to perform compensating computations during its commit time. Note that, the UTs conflicting with CROTs are allowed to continue without blocking, which is different 61
T1
T2
r1[x0] w1[x1] r1[y0] w1[y1]r1[p0]w1[p1] S1 T21 r [x ] r [z ] 2 0 2 0 S2
T22 r2[x1] r2[z0]
C1 compensating operations
C2 r3[z.0] r3[y.0] r3[p.0]
T3 S3
C3 r4[z0] w4[z1]
T4 S4
C4
Time
Figure 6.2: Depiction of transaction processing with SSSLR from the 2PL procedure. The entry “sp yes” (speculation yes) indicates that the requesting NCROT carries out speculative executions with the after-image produced by the preceding UT and forms a commit dependency with that UT. Transaction processing in SSSLR Figure 6.2 depicts the processing under SSSLR. Here T2 is a NCROT, T3 is a CROT, T1 and T4 are UTs. Note that, T2 waits for T1 to produce after-image. Once the after-image is available, being a NCROT, T2 proceeds with speculative executions. In the figure, T3 conflicts with T1 based on the data objects ‘y’ and ‘p’. Also T3 conflicts with T4 based on the data object ‘z’. Being a CROT, T3 proceeds its execution without speculation and blocking. During commitment, T3 performs compensating computations by reading the modified values of ‘y’, ‘p’ and ‘z’ from the transaction log. These values are available in the log as the UTs T1 and T4 are committed before T3 . In SSSLR protocol, the UTs conflicting with CROTs alone, are executed without blocking. By following this procedure, T4 is executed without blocking even though it is conflicting with T3 based on the data object ‘z’. Note that, this type of processing does not violate the serializability criteria.
6.2.2
The SSSLR Protocol
For each data object, a lockqueue is maintained to store the lock requests. The CROTs use the transaction log to read the updated values produced by the conflicting UTs. The list dependset(Tij ) stores the commit dependency details of the jth speculative execution of Ti . This list is maintained for each speculative execution of NCROTs. The list dependset(Ti ) stores the details of the UTs with which the CROT Ti had conflicts, during its execution. This list is used by the CROTs for identifying conflicting UTs while performing compensating computations. • Protocol for UTs In SSSLR, the UTs request RU-locks for reading and EW-locks for writing. The protocol for UTs is as follows. Let Ti be a UT. 1. Lock acquisition. Let Ti requests for RU-lock to read ‘x’ or EW-lock to write ‘x’. The lock request is entered into the lockqueue. 1.1 Ti obtains RU-lock if no transaction holds EW-lock or SPW-lock. Step (2) is followed. 1.2 Ti obtains EW-lock on ‘x’, if no transaction holds RU-, NR-, EW-, and SPW-locks.
62
2. Execution. During execution, whenever Ti produces the after-image for a data object, EW-lock on the data object is converted into SPW-lock. If Ti obtains all the locks, step (3) is followed. Otherwise step (1) is followed. 3. Commit/Abort Rule. Whenever Ti commits, the speculative executions of NCROTs that have been carried out with before-images of Ti are terminated. Whenever Ti aborts, the speculative executions of NCROTs which have been carried out with after-images of Ti are terminated. Commit status of Ti is added into the dependset of CROTs which are having dependency with Ti . Whenever Ti commits or aborts, the information regarding Ti is deleted from the dependset maintained for each of the speculative execution of NCROTs which are dependent on Ti . Also, all the related lock entries of Ti are deleted. • Protocol for CROTs In SSSLR, the CROTs request CR-locks for reading. The protocol for CROTs is as follows. Let Tj be a CROT. 4. Lock acquisition. Let Tj requests for CR-lock to read ‘x’. The lock is allocated to Tj . If the lock request conflicts with a UT (sm yes), then the details of that UT is added into the dependset(Tj ). 5. Execution. Tj continues the execution by accessing ‘x’. If Tj obtains all the locks then step (6) is followed. Otherwise, step (4) is followed. 6. Commit/Abort Rule. Whenever Tj commits, necessary compensating computations specified in its transaction program are performed using updated values available in the transaction log, which have already been produced by the conflicting committed UTs. The details of conflicting committed UTs are available in dependset(Tj ). All the related lock entries of Tj are deleted. If Tj aborts, then also the lock entries of Tj are deleted. • Protocol for NCROTs In SSSLR, the NCROTs request NR-locks for reading. The protocol for NCROTs is as follows. Let Tk be a NCROT. 7. Lock acquisition. Let Tk requests for NR-lock to read ’x’. The lock request is entered into the lockqueue. 7.1 If no transaction holds EW- or SPW-locks, the NR-lock is allocated to Tk . The step (8.1) is followed. 7.2 If a preceding transaction holds SPW- lock (sp yes), the NR-lock is granted. The identifier of preceding transaction that holds SPW-lock on ‘x’ is included in the Tk ’s dependset. The step (8.2) is followed. 8. Execution. 8.1 Tk continues with the current executions by accessing ‘x’. Step (8.3) is followed. 8.2 Each execution of Tk is split into two speculative executions: one is with the before-image and the other one is with the after-image. The identifier of the preceding transaction is added into the dependset of the speculative executions of Tk , which used the after-image of ‘x’ 8.3 If Tk obtains all the locks, step (9) is followed. Otherwise, step (7) is followed.
63
Lock requested by Tj CR NR RU EW
CR yes yes yes yes
NR yes yes yes no
Lock held by Ti RU EW yes sm yes yes asp yes yes no no no
SPW sm yes asp yes no no
Figure 6.3: Lock compatibility matrix for SASLR
T1 T2
r1[x0] w1[x1] r1[p0] w1[p1]r1[q0]w1[q1] C1 S1
T21 r2[x0] r2[y0] r2[z0] S2
compensating operations
T22 r2[x1]
C2 r3[z.0] r3[y.0] r3[p.0]
T3 S3
C3
T4
r4[z0] w4[z1] S4
C4
Time
Figure 6.4: Depiction of transaction processing with SASLR 9. Commit/Abort Rule. Once Tk ’s processing is completed, one of its speculative executions is chosen as follows: Suppose Tk has completed at time ‘t’. Tk retains that speculative execution which contains the effect of all committed transactions which have committed before ‘t’. If Tk is aborted, then all its speculative executions are also aborted. The locks allocated to Tk are released after the termination and the lock entries of Tk are deleted.
6.3 The Semantics-Based ASLR protocol 6.3.1
Overview
In the proposed SASLR protocol, the NCROTs are executed with asynchronous speculation. In this protocol, CROTs are executed without speculation and UTs are processed with 2PL. Also, UTs conflicting with CROTs alone are processed without blocking. CROTs have to perform compensating computations after the completion of their execution. We have already defined the notion “compensatability” and compensating operations in Section 6.1. Types of locks The lock compatibility matrix of SASLR is shown in Figure 6.3. The entry “yes” indicates that the corresponding locks are compatible and “no” indicates that the corresponding locks are incompatible. The entry “sm yes” (semantic yes) indicates that the requesting CROT is allowed to continue the execution. However, the CROT has to perform compensating computations during commit time. Note that, the UTs conflicting with CROTs are allowed to continue without blocking, which is different from the 2PL procedure. The
64
“asp yes” (asynchronous speculation yes) entry for EW-NR conflict indicates that the requesting NCROT carries out possible speculative executions with the available versions of the data object. Once the after-image is produced, the requesting NCROT can start further speculative executions. The entry “asp yes” for SPW-NR conflict indicates that the requesting NCROT carries out speculative executions by accessing the after-image produced by the preceding UT and forms a commit dependency with that UT. Transaction processing in SASLR Figure 6.4 depicts the processing under SASLR. Here T2 is a NCROT, T3 is a CROT, T1 and T4 are UTs. Note that, T2 starts its execution by reading before-image of the data object ‘x’. Once the after-image of ‘x’ becomes available T2 starts another speculative execution based on asynchronous speculation policy. In the figure, T3 conflicts with T1 based on the data object ‘p’. Also T3 conflicts with T4 based on the data object ‘z’. Being a CROT, T3 proceeds its execution without speculation and blocking. During commitment, T3 performs compensating computations by reading the modified values of ‘p’ and ‘z’ from the transaction log. These values are available in the log as the UTs T1 and T4 are committed before T3 . In SASLR protocol, the UTs conflicting with CROTs alone, are executed without blocking. By following this procedure, T4 is executed without blocking even though it is conflicting with T3 based on the data object ‘z’. Note that, this type of processing does not violate the serializability criteria.
6.3.2
The SASLR Protocol
For each data object, a lockqueue is maintained to store the lock requests. The CROTs use the transaction log to read the updated values produced by the conflicting UTs. The list dependset(Tij ) stores the commit dependency details of the jth speculative execution of Ti . This list is maintained for each speculative execution of NCROTs. The list dependset(Ti ) stores the details of the UTs with which the CROT Ti had conflicts, during its execution. This list is used by the CROTs for identifying conflicting UTs while performing compensating computations. • Protocol for UTs In SASLR, the UTs request RU-locks for reading and EW-locks for writing. The protocol for UTs is as follows. Let Ti be a UT. 1. Lock acquisition. Let Ti requests for RU-lock to read ‘x’ or EW-lock to write ‘x’. The lock request is entered into the lockqueue. 1.1 Ti obtains RU-lock if no transaction holds EW-lock or SPW-lock. Step (2) is followed. 1.2 Ti obtains EW-lock on ‘x’, if no transaction holds RU-, NR-, EW-, and SPW-locks. 2. Execution. During execution, whenever Ti produces the after-image for a data object, EW-lock on the data object is converted into SPW-lock. If Ti obtains all the locks, step (3) is followed. Otherwise step (1) is followed. 3. Commit/Abort Rule. Whenever Ti commits, the speculative executions of NCROTs that have been carried out with before-images of Ti are terminated. Whenever Ti aborts, the speculative executions of NCROTs which have been carried out with after-images of Ti are terminated. Commit status of Ti is added into the dependset of CROTs which are having dependency with Ti . Whenever Ti commits or aborts, the information regarding Ti is deleted from the dependset maintained for each of the speculative execution of NCROTs which are dependent on Ti . Also, all the related lock entries of Ti are deleted. 65
• Protocol for CROTs In SASLR, the CROTs request CR-locks for reading. The protocol for CROTs is as follows. Let Tj be a CROT. 4. Lock acquisition. Let Tj requests for CR-lock to read ‘x’. The lock request is allocated to Tj . If the lock request conflicts with a UT (sm yes), then the details of that UT is added into the dependset(Tj ). 5. Execution. Tj continues the execution by accessing ‘x’. If Tj obtains all the locks then step (6) is followed. Otherwise, step (4) is followed. 6. Commit/Abort Rule. Whenever Tj commits, necessary compensating computations specified in its transaction program are performed using updated values available in the transaction log, which have already been produced by the conflicting committed UTs. The details of conflicting committed UTs are available in dependset(Tj ). All the related lock entries of Tj are deleted. If Tj aborts, then also the lock entries of Tj are deleted. • Protocol for NCROTs In SASLR, the NCROTs request NR-locks for reading. The protocol for NCROTs is as follows. Let Tk be a NCROT. 7. Lock acquisition. Let Tk requests for NR-lock to read ’x’. The lock request is entered into the lockqueue. 7.1 If no transaction holds EW- or SPW-locks, the NR-lock is allocated to Tk . The step (8.1) is followed. 7.2 NR-lock is granted. The identifier of preceding transaction that holds SPW-lock on ‘x’ is included in the Tk ’s dependset. 7.3 If the preceding transaction holds SPW- lock (asp yes) then the step (8.2) is followed. If the preceding holds EW-lock (asp yes), then step (8.3) is followed. 8. Execution. 8.1 Tk continues with the current executions by accessing ‘x’. Step (8.4) is followed. 8.2 Each execution of Tk is split into two speculative executions: one is with the before-image and the other one is with the after-image. Step (8.4) is followed. 8.3 Tk continues with the current executions by accessing ‘x’. Whenever the preceding UT converts EW-lock into SPW-lock, Tk starts additional speculative execution by accessing the after-image of ‘x’ produced by that UT. 8.4 If Tk obtains all the locks, step (9) is followed. Otherwise, step (7) is followed. 9. Commit/Abort Rule. Suppose one of the speculative executions Tkj of Tk has completed at time ‘t’. If the read set of Tkj contains the effect of all the conflicting transactions that have committed before ‘t’, Tkj is retained and Tk ’s other speculative executions are aborted. Otherwise, Tkj is aborted. (Note that one of the speculative execution will be committed). If Tk is aborted, then all of its speculative executions are also aborted. Also the lock entries of Tk are deleted.
66
6.4 Comparison of Schedules In the proposed semantics-based protocols, UTs conflicting with CROTs alone are processed without waiting. So, the proposed semantics-based protocols produce different serializable schedules than the SSLR and ASLR protocols. Let T1 be a compensatable ROT and T2 be a UT. Let as assume that T2 has requested for lock on ‘x’ after T1 acquired lock on ‘x’. In SSLR UTs are processed using 2PL. The schedule shown in Figure 6.5 represents an interleaved execution of the transactions T1 and T2 under SSLR protocol. (Serialization order: T1 and T2 ). Here, T2 being a UT, waited for the lock on ‘x’ to get released. The schedule shown in Figure T1 r[x0 ] r[y0 ] r[z0 ] commit
T1 r[x0 ]
T2
T2 r[x0 ]
r[y0 ] w[x1 ] commit r[x0 ] w[x1 ] commit
r[z0 ] #r[x1 ] commit
Figure 6.5: Serializable schedule of SSLR
Figure 6.6: Serializable schedule of SSSLR
6.6 represents an interleaved execution of the transactions T1 and T2 under SSSLR protocol. Note that, T2 has acquired lock on ‘x’ even though read lock is held by T1 . In the figure, the symbol ‘#’ indicates that by reading the version ‘x1 ’ compensating computation is performed by T1 during its commitment. So, the serialization order of transactions is T2 and T1 , which is different from SSLR. The schedule shown in Figure 6.7 represents an interleaved execution of the transactions T1 and T2 under ASLR protocol. (Serialization order: T1 and T2 ). Here also, T2 being a UT, waited for lock on ‘x’ to get released. The schedule shown in Figure 6.8 represents an interleaved execution of the transactions T1 and T2 under SASLR protocol. Here, the symbol ‘#’ indicates that by reading the version ‘x1 ’, compensating computation is performed by T1 during its commitment. So, the serialization order of transactions is T2 and T1 which is different from ASLR. T1 r[x0 ] r[y0 ] r[z0 ] commit
T1 r[x0 ]
T2
T2 r[x0 ]
r[y0 ] w[x1 ] commit r[x0 ] w[x1 ] commit
r[z0 ] #r[x1 ] commit
Figure 6.7: Serializable schedule of ASLR
Figure 6.8: Serializable schedule of SASLR
So, we can say that SSSLR and SASLR protocols produce schedules which are different from the ones produced by SSLR and ASLR protocols. 67
Table 6.1: Simulation Parameters, Meaning and Values Parameter Meaning Value dbSize Number of objects in the database 1000 cpuTime Time to carry out CPU request 5ms ioTime Time to carry out I/O request 10ms rotMaxTranSize Size of largest ROT transaction 20 objects rotMinTranSize Size of smallest ROT transaction 15 objects utMaxTranSize Size of largest UT transaction 15 objects utMinTranSize Size of smallest UT transaction 5 objects noResUnits Number of RUs ( 1 CPU, 2 I/O) 8 MPL Multiprogramming Level 20 % of UTs Percentage of UTs currently active 30% and 50% % of CROTs Percentage of CROTs currently active Simulation variable (10 to 50) logOverhead Time to search transaction log 5ms
6.5 Performance Results We discuss the results of the simulation experiments after discussing the simulation model and parameters.
6.5.1
Simulation Model and Parameters
We have considered the closed queue model discussed in Chapter 4 for evaluating the performance. Some additional simulation parameters are used here. The description of simulation parameters with values is shown in Table 6.1. The parameter “logOverhead” denotes the CPU time for reading the log in the reverse chronological order. The parameter “% of UTs” denotes the percentage of UTs currently active in the system. The parameter “% of CROTs” means the percentage of CROTs active in the system. Let ‘u’ indicates the “% of UTs”, which means that at any point of time, there are ‘u’ percent UTs active in the system. Let ’c’ indicates “% of CROTs”, which indicates that at any point of time, there are ‘c’ percent CROTs are active in the system. Note that, there are (100-u-c) percent NCROTs active in the system. We conducted the experiments by varying “% of CROTs” from 10 to 50.
6.5.2
Protocols Simulated
We have compared SASLR, SSSLR protocols with 2PL, FCWR, SSLR and ASLR protocols. In all of these protocols, we have assumed that aborted transactions are resubmitted again after the time duration equals to average response time in order to reduce repeated aborts. For SSLR, ASLR, SSSLR and SASLR protocols we have assumed that all the speculative executions are carried out in parallel. Also, we have not taken into account the cost of deadlock detection as it is same for all locking-based protocols. In the experiments, the graphs show the mean results of 20 experiments; each experiment was carried out for 10,000 transactions. The results are plotted with a mean of 95 percent confidence intervals. These confidence intervals are omitted from the graphs.
68
45 #UTs = 30%, #RUs=8, MPL = 20
Throughput
40 2PL FCWR SSLR SSSLR ASLR SASLR
35
30
25 10
15
20
25
35 30 % of CROTs
40
45
50
Figure 6.9: % of CROTs vs Throughput (30% UTs)
6.5.3
Experiments under 30% and 50% UTs Environment
In the following experiments, we have reported the results by simulating environments in which 30% and 50% UTs are kept. Note that, the performance of 2PL, FCWR, SSLR and ASLR protocols is not affected because of change in the “% of CROTs”. This is because, these protocols consider both CROTs and NCROTs as simple ROTs. Figure 6.9 shows how throughput performance for 2PL, FCWR, SSLR, SSSLR and SASLR vary with “% of CROTs”. We can observe that, SSSLR protocol performs marginally better than SSLR in 30% UTs environment. Also, the figure shows that the performance of SASLR is slightly better than ASLR protocol. In SSSLR and SASLR, UTs conflicting with CROTs are not blocked. However, UTs conflicting with UTs are blocked. So, the performance of SSSLR and SASLR is only marginally better than SSLR and ASLR protocols correspondingly. Similar trend can be observed in Figure 6.10. Figure 6.11 shows how UT throughput performance of 2PL, FCWR, SSLR, ASLR, SSSLR and SASLR protocols vary with “% of CROTs”. It can be observed that, both SSSLR and SASLR protocosl perform better than the remaining protocols in 30% UTs environment. Both SSSLR and SASLR protocols allow the UTs conflicting with CROTs to continue their executions without blocking. So, the UT throughput performance for SSSLR and SASLR protocols is higher than the remaining protocols. From the Figure 6.12, we can observe that UT throughput performance of SSSLR and SASLR is better than that of 30% UTs environment. In 50% UTs environment, more UTs are allowed to enter into the system. Note that, UTs conflicting with CROTs are executed without blocking. As % of CROTs are increasing in the system, more UTs complete their executions. So, the UT throughput performance for SSSLR and SASLR protocols has been improved in 50% UTs environment. Figure 6.13, shows how ROT throughput performance of 2PL, FCWR, SSLR, ASLR, SSSLR and SASLR vary with “% of CROTs”. We can observe that, the performance of SSSLR and SASLR protocols decreases slightly as we increase “% of CROTs” under 30% UTs environment. The policy of allowing UTs conflicting with CROTs to continue their executions make more UTs to actively compete for resources than the situation of SSLR and ASLR. So, the ROT throughput of SSSLR is slightly less than SSLR. We can also observe that the ROT throughput of SASLR protocol is less than that of ASLR. Note that, the performance reduction of
69
45
#UTs = 50%, #RUs=8, MPL = 20
40
Throughput
35 30 25 20 15
2PL FCWR SSLR
10
SSSLR ASLR SASLR
5 0 10
15
20
25
35 30 % of CROTs
40
Figure 6.10: % of CROTs vs Throughput (50% UTs)
Figure 6.11: % of CROTs vs UT throughput (30% UTs)
70
45
50
25 #UTs = 50%, #RUs = 8, MPL = 20
UT Throughput
20
15
10
2PL FCWR SSLR
SSSLR ASLR SASLR
5
0 10
15
20
25
35 30 % of CROTs
40
45
50
Figure 6.12: % of CROTs vs UT throughput (50% UTs)
2PL FCWR SSLR
34
ROT Throughput
32
SSSLR ASLR SASLR
#UTs = 30%, #RUs=8, MPL = 20
30 28 26 24 22 20 10
15
20
25
35 30 % of CROTs
40
Figure 6.13: % of CROTs vs ROT throughput (30% UTs)
71
45
50
25 #UTs = 30%, #RUs=8, MPL = 20
ROT Throughput
20
15
2PL FCWR SSLR SSSLR ASLR SASLR
10
5
0 10
15
20
25
35 30 % of CROTs
40
45
50
Figure 6.14: % of CROTs vs ROT throughput (50% UTs) SASLR is less than that of SSSLR protocols. In SASLR protocol, ROTs need not wait for the production of after-image. So, the performance of SASLR is better than SSSLR protocol. Similar trend is observed in Figure 6.14 for 50% UTs environment. Figure 6.15, shows average number of speculative executions per transaction required for SSLR, ASLR, SSSLR and SASLR by varying “% of CROTs” under 30% UTs environment. We can observe that for SSSLR and SASLR protocols, the average number of speculative executions per transactions has been less. This is because, CROTs are processed without speculation in SSSLR. In Figure 6.16, we can observe that the performance difference between SSLR, ASLR and SSSLR, SASLR protocols is more than the difference observed in Figure 6.15. In 50% UTs environment, more UTs are allowed to enter into the system. Note that, UTs are processed without speculation. So, the average number of speculative executions per transaction decreases as “% of CROTs” increases.
6.6 6.6.1
Discussion and Implementation Issues Discussion
We discuss the issues in application of semantics for processing transactions specified in TPC-C benchmark. Next, we discuss the data currency issues. • Performance enhancement of transactions specified in TPC-C benchmark The TPC-C [57] is an online transaction processing workload. It is a mixture of read-only and update intensive transactions that simulate the activities found in complex OLTP application environments. TPC-C uses many tables and transactions. We consider only the STOCK-LEVEL table and NEWORDER and STOCK-LEVEL transactions for our discussion. The NEW-ORDER transaction updates S QUANTITY column of STOCK table, for each item specified in the order. The STOCK-LEVEL transaction determines the number of recently sold items that have a stock level below a specified threshold. It reads the S QUANTITY column of the STOCK table for each comparison. So from the above discussion, we can classify that STOCK-LEVEL as a compensatable-ROT and NEW-ORDER as
72
Figure 6.15: % of CROTs vs Average number of speculative executions (30% UTs)
Average number of speculative executions
2
1.5
1 SSLR SSSLR ASLR SASLR
0.5
#UTs = 50%, #RUs = 8, MPL = 20
0 10
15
20
25
35 30 % of CROTs
40
45
50
Figure 6.16: % of CROTs vs Average number of speculative executions (50% UTs)
73
a short UT. The proposed approach, can execute the transactions NEW-ORDER and STOCK-LEVEL without blocking. At the commit time, STOCK-LEVEL has to perform some compensating operations to include the updates of NEW-ORDER transaction. Thus the proposed approach improves the performance. Also, it provides high data currency to ROTs. • Data currency issues The data currency provided to the ROTs are very important for some web-based information systems. For example, let us consider the on-line stock exchange system, which provides the facilities like purchase and sale of company shares for its registered users. In this system, for the read only queries issued by the users, the SI-based protocols provide the data which is available in the database before the start of execution of current query, to the users and the decision taken based on this data may affect the financial benefits of the users. 2PL protocol provides recent data by including the effects of concurrent updates, but response time of 2PL is more. The proposed approach also, provides recent data by including the effects of concurrent updates. The advantage is that the response time of proposed approach will be less than that of 2PL. So, the users will get recent data in a quick time and can take a better decision.
6.6.2
Implementation issues
Classification of ROTs It is possible to classify the ROT as a compensatable-ROT if the computations performed by that ROT are compensatable. This classification information can be recorded in the transaction program itself. The problem is to identify the “compensatability” property of the operations. If the ROTs perform computations of arithmetic type, it is easy to identify the “compensatability” property of the computations. But for other computations, checking for “compensatability” property is difficult. We have to investigate this issue further.
6.7 Chapter Summary In this chapter, we introduced a notion called “compensatability” for classifying the ROTs into compensatable and non-compensatable ROTs. We have proposed semantics-based SSLR and ASLR protocols by combining speculation and semantics. These protocols process compensatable ROTs without speculation and blocking and non-compensatble ROTs with speculation. Also, these protocols process UTs conflicting with compensatable ROTs without blocking. These protocols improve the performance of ROTs by executing less number of speculative executions. The simulation results indicate that these protocols perform better than 2PL, FCWR and the SL-based protocols for ROTs. In the next chapter, we discuss the analytical methods for evaluating the performance of 2PL, FCWR and ASLR protocols.
74
Chapter 7
Performance Evaluation through Analytical Modeling In the literature, analytical methods are discussed to evaluate the performance of concurrency control protocols. In this chapter, after discussing the procedure for deriving analytical model for 2PL and optimistic concurrency protocols, we present the analytical procedures for the proposed ASLR protocol. The results of our simulation and analytical studies indicate that the ASLR protocol performs better than 2PL and SI-based FCWR protocols. In Section 7.1, we present the analytical study of 2PL and optimistic protocols. In Section 7.2, we discuss the analytical procedures for 2PL, FCWR and ASLR protocols for evaluating the performance by considering ROT processing environment. In Section 7.3, we have compared the performance results of analytical and simulation studies for 2PL, FCWR and ASLR protocols.
7.1
Performance Evaluation of 2PL and Optimistic protocols
In this section, first, we discuss the transaction model. Next, we discuss the hardware resource model used for calculating processing delays in accessing hardware devices. Subsequently, we discuss the response time study for 2PL and optimistic protocols by considering UTs.
7.1.1
Transaction Model
The transaction model consists of nL + 2 states, where nL is the number of the data objects accessed by the transaction. The transaction has an initial setup phase which is denoted as state 0. Following the initial set up, a transaction progresses to states 1,2, . . . , nL , in that order. This is the execution phase. At the start of each state, i, the transaction begins to access a new data object and moves to state i + 1 when the next data object access begins. At the end of the state nL , if successful, the transaction enters into the commit phase at state nL + 1. A conflict is defined to be an event in which a transaction accesses a data object that is currently accessed or in use by another transaction in an incompatible mode. The result of a conflict is either a transaction wait or transaction abort. In this section, for the analytical study, we have considered only one class of the transactions which perform read and update operations on the database.
75
7.1.2
Hardware Resource Model
In this subsection, we discuss the hardware resource model used for calculating processing delays in accessing hardware devices such as CPU, I/O, etc. We describe the hardware resource model by considering that there are K tightly coupled processors and a database is spread over 2K disks. Assuming open queue environment with poisson arrivals of transactions, we model the processors and disks as M/M/m (m refers to multiple CPU and I/O servers) with FCFS discipline. The notations used here are described in Table 7.2. The CPU and I/O waiting time are calculated based on the procedure discussed in [58]. (i) CPU waiting time Let arrivalRate be the rate at which transactions are arriving into the system, TimecpuServer be the CPU time required to access a data object and to perform the computations on that data object, and NcpuServers be the number of CPUs present in the system. Then utilization of CPU (cpuUtilzation) can be expressed as follows. cpuU tilization =
arrivalRate × T imecpuServer NcpuServers
(7.1)
Let cpuWaitingTimequeue be the average waiting time of data object access requests generated by the transactions for a CPU. The average number of requests currently waiting in the queue maintained for CPUs is given by LengthcpuQueue = arrivalRate × cpuW aitingT imequeue
(7.2)
The probability that there are no requests waiting in the queue maintained for CPUs is given by P robotasks = [1 +
(NcpuServers × cpuU tilization)NcpuServers + NcpuServers ! × (1 − cpuU tilization)
NcpuServers −1
∑ i=1
(NcpsServer × cpuU tilization)n −1 ] n!
(7.3)
The probability that there are as many more requests than we have CPUs is given by P robtasks≥NcpuServers =
NcpuServers × cpuU tilizationNcpuServers × P robotasks NcpuServers ! × (1 − cpuU tilization)
(7.4)
Then the average waiting time of the transaction requests waiting in the queue maintained for CPUs is given by cpuW aitingT imequeue = T imecpuServer ×
Ptask≥NcpuServers NcpuServers × (1 − cpuU tilization)
(7.5)
(ii) I/O waiting time Let TimeioServer be the time required to access a database object from the disk and NioServers be the number of disks present in the system. Then the utilization of disks is given by ioU tilization =
arrivalRate × T imeioServer NioServers 76
(7.6)
Let ioWaitingTimequeue be the average waiting time of data object access requests generated by the transactions for a disk. The average number of requests currently waiting in the queue maintained for disks is given by LengthioQueue = arrivalRate × ioW aitingT imequeue
(7.7)
The probability that there are no requests waiting in the queue maintained for each disk is given by P robotasks = [1 +
(NioServers × ioU tilization)NioServers + NioServers ! × (1 − ioU tilization)
NioServers ∑ −1 i=1
(NioServer × ioU tilization)n −1 ] n!
(7.8)
The probability that there are as many more requests than we have disks is given by P robtasks≥NioServers =
NioServers × ioU tilizationNioServers × P robotasks NioServers ! × (1 − ioU tilization)
(7.9)
Then the average waiting time of the transaction requests waiting in the queue maintained for disks is given by ioW aitingT imequeue = T imeioServer ×
7.1.3
Ptask≥NioServers NioServers × (1 − ioU tilization)
(7.10)
Response Time Study for 2PL and Optimistic Protocols
In a database management system, the transaction response time depends upon the occurrence of data conflict in accessing the database and also on the queuing and processing delay in accessing hardware resources such as CPU, I/O, etc. In the presence of the data contention, the probability of data conflict for any transaction depends not only on the concurrency control scheme but also on the transaction response time, which in turn depends on the conflict probability. This cyclic effect needs to be incorporated into the analytical model. In this subsection, first, we discuss the analytical model for computing probability of lock contention and average response time for 2PL. Next, we discuss the analytical methodology for computing probability of aborts and average response time for optimistic protocols. Here, we have not considered the effect of deadlocks in evaluating performance. The notations used in this subsection are explained in Table 7.2. (i) 2PL protocol Let RIN P L be the time taken for loading and initiating a transaction, RE be the time taken for executing a transaction, N L be the number of data objects accessed by the transaction, PW be the probability of lock contention, RW be the mean lock waiting time and T Commit be the average commit time to reflect the updates into the database. The mean transaction response time for 2PL protocol can be expressed as follows. R = RIN P L + RE + NL PW RW + TCommit
77
(7.11)
We can consider RIN P L as a constant for all transactions. Let ‘c’ be the time taken to commit the update of a single data object into the database. Then T Commit can be calculated as given below. TCommit = NL c
(7.12)
Let a be the mean time spent for reading a data object and performing computations. Then we can calculate RE as given below. RE = NL a
(7.13)
The variable a can be computed as given below by considering the hardware resource model discussed in the previous subsection. a = T imecpuServer + cpuW aitingT imequeue T imeioServer + ioW aitingT imequeue
(7.14)
In the above Eq., TimecpuServer and TimeioServer are constants. We have to derive the equations required for computing cpuWaitingTimequeue and ioWaitingTimequeue . Note that, a waiting transaction do not consume resources. This factor is considered for calculating both CPU and I/O waiting time. For computing cpuWaitingTimequeue , cpuUtilization has to be computed first. The procedure for computing cpuUtilization is given below. cpuU tilization = (arrivalRate(T imecpuServer + NL T imecpuServer )− (PW arrivalRate(T imecpuServer + NL T imecpuServer )))/NcpuServers
(7.15)
In the above Eq., PW is the probability of lock contention. Then the probability that there are no requests waiting in the queue maintained for CPU and the probability that there are as many more requests than we have CPUs and cpuWaitingTimequeue are calculated as per the Eqs. (7.3),(7.4) and (7.5). The utilization of I/O device can be calculated as given below. ioU tilization = (arrivalRate(T imeioServer + NL T imeioServer )− (PW Arrivalrate(T imeioServer + NL T imeioserver )))/NioServers
(7.16)
Then by using the Eqs. (7.8),(7.9) and (7.10), ioWaitingTimequeue can be calculated. Let us now describe the procedure required for computing PW and RW . For this, first, we have to compute sum of lock holding time which is explained below. Calculation of sum of lock holding time (G): Let us consider G as the sum of the (mean) lock holding time for each data object over all N L data objects accessed by a transaction. Let T Commit be the commit time required for a transaction to reflect the updates into the database. Then G is computed as given below. NL ∑ G = [ (ia + (i − 1)b)] + NL TCommit i=1
78
(7.17)
. Probability of lock contention: The probability of lock contention is calculated as given below. PW =
arrivalRateG L
(7.18)
Calculation of mean lock waiting time: The mean lock waiting time RW can be computed as given below. RW = (((a + b)(NL − 1)) + NL TCommit )/3
(7.19)
Calculation of b: We have used the average waiting time b in the Eq. (7.17) to compute G. This waiting time can be calculated as given below. b = PW RW
(7.20)
Iterative procedure for computing a, PW and RW : In the beginning, we assume a small fractional value for PW to compute cpuWaitingTimequeue and ioWaitingTimequeue . Next, we can calculate a. Then we calculate G, PW and RW by initializing b to 0. After this, we can calculate the next value of b and a. Then calculation of G, PW and RW are performed by considering new a and b. This process continues and typically converges in a few iterations. Once this process is completed, the values of a, PW and RW are known. (ii) Optimistic protocols Transaction aborts can happen in the optimistic protocols for conflicting transactions. Whenever a transaction is aborted, it will be rerun in the system after T Backof f time which is equivalent to average ′ commit time of a transaction (T Commit ). Let PA be the probability of transaction aborts and RE be the time required to rerun a transaction. Then, the mean response time (R) can be calculated as follows. ′
R = RIN P L + RE + PA (TBackof f + RE ) + TCommit
(7.21)
TCommit can be computed by using the Eq. (7.12). As per [47], T Backof f is equivalent to T Commit . ′ Also, RIN P L is a constant as discussed earlier. We have assumed that both RE and RE are equal by considering that, for rerunning a transaction, it requires time equivalent to the one required for the first run. We also compute RE as discussed in the previous section. For computing cpuUtilization we require the following modification. cpuU tilization = (arrivalRate(T imecpuServer + NL T imecpuServer )+ PA (NL T imecpuServer arrivalRate))/NcpuServers
(7.22)
Then, the probability that there are no requests waiting in the queue maintained for CPUs and the probability that there are as many more requests than we have CPUs and cpuWaitingTime are calculated based on the the Eqs. (7.3),(7.4) and (7.5). Next, the ioUtilization can be calculated as follows. ioU tilization = ((arrivalRate(T imeioServer + NL T imeioServer )+ PA (NL T imeioServer arrivalRate))/NioServers
79
(7.23)
Similarly the ioWaitingTime can be calculated as per the the Eqs. (7.8),(7.9) and (7.10). Initially we take a very small value for PA and calculate a as per the procedure explained in the previous subsection. Let T H be the mean data object holding time. Also, we compute T Commit as per Eq. (7.12). Now, PA is computed as follows. PA = arrivalRate(NL2 (TH + TCommit )/L
(7.24)
The mean data object holding time can be calculated by using the following Eq. TH = ((NL + 1)a)/2
(7.25)
After computing PA , the procedures for computing a and PA are repeated. This process continues and typically converges in a few iterations. Once this process is completed, the values of a and PA are known and with these values the mean response time R can be computed by using Eq. (7.21).
7.2 Response Time Study of 2PL, FCWR and ASLR Protocols for ROT Processing Environment In this section, we discuss the analytical methods developed for 2PL, FCWR and ASLR protocols by considering ROT processing environment.
7.2.1
2PL Protocol
In the subsection 7.1.3, we have already discussed how to calculate the average response time of UTs processed by 2PL. We have analyzed the ROT processing environment and come up with the modifications required in the equations discussed in the subsection 7.1.3 for 2PL. Let PU T be the percentage of UTs currently processed in the system, PROT be the percentage of ROTs currently existing in the system, RU T be the average response time for UTs and RROT be the average response time for ROTs. Then average response time can be calculated as given below. R = PU T RU T + PROT RROT
(7.26)
We have derived the Eqs. necessary for calculating the average response time by considering UTs and ROTs as given below. RU T = RIN P L + NLU a + NLU PW RW + TCommit
(7.27)
RROT = RIN P L + NLR a + NLR PW RW
(7.28)
In the above Eqs., N LU denotes the number of data objects accessed by the UTs, N LR denotes the number of data objects accessed by the ROTs and T Commit denotes the commit time of a UT. The commit time required
80
for UTs can be calculated as given below. TCommit = cNLU
(7.29)
We can compute RE and a as per the Eqs. (7.13) and (7.14). Similarly cpuUtilization and ioUtiization can be calculated by modifying the Eqs. (7.15) and (7.16) as given below.
cpuU tilization = (arrivalRate(T imecpuServer + NLRU T imecpuServer )− (PW arrivalRate(T imecpuServer + NLRU T imecpuServer )))/NcpuServers
(7.30)
ioU tilization = (arrivalRate(T imeioServer + NLRU T imeioServer )− (PW arrivalRate(T imeioServer + NLRU T imeioserver )))/NioServers
(7.31)
In the above Eqs., NcpuServers and NioServers are the number of CPUs and disks available in the system respectively and NLRU is the average number of data objects accessed by a transaction. NLRU can be computed as given below. NLRU = NLR PROT + NLU PU T
(7.32)
The cpuWaitingTime and ioWaitingTime are calculated as per the Eqs. (7.2),(7.3),(7.4),(7.5), (7.7),(7.8),(7.9) and (7.10). The calculation of sum of mean lock holding time is modified by considering both ROTs and UTs. In ROT processing environment, suppose a UT conflicts with an ROT. Then as per the 2PL rule, the UT has to wait for the commitment of that ROT. Until then, that ROT will hold the locks. This lock holding time (G1 ) is calculated as given below. N LR ∑
G1 = [
(ia + (i − 1)b)]
(7.33)
i=1
. Suppose a UT conflicts with another UT. Then the lock requesting UT has to wait until the lock holding UT commits. This lock holding time (G2 ) is calculated as given below. N LU ∑
G2 = [
(ia + (i − 1)b)] + NLU TCommit
(7.34)
i=1
. Suppose an ROT conflicts with a UT. The conflicting UT holds the lock until its commitment. This lock holding time (G3 ) is calculated as given below. N LU ∑
G3 = [
(ia + (i − 1)b)] + NLU TCommit
(7.35)
i=1
Next, we can calculate mean lock holding time G averaged over all data objects accessed by a transaction as given below. G = PROT G3 + PU T (PROT G1 + PU T G2 )
81
(7.36)
Then the probability of lock contention can be calculated as per the Eq. (7.18). Let us now discuss the procedure required to compute mean lock holding time by considering ROT processing environment. Suppose a UT conflicts with an ROT or UT. Then the UT has to wait until the conflicting UT or ROT commits. The mean lock waiting due to this can be computed as given below. RW 1 = (((a + b)(NLU − 1))+ NLU TCommit )/3
(7.37)
RW 2 = ((a + b)(NLR − 1))/3
(7.38)
Suppose, if an ROT conflicts with a UT, then the ROT has to wait until the conflicting UT commits. RW 3 = (((a + b)(NLU ))+ NLU TCommit )/3
(7.39)
Next, we calculate the mean lock waiting time by considering both the ROTs and UTs as given below. RW = PROT RW 3 + PU T (PROT RW 1 + PU T RW 2 )
(7.40)
. Then by following the iterative procedure discussed for 2PL in the subsection 7.1.3, it is possible to compute a, PW and RW .
7.2.2
ASLR protocol
In this subsection, we discuss the analytical methodology required for calculating average response time for ASLR protocol. In ASLR protocol, ROTs are executed without blocking by following asynchronous speculation policy. But UTs are processed as per 2PL rule. So, we require a modified procedure for computing sum of mean lock holding time and average lock waiting time. Calculation of sum of lock holding time (G): In ASLR, only UTs will wait for locks. Suppose a UT conflicts with an ROT. Then the conflicting ROT will hold the locks until its commitment. This lock holding time (G1 ) is calculated as given below. N LR ∑
G1 = [
(ia)]
(7.41)
i=1
. Suppose a UT conflicts with another UT. Then the lock requesting UT has to wait until the lock holding UT commits. This lock holding time (G2 ) is calculated as given below. N LU ∑
G2 = [
(ia + (i − 1)b)] + NLU TCommit
(7.42)
i=1
. Next, the sum of mean lock holding time averaged over all data objects accessed by a transaction can be
82
derived as follows. G = PU T (PROT G1 + PU T G2 )
(7.43)
Probability of lock contention: The probability of lock contention by considering both the ROTs and UTs can be calculated as given below. PW = (
arrivalRateG ) L
(7.44)
Calculation of mean lock waiting time: Suppose a UT conflicts with another UT. Then the UT has to wait until the conflicting UT commits. The mean lock waiting due to this can be computed as given below. RW 1 = (((a + b)(NLU − 1))+ NLU TCommit )/3
(7.45)
. Suppose a UT conflicts with an ROT. Then, the UT has to wait for that ROT to commit. The lock waiting time due to this can be calculated as follows. RW 2 = (a(NLR ))/3
(7.46)
Next, we calculate the mean lock waiting time by considering both the ROTs and UTs as given below. RW = PU T (PU T RW 1 + PROT RW 2 )
(7.47)
. The iterative procedure discussed for 2PL is also followed in ASLR for computing a, b, PW and RW . Next, we can calculate the average response time by considering UTs and ROTs as given below. RU T = RIN P L + NLU a + NLU PW RW + TCommit
(7.48)
RROT = RIN P L + NLR a
(7.49)
Finally the average response time can be calculated by using the Eq. (7.26).
7.2.3
FCWR protocol
In this subsection, we discuss the procedure for calculating average response time for FCWR protocol. FCWR is an optimistic protocol which processes the ROTs without blocking and UTs by following the first committer wins rule. As per this rule, if two UTs are in conflict and one of the UTs is committed and the other UT is aborted. In this thesis, we have assumed that the aborted transactions are resubmitted after a time period which is equivalent to average commit time of a UT. Also, we have assumed that aborted transactions are resubmitted only once. In FCWR also, we follow same procedures discussed in section 7.1.3 for computing average response time with some modifications. The cpuUtilization, ioUtilization, probability of aborts and mean data object
83
holding time are calculated by modifying the Eqs. (7.22), (7.23), (7.24) and (7.25) as given below. cpuU tilization = (arrivalRate(T imecpuServer + NLRU T imecpuServer )+ PA (NLU T imecpuServer nuts))/NcpuServers
(7.50)
ioU tilization = (arrivalRate(T imeioServer + NLRU T imeioServer )+ PA (NLU T imeioServer nuts))/NioServers
(7.51)
2 PA = nuts(NLU (TH + TCOM M IT ))/L
(7.52)
TH = ((NLU + 1)a)/2
(7.53)
In the above Eqs., nuts denotes the total number of UTs currently present in the system. Calculation of average response time (R): First we calculate the average response time by considering the UTs as given below. RU T = RIN P L + NLU a + TCOM M IT + (PA (TBackof f + NLU a))
(7.54)
Here we consider that, T Backof f is equivalent to T Commit as per [47]. Next, we can calculate the average response time by considering the ROTs as per the Eq. (7.49). Finally, the average response time can be calculated as per the Eq. (7.26).
7.2.4
Discussion
In our analytical and simulation studies of ASLR, we have not considered the overhead involved in thread creation and management. The modern operating systems support efficient environments like thread pool [56] for the creation of threads. So, we believe that there will not be a major change in the results due to the addition of this overhead.
7.3 Performance Results In this section, we have compared the results of the analytical methods with that of the simulation experiments. First, we discuss the simulation model. Next, we discuss the simulation parameters. Subsequently, we discuss the performance metrics and protocols considered for performance evaluation. Finally, we compare the results of analytical study and the simulation experiments conducted for the 2PL, FCWR and ASLR protocols.
7.3.1
Simulation Model
In this section, we discuss the simulation model which is used for evaluating the performance of the concurrency control protocols based on [52]. The simulation model is described by using logical and physical
84
queueing models. We have considered the open queue model for evaluating the performance.
Figure 7.1: Logical queuing model: Open queue Central to our simulation model for studying concurrency control algorithm performance is the open queueing model of a centralized database system shown in Figure 7.1. The transactions originate from the terminals follow poisson arrival rate. When a new transaction originates, it enters the ready queue. The transaction then enters cc queue (concurrency control queue) and makes the first of its concurrency control requests. The concurrency control (cc) manager which is the core of the database system that manages the access to data objects. The cc manager serves the data object access requests available in the cc queue in FCFS basis. If the data object access is granted by the cc manager, the transaction proceeds to the object queue and accesses the data object. If the data object access request is not granted, the transaction enters the blocked queue until it is once again able to proceed. If a request leads to a decision to restart the transaction, it goes back of the cc queue possibly after some period through ready queue. It then begins requesting for data object access from the first object. Eventually the execution of a transaction may complete and the cc manager may chose to commit that transaction. If the transaction is an ROT, it is finished. If the transaction is a UT, it writes its updates into the database by going through update queue. After the completion of commit operation, the transaction leaves the system. The model described above is a generic model which can be extended to locking or optimistic protocols. For locking protocol, each data object access request includes a lock request for the data object. Only if a lock is granted to a request, the data object is allowed to access by the transactions. Here, blocked queue is used for storing waiting lock requests. For optimistic protocols, the data object access requests are kept in the object queue and these are served one by one on FCFS basis. After the last data object access is completed, a transaction returns to the cc queue and then validation test is performed, If successful, the transaction enters 85
TERMINALS
. . . ready queue
cpu
: . cpu
disk
. . disk
Figure 7.2: Physical queuing model related to Figure 7.1 commit phase or else the transaction is aborted and restarted possibly after some period. The hardware resources required for executing transaction are mainly CPU, Memory and I/O devices. We have assumed that enough memory is available for storing all of the transactions originate from terminals. So, the transactions only have to wait for CPU and I/O devices for their execution. The physical queuing model which includes CPU and I/O devices is depicted in Figure 7.2. Here, the CPU servers and I/O severs may be though of as being a pool of servers and requests are serviced in a FCFS manner. Whenever a transaction requires CPU service, it is assigned to a free CPU server; otherwise the transaction waits until one becomes free. There is a queue maintained for I/O servers. When a transaction needs a service, it is assigned to a free I/O server; otherwise the transaction waits until one becomes free.
7.3.2
Simulation Parameters
The simulation parameters used here are similar to the parameters discussed in Chapter 4. Also, a new simulation parameter “arrivalRate” which denotes the arrival rate of transactions in the system, is included. The description of parameters used by both the simulation and analytical model are shown in Table 7.1 with values. The values for “rotTranSize” and “utTranSize” are fixed as 20 and 15 objects, respectively [55]. Also, we have considered poisson transaction arrivals and the transaction arrival rate (“arrivalRate” ) varies between 10 and 90.
86
7.3.3
Performance Metrics
We have employed “average response time”, ”probability of lock contention” and “probability of aborts” as the performance metrics. • Average response time Let ‘t1 ’ denotes the time instance at which a transaction T submitted to the system. Let ‘t2 ’ denotes the time instance at which T has been committed in the system. Then ‘t2 ’ - ‘t1 ’ is the response time of T. Let ‘s’ be the sum of response times of all transactions committed in the system and ‘n’ be the total number of transactions (both UTs and ROTs) committed in the system. Then average response time is equal to s/n. • Probability of lock contention Let ‘d’ be the total number of lock conflicts happened in the system and ’e’ be the total number of lock requests generated in the system. Then d/e is calculated as probability of lock contention. • Probability of aborts Let ‘a’ be the total number of aborts happened for UTs and ‘u’ be the total number of UTs generated in the system. Then a/u is calculated as probability of aborts for UTs.
7.3.4
Protocols
We have compared the performance of ASLR protocol with that of 2PL and FCWR protocols. In FCWR protocol, we have assumed that aborted transactions are resubmitted after the time equivalent to commit time of a UT. In the experiments, the graphs show the mean results of 20 experiments; each experiment was carried out for 10,000 transactions. The results are plotted with a mean of 95 percent confidence intervals. These confidence intervals are omitted from the graphs.
7.3.5
Results
In this subsection, we compare the results obtained through analytical study and simulation experiments. Figure 7.3 shows the average response time performance of transactions by considering both the UTs and ROTs with the increase in transaction arrival rate. We have found similar trends in the results of both analytical methods and simulation experiments. As expected, the performance of 2PL is poor due to high data contention. This is exhibited by both analytical and simulation results. We can observe that as transaction arrival rate increases beyond 20, the response time of 2PL shoots up, due to high data contention. In FCWR, the conflicting UTs are aborted and resubmitted. As the transaction arrival rate increases, more UTs arrive into the system and so the probability of aborts increases for UTs. As more number of UTs are aborted, the performance of FCWR comes down. It can be observed in Figure 7.3 that ASLR protocol performs better than both 2PL and FCWR protocols. Even in a high data contention situation, the average response time for ASLR protocol is very less in comparison with 2PL and FCWR protocols. This is due to the fact that ASLR protocol processes ROTs without waiting by following asynchronous speculation policy. Figure 7.4 shows the probability of aborts performance of UTs for FCWR protocol. Both the analytical and simulation results are close. We can observe that the abort ratio increases as the transaction arrival rate increases. Due to this high abort ratio FCWR performs poorly when compared with ASLR protocol.
87
Figure 7.5 shows the probability of lock contention for 2PL and ASLR protocols. We can observe that both simulation and analytical results are fairly close. For 2PL, the lock contention increases rapidly as transaction arrival rate increases. The ASLR protocol shows less lock contention in the figure. As ASLR protocol follows asynchronous speculation policy, it does not block ROTs. Because of this reason, the probability of lock contention is low for ASLR protocol. So, our conclusion is that in centralized database environment, ASLR protocol can perform better than both 2PL and FCWR protocols.
Figure 7.3: Transaction arrival rate versus Average response time
1 FCWR-A FCWR-S
Probability of aborts
0.8
0.6
0.4
0.2
0
10
20
30 40 Transactions per second
50
Figure 7.4: Transaction arrival rate versus Prob. of aborts
88
60
Table 7.1: Parameters, Meaning and Values Parameter Meaning Value dbSize Number of objects 1000 in the database cpuTime Time to carry 5ms out CPU request ioTime Time to carry 10ms out I/O request rotTranSize Size of ROT 20 objects utTranSize Size of UT 15 objects noResUnits Number of RUs 8 ( 1 CPU, 2 I/O) arrivalRate Transaction 10-90 arrivalRate % of ROTs Percentage 70% of ROTs % of UTs Percentage 30% of UTs
Notation RW PW NL RIN P L RE R PA ′ RE T Backof f T Commit arrivalRate TimecpuServer NcpuServers TimeioServer NioServers L N LR N LU PROT PU T N LRU a b c n nuts G
Table 7.2: Notations and Meanings Meaning The mean lock waiting time The probability of lock contention The number of data objects accessed by the transactions The time taken for loading and initiating the transaction The time taken for executing a transaction Transaction response time The probability of transaction aborts The time taken for rerunning the transaction The average time to start rerunning a transaction after its abort The average commit time of a transaction to reflect the updates into the database Rate at which transactions are arriving in to the system The average CPU time required for serving a request given by a transaction for a data object Number of CPUs present in the system The average I/O time required for serving a request given by a transaction for a data object Number of disks present in the system Number of data objects available in the database The number of data objects accessed by ROTs The number of data objects accessed by UTs Percentage of ROTs present in the system Percentage of UTs are present in the system Average number of data objects accessed by a transaction The mean time spent for reading a data object and performing computations The mean-wait time for a lock The time required to commit the updates of a data object into the database The total number of transactions currently present in the system The total number of UTs currently present in the system Sum of lock holding time
89
0.05
Prob. of lock contention
0.04
0.03
0.02 2PL-A 2PL-S ASLR-A ASLR-S
0.01
0
20
40 60 Transactions per second
80
100
Figure 7.5: Transaction arrival rate versus Prob. of lock contention
7.4 Chapter Summary In this chapter, we evaluated the performance of ASLR, 2PL and SI-based FCWR protocols using simulation and analytical methods. The results of simulation and analytical studies indicate that the proposed ASLR protocol improves the performance of ROTs better than 2PL and SI-based FCWR protocols.
90
Chapter 8
Conclusion To improve the performance of ROTs, snapshot isolation (SI) level has been proposed in [18]. The commercial DBMSs like Oracle and Microsoft SQL Server follow SI to process ROTs. In SI-based protocols, ROTs are executed without blocking. The ROTs are processed with both correctness and data currency problems. In this thesis, we have have proposed improved protocols for processing ROTs by extending speculation. In the proposed protocols ROTs are processed with speculation and UTs are processed with 2PL. As a result, significant benefits are realized for improving the performance of ROTs. Based on the mode of carrying out speculative executions we have proposed two protocols: synchronous speculative locking protocol for ROTs (SSLR) and asynchronous speculative locking protocol for ROTs (ASLR). We have proposed the protocols and discussed their correctness proofs. Also, we have evaluated the performance of the proposed protocols by conducting simulation experiments. The simulation results indicate that the proposed ASLR and SSLR protocols improve the performance of ROTs significantly over 2PL, FCWR and SL protocols. The proposed protocols improve the performance by trading extra processing resources. The results indicate that the protocols require only 0.2 times additional resources for significant performance improvement. Also, we have evaluated the performance of 2PL, FCWR and ASLR protocols using the analytical method by considering open queue environment. The results indicate that the proposed ASLR protocol performs significantly better than both the 2PL and FCWR protocols. In the proposed SSLR and ASLR protocols, conflicting UTs are made to wait for ROTs as UTs follow 2PL. We have made an effort to reduce the blocking of UTs by exploiting semantics of ROTs. We have proposed a new notion called “compensatability” and based on this notion classified the ROTs into “compensatable ROTs (CROTs)” and “non-compensatable ROTs (NCROTs)”. The CROTs are executed without speculation and blocking. However, during commit time, a CROT has to perform compensating computations by including the effects of concurrent UTs which have committed before this CROT. We have proposed improved SSLR and ASLR protocols by incorporating the notion of compensatability. Overall, we conclude the following: • The simulation and analytical results indicate that the proposed speculation-based protocols have the potential improve the performance of ROTs over the 2PL, FCWR, SI-based and SL protocols. • The proposed protocols process the transactions without any correctness issues. It has been shown that the proposed protocols process transactions by following serializability criteria • The proposed protocols process the transactions without any data currency issues. An ROT does not miss the updates made by the preceding committed transactions. 91
• The proposed protocols carry out multiple speculative executions for ROTs. So, the performance is improved by trading extra processing resources. The simulation results show that on average, a transaction carries out 1.5 number of parallel executions. So, by adding 0.2 times additional resources, we can have the high performance protocol to process ROTs without any correctness and data currency issues The proposed approaches provide the scope to improve the performance by trading extra processing resources. For implementing proposed protocols in real DBMS, further investigation is required. The idea of speculation can be extended to develop protocols to improve the transaction processing performance in data warehousing environments and distributed database systems. The list of future works is as follows. (i) Implementation The prototype RDBMS PostgreSQL implements a type of SI-based protocol for transaction processing [6]. As a part of future work, we would like to implement the proposed SL-based protocols in PostgreSQL or other open source DBMS to analyze the performance improvement with TPC-C and TPC-W benchmark workloads. (ii) Index locking In DBMS, one or more indices may be maintained for each table present in the database. Access to the tables is made only through one of the indices maintained for that table. The DBMS may use an index locking protocol to provide higher concurrency while preventing the phantom phenomenon. In future, we wish to investigate how speculation-based locking protocol can be implemented for such indices. (iii) Semantics of ROTs in TPC-C and TPC-W benchmark workloads The TPC-C and TPC-W bench mark programs use many ROTs in their specifications. We would like to analyze the ROTs specified in these benchmark programs whether those ROTs satisfy the notion “compensatablity” or not. We believe that the results of this investigation will prove that there are ample opportunities available for improving the performance by exploiting the notion “compensatability”. (iv) Long running transactions of data warehouse environment In data warehouse environment, long running ROTs would be performing some statistical computations by reading large number of data objects. Also, many long running UTs present in such environment may update the data warehouse concurrently. How to reflect these concurrent updates to the long running ROTs is a research issue. As a part of future work, we like to investigate the improved approaches by applying nested transactions concept. (v) Distributed transactions under SOA/Cloud environment Under centralized environment, the simulation results indicated that ASLR improves the performance over SSLR. However, in distributed environment the situation might be different due to transmission delay factor. Also, all the data may not be available at the same site. So, in the distributed environment both SSLR and ASLR provides a scope to build hybrid ROT processing protocol. The details will be investigated as a part of future work. The modern web-based systems are built by following service oriented architecture (SOA) framework. Such systems have to handle large number of distributed transactions in order to provide various services to the users. There are approaches discussed in which SOA transactions are handled by the application programs on their own. Many “cloud” vendors are in the market who provide both hardware 92
and software as scalable services to the required organizations. In future, software developed using SOA framework will run on such cloud infrastructure. We like to investigate the issues in processing distributed transactions generated in the SOA-based systems running in a cloud environment. Overall, it can be observed that the modern multi-core CPU-based systems possess abundant computing power and huge main memories. Researchers are working on to develop software systems which can exploit this high processing power to improve the performance. Software engineers are trying to convert existing single threaded applications into multi threaded applications in order for improving the performance by exploiting the high processing power of multi-core CPU-based systems. The proposed speculation-based protocols have a potential to exploit the parallel processing power of modern multi-core CPU-based systems to improve the query performance in modern database driven systems, in terms of throughput, correctness and data currency.
93
List of Publications
1. T. Ragunathan and P. Krishna Reddy, Performance Evaluation of Speculation-based Protocols for Read-only Transactions through Analytical Modeling (to be submitted to Performance evaluation journal, Elseiver publications) 2. Mohit Goyal, T. Ragunathan and P. Krishna Reddy, Extending Speculation-Based Protocols for Readonly Transactions to Distributed Database Systems, HPCC 2010, conference proceedings, 527-532 3. T. Ragunathan and P. Krishna Reddy, Semantics-Based Speculative Locking Protocols for Improving the Performance of Read-only Transactions, (to be submitted to IEEE TKDE journal) 4. T. Ragunathan, P. Krishna Reddy and Mohit Goyal, Semantics-Based Asynchronous Speculative Locking Protocol for Improving the Performance of Read-only Transactions, SpringSim 2010 conference, April 12-15, Orlando, Florida 5. T. Ragunathan and P. Krishna Reddy, Performance Evaluation of Speculation-Based Protocol for Readonly Transactions, ACM COMPUTE 2010 conference, January 22 & 23, Bangalore, India. 6. T. Ragunathan and P. Krishna Reddy, Speculation-Based Protocols for Improving the Performance of Read-only Transactions, COMAD 2009 Ph.D. Workshop,December 9th, 2009, at ISIM, Mysore. 7. T. Ragunathan and P. Krishna Reddy, Exploiting Semantics and Speculation for Improving the Performance of Read-only Transactions, Proceedings of COMAD 2008, pp. 162-173, 14th International Conference on Management of Data, December 17-19, 2008, IIT Bombay. 8. T. Ragunathan and P. Krishna Reddy, Speculation-Based Protocols for Improving the Performance of Read-only Transactions (to appear in IJCSE journal, Inderscience Publications). 9. T. Ragunathan, Extending Speculation for Improving the Performance of Read-only Transactions, 11th international conference on extending database technology (EDBT 2008), Ph.D Workshop, March 2008, Nantes, France. 10. T. Ragunathan and P. Krishna Reddy, Improving the Performance of Read-only Transactions through Asynchronous Speculation, Proceedings of SpringSim 2008, pp. 467-474, April 2008, Ottawa, Canada. 11. T. Ragunathan and P. Krishna Reddy, Improving the Performance of Read-only Transactions through Speculation, 5th International Workshop on Databases in Networked Information Systems (DNIS 2007), Lecture Notes in Computer Science vol. 4777, pp. 203-221 (October 2007)
94
12. T. Ragunathan and P. Krishna Reddy, Speculation-Based Protocols for Improving the Performance of Read-only Transactions, Poster Presentation, I-CARE, IBM-IRL Colloaborative Academia Research Exchange Pogramme, IBM Research India, Delhi, October 26, 2009. 13. T. Ragunathan and P. Krishna Reddy, Improving the Performance of Read-only Transactions through Speculation, Poster Presentation, Microsoft Techvista’07, Bangalore, January 23, 2007 14. T. Ragunathan and P. Krishna Reddy, Performance Enhancement of Read-only Transactions Using Speculative locking Protocol, Sixth Annual Inter Research Institute Student Seminar in Computer Science (IRISS 2007), IIIT, Hyderabad, January 4-5, 2007.
95
Appendix A 1. Correctness Criteria for Transaction processing In this section, we briefly explain about the correctness criteria for transaction processing. The operations performed by concurrent execution of a set of transactions may be interleaved. Such an execution is modeled by a structure called history. A history indicates the order in which the operations of the transactions were executed relative to each other. A history can be defined as a partial order because some of the operations of the concurrent transactions may be executed in parallel. If a transaction Ti specifies the order of two of its operations, these two operations must appear in that order in any history that includes Ti . Also, we require that a history specify the order of all conflicting operations that appear in it. Two operations are said to conflict if they both operate on the same data object and at least one of them is a Write. Thus, r[x] conflicts with w[x], while w[x] conflicts with both r[x] and w[x]. If two operations conflict, their order of execution matters. The value of x returned by r[x] depends on whether or not that operation precedes or follows a particular w[x]. Also, the final value of x depends on which of two w[x] operations is processed last. Definition 4 History of a set of transactions. Let T = T1 ,T2 , . . . , Tn be a set of transactions. A complete history H over T is a partial order with ordering relation∪