Design and Evaluation of Protocols for Maintaining Data Consistency in Intermittently Connected Databases Chang-Ming Tsai
LihChyun Shu
Dept. of Information Management College of Management Chang Jung University National Cheng Kung University Tainan County, Taiwan 711, ROC Tainan, Taiwan 701, ROC
[email protected] [email protected] January 27, 2005 Abstract We consider a distributed information system that consists of multiple mobile sites, each holding a local copy of the data needed at that site. Synchronization technologies exchange information among the sites via a central server, keeping each site correct and current. In this paper, we propose two concurrency control protocols for such a system. While both protocols use an optimistic replication strategy and guarantee serializable executions of transactions, their characteristics are different in other aspects. Performance evaluation via simulation experiments reveals the conditions under which each protocol performs reasonably well.
1
Introduction
Intermittently connected databases (ICDB) are gaining popularity due to the growing trend towards mobile computing [MDN+ 98]. In such a system, multiple clients with replicated local data work asynchronously while disconnected from a server that acts as a data repository for all the clients. Each client connects only when it needs to download data from the server or commit local changes on the server. Because disconnected clients may modify local data, maintaining consistency of data copies replicated at multiple sites is an important issue. One approach to this problem is to model concurrent client activities as transactions. The following example illustrates the concurrent update problem in such an environment: Example 1.
Assuming an ICDB environment with a server and two mobile clients A and B. Two data
items x and y are maintained at the server. In the beginning, the server initializes both x and y with the value 1. Client A connects first, replicates x from the server, and disconnects. Client B then connects, also replicates x, and disconnects. While disconnected, both clients increment their local replicas of x. Upon server reconnection, client A updates x at the server with the value 2, i.e., the value of its local copy of x, and disconnects. Client B then reconnects, also updates x at the server with the value 2, and disconnects. Apparently, the lost update problem had occurred in Example 1 because the update done on x by client A was ignored by client B. In this paper we focus on the problem of maintaining consistency of replicated
2
RELATED WORK
2
data in an ICDB environment where mobile users can read/write their replicated data while disconnected. We model concurrent client activities as transactions. For Example 1 above, the activity of client A will be modelled as a transaction which consists of a read operation on x, denoting its downloading of x, followed by a write operation on x, denoting its writing back of x at reconnection. By the same token, the activity of client B is modelled as another transaction. We observe that for the ICDB model we are considering (every data item in the database has a primary copy that is stored on the server, but data can be replicated and updated at every mobile client; committing local updates must be done via handshaking with the server), one can adopt the typical correctness criterion used for centralized database systems, i.e., serializability, rather than the more complicated criterion for replicated distributed databases, i.e., one-copy serializability. The main reason is because the master copy of every data item is kept on the server—local updates done by each disconnected client are tentative, and they will need to be reconciled with the master copies kept by the server when the client connects and tries to commit its updates [GHOS96]. We say a client starts a committing session when it is ready to write back its local updates. We make an important observation: Although the activity of every committing client performed between two consecutive committing sessions may consist of multiple tentative local transactions, in fact it can be modelled as a single transaction. The above two observations greatly simplify our design of concurrency control protocols and their correctness reasoning. Based on them, we propose two concurrency control protocols: (1) a match-and-go protocol that will abort local transactions sent from a connecting mobile clients if the read set of the transactions has been changed with different values by other mobile clients; (2) a graph-based scheduler that maintains and tests the serialization graph of the history which represents the execution the scheduler controls. The two protocols have different characteristics in terms of the types of histories they accept or reject, classes of serializability their generated histories belong to, and scalability, hence making them appropriate for different applications. We organize the remainder of the paper as follows. Section 2 discusses related work. Section 3 first describes the problem of concurrency control for ICDBs. Then it looks into appropriate correctness underpinnings for the problem and a simpler way to model clients’ concurrent activities. Sections 4 and 5 describe the match-and-go protocol and the graph-based scheduler, respectively. In Section 6, we compare the characteristics of the two protocols and evaluate them via simulation experiments. Section 7 concludes.
2
Related Work
Replica control strategies can be categorized as either pessimistic or optimistic. The two protocols we propose belong to the optimistic camp. The pessimistic approach does not allow two disconnected clients to perform conflicting operations at the same time. The optimistic approach, on the other hand, permits reads and writes everywhere, hence must detect and resolve conflicts after their occurrences. A good overview of existing replication solutions in traditional distributed databases can be found in [KA00].
3
CONCURRENCY CONTROL FOR ICDBS
3
In the mobile data management domain, the optimistic approach is generally preferable to the pessimistic one because by using pessimistic schemes data needed by a client may be held by another disconnected client for an extended period of time. The optimistic approach, however, will incur abortion and redo overheads when conflicts can not be resolved. On the other hand, deadlock is a potential problem when locking is used to enforce the pessimistic strategy. An analytic study of conflict detection and resolution for the optimistic approach and deadlock detection and resolution for the pessimistic approach can be found in [GHOS96]. In the two-tier replication scheme proposed by Gray et al. [GHOS96], base nodes store the master versions of replicated data and are always connected. A tentative version records local updates on a mobile node. A tentative transaction operates on and produce tentative data. A base transaction, on the other hand, operates on and produce only master data. Tentative transactions are reprocessed as base transactions when a mobile node reconnects to a base node. Tentative transactions fail if a pre-selected acceptance criterion is not met. Gray et al. [GHOS96] pointed out that an ideal replication scheme would achieve serializable transaction execution, although application-specific acceptance criteria may be acceptable in some cases. Pitoura and Bhargava [PB99] proposed a replication model which takes varying connectivity conditions among communicating nodes into consideration. Nodes are divided into clusters where strongly connected sites belong to the same cluster. Copies within the same cluster are required to be consistent. Intercluster data inconsistency is made to be bounded. Two types of transactions are identified: weak and strict transactions. Weak transactions access local, potentially inconsistent copies and perform tentative updates. Strict transactions access consistent data and perform permanent updates. When disconnected, a client can still operate by employing weak transactions. To determine correct concurrent execution of weak and strict transactions, the proposed scheme uses intracluster and intercluster serialization graphs. The intercluster serialization graph is maintained at run time, which is analogous to our graph-based scheduler. Commercial database products have also included support for maintaining data consistency in the ICDB environments. For example, Sybase SQL Anywhere provides a reconciliation scheme in its MobiLink server synchronization technology [Syb00], which is analogous to our match-and-go protocol. Unlike our approach, Sybase’s scheme relies on the server to detect and resolve conflicts, thereby limiting its scalability. We have seen other variants of our match-and-go protocol in [CC02, PMDD00]. To our knowledge, none had formally reasoned about the correctness of such protocols as we do.
3
Concurrency Control for ICDBs
As we described in Example 1, concurrent client activities in ICDBs could corrupt shared data if not properly coordinated. Concurrency control protocols are thus designed to be used as validation rules whenever a client connects with the server and tries to commit its local updates. Depending on the validation result, local updates done by a committing client may or may not be propagated to the server. In this paper, we assume connecting clients are processed one at a time, in the order they connect with the server. Because each client replicates data from the server and operates asynchronously, it is natural to consider
4
MATCH AND GO
4
using one-copy serializability [BHG87] as the underlying correctness criterion for ICDBs. A replicated data history is one-copy serializable if it is equivalent to a serial execution of the same transactions on a one-copy database. However, when we look more closely into the ICDB model we are considering, we find out that our model is distinct from conventional distributed and replicated databases in one important respect: The master copy of every data item is kept on the server. Updates done by each disconnected client are tentative; local updates become committed and can be propagated to the server only when a handshaking rule with the server is passed. By contrast, a replicated database as described in [BHG87] is a type of peer-to-peer systems where transactions executed on each individual machine operate on replicated data and can be committed locally provided that certain pre-specified protocol rules, e.g., the quorum consensus algorithm, are followed. A replicated copy of a data item becomes the master copy when its update is committed, and the committed value must be propagated to other machines that have replicated the same item. The above discussion leads us to the conclusion that we can adopt the typical correctness criterion used for centralized database systems, i.e., serializability, rather than the more complicated criterion for replicated distributed databases, i.e., one-copy serializability. Furthermore, from the server’s point of view, it only concerns about what data has been read by a committing client and what data the committing client wishes to write back. Although a committing client may execute many local transactions while disconnected, in fact it can be modelled as a single transaction consisting of reads on data it downloaded from the server and writes on data it wishes to propagate back to the server. The above two observations greatly simplifies our design and analysis of concurrency control protocols for ICDBs. It is a well-known fact that serializable executions can be precisely characterized by serialization graphs [EGLT76]. The first protocol we propose for ICDBs in section 4, called the match-and-go protocol, does not physically maintain the serialization graph itself, but produces histories that belong to a generalized class of view serializability. The second scheduler, described in section 5, is based on maintaining and testing the conflict serialization graph of the history that represents the execution the scheduler controls. We discuss a couple observations which facilitate the scheduler’s incremental construction of the graph. The two protocols have different characteristics and can be extended in different ways, which we discuss in section 6.1.
4
Match and Go
In our ICDB environment, the server needs to work with all of its clients’ requests. It is often desirable to let clients share the system loading as much as possible. This motivates the design of our first protocol, called the match-and-go protocol (MAG). A client is committing when it is ready to commit its changes on shared data over the server. This committing client is represented by a transaction consisting of zero or more reads on data downloaded from the server during previous connections, followed by writes on shared data. The basic idea underlying MAG is that updates done by the current transaction depend on the data values it reads. It is possible that the data values used by the client transaction to compute its updates are changed
4
MATCH AND GO
5
by other clients. But as long as the value of each data item in the transaction’s read set remains the same as the value of the datum when it is downloaded after the transaction starts, then the transaction can still be committed. With MAG, if a transaction reads a data item for computation purpose, then there will be a second read operation on the same item for validation purpose at commit time. It is the committing client, rather than the server, who is responsible for checking any data value discrepancy. We show the protocol’s handshaking rules in Figure 1. Note that if the values of all data items in the committing transaction’s read set are found to remain the same at commit time, then this condition must not be changed to false until the write set of the committing transaction is written back to the server. This property holds in our protocol because we assume that only one connecting transaction is processed at a time. While the concept of MAG should be easy to grasp, its correctness reasoning, however, may not be straightforward. Because the protocol does not check for read/write operation conflicts between transactions submitted by different clients, conflict serializability does not apply here. We claim that all histories representing executions that could be produced by MAG belong to a generalized class of view serializability, which we call view∗ serializability. We first define the notion of view∗ equivalence below. Definition 1 Two histories H and H 0 are said to be view∗ equivalent if (1) they are over the same set of transactions and have the same operations; (2) each transaction’s reads read the same value in two histories and (3) the final write operation on any data item comes from the same transaction in two histories. Note that view∗ equivalence concerns about every read in two histories acquiring the same value. By contrast, view equivalence ensures that every read in two histories reads from the same transaction [BHG87]. We define a history H to be view∗ serializable if for any prefix H 0 of H, the committed projection of H, C(H 0 ), is view∗ equivalent to some serial history. Clearly, view∗ serializability is a super set of view serializability. Now we are ready to show that MAG only produces view∗ serializable executions. Theorem 1 For any history H produced by MAG, H is view∗ serializable. Proof: Let H 0 be any prefix of H. We want to show that C(H 0 ) is view∗ equivalent to the following serial history Hs : Transactions in Hs are committed transactions in C(H 0 ). A transaction T1 appears before T2 in Hs if T1 commits before T2 . Obviously, the two histories C(H 0 ) and Hs must have the same set of transactions and the same set of operations. For validation purpose, if a transaction reads a data item, then there will be another read operation on the same datum when the transaction is ready to commit. If a transaction T commits in C(H 0 ) and if it reads any data item, say x, then its two read operations on x must have acquired the same value. Now consider the second read on x in C(H 0 ) and the second read on x in Hs . Because we place committed transactions in Hs according to their committed order in C(H 0 ), the two read operations must read from the same write operation in both histories, hence the same value. Furthermore, the two reads on x in Hs must also read the same value because Hs is a serial history. Hence, all four read
4
MATCH AND GO
6
At the server site: if the connecting client’s message is ”DOWNLOAD” then transmit every data needed by the client; else (the connecting client’s message is ”UPLOAD”) install the data values supplied by the client; endif; At the client site: if the processing mode is ”READ” then send the ”DOWNLOAD” message to server; download any data needed by the client, and place these downloaded data items with values in a read pool; else (the processing mode is ”VALIDATE+WRITE”) send the ”DOWNLOAD” message to server; download all data items appeared in the read pool, and compare them with those in the read pool; if there are no changes in values for all data items then send ”UPLOAD” message to the server; send the data items to be uploaded with values to server; commit the local transaction; else abort the local transaction; endif; flush the read pool; endif; Figure 1: Handshaking rules for the match-and-go scheduler
5
GRAPH-BASED SCHEDULER
7
operations on x in C(H 0 ) and Hs must read the same value. As a result, each transactions’ reads read the same value in both histories. Finally, the final write operation on any data item in both histories must come from the same transaction, due to our way of placing committed transactions in Hs . 2
5
Graph-Based Scheduler
The match-and-go protocol introduced in Section 4 has one important characteristic: The data read (downloaded) by a committing client can be made in one or more connections. In other words, a mobile client is not required to download all data it needs during one single connection. Whenever a need arises, a client can connect to the server and download any data it needs. This design permits the read operations of a committing client to be interleaved with write operations of other clients. However, we are able to delay checking data value discrepancy for all data read until a client commits because the underlying correctness criterion is not conflict-based. Another characteristic of the match-and-go protocol is that it requires each client to always use recently changed data. While this property should be important in certain applications, less up-to-date data may be acceptable in other applications. In this section, we will introduce one scheduler that is based on conflict serializability. In particular, the scheduler incrementally builds a serialization graph that represents the execution history the scheduler processes. The graph-based scheduler (GBS) produces validation results based on detection of cycles in the graph. We still desire the flexibility of allowing a client to download data it needs in more than one connections. Due to the ordering nature of conflict operations, our GBS checks possible conflict relationships whenever they arise, rather than delaying the check until commit time. Each committing client consists of zero or more read blocks and a write block. Each read block consists of reads on data that were downloaded by the client during one connection. The write block consists of updates on data that the client wishes to propagate back to the server. We denote the read blocks of a client transaction Ti as RBi,0 , RBi,1 , . . .. When we do not need to refer to a particular read block of Ti , we use RBi to denote a read block of Ti . The first time Ti downloads data, i.e., when RBi,0 is processed by the server, a new node Ti is added in the serialization graph, and Ti is in the active state. When processing a read block RBi , we check conflict relationship between RBi and each write block W Bj of client transaction Tj already in the history; if W Bj contains an operation that conflicts with an operation in RBi , we add an edge from Tj to Ti in the serialization graph. We say W Bj conflicts with RBi . We say a path exists from T1 to Tn if there is an edge from Tl to Tl+1 , 1 ≤ l ≤ n − 1. T1 is called an ancestor of Tn and Tn is a descendant of T1 . A problem we must consider is how far back in the history our GBS should look for the write blocks that conflict with a given read block. We need to consider the conflict relation between a read block RBi and a write block W Bj in the history if it could induce an edge from Tj to Ti and this edge may become part of a cycle in the serialization graph. The following observation characterizes the condition under which the edge Tj → Ti (if it exists) may be involved in a cycle.
5
GRAPH-BASED SCHEDULER
8
Observation 1 When processing a read block RBi , the GBS must consider the conflict relation between RBi and a write block W Bj that is already in the history if there exists a path from an active transaction Tk to Tj in the serialization graph. The observation can be explained as follows. If Tk is active, then a path from Ti to Tk becomes possible when Tk is about to commit, i.e., when its write block is received by the server. We must check the conflict relation between RBi and W Bj , because ignoring that may lead to a missing cycle (formed by an existing path from Tk to Tj , then a possible edge from Tj to Ti , finally a possible path from Ti back to Tk ) in the serialization graph. On the other hand, if no such Tk exists when we process RBi , then we can safely ignore the conflict relation between RBi and W Bj , knowing that no cycle will form even though there should be an edge from Tj to Ti . In a similar vein, when processing a write block W Bi , the GBS must consider the conflict relations between W Bi and those write blocks whose transactions have active ancestors. In addition, the conflict relation between W Bi and every read block already in the history may form an edge, which in turn may lead to a cycle afterwards. We thus obtain the following observation with regard to a received write block. Observation 2 When processing a write block W Bi , the GBS must consider the conflict relation between W Bi and a write block W Bj if there exists a path from an active transaction Tk to Tj in the serialization graph. In addition, it must also consider the conflict relation between W Bi and the read block(s) of every active transaction Tj . To quickly identify which write (read) blocks the GBS must consider when processing a received block, we maintain a set of transactions whose write (read) blocks have active ancestors, denoted WAA (RAA). Whenever the GBS processes a read block, it only needs to consider the write blocks belonging to transactions in WAA. Whenever the GBS processes a write block, it needs to consider the write blocks of transactions in WAA and the read blocks of transactions in RAA. WAA and RAA, both initialized to be empty set, are updated incrementally. Based on Observation 2, the first time a read block of Tj , RBj , is received, we include Tj in RAA, meaning that Tj is an active ancestor of RBj . To keep track of the number of read blocks of Tj that have been processed, we use a variable N RBj . The GBS then uses RAA and N RBj when it builds the serialization graph by considering the conflict relationships between a received write block and the read blocks of transactions in RAA. For every active transaction Tj , the GBS uses DATj to maintain a set of transactions who are descendants of Tj . DATj is mainly used by the GBS to efficiently detect cycles: when an edge from Tk to Tj is to be added due to the conflict relation between their corresponding blocks, a cycle will be formed if Tk is in DATj , i.e., a path from Tj to Tk together with an edge from Tk to Tj . Note that when such a situation arises, the GBS must be processing the write block of Tj . In this case, the GBS must reject W Bj and abort Tj to avoid a non-serializable execution. In order for the GBS to facilitate maintenance of RAA, WAA, and DATj for every active transaction Tj , the GBS maintains a set of active ancestors of Tj , denoted AAj . When a read block RBj is received,
6
DISCUSSIONS AND PERFORMANCE EVALUATION
9
we make Tj a member of AAj , meaning that Tj is an active ancestor of itself. If an edge Tk → Tj is added, then AAj is set to be the union of AAj and AAk , i.e., active ancestors of Tk also become active ancestors of Tj . For each member Tl of AAj , Tj is made a member of DATl . We give the detailed algorithms of GBS for processing read and write blocks in the appendix of the paper. To ease exposition, we assume every transaction downloads data just once, i.e., with only one read block. The algorithms can be easily extended to handle multiple downloads for every single transaction.
6
Discussions and Performance Evaluation
We have described two concurrency control protocols that are designed to be used to ensure data consistency in ICDB environments. The match-and-go (MAG) protocol aborts a local transaction sent from a committing client if its read set has been changed with different values by other mobile clients. The graph-based scheduler (GBS) incrementally maintains the serialization graph of the history which represents the execution the scheduler controls. A local transaction is accepted by GBS if its inclusion in the graph will not lead to a cycle. We compare the characteristics of the two protocols in Section 6.1. In Section 6.2, we evaluate them via simulation experiments.
6.1
Discussions
The two protocols are designed to be used as validation procedures when a client connects with its server. Our reconciliation criterion is based on global serializability, i.e., every concurrent execution of committed local transactions is equivalent to a sequential execution of the same transactions. The histories produced by the two protocols are, however, members of different serializability classes. While GBS produces histories that are in the class of conflict serializability, the set of histories produced by MAG is shown to be in the class of view∗ serializability, a generalized class of view serializability. It can be shown that the set of histories that can be generated by GBS has a nonempty intersection with the set of histories that can be generated by MAG. However, these two sets are incomparable with respect to set inclusion. Another aspect that distinguishes these two protocols is that MAG requires that each mobile client always uses recently changed data (made by other clients). Hence, the protocol is more suitable for applications such as stock trading, for which knowing and using new data state is important. GBS, on the other hand, allows a mobile client to compute results based on not-up-to-date data values. Hence, the protocol is more suitable for applications such as executive decision making, for which approximate data state may be acceptable. As for the reconciliation overhead, it is the server’s responsibility to produce correct histories in the graph-based scheduler. On the other hand, the reconciliation task is distributed to each connecting mobile client in the match-and-go protocol. As a result, MAG is more scalable than GBS.
6
DISCUSSIONS AND PERFORMANCE EVALUATION
10
Table 1: System Parameters and Default Settings Parameter DBSize DomainSize
1000∼5000 1∼1000
Meaning Number of data items in database Number of values that each data item can take
ClientSize
50∼5000
TransSize
8
Number of data items read or written by a transaction
mu
5
Average inter-arrival time on server
ConnPattern
6.2
Setting
Normal/Uniform
Number of clients
Client connection pattern
Experimental Evaluation
In order to better understand these protocols, we conduct a series of experiments by using a software simulator. This simulator is written in C language, and runs on a standard personal computer. Typically the simulator is instructed to run for a large number of time ticks, say 10,000,000, to simulate a long lasting process. The number of connections made by the clients in any given time interval is designed to follow the Poisson distribution— the inter-arrival time between every two connections follows an exponential distribution. Each client generates its own transactions independently. The number of transactions each client can generate follows one of the following two options: (1) a uniform distribution: this means that all clients will produce approximately the same number of transactions. (2) a normal distribution: this is intended to simulate the scenarios that some clients will connect to the server more often than the others. The parameters, together with their settings and meanings used in our simulations are listed in Table 1. The principal performance metric used in this simulation is the transaction success rate, which is defined as the ratio of the number of successful transactions to the total number of generated transactions by all clients. In this simulation we also include in the discussion a variant of the MAG protocol. Note that with the MAG protocol, the all-or-nothing approach in writing back values to a server is a rather strict requirement. One value change in the read set will abort a committing transaction, even if none of data items in the write set items depends on this data item whose value happened to be modified. If the write set can be decomposed into mutually independent subsets, then we can treat this transaction as a set of subtransactions, according to those independent write subsets and suitably adjusted read sets. In general, the read set of each subtransaction will have a smaller size than that of the whole transaction. If we apply the MAG protocol to each subtransaction, then some of them may succeed even if the transaction as a whole fails to commit. We call this modified protocol partially committed MAG protocol, abbreviated as PCMAG. Obviously all histories generated by PCMAG protocol are view∗ serializable. We introduce a dependency factor, d, between the write set and the read set of a transaction to enable the simulation with the PCMAG. The value d is simply the probability that, for any given read data item, a write data item happens to
6
DISCUSSIONS AND PERFORMANCE EVALUATION
11
1.00
Success rate
0.90
0.80
0.70
MAG PCMAG GBS
0.60 1000
2000
3000 Database size
4000
5000
Figure 2: The impact of different database sizes on success rate (50 clients uniformly connected, domain size = 100) depend on. We begin by looking at the the effect of database size on the transaction success rate. In Figures 2 and 3, we can see a trend in the figures that the success rate for each of these protocols increases as the database size increases. This is due to the fact that the more data items a database has, the less chance two transactions will access same data items. Therefore, the success rate of all these protocols will increases as the database size increases. Note that we have a dependency factor of 0.5 for the PCMAG in these figures. Meanwhile, note that transactions originated from the same client in our setting will arrive at the server one after another and should not cause each other to abort. An aborted transaction can only occur when it conflicts with transactions from other clients. If the clients’ connection pattern follows an uniform distribution, than the interleaving of transactions from different clients will be more common than that follows a normal distribution. Therefore, a higher success rate is expected with a normally distributed connection pattern as shown in Figure 3. Intuitively, increasingly complicated interaction among transactions can be expected as the number of clients increases. When MAG or PCMAG is used, it means that the data values a transaction read in the first phase will be more likely to be updated by other clients’ transactions before this same transaction is ready to commit. So the transaction success rate will be lower as the number of clients increases. If GBS is used, more clients will likely incur more nodes in the serialization graph maintained by the system. Hence the probability of forming cyclic cycles in the graph is likely to increase. So more clients will have a negative effect on the success rate for the system in general. This observation agrees the data shown on Figure 4. In this figure, we include simulations for three different levels of dependency for PCMAG. We are also interested in the relationship between data domain size and the transaction success rate for these protocols. With the MAG protocol, a transaction needs equal data values in its two read operations to allow itself committed successfully. Similar scenarios apply with the PCMAG. When the data domain size
6
DISCUSSIONS AND PERFORMANCE EVALUATION
12
1.00
Success rate
0.90
0.80
0.70
MAG PCMAG GBS
0.60 1000
2000
3000 Database size
4000
5000
Figure 3: The impact of different database sizes on success rate (50 clients normally connected, domain size = 100)
1.00
Success rate
0.90
0.80
MAG PCMAG(0.5) GBS PCMAG(0.75) PCMAG(0.25)
0.70
0.60 100
200
300 Client size
400
500
Figure 4: The relationship between client size and transaction success rate
6
DISCUSSIONS AND PERFORMANCE EVALUATION
13
1.00 0.99
Success rate
0.98 0.97 0.96 0.95 0.94 0.93
MAG PCMAG
0.92
GBS
0.91 0.90 1
10
100
1000
Domain size
Figure 5: The relationship between data domain size and transaction success rate (Clients are uniformly connected.) is small, a data item will have better chance of being switched back to its earlier value after several update operations. The chance degrades quickly as the domain size gets bigger. So the domain size could affect the transaction success rate while the MAG or PCMAG protocol is enabled. But with the GBS protocol, a server never checks data values in transactions to commit or abort transactions. So data domain size will have no effect on the transaction success rate while the the GBS protocol is enacted. Figures 5 and 6 gives us a glimpse on this issue. All previous figures in this section have shown us that a system with GBS protocol has better transaction success rate than that with MAG protocol. This is true for most practical scenarios and can be explained as follows. With the MAG protocol, a transaction gets aborted when it detects any value change in its read set. This kind of value change will correspond to read-write conflict edges in the serialization graph of GBS. But, with the GBS protocol, a read-write conflict does not necessarily result to an abortion of a transaction. On the other hand, a transaction with GBS is aborted when it is involved in a cyclic circle of the serialization graph. Each conflicting edges in any cyclic circle is the result of either a read-write conflict or a write-write conflict. Case (1): If it is a read-write conflict, then an abortion is destined to occur for systems using MAG protocol unless the write operation writes back the value the read operation reads earlier. It is rather uncommon for many applications that a data item gets updated by different and independent transactions and retains its original value afterwards. Case (2): If it is a write-write conflict, then this means that the read set of the earlier transaction of these two transaction is not modified. So the earlier transaction will get through the system using MAG protocol successfully. With a relatively large database, compared to the data size a typical transaction processes, the chance for write-write conflicts is small. But when we look further into Figure 4, it reveals that there is an interesting crossing between two curves which correspond to GBS and PCMAG with a 0.25 dependency factor. This tells us that, under certain conditions, PCMAG could perform better than GBS regarding the success rate. Figures 7 and 8 give us
7
CONCLUSION
14
1.00 0.99 0.98 Success rate
0.97 0.96 0.95 0.94 0.93 MAG PCMAG GBS
0.92 0.91 0.90 1
10
100
1000
Domain size
Figure 6: The relationship between data domain size and transaction success rate (Clients are normally connected.) more details on this issue. When the dependency between the read set and the write set of each transaction is low, the environment will favor the PCMAG protocol.
7
Conclusion
In this paper, we have proposed two concurrency control protocols for intermittently connected client-server databases. The protocol rules determine whether updates done by a committing client can be reflected back to the server. We make two important observations which greatly simplify the design and analysis of our protocols. In particular, we argue that one can adopt serializability, rather than one-copy serializability, as the underlying correctness criterion. In addition, the activities done by each committing client can be modelled by a single transaction, rather than one or more local transactions. The two protocols produce schedules that belong to different classes of serializability: view∗ serializability by the match-and-go protocol (MAG) and conflict serializability by the graph-based scheduler (GBS). When comparing their performance, we also consider a variant of MAG, called PCMAG, that takes the dependency of a transaction’s write set on its read set into consideration. Simulation results reveal that all three protocols commit more transactions as database size increases. All of them abort more transactions as client size increases. As data domain size increases, the success rate achieved by both MAG and PCMAG drops quickly, and stabilizes beyond the size of ten. GBS, however, is not sensitive to domain size changes. Generally speaking, GBS outperforms MAG and PCMAG in most scenarios. PCMAG can outperform GBS when the degree of dependency of transactions’ write sets on their read sets is low.
7
CONCLUSION
15
1.00
Success rate
0.90
0.80
0.70
0.60 0.00
MAG PCMAG GBS
0.25
0.50 0.75 Read/write dependency
1.00
Figure 7: The impact of different levels of data dependency on success rate between a write data set and a read data set (50 clients , database size = 1000)
1.00 MAG PCMAG GBS
Success rate
0.90
0.80
0.70
0.60 0.00
0.25
0.50 0.75 Read/write dependency
1.00
Figure 8: The impact of different levels of data dependency on success rate between a write data set and a read data set (500 clients , database size = 5000)
REFERENCES
16
References [BHG87]
Philip A. Bernstein, Vassos Hadzilacos, and Nathan Goodman. Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987.
[CC02]
Sidney Chang and Dorothy Curits. An approach to disconnected operation in an object-oriented database. In Proceedings of the Third International Conference on Mobile Data Management, pages 19–26, 2002.
[EGLT76] K. Eswaran, J. Gray, R. Lorie, and I. Traiger. The notions of consistency and predicate locks in a database system. Communications of the ACM, 19(11):624–632, 1976. [GHOS96] Jim Gray, Pat Helland, Patrick O’Neil, and Dennis Shasha. The dangers of replication and a solution. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, 1996, pages 173–182, 1996. [KA00]
B. Kemme and G. Alonso. A new approach to developing and implementing eager database replication protocols. ACM Transactions on Database Systems, 25(3):333–379, September 2000.
[MDN+ 98] Sameer Mahajan, Michael J. Donahoo, Shamkant B. Navathe, Mostafa H. Ammar, and Sanjoy Malik. Grouping techniques for update propagation in intermittently connected databases. In Proceedings of the International Conference on Data Engineering, pages 46–53, 1998. [PB99]
Evaggelia Pitoura and Bharat K. Bhargava. Data consistency in intermittently connected distributed systems. IEEE Transactions on Knowledge and Data Engineering, 11(6):896–915, 1999.
[PMDD00] Nuno Pregiica, J. Legatheaux Martins, Henrique Domingos, and S’ergio Duarte. Data management support for asynchronous groupware. In Proceeding on the ACM 2000 Conference on Computer Supported Cooperative Work, December 2-6, 2000. ACM, 2000. [Syb00]
Sybase. Building UltraLite Applications Using SQL Anywhere Studio. Student guide, Sybase, Inc., 2000.
A
Algorithm for processing a read block RBj in GBS
Proc for processing a read block RBj create a node Tj in SG; // Tj is an active ancestor for RBj (i.e., itself) AAj = {j}; RAA = RAA ∪{j};
B
ALGORITHM FOR PROCESSING A WRITE BLOCK W BJ IN GBS DATj = DATj ∪ {j}; for each k ∈ WAA // if W Bk has an active ancestor and W Bk ∩ RBj 6= ∅, if W Bk ∩ RBj 6= ∅ then // then an edge from Tk to Tj is add Tk → Tj to SG; // found AAj = AAj ∪ AAk ; // active ancestors of Tk also become active ancestors of Tj end if; end for; for each l ∈ AAj DATl = DATl ∪ DATj ; end for; for each l ∈ DATj AAl = AAl ∪ AAj ; end for; End Proc;
B
Algorithm for processing a write block W Bj in GBS
Proc for processing a write block W Bj for each k ∈ WAA if W Bk ∩ W Bj 6= ∅ then if k 6∈ DATj then add Tk → Tj to SG; // active ancestors of Tk also become active ancestors of Tj AAj = AAj ∪ AAk ; WAA = WAA ∪ {j}; else // cycle found trim(j); return; end if; end if;
17
B
ALGORITHM FOR PROCESSING A WRITE BLOCK W BJ IN GBS
end for; for each k ∈ RAA and k 6= j if RBk ∩ W Bj 6= ∅ then if k 6∈ DATj then add Tk → Tj to SG; // active ancestors of Tk also become active ancestors of Tj AAj = AAj ∪ AAk ; WAA = WAA ∪ {j}; else // cycle found trim(j); return; end if; end if; end for; // The edges that enter into Tj may be formed by W Bj , and not RBj , in this case // we must treat j as the descendant of j’s ancestors (or new ancestors due to W Bj ) for each l ∈ AAj DATl = DATl ∪ DATj ; end for; for each l ∈ DATj AAl = AAl ∪ AAj ; end for; // If Tj has an incoming edge, we still treat it as active (even though it is // now committed). if Tj has no incoming edge then trim(j); end if; End Proc;
// // Proc for removing Tj either because it is aborted or it is committed and there is no
18
B
ALGORITHM FOR PROCESSING A WRITE BLOCK W BJ IN GBS
// incoming edge toward Tj in the serialization graph. // Proc trim(j) for each l ∈ DATj AAl = AAl - {j}; if AAl == ∅ then WAA = WAA - {l}; RAA = RAA - {l}; end if; end for; for each child Tk of Tj remove Tj → Tk from SG; if Tk is committed and has no incoming edge then trim(k); end if; end for; remove Tj from SG; remove DATj ; remove AAj ; RAA = RAA - {j}; WAA=WAA- {j}; remove edges incident with Tj ; End Proc;
19