rent push mechanisms, the propagation of update transac- tions to client sites is ... client-server database model as our underlying framework and carry out a ...
Real-Time Client-Server Push Strategies: Specification and Evaluation Vinay Kanitkar and Alex Delis Department of Computer and Information Science Polytechnic University Brooklyn, NY 11201
Abstract The widespread use of networked environments and the recent availability of high network bandwidths has rekindled interest in the area of automatic data refresh/update mechanisms. In many application areas, the updated information has a limited period of usefulness. Therefore, the development of systems and protocols that can handle such update tasks within predefined deadlines is required. In this paper, we propose and evaluate two real-time update-propagation mechanisms in a client-server environment. The main idea is the transport of update transactions to the location of the cached data rather than the updated data itself. Unlike current push mechanisms, the propagation of update transactions to client sites is neither periodic nor mandatory. Instead, it is based on client-specific criteria which depend on the contents of the database object being updated. We examine real-time push scheduling issues using the popular client-server database model as our underlying framework and carry out a performance scalability study under varying system configurations.
1
server to the client. This localized transaction processing can result in faster response times but requires more complicated schemes for concurrency control [22, 4, 24] and crash recovery [18, 9]. An important measure of performance in the client-server model has been the response time in obtaining data from the server, and the swift availability of updated data. There has been considerable research in this field with respect to improving server response times without compromising the requirements of data consistency and concurrency control [7, 19, 24]. The increasing availability of network bandwidth gives us the opportunity to focus on automatic data update/refresh mechanisms. We call the process of automatically updating cached data or information update-push, and servers which implement such techniques are termed pushservers. So far, limited processing power at the server and constraints on network capacities have made push-server implementations limited in scope. Conceptually, current push mechanisms operate as follows:
Logical predicates specify the data that client machines want to receive automatic updates to. At specific time-intervals, or when the data changes, the updated data is automatically sent (pushed) to the client where it is displayed in place of, or in addition to, the old data.
Introduction
Most contemporary database systems and corporate information systems have been built on a base architecture of inter-networked workstations. The Client-Server (CS) paradigm, in particular, has become very common in the implementation of such systems [4, 6, 24]. The server hosts the database which is queried and updated by the clients. In the past, such clients used to be dumb terminals which acted as little more than user-interface devices Now, with the decreasing cost of powerful workstations, the clients’ resources are also utilized to perform database processing. Local storage (memory/disk) is used to cache frequently accessed data, off-loading operations on that data from the Copyright 1998 IEEE. Published in the Proceedings of the RealTime Technology and Applications Symposium, 3-5 June 1998 at Denver, Colorado.
All updates to the data occur only at the server.
However, in reality, most push techniques execute in the form of a “programmed” pull. This means that the client site is set to contact the server at periodic intervals and download updates to the data. The major disadvantage of this method is that the frequency with which automatic updates are performed does not depend on the content of the data, but on the specified refresh interval. Allowing updates only at the server simplifies the push protocol but does not reflect the mode of operation of contemporary client-server databases, where updates can occur at the clients as well [19]. In this paper, we propose two data-triggered push protocols which
push an update to a client only if causes the contents of the data to satisfy client-specified conditions. Hence, the push mechanism is active rather than monotonic. An update is propagated by sending the updating transaction over to the requesting client where it is re-executed on a locally cached copy of the data. Such transactions are called pushtransactions. In addition to providing active rule-based updates, we also impose deadlines on the transmission and execution of the update transactions. Specifically, we use deadlines to impose a period of usefulness on the update. This is because in many real-world applications like stock/futures market systems and geographical positioning systems, the usefulness of the data reduces drastically once it becomes old. The oldness of the data is dependent on the application context. In our model, clients define their data sets of interest, and provide logical predicates (push predicates) which specify the conditions to be satisfied for automatic updates to those data. Client predicates are in the form of forwardchaining rules, which are similar to production rules [11]. These rules primarily execute in response to updates made to the database, and have the general form:
ON EVENT IF CONDITION THEN ACTION
The EVENT belongs to the set of events/states that have importance within the system context. The CONDITION is a look-up over the database and the ACTION is the set of operations to be performed when the CONDITION is satisfied. In the context of a push mechanism, the EVENT is the completion of an update to the data. The CONDITION is a look-up of updated database objects where the contents of the objects are evaluated against push predicates specified by the clients (that are interested in those objects), and the ACTION specifies the propagation of update transactions to interested clients if the CONDITION is satisfied. Since the update transactions are shipped to the clients and re-executed on local copies of the data, they can be executed in parallel without the use of global locks. This results in greater availability of the primary copy of the database for conventional transaction processing. This paper is organized as follows: in the next section, we describe the database processing and transaction scheduling models. Section 3 describes the two update-push protocols while Section 4 outlines the experimental methodology and presents our experimental results. We discuss related work in Section 5, and conclusions can be found in Section 6.
2
Client-Server Database Configuration and Real-Time Scheduling
"!$##%' )(+*-,&
This subsection describes the model and locking scheme in a data-shipping client-server database (CS-DBS) [4, 19]. The push mechanisms described in this paper have been superimposed on such an architecture. In this environment, the database resides at the server and user transactions are initiated at the clients. Client computing systems are expected to have significant processing and storage capabilities. A local area network is assumed to provide transport support for both query/update requests and data/result shipping. The disk and main memory available at the clients is used to cache significant portions of the database. When a client transaction needs to access database objects, it first checks if the object has been cached locally (with the required lock). If it has, then it can be used immediately. Otherwise, the server is contacted and requested to grant the required lock and ship the object over to the client. Once all the requisite data becomes available, the client executes the transaction locally. Hence, all the transaction processing is performed at the clients while the server performs only low-level database functionalities on behalf of the requesting clients. Objects and locks that have been fetched from the server are cached until the server explicitly requests the client to release them. This inter-transaction caching allows subsequent requests on the cached data to be satisfied locally without interaction with the server. Hence, in addition to off-loading the database processing from the server, the use of data-shipping also allows a possible reduction of network usage and contention (when transactions demonstrate spatial and/or temporal locality of database accesses). The server is used to provide a global concurrencycontrol mechanism. A global lock table allows the server to ensure sequential write access to database objects. There are two kinds of locks, Shared (read) and Exclusive (read-write) [19], and the locking schema works as follows:
.
.
If a client has requested a Shared Lock on a object, the server grants the lock only if no other client has an Exclusive Lock on that object. Similarly, if a client requests an Exclusive Lock on a object, then this lock is granted only if no other client has any type of lock on that object. When a lock is granted to a client, a copy of the object is shipped over to the requesting client, if necessary. If another client(s) has a conflicting Lock on that object, the server then contacts that client and requests it to return the object and release the lock on the object. Once the object has been returned, the server grants
the lock to the requesting client and sends the object over.
transaction with an earlier deadline in favor of an older transaction which may have a later deadline. Therefore, FCFS is not really a real-time scheduling strategy, but serves as a baseline for comparison with other scheduling techniques.
In many cases, an update to the database may be of interest to other clients in the system. To satisfy such requirements, the updates to the data must be shipped to all interested clients. Such updates are propagated by means of push-transactions. We assume that the propagation of such updates has to be done within specified time constraints. The next subsection describes the three update scheduling algorithms that we have considered in this paper. Figure 1 shows the three types of transactions in our system: regular updates, pushed updates and queries. Regular updates and queries are part of the underlying client-server DBMS while the pushed updates belong to the automatic data refresh mechanism. There are no deadline constraints on regular updates or queries. Deadlines are imposed only on the pushed updates (push-transactions) since our emphasis is on the timely communication of data updates to interested clients.
n
n
Earliest Deadline (ED): The transaction with the earliest deadline is assigned the highest priority. This policy does not look at the expected processing time for transactions and hence has the weakness that it can schedule transactions that have missed their deadline or are certain to. Least Slack First (LS): For each transaction, a slack time oqpsrtiuvxwzy|{ , is defined. Here, v is the current time, and y and r are the estimated processing time and deadline for the transaction respectively. The slack time, o , is an estimate of how long a transaction can be delayed without missing its deadline. The transaction with the least available slack time will be scheduled first by this algorithm.
Transactions
Updates
Pushed Updates
Queries
Regular Updates
/0213457698;:=9@9@0 AB9>DCE0?F GHFJI KJ4>G9@9>LBMCE0?F G9@ NOPN
QSR T U-VWTXMY[Z\]U+^-XL_ `badc-ePZfUgih"ejg\RMZjYM_lkmV
After every modification to the database, the rule-base of clients’ push predicates is evaluated to determine which clients require this update to be pushed to them. Each push predicate has an associated deadline which indicates the time within which the update must reach a particular client. A push-transaction is said to have successfully completed only if the update has successfully reached the requesting client within its deadline. In a Real-Time Database System (RTDBS), the priority assignment scheme used to schedule transactions often plays an important part in deciding the efficiency of the system [1, 14]. This efficiency is measured in terms of the percentage of transactions completed within their deadlines. RTDBSs use several methods for assigning priorities to transactions [17, 13]. The most elementary ones are:
n
First-Come First-Serve (FCFS): This policy schedules transactions to be processed in the order in which they arrive. Since the deadline information is not used, FCFS will discriminate against a newly arrived
An important assumption that we make in the discussion of real-time scheduling is that the expected processing time of all jobs is known. This assumption is necessary to use the LS scheduling discipline. Availability of the expected processing time for a job also allows us to determine the feasibility of the job meeting its deadline. Transactions that are expected to miss their deadlines are designated as Tardy Transactions, and are executed at a lower priority. In ED scheduling, we adjust the deadline for a transaction by an estimate of the time required to ship the transaction to its execution point, while in LS scheduling, we include this estimate as part of the estimated processing time y in the calculation of the slack for the transaction. This is an adaptation of the virtual deadline assignment technique proposed in [16]. In the context of real-time update push scheduling, we have additional information about the push-transactions to be scheduled. Using this extra information, we devise two criteria that can be used to enhance the quality of the schedules generated by the above scheduling algorithms. These criteria are:
n
n
Client Count (CC): The use of this information allows the scheduler to break ties between push-transactions by choosing the transaction which, at the point of activation, affects the greatest number of clients first. De-scheduling of Redundant Transactions (DRT): The idea here is to de-schedule transactions whose effect has been superseded by later transactions with the same or earlier deadlines. For example, if a transaction that updates the stock price of XYZ Inc. to $56 has not been pushed by the time a later transaction
changes it to $52 then it is redundant to push the earlier transaction. This technique is also used to removing tardy transactions from push schedules.
3
}]~
Description of the Push Protocols
" Ml
&[j&-7 ]=bLm
Push-Transactions: When an update to the database is to be communicated to requesting clients, a push-transaction is shipped to each one of them. An push-transaction specifies the operation to be performed on the copy of the database object resident at the client’s cache. Once a pushed transaction is received, the client executes it locally to make its copy of the database object consistent. Pushed transactions are scheduled with a higher priority than locally generated (regular) transactions as they have deadlines to meet. There are two basic types of push-transactions, absoluteupdates and relative-updates. An example of an absoluteupdate would be to set the stock price of XYZ Inc. to $56 per share while a relative-update would be like crediting $200 to the bank account of X. An absolute-update transaction can be pushed to clients at any time without risking global data consistency. However, to successfully execute relative updates, we need to ensure that all updates reach the interested clients in the order that they were performed. Therefore, in addition to the deadline information, push-transactions affecting the same object also have to be scheduled in the same relative order that the original updates were executed in. Although our protocols can be easily extended to handle relative-update transactions, for reasons of simplicity, we consider only absolute-update transactions in the following description of our protocol.
}]~P
]M] )$"&b
Any modification to an object changes the state of the database and may require that the updated data be pushed to a set of clients (via push-transactions), within specified time intervals. Therefore, the push-transaction has to be scheduled according to its deadline and the number of clients it affects. Our measure of performance is the percentage of such transactions that meet their deadlines. For the actual push mechanism, we propose two different schemes: Server–Push: In this scheme, an update at a client is first sent to the server. The server, then, may push the update to the clients that have requested it. Therefore the complete set of client push predicates is stored only at the server. Once a client update reaches the server, the rule base is evaluated to determine whether this change to the database state may trigger any pushtransactions. These push-transactions are transmitted to the requesting clients where they are executed in order to bring the clients’ copies of the database objects up-to-date (Figure 3). At any given time, the server may have several updates that need to be pushed to various groups of clients. These push-transactions need to be scheduled so that the number of deadlines that are missed is minimized. The strategies that we use (and evaluate) for this scheduling have been described in the previous section. Update
Client Update
Rule Base
Server Push Transaction Scheduler
Push Predicates: Clients specify their data set of interest by providing logical predicates. We call these predicates as push predicates. The complete set of push predicates forms the rule base for the active update processing and is stored at the server. A client’s push predicates not only denote the data that is to be automatically updated, but also specify the conditions under which these updates should be pushed and the deadline within which the update should be visible at the client. This imposition of time constraints on the push mechanism dictates the use of real-time scheduling techniques. As an example, a user at a client computer (Client ID 34) may be interested in tracking the share price of XYZ Inc. To receive automatic updates of this information, the user can specify a push predicate to the server of the database system. A template for such a push predicate is given in Figure 2. It specifies that the stock price of XYZ Inc. is to be pushed to Client 34 every fifteen minutes or as soon as it goes below $8.50.
Push Transactions
Client
Client
Client
2 ¢¡£;¤ P¥9 §¦¨&9©ª As the evaluation of the rule base is done in a centralized manner (at the server only), the server DBMS needs to be modified significantly to perform such update refreshment in a traditional client-server environment. The client DBMS needs only minor modifications to accommodate the receipt and scheduling of pushed transactions. This protocol will clearly be beneficial when the server is better placed to deliver the updates than the client where the updates originated, e. g., when the network topology resembles a star with the server at the center.
PUSHPREDICATE example; CLIENT ID: 34; CONDITION: (( GetObject( "XYZ Inc." ).StockPrice < 8.5 ) OR ( TimeSinceLastPush( 15 minutes )); ACTION: Push Updated Stock Price Information; DEADLINE: Within 5 minutes of the CONDITION becoming TRUE;
Ä
«¬2®¯P°²±³[´¶µ]®9·¸¹µ&¯°9º ¬?»9¼M½°¿¾9°ÀÁÂ?¼M½Ã° Client–Push: In this scheme, the updating client pushes the update to the specified set of clients directly, and the push is performed with multiple pointto-point transfers. When a client is granted a lock on an object, the set of push predicates associated with the state of that object are shipped over to the client as well. The set of push predicates cached at a client forms its local rule base. After an update to a database object, the local rule base is evaluated to determine which clients’ predicates require the updates to be sent to them. The updating client first invalidates the data object stored at the server, and then pushes the updates to the interested set of clients directly. These pushes are assigned priorities according to a scheduling algorithm and transmitted in that order. The server is treated on a par with the other clients and an updatepush is scheduled to be sent to the server. The Client– Push scheme is depicted in Figure 4.
Update
Rule Base
Push Transaction Scheduler
Client
Push Transactions
Server
Client
Client
Client
«¬Å®¯P°ÇƳ;È=§¬?°ÉM½Pʵ]®9·¸
The advantage offered by this scheme is that the effort involved in pushing the updates is off-loaded from the server giving it more processing power to serve nonpush object requests and manage updates to the rule base of clients’ push predicates. Contrary to Server– Push, adapting a traditional CS-DBS to provide such update refreshment requires considerable modification of the clients’ DBMS (to include rule base evaluation and push-transaction scheduling). On the other hand, the database object server requires very little modification.
4
Experimental Evaluation of the Models
To evaluate our update push protocols, we have constructed a simulation model of our push protocols and implemented it using the CSIM simulation library. In this section, we describe the experimental setup and discuss the results of our simulation experiments. We mainly investigate the following key points: (i) the scalability of the two push schemes as the number of clients attached to the server, and hence the update and push load, increases, and (ii) the effect of the various scheduling policies on the performance of the two push techniques.
ËÍÌÎ
Ï+ÐÒÑMÓ Ô]Õ]Ô ÖjÔר
For our experiments, we used a database consisting of 10,000 objects. The database was resident at the server. The cache (disk and memory) of each client could accommodate up to 10% of the database. The database was divided into 10 disjoint partitions and the frequency of accesses to each partition was decided according to a Normal and a Zipf distribution. The used probability distributions are shown in Figures 5 and 6. There was no restriction on client accesses to any portion of the database. Therefore, for a system with Ù clients, there could be Ù -way contention for each database object. The timing parameters and results are described in terms of CSIM time units. Transaction arrivals at each client were modeled as a Poisson processes (with mean inter-arrival time 1000 CSIM time units). The load on the system was scaled upwards by increasing the number of clients attached to the server. Each locally generated client job could either be a query or an update. The processing time for every non-push (query or regular update) transaction was modeled as an exponential distribution with a mean of 25 time units. The number of objects requested by each transaction was exponentially distributed around a fixed mean of 30. The percentage of regular client transactions that were updates was fixed at 10%. Database object requests by a regular update transaction were either Shared or Exclusive lock requests which were generated randomly according to the percentage of updates for the experiment (10%). The values of the parameters for each set of experiments are shown in Table 1. The push-transactions were scheduled according to the ED, LS and FCFS scheduling algorithms. In addition to these
0.30
äThe probability of
äThe probability of accesses for each
åaccesses for each
æof the ten partitions of the database
æof the ten partitions æof the database 0.20
0.20
0.10
0.10
0.00
Ú Û2 Ü3 Ý4 Þ5 ß6 à7 á8 â9 10ã
çèÅéêëPìîíïSð ñ9ìiòóMôóõ9ó9ö9ì÷ó9ø9ø9ì9öLöúù9óDôôìëüûþýÿ ë ì9ó9øñSø è?ìûMô9ó9øLø9ÿ ëjò è§ûÒé|ôÃÿ¿ó]ÿ ëÇó ò è?öMô ëüè§õJêMôEè?ÿ û
1
2
3
4
5
6
7
8
9
10
çè2éêë ì ïxð ñ9ìiòóMôóõ9ó9ö9ìió9øLø9ì9ö9ö ù9óMôôìë û)ýÿ ë ì9ó9øñmø§è?ìûMôHó9øLø9ÿ ëjò è§ûÒé)ôÿ)ó! è§ùMýÇò è öDôEë è õJêMô è ÿ û "#%$'&)(+*-,/.1012 ì94 ì 3Sùù9ìû9ò +è 5)6
the two systems, (ii) the effect of the three scheduling strategies on the efficiency of the two push schemes, and (iii) the average response times for non-push object requests from clients. The result presented have been averaged over three separate runs for each experiment. 8000
7000
6000
Number of Deadlines
scheduling strategies, we used the CC (Client Count) criterion so that the push scheduler could propagate those updates which affected the largest number of clients first. The DRT (De-scheduling of Redundant Transactions) criterion was used to remove redundant and tardy transactions from the push schedules. We simulated the network by means of a CSIM process. The network delays in the transmission of data or requests were modeled as those in an Ethernet LAN [3]. The topology is depicted in Figure 7. Since the deadline assignment plays an important role in the evaluation of a real-time system, we describe our deadline assignment model in detail. In a data-driven update protocol, an update to a data value has varying significance to different clients. With this in mind, we define a (client, probability, deadline) triplet for each database object. After an update to an object, the push scheduler evaluates the triplet for Client , (
), and pushes an update transaction to Client with probability . If triggered, this push transaction has to reach its destination client within time units after the update transaction commits. Thus, is the time interval for which Client recognizes the value of receiving the update. For each triplet (
), we generated the probability and the deadline randomly. The probabilities, , were uniformly distributed (between 0 and 1). The deadlines were generated according to an exponential distribution which had the average update transaction length (25 CSIM time units) as its average. Hence, for most triplets, the deadlines are moderately tight. Figure 8 shows the distribution of the deadlines for the push-transactions. The results that we present are: (i) the percentage of update-pushes that reached the clients within their specified deadlines in
0.00
1
5000
4000
3000
2000
1000
0 0.00
50.00
7
100.00
150.00
200.00
Length of Deadlines in CSIM units
çè2éêëì98 ï ð ñ9ìxò è?öMô ëüè§õJêMôEè?ÿ û ÿJýùê9öñSòì9ó9ò:§è§û9ì9ö¿è§û ;)êûè ôÃö @BADC
EGFHJILKNM-H
In the first set of experiments, we compare the efficiency of the three scheduling strategies for the Server–Push and Client–Push schemes when the distribution of accesses to the database conforms to the Normal distribution (Figure 5).
Parameter Push Protocol Database Size Client Disk Cache Size Client Main Memory Size Scheduling Strategies Access Distribution Percentage of Updates Average Number of Objects Accessed by Each Transaction
Experiment Set 1 Server Client 10000 10000 900 900 100 100 ED,LS,FCFS ED,LS,FCFS Normal Normal 10% 10% 30
Experiment Set 2 Server Client 10000 10000 900 900 100 100 ED,LS,FCFS ED,LS,FCFS Zipf Zipf 10% 10%
30
30
30
OJPQR+SUT)V/WXPZY[P\4SJ]^SY`_a[b:YS/P)cd4_)SJ]ba:SfeZg)SYh?\4SiJ]^_
LAN Comm. Soft.
DBMS Buffers
Lock Tables Server
Comm. Soft.
Client DBMS
Server DBMS
Server Disk
Comm. Soft.
Client DBMS Buffers
Client Disk
Application Software
Objects
Client
jCached
Client DBMS ......
Client DBMS Buffers
Client Disk
Application Software
Objects
Client
jCached
khmlnYDSUoV-O:d)Sqp