Coelho da Silva TL, Nascimento MA, de Macˆedo JAF et al. Non-intrusive elastic query processing in the cloud. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 28(6): 932–947 Nov. 2013. DOI 10.1007/s11390-013-1389-2
Non-Intrusive Elastic Query Processing in the Cloud Ticiana L. Coelho da Silva1 , Mario A. Nascimento2 , Senior Member, ACM Jos´e Antˆonio F. de Macˆedo1 , Member, ACM, Fl´avio R. C. Sousa1 , and Javam C. Machado1 1
Department of Computing, Federal University of Ceara, Fortaleza, Ceara, Brazil
2
Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2E8, Canada
E-mail:
[email protected];
[email protected];
[email protected]; {sousa, javam}@ufc.br Received December 1, 2012; revised May 15, 2013. Abstract Cloud computing is a very promising paradigm of service-oriented computing. One major benefit of cloud computing is its elasticity, i.e., the system’s capacity to provide and remove resources automatically at runtime. For that, it is essential to design and implement an efficient and effective technique that takes full advantage of the system’s potential flexibility. This paper presents a non-intrusive approach that monitors the performance of relational database management systems in a cloud infrastructure, and automatically makes decisions to maximize the efficiency of the provider’s environment while still satisfying agreed upon “service level agreements” (SLAs). Our experiments conducted on Amazon’s cloud infrastructure, confirm that our technique is capable of automatically and dynamically adjusting the system’s allocated resources observing the SLA. Keywords
1
elasticity, query processing, non-intrusive, service level agreement
Introduction
A cloud computing platform consists typically of a very large number of computers responsible for data computing and storage[1] . Such massive computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to the users’ demands[2] . We consider a cloud computing environment as a set of virtual machines (VMs) which may be assigned to different physical machines (PMs); different VMs may yield different performances due to many factors, e.g., concurrency with other processes on the PMs. Cloud computing elasticity enables the system to provide and remove resources according to the application’s needs in real-time[3] . However, providing adequate cloud elasticity on-demand is not a trivial matter. A cloud computing environment is subject to several factors that may influence its performance, including different types of virtual systems provided by the service, different availability zones, and demand variation[4] . Therefore, providing cloud computing elasticity requires monitoring closely (or predicting) the system’s demand for resources in order to decide when to add or to remove resources. This is a critical task,
given that under-provisioning or over-provisioning computing resources will likely negatively affect the cloud’s user and/or provider. When users purchase computing time from a cloud provider, typically both sides, the user and provider agree on the quality of service, via a service level agreement (SLA), which may be composed of the following parameters[5] : • revenue: monetary value paid by the user to the provider for the computing time; • operating costs: monetary value paid by the user to the provider for the computing resources allocated for processing the user’s workload; • service level objective (SLO): which is associated with a user-defined metric which must be satisfied by the provider, e.g., response time, throughput, availability; • penalty: monetary value paid by the provider to the user for not satisfying the SLO. In this context, it is to the advantage of cloud providers to automatically monitor and scale their resources, e.g., VMs, in real-time as a function of the current workload, in order to lower their computational cost while satisfying all current SLAs and minimizing penalties as much as possible. This is the very problem we address.
Regular Paper The preliminary version of the paper was published in the Proceedings of the 4th International Workshop on Cloud Data Management. ©2013 Springer Science + Business Media, LLC & Science Press, China
Ticiana L. Coelho da Silva et al.: Non-Intrusive Elastic Query Processing in the Cloud
In this paper we aim at continuously monitoring a DBMS’s performance and automatically minimizing the VMs used for query processing while minimizing potential SLO violations. We use query processing time as the contracted SLO metric. Unlike other existing approaches, which we will discuss in Section 5, our resource provisioning approach does not assume that the number of VMs is fixed[6] , or that VMs yield the same performance[7] , nor we assume we can predict the workload (or that it is given) beforehand[8] . Moreover, we do not rely on user-defined rules for scaling up or down cloud resources allocated to a given workload①,② Consider a company which wishes to migrate its applications to a cloud environment in order to allocate computing resources according to demand (elasticity) and attend a quality service on applications’ response time as contracted metric SLO. Using our elastic solution to ensure those, the migration process of applications to the cloud can be made directly. Our approach uses the relational data model and works with full (database-wise) replication. Thus, each deployed VM has a DBMS with a complete copy of the database. Our solution does not partition the data, but the query. We apply virtual partitioning[9] to divide the query into subqueries to be processed on allocated virtual machines. The use of full replication facilitates virtual partitioning, and we do not need to deal with the problem of rewriting the query or data transfer between the VMs during query processing, as we would have with another type of replication such as vertical or horizontal, for example. A preliminary version of this paper appears in [10]. In that paper only select-range queries based on key attributes were investigated. This paper extends [10] by showing how to address select-range queries on nonkey attributes as well as aggregation queries. We also extend the experimental analysis by investigating those types of queries using different distributions for the data and for the query arrival times. With this in mind, the main contributions of this paper are: • a non-intrusive, automatic and adaptive performance monitoring technique for DBMSs on the currently allocated VMs, • a pragmatic approach which dynamically aims at providing the smallest set of VMs capable of satisfying each query’s SLO and thus the user’s SLA in general. Our experimental results, using TPC-H data on Amazon’s cloud show that our approach is capable of dynamically adjusting the number of allocated VMs, using SLO satisfaction as a guideline, outperforming
933
uninformed or poorly estimated decisions on a static deployment. Our previous work[10] only deals with range queries based on primary keys. Inspired by the microbenchmark proposed in [11], in this paper we present the solution and extensive results for range queries based on non-key attributes as well as aggregation queries. One of the main advantages of the approach that we propose is that it may be easily applied to cases where the user already has its applications using relational database and desires to deploy it in a cloud infrastructure, without the need of re-architecturing the applications which are predominantly based on RDBMS technology. The remainder of this paper is organized as follows. Section 2 defines our problem and gives an example of our strategy. Section 3 discusses our architecture, presents our monitoring approach and our adaptative provisioning, and our proposed algorithms are discussed. Experimental results are discussed in Section 4. Section 5 reviews the related work and in Section 6 we conclude our findings and offer directions for further research. 2
Problem Formulation
Let TR Q denotes the estimated total time needed to process a query Q and let SLO Q be the agreed time to process a given query Q as per the SLA. If p is the cost, per unit of time, of failing to satisfy SLO Q , we can define the provider’s penalty (cost) as: pp Q = max{(TR Q − SLO Q ) × p, 0}.
(1)
Let cvm be the cost, per unit of time, for using a VM which contains a full replica of a relational database and let nQ (t) be the number of VMs allocated to execute Q at time t, where time is discretized in billable units (e.g., hours in case of Amazon’s cloud). Then, we can define the computing cost for running Q as: TRQ
cc Q =
X
(nQ (t) × cvm ).
(2)
t=1
Clearly, allocating as many VMs as possible will yield an optimally minimum provider’s penalty cost, but will increase its computing cost. Likewise, allocating a single VM will minimize the provider’s computing cost, but will very likely yield a potentially large penalty cost. Hence, our problem can be defined as: given a query Q, we need to obtain, for each (discretized) time point t, the number of VMs (nQ (t)) minimizing Q’s cost
① Amazon CloudWatch, http://aws.amazon.com/pt/cloudwatch, April 2013. ② Amazon Auto Scaling, http://aws.amazon.com/pt/autoscaling, April 2013.
934
J. Comput. Sci. & Technol., Nov. 2013, Vol.28, No.6
(pp Q + cc Q ).
(3)
This problem definition captures the elasticity of the cloud environment, namely that in different points of time a different number of VMs may be sufficient to process a query with respect to its SLO. As argued earlier, this number may vary due to, for instance, higher (or lower) workloads on the VM’s actual physical machines or more (or less) queries being submitted to the VM’s DBMS. Similarly to [12], we simplify our cost model by not including any deployment costs for VM. We also assume that even though different VMs may yield different performance, they all have the same cost. Since our queries are read-only, there are no costs associated with updating replicated data, and we do not consider the costs associated with DBMS usage, i.e., we assume the cost of the VMs is the more significant one. We do not consider the costs for storing copies of the VM images and data persistent store, namely Amazon’s S3 (in the case of our experimental setup), because they are constants. Moreover, the cost for storing a VM image is much lower than the VM’s cost③ . 2.1
Query Definition
In this work, we focus on single-table queries: selectrange and aggregation queries. Paper [10] has already defined select-range queries, but it just investigates the case where the range is over a primary key attribute. In this paper, we also handle when the range is over any kind of attribute. A select-range query is defined as follows:
We also handle aggregation queries and aggregation over range: SELECT OPER(T.attr) FROM table T; where OPER is an aggregate operator(SUM, for example) and T.attr denotes an attribute of a table T. In the following we offer a motivation for our proposed approach, which is detailed in the next forthcoming subsections. Table 1 summarizes the notation used throughout this paper. Table 1. Notation Notation Q Qs SLO Q vm i RR i NT Q i ST i p cvm H
TM i S
V Tremain
Meaning A select-range query Q’s selectivity (number of tuples to be retrieved) Maximum allowed time (s) to process Q VM i, i ∈ {1, 2, . . . , m} vm i ’s reading rate (tuples/s) RR i × SLO Q , i.e., the number of tuples vm i can read without violating Q’s SLO (Slack) time that vm i can further allocate without violating any SLO Cost, per unit of time, of failing to satisfy SLO Q Cost, per unit of time, for using a VM Data table contains 4-tuple hpartitioning attribute, query’s selectivity, number of partitions, average processing timei Time spent to monitor vm i ’s performance A ordered set of pairs hvm i , ST i i such that ST i > 0 and there is only one element for each vm i A set of vm i allocated to process a query Q Time that remains to finish Q processing without violating SLO Q
SELECT * FROM table T WHERE T.attr >= Vs and T.attr < Vf;
2.2
where T.attr denotes an attribute of table T and Vs and Vf are integer values. This type of query is generic enough to accommodate more specific single-table queries discussed in the microbenchmark proposed in [11]. For instance scan queries:
Assume that the following single select-range query Q with SLO Q is received by the cloud provider. SELECT * // = 0 and T.pk < 3000; where T.pk is the primary key of table T. Assuming that T’s primary key has no gaps, Qs = 3000 tuples. Further let us assume that SLO Q is 100 seconds and that our initial provisioning is one single machine vm 0 such that RR 0 = 20 and consequently NT Q 0 = 2000. Clearly, using only vm 0 will yield a penalty to be paid by the provider, namely the cost of reading all the 3000 tuples, (TR Q − SLO Q ) × p = (150−100)×p = 50×p. It is wise then to bring another VM (vm 1 ) up, to help. Let us assume that RR 1 = 10 and NT Q 1 = 1000.
SELECT * FROM table T can be trivially rewritten as the following two queries: SELECT MIN(T.pk) into Vs, MAX(T.pk) into Vf FROM table T; SELECT * FROM table T WHERE T.pk >= Vs and T.pk < Vf + 1; where T.pk denotes the primary key attribute of T. ③ Amazon Web Services, http://aws.amazon.com/, April 2013.
Motivating Example
2.2.1 Select-Range Query
Ticiana L. Coelho da Silva et al.: Non-Intrusive Elastic Query Processing in the Cloud
935
At this point it may be seemed that those two VMs are enough to process Q and satisfy its SLO by rewriting Q into the following two (sub)queries, Q1 and Q2 , running the first on vm 0 and the second on vm 1 , respectively. Note that we use virtual partitioning (i.e., partitioning the range over the primary key) to divide Q into Q1 and Q2 .
In this case, the larger number of partitions we can use is 4, maximizing the chance to monitor Q’s performance while still satisfying SLO Q . Thus, we divide query Q1 into four queries:
SELECT * // = 0 and T.pk < 2000;
SELECT * // = 500 and T.pk < 1000;
SELECT * // = 2000 and T.pk < 3000;
SELECT * // = 1000 and T.pk < 1500;
Using only two VMs during the whole processing would be possible if the cloud computing environment was stable. However, the reality is that it is not. Both VMs can change their performance considerably as time goes by. For instance, other processes can be started on the physical machines hosting the VMs, or other processes, even other queries, can be started on those VMs. These will affect the VMs performance and potentially make them violate SLO Q , as any deviation in performance that may affect the ability of the provider to satisfy SLO Q can be proactively addressed, requiring for a continuous performance monitoring. Our proposal to address this issue is to further partition queries so that the executing VMs’ reading rate can be monitored often enough in such way that other VMs can be added as needed in order to enforce SLO Q . An obvious question is how often should such monitoring happen. If monitoring is too frequent that means the original queries would have to be partitioned in too many sub-queries, then the overhead added may hurt more than help. If seldom done it may be too late to make any corrections and to avoid potential penalties. To address the partitioning process we rely on historical data, i.e., how long it took to process a select-range query with a given selectivity using a certain number of partitions. We assume the existence of a table H for each VM where each entry has a 4tuple: hpartitioning attribute, query’s selectivity, number of partitions, average processing timei. Using H we can find the maximum number of partitions and we can divide a query on, thus allowing an as-frequent-aspossible monitoring, while still satisfying SLO Q . For the sake of argumentation, let us assume that H has enough information so that the following entries can be found when the query’s selectivity is set to 3000 and the partitioning attribute is pk as follows:
SELECT * // = 1500 and T.pk < 2000;
H = { hpk, 3000, 2, 72i, hpk, 3000, 3, 80i, hpk, 3000, 4, 95i, hpk, 3000, 5, 120i}.
SELECT * // = 0 and T.pk < 500;
Even though it is not discussed here for the sake of brevity, a similar reasoning would be applied with respect to Q2 to be processed at vm 1 . This partitioning methodology using primary key as a partitioning attribute is the same for both select-range queries and aggregation queries which are presented in the next subsection. Let us assume that vm 1 ’s performance is stable though and it is able to finish its workload as planned. When Q1,1 finishes we have the first opportunity to monitor, in a non-invasive manner, the VM’s performance. Let us assume that it spent actually 50 seconds to finish. This leads us to reset the value of that VM’s reading rate to RR 0 = 10 (500 tuples in 50 seconds), which leads to an expected completion time, for all three remaining sub-queries, of 150 seconds. This brings the expected completion time above SLO Q , and triggers a revision of the initial provisioning, so that SLAQ can still be satisfied. Note that before reviewing the initial provisioning, the three remaining partitions (Q1,2 , Q1,3 and Q1,4 ) are gathered in a single query. Given the current reading rate for vm 0 the best one can hope for it is that it will be able to read only 500 out of the remaining 1500 more tuples in the 50 seconds before SLO Q ’s is violated. The only chance for satisfying SLO Q is to offload some of the query processing to another (newly allocated) VM. Let us assume that the new VM (vm 2 ) is such that RR 2 = 50. Then all remaining 1000 tuples might be read by vm 2 in 20 seconds in the best case, which does not lead to a violation of SLO Q (recall that at this point 50 seconds have already been spent on Q1,1 ). Note that our partitioning is adaptive, and we may partition the query again if the system allocated a new number of machines. In summary, our elastic system would be able to process Q as follows: vm 0 would be used from time 0 to 100 in order to retrieve tuples satisfying the primary
936
key range [0, 1000], vm 1 would also be used from time 0 to 100 in order to retrieve tuples satisfying the range [2000, 3000], and vm 2 would be used from time 50 to 70 to retrieve tuples in the range [1000, 2000]. The number of VMs used as a function of time to execute Q would require two VMs between time [0, 50], three VMs between time [50, 70] and again two VMs between time [70, 100] as we could see in Fig.1. On the 100th second, the VMs were deallocated.
Fig.1. Variation in the number of nodes allocated by our approach.
One scenario that was not discussed above is when a VM’s performance improves. For instance, suppose that query Q2 was partitioned into, say three subqueries. It could be the case that after the first subquery finishes and the monitoring stage is started, one discovers that RR 1 has improved, possibly because some other processes in the VM may have finished, and then vm 1 can finish its query ahead of schedule. As vm 1 improves the reading rate, it is expected that the total time to finish the processing of their queries is less than the SLO. Thus, vm 1 has a slack time and this could be used “to help” vm 0 also finish its current workload satisfying the SLO. This situation will be observed in our approach in detail next. As one should have noted, a number of assumptions have been made in the discussion above without proper argumentation. Those will be discussed in the following as well. In the discussion above the range is based on the primary key of the table. Clearly, not all range-queries are based on primary keys, i.e., we also need to be able to address queries such as: SELECT * FROM table T WHERE T.attr >= Vs and T.attr < Vf; where T.attr denotes an attribute of table T and Vs and Vf are integer values. One obvious option is to try the same type of partitioning as done for the primary key, but as opposed to the previous case, where an index can be trivially used, the partitions now yield query plans using a costly linear scan. Creating an index for each possible “queriable” attribute is not feasible. Thus, our proposed
J. Comput. Sci. & Technol., Nov. 2013, Vol.28, No.6
solution is to rewrite the query in order to use the primary key for partitioning, as follows: SELECT MIN(T.pk) into Vmin , MAX(T.pk) into Vmax FROM table T; SELECT * FROM table T WHERE T.attr >= Vs and T.attr < Vf and T.pk >= Vmin and T.pk < Vmax + 1; In this case, Qs uses the primary key attribute. This is an over-estimation, because the selectivity of the range over T.pk is greater than or equal to the range’ selectivity over T.attr, and therefore it may allocate more VMs than necessary, but it is safe, as it will avoid penalties and is also possible to partition. Even if the range over T.attr is smaller than over T.pk, it is worth to partition by T.pk, because each partition does not perform a full table scan as it would happen when partitioning by T.attr. To partition Q, we use the partitioning methodology presented before for monitoring and provisioning virtual machines to process Q is in the same way presented previously. It is important to note that all this monitoring and adjusting was done in a non-intrusive manner, i.e., the VMs and associated DBMS provided to the cloud provider by their respective vendors, so they do not require any change to be used. 2.2.2 Aggregation Query Let us now discuss how to address aggregation queries using our query partitioning approach discussed above. Assume that the following single aggregation query Q, with SLO Q equal to 100 seconds, is received by the cloud provider. SELECT OPER(T.attr) // = 0 and T.pk < 2000; // Q2 : SELECT OPER(T.attr) FROM table T WHERE T.pk >= 2000 and T.pk < 3000; We could rely on the same historical data H of the previous example since we use the same partitioning attribute. The larger number of partitions in this case is 4, because it maximizes the chance to monitor Q’s performance while still satisfying SLO Q . Thus, we divide query Q1 into four queries: SELECT OPER(T.attr) // = 0 and T.pk < 500; SELECT OPER(T.attr) // = 500 and T.pk < 1000;
expected completion time, for all three remaining partitions (Q1,2 , Q1,3 and Q1,4 ), of 150 seconds. This puts us above SLO Q and we could follow the same steps presented in the previous example to review the initial provisioning and to perform the necessary (dynamic) provisioning, so that SLO Q can still be satisfied. Given the current reading rate for vm 0 the best one we can hope is that vm 0 will be able to read only 500 out of the remaining 1 500 more tuples in the 50 seconds before SLO Q ’s is violated. Let us allocate a new VM (vm 2 ) with RR 2 = 50 to satisfy the SLO Q and to offload some query processing. Then all remaining 1000 tuples might be read by vm 2 in 20 seconds, in the best case, which does not lead to a violation of SLO Q (recall that at this point 50 seconds have already been spent on Q1,1 ). We have the same elastic behaviour (two VMs between time [0, 50], three VMs between time [50, 70] and again two VMs between time [70, 100]) presented in Fig.1. The fact that a single basic approach can be easily adapted/applied to different types of queries is a significant advantage. For instance, any optimization that can be done to the basic underlying (select-range) query can benefit other types of queries. 3 3.1
Our Adaptative Approach Prototype Architecture
Our architecture, depicted in Fig.2, is composed by four modules: partition engine, monitoring engine, capacity planner and orchestration engine. The partition engine uses table H and is responsible for partitioning the query aiming at respecting the query’s SLO. The monitoring engine is executed within each VM vm i allocated to process a query Q and aims at making sure each VM keeps within the expected SLO. VMs can
SELECT OPER(T.attr) // = 1000 and T.pk < 1500; SELECT OPER(T.attr) // = 1500 and T.pk < 2000; A similar reasoning would be applied with respect to Q2 to be run at vm 1 . As in the previous example, we assume that vm 1 ’s performance is stable though and it is able to finish its workload as planned. After Q1,1 finishes we have the first opportunity to monitor the VM’s performance in a non-invasive manner. For the sake of brevity, we could suppose the same what happened in the previous example where Q1,1 spent 50 seconds to finish, therefore the VM’s reading rate decreases to RR 0 = 10 and leads to an
937
Fig.2. Architecture.
938
J. Comput. Sci. & Technol., Nov. 2013, Vol.28, No.6
“request” and “offer” help. Both the partition engine and monitoring engine form the core of our approach and are discussed in more details in the next subsection. The capacity planner initially provisions a number of VMs to process a query Q within the agreed SLO Q , minimizing the computational cost and penalty. It also has to make some decisions when the monitoring engine warns that the SLO Q is about to be violated. This is the subject of Subsection 3.3. The orchestration engine communicates with the capacity planner to obtain a provisioning, and with the partition engine to obtain the partitions and afterwards gives them to the monitoring engine. 3.2
Non-Intrusive Monitoring
In this subsection, we describe the partitioning and monitoring methods that are used in our non-intrusive elastic query processing. These methods are implemented by two algorithms. Algorithm 1, responsible for query partitioning, is executed by a single VM, which has the partition engine. Algorithm 2, which monitors query partition processing, is carried on at each virtual machine allocated to this process. Algorithm 1: Partitioning Q Input: H, Q, SLO Q , V = {vm 0 , . . . , vm m } 1 begin foreach vm i ∈ V do 2 3 Qi ←− subquery of Q with NT Q i selectivity; nQ ←− query in H, the maximum number of 4 i partitions for Qi satisfying SLO Q ; 5 Pi ←− divide Qi in nQ i partitions; monitoring(Pi , SLO Q ); 6 7 end 8 end
Our partitioning algorithm distributes a number of partitions to each available VM based on its performance. In a dynamic cloud environment, where the performance of each VM could vary over time, this is a complex task. Thus, our algorithm uses table H for choosing an adequate number of partitions for dividing a query Q. Recall that table H contains information about the maximum number of partitions we can divide a query Q while still satisfying SLO Q . We assume that a sufficient number of samples for table H can be obtained offline for each database and query deployed in our system. In our setting, the average processing time stored in each table H entry can be computed by using PostgreSQL’s command EXPLAIN ANALYZE hqueryi. The estimated execution time returned by this command is based on a query plan, and it is usually smaller than the
real query response time. Thus it is possible that each VM gets more partitions to monitor than necessary to satisfy SLO Q . However, having more partitions we can monitor more often, hence, have more opportunities to make corrections and better adapt to the variations in the environment. Algorithm 1 details how partitions are created from a query Q taking into account SLO Q , performance of ∀vm i ∈ V and data table H. It works as follows. For each vm i allocated to process Q, the partition engine rewrites Q into a (sub)query Qi (line 3) which has selectivity equal to the number of tuples that vm i can read without violating SLO Q , i.e., NT Q i , after that Qi is partitioned using H by constructing the largest set of partitions Pi to be processed by vm i , such that the average Pi processing time does not exceed SLO Q (lines 4 and 5). How to obtain the value NT Q i for each vm i is discussed in the next subsection. After built Pi partitions, our Algorithm 2 is applied inside each vm i (line 6). Algorithm 2: Monitoring Input: Pi , SLO Q 1 begin Ps ←− selectivity(Pi ); 2 3 while Pi 6= ∅ do q ←− Pi .remove(); 4 Tstart ←− timer (); 5 TP ←− execute(q); 6 7 Tend ←− timer (); 8 Tq ←− Tend − Tstart ; 9 Tspent ←− Tspent + Tq ; 10 Ps ←− Ps − |TP |; Testimated ←− Ps /RR i ; 11 12 TR Q ←− Tspent + Testimated + TM i ; 13 if TR Q − SLO Q > 0 then 14 Tremain ←− SLO Q − (Tspent + TM i ); return 15 makeDecision(vm i , Pi , Tremain ); else if TR Q − SLO Q < 0 then 16 ST i ←− SLO Q − TR Q ; 17 18 hasSlackTime(ST i ); 19 removeVM (vm i );
At each vm i , Algorithm 2 starts monitoring the VMs’ reading rate and estimates how long it takes to finish all partitions in Pi . For each partition Pi (line 4), Algorithm 2 calculates the time spent (line 8) to process the partition (line 6). Therefore, it is possible to know how many tuples of Pi remain to be retrieved (line 10), and also the estimated time to do that (line 11). This estimated time is based on the VM’s reading rate, i.e., RR i .
Ticiana L. Coelho da Silva et al.: Non-Intrusive Elastic Query Processing in the Cloud
939
If the estimated total time needed to process Q (line 12) is greater than SLO Q (line 13), this indicates that SLO Q may be violated. Note that we also consider the time TM i spent to monitor vm i ’s performance. Hence, continuing to process Pi only with vm i will yield a penalty. In order to use the remaining time (line 14) to process the remaining partitions in Pi without violating SLO Q , we have to make some decision (line 15). At this point Pi should be recomputed for vm i , using Algorithm 4 explained in the next subsection. Algorithm 2 should also cope with the situation when the SLO Q could be satisfied faster than expected (line 16). In such case, we may use the slack time (line 17) for processing other incoming query partitions or to decrease the number of allocated VMs. ST i represents the slack time value of vm i that can be further allocated without violating SLO Q . The VMs with slack time are used by provisioning algorithms that will be explained in the next subsection, for reusing overloaded machine resources. The selectivity of each partition is calculated in accordance to the range’ selectivity of partitioning attribute. Recall that in all queries we chose the primary key as the partitioning attribute.
DBMS. For example, in PostgreSQL and MySQL, the command EXPLAIN hqueryi can be used for obtaining such information.
3.3
We aim at taking advantage of the VM’s slack time. Algorithm 3 finds in S the largest set of vm i whose ST i can be dedicated to processing Q. The choice of the largest set is to allow the provider to serve more users using less computing resources (VMs) and meeting the SLA. The choice is made using a greedy approach, following the order of S. Let S be an ordered set of pairs hvm i , ST i i, which contains slack time ST i for each vm i , such that there is only one element for each vm i and ST i is greater than 0 for all set elements. The set S is ordered by the descendent order of ST i value. Algorithm 3 calculates Q’s selectivity (line 3). If S 6= ∅ (line 4), the algorithm removes the first vm i from S (line 5) and computes the number of tuples vm i can retrieve. If ST i > SLO Q , then NT Q i is calculated based on SLO Q and RR i (line 7). Otherwise, NT Q i is computed based on ST i and RR i (line 9). The second loop in Algorithm 3 (line 12) is responsible for distributing remaining tuples in Q (line 12) to new VMs (line 13). When instantiating a new virtual machine vm i , the algorithm calculates its NT Q i (line 14). NT Q is calculated using the VM’s reading rate i (RR i ) and the value of SLO Q . Algorithm 3 terminates when there is no tuples in Q to be distributed. After that, our strategy partitions Q (Algorithm 1 discussed in the previous subsection) is called to monitor the performance of each provisioned VM during Q execution (line 17).
Dynamically Provisioning
In this subsection, we present a method for dynamically provisioning VMs to satisfy the query’s SLOs and to minimize the computational cost, taking into account the workload and the VMs’ performance variations. To achieve this goal, we propose two algorithms. The first algorithm computes the initial amount of VMs necessary before starting query processing. During the query processing, the second algorithm may be called in order to allocate more VMs to aid a virtual machine that is expected to violate SLO Q . Indeed this algorithm is called at the monitoring stage as we saw in the previous subsection. Recall that in Algorithm 2, we have to make some decision to continue query processing without penalty. Algorithm 3 implements the initial provisioning approach. The purpose of this algorithm is to compute the smallest set V of virtual machines (V = {vm 0 , . . . , vm n }) that should be initially devoted to processing Q while satisfying SLO Q . For each vm i allocated it is necessary to know the amount of Q’s tuples that vm i can process without violating SLO Q , Q i.e., NT Q i . NT i is computed according to the vm i ’s reading rate capacity (RR i ) within SLO Q . Algorithm 3 computes the Q’s selectivity and computes the adequate amount of tuples (NT Q i ) that should be delivered to each vm i ∈ V . We can obtain the query’s selectivity by using the query plan statistics presented by the
Algorithm 3: Initial Provisioning Input: Q, SLO Q 1 begin V ←− ∅; 2 3 Qs ←− selectivity(Q); while Qs > 0 ∧ S 6= ∅ do 4 5 hvmi , ST i i ←− S.remove(); if ST i > SLO Q then 6 7 NT Q i ←− SLO Q × RR i ; else 8 9 NT Q i ←− ST i × RR i ; Qs ←− Qs − NT Q 10 i ; V ←− V ∪ {vmi }; 11 while Qs > 0 do 12 13 vmi ←− new (); 14 NT Q i ←− SLO Q × RR i ; Qs ←− Qs − NT Q 15 i ; 16 V ←− V ∪ {vmi }; 17 partitioningQ(H, Q, SLO Q , V );
940
J. Comput. Sci. & Technol., Nov. 2013, Vol.28, No.6
Recall that in Algorithm 2 we have to make some decision to continue query processing while satisfying SLO. We therefore propose an elastic solution (described in Algorithm 4), which dynamically provisions VMs. At the monitoring stage, Algorithm 4 is called for each vm i that has the possibility of violating SLO Q . Suppose that vm i has a set of remaining partitions Pi , which should be processed within Tremain units of time without violating the SLO Q . Algorithm 4 recalculates the number of VMs to help vm i to finish the processing of Pi within Tremain , aiming at satisfying SLO Q . First, Algorithm 4 recalculates how many tuples vm i may retrieve while satisfying SLO Q . After that, Algorithm 4 provisions a new set of VMs and allocates to each of them an amount of tuples to be processed. Algorithm 4: MakeDecision Input: vm i , Pi , Tremain 1 begin 2 V ←− ∅; 3 Q0 ←− gatherPartitions(Pi ); Q0s ←− selectivity(Q0 ); 4 NT Q 5 i ←− Tremain × RR i ; 6 Q0s ←− Q0s − NT Q i ; 7 V ←− V ∪ {vm i }; 8 while Q0s > 0 ∧ S 6= ∅ do 9 hvm j , ST j i ←− S.remove(); if ST j > Tremain then 10 NT Q 11 j ←− Tremain × RR j ; 12 else NT Q 13 j ←− ST j × RR j ; 0 14 Qs ←− Q0s − NT Q j ; 15 V ←− V ∪ {vm j }; while Q0s > 0 do 16 17 vm j ←− new (); 18 NT Q j ←− Tremain × RR j ; 19 Q0s ←− Q0s − NT Q i ; 0 V ←− V ∪ {vmj }; 20 21 partitioningQ(H, Q0 , Tremain , V );
Following Algorithm 4, at line 3, the remaining partitions of Pi are gathered into a new query Q0 . Then 0 0 NT Q i is computed, i.e., the amount of Q ’s tuples that vm i has to process. 3.4
Decision Making
Next, the same idea of Algorithm 3 is applied, i.e., find in S the largest set of VMs whose slack time can be ④ Amazon Web Service, http://aws.amazon.com/, April 2013. ⑤ TPC-H Benchmark, http://www.tpc.org/tpch/, April 2013.
devoted to processing Q0 (lines 8∼15) and distributing Q0 ’s tuples to each of these VMs. If there still are tuples to be processed in Q0 and S = ∅ (line 16), we have to allocate new VMs (line 17). When instantiating a new machine vm j , the algo0 rithm calculates its NT Q j (line 19) which is using the VM’s reading rate (RR j ) and the value of Tremain . Algorithm 4 terminates when there are no more tuples of Q0 to be distributed. After that, our strategy partitions Q0 (Algorithm 1 discussed in the previous subsection is called) to monitor the performance of each provisioned VM during Q0 execution, trying to ensure the Q0 execution within Tremain (line 21). 4
Evaluation
We implemented a prototype of our strategy in Java, which runs on Amazon’s EC2 cloud infrastructure. Each machine in our system runs on a small instance of EC2, i.e., the environment is homogenous, but recall that the VMs’ individual performance may vary over time. Each instance is a virtual machine with a 2.4 GHz Xeon processor, 1.7 GB memory and 160 GB disk capacity. We created an Amazon Machine Image (AMI) with the DBMS and the agent, that allows starting a new VM quickly. The Amazon Elastic Block Store (EBS) was used for storing the AMI at low cost④ . With this setup instantiating a VM on-demand can be done quickly, i.e., there is no need to have a pool or “reserved” VMs, thus fully exploring the elasticity of the cloud environment. We use the Ubuntu 11.10 operating system and the PostgreSQL 9.1 DBMS. We used TPC-H to construct the dataset, in particular we used TPC-H’s scale 8⑤ . That gave us a database of approximately 13 GB. The dataset was stored in DBMS and fully replicated in each VM. In these experiments, the queries are of the type select-range and aggregation. All queries have the same SLO values and selectivities, namely 2 million tuples. The idea is not to have the queries themselves interfering with the query processing time, but rather the system’s resource allocation. We used the orders table of TPC-H’s benchmark, and its primary key (o orderkey) was used as the partitioning attribute. We stressed the system with a series of seven query bursts. Each burst contains between 8 to 13 queries and the arrival time follow a Poisson distribution with λ = 60[14] and a uniform distribution with each burst arriving in a 30-second interval. We then used different
Ticiana L. Coelho da Silva et al.: Non-Intrusive Elastic Query Processing in the Cloud
4.1
941
Select-Range Query
SLOs, from with a more strict one to a more relaxed one. Recall that SLO is a parameter given by the user at query time. The only reason we use different values is to investigate the robustness of our approach. Note that with more strict SLO it is reasonable to expect that queries from one burst will still be executed when queries from the next burst arrive. This allows us to precisely investigate whether our solution would be elastic, i.e., whether it would require and use more resources when needed, and release them when not. Each sequence of bursts was repeated five times and in the following we report the maximum, minimum and average number of VMs needed as a function of time over all repetitions. We rounded the average number of VMs to report an integer value. Finally, in order to eliminate possible interferences between subsequent test runs (sequence of bursts), in particular caching effects, we cleaned OS caches and restarted the DBMS before each test run. We measured the computational cost by summing the number of used VMs every 60 seconds for the experiments using Poisson distribution and 30 seconds for experiments using uniform distribution.
For select-range queries with range over non-primary key attribute, we used the attribute o totalprice which is in orders table, and the range boundaries for each query was generated randomly from among the values stored on the database for o totalprice as long as the query’ selectivity is approximately 2 million tuples. In Figs. 3 and 4, we present the experimental results with select-range queries over non-primary key. We used different SLOs, from a more strict one (80 seconds) to a more relaxed one (120 seconds). We note that even though the query load was the same for all test runs, several factors are outside our control, e.g., other processes running on the physical machine hosting the used VM did contribute to the variation in performance[4] . We show the maximum, minimum and average values obtained in the experiments. We also generated randomly the range for selectrange queries over primary key attribute. In this case, we used o orderkey which is the primary key in orders table. In Figs. 5 and 6, we present the experimental results with select-range over primary key attribute.
Fig.3. Number of nodes with different SLO values in Poisson
Fig.4. Number of nodes with different SLO values in uniform
distribution using select-range over non-primary key attribute on
distribution using select-range over non-primary key attribute on
range. (a) 80 s. (b) 100 s. (c) 120 s.
range. (a) 80 s. (b) 100 s. (c) 120 s.
942
J. Comput. Sci. & Technol., Nov. 2013, Vol.28, No.6
Fig.6. Number of nodes with different SLO values in uniform disFig.5. Number of nodes with different SLO values in Poisson distribution using select-range over primary key attribute on range. (a) 60 s. (b) 80 s. (c) 100 s.
In those experiments, we could be more strict on the SLO’s value than in the previous experiments, because the response time of queries used in these experiments is lower than the response time of queries used in the previous one. The workload queries had less time to be processed when we used a more strict SLO value. Thus, each query had SLO value equal to 60 seconds. In another case, when we used a more relaxed SLO value (100 seconds), each query had more time to finish (100 seconds). Clearly our approach dynamically increases and decreases the amount of VMs used during the processing of select-range queries bursts, reflecting the aimed adaptativeness to the load’s increase and decrease, while using the queries’ SLO as guideline. In Table 2 and Table 3 we show the average computing cost of the test run as a function of the provided SLO. In Table 2, as we could expect, for more strict SLO value, the computation cost is greater than or equal to the more relaxed value, in order to avoid potential penalties. The same pattern could be observed in Table 3. The computational cost to be higher in Poisson distribution than in uniform distribution may be explained by that the new queries bursts arrive while the remai-
tribution using select-range over primary key attribute on range. (a) 60 s. (b) 80 s. (c) 100 s. Table 2. Comparing the Execution of Our Approach with Varying SLO Values for Select-Range Queries over Non-Primary Key Attribute in Poisson and Uniform Distribution Distribution Uniform
Poisson
SLO’s Value (s) 80
Computing Cost (cVM ) 83
100
74
120
72
80
93
100
81
120
73
Note: In all the cases our approach did not yield any penalty. Table 3. Comparing the Execution of Our Approach with Varying SLO Values for Select-Range Queries over Primary Key Attribute in Poisson and Uniform Distribution Distribution Uniform
Poisson
SLO’s Value (s) 60
Computing Cost (cVM ) 62
80
61
100
61
60
78
80
69
100 61 Note: In all the cases our approach did not yield any penalty.
Ticiana L. Coelho da Silva et al.: Non-Intrusive Elastic Query Processing in the Cloud
943
ning bursts are still being processed. Therefore, more VMs are allocated and the computational cost increases. As for select-range queries over non-primary key, a new range between the lower value and the higher value of primary key is added, and all tuples in the table are checked if they satisfy the range over non-primary key attributes. So that in these experiments, selectrange over non-primary key requires more time to be processed than select-range over primary key attribute. With a greater response time, the VMs are allocated for longer, consequently a potentially higher computational cost is required to process select-range over non-primary key as we could see from Table 2. 4.2
Aggregation Query
In the following experiments, we chose aggregation queries over range. The range is over the attribute o orderkey which is the primary key in orders table. Actually, we used the same queries of the experiments with select-range over primary key attribute, just changed the SELECT clause (which had “∗”) to AVG (o totalprice). As we used TPC-H to construct the dataset and as we aim to continue with the same queries’ selectivity and the same SLO’s values of the previous experiments, we could not create bursts with aggregation queries without range. The range limits the selectivity and it enables to create seven bursts with 8 to 13 queries varying the query’ SLO from 60 seconds to 100 seconds as in the previous experiments. The experiments with aggregation queries are presented in the Figs.7 and 8. Note that we also varied the SLO’s value, from a more strict one (60 seconds) to a more relaxed one (100 seconds). We could observe the same behaviour that was in select-range queries experiments. When the number of queries on the workload increases or decreases, the number of virtual machines allocated also increases or decreases respectively. Table 4 shows the computing cost of the test run with aggregation query. These experiments did not yield any penalty and confirm what we expect for more strict SLO value, i.e., the computation cost was greater than or equal to the more relaxed value, in order to avoid potential penalties. In these experiments, the aggregation queries with a range over primary key attribute have a response time which is less than the response time of select-range over primary key attribute, i.e., we could have been more restricted in SLO’s values for experiments with aggregation queries. Considering these experiments, the select-range queries over primary key attribute spent more time, since it is necessary to retrieve all attributes’ values (a large amount of bytes)
Fig.7. Number of nodes with different SLO values in Poisson distribution using aggregation query. (a) 60 s. (b) 80 s. (c) 100 s.
Fig.8. Number of nodes with different SLO values in uniform distribution using aggregation query. (a) 60 s. (b) 80 s. (c) 100 s.
944
J. Comput. Sci. & Technol., Nov. 2013, Vol.28, No.6
Table 4. Comparing the Execution of Our Approach with Varying SLO Values for Aggregation Queries in Uniform and Poisson Distribution Distribution Uniform
SLO’s Value (s) 60 80 100
Computing Cost (cVM ) 58 55 45
Poisson
60 73 80 58 100 50 Note: In all the cases our approach did not yield any penalty.
present in SELECT clause, so that more I/O operations are done. We believe that it has an overhead to bring these data to memory. The aggregation queries used in these experiments just retrieve the value aggregated of o totalprice. In order to compare the value of our elastic approach we also ran tests where the number of VMs allocated was fixed. The idea is that a user would normally have to “guess” the number of VMs to be used, risking either underestimating (saving on VM costs but likely paying a penalty for violating some queries SLOs) or overestimating (paying more VMs than necessary). Table 5 shows the average number of machines chosen by our strategy during the experiments for selectrange and aggregation queries where the SLO for all queries is the more strict one (80 seconds for selectrange over non-primary key attribute, and 60 seconds for select-range over primary key attribute and aggregation query). Table 5. Comparing the Average Number of VMs Used by Our Approach with the Optimum Number of VMs Distribution Query Type Poisson
Uniform
Average No-Penalty Allocated Minimum Select-range over PK 11 7 Select-range over non-PK 13 9 Aggregation 10 7
Select-range over PK Select-range over non-PK Aggregation Note: PK: primary key.
8 10 8
7 9 7
But making the correct choice about the number of VMs is a guessing. We only know this minimum number of machines required, because we did before other tests using fewer machines and all of them have penalties to be paid. Note that these minimum numbers of machines needed were for one workload configuration. If other queries were incoming to our system, we have to run exhaustive tests again. Users might rely on automated query execution time prediction techniques, but the state-of-the-art tech-
niques either present significant error margin or require in-depth runtime information that is not always available[7] . If the cloud provider tries to guess using an underprovisioning strategy, he/she would pay less for the VMs, but maybe there would be penalty to be paid, as would not satisfy some queries’ SLO. On the other hand with over-provisioning which is another guessing, there would not be penalty, but the computational cost could be higher. The results yielded by our approach show an elastic solution which reacts to the variation of environment and different sizes of bursts of queries. Our solution ensures the SLOs were satisfied without having to guess, in a non-intrusive and automatic manner. Moreover, the penalty could dominate the cost of workload execution and incur a much higher cost, while using our approach in these experiments we did not have penalties to be payed. 5
Related Work
Alves et al.[7] proposed FloodDQ, a MapReducebased system that adaptively increases and decreases computing power at runtime, towards completing execution within the specified deadline. It restricts its scope to only one single pipeline of queries. They calculated the number of nodes that must be added or removed based on the rate of data processing, and assumed that all the nodes have the same processing capability. This strategy is similar to ours, but we do not monitor on a fixed time schedule, furthermore our partitioning strategy to monitor is adaptive. As well, we assume that the nodes may have different performance and we present an algorithm for initial provisioning. An adaptive approach for provisioning VMs for the use of a distributed stream processing systems (DSPS) in the cloud is presented in [15]. The proposed provisioning algorithm uses a black-box approach, i.e., it is independent of the specifics of the queries running in the DSPS. It scales the number of VMs used solely based on measurements of input stream rates. It detects an overload condition when a decrease in the processing rate of input data occurs because of discarded data tuples due to load shedding. The algorithm is invoked periodically and calculates the new number of VMs that are needed to support the current workload demand; however, the paper does not specify how often this algorithm is invoked. We do not focus only on resource provision, and our contribution is also an adaptative monitoring, and re-provisioning if necessary, during querying processing. The authors of [8] proposed Kingfisher, a cost-aware system that tries to minimize the customer-centric cost,
Ticiana L. Coelho da Silva et al.: Non-Intrusive Elastic Query Processing in the Cloud
i.e., the cost of renting servers while meeting the application’s SLA. It solves an integer linear program to account for both the infrastructure and transition cost deriving appropriate elasticity decisions under each workload change. Kingfisher uses a proactive approach to know when to provision as well as an ideal workload predictor that uses statistics gathered by the monitoring engine to derive estimates of future workload. Our approach differs from this approach in many ways: we use a cloud-provider-centric approach, and we do not assume a workload predictor, rather than the system’s performance is continuously monitored. In [6] the authors present a framework for resource provisioning that identifies a set of minimum-cost infrastructure resources in order to guarantee a target QoS. The paper describes intrusive solutions to the resource provisioning problem: black-box and whitebox. Black-box provisioning profiles the performance and cost of different VM types under varying query input rates using sample executions. The aim is to capture the input rate that each VM can support without violating the QoS of the queries it executes. White-box provisioning estimates resource requirements of each target workload, using the statistics from the database optimizer. These data are used to solve a multidimensional bin-packing problem. Whereas the VM of public cloud may have variable performance and the solutions of both methods proposed in the paper may not generate optimal solutions, monitoring during the execution of the workload is required and possibly decisionmaking so that the QoS is guaranteed. The paper does not present a monitoring strategy as our work. Our solution is non-intrusive, which is considered as an important practical aspect to be considered. The problem of provisioning resources in a public cloud to execute data analytic workloads is examined in [12]. The algorithm presented explores the space of possible configurations (a set of different types of VMs and a mapping of the query classes to VMs) for the input workload based on predicted costs of the configurations. The algorithm tries to find a configuration where resource costs are minimized while the SLA associated with the workload is met. The cost model presented is similar to ours. However, considering that the performance of a particular configuration can therefore degrade and SLAs are violated, it might be necessary to change the resources allocated to the application. The paper does not present a dynamic provisioning as we have done in our work. Kairos is presented in [16]. It is a system that offers a consolidation scheme for over-provisioned database servers. It uses monitoring techniques and resource models to measure the hardware requirements of
945
database workloads and to predict the combined resource utilization of those workloads. The work aims at minimizing the number of servers and improving balance load, while observing an SLA. While similar to our goal, Kairos tries to reduce the operational cost, it differs from our work as it does not do a full-range (increase and decrease of resource allocation) dynamic provisioning like our approach. The paper of [17] presents an adaptive method to optimize the response time of range queries in a distributed database. The algorithm partitions and adaptively identifies the best level of parallelism for each query, since choosing the maximum level of parallelism is not necessarily the best strategy to optimize a query’s performance. If a query is sent to too many storage hosts, it can saturate a single client by returning results faster than the client can consume them. That work, similar to ours, proposes an adaptive provisioning algorithm for range queries and considers possible variation in VM performances, but it differs from our work as it does not have an SLA to observe and does not specify how often the algorithm is invoked. Finally, in [10] we focused just on select-range queries which has a range over a primary key attribute. This work is an extension of that one to cover more different queries like aggregation and select-range over nonprimary key attribute. 6
Conclusions and Future Work
In this paper we proposed a cloud-based, nonintrusive, automatic and adaptive performance monitoring technique for DBMSs for select-range queries and aggregation queries, as well as an approach that dynamically minimizes the number of VMs needed to satisfy the queries’ SLO. Our experimental results confirm that our technique is capable of dynamically adjusting to the system’s and VM’s load, while observing the queries’ SLA, and also minimizing the computational cost and outperforming uninformed decisions on a static deployment. Future work will focus on queries that use joins. We also intend to conduct more experiments and address costs related with I/O and storage. Another concern for future work is to investigate the influence of other parameters that can be used beyond the reading rate of each VM, such as CPU and effective memory available in each VM. Acknowledgement Research performed while the first author, supported by a CAPES scholarship, was visiting the University of Alberta (under the auspices of DFAIT’s Emerging Leaders of America Program, ELAP). The use of Amazon’s cloud computing environ-
946
J. Comput. Sci. & Technol., Nov. 2013, Vol.28, No.6
ment was possible thanks to an Amazon AWS Research Grant. Mario A. Nascimento is also a visiting professor at Federal university of Ceara.
Ticiana L. Coelho da Silva is an assistant professor at Federal University of Ceara (UFC), Quixada, Brazil, and currently is also a Ph.D. candidate in computer science at UFC. She obtained her M.Sc. and B.Sc. degrees also in computer science at UFC. Her research interests lie in cloud computing, quality of service, query processing, big data and
References [1] Zhao J, Hu X, Meng X. ESQP: An efficient SQL query processing for cloud data management. In Proc. the 2nd Int. Workshop on Cloud Data Management, Oct. 2010, pp.1-8. [2] Mell P, Grance T. The NIST definition of cloud computing. NIST special publication, 2011, 800(2011): 145. [3] Islam S, Lee K, Fekete A, Liu A. How a consumer can measure elasticity for cloud platforms. In Proc. the 3rd International Conference on Performance Engineering, April 2012, pp.8596. [4] Schad J, Dittrich J, Quian´ e-Ruiz J A. Runtime measurements in the cloud: Observing, analyzing, and reducing variance. Proc. VLDB Endowment, 3(1/2): 460-471. [5] Sousa F R C, Moreira L O, Santos G A C, Machado J C. Quality of service for database in the cloud. In Proc. the 2nd International Conference on Cloud Computing and Services Science, April 2012, pp.595-601. [6] Rogers J, Papaemmanouil O, Cetintemel U. A generic autoprovisioning framework for cloud databases. In Proc. the 26th IEEE International Conference on Data Engineering Workshops, March 2010, pp.63-68. [7] Alves D, Bizarro P, Marques P. Deadline queries: Leveraging the cloud to produce on-time results. In Proc. the 4th IEEE International Conference on Cloud Computing, July 2011, pp.171-178. [8] Sharma U, Shenoy P, Sahu S, Shaikh A. A cost-aware elasticity provisioning system for the cloud. In Proc. the 31st International Conference on Distributed Computing Systems, June 2011, pp.559-570 [9] Lima A A B, Mattoso M, Valduriez P. Adaptive virtual partitioning for OLAP query processing in a database cluster. Journal of Information and Data Management, 2010, 1(1): 75-88. [10] Coelho da Silva T L, Nascimento M A, de Macˆ edo J A F, Sousa F R C, Machado J C. Towards non-intrusive elastic query processing in the cloud. In Proc. the 4th International Workshop on Cloud Data Management, Oct. 29-Nov. 2, 2012, pp.9-16, [11] Popescu A D, Kantere D D V, Ailamaki A. Adaptive query execution for data management in the cloud. In Proc. the 2nd International Workshop on Cloud data management, October 2010, pp.17-24. [12] Mian R, Martin P, Vazquez-Poletti J L. Provisioning data analytic workloads in a cloud. Future Generation Computer Systems, 2013, 29(6): 1452-1458. [13] Papadias D, Kalnis P, Zhang J, Tao Y. Efficient OLAP operations in spatial data warehouses. In Proc. the 7th International Symposium on Advances in Spatial and Temporal Databases, July 2001, pp.443-459. [14] Willig A. A short introduction to queueing theory. Technical Report, Technical University Berlin, 1999. [15] Cervino J, Kalyvianaki E, Salvachua J, Pietzuch P. Adaptive provisioning of stream processing systems in the cloud. In Proc. the 28th IEEE International Conference on Data Engineering Workshops, April 2012, pp.295-301. [16] Curino C, Jones E P C, Madden S, Balakrishnan H. Workload-aware database monitoring and consolidation. In Proc. the 2011 ACM SIGMOD International Conference on Management of Data, June 2011, pp.313-324. [17] Vigfusson Y, Silberstein A, Cooper B F, Fonseca R. Adaptively parallelizing distributed range queries. Proc. VLDB Endowment, 2(1): 682-693.
data analytics. Mario A. Nascimento is a professor and associate chair (research) at the Department of Computing Science of the University of Alberta, Canada. Before joining the University of Alberta, he was a researcher with the Brazilian Agency for Agricultural Research (1989∼1999) and also an adjunct faculty member with the Institute of Computing of the University of Campinas (1997∼1999). In addition, Mario has also been a visiting professor (during a sabbatical leave) at the National University of Singapore’s School of Computing and Aalborg University’s Department of Computer Science and is currently an adjunct visiting professor at the Federal University of Ceara in Brazil. Besides often serving as a program committee member for the top database conferences, and as (co)chair of several workshops and symposia, Mario has also served as ACM SIGMOD’s Information Director (2002∼2005) and ACM SIGMOD Record’s Editor-in-Chief (2005∼2007). Currently he is a member of VLDB Journal’s Editorial Board and a senior member of the ACM. His main research interests lie in the areas of spatio-temporal data management and data management for wireless sensor networks. Jos´ e Antˆ onio F. de Macˆ edo holds a M.Sc. degree in computer science and a Ph.D. degree in computing, both from the Pontifical Catholic University of Rio de Janeiro, Brazil. During his Ph.D. course he worked at ISI Lab at ENST-Bretagne, France, and at Database Laboratory at EPFL, Switzerland. From 2006 to 2009, he was a senior researcher at Database Laboratory at EPFL, Switzerland. He joined the Computer Science Department of the Federal University of Ceara in 2009, where he is currently an associate professor. In 2010, he was granted with a research fellowship from the National Counsel of Technological and Scientific Development (CNPq) of Brazil. He is the author of more than 30 refereed journal and conference papers and his research interests include conceptual modeling and database design, cloud computing data management, and semantic Web.
Ticiana L. Coelho da Silva et al.: Non-Intrusive Elastic Query Processing in the Cloud Fl´ avio R. C. Sousa is an adjunct professor at the Federal University of Ceara (UFC), Quixada, Brazil. He obtained a B.Sc. degree (2004) in computer science from Federal University of Piaui (UFPI), M.Sc. and Ph.D. degrees in computer science from UFC in 2007 and 2013, respectively. His primary research interest is distributed systems and database management, with specialization in cloud computing and big data.
947
Javam C. Machado is an associate professor at the Federal University of Ceara (UFC), Brazil. He obtained an M.Sc. degree in computer science from the Federal University of Rio Grande do Sul, Brazil, and a Ph.D. degree in computer science from the University of Grenoble, France. For 8 years, he was the manager of the UFC’s IT infrastructure and, in 2011, he became the vice-director of the Science College at the same University. Since 1995, he has coordinated research projects on the area of database and distributed systems and has advised M.Sc. and Ph.D. candidates. He is interested particularly in data management and cloud computing.