Optimizing the Energy Consumption of Continuous Query Processing with Mobile Clients Panayiotis Neophytou 1 , Jesse Szwedko 1 , Mohamed A. Sharaf 2 , Panos K. Chrysanthis 1 , Alexandros Labrinidis 1 1
2
Department of Computer Science, University of Pittsburgh School of Information Technology and Electrical Engineering, The University of Queensland
{panickos, jjs86}@cs.pitt.edu,
[email protected], {panos, labrinid}@cs.pitt.edu
Abstract—Complex event detection over data streams has become ubiquitous through the widespread use of sensors, wireless connectivity and the wide variety of end-user mobile devices. Typically, such event detection is carried out by a data stream management system executing continuous queries (CQs), previously submitted by the users. In this paper, we consider the situation where the results of the CQs, which are in the form of individual data streams, are disseminated to the users’ handheld, battery-operated devices over a shared broadcast medium. In order to reduce the overall energy consumption of the mobile devices, we propose BOSe*, a power-aware query operator placement algorithm that determines which part of a CQ plan should be executed at the data stream management system and which part should be executed at the mobile device. BOSe*’s effectiveness in reducing energy consumption, as well as response time under specific conditions, is evaluated using simulation, driven by parameters measured on real mobile devices.
I. I NTRODUCTION Research in energy conservation in mobile and wireless networks has always been driven by the fact that there is a gap between the energy consumed processing data compared to the energy consumed for transmitting or receiving data at a mobile device, in the favor of local processing. This is especially true as the CPUs of the mobile devices become even faster, with newer devices even featuring two-cores. This fact has led to the design principle of trading off communication energy consumption for computation energy consumption (e.g., [14]). In this paper, we apply this principle to minimize the energy consumption on mobile users’ hand-held, battery-operated devices of Data Stream Management Systems (DSMSs). Specifically, we consider monitoring applications where mobile users register continuous queries (CQ) specifying events of interest to them to be efficiently executed by a DSMS over unbounded data streams. The results of CQs are also in the form of continuous data streams that are continuously disseminated to the mobile end-users over a shared broadcast wireless medium. As an example of a CQ submitted by a mobile user, consider the query of a trader which monitors stock price updates, at the floor of a market exchange (Fig. 1): it selects the stocks that are in the NASDAQ index (operator O1 ); projects out the columns of no interest (operator O2 ); joins the tuples with user’s portfolio to append the buying price (operator O3 ); and finally, calculates the user’s profit in 1 This
work was supported by NSF grants IIS-0534531 and IIS-1050301.
the last 5 minutes, every 30 seconds (operator O4 ). Our approach is to split the load of query processing between the DSMS and the mobile user devices themselves by opportunistically taking advantage of the reduced size of intermediate results to shorten the mobile devices’ listening time of the broadcast and, consequently, reduce their communication energy cost, at the cost of additional local processing. In our trader query example, the first two operators O1 (select) and O2 (project) are data reducing, while the operator O3 (join) and O4 (aggregation) could potentially be data expanding. Thus, O3 and O4 could be shipped to the mobile client, and the much smaller intermediate result produced by O2 will be broadcast. Further, CQs often share prefixes which include the same operators (as in Fig. 1(b)), making it even more beneficial to broadcast intermediate results shared by multiple queries since the overall broadcast size will be reduced. In our prior work in [12], we have laid the groundwork for our approach by proposing three operator placement algorithms assuming a simplistic CQ execution model. Of these, the BOSe (Broadcast-aware Operator Selection) algorithm was the best performing. Encouraged by our preliminary results, in this paper we present and evaluate BOSe* which is based on a realistic CQ execution model that supports operator sharing among queries and sharing of queries among users. Our results have also suggested that BOSe* reduces response times under specific conditions. Contributions: This paper’s contributions can be summarized as follows: 1) We propose BOSe*, an operator placement algorithm, which takes into account the broadcast organization, sharing of operators and sharing of queries to provide the most overall energy savings at the mobile devices. 2) We present a heuristic parameterized variant of BOSe*, namely, BOSe-vlook, which significantly reduces the cost for generating high-quality operator placement plans which are comparable to those of the basic BOSe*. 3) We provide an extensive experimental evaluation, using parameters obtained from experiments using real mobile devices. Our results show that BOSe* can achieve an overall energy reduction of up to 53%. Outline: Section II provides the system model. Section III presents BOSe* and Section IV describes its evaluation. Section V surveys related work. Section VI lists our findings.
II. S YSTEM M ODEL In our model, the DSMS implements as a core module a wireless disseminator which broadcasts the results of CQs to the mobile users (clients). A. Data Stream Processing A CQ evaluation plan generated by the query optimizer can be conceptualized as a data flow tree [1], [6], where the nodes are operators that process tuples and the edges represent the flow of tuples from one operator, e.g., Ox , to another, Oy (Fig. 1(a)), where upstream(Oy ) = Ox and downstream(Ox ) = Oy . Each operator is associated with a queue where input tuples are buffered until they are processed. A single-stream query Qk has a single source operator, e.g., O1k and a single output operator, e.g., O4k . Further, in a query plan Qk , an operator segment Gkx,y is the sequence of operators that starts at Oxk and ends at Oyk . If the last operator on Gkx,y is the output operator, then we simply denote that operator segment as Gkx . When two or more continuous queries share a common subexpression, that sharing translates into identical prefix operator segments across the plans for those queries. In such case, the DSMS continuous query optimizer typically chooses to instantiate this prefix only once. The results produced by this prefix segment are shared among the remaining suffixes of the queries that share this prefix. For each operator Ox in the shared segment, we define Q(Ox ) as the set of queries sharing Ox . For example, in Fig. 1.(b), Q(O1 ) = Q(O2 ) = {Qk , Qj }. Another popular form of sharing which can be seen as a special case of operator sharing is when the same query is shared (submitted by) two or more clients. In such a case, for each shared query Qk , we define C(Qk ) as the set of clients sharing Qk . We define C(Ox ) as the set of clients who registered queries that share Ox (e.g., C(Oxj,k ) = C(Qj ) ∪ C(Qk ).) In a query Qk , an operator Oxk (or!simply Ox ) could be select (σ), project (π), aggregate (e.g. ), or join-table (! "T ). Each operator is associated with three parameters: • cx : the number of cycles needed to process an input tuple. • sx : the ratio of output tuples produced by Ox after processing one input tuple. Thus, sx is less than or equal to 1 for a filter operator and it could be greater than 1 for a join operator. • px : the ratio between the size of a tuple produced by Ox (i.e., output size) to its size before being processed (i.e., input size). Thus, px is less than or equal to 1 for a project operator and it may be greater than 1 for a join operator. For an operator Ox , with upstream(Ox ) = Ox−1 , we define the following characterizing parameters : • tnx : is the number of tuples produced at the output of Ox after processing the tnx−1 tuples in its input queue: tnx = tnx−1 × sx • tsx : is the size of each tuple produced at the output of Ox after processing a tuple in its input queue: tsx = tsx−1 ×px • dsx : is the size of the data block produced at the output
Fig. 1: Continuous Query Plan
queue of Ox after processing a block of data from its input queue: dsx = tnx × tsx Notice that if Ox is the output operator in query Qk , then dsx is the total size of the data block produced by Qk . B. Wireless Broadcast In this work, we adopt broadcast push ([3]) since it naturally complies with the DSMS access model where a client installs a CQ once and the server repeatedly broadcasts the new results as they become available. Hence, any number of clients can monitor the broadcast channel and retrieve data as it arrives, at a constant bandwidth speed: BW. In our model, the wireless disseminator initiates a new broadcast cycle as soon as the previous one ends. Each cycle consists of a sequence of results which could be either a final result (i.e., produced at a query’s output operator) or an intermediate result (i.e., produced at a query’s internal operator). In general, we denote the broadcast result produced by Ox as Dx and its size in bytes as | Dx |. Moreover, if Ox is shared by more than one query then those results are also shared on the broadcast and only appear once. The result Dx of an operator Ox appears on the broadcast channel as a contiguous sequence of data packets preceded by a descriptor packet that contains an identifier of Q(Ox ) and the time offset to the next broadcast cycle. Accordingly, each client should tune to each broadcast cycle for results corresponding to its registered CQs. During tuning, the client’s network interface card (NIC) is in active mode consuming relatively large amounts of energy compared to when the client’s NIC is switched to an idle mode. Hence, the amount of energy consumed by a wireless client depends on the data organization [11], [10]. In this paper, we adopt two basic data organization schemes. The first data organization scheme uses sorted broadcast, where the broadcast server sorts the results according to the data size and the popularity of each result. In particular, each Rx result Dx is assigned a priority Px equal to |D , where Rx x| is the number of queries that share " the result Dx produced " "# " by operator Ox . That is, Rx = " Qi !Q(Ox ) C(Qi )". The broadcast is then organized in descending order of priority. This maintains a broadcast that follows the weighted shortest job first scheduling policy which has been shown to minimize total response time in shared resources [7]. Accordingly, a client Ni that registered a set of queries Q(Ni ) = {Q1 , Q2 , ..., Qm } will need to download a set of results (either final or intermediate) where each of those results might correspond to one or more queries in Q(Ni ). In
particular, Ni will download results {D1 , D2 , ..., Dn } where n is the number of results and n ≤ m. Hence, if the queries in Q(Ni ) do not share any operators then n = m, otherwise n < m. Finally, Ni ’s tuning time TT (Ni ) is computed as: TT (Ni ) =
| Dmin | +
P
BW
j
| Dj |
, ∀y : Pj > Pmin
(1)
where Dmin is the result with the minimum priority among all the results {D1 , D2 , ..., Dn }. In particular, the client will tune from the beginning of the broadcast downloading results for queries in its registered set and it will stay tuned until it downloads the last one (with Dmin ) in that set. The second data organization scheme uses an indexed broadcast, where the broadcast server attaches an index at the beginning of each broadcast cycle (e.g., a (1,1) index [11]). The index contains an entry for each result Dx on the broadcast in the form < Q(Dx ), tx >, where tx is the time offset of Dx within the broadcast cycle. In this scheme, a client Ni needs to first tune to the index packet to learn the broadcast times for each of the results corresponding to the queries that it has registered for, then power off its NIC until the time of the smallest of those timestamp, say ts . At time ts , Ni powers on its NIC again, tunes into the broadcast to retrieve that result (i.e., Ds ) and after it finishes fetching all of Ds ’s data packets, powers off the NIC again until the next timestamp or the next broadcast cycle, if there are no other timestamps remaining. The tuning time for a node Ni in the indexed broadcast is computed as: TT (Ni ) =
Pn
x=1
| Dx | + | Index | BW
(2)
where n is the number of results corresponding to Ni ’s queries as defined before, | Dx | is the size of each of those results, and | Index | is the size of the index. C. Mobile Clients Mobile clients, serviced by the system, can register multiple queries and then listen to a broadcast medium to get their results. We assume that each mobile client is associated with a profile that includes the following four characteristics: • Speed(Ni ): processing speed of the client in cycles per unit of time. • PP (Ni ): power consumed per unit of time of processing. • PT (Ni ): power consumed per unit of time of tuning (i.e., when the NIC is active). • EP owerUp : energy needed to power up the NIC. Based on the client profiles, the energy consumption, and computational cost can be computed. Specifically the tuning energy, ET une for a client Ni : ET une (Ni ) = TT (Ni ) × PT (Ni ) + U (Ni ) × EP owerU p
(3)
where TT (Ni ) is the tuning time and U (Ni ) is the number of times the client needs to power up the NIC. The processing power, EP rocess for a client Ni , given the processing time, TP as: EP rocess (Ni ) = TP × PP (Ni )
(4)
III. BOS E *: B ROADCAST AWARE O PERATOR S ELECTION In this section, we formalize the placement of CQ operators in DSMS environments with mobile clients and propose our new operator placement algorithms, BOSe* and its variant. A. Problem Statement Our goal is to design operator placement algorithms that work in synergy with the broadcast organization so that we minimize the total energy consumption at the mobile clients. The total energy consumption is the sum of two components: tuning and processing, which can be expressed as: ET otal = ET une + EP rocess
(5)
Given the clients’ profiles and their corresponding registered queries, an operator placement algorithm decides to shift some of the computation to the client if it is beneficial in reducing the overall total energy consumption. Specifically, for each query it splits its query plan into two segments, where the first segment is processed on the DSMS, whereas the second one on the clients who registered to the query. For instance, if client Ni registered queries Q(Ni ) = {Qj , Qk } and their plans were split at operators Oxj and Oyk respectively, then the operators in G = Gjx+1,output and Gky+1,output have to be processed on the client. The set of operators executed at Ni is expressed as: G = Gjx+1,output ∪ Gky+1,output , to cover the case where the two segments share common operators and need to be instantiated only once on each client. Thus, for a client Ni running segments G = Gjx+1,output ∪ Gky+1,output ∪ ..., the processing time is computed as: TP (Ni ) =
X ck × tnk−1 Speed(Ni ) O ∈G
(6)
k
Using Eq. 4 we can now find the processing energy EP rocess consumed by each wireless client Ni for use in calculating the total processing power on each client in Eq. 5 In our prior work [12], we proposed the following two algorithms along with BOSe (the preliminary version of BOSe*.) a) DataMinCut: minimizes the tuning energy expended by the clients. It does so by using the Max-flow Min-cut theorem, on the whole query network, which is cast into a flow-graph, to select the edges E whose data will populate the broadcast. Each edge in the flow graph is labeled with the average data size flowing through it. b) PowerMinCut: attempts to minimize the overall energy by choosing the edges that result in smallest total energy (tuning and processing) for the client. It augments the edge labels with the client processing energy cost of all the operators downstream of each edge. This algorithm neglects the broadcast organization and thus may result in suboptimal energy consumption as, depending on the broadcasting scheme, a local decision may negatively affect other clients. B. Broadcast-aware Operator Selection (BOSe*) BOSe* could be perceived as a hybrid of DataMinCut and PowerMinCut as it integrates the desirable features of each. On one hand, like DataMinCut, BOSe* tries to minimize the
length of the broadcast cycle to minimize tuning energy. On the other hard, BOSe*, like PowerMinCut, considers the extra energy needed for operator processing at the client. BOSe* uses the DataMinCut output as a starting point and then applies a greedy selection process geared towards finding a segment of operators downstream from the current cut and reinstating them back on the server. Since DataMinCut gives the minimal broadcast size, that means that any reinstatement by BOSe* will incur an increase in the broadcast no matter what. However, BOSe* will only perform a reinstatement if its benefit in terms of reducing processing energy is greater than the cost incurred in terms of increasing tuning energy, which depends on the broadcast organization. At each iteration, BOSe* examines all the current broadcast edges, E. For each edge Ex ∈ E, it generates a list of all the possible segments of operators following that edge, vertically as well as horizontally. That is, all the prefixes of the operator segments Gjx+1,output , for all of the queries Qj in Q(Ox ). It does this recursively by adding to the current segment, all possible combinations of operators connected to it (powerset), taken from the next level of operators. This process is performed for each edge in E and the segment with the highest impact in reducing total energy is selected and its operators are reinstated to the server. BOSe* repeats the selection process until no further improvement in energy is achievable. BOSe* optimization function: Recall that at each step, BOSe* is expected to increase the tuning energy while decreasing the processing energy as compared to DataMinCut. Assume these changes are ∆T une and ∆P rocess , respectively. Hence, after BOSe makes a selection, the overall energy consumption from Eq. (5) can be expressed as: ET otal = (ET une + ∆T une ) + (EP rocess − ∆P rocess )
Clearly, our objective is to select an operator segment which minimizes the value: ∆T une − ∆P rocess which should also be less than zero. Thus, we simply need to compute that value for each operator segment under consideration and select the one with the lowest value. To illustrate this process, assume that ER ∈ E. Further, assume that G is one of those prefixes under examination, which is considered as a candidate to be moved from client Ni back to the server side. For instance, G, in Fig. 2. Moving G back to the server will reduce the processing energy by the amount ∆P rocess , computed as follows: ∆P rocess =
X
X
Oj ∈G Ni ∈C(Oj )
„
tnj−1 × cj × PP (Ni ) Speed(Ni )
«
Further, moving segment G back to the server entails replacing, in the broadcast E, its input edge (i.e., edge ER ) with G’s output edges, say EA1 and EA2 (Fig. 3). The impact of this replacement operation, on the tuning energy (∆T une ) depends broadcast organization. In the case of sorted broadcast, removing ER will reduce the tuning time for all clients Ni receiving data from ER . Additionally, it will also reduce the tuning time for all the clients waiting for results that appear after ER on the broadcast cycle (NR ). This reduction per client is simply dsR /BW ,
c Fig. 2: DataMinCut and operator segment G = {Gd,e 2,2 , G2,3 }.
Fig. 3: Optimization step for BOSe algorithm.
where dsR is the data size of associated with edge ER . Hence, the total reduction is computed as: 0 dsR ×@ ∆R = BW
X
PT (Ni ) +
X
N∈NR
Ni ∈C(ER )
1
PT (N )A
(7)
Similarly, adding EAi will increase the tuning time for the clients Ni receiving data from EAi and will also increase the tuning time for all the clients waiting for results that appear after EAi on the broadcast cycle (Fig. 3), including any newly added edges which will appear after EAi . This increase per client is simply dsAi /BW , where dsAi is the data size associated with edge EAi . This increase is computed as: ∆A =
X
Ai ∈EA
0 ds Ai @ ×@ BW 0
X
PT (N ) +
X
N∈NA i
N∈C(EAi )
11
PT (N )AA
(8)
where NAi is the set of clients waiting for results that appear after EAi on the broadcast cycle. Moreover, under a sorted broadcast, the location of the new edge EAi on the broadcast cycle is determined according to each one’s data size. Since dsAi > dsR , ∀i, then EAi will appear at a further offset than ER ’s one. Therefore, when EAi replaces ER , the clients Ni which consume results in EAi will have to spend more tuning time to receive them than what they used to spend to receive ER . This translates into extra tuning energy which is computed as: P ∆R,A =
X
X
Ai ∈EA N∈C(EA ) i
E∈ER,A i
dsE
BW
× PT (N )
(9)
where ER,A i is the set of edges on the broadcast cycle that appear after ER and before EAi . Thus, ∆T une for the sorted broadcast is computed as: ∆T une = −∆R + ∆A + ∆R,A
dsR × PT (N ) BW N∈C(ER ) X X dsA ∆#A = × PT (N ) BW A ∈E ∆#R =
i
X
(10) (11)
A N∈C(EAi )
It is interesting to note that the above equations clearly reveal that BOSe* on indexed broadcast behaves like PowerMinCut on indexed broadcast. This fact is also confirmed by our experimental results. Complexity and Approximation: The BOSe* algorithm is performing an exhaustive search over all possible combinations of operators both vertically and horizontally. !log n The number of combinations for each edge in E is i d 2i . To reduce the search space we propose a parameterized version of BOSe, called BOSe-vlook(x), which limits the cardinality of the set taken from the power-set (as mentioned before) up to x. IV. E XPERIMENTAL E VALUATION In this section, we illustrate the performance of our algorithms using our own simulator driven by parameters determined from experimental results using a mobile device. A. Experimental Setup The algorithms – DataMinCut, PowerMinCut, and BOSe* and its varient – are implemented in Java and run as they would on a real system. As a base case, we also implemented ServerOps which executes all the operators on the server and broadcasts the final results to the clients. The workload used in the simulator is a complete plan of the registered CQs, along with the mobile clients’ profiles which include their speed and power consumption parameters for processing and tuning. We measure the total energy consumption at all the mobile clients as the ratio of processing to tuning power of the nodes increases linearly from 0.01 to 0.65, while keeping the tuning power constant. The remaining parameters are summarized in Table I. It should be noted that from our measurements on real hardware we found these ratios to be between 0.0104 and 0.6572. The values reported are averages of 15 runs. B. Experimental Results (Energy) Sorted Broadcast: As Fig. 4 shows, BOSe* outperforms all other algorithms. ServerOps’ performance is constant because it runs all operators at the server side, and thus increasing the processing power of mobile clients will have no impact on its performance. It also shows that DataMinCut is linearly increasing because DataMinCut tries to minimize only the tuning energy and not the processing energy. Hence, DataMinCut selects the same set of edges for every setting regardless of the increase in power consumption needed for processing. PowerMinCut performs better than DataMinCut and ServerOps; however, it is oblivious to the broadcast
TABLE I: Workload Default Characteristics Levels per query refers to the number of operators which exist in every singlestream query in the workload. Operator cost skewness is a Zipf distribution per level is towards the high cost and is proportional to the level number; in the default setting the skewness of operator level i is equal to 0.2 × i. Parameter Number of queries Number of clients Number of clients per query Levels per query Sharing levels Maximum Degree of Operator Sharing Sources tuple rate Sources tuple size Selectivity Projectivity Operator costs (cycles/tuple) Operator cost skewness Hand-held device speed Server speed Bandwidth
Values 50 25 1-5 (Zipf) 10 2 3 500 – 1000 tuples/sec uniform 2000 – 4000 bytes uniform 0.2 – 1.8, uniform 0.5 – 1.5, uniform 100 × 106 − 200 × 106 0.2 increments (Zipf) 1 × 109 cycles/sec 1 × 3.49 cycles/sec 125000 bytes/sec
Popularity 2 Client Power Consumption (normalized for ServerOps)
In the case of indexed broadcast the optimization is simplified since the edge remove/add operation will only affect the client under examination without any impact on the other clients in the system (i.e., ∆R,A = 0 in Eq. 9). Thus, ∆T une for the indexed broadcast is denoted by ∆"T une = −∆"R +∆"A , where equations 7 and 8 become:
ServerOp DataMinCut PowerMinCut BOSe* BOSe-vlook(1) BOSe-vlook(2) BOSe-vlook(3)
1.5
1
0.5
0 0
0.1
0.2
0.3 0.4 0.5 Client CPU/Transmission Power Ratio
0.6
0.7
Fig. 4: Energy consumption for Tuning Vs Processing power cost for sorted broadcast, normalized over ServerOps.
organization as it considers each query individually without measuring the impact of its selected edge on the other clients in the system. BOSe* is always performing better than any of the other algorithms because it evaluates the different options taking into account both the broadcast organization and the processing power costs of operators running on the mobile clients, thus striking a fine balance between both tuning and processing energies. For instance, at a processing to tuning energy ratio of 0.01 BOSe* provides an improvement in energy of 50% over ServerOps and at 0.35 an improvement of 14%. In the extreme case of 0.65, it still provides a small improvement. The heuristics BOSe-vlook(2) and BOSe-vlook(3) perform very similarly to BOSe*. The reduction in number of computations is very significant: 38% for BOSe-vlook(3) and 75% for BOSe-vlook(2), while the performance remains very close to the best (up to 99%) for BOSe-vlook(3). Indexed Broadcast: Due to space limitations we omit these results; however they are similar to those seen in Fig. 4. For example the energy savings in this broadcasting scheme are 53% compared to ServerOps, at a ratio of 0.01. In this case BOSe* and PowerMinCut perform identically as the broadcast order does not effect the power consumption of the clients.
Popularity
Client Response Time (normalized for ServerOps)
2
ServerOp DataMinCut PowerMinCut BOSe* BOSe-vlook(1) BOSe-vlook(2) BOSe-vlook(3)
1.5
1
0.5
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Client CPU/Transmission Power Ratio
Fig. 5: Average response time of client queries for Tuning Vs Processing power cost for sorted broadcast, normalized over ServerOps.
C. Experimental Results (Response Time)
The idea of query operator shipping to reduce the network cost in a distributed database system was used in MOCHA [13] where data reducing operators were pushed towards the data sources and the data producing operators towards the clients. This is similar to our approach, but in our case the data is disseminated to the clients through a broadcast network instead of point-to-point. Also, [13] considers ad-hoc queries, whereas in our work we consider CQs over data streams. The idea of query operator distribution was proposed in distributed DSMS (D-DSMS) (e.g., [2]) for workload balancing. Similar to our approach, several approaches of CQ operator distribution in D-DSMSs consider data transmission overhead (e.g., [16], [15], [9]). All the D-DSMS approaches consider point-to-point unicast connections between the distributed nodes, as opposed to our work that considers broadcast push. Further, these techniques do not consider energy consumption. VI. C ONCLUSIONS
Although we were optimizing for power consumption on the client, we noticed notable gains for the response time of client queries (Fig. 5). This can be attributed to the fact that while the server handles all queries (we assume a round-robin scheduler) the client need only be concerned with processing operators for queries that it is interested in and can dedicate its resources to that end. Thus, by broadcasting an intermediate result, the client can complete the computations quicker. However, this is only true when the following equation holds for a client Ni (without loss of generality assume only one query for Ni ):
In this paper, we proposed BOSe*, a query operator placement algorithm that splits the load of CQ processing between the DSMS and the mobile user devices in order to minimize the overall energy consumption on the mobile devices. Our extensive experimental evaluation showed the effectiveness of BOSe* in distributing the query operators in a way that reduces the size of the query results that need to be continuously disseminated to the mobile users and which leads to an overall reduction up to 53% in energy in mobile devices, compared to the traditional centralized CQ processing at the DSMS.
TTServerOps +TP (Server) ≥ TTBOSe +TP (Ni )+TP" (Server)
R EFERENCES
If the workload remains the same, we know that TP (Server) ≥ TP" (Server) and TTServerOps ≥ TTBOSe where TP" is the new server processing time under BOSe*. This means that TP (Ni ) (the time to process on client Ni ) has to be low enough to not dominate the other components. In general, longer processing time on a client means higher energy drain so BOSe*, while minimizing power consumption, will also reduce response time since long-running operators will not be moved to the client. However, clients with efficient CPUs may see a higher response time as BOSe* may choose to move more operators to these clients, thus increasing response time while decreasing power consumption. V. R ELATED W ORK Current DSMSs’ assume that the underlying network layer is responsible for propagating the output data streams to endusers. However, this decoupling of the system from the transport layer eliminates the chance of exploiting the CQs’ characteristics for better bandwidth utilization. Previous research on Publish/Subscribe and mobile information systems shows the importance of considering queries’ semantics together with employing advanced data dissemination schemes (e.g., [5], [4], [8], [11]). In these schemes, data of interest for multiple clients is only disseminated once, thus making an effective use of the available bandwidth and allowing maximum scalability. In this paper, we apply the same concept in disseminating a DSMS’s output data streams for scalability and energy savings.
[1] D. J. Abadi et al. Aurora: a new model and architecture for data stream management. The VLDB Journal, 12(2):120–139, 2003. [2] D. J Abadi et al. The design of the borealis stream processing engine. In CIDR, 2005. [3] S. Acharya, M. Franklin, and S.Zdonik. Balancing push and pull for data broadcast. In SIGMOD, 1997. [4] S. Acharya and S. Muthukrishnan. Scheduling on-demand broadcasts: New metrics and algorithms. In MobiCom, 1998. [5] D. Aksoy, M. Franklin. RxW: A scheduling approach for large-scale ondemand data broadcast. IEEE/ACM Trans. Netw., 7(6):846–860, 1999. [6] B. Babcock, S. Babu, M. Datar, R. Motwani, and D. Thomas. Operator scheduling in data stream systems. VLDB J., 13(4):333–353, 2004. [7] N. Bansal and K. Dhamdhere. Minimizing weighted flow time. ACM Trans. Algorithms, 3(4), 2007. [8] A. Crespo, O. Buyukkokten, and H. G. Molina. Query merging: Improving query subscription processing in a multicast environment. IEEE Trans. Knowl. Data Eng, 15(1):174–191, 2003. [9] P. Pietzuch et al. Network-aware operator placement for streamprocessing systems. In ICDE, 2006. [10] Q. Hu, W.-C. Lee, and D. L. Lee. Power conservative multi-attribute queries on data broadcast. In ICDE, 2000. [11] T. Imielinski, S. Viswanathan, and B. R. Badrinath. Energy efficient indexing on air. In SIGMOD, 1994. [12] P. Neophytou, M. A. Sharaf, P. K. Chrysanthis, and A. Labrinidis. Power-aware operator placement and broadcasting of continuous query results. In MobiDE, 2010. [13] M. Rodriguez-Martinez, N. Roussopoulos. MOCHA: A self-extensible database middleware system for distributed data sources. In SIGMOD, 2000. [14] M. A. Sharaf and P. K. Chrysanthis. On-demand data broadcasting for mobile decision making. MONET, 9(6):703–714, 2004. [15] Y.Ahmad and U. Cetintemel. Network-aware query processing for stream-based applications. In VLDB, 2004. [16] Y. Zhou, B. Chin Ooi, and K.-L. Tan. Dynamic load management for distributed continuous query systems. In ICDE, 2005.