Combining Optimal Performance with Cost-Efficiency ...

3 downloads 5801 Views 242KB Size Report
In Section IV the proposed cost-effective, adaptivity-oriented broadcast scheduling ... Broadcast Scheduling server is responsible for extracting the pages and ..... the CPU usage steady at 100% and dedicated to the MATLAB interpreter.
Combining Optimal Performance with Cost-efficiency in Adaptive Wireless Broadcast-based Systems Christos K. Liaskos #1 , Sophia G. Petridou #2 , Georgios I. Papadimitriou #3 #

Department of Informatics, Aristotle University of Thessaloniki Thessaloniki, Greece 54124 1 2

[email protected] [email protected] 3 [email protected]

Abstract—Research on push-based systems has introduced several outstanding theoretical analyses and algorithms, aiming to optimize the clients’ mean waiting time under several conditions. However, the computational and memory requirements aspect has been neglected to a great extend, thusly undermining the primary advantage of push systems over pull-based ones, i.e. their minimal cost. In this paper influential and top performing, well-known algorithms are evaluated from a cost aspect. It is shown that radical improvements are required for them to be realizable. Moreover, a new cost-efficient broadcast scheduling algorithm is introduced, achieving nearly top performance with minimal CPU and memory requirements. The new algorithm also promotes the adaptivity of push systems to the clients’ changing needs, another factor that has not been taken into account by traditional approaches.

I. I NTRODUCTION Data dissemination through broadcasting has been gaining popularity over the past years. Many modern wireless telecommunication systems suffer from scalability, population coverage and QoS problems. Overcoming these shortages through the traditional pull-based logic is rather costly, as it entails the instalment over pretty powerful hardware at the server side. Data broadcasting on the other hand has been adopted by the IT industry as a cheap alternative, either in the form of multicasting/broadcasting-enabled internet routers and related protocols [1] or push-based services via cellular networks [2]. Typical recent examples of application are the XM Traffic and Weather [3], and Microsoft’s MSN Direct [4], both providing realtime environmental data to subscribers. It must be clarified that the classic pull architecture generally outperforms the broadcast approach from a client serving time aspect. The primary advantage of the latter architecture is its minimal cost. This fact implies the presence of hardware limitations (i.e. CPU power and operating memory) at the server side, whose primary role is the compilation of the common broadcast schedule that is disseminated to the clients. The reader is encouraged to visualize such a scheduling server as a device of the scale of a standard x86 computer with 2.5 GHz CPU and 4 GB of memory. However, related research has ignored this observation to the point that current,

978-1-4244-5795-3/10/$26.00 ©2010 IEEE

top performing scheduling algorithms require even supercomputers to run efficiently, or are unrealizable in several cases. A. Summary of contents Section II provides the basics for understanding push systems terms, architecture and operation. Related work Section III is extended to briefly present the compared algorithms. In Section IV the proposed cost-effective, adaptivity-oriented broadcast scheduling procedure is formulated. Comparison with the state of the art and most popular similar approach is presented in the following subsections, alongside the comparison results and remarks. Finally, in Section VI conclusive remarks are presented, followed by the authors’ related work and contribution in Sections VII and VIII, respectively. II. F UNDAMENTALS OF W IRELESS DATA B ROADCASTING The main objective of a broadcast-based system is the dissemination of data units, commonly referred to as pages. These can be assumed to be of equal size-in an ATM fashionor not. The pages are stored in a typical database. A central, Broadcast Scheduling server is responsible for extracting the pages and scheduling them appropriately for broadcasting to the wireless clients. This typical topology is illustrated in Fig. 1. The appropriate scheduling-i.e. the construction of the optimal Broadcast Schedule or simply BS-is most usually taken as the formation of the shortest BS that minimizes the clients’ mean waiting time. This implies that the clients have some means of notifying the server for their changing preferences. As shown in [5], this feedback scheme must require minimal client upload bandwidth so as not to upset the push nature of the system. The information obtained by the feedback scheme is usually used to approximate the page access probabilities πi , as in [6]. The clients’ preferences may also be static, depending on the system’s purpose. However, the main research interest focuses on the adaptive push-based systems, where the clients’ needs change over time. It is essential that the BS be as short as possible, since this curtails shorter database content

1221

to show that in a schedule of starting size Lo (with varying page size), each page i with access probability πi , must have exactly √ πi (1) ui = Lo N √ i=1 πi occurrences, where N is the total number of pages. The clients’ mean response time D is equal to Lo N πi · D= (2) i=1 ui 2

Fig. 1.

Topology of a wireless push system.

advertising/client feedback cycles, and therefore higher adaptivity. Feedback schemes have been proposed [2], [5], and the analysis yielding the ideal BS characteristics has been completed [7]. Conclusively, the role of the central server is twofold: • The client feedback processing for the extraction of the approximate page access probabilities. • The optimal Broadcast Schedule Construction. As in any centralized system, the central server carries the largest part of the computational burden. Not all of the proposed feedback schemes had their computational needs evaluated, but it is safe to assume that due to their lightweight nature they contribute little in server’s CPU overhead and memory needs. However, little (if any) research has been done on optimizing the performance and assessing the CPU/memory impact of the proposed Schedule Construction algorithms. III. R ELATED WORK Research on the field of broadcast systems can be roughly split into two main categories: the mathematical foundation and analysis attempts of the broadcast problem in general, and algorithmic approaches that tend to provide simplified, algorithmically applicable solutions of the general problem and/or examine other advanced aspects. In the definition of the data broadcast problem, each to-bebroadcasted item is assigned a generic ”broadcast cost”. The term “cost” is purely theoretical in this case, and represents a floating point value with no specific physical meaning or correlation to any financial meaning. The problem is to find a schedule over an infinite-time horizon so that the mean client waiting time as well as the mean broadcast cost are minimized. This problem is proved to be NP-hard [8], [9]. [8] discusses the generalized maintenance problem (a known NP-hard problem), and proves the broadcasting of equally-sized items to be a subcase of it. A lower bound for the client’s mean waiting time is also provided. [10] proves the existence of a possible solution for teletext systems, and also defines a lower bound for the client’s mean waiting time in this case. Another research branch ignores broadcast costs. Most importantly, [7] presented the analytically optimal characteristics of the Broadcast Schedule. Based on this analysis, it is trivial

which with (1) in commission is minimized and expected to be equal to N √ 2 ( i=1 πi ) min (3) D = 2 Equation (1) obviously yields non-integer values, and thus rounding must be employed. Thus the real number of occurrences of each page i-i.e. the ith page’s ”speed”-is equal to ureal = round(ui ) i

(4)

where round(.) denotes the nearest integer value. The length of the final schedule becomes N Lreal = ureal (5) i i=1

A scheduling algorithm is also defined in [7], which even today achieves the minimum client waiting times and is discussed in this paper. Retaining the no-broadcast costs consideration, several algorithmic approaches have been proposed to simplify the data broadcasting problem, the most influential of all being the Broadcast Disks model [11], which is also discussed in this paper. The pages are grouped according to their ”speeds” into groups called disks. These abstractedly represent an array of disks spinning around a common axis. The disks are set to rotate with an angular velocity proportional to the aggregate demand of contained pages. An imaginary set of stationary heads then retrieves pages from the disks, serializes them, and forwards them to the antenna for broadcasting. A great deal of work has been done based on this model, studying data prefetching, caching [12] and indexing [13], hybrid data broadcasting [14], as well as scheduling strategies [15] and noise interference [11]. IV. C OST EFFECTIVE , ADAPTIVITY ORIENTED BROADCAST SCHEDULER

The proposed Cost-effective, Adaptivity-oriented Broadcast Scheduler (CABS) employs a rotating disks scheme, depicted in Fig. 2. (It must be clarified that apart from basic conceptual similarities, this scheme is totally uncorrelated with the broadcast disks scheme introduced in [11], where the rotating disks concept is solely used for algorithmic illustration purposes, the algorithm presented therein having nothing to do in reality with any array of spinning disks whatsoever. A brief description of that algorithm has been given in Section III). It is assumed that a voting-based feedback scheme [5] is in commission, and that after an amount of time the server

1222

involved in any stage. The CABS algorithm is formulated as Algorithm 1.

Fig. 2.

The proposed broadcast schedule constructor.

has acquired sufficient knowledge of the clients’ preferences, and therefore the pages’ access probabilities as well. Several pages may have received zero votes due to null usefulness. In the case of equally sized pages, we begin by grouping the non zero-voted ones by their speeds, as these are defined by (4). The zero voted pages form the last group and are assigned a speed equal to 1 unit. Each group of pages is then placed on the periphery of a circle with unary radius. The disks are set to rotate around a common axis with their corresponding speeds, and a set of stationary heads detects the start/end of pages or packets as shown in Fig. 2. On the event of the detection of a page/packet start/end, the page/packet is immediately broadcasted. Should two or more pages conflict, they are sorted by descending impact Ii on the mean client waiting time (see right term of Equation 2) πi Ii = (6) ui and are broadcasted in that order. To minimize the probability of a conflict occurring, the starting angle of each disks is randomized in the range [0, 2π]. The procedure lasts for a 2πR min{ui } = 2πR = 2π, at which time interval of T = min{u i} N real point a BS of exactly L = i=1 ureal has been produced. i Should the pages be of varying size, they are divided into chunks the size of e.g. the smallest of the pages. Each group of a single page chunks is assigned a distinct disk. From that point the procedure is the same as described. In this way, ”Data Scrambling” is achieved, i.e. the contents of every page are spread uniformly throughout the Broadcast Schedule. The advantage of this approach is that the clients do not need to wait while an excessively big page is transmitted, a behavior that would have a direct negative impact on the clients’ mean waiting time. The algorithm is expected to achieve near-minimum client mean waiting times since it is designed in accordance with the analysis of [7]. Since it simulates a group of physical disks rotating with a prefixed speed, the computational poweri.e. the time required for the completion of the procedure-is expected to be steady. The length of the produced schedule is also always equal to that of (5) since no zero adding is

Algorithm 1 The CABS Schedule Constructor. Input: The speeds ui and access probabilities πi of the N pages. Output: The broadcast schedule. 1: Group pages by ui 2: Set N oD =number of created groups 3: Set DiskSizei = {number of pages in group i}, i = 1 . . . N oD 4: Set initial position of heads: S0i = random{[0, 2π]}, i = 1 . . . N oD 2π 5: Set Δt = min{ DiskSize , i = 1 . . . N oD} i ·ui 6: Set T = 2π Create the broadcast schedule 7: for t := Δt to T , step by Δt do 8: Calculate current head positions: Sti = S0i + ui · t, i = 1 . . . N oD 9: Collect pages/packets whose boundaries have been crossed in [t, t − Δt], in array B 10: Sort B by descending impact factors Ii 11: Broadcast items of B 12: end for

V. P ERFORMANCE AND COST ASSESSMENT Even though the proposed algorithm adheres to the guidelines of the theoretical analysis, it is imperative that its performance be assessed and compared to the current state of the art and other popular approaches. To this end, the CABS algorithm is compared with the one introduced in [7] and the one of [11]. The former, though introduced in 1999, still represents the top performing algorithm client waiting time-wise. The latter represents the most popular - in terms of influence - approach, as discussed in Section III. As the current trends in push technology incline towards adaptivity in dynamic client environments and cost minimization, the performance assessment is threefold: 1) The classic criteria, i.e. the mean client waiting times achieved by all three algorithms are compared, for a variety of different client probabilistic configurations. 2) The length of the produced broadcast schedules are compared as a metric of the system’s adaptive capabilities, as described in Section II. From another point of view, smaller BS lengths stand for lower memory requirements (e.g cache size) and have a direct impact on the financial cost of the hardware implementation of the system. 3) As a metric of the cost of each algorithm, the CPU times required to construct the corresponding BS are compared. Notice that to the best of our knowledge, the latter two criteria are overlooked in the totality of the research on push systems, and are unconsciously discarded as insignificant.

1223

TABLE I F EATURE SUPPORT OF PSC ALGORITHMS . SOA∗1

BDISKS∗2

CABS

no yes no no

yes no no no

yes yes yes yes

Periodic BS Varying page size Data scrambling Zero voted page management

Client Mean Waiting Time (page unit duration)

Feature

3000

∗1 :

The algorithm presented in [7] ∗2 : The algorithm presented in [11] TABLE II S IMULATION PARAMETERS . Parameter

Value

TOPOLOGY

STAR {1 node-server: n wireless clients}

Client Query p.d.f θ

i ZIPF: πi = ,i = 1...N iθ {[0.5 : 0.1 : 1.2]} ∩ {1}

FEEDBACK

Server has sufficient knowledge of clients’ p.d.f (i.e. steady state)

i=1

1 θ

L0

15, 000

Number of pages Page Size Simulation Duration ThinkTime∗

N = 5, 000 1 unit (uniform) 30, 000 client queries 2 pages

2000 BDISKS performance is beyond figure scale

1500

1000

Fig. 3.

500

A. Simulation Set-up and Results As shown in Table I, not all compared algorithms support all the features discussed in the theoretical analysis. Since the SOA (State-Of-the-Art [7]) and BDISKS ([11]) algorithms TABLE III CPU T IME A SSESSMENT S PECIFICATIONS . Value

Test CPU model Number of cores Test RAM

x86 Family 6 model 15 Stepping 11 GenuineIntel 4× 2672MHz 4GB DDR2 800 MHz

Operating System Version

Windows XP Professional 5.1.2600 SP3 Build 2600

Programming language Environment

Single-threaded MATLAB M-code MATLAB R2009b

CPU Usage during execution Application Priority

Steady 100% on one dedicated core∗ 13 (“High”)

Measured by means of Windows Task Manager

0.7

0.8 0.9 zipf p.d.f. theta parameter

1

1.1

1.2

Mean Client Waiting times achieved by the scheduling algorithms.

1) Handwritten assembly code is obviously much less reliable than the corresponding one produced by a well known and globally trusted compiler or interpreter. 2) Each algorithm has typically an implementation of 10 or less lines of MATLAB code, leaving little-if any-room for implementation issues to be considered. 3) The MATLAB interpreter and profiler are popular and well-known in the scientific community globally. 4) Only one official MATLAB interpreter exists, as opposed to other programming languages like C/C++ or FORTRAN. The results are thus more verifiable and trustworthy. 5) The MATLAB version used supports JIT compiling, providing execution times comparable to corresponding C/C++ implementation.

As the results prove though, it is exactly these criteria that determine whether an algorithm can be implemented in the real world. Moreover, push systems are de facto used as a minimal cost solution. Should an algorithm require exquisite, powerful and expensive hardware to run, it would be wise to abandon the push logic altogether, and resort to classic clientserver architectures which are bound to perform better.

Parameter

0.6

are at disadvantage from this aspect, the comparison will be limited to the case of equally sized, non zero-voted pages. The comparison is simulation based and the corresponding parameters are presented in Table II. The parameter values used are typically used in similar papers [16], [17]. Thus all algorithms were implemented in MATLAB as standalone functions. The CPU time spent in each one was then measured by the MATLAB Code Profiler tool. The reasons for this choice were the following:

∗ Introduced in [11]: upon receiving a wanted page, the client halts queries for this time interval.

∗:

2500

0 0.5

N

Theroretical Minimum SOA CABS BDISKS

Throughout the execution time, great care was taken to keep the CPU usage steady at 100% and dedicated to the MATLAB interpreter. Specification details are given in Table III. Finally, results are displayed in Fig. 3-6. B. Remarks Concerning the achieved mean waiting time, both SOA and CABS achieve results of the same level, their performance literally coinciding with the corresponding IDEAL values, as shown in Fig. 3. SOA has a slight advance (1%) for every θ value. The BDISKS algorithm on the other hand, fails by far to achieve similar results, its performance reaching even 1030 values.

1224

7

x 10

Theroretical Minimum SOA CABS BDISKS

7 6

BS Length (pages)

5 4 BDISKS performance is beyond figure scale

3 2 1 0 0.5

0.6

0.7

0.8 0.9 zipf p.d.f. theta parameter

1

1.1

1.2

Fig. 4. Illustration of the Schedule Lengths required by each scheduling algorithm.

20

10

SOA CABS BDISKS

CPU Time (seconds) − semilog

15

10

10

10

5

10

0

10

−5

10

0.5

0.6

0.7

0.8 0.9 zipf p.d.f theta parameter

1

1.1

1.2

Fig. 5. CPU time required by each scheduling algorithm on the test machine. BDISKS values are approximates.

1500

Required Memory to hold BS (GB)

SOA CABS

1000

500

0.3 0.5

0.6

0.7

0.8 0.9 zipf p.d.f theta parameter

1

1.1

In the case of required BS sizes displayed in Fig. 4. Both SOA and BDISKS produce extremely big broadcast schedules. Excessively big BS sizes have a negative impact on the system’s adaptive capabilities, since the content-advertising cycles would last too long. In this interval the clients’ preferences will have probably changed, and the produced BS will no longer be suitable for them. From another point of view, the BS size has a direct impact on the data storage capabilities of the server. Fig. 6 illustrates the memory needs of a server managing 5000 HTML pages, 10KBytes each. Notice that SOA requires even 1.25T Bytes of storage area, while CABS merely 300M Bytes in any case. The BDISKS performance is not displayed as it exceeds the figure scale by far. Finally, Fig. 5 presents the CPU cost of each algorithm. The y-axis represents values in semi-logarithmic format. While CABS requires steadily < 1 seconds in any case, SOA requires 3 hours to 10 minutes, while BDISKS requires even days. This means that a 2.67 GHz CPU-based server would have to halt every other function for minutes to hours or days in order to produce the BS. This once again means that by the time the BS is completed, the client preferences will have changed, and the produced BS will not be suitable and would have to be recalculated. This obviously means that the system’s adaptivity would be severely downgraded. Also notice that in the context of this paper the server is assumed to have a perfect knowledge of the clients’ p.d.f. which is clearly not the case for BDISKS, and only conditionally possible for SOA. As a final remark, modern devices of a standard x86 computer are equipped with 2.5 GHz CPUs and 4 GB of memory. The only algorithm capable of running on devices of this level is obviously CABS. BDISKS is proven to fail in both performance and cost aspects, and should thus better be avoided in real world implementations. SOA, on the other hand, would require a small cluster computer to run, and only if a multi-threaded version of the algorithm was presented. The current version is strictly single-threaded. Even if pure assembly implementations boosted its performance by an excessive factor of ×100, it would still be conditionally acceptable. Another possible solution would be the avoidance of the x86 architecture and the design of specialized, dedicated hardware. However, should one possess these funds, it would be more profitable to invest in a top performing pure clientserver architecture, abandoning the push logic, or invest in other aspects of a telecommunications system design, like population coverage and more efficient antenna installations. Again, it must be stated that this study merely pointed out an important aspect of push systems that has been neglected. It does not in any way downgrade the significance of the compared studies, on which it actually heavily relies. The general remark is the encouragement of future studies to take into consideration the cost parameter.

1.2

VI. C ONCLUSION

Fig. 6. Memory in GBs required to hold the BS of each scheduling algorithm. Standard 10KB page size is assumed.

1225

In the context of the present work: 1) Implementation cost assessments have been shown to be critical for push systems and every new, proposed

algorithm should provide proof of minimal or at least realistic computational and memory requirements. 2) The testing of adaptivity capabilities of proposed broadcast schemes has been proven to be of equal importance, since there is little interest for old fashioned, static push systems. The Broadcast Schedule length has been proposed as a representative factor of a push system’s adaptive capabilities. 3) A new cost-effective, adaptivity-oriented broadcast schedule constructor algorithm has been proposed. It must be clarified that the present work relies on the excellent analysis of [7] and was inspired by broadcast disks model. Main purpose is thus not in any way the deprecation of these studies, but to encourage the cost assessment in future ones. VII. AUTHORS ’ RELATED WORK The authors have so far contributed in both analytical and algorithmic approaches of the generic-cost independent broadcast problem. In [5] and [6] an effort to combine clustering techniques with the broadcast disks model was made. In [18] an analytical approach was presented, providing the tools of defining the optimal scheduling parameters in advance, as well as estimating the mean client waiting time in advance with accuracy, without relying on time consuming simulations. Moreover, research on adaptive push systems has been carried out in [2], [5]. The authors began to study the cost parameter in [19]. VIII. C ONTRIBUTION This paper focused on the real world implementation perspectives of algorithms designed to construct the schedule of broadcast (or push) based systems. By theoretically analyzing many aspects of the broadcast scheduling procedure, a new broadcast scheduler constructor algorithm the Cost-effective Adaptivity-oriented Broadcast Scheduler, CABS has been proposed. CABS was compared in terms of performance, adaptivity, and CPU/memory cost with the current state of the art SOA algorithm, and another very popular and influential approach, the Broadcast Disks model, BDISKS. The results have proven that these algorithms have so much neglected the adaptivity and cost attributes, that may be only conditionally realizable. As a general paper contribution, it is shown that any pushrelated study should take the implementation cost and adaptive capabilities into account, apart from waiting time-based performance. Successful compliance with the aforementioned criteria results into realistic and practically useful systems.

R EFERENCES [1] M. Y. Yuanyuan Yang, Jianchao Wang, “A service-centric multicast architecture and routing protocol,” IEEE Transactions ob Parallel and Distributed Systems, vol. 19, no. 1, 2009. [2] P. Nicopolitidis, G. Papadimitriou, and A. Pomportsis, “Exploiting locality of demand to improve the performance of wireless data broadcasting,” IEEE Transactions on Vehicular Technology, vol. 55, no. 4, 2006. [3] XM-Satellite-Radio, “Xm radio,” World Wide Web electronic publication, 2009. [Online]. Available: http://www.xmradio.com/ [4] Microsoft-Inc., “Msn direct-connected navigation made simple,” World Wide Web electronic publication, 2009. [Online]. Available: http://www.msndirect.com/ [5] C. Liaskos, S. Petridou, G. Papadimitriou, P. Nicopolitidis, M. Obaidat, and A. Pomportsis, “Clustering-driven wireless data broadcasting,” IEEE Wireless Communications Magazine, to appear. [6] C.K.Liaskos, S.G.Petridou, G.I.Papadimitriou, P.Nicopolitidis, M.S.Obaidat, and A. S. Pomportsis, “A novel clustering-driven approach to wireless data broadcasting,” in Proc. of IEEE SCVT ’08. [7] N. Vaidya and S. Hameed, “Scheduling data broadcast in asymmetric communication environments,” Wireless Networks, vol. 5, no. 3, 1999. [8] A. Bar-Noy, R. Bhatia, J. Naor, and B. Schieberaruch, “Minimizing service and operation costs of periodic scheduling,” in Proc. of SODA ’98. [9] C. Kenyon and N. Schabanel, “The data broadcast problem with nonuniform transmission times,” in Proc. of SODA ’99. [10] M. Ammar and J. Wong, “On the optimality of cyclic transmission in teletext systems,” IEEE Transactions on Communications, vol. 35, no. 11, 1987. [11] S. Acharya, R. Alonso, M. Franklin, and S. Zdonik, “Broadcast disks: data management for asymmetric communication environments,” in Proc. of SIGMOD ’95. [12] K. Wu, P. Yu, and M. Chen, “Energy-efficient caching for bandwidthlimited wireless mobile computing,” in Proc. of ICDE ’96. [13] J. Cai and K. Tan, “Tuning integrated dissemination-based information systems,” Data and Knowledge Eng., vol. 30, no. 1, 1999. [14] K. Stathatos, N. Roussopoulos, and J. Baras, “Adaptive databroadcast in hybrid networks,” in Proc. of VLDB ’97. [15] W. Wang and C. Ravishankar, “Adaptive data broadcastingin asymmetric communication environments,” in Proc. of IDEAS ’04. [16] E. Alexandre, A. Pena, and M. Sobreira, “Low-complexity bit-allocation algorithm for mpeg aac audio coders,” IEEE Signal Processing Letters, vol. 12, no. 12, 2005. [17] R. Sukkar, S. Herman, A. Setlur, and C. Mitchell, “Reducing computational complexity and response latency through the detection of contentless frames,” in Proc. of ICASSP ’00. [18] C. Liaskos, S. Petridou, and G. Papadimitriou, “On the analytical performance optimization of wireless data broadcasting,” IEEE Trans. Vehicular Technology, to appear, 2009. [19] C. Liaskos, S. Petridou, and G. Papadimitriou, “A new approach to the design of wireless data broadcasting systems: An analysis-based costeffective scheme,” in Proc. of ICUMT ’09.

1226

Suggest Documents