Design and Simulation of Power-Aware Scheduling Strategies of

0 downloads 0 Views 180KB Size Report
for multimedia applications using the client-side communi- cation energy as ... if compared to traditional scheduling policies coupled with. MAC-level PM. We use ...
Design and Simulation of Power-Aware Scheduling Strategies of Streaming Data in Wireless LANs Andrea Acquaviva

Emanuele Lattanzi

Alessandro Bogliolo

Information Science and Technology Institute (STI) 61029 Urbino Urbino, Italy {acquaviva,

lattanzi, bogliolo}@sti.uniurb.it

ABSTRACT

1. INTRODUCTION

One of the major concerns for 802.11b wireless local area networks is energy efficiency. In fact, mobile devices spend a large amount of power on their radio interface for accessing multimedia services such as audio and video streaming. In this work we address the problem of energy-aware scheduling of streaming data provided by a single server to multiple clients. We propose both open-loop and closedloop strategies that exploit application level information to perform energy-efficient traffic reshaping. We evaluate the effectiveness of the proposed strategies by means of accurate power/performance simulations performed on top of Mathworks’ Simulink. System components are modeled as generalized semi-markov processes (GSMPs) and characterized by means of real-world measurements. In particular, the timing and power behavior of wireless network interface cards is accurately captured in order to evaluate the impact of power management strategies on a wireless 802.11b link. Experimental results show that up to 75% of the communication energy can be saved by means of power-aware traffic scheduling with negligible user-perceived performance degradation.

Supporting wireless local area network (WLAN) connectivity may cost up to 60% of the energy budget of a palmtop multimedia appliance [2]. Power consumption of wireless network interface cards (WNICs) is the main drawback of IEEE 802.11x WLAN. Different techniques have been proposed to reduce WNIC power consumption, including transmission power control [10] and MAC-level power management (PM) [7, 12], that can be activated by the card driver when power is critical. Furthermore, virtually all cards support various shut-down states, spanning the trade-off between power consumption and re-activation latency. It has been shown [3] that an effective way to further reduce energy consumption is to create opportunities for card shutdown by network traffic reshaping at a higher level than MAC. The basic rationale of this approach is to create long idle periods for the NIC, so that the high shut-down transition cost (in terms of latency and energy) can be fully amortized and power can be saved during long shut-down times. In many WLANs, such as home networks, a few servers connect multiple WLAN clients to a wired network via access points (APs). In a multiclient environment, traffic reshaping becomes a scheduling problem. While clients run on power constrained devices, servers are typically not as power constrained. In addition, servers can have access to the information about both wired and wireless network conditions. For these reasons, servers are the best candidates for efficiently scheduling data transmission to clients. Traditionally, scheduling for multimedia traffic has been studied from two main perspectives. In the context of multimedia data delivery across large network topologies, several Quality of Service (QoS) sensitive schemes have been proposed. These schemes are designed to work in network elements (switches and routers) responsible to allocate a share of the link bandwidth to multimedia streams. They are basically aimed at overcoming limitations of fair queuing schemes such as Weighted Fair Queuing (WFQ) and Virtual Clock (VC) in providing QoS guarantees to soft real-time applications. In this context, the Dual Queue discipline has been proposed that tries to maximize the number of customers receiving a good service in case of congestion [6]. Real-time traffic scheduling schemes suitable to be used in QoS oriented network architectures, such as IntServ, have been also proposed [11]. When multimedia data must be delivered across a local

Categories and Subject Descriptors I.6 [Simulation and Modeling]: Model Validation and Analysis

General Terms Design, Management, Experimentation

Keywords Wireless LANs, Power management, Traffic scheduling, Simulation

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MSWiM’04, October 4-6, 2004, Venezia, Italy. Copyright 2004 ACM 1-58113-953-5/04/0010 ...$5.00.

39

area network, scheduling strategies can be implemented at the traffic source level. Several schemes have been proposed in this context, mostly for video-on-demand (VOD) systems. These schedulers have been traditionally targeted to minimize waiting time of clients. To ensure service robustness with respect to packet delivery latency variations and time-varying re-transmission rates, the streaming video clients and the server decouple frame transmission and playback through client frame buffers, which are controlled by the server via packet transmission scheduling. A traditional scheduling policy is join-the-shortest-queue (JSQ) (i.e., earliestdeadline first), which guarantees optimal results in terms of buffer emptiness avoidance [8]. More recent work in this area leads to development of schedulers aimed at matching real-time constraints for scalable VOD systems [14] and QoS requirements for simultaneous video transmission [13]. In this work we address the problem of traffic scheduling for multimedia applications using the client-side communication energy as objective function for optimization under real-time constraints. We propose energy-aware scheduling and buffer management policies exploiting the WNIC radio-off state to aggressively reduce power. We present both open-loop and closedloop strategies. The open-loop strategy is completely controlled by the server that exploits the knowledge of the consumption rate of all clients in order to provide bursts of new frames followed by timed shut-down command that turns off the WNIC of the client for a given amount of time. The closed-loop strategy is based on a low-water-mark notification sent by each client to the server whenever the number of packets in the local application buffer falls below a given threshold. In both cases, the best energy efficiency is provided by a join-the-longest-queue (JLQ) scheduling policy, since it maximizes the burstiness of the traffic directed to each client. We present comparative simulation results showing that our approach can drastically improve energy efficiency (by as much as 75%) without impairing the quality of service, if compared to traditional scheduling policies coupled with MAC-level PM. We use Mathworks’ Simulink as a simulation testbed, modeling all system components as generalized semi-Markov processes (GSMPs) [5]. In particular, we capture the timing and power behavior of the WNIC in all operating modes supported by the 802.11b protocol, in order to evaluate the impact of DPM strategies and traffic reshaping on client’s consumption. All simulation models are accurately characterized and validated against real-world measurements performed on a working prototype, while the workload for the simulation model is directly generated from time-stamped traces taken from video streams.

2.

SYSTEM MODEL AND SIMULATION

In this section we describe the main components of our simulation framework, represented by (i) the streaming video application, (ii) the power management support of WNICS, (iii) the client-server power manageable system.

2.1 Streaming Video over an IEEE 802.11b Link Our target application is a streaming video accessed by

40

mobile clients through an IEEE 802.11b wireless link. The video server communicates to the wireless network through an access point (AP) acting as a bridge to the wireless NICs installed on the palmtops. Each client runs a MPEG4 decoder that reads, buffers, decodes and plays back at a constant rate (consumer rate) the video frames. Server and clients exchange data using a RTP applicationlevel protocol over an UDP/IP stack. The amount of traffic on the channel depends on the compression rate and frame rate of the video stream. While the frame rate is a constant, the compression rate is variable, so that a frame can be represented by one or more packets. When more than one client is simultaneously connected to the video server, both CPU and communication resources are time-multiplexed. In general, clients may request different data with different frame rates. Without any power control, the server adapts the packet transmission rate (producer rate) to the playback rates read from time-stamps in the RTP packet headers. Packets belonging to the same frame are sent as a single burst over the channel, frames belonging to the same stream are spaced in time according to their time stamps, while frames belonging to different streams (i.e., sent to different clients) are interleaved in time according to their rates. In order to match real-time video constraints, the client application buffer should never be empty when the decoder looks for a frame to decode. If this happens, a deadline miss occurs. Without any power control, the shape of the traffic on the wireless channel reflects the server transmission rate and client-side buffers are needed only to compensate timing uncertainties due to server-side resource sharing, communication overheads and variable compression rates. To provide a realistic workload to our simulation model, we instrumented the MPEG video decoder to produce a time-stamped trace of packets that can be collected while running the streaming application on a PC.

2.2 Power Management Support Most wireless cards provide at least two low-power states: a sleep state (or doze mode) and a radio-off state. When sleeping, the card is inactive and its power consumption is much lower than in the active states, but it maintains synchronization to the access point by waking up periodically to listen for beacons. When in radio-off state, the card is completely switched off and consumes no power. MAC-Level DPM Support. According to the 802.11b MAC-level protocol [9], the AP can perform traffic reshaping by buffering incoming packets to allow the WNIC to sleep for a determined period of time in order to save power. If the MAC-level power management is enabled, the WNIC goes to sleep for a fixed period as soon as a fixed timeout has elapsed since the last received packet. While the NIC is sleeping, incoming packets are buffered by the AP. After expiration of the sleeping period, the card wakes up and listens for the beacon periodically sent by the AP. The beacons contain a traffic indication map (TIM) providing information about buffered packets, and are used to re-synchronize the WNIC to the AP. If there are buffered packets to be received by the client, the WNIC replies to the beacon by sending polling frames to get the backlog from the AP packet by packet. MAC-level DPM has several limitations. First, it cannot completely shut down the card, so that the power consumption of the sleep state represents a lower bound to the achievable average power. Second, it does not provide any mecha-

Figure 1: Simulink model of the streaming application over a wireless channel. nism for waking up the NIC earlier than the fixed sleep time if the incoming packets saturate the AP buffer. Third, the maximum sleep time supported by commercial implementations of the IEEE 802.11b protocol is usually lower than 1 second, in order to reduce the risk of AP buffer saturation and to keep the card synchronized to the AP (thus reducing the wakeup time). Fourth, the protocol power management has a sizeable overhead due to the polling frames used to download the backlog. Application-Level DPM Support. Even if the power consumption in sleep state is low, it is not negligible. Moreover, the card is sensitive to broadcast traffic. A more aggressive policy would require to completely shut-off the card when not needed by any active application in the system. Thus, more power can be saved, at the price of a larger wakeup delay needed by network re-association. On the other hand, the radio-off state of the WNIC cannot be directly managed at the protocol level, since the card is completely switched off. OS-level policies can be implemented to this purpose [2].

2.3 Power Manageable System Model Discrete-event models [1] were built on top of Mathworks’ Simulink for all system elements: the MPEG4 server (i.e., the producer), the access point with internal buffer, the IEEE 802.11 wireless link, the NIC (namely, a CISCO Aironet), the application buffer and the video client (i.e., the consumer). Behavioral, timing and energy models were characterized and validated against real-world measurements performed on a fully-operational instrumented system. In particular, all MAC-level power management modes supported by the CISCO NIC were accurately modeled. The block diagram of the Simulink model of the system is shown in Figure 1. Producer. The producer generates output events (representing network packets) according either to a given distribution of inter-arrival times, or to a trace of time-stamped packet information. For each packet, three properties are either randomly generated or read from the trace: the size (in bytes), the frame it belongs to, and the total number of packets representing the same frame. Packets belonging to the same frame are generated as a burst, while packets belonging to different frames are generated at a rate depending on the application. Packet information is made available at the output ports, together with the event that represents the generation of a new packet. Access point buffer. Packet events generated by the

41

producer become input events for the buffer of the base station, which is explicitly represented in the model as a limited FIFO queue with a customizable size. The content of the queue is saved in memory as an array of packets each with the corresponding information (size, frame number, packet per frame). If the producer tries to send a packet and the queue is full, a lost event is generated and the corresponding packet is discarded. Access point. The access point (or base station) gets the packets from the queue and send them to the wireless channel. If the input buffer is empty or the receiver in not ready, the AP goes to a waiting state. If the receiver is sleeping because of DPM, the AP goes to an idle state, causing incoming packets to accumulate in the input buffer. Wireless Channel. The wireless channel is represented by a block that receives input events representing incoming packets and generates, with a given latency, output events representing packet delivery. The wireless channel is bidirectional and it has a user-defined packet-loss probability. Lost packets are not delivered to the receiver. Notice that we use a simple channel model since we are interested in modeling the entire wireless system, rather than the wireless channel by itself. Channel latency, loss probability and bandwidth (implicitly modeled by the receiver) are sufficient to perform realistic system-level simulations. Nevertheless, any channel model [4] can be easily embedded in this block to take into account complex error statistics. Wireless Network Card. This is the most critical block of the system, since its power consumption is critical for the battery lifetime of the palm-top. Focusing only on the reception of UDP traffic, we describe two main operating modes: always on (ON) and power-save protocol (PSP). The two operating modes can be viewed as macro-states of a toplevel state diagram where state transitions are triggered by user commands. The state diagram of the ON mode is shown in Figure 2.a. The card waits for incoming packets that trigger transitions from wait to receive. Depending on the nature and on the correctness of the packet, the card may or may not send a MAC-level acknowledge to the base station. In particular, a positive acknowledge is required whenever a unicast packet is properly received, while no negative acknowledge is sent for corrupted packets, and no acknowledge is required for multicast or broadcast packets. We call a-packet any packet that requires a positive acknowledge, n-a-packet any packet that will not be acknowledged. Although the power state of the card during reception is independent of the nature of the

State Poll Wait Receive Ack Process

Current V=0.0781+/-0.0015 V=0.0527+/-0.0015 V=0.0607+/-0.0006 V=0.0842+/-0.0008 V=0.0573+/-0.0026

overhead AckTime APtime latency payload latency

Time T=0.3430+/-0.0285 T=1.1673+/-0.1184 T=0.7199+/-0.1346 T=0.3153+/-0.0209 T=0.2883+/-0.1416

overall packet time

t

Figure 3: Timing diagram of the wireless channel

Table 1: Power consumption of the NIC in each state

Output Buffer. This block represents the consumer buffer. It can be used to model either the protocol stack buffer or the application buffer (if present). More levels of buffering can be added. In our case we decided to represent only the UDP protocol buffer, since the application buffer is usually larger and less critical. Consumer. The consumer simulates a streaming application that reads packets from the output buffer at a given rate. Since real-time constraints impose a constant frame rate, but each frame may be encoded using a different number of packets, packet requests from the consumer do not arrive at a constant rate. Rather, the consumer decides how many packets to read within a frame period based on the information associated with the incoming packets. Frames that either arrive late with respect to the deadline or are incomplete because of a packet loss are discarded by the consumer.

received packet, in our state diagram we use two different receive states: receive w ack, that leads to the Ack state, and receive w/o ack, that leads back to the Wait state. State duplication is used only to represent the dependence of the acknowledge on the nature of the received packet. The card remains in the receiving state for a time interval that is dynamically computed as the ratio between the size of the packet (including the payload and the protocol overhead) and the bandwidth of the channel. After each packet has been completely received, the NIC takes some time to process the packet at the MAC level and then sends an acknowledge back to the AP through the wireless channel. After the acknowledge packet has been sent, the NIC goes back to the waiting state until the next packet arrives. The timing diagram of the wireless channel is shown in Figure 3. The state diagram that represents the behavior of the card in power save mode is shown in Figure 2.b. For readability, states are clustered into two subsets labeled Idle and Busy. The Idle part of the graph describes the behavior of the card when no traffic is received. The card stays in a low-power sleep state until a given timeout expires. Then it wakes up and waits for the beacon. We call beacon1 (beacon0) a beacon if its TIM indicates that there is (there is not) buffered traffic for the card. If a beacon0 is received, the card goes back to sleep as soon as the beacon has been completely received, otherwise it stays awake and enters the Busy subgraph to get the buffered backlog from the base station. To get the buffered packets the card enters a 5-state loop that entails: sending a polling frame (Poll), waiting for a packet (Wait), receiving a packet (receive w ack), sending a positive acknowledge (Ack) and preparing a new polling frame (Process). Each packet contains a more bit telling the card whether there are additional buffered frames (more=1) or not (more=0). The card exits the loop as soon as the last packet (i.e., a packet with more=0) is received. Notice that, when in the wait state, the card is also sensitive to broadcast and multicast packets that do not require any acknowledge. We use the same modeling strategy used for the ON-mode to model the reaction of the card to packets that do not require a positive acknowledge. Timing and power values characterized for each state of the NIC are reported in Table 1. Power Manager. The power manager represents the model of the actual implementation of the power management protocol of the 802.11b standard [9]. Based on external settings (that can be decided by the user), the power manager generates output events (ShutDown and WakeUp) to notify the beginning and ending of sleeping periods. After the card has been woken-up, it starts to receive packets accumulated by the AP. After each received packet, the power manager resets a timeout counter. If the timeout expires before the reception of a new packet, the card is put again in the idle state by means of a ShutDown event.

3. POWER-AWARE SCHEDULING STRATEGIES In this section we present two scheduling policies that exploit the application-level DPM support described in the previous section to save power. We consider multiple clients independently accessing the same video server.

3.1 Closed-Loop JLQ The first policy we implemented is a client-driven DPM technique based on the use of client buffers with high-watermark (HWM) and low-water-mark (LWM) thresholds. The server application refills the client buffer until the HWM is reached. The client explicitly communicates to the server when the threshold is reached and prepares to shut-down the WNIC. When the server receives the HWM notification it stops sending packets to the client. The client keeps receiving incoming packets possibly buffered by the AP, and puts the WNIC to a radio-off state as soon as a given timeout has elapsed from the last received packet. Notice that the HWM is significantly lower than the maximum buffer size in order to avoid buffer overflow caused by packets in the AP queue. When the WNIC is turned off, the video decoder consumes frames accumulated in the client-side buffer, until the number of buffered frames reaches the LWM. At this time, the client wakes-up the card and notifies the LWM to the server. The LWM threshold is set to a conservative value taking into account that the client keeps consuming buffered frames during wake-up, while the card is still unable to receive new packets. The effectiveness of the policy depends on the distance between HWM and LWM and on the production rate of the server. In case of multiple clients, each client behaves exactly as if there were no other clients. With respect to the singleclient situation, the only difference perceived by the client

42

eow

beacon0 Sleep

a−packet

Receive w ack

eor

eor

eot eot

n−a−packet

eor

Ack

Busy

Wait

Receive beacon

Process

eop Poll

eor eor

Receive w ack

packet last

eot

Wait

n−a−packet

Receive w/o ack

beacon1

eor

Ack

eot

Wait beacon

Wakeup

Idle

timer

eor Receive w/o ack

a) ON

b) PSP

Figure 2: State diagram of the WNIC working either in ON mode (a) or in PSP mode (b). is the possibility of turning off the WNIC (according to a given timeout) even if the buffer level is below the HWM. This may happen when the server is sending packets to other clients. LWM and HWM notifications are used by the server to implement an energy-aware packet scheduling policy. When a client starts its buffer is empty and it sends a LWM notification. The server starts servicing the first client that notifies a LWM situation. Then, the server keeps sending packets to the same client until either the HWM is reached by its buffer, or other clients notify a LWM situation. If the HWM is reached when no other clients have buffer levels below LWM, the server stops sending packets until next LWM notification. On the contrary, if a LWM notification is received the server stops sending packets to the previous client and starts servicing the new one. In practice, the scheduling algorithm tends to provide to each client the longest bursts of packets compatible with the frame buffers and with the real-time constraints of all other clients. Since incoming packets tend to join the longest queue until a HWM notification is received by the server, we denote the scheduling algorithm by JLQ, as opposed to JSQ. The worst-case scenario for our algorithm is that of multiple clients simultaneously starting to download streaming data. In this case, in fact, all clients send LWM notifications and need to receive packets as soon as possible. Since concurrent LWM notifications are handled in a first-come-firstserved way, at the beginning the server provides a packet to each client, acting as a round-robin arbiter.

The upper bound for the burst size is given by the clientside application buffer, while the maximum idle time following a burst depends on the time taken by the client to consume all packets in the burst. Since the number of packets per frame is not a constant, while the frame rate is fixed, the actual consumption time may vary from burst to burst even if all bursts have the same size. On the other hand, the actual size of the application buffer is a theoretical bound that cannot be reached in practice, or otherwise a miss-predicted latency may give rise to a buffer overflow. Similarly, the WNIC cannot stay idle until all buffered packets have been consumed, or otherwise the wake-up and transmission latency would cause a frame miss. In practice, the burst size is decided based on the trade-off between energy savings (the larger the bursts the larger the savings) and quality of service (the shorter the bursts the lower the risk of data overflow). Once the burst size has been decided based on the above considerations, for our video streaming application, the delay between bursts can be directly computed by the server exploiting its knowledge of frame packetization. The server transmits a burst of data to the client preceded by a special control packet containing information about the number of packets in the burst and the time at which the next burst will be transmitted. The client reacts to the special packet by counting the received packets and by setting a timeout Dclient to be used to wakeup in time to receive the next burst. As the total number of packets in the burst has been received, the client shuts down the WNIC that will be preemptively turned on when the timeout expires. Since the wake-up time is conservatively accounted for when setting the timeout, no performance penalty is incurred. In a multi-client environment, the server must send control packets to all clients. Clients are served on an earliest deadline first (EDF) order, the deadline being represented by the estimated buffer deployment time. A round-robin scheme is used if two or more clients have the same deadline.

3.2 Open-Loop JLQ An alternative to receive feedback information from the client about the queue level is to let the server predict when the client buffer will be empty by exploiting its knowledge of the workload. This leads to an open loop policy, where the server decides when to shut off the card and for how long [2]. The server schedules the transmission to each client in bursts to create idle periods long enough to exploit the radiooff state of the WNIC. The wireless card of the client is switched off once the server has sent a burst of data that will keep the client application busy until the next communication burst. Burst size and idle time between bursts are decided by the server in order to fully exploit the application buffers while preventing overflows and frame misses.

3.3 Open-Loop JLQ with MAC-level PM In the previous section we described how the server knowledge of workload can be exploited to shut-off the radio of the NIC by organizing the traffic in bursts, in order to compensate for the re-association delay. An alternative to this approach is to use doze mode instead of completely shutting

43

1000

Open loop. The key parameter of the open loop policy is the burst size that is decided based on the size of the buffer and affects the sleep time. First, a conservative burst size is computed as Burst size = Buf f er size − Ncushion , where Ncushion is needed to avoid buffer overflow. Then, the complete knowledge of the packets in the burst is exploited by the server to compute the time needed by the client to consume all packets in the burst. This time, denoted as Tcons is exactly computed for each burst based on the frame rate and on the size of the frames in the burst. Both the number of packets and the consumption time are sent to the client within the control packet. The timeout to be used to preemptively wakeup the WNIC is then computed by the client as

# of frames

800 a) Benchmark1

600 400 200 0 2500

# of frames

b) Benchmark2 2000 1500 1000 500 0

0

1

2

3 4 # of packets per frame

5

6

7

Figure 4: Packet histograms for each benchmark

Dclient = Tcons − Tof f −on

down the card, while keeping the same scheduling strategy. More precisely, we enable the MAC-level power management of the card in all the clients while the server schedules the packets in bursts as in the open-loop JLQ approach. From an energy perspective, the advantage of this solution is to reduce the wake-up delay and energy consumption. On the other hand, more energy and time is spent for transmitting bursts, due to the polling frame mechanism, and more energy is consumed during idle periods.

3.5 Fairness Considerations The proposed scheduling policies are not fair on a small time scale, in that they tend to keep serving the same client as long as possible. However, our policies can be considered fair over a time scale larger than the time spent to refill the buffer of all the clients, that is usually small, since the policy aims at minimizing the activity period of the WNICs. On the other hand, fairness is not an issue as long as real-time constraints are satisfied for all clients.

3.4 Parameter Setting

3.6 Implementation Issues

Low-power scheduling policies are characterized by various parameters, whose setting needs to be discussed since they are affected by network and workload conditions and by buffer size.

In our simulations we assumed streaming video to be the only application running on each client and the three video streams to be the only traffic through the access point. In real cases this may not be the case. The effectiveness of the proposed DPM strategies can be limited both by other applications running on the same hardware and by other WNICs connected to the same access point. First of all, both additional applications and clients may use shared communication resources (wireless link, access point) thus limiting the effective bandwidth and the benefit of the production rate. Second, other applications may avoid the WNIC to be shut down. In fact, the OS-support to WNIC shut down can take into account the communication needs of all applications before serving a shut-down request [2]. The proposed policies add two kinds of overhead related to scheduling decision and communication of additional packets (closed-loop only). While the second overhead is taken into account in our simulations, the first one is not. However, scheduling decision overhead can be considered negligible, as it does not need complex computations.

Closed loop. Closed loop policy depends on low and high watermarks that need to be set taking into account transition delays and network conditions. To handle dependency on network conditions, a cushion value is added to the LWM computation, so that LWM can be computed as follows: LW M = (Toff-on + Tcushion ) · λs

(1)

Where Toff-on is the wakeup time of the WNIC from the radio-off state, Tcushion is the “cushion” time used to avoid that the buffer empties completely because of network uncertainties. A 10% cushion time was used in our experiments. λs is the average packet consumption rate at the client side, that is computed based on the frame rate and on the average number of packets per frame (i.e., frame size). λs = f rame rate · avg f rame size Figure 4 shows the distribution of the number of packets per frame for two different video streams. The first statistic relates to an almost static video sequence, a video conference with a frame rate of 15 frames/sec, the second statistic refers to a fireworks video, characterized by sharp picture changes with a frame rate of 30 frames/sec. Tcushion is also used for HWM, which is computed as follows: HW M = Buf f er size(Tmessage + Tcushion ) · (λa − λs ) (2)

4. SIMULATION RESULTS 4.1 Comparative Analysis of Scheduling Strategies The results reported in this section refer to a single server accessed by 3 independent clients. The energy-aware scheduling strategies we compare are: JSQ scheduling combined with MAC-level DPM (JSQ-MAC), closed-loop and openloop JLQ scheduling policies with software-controlled shut down of the WNIC (JLQ-CL, JLQ-OL), open-loop JLQ with MAC-level power management (JLQ-OL-MAC). We used as a benchmark a 10sec MPEG4 video stream characterized by a frame rate of 15 frames/sec. The frames have variable length, depending on their complexity, so that

where λa is the arrival rate and Tmessage is the time needed for the client to send a HWM notification message. In fact, HWM must be set in order to leave enough empty slots in the buffer to accommodate networks packets received during HWM notification.

44

Figure 5: Comparison among DPM policies

Figure 6: Comparison among DPM policies

each frame is transmitted using one or more packets. The average number of packets per frame is 1.2. Actual traces were used to provide a realistic workload to our simulations. Energy efficiency is always reported in terms of energy per packet received by each NIC. Comparative results are reported in Figure 5: the energy efficiency (defined as the average energy consumed by the NIC of the client to receive a packet) is plotted as a function of the production rate. Notice that each scheduling strategy depends on tuning parameters that impact its energy efficiency: HWM and LWM for JLQ-CL, burst size for JLQOL, sleep period for MAC-level PM. The results reported in Figure 5 were obtained using HWM=30, LWM=10, bursts of 40 packets, sleep periods of 100ms and 200ms. The criteria for choosing HWM, LWM and burst size will be discussed in the following, while for the sleep periods we simulated the only ones actually supported by the CISCO card. Notice that JSQ scheduling requires the production rate to be adapted to the overall consumer rate, giving rise to a single point (for each value of the sleep period) in the plot of Figure 5. We decided to use horizontal segments to represent JSQ results in order to ease the comparison with other policies. Solid lines represent results obtained for JLQ scheduling strategies. Error bars are associated to each point to show the difference between the average energy per packet of the three WNICs. Interestingly, for all JLQ scheduling algorithms the three cards have almost the same energy efficiency and the curve is almost coincident to that obtained with a single client (not shown in figure because almost undistinguishable from the 3-client curve), demonstrating the good scalability of the proposed approach. As expected, energy efficiency increases as a function of the production rate. This is due to the reduced burst duration, which reduces the energy consumption for a fixed number of transmitted packets. On the other hand, the wireless link has a limited bandwidth that becomes the bottleneck for higher production rates. That’s why the energy efficiency doesn’t improve any more for production rate above 300 packets per second. First of all, we observe that the energy per packet provided by JLQ policies is always lower than that provided by JSQ policies. The only exception is JLQ-CL, which is less energy efficient for very low production rates, but becomes much more efficient at higher rates. JLQ-OL achieves best energy efficiency for all production rates, even if JLQ-CL reaches very similar results above 300 packets per second. In fact, the open-loop approach explicitly shuts off the WNIC immediately after the end of each burst without waiting for a timeout to expire as in the closed-loop case.

Open-loop JLQ policies associated with MAC-level DPM are less efficient than JLQ-OL because of the additional energy per packet due to the polling frame. As expected, the marginal effect of the polling frames is higher at higher production rates. To appreciate the effectiveness of the proposed techniques, it is worth mentioning that the average energy per received packet spent by the same NIC without any DPM strategy would be more than 23mJ. The results of Figure 5 are summarized by the bar graph of Figure 6. For each policy, the shaded part of the bar represents the range of energy efficiencies obtained while changing the production rate from 50 to 500 packets per second. It can be observed that JLQ-OL is more energy efficient than JSQ-MAC even under worst case conditions.

4.2 Sensitivity Analysis Both closed-loop and open-loop JLQ policies have parameters to be tuned based on the size of the available buffers and on quality of service and energy efficiency targets. In general, energy efficiency depends on burst size that is directly controlled in the open-loop policy, while it depends on the difference between HWM and LWM in closed-loop policies. In both cases, burst size is limited by the size of buffers and by quality of service constraints. The dependence of energy per packet on burst size is plotted in Figure 7. The x-axis represents the HWM for JLQ-CL (with fixed LWM at 5) and the burst size for JLQ-OL. As expected, the higher the HWM/burst size, the greater the energy reduction. The white circles denote the settings we used for our simulations, since they provide sizeable energy savings while requiring small memory requirements. For our experiments we used a burst size of 40 packets, conservatively smaller than the AP buffer size, in order to be able to raise the production rate above the channel bandwidth and to guarantee scalability.

4.3 QoS Considerations Scheduling policies impact quality of service in that they affect the probability of frame misses due to buffer overflow or deadline misses. Buffer overflow may occur when the burst transmitted by the server exceeds the available space in the AP or client buffer. In order to reduce the risk of overflow, HWM and burst size must be conservatively lower than the maximum buffer size, in order to compensate traffic fluctuations. Deadline misses, on the other hand, may occur if the client buffer is empty when the client requests a new frame. In order to avoid this situation, a minimum amount of packets must be always kept in the buffer in nominal conditions. In the closed-loop policy, this is done by setting the LWM, while

45

6. REFERENCES [1] A. Acquaviva, E. Lattanzi, A. Bogliolo, and L. Benini. A simulation model for streaming applications over a power-manageable wireless link. In Proceedings of European Simulation and Modelling Conference. ESMC, Oct 2003. [2] A. Acquaviva, T. Simunic, V. Deolalikar, and S. Roy. Remote power control of wireless network interfaces. In Proc. of Power and Timing Modeling, Optimization and Simulation, pages 369–378, Sept 2003. [3] D. Bertozzi, A. Raghunathan, L. Benini, and S. Ravi. Transport protocol optimization for energy efficient wireless embedded systems. In Proc. of Design, Automation and Test Conference, pages 706–711. DATE, Mar 2003. [4] C.-C. C. et al. A new statistical wideband spatio-temporal channel model for 5-ghz band wlan systems. IEEE Journal on Selected Areas in Communications, 21(2):139–150, Feb 2003. [5] P. W. Glynn. A gsmp formalism for discrete event systems. Proc. of the IEEE, 77(1):14–23, 1989. [6] D. Hayes, M. Rumsewicz, and L. Andrew. Quality of service driven packet scheduling disciplines for real-time applications: Looking beyond fairness. In Proc. of Annual Joint Conference of the IEEE Computer and Communications Societies, pages 402–412. (INFOCOM), Mar 1999. [7] R. Krashinsky and H. Balakrishnan. Minimizing energy for wireless web access with bounded slowdown. In Proc. of Mobile Computing and Networking Conference, pages 119–130, Sep 2002. [8] X. Liu, Y. Xiang, and T. J. Li. Counter based routing policies. In Proc. of High Performance Computing Conference, pages 389–393, Dec 1999. [9] L. S. C. of the IEEE Computer Society. Part 11: Wireless lan mac and phy specifications: Higher-speed physical layer extension in the 2.4 ghz band. IEEE, 1999. [10] V. Raghunathan, S. Ganeriwal, C.Schurgers, and M. Srivastava. e2 wf q: An energy efficient fair scheduling policy for wireless systems. In Proc. of International Symposium on Low Power Electronics and Design, pages 30–35, Aug 2002. [11] H. Shi and H. Sethu. Scheduling real-time traffic under controlled load service in an integrated service internet. In Proc. of Workshop on High Performance Switching and Routing, pages 11–15, May 2001. [12] K. Sivalingam, J. Chen, P. Agrawal, and M. Srivastava. Design and analysis of low-power access protocols for wireless and mobile atm networks. Wireless Network, 6:73–87, 2000. [13] H. Wan and X. Lin. Multiple priorities qos scheduling for simultaneous videos transmissions. In Proc. of International Symposium on Multimedia Software Engineering, pages 135–141, Dec 2000. [14] M. Y. Wu, S. Ma, and W. Shu. Scheduled video delivery for scalable on-demand service. In in Proc. of the 12th international workshop on Network and operating systems support for digital audio and video, pages 167–175, May 2002.

Figure 7: Energy per packet provided by JLQ-CL and JLQ-OL as a function of HWM and burst size, respectively. in the open-loop policy this is done by waiting an initial latency before rendering the first frame. Traffic fluctuations may be due both to the non-ideality of the wireless link, and to the non-uniform frame complexity. In case of an ideal error-less link, a LWM of 5 (or a latency of 100ms) is sufficient to avoid any frame miss possibly caused by the variable efficiency of MPEG4 encoding. Higher values of LWM (and latency) are required to compensate channel non-ideality, as discussed in the following subsection.

4.4 Channel Error Effects The rationale behind JLQ policies is to exploit buffer space at the client side to create opportunities to shut-down the WNIC by means of traffic reshaping. However, if buffers have been designed to improve system robustness against bad network conditions, the application of the proposed policies may increase the frame-miss probability (impairing the quality of service). For a fixed buffer size, there is a tradeoff between energy efficiency and quality of service. On the other hand, buffers could be designed to satisfy both quality of service and energy efficiency requirements. We tested our system under lossy channel conditions, by simulating MAC-level packet loss probabilities up to 30%. A packet loss causes MAC-level retransmission. A retransmission happens if a MAC level ACK for a transmitted packet is not received by the access point after a timeout. The maximum value of MAC retries supported by CISCO cards is 16. Hence, MAC-level errors are viewed at the application level as transmission delays. Our experiments showed that error rates up to 30% could be tolerated without causing any frame miss by using LWM=10 or a playback latency of 200ms.

5.

CONCLUSION

In this paper we have presented power-aware applicationlevel scheduling policies for streaming data over a wireless network. Comparative experimental results obtained from simulation models validated against real-world experiments show that application-level scheduling provides power savings much larger than those provided by MAC-level Dynamic Power Management (DPM) combined with traditional performance-oriented scheduling. Our experiments, conducted on a system composed of three independent clients concurrently accessing the same MPEG4 server, show that such parameters can be set in order to achieve energy savings up to 75% without any frame miss, even under bad channel conditions.

46

Suggest Documents