RATE CONTROL AND STREAM ADAPTATION ... - Stanford University

4 downloads 62597 Views 179KB Size Report
In a multihomed video streaming system, a video sequence ..... We call this generalized model as ..... [5] “AT&T faces 5,000 percent surge in traffic,” http:.
RATE CONTROL AND STREAM ADAPTATION FOR SCALABLE VIDEO STREAMING OVER MULTIPLE ACCESS NETWORKS Cheng-Hsin Hsu, Nikolaos M. Freris, Jatinder Pal Singh

Xiaoqing Zhu

Deutsche Telekom R&D Laboratories USA 5050 El Camino Real 221 Los Altos, CA 94022

Cisco Systems, Inc. 425 East Tasman Drive San Jose, CA 95134

ABSTRACT

service providers recently report a staggering 50-fold increase in data traffic due to smartphone users [5, 6]. Since increasingly more cellular data plans are becoming flat-rate, these mobile service providers need to offload the Internet traffic over WLANs to remain profitable. Video streaming has high bandwidth and stringent delay requirements, and can greatly benefit from multihoming. In a multihomed streaming system, the server needs to carefully choose the streaming rates: choosing a low rate may result in under-utilization of the access network, while selecting a high rate may lead to network congestion and video packets missing their playout deadlines. Hence, effective rate control for a good trade-off between the achieved throughput and experienced delay is important. Once the streaming rates are determined, the server needs to convert the video stream into a format that can be delivered to the client on time. We refer to this conversion as stream adaptation, which is traditionally implemented via computationally intensive transcoding [7] which does not scale well. In contrast, scalable video coding [8] supports flexible stream adaptation, and further enables service providers to save the cost of streaming servers and transcoders. Although scalable video coding comes with a small coding inefficiency, the recent H.264/SVC standard is reported to be quite efficient [8]; it even outperforms some nonscalable coders, such as MPEG-4 Advanced Simple Profile (ASP) [9]. We study scalable video streaming and consider the problem of determining the portions of a scalable video stream that should be sent over individual access networks under several constraints including available bit rate (ABR) and round-trip time (RTT) of each access network, and rate-distortion (R-D) functions of individual video frames. Our contributions can be summarized as follows:

In a multihomed video streaming system, a video sequence is simultaneously transmitted over multiple access networks to a client. In this paper, we formulate the rate control and stream adaptation problems into a unified optimization problem, which determines the sending rates of individual networks, selects which video packets to transmit, and assigns each packet to an access network. We propose two heuristic algorithms with a trade-off between optimality and computational complexity. One of the proposed algorithms runs faster, while the other one results in better video quality. We propose a hybrid algorithm that demonstrates a good balance between optimality and computational complexity. We conduct extensive packet-level simulations to evaluate our algorithms using real network conditions and actual scalable video streams. We compare our algorithms against the rate control algorithms defined in the Datagram Congestion Control Protocol (DCCP) standard. The simulation results show that our algorithms significantly outperform current systems while being TCP-friendly. Our algorithms achieve at least 10 dB quality improvement over DCCP and result in up to 83% packet delivery delay reduction. Index Terms— Scalable video streaming, distortion minimization, rate allocation, stream adaptation 1. INTRODUCTION Increasing number of fixed and mobile Internet devices have access to multiple networks. Multihomed systems [1, 2] concurrently utilize multiple access networks for higher aggregate bandwidth, better load balancing, more pervasive connectivity, improved error resilience, and lower network latency [3]. These benefits are crucial for both end users and service providers. In particular, end users who exclusively use 3G data networks may suffer from insufficient network capacity [4], whereas those who only connect to wireless local-area networks (WLANs) may suffer from frequent disconnections as each access point only covers a small area. Furthermore, a service provider may save its transit cost by offloading Internet traffic to other access networks. In fact, several US mobile

• We formulate the rate control and stream adaptation problems over multiple access networks into a unified optimization problem. Solving this problem jointly: (i) determines sending rates of individual access networks, (ii) selects video packets to transmit, and (iii) assigns each packet to an access network, in order to maximize video quality at a client. 1

• We propose two heuristic algorithms to solve our problem, which provide a trade-off between optimality and computational complexity. We present a hybrid algorithm for a good balance of this trade-off. • We conduct extensive trace-driven, packet-level simulations using actual network conditions and real scalable streams. Our simulation results show that our proposed algorithms outperform the rate control algorithms defined in the Datagram Congestion Control Protocol (DCCP) standard [10].

GoP Server Internet r 1

WiFi Client

r2

Ethernet

r3 3G Cellular

0 1 2 3 4 Fig. 1. Multihomed scalable video streaming.

The rest of this paper is organized as follows. In Sec. 2, we present related work. We present the problem formulation in Sec. 3. In Sec. 4, we propose and analyze three heuristic algorithms. The proposed algorithms are evaluated in Sec. 5. Sec. 6 concludes the paper.

Fig. 2. SVC prediction structure.

shows an example for N = 3. The network condition of each access network n (1 ≤ n ≤ N ) is described by its available bit rate (ABR) cn and round-trip time (RTT) τn , which are periodically measured using a lightweight measurement tool, such as Abing [16]. Based on network conditions, the server splits the video stream into N transport streams, and transmits transport stream n over access network n. We let rn bePthe transport stream rate over access network n, and N r := n=1 rn be the total video streaming rate. Operating a network at a rate rn close to its available bit rate cn leads to higher delays and packets being dropped because of network congestion. Since the available bit-rate cn varies with time, rate control is important for timely delivery of video packets. We take one-way delay tn as half of RTT τn , i.e., tn := τ2n . Following the derivation in [1, 17], we relate the one-way delay tn over access network n and its remaining bandwidth cn − rn as:

2. RELATED WORK Rate control of nonscalable video streams for multihomed clients has been investigated in [1, 2, 11]: Singh et al. [11] propose a solution based on stochastic control of Markov Decision Processes, Alpcan et al. [2] give a solution based on H∞ -optimal control of linear dynamic systems, and Zhu et al. [1] present a solution based on convex optimization. Efficient stream adaptation using scalable streams has also been studied [12–15]: Hefeeda and Hsu [12] consider the stream adaptation problem of a Fine-Grained Scalable (FGS) stream between one receiver and multiple senders. Amonou et al. [13] study the problem of prioritizing video packets for H.264/SVC streams. They empirically calculate the distortion impact of dropping each video packet, and assign higher priority to video packets with higher impact values. Mansour et al. [14] study the stream adaptation problem in a single-hop wireless network, where the receivers share a given network capacity for receiving FGS streams from the base station. Sun et al. [15] propose an R-D model for FGS streams coded by H.264/SVC, which is based on a generalized Gaussian distribution source model. To the best of our knowledge, our work is the first that simultaneously considers the end-to-end rate control and scalable stream adaptation for multihomed clients. Previous works either consider nonscalable video streaming [1, 2, 11], or concentrate on scalable stream adaptation without accounting for diverse and dynamic network conditions [12–15].

tn =

αn , cn − rn

(1)

where parameter αn is estimated from the past observations of RTT τn and residue bandwidth cn −rn by linear regression. Let pn be the packet loss probability over access network n, which corresponds to random losses and packets missing their playout deadlines. We assume that packet losses are statistically independent across different access networks and across different transmissions, and we write pn = gn (cn , rn ), where gn (·, ·) models the packet loss rate. We assume that gn (·, ·) is increasing in rn and decreasing in cn . By considering a general gn (·, ·) function we can accommodate various queueing models [18], e.g., the M/M/1 queueing model or finite-length queueing models such as M/M/1/K and G/D/1/K. The actual world is far more complicated than a single-hop queueing model, since it entails streaming over multiple hops and varying routes. Nevertheless, the M/M/1 model has been observed to yield a good approximation in previous work [1, 17], while our simulation results (Sec. 5) also confirm that it is quite effective and leads to significant performance improvements. Therefore, we consider the M/M/1 model in the rest of the paper, but our analysis is also applicable to other network models.

3. PROBLEM FORMULATION 3.1. Heterogeneous Access Networks We consider the multihomed video streaming problem, in which a streaming server sends video data over N heterogeneous and time-varying access networks to a client1 . Fig. 1 1 Scalable video streaming over a single access network is a special case of our analysis with N = 1.

2

Under the assumption of the M/M/1 model, we write pn = gn (cn , rn ) = e−t0 /tn [18], where t0 is the playout deadline and tn is the average one-way delay. The parameter t0 is application-dependent. Combining this equation with (1), we get pn = e−

t0 (cn −rn ) αn

.

request (ARQ) at the data-link layer. These mechanisms are not controlled by the video streaming system at the application layer. We define PN xm,q := n=1 xm,q,n , (4)

(2)

to be a binary variable with value 1 if NALU gm,q is sent over some network, and 0 otherwise. Next, we develop our distortion model to estimate the distortion of a substream extracted from a scalable stream. We use mean square error (MSE) as the distortion metric and denote the total distortion of frame m by dm , which can be divided into two components: truncation distortion em and drifting distortion ym . We let dm = em + ym . Truncation distortion, em , refers to the quality degradation due to dropping NALUs of frame m. We let δˆm be the full-quality distortion of frame m, which is achieved when gm,q for all q = 0, 1, . . . , Q are received on time. We let δm,q (0 ≤ q ≤ Q) be the additional distortion introduced by dropping NALU gm,q . Because of the dependency among MGS layers, to decode gm,q , all NALUs gm,q′ , where q ′ < q must have been decoded. Following the definition of (3), we can write truncation distortion as:  PQ em = δˆm + q=0 1 − x ¯m,q δm,q , where (5) Q (6) x ¯m,q := q′ ≤q xm,q′ = minq′ ≤q xm,q′ .

3.2. Scalable Video Streams The H.264/SVC [8] standard employs hierarchical prediction structure among frames within the same group of pictures (GoP). That is, video frames are divided into a temporal base layer and multiple temporal enhancement layers. Frames in the temporal base layer use only the temporal base layer frame of the previous GoP for prediction, while frames in the temporal enhancement layers use the two neighboring frames in lower temporal layers for predictions. Fig. 2 illustrates this prediction structure. Each frame consists of multiple quality layers. There are two types of quality scalability: coarse-grained scalability (CGS) and medium-grained scalability (MGS). We consider MGS layers in this work. H.264/SVC streams are divided into Network Abstraction Layer units (NALUs), and each NALU gm,q is identified by the frame number m, 1 ≤ m ≤ M , and the quality layer q, 0 ≤ q ≤ Q, where M is the number of frames considered in each optimization problem and Q + 1 is the number of quality layers. We consider M to be a multiple of the GoP size. For any frame m, NALU gm,0 carries the basic quality representation, and NALU gm,q , where 0 < q ≤ Q, contains quality enhancement for that frame. We let sm,q be the size of NALU gm,q . NALU gm,q , 1 ≤ q ≤ Q, is decodable if and only if all NALUs of lower quality layers (gm,q′ , q ′ < q) are received on time and decodable. NALU gm,0 is decodable if its hierarchical prediction parents are successfully received on time. We let Pm be the ancestor frames2 of frame m. The two immediate parents of frame m are denoted by pm,1 and pm,2 . With the above notations, we can describe our multihomed scalable video streaming problem as determining which NALUs to send, and associate each NALU with an access network. We let xm,q,n be a boolean variable indicating whether we send gm,q over access network n: xm,q,n =

(

1,

if we send gm,q over network n

0,

otherwise.

The value of x ¯m,q is equal to 1, if and only if all NALUs of frame m with quality layers lower than or equal to q are received. Drifting distortion, ym , refers to the distortion caused by inter-frame predictions due to imperfect reconstruction of ancestor frames in Pm . We consider an abstract model of the form ym = fm (ePm ), where fm is increasing in each argu|P | ment on R+ m . Sun et al. [15] propose to use an increasing bilinear distortion model: ym = fm (epm,1 , epm,2 ) = ζm,0 + ζm,1 epm,1 + ζm,2 epm,2 + ηm epm,1 epm,2 .

In this model, ζm,1 and ζm,2 are nonnegative constants that are computed from the known fraction of inter-coded macroblocks in frame m. The parameters ζm,0 and ηm are derived from empirical data and ηm is assumed nonnegative. We generalize this bilinear model into a degree-2 polynomial model:

(3)

We assume that the packet loss rates of access networks are sufficiently low so that sending a NALU over multiple access networks results in no advantage, while incurring higher network load due to duplicated data transfer. This assumption is reasonable because many modern access networks implement forward error correction (FEC) and automatic repeat 2 In

(7)

ym = fm (epm,1 , epm,2 ) = ζm + ηm,1 epm,1 + ηm,2 epm,2 + κm,1 e2pm,1 + 2κm,2 epm,1 epm,2 + κm,3 e2pm,2 , (8) where ζm , η, and κ are model parameters. We consider a convex increasing drifting distortion function by adding the ! κm,1 κm,2 constraints that η, κ are nonnegative and κm,2 κm,3

this paper, we use bold symbols to represent vectors.

3

In this problem, rate control is performed through (12c). This is a form of proactive congestion control, in the sense that it seeks to avoid causing network congestion, as opposed to the responsive nature of TCP-like rate control algorithms. Assuming that fm is increasing in each argument, the objective function is increasing in pn , for fixed x. It is decreasing in xm,q for each m = 1, 2, . . . , M and q = 0, 1, . . . , Q. The objective function is increasing in em and ym for each m. Based on these properties, we can replace the equality constraints in (12c), (12d), and (12g) with ≥, ≤, and ≥ inequality constraints, respectively. This yields an equivalent formulation with no nonlinear equality constraints. We note that (12) is an integer program [20], and its statespace has a cardinality of 2M QN , which renders exhaustive search intractable for actual applications. While dynamic programming can be employed for computing the optimal solution, doing so still leads to exponential complexity, and prohibitively long running time.

is positive semidefinite. We call this generalized model as degree-2 polynomial model. Liang et al. [19] suggest that the truncation error of a frame propagates to its descendants in the partial ordering of inter-frame prediction in the same GoP, in a linear fashion. Inspired by this, we propose another generalized model P ym = fm (ePm ) = γm,m + k∈Pm γm,k ek , (9)

where γm,k are nonnegative parameters. We refer to this model as multi-scale linear model. In Sec. 5.2, we empirically compare the two proposed distortion models against a model in the literature. 3.3. Optimization Problem

We formulate the multihomed scalable video streaming problem as one of finding the xm,q,n values to maximize video quality at the client, i.e., minimize the total expected distortion, under current network conditions. Let F be the frame rate in frames-per-second (fps). The average transport stream rate for network n, in a time interval of length M F , is rn =

F P M PQ sm,q xm,q,n . M m=1 q=0

Remark 1. The formulation in (12) can account for the case that NALU gm,q comprises multiple packets, say Um,q packets. Let xm,q,u,n be 1 if the u−th packet of NALU gm,q is sent over access network n, and 0 else. We then substitute xm,q,n with xm,q,u,n in (12i), (12j) and replace (12d), (12e) with PN (13) xm,q,u = n=1 (1 − pn )xm,q,u,n , Q QUm,q′ x ¯m,q = (14) u=1 xm,q ′ ,u . q ′ ≤q

(10)

The rate can be used to estimate packet loss probability pn over network n by means of the network model (2). Under the statistical independence assumption, the expected delivery probability of NALU gm,q , denoted (by some abuse of notation) by xm,q ∈ [0, 1], can be calculated as: PN xm,q = n=1 (1 − pn )xm,q,n . (11)

Typically, Um,q is a function of NALU size sm,q . For example, for a path with maximum payload length θ, the streaming s server may send NALU gm,q with Um,q = ⌈ m,q θ ⌉. The algorithms proposed in Sec. 4 can be readily extended by updating the distortion model.

In writing (11), we assume that each access network n behaves like a binary channel with loss probability pn for each packet, and that each NALU comprises a single packet for brevity. In the light of (11) we can rewrite x ¯m,q in (6) and model the truncation distortion using (5). The joint rate control and stream adaptation problem can be written as an optimization problem of finding x := {xm,q,n }: PM min (12a) m=1 dm x P P M Q F s.t. rn = (12b) m=1 q=0 sm,q xm,q,n , M pn =

xm,q = x ¯m,q = em = ym = dm = xm,q,n ∈

e−t0 (cn −rn )/αn , PN n=1 (1 − pn )xm,q,n , Q q ′ ≤q xm,q ′ , P Q δˆm + q=0 (1 − x ¯m,q )δm,q , fm (ePm ),

4. HEURISTIC ALGORITHMS In this section, we present three heuristic algorithms. 4.1. Simple Rate-Distortion Optimization By ignoring the drifting distortion, we propose a heuristic algorithm that first sorts NALUs gm,q on their importance. Then, it sequentially schedules the NALUs until the access networks are fully loaded, i.e., right before their loss probabilities exceed a desired maximal value Pmax . We call this algorithm Simple Rate-Distortion Optimization (SRDO), and we give its pseudocode in Figure 3. The SRDO algorithm takes the maximum packet loss rate as an input. In line 2, it sorts NALUs on the ratio of potential quality improvement δm,q and size sm,q . The for-loop between lines 3–7 iteratively finds the least loaded access network, and transmits the next unsent NALU over it. The algorithm returns in line 5 if the maximum packet loss rate is exceeded, and in line 7 if all NALUs have been sent. Computa-

(12c)

(12d) (12e) (12f) (12g)

em + y m , (12h) {0, 1}, m = 1, . . . , M, q = 0, . . . , Q, n = 1, . . . , N, PN n=1 xm,q,n ≤ 1.

(12i) (12j) 4

1. 2.

let x = {xm,q,n = 0 | ∀m, q, n} δ sort gm,q on sm,q m,q

3. 4. 5. 6. 7.

for n ˆ = argminN n=1 pn let gm,ˆ ˆ q be the next unsent NALU if sending gm,ˆ ˆ causes pnˆ > Pmax return x ˆ q on n else update x with xm,ˆ ˆ q ,ˆ n =1 if no more unsent NALU return x

1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

Fig. 3. Simple Rate Distortion Optimization algorithm.

let x = {xm,q,n = 0 | ∀m, q, n} forever let gd be all immediately decodable NALUs if gd is empty return x for gm,q ∈ gd for n = 1 to N compute bm,q,n based on x let bm,ˆ ˆ q ,ˆ n /sm,ˆ ˆ q ≥ bm,q,n /sm,q ∀m, q, n if bm,ˆ ≤ 0 return x ˆ q ,ˆ n update x with xm,ˆ ˆ q ,ˆ n = 1, update gd

Fig. 4. Progressive Rate-Distortion Optimization algorithm.

tional complexity of the SRDO algorithm is dominated by the  sorting in line 2, on the order of O M (Q+1) log[M (Q+1)] .

is no immediately decodable NALUs leading to distortion reduction.

4.2. Progressive Rate-Distortion Optimization The SRDO algorithm assumes the drifting distortion is insignificant, which is less accurate for videos with higher temporal correlation. We propose a new algorithm following the observation that sending one more NALU gm,q concurrently incurs positive and negative impacts on the total distortion. More specifically, sending NALU gm,q over access network n leads to: (i) lower truncation distortion for frame m and lower drifting distortion for its descendants, and (ii) higher distortion for frames with NALUs already assigned to access network n, since packet loss rate pn is an increasing function of the network load. We let bm,q,n be the net distortion impact of sending NALU gm,q over access network n on top of of what have been sent. The new algorithm is referred to as Progressive RateDistortion Optimization (PRDO): it follows the video dependency structure and iteratively sends more NALUs by selecting the NALU that would reduce total distortion the most. The algorithm stops if all NALUs have non-positive net distortion impact values, or if there is no unsent NALU. By leveraging on the dependency structure, we can largely reduce the number of bm,q,n to be computed. More precisely, our algorithm considers only the immediately decodable NALUs at each step, that is to say the NALUs with all their ancestors transmitted earlier. We give the pseudocode of our proposed PRDO algorithm in Fig. 4. The loops starting in lines 2, 5, and 6 have at most M (Q + 1), M , and N iterations, respectively. Furthermore, line 7 can be computed by scanning through all M (Q + 1)N NALUs only once. Hence, the PRDO algorithm  has a polynomial time complexity of O M 3 (Q + 1)2 N 2 .

5. EVALUATION 5.1. Network and Video Trace Collection We use Abing [16] to periodically measure ABR and RTT values between hosts on two networks. We chose Abing because it converges fast and is light-weight [21]. We collect network traces between Deutsche Telekom Laboratories (in Berlin) and Stanford University. Three access networks are considered: Ethernet, 802.11b, and 802.11g. We configure Abing to take a measurement every two seconds for two hours. Parts of the network traces have been used in our previous studies [1, 11]. We consider four 10-sec video sequences: City, Soccer, Crew, and Harbour. These sequences are in 4CIF (704x576) resolution at 30 fps. We use JSVM Reference Software (version 9.19.4) to encode each sequence into a scalable stream with a GoP size of eight and eight MGS layers. We tested different numbers of MGS layers and found that number of MGS layers does not affect coding efficiency substantially. Fig. 5 illustrates that, compared to Q = 2, Q = 8 only results in 5–7.5% rate increase. Once we get the scalable streams, we parse them for the NALU size sm,q . 5.2. Video Distortion Model Validation For each scalable stream, we estimate the truncation distortion model parameters as follows. We first decode the complete scalable stream and compute the full-quality disˆ Next, we truncate NALUs gm,q for each frame tortion δ. m = 1, 2, . . . , M with q = 0, 1, . . . , Q, but we keep all NALUs of frames in temporal layers lower than m. We then decode the truncated video stream to compute δm,q . We preserve the complete frames in lower temporal layers to prevent any drifting distortion. To estimate the drifting distortion model parameters, we decode each scalable stream 32 times with random ancestor

4.3. Hybrid Rate-Distortion Optimization We propose a Hybrid Rate-Distortion Optimization (HRDO) algorithm, which uses SRDO to bootstrap a solution, and then applies PRDO to send more NALUs. HRDO stops when there 5

playout deadline t0 = 1 sec. For the SRDO algorithm, the maximum packet loss rate Pmax is set to 10%. The maximum UDP packet size is set to 1000 bytes. We run the simulations with four video sequences. For each setup, we use each algorithm to solve the optimization problem 180 times, and report average results. We consider four measures of performance: video quality in PSNR (Peak Signal-to-Noise Ratio), streaming rate, packet delivery delay, and running time.

Table 1. Goodness-of-fit comparison in RMSE. Seq. Bilinear Multi-scale Linear Deg-2 Poly. Crew 0.4348 0.1493 0.4086 Harbour 1.1201 0.6566 0.9762 Soccer 0.5392 0.2865 0.5034 City 0.4777 0.5122 0.3963 frame truncations, and fit the samples to the distortion models using Matlab (version R2009b). We use the same samples for the proposed degree-2 polynomial model of (8) and multiscale linear model of (9). We then compare their goodnessof-fit results against the bilinear model of (7) in Table 1. This table shows that the proposed models outperform the model in [15]. The multi-scale linear model provides the best fit for all sequences except City, therefore we use this model in simulations. To validate the model accuracy, we randomly drop NALUs from each scalable video stream and decode it to get the empirical distortion. We then use the multi-scale linear model of (9) to estimate the distortion. Fig. 6 plots a sample time period of Crew, which clearly shows that the proposed model closely follows the empirical distortion.

5.4. Results Benefits of Multihoming. We use the DCCP streaming server with different number of access networks and report the results from City with 30% background traffic. We run the DCCP rate control algorithms with one, two, and three access networks and compute the video quality. We plot sample results for a 60-sec period using DCCP-TCP in Fig. 7, while results from DCCP-TFRC are similar. This figure shows that multihoming can significantly increase video quality and reduce the number of quality fluctuations. Video Quality. We compare the video quality achieved by the proposed SRDO and PRDO algorithms against DCCPTCP and DCCP-TFRC with 30% background traffic. In Fig. 8(a) we plot the video quality achieved using each algorithm for a 60-sec sample period. We observe that both DCCP-TCP and DCCP-TFRC suffer from sudden quality drops and that the proposed algorithms achieve high video streaming quality. We report the aggregate video quality for different video sequences in Fig. 8(b). The proposed algorithms outperform the DCCP rate control algorithms by at least 10 dB in video quality. Streaming Rate and TCP-Friendliness. DCCP rate control algorithms are designed to be TCP-friendly. We report the streaming rates for different algorithms with 30% background traffic. The simulation results (figures not shown due to the page limitations) indicate that the proposed algorithms lead to smooth streaming rates, comparable to the average rates of the DCCP rate control algorithms. In Fig. 9, we present the average streaming rate for all considered video sequences and algorithms. This figure shows that the SRDO and PRDO algorithms result in almost the same streaming rates as DCCPTCP and DCCP-TFRC, and hence are TCP-friendly. Packet Delivery Delay. We present the average packet delivery delay for all video sequences under 30% background traffic in Fig. 10. Our proposed algorithms result in short packet delivery delays, at least 2.5 seconds shorter than DCCP (or 83% delay reduction). This, in turn, shows that the inferior video quality of DCCP rate control algorithms is partially due to longer packet delivery delays causing video packets to miss their playout deadlines. Trade-off between Optimality and Computational Complexity. To throughly evaluate the proposed SRDO, PRDO, and HRDO algorithms, we stream City with them under vari-

5.3. Setup We evaluate our proposed algorithms using the NS-2 simulator [22] by implementing a multihomed streaming server which supports the SRDO, PRDO, and HRDO algorithms. These algorithms are implemented as Matlab subroutines. For comparison, we consider the rate control algorithms defined in DCCP [10], a modern transport protocol designed for video streaming. We use a public DCCP implementation [23] in NS-2, with two standard rate control algorithms: (i) TCPlike algorithm that implements window-based TCP rate control and (ii) TFRC (TCP-Friendly Rate Control) algorithm that is an equation based algorithm achieving long-term TCP fairness. We have implemented a multihomed DCCP streaming server that sets up a connection over each access network. This server iterates through each DCCP connection and transmits NALUs from lower to higher quality layers until reaching the rate limit computed by the congestion control algorithms. We refer to the DCCP streaming server with TCPlike rate control as DCCP-TCP and the one with TFRC rate control as DCCP-TFRC. We simulate multihomed video streaming sessions for random starting times in network traces of the four considered video sequences. For each session, the simulator adjusts the capacity and delay of each access network following the network traces, and it rewinds video sequences once their ends are reached. We use the NS-2 traffic generators to add background traffic over each access network at a rate between 30% to 90% of its available bandwidth. We also implement Abing in NS-2 for ABR and RTT measurements. We choose M = 32 for our optimization problem, and set the 6

Quality in PSNR (dB)

Estimated Actual

50 40 30 20

City Crew Harbour Soccer 2

3

4 5 6 7 No. MGS Layers

10 0 150

8

40 35 30 25 20 15 10 5 0

DCCP-TCP DCCP-TFRC SRDO PRDO 0

10

20 30 40 Time (sec)

50

60

190 210 230 Frame Number

250

Fig. 6. Sample distortion model accuracy from Crew. Quality in PSNR (dB)

Quality in PSNR (dB)

Fig. 5. Rate increase of different number of MGS layers.

170

40 35 30 25 20 15 10 5 0

DCCP-TCP DCCP-TFRC SRDO PRDO City

Soccer Crew Harbour Video Sequence

(a)

40 35 30 25 20 15 10 5 0

1 Net 2 Nets 3 Nets 0

10

20 30 40 Time (sec)

50

60

Fig. 7. Video quality achieved by different numbers of access networks. Streaming Rate (Mbps)

60 Distortion in MSE

Rate Increase (%)

8 7 6 5 4 3 2 1 0

DCCP-TCP DCCP-TFRC SRDO PRDO

5 4 3 2 1 0

City

Soccer Crew Harbour Video Sequence

(b)

Fig. 8. Video quality comparison: (a) City and (b) overall results.

ous background traffic loads from 50% to 90%3 . We plot the video quality achieved by the proposed algorithms at different background traffic loads in Fig. 11(a). This figure shows that when the background traffic is not significant, SRDO performs almost as good as PRDO, but the performance gap becomes nontrivial (about 10 dB) when background traffic is increased to 90%. HRDO performs slightly worse than PRDO, unless the bandwidth is highly saturated. We plot the running time of different algorithms in Fig. 11(b). We observe that SRDO runs in real-time (between 150–200 msec), while PRDO takes significantly longer time to finish, and HRDO has an intermediate running time. Fig. 11 shows the trade-off between optimality and computational complexity: PRDO results in better video quality, but has higher complexity, while SRDO runs faster, but leads to lower video quality. The HRDO algorithm depicts a good trade-off of optimality for complexity.

Fig. 9. Streaming rate comparison from all video sequences.

ceived video quality at the client subject to constraints on network conditions and video characteristics. We have formulated the problem into an integer programming problem, and have proposed two heuristic algorithms: the SRDO algorithm assumes the drifting distortion is insignificant and has a very low time complexity, while the PRDO algorithm employs an elaborate distortion model for better video quality at the expense of longer running time. We have also proposed a hybrid HRDO algorithm with good balance between optimality and computational complexity. We have evaluated all the algorithms using NS-2 simulator with real network and video traces. The simulation results have shown that the proposed algorithms outperform the rate control algorithms defined in the DCCP standard [10]. In particular, the proposed algorithms: (i) result in higher video quality, (ii) is TCP-friendly, and (iii) incur short packet delivery delays. The present work can be extended along several directions. We plan to develop algorithms based on suboptimal convex programs for the joint rate and distortion optimization problem. We plan to integrate the packet loss rates of wireless networks into our formulation, and further assign different FEC rates to different NALUs. Another direction is generalizing the optimization problem for multiple streaming servers competing for the same access networks.

6. CONCLUSIONS We have addressed the problem of streaming scalable videos over multiple access networks to a client, based on an optimization framework. The objective is to maximize the per3 In our experiments, we found that these algorithms result in similar performance when the background traffic load is lower than 50%.

7

DCCP-TCP DCCP-TFRC SRDO PRDO

2 1 0

City

Soccer Crew Harbour Video Sequence

35 30 25 20 15 10

SRDO PRDO HRDO

5 0 50

60 70 80 Background Traffic (%)

(a)

Fig. 10. Packet delivery delay comparison.

90

Running Time (sec)

3

Quality in PSNR (dB)

Packet Delay (sec)

4

30 25 20 15 10

SRDO PRDO HRDO

5 0 50

60 70 80 Background Traffic (%)

90

(b)

Fig. 11. Trade-off between (a) optimality and (b) computational complexity.

7. REFERENCES

[12] M. Hefeeda and C. Hsu, “Rate-distortion optimized streaming of fine-grained scalable video sequences,” ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 4, no. 1, pp. 2:1–2:28, January 2008. [13] I. Amonou, N. Cammas, S. Kervadec, and S. Pateux, “Optimized rate-distortion extraction with quality layers in the scalable extension of H.264/AVC,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 9, pp. 1186– 1193, September 2007. [14] H. Mansour, V. Krishnamurthy, and P. Nasiopoulos, “Channel aware multiuser scalable video streaming over lossy underprovisioned channels: Modeling and analysis,” IEEE Transactions on Multimedia, vol. 10, no. 7, pp. 1366–1381, November 2008. [15] J. Sun, W. Gao, D. Zhao, and W. Li, “On rate-distortion modeling and extraction of H.264/SVC fine-granular scalable video,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, no. 3, pp. 323–336, March 2009. [16] “Abing project page,” http://www-iepm.slac. stanford.edu/tools/abing/. [17] X. Zhu, E. Setton, and B. Girod, “Congestion-distortion optimized video transmission over ad hoc networks,” Signal Processing: Image Communication, vol. 20, no. 8, pp. 773–783, September 2005. [18] D. Gross, J. Shortle, J. Thompson, and C. Harris, Fundamentals of Queueing Theory, Wiley-Interscience, 4th edition, 2008. [19] Y. Liang, J. Apostolopoulos, and B. Girod, “Analysis of packet loss for compressed video: Effect of burst losses and correlation between error frames,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 7, pp. 861–874, July 2008. [20] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 1st edition, 2004. [21] J. Navratil and R. Cottrell, “ABwE: A practical approach to available bandwidth estimation,” in Proc. of Passive and Active Measurement Workshop (PAM’03), La Jolla, CA, April 2003. [22] “The network simulator,” http://www.isi.edu/ nsnam/ns/. [23] N. Mattsson, “A DCCP module for NS-2,” M.S. thesis, Department of Computer Science and Electrical Engineering, Lulea Tekniska University, 2004.

[1] X. Zhu, P. Agrawal, J. Singh, T. Alpcan, and B. Girod, “Distributed rate allocation policies for multihomed video streaming over heterogeneous access networks,” IEEE Transactions on Multimedia, vol. 11, no. 4, pp. 752–764, June 2009. [2] T. Alpcan, J. Singh, and T. Basar, “Robust rate control for heterogeneous network access in multihomed environments,” IEEE Transactions on Mobile Computing, vol. 8, no. 1, pp. 41–51, January 2009. [3] J. Apostolopoulos and M. Trott, “Path diversity for enhanced media streaming,” IEEE Communications Magazine, vol. 42, no. 8, pp. 80–87, August 2004. [4] F. Hartung, U. Horn, J. Huschke, M. Kampmann, T. Lohmar, and M. Lundevall, “Delivery of broadcast services in 3G networks,” IEEE Transactions on Broadcasting, vol. 53, no. 1, pp. 188–199, March 2007. [5] “AT&T faces 5,000 percent surge in traffic,” http: //www.internetnews.com/mobility/article. php/3843001, 2009. [6] “T-Mobile’s growth focusing on 3G,” http: //connectedplanetonline.com/wireless/ news/t-mobile-3g-growth-0130, 2009. [7] J. Xin, C. Lin, and M. Sun, “Digital video transcoding,” Proceedings of the IEEE, vol. 93, no. 1, pp. 84–97, January 2005. [8] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the scalable video coding extension of the H.264/AVC standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 9, pp. 1103–1120, September 2007. [9] M. Wien, H. Schwarz, and T. Oelbaum, “Performance analysis of SVC,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 9, pp. 1194–1203, September 2007. [10] E. Kohler, M. Handley, and S. Floyd, “Datagram congestion control protocol (DCCP),” RFC 4340, March 2006. [11] J. Singh, T. Alpcan, P. Agrawal, and V. Sharma, “An optimal flow assignment framework for heterogeneous network access,” in Proc. of IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM’07), Helsinki, Finland, June 2007, pp. 1–12.

8

Suggest Documents