This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2011 proceedings
QoE-driven Sender Bitrate Adaptation Scheme for Video Applications over IP Multimedia Subsystem Asiya Khan, Is-Haka Mkwawa, Lingfen Sun and Emmanuel Ifeachor Centre for Signal Processing and Multimedia Communications, School of Computing and Mathematics, University of Plymouth, Plymouth PL4 8AA, UK Email:
[email protected];
[email protected];
[email protected];
[email protected]
Abstract— IP Multimedia Subsystem (IMS) offers a framework which enables the provisioning of multimedia services with Quality of Service (QoS) and mobility support across heterogeneous networks. The aim of this paper is twofold. First, to present a new fuzzy logic based Sender Bitrate (SBR) adaptation scheme at pre-encoding stage that is Quality of Experience (QoE) driven for video applications. The scheme was tested and evaluated in the NS2 based simulation access networks of third generation Universal Mobile Telecommunication System (UMTS) networks. Second, to demonstrate the implementation of the proposed adaptation scheme in our developed open Androidbased IMS test bed. The test bed was developed to fully understand and manipulate the effects of network conditions on perceptual quality. The SBR adaptive scheme is evaluated in terms of the Mean Opinion Score (MOS). Extensive simulation and test bed results demonstrate the effectiveness of the proposed adaptation scheme especially at UMTS bottleneck access networks where perceived video quality is most affected. The proposed scheme was responsive to available network bandwidth and congestion and adapted the SBR accordingly maintaining acceptable quality in terms of the MOS. The proposed scheme has applications in network planning and content provisioning for network/service providers. Keywords; IMS, QoE, SBR, MOS, NS2, UMTS
T
I. INTRODUCTION
ransmission of multimedia applications and services over wireless access technologies is continuously gaining popularity. IP Multimedia Subsystem (IMS) as defined by the 3rd Generation Partnership Project (3GPP) [1] and adapted by several standardization bodies acts as a service oriented enabler across fixed and mobile IP networks. With the convergence of the Internet, fixed and mobile communications and the increase of multimedia applications, the issue of maximizing the resource utilization while satisfying user’s Quality of Experience (QoE) requirements has been gaining importance. This paper aims to propose a new QoE-driven adaptation scheme at pre-encoding stage which was successfully implemented in the android based IMS test bed.
The optimization of QoE is crucial for multimedia design and delivery. Several researchers have proposed adaptation schemes in literature. In [2] authors propose an adaptive fuzzy rate control feedback algorithm based on packet loss rate and congestion notification from routers. However, they did not consider initial optimum encoding rate of the video. In [3] a model is proposed based on dynamic bitrate control to subjectively estimate the quality of video streaming. Their estimation model considers user perception in three areas where quality degradation is high, the impression of past quality and the duration of degradation. In [4] the authors have proposed a bitrate control scheme based on congestion feedback over the Internet. The scheme reacts to network congestion but does not consider user’s QoE. In [5] authors have proposed an adaptation algorithm which dynamically adapts scalable video to a suitable three dimension combination. In [6],[7] authors have presented adaptation based on network state and congestion control over UMTS transport channels. Authors in [8] have presented an adaptive bandwidth allocation scheme based on the queue length and the packet loss probability. A scheme based on packet dispersion instead of packet loss is presented in [9] using a fuzzy rule in combination with a transcoder to adapt the video bitrate. Most of these schemes do not take into account the video content as the dynamics of the content are critical for the final perceptual outcome. In addition, the main aim of most of these schemes is to minimize the end-to-end packet loss and/or delay and optimize network QoS parameters only without any consideration to QoE metrics. Also, current work is limited to simulation only with no real implementation in mobile devices. In a previous work the adaptation for VoIP (Voice-overInternet) was implemented in our developed IMS test bed [10]. In this paper we have extended that to video applications. The focus here is on QoE-based adaptation as the prime criterion for the quality of multimedia applications is the user’s perception of service quality [11]. The most widely used metric is the Mean Opinion Score (MOS). Hence, the main contributions of the paper are twofold:
978-1-61284-231-8/11/$26.00 ©2011 IEEE
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2011 proceedings
•
Propose an efficient SBR adaptation scheme that is QoE-driven at pre-encoding stage over UMTS access networks. • Demonstrate the implementation of the proposed scheme in an open Android-based IMS test bed which was developed to fully understand the effects of network conditions on perceptual quality adaptation. The QoE-adaptation scheme was also implemented in NS2 [12] to allow extensive simulation. Android open mobile platform [13] has been used with G1 mobile handset as an IMS client [14] in the IMS test bed as its future has been shown very promising for UMTS access networks. Preliminary results show a clear improvement in overall quality in response to bandwidth and network congestion. The rest of the paper is organized as follows. In section II we present the proposed adaptation scheme. Section III presents the evaluation set up and results. Section IV describes the implementation of the proposed scheme in Android-based IMS test bed. Section V concludes the paper highlighting areas of future work.
parameters in the application layer as Content Type (CT), Frame Rate (FR) and Sender Bitrate (SBR) and the physical layer as Block Error Rate (BLER) modeled with 2-state Markov model with variable Mean Burst Length (MBL) of 1.75.
Figure 1. Conceptual diagram to illustrate video quality prediction and QoE-driven adaptation
(1)[16] The coefficients along with the goodness of fit (R2) and Root Mean Squared Error (RMSE) are given in Table I.
II. PROPOSED QOE-DRIVEN ADAPTATION SCHEME Fig. 1 illustrates how the video quality is predicted nonintrusively and shows the concept of QoE-driven adaptation. At the top of Fig. 1, intrusive video quality measurement block is used to measure video quality at different network QoS conditions (e.g. different packet loss, jitter and delay) or different application QoS settings (e.g. different codec type, content type, sender bitrate, frame rate, resolution). The measurement is based on comparing the reference and the degraded video signals. PSNR to MOS conversion from evalvid [15] is used for measuring video quality in this paper. The video quality measurements based on MOS values are used to derive non-intrusive QoE prediction model and sender bitrate adaptive control mechanism based on non-linear regression methods from [16]. The following sub-sections (A and B) describe the model and adaptation scheme in detail. A. QoE prediction model The non-linear regression-based model was developed in an earlier work [16] to predict video quality for all content types from both application and physical layer parameters for video applications over UMTS networks. In Fig. 1 the video content classification is carried out from video at the receiver side by extracting their spatial and temporal features using cluster analysis. The details are given in [17]. The proposed model is trained with sequences of akiyo, foreman and stefan and validated with carphone, suzie and football. The video sequences represent content with low Spatio-Temporal (ST) to high ST features as classified in our previous work [17]. As the transmission of video was for mobile handsets, all the video sequences were of QCIF resolution (176x144) and encoded in H.264 with Baseline Profile at 1.2 level, with an open source JM software [18] encoder/decoder. The considered frame structure is IPPP for all the sequences, since the extensive use of I frames could saturate the available data channel. The model is predicted with a combination of
TABLE I COEFFICIENTS OF METRIC MODELS α 5.2266 RMSE
β 3.681e-08 0.373
γ -0.1134
δ 8.1466
ε -1.9643 R2
ξ -0.7166 87.89%
μ -1.3502
B. QoE-driven adaptation scheme We take advantage of the fuzzy logic [19] which is implemented at the sender side, processes the feedback information and decides the optimum number of layers that will be sent using fuzzy logic control in Fig. 1. Layered encoding is used for adapting the video streams to the network dynamics. Video streams are encoded in a layered manner in a way that every additional layer increases the perceived quality of the stream. Base layers are encoded at a very low rate to accommodate for the UMTS access network conditions. Additional layers are added or dropped in order to adapt the video stream according to the content type and network conditions. We describe the two inputs to our adaptation scheme in detail as Congestion (C) and Degredation (D). To calculate the first input, C, we use the model proposed in eq. (1) for MOS prediction. The model is light weight and easy to implement. The predicted QoE metrics together with network QoS parameters is then used in the QoE-driven adaptation scheme to adapt the sender bitrate as shown in Fig. 1. RTCP is used to exchange the feedback on the quality of the data distribution by exchanging reports between the sender and the receiver. The feedback information is sent through extended RTCP reports [20] every second from the network and collects QoS information like loss rate, delay and jitter from the core network to give the network congestion level. The network congestion level is calculated from the Block Error Rate (BLER) computed from the total number of blocks lost over the total blocks sent. BLER is used in this paper as
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2011 proceedings
D = MOSmax – MOSt
(3)
The maximum value that D (degradation) can have is 3.4 (as the range of MOS is from 1-5), indicating maximum degradation, and the minimum value that D can have is 0 indicating no degradation at all. The degradation, D has been split into four levels as 0-0.25, 0.25-0.7, 0.7-1.2 and D>1.2. The split in the values of D are reflective to the changes in visual quality. This is then linked with an SBR level. The degradation, D, along with the Congestion, C, are used as input to the fuzzy logic sender bitrate adaptor. The membership functions for the two inputs (linguistic input variables) and the output (SBRchange) is shown in Fig. 2. Triangular functions are chosen due to their simplicity. The SBR change (output) surface is also given by Fig. 2 which shows the overall behavior of the SBR adaptor. The first linguistic variable (LV) input C is the network congestion. It ranges from 0 to 1. The second LV, D is the degradation calculated from QoE model. D ranges from 0 to 3.4. The fuzzy SBR adaptor processes the two linguistic variables based on the predefined if-then rule statements (rule base) shown in Table II, and derives the linguistic output variable SBRchange, which is defined for every possible combination of inputs. An example of the fuzzy rule is: If congestion is large (L) and degradation is medium (M) then SBRchange is BC (big change) The linguistic variables in Table II are given by the membership functions of the output in Fig. 2 and are described as No Change (NC), Very Small Change (VSC), Small change (SC) and Big Change (BC). The linguistic variables in Table II for the two inputs are given by Zero (Z), Small (S), Medium
(M) and Large (L). The defuzzified output can then be used to determine the next level of SBR as given by eq. (4). SBRnew = SBRold + SBRchange
(4)
Membership Function for LV input 1
Membership Function for LV input 2
1Z S M
1Z S
L
Degree of membership
Degree of membership
0.5
0 0
0.5 Congestion
M
L
0.5
1
0 0
1
2 Degradation
3
Membership Function for LV output Degree of membership
opposed to packets lost as in UMTS networks, the physical layer passes the transport blocks to the Medium Access Control (MAC) layer together with the error indication from Cyclic Redundancy Check, the output of the physical layer can be characterized by the overall probability (BLER) in this paper. Thus, an error model based on 2-state Markov model [21] of block errors was used in the simulation. We define Congestion (C), computed from [21] as the fraction of the number of Block Lost (BL) divided the total number of Blocks Sent (BS) within an interval. Therefore, the congestion, C, is given by eq. (2) as: (2) The range of congestion level is from [0,1] with 0 being no congestion and 1 meaning fully congested network. The Congestion, C, was partitioned into four levels as (0