A LOW-COMPLEXITY PACKET CLASSIFICATION ALGORITHM FOR MULTIPLE ... AC 3. AC 2. AC 1. AC 0. pMD2 i,j. pMD1 i,j. Transport. UDP/IP. Physical. CD.
A LOW-COMPLEXITY PACKET CLASSIFICATION ALGORITHM FOR MULTIPLE DESCRIPTION VIDEO STREAMING OVER IEEE802.11E NETWORKS Simone Milani‡, Giancarlo Calvagno‡ , Riccardo Bernardini∗, Roberto Rinaldo∗ ‡
Dept. of Information Engineering, University of Padova, Italy ∗ DIEGM, University of Udine, Italy
{simone.milani,calvagno}@dei.unipd.it, {riccardo.bernardini,rinaldo}@uniud.it
ABSTRACT The robust transmission of video sequences over wireless LANs presents several challenging problems concerning the presence of packet losses, delays, and bandwidth limitations. Their effect on the visual quality of the sequence reconstructed by the end user can be mitigated by adopting a wireless architecture that is able to support different levels of Quality-of-Service (QoS), like the IEEE 802.11e standard, and by compressing the video sequence to be transmitted using a robust source coder, like a Multiple Description Coding (MDC) scheme. This paper presents a MDC-based video streaming architecture that tries to adaptively optimize the performance of both solutions by assigning the RTP video packets produced by the MDC encoder to the different QoS classes of IEEE 802.11e. Experimental results shows that the performance is significantly improved with respect to a non-adaptive solution. Index Terms— Multiple Description, IEEE 802.11e, cross-layer optimization, packet classification. 1. INTRODUCTION The widespreading of IEEE 802.11 networks has enabled the possibility of distributing rich media content anywhere, anytime, and from any device. However, the streaming of video sequences over wireless networks still presents several issues due to the presence of delays, packet losses and bandwidth limitations that affect the transmission performance. In order to improve the quality of the reconstructed sequence, video coding experts have been focusing on error concealment algorithms and robust video coding paradigms. Among these, Multiple Description Coding schemes [1] have proved to be significantly efficient for wireless scenarios where channel conditions are time-varying and the bursty nature of packet losses can lead to a significant distortion. At the same time, network designers have been involved in defining network architectures that are able to provide different classes of service. This work was carried on within the the PRIN research project prot. 2005099247 with the support of the Italian Ministry of University and Research (MiUR).
In particular, the emerging IEEE 802.11e medium access control protocol [2] defines different levels of Quality-of-Service (QoS) that permit handling and transmitting video packets according to different priorities and policies. As a consequence, this differentiation poses the problem of assigning the most appropriate QoS class at the lower layers of the protocol stack to each video packet generated at application level. During the last years this possibility has lead to the design of several cross-layer (CL) optimization strategies that assign to each video packet the most appropriate QoS class according to the characteristics of the carried information and of the source coding scheme. As an example, the video coding setting in [3] divides macroblock headers and Motion Vectors from the coded residual signal generating different types of packets thanks to the Data Partitioning (DP) mode of the H.264/AVC standard. Each packet is then assigned to a different class of service according to the information it carries. However, at high bit error rates (BER) the performance of the MDC-based schemes is significantly better than that of the DP mode since they permit a more gentle degradation of the visual quality (see [4]). As a consequence, in this paper we design an efficient adaptive classification strategy for packets generated by a Multiple Description coder taking advantage of the QoS differentiation provided by the IEEE 802.11e standard. More specifically, packets are distributed among the different service queues at MAC level according to the percentage of null quantized DCT coefficients. Experimental results show that this adaptive approach improves the average PSNR value with respect to a traditional MDC approach where each description is assigned to a fixed QoS class. In the following, Section 2 presents the adopted crosslayer scheme, while Section 3 describes the classification algorithm that maps the RTP video packets to the QoS classes defined in the IEEE 802.11e specification. Section 4 reports the experimental results obtained on a set of sequences coded with different QP, while Section 5 draws the conclusions.
MD1
p i,j
CL control
MD2
H.264 decoder
p i,j
Physical
Mult. Acc. CD
AC 2 AC 3
Transport
Radio Channel
AC 0 AC 1
p i,j
UDP/IP
MD2
Physical
H.264 coder
MAC/LLC
MD2
MD Conceal.
UDP/IP
H.264 coder
Mux
Recon. seq.
MD1
p i,j
Transport
Frame Odd/ even rows
MD1 Input seq.
H.264 decoder RTP packets
IEEE 802.11e MAC layer
Fig. 1. The adopted Multiple Description scheme 2. A CROSS-LAYER MULTIPLE DESCRIPTION SCHEME BASED ON THE IEEE 802.11E STANDARD In this work we adopted the very simple 2-description MDC scheme shown in Fig. 1, where the application and the MAC layers are depicted in more detail with respect to other parts of the protocol stack. In fact, the implemented cross-layer strategy relies on adequately tuning the MD source coder at the application layer and the multiplexer for the queues at MAC layer in order to optimize the performance of the MD concealment unit at the receiver. In the next subsection, these two layers will be described more accurately. 2.1. The MD coder and decoder at the application layer At the application layer, the odd and even rows of each input frame are sampled into two fields (descriptions) and sent to two independent H.264/AVC coders. Each coder generates a packet stream that is sent to the lower levels in the protocol stack. At the receiver, in case both H.264/AVC decoders correctly get all the packets, both fields can be correctly decoded, and the coded sequence can be reconstructed without any further quality loss. In case some parts of one description are missing, the MD Concealment unit estimates the lost rows through a bilinear interpolation of the rows from the other description, which have been correctly decoded and are interleaved with the lost ones. This estimation produces a less degraded approximation of the coded frame which reduces the loss in the PSNR value of the reconstructed sequence. In case both descriptions get lost, it is necessary to adopt the same error concealment techniques that are adopted by single description (SD) coding schemes. Therefore, the transmission conditions for the two descriptions must be varied in such a way that at least one description is correctly received. To achieve this, the proposed CL scheme takes advantage of the features offered by the IEEE 802.11e protocol, which will be described in the following paragraph.
2.2. The IEEE 802.11e protocol The basic 802.11 MAC protocol, named Distribution Coordination Function (DCF), operates according to the Carrier Sense Multiple Access/Collision Avoidance (CSMA/CA) access strategy. Before a station starts transmitting a queued MAC Protocol Data Unit (MPDU), the channel has to remain available for a random time interval, called backoff time, that varies in the interval [0, CW], where the Contention Window parameter CW is initially set to CWmin and is doubled any time the transmission fails (up to CWmax ). Whenever the packet is not correctly acknowledged by the receiver, the station retransmits it until the maximum number RETRY LIMIT (RL) of trials is reached. In order to support different levels of QoS, the 802.11e [2] standard introduces the Enhanced Distributed Channel Access (EDCA) strategy, where multiple backoff processes (up to 4) are allowed by distinguishing multiple packet queues within the same wireless station. Each queue is referenced as Access Category (AC), which can be characterized by a different set of parameters [CWmin , CWmax , RL] and a different priority. It is possible to distribute the RTP packets among the different queues in order to optimize the quality of the sequence reconstructed at the decoder. 3. A CLASSIFICATION ALGORITHM FOR MDC RTP PACKETS Previous works have shown that for SD-based video coders the distortion produced by the loss of a single packet strongly depends on the characteristics of the coded information. In [5] Qu et al. propose an Unequal Error Protection (UEP) strategy that increases the number of FEC packets according to the activity level of the original sequence. In [6], the authors propose a joint source-channel coding optimization algorithm that tunes the FEC coder according to the percentage ρ of null quantized transform coefficients (called zeros). It is possible to find a similar relation for MDC schemes, too. Figure 2
MD2 lost
MD1 lost
0.3
δ PSNRl
δ PSNRl
0.25 0.2 0.15 0.1
0.25
0.2
0.15
2
2 1.5
1.5 1 0.5 0
NIntraMB
0.99 0.992 0.994 0.996 0.998
ρ
0.988
1 0.5 0
NIntraMB
0.99 0.992 0.994 0.996 0.998
0.988
ρ
Fig. 2. The relative PSNR decrement δPSNR vs. the percentage ρ of zeros and the number NIntraMB of Intra macroblocks for the lost slice (sequence foreman CIF). Two separate graphs are reported assuming that either MD2 or MD1 is correctly received. The PSNR value is evaluated on the final sequence after the MD error concealment. reports the relative quality loss for the foreman sequence, measured by the parameter δPSNRl =
PSNRo − PSNRl PSNRo
(1)
where PSNRo is the PSNR value for the frame correctly reconstructed and PSNRl is the PSNR of the reconstructed approximation after the loss of a packet. The plots show that the parameter δPSNRl is approximately linearly-related with the parameter ρ so that, as the percentage ρ decreases, the PSNR loss increases. Therefore, it is possible to design a classification algorithm for the EDCA strategy that takes into consideration this relation. In [4] the authors assign a description to the AC2 class and the other to the AC1 class, made exception for Intra packets that are assigned to AC3. The assignment is fixed and proves to be effective since it is able to grant the necessary diversity between the loss statistics of each MD stream. In [7] the author adopts an unbalanced MD-FEC scheme and assigns to the most reliable channel the description with the highest bit rate. However, these assignment techniques do not take into consideration the characteristics of the video signal. In our research, we have designed and tested a classification strategy that distributes the packets among the different ACs according to the description they belong to and according to the percentage of zeros for the coded signal. More specifically, given the i-th frame and the for the percentage of zeros ρMD1 for the field MD1 and ρMD2 i i field MD2, the corresponding packets are assigned to the ACi queue according to Algorithm 1, where ρMD1 is the average percentage of zeros for the fields of description MD1 (computed from the previous frames) and ρMD2 is the average percentage of zeros for description MD2. In this way, the highest priority AC is reserved for Intra frame packets and Inter packets from description MD1 with a low ρ value, while AC1 and AC2 are used for Inter packets according to the corresponding ρ values. Hence, the highest QoS level is assigned to the
Algorithm 1 Packet classification using ρ 1: Naming pMD1 i,j the j-th packet of MD1 for the i-th frame 2: and pMD2 the j-th packet of MD2 for the i-th frame i,j 3: if the i-th frame is Intra coded then 4: AC3 ← pMD1 AC3 ← pMD2 i,j i,j 5: else 6: if ρMD1 < ρMD1 then i 7: AC3 ← pMD1 i,j 8: else 9: AC2 ← pMD1 i,j 10: end if 11: if ρMD2 < ρMD2 then i 12: AC2 ← pMD2 i,j 13: else 14: AC1 ← pMD2 i,j 15: end if 16: end if
packets that contain the most critical information for the decoding process, while the lowest priority is given to the least important data. Note also that the proposed algorithm provides a different average QoS level to the packet streams belonging to different descriptions. In fact, the MDC paradigm proves to be effective whenever it is possible to estimate the lost rows from the other ones. Giving the same priority to both descriptions would reduce the probability of having at least one description correctly received. In a second optimization, we considered the possibility of switching the description MD1 with description MD2 whenever the recovering performance of MD2 is greater than that of MD1, i.e. estimating MD1 from MD2 leads to a better quality in the reconstructed sequence. To this purpose, at the beginning of each GOP MD1 and MD2 are switched according to the criterion reported in Algorithm 2. Algorithm 2 Description switching 1: if ρMD2 < ρMD1 then 2: MD2 becomes the highest priority description 3: else 4: MD1 remains the highest priority description 5: end if
4. EXPERIMENTAL RESULTS In order to evaluate the performance of the presented CL algorithm, we adopted the same experimental setting of [4]. The video sequence is transmitted by a video RTP server in an IEEE 802.11e network implemented using the Omnet++ simulator (see Fig. 3). In our tests, we simulated the transmission of various sequences (coded with fixed QP = {18, 24, 30}, GOP IPP...P of 15 frames) varying the channel propagation condition. The plots in Figure 4 report the average PSNR vs.
45
41 40 39
AC1 (CWmin , CWmax , RL) AC2 (CWmin , CWmax , RL)
(31, 1023, 4) (15, 31, 8)
AC3 (CWmin , CWmax , RL) Bit rate
(7, 15, 8) 11 Mbit/s
Mobile speed
5 m/s
Path loss α Transmission power
[2.0, 2.3] 75 mW
38 37 36
38 37 36 35
35
Alg. 1 Alg. 1 + Alg. 2 fixed
34
Alg. 1 Alg. 1 + Alg. 2 fixed
−3
−3
10
10
BER
BER
(a) foreman QP=18
(b) foreman QP=24 39
35.5
38
35
PSNR (dB)
34.5 34 33.5
37 36 35
33
34
32.5
31.5
Alg. 1 Alg. 1 + Alg. 2 fixed
33
Alg. 1 Alg. 1 + Alg. 2 fixed −3
−3
10
10
(c) foreman QP=30
(d) table QP=24 37
39
36.5
38
36
37 36 35
35.5 35 34.5 34
34 33 32
Fig. 3. The adopted network scenario.
BER
BER
MDC Video stream
PSNR (dB)
Value (31, 1023, 4)
PSNR (dB)
PSNR (dB)
42
PSNR (dB)
Parameter
39
43
32
AC0 (CWmin , CWmax , RL)
40
44
PSNR (dB)
the BER value from a set of 10 trials per point. It is possible to notice that the Algorithm 1 allows us to improve the PSNR value for sequences coded at different quality levels whenever the BER value increases. For the sequence foreman coded with QP = 24, the average PSNR is increased by 2 dB with respect to the fixed approach of [4] when the BER is 9 · 10−4 , but it is possible to notice a similar improvement for other sequences too (see the results with BER=8 · 10−4 for table in Fig. 4(d) and for paris in Fig. 4(e)). The use of Algorithm 2 slightly improves the quality of the reconstructed video sequence, as it is shown in Fig. 4(d) for BER=10−3 . However, this increment is strictly dependent on the characteristics of the video sequence. For the sequence news coded with QP = 30 (see Fig. 4(f)) no significant differences can be appreciated between Alg. 1 and Alg. 1+Alg. 2.
33.5
Alg. 1 Alg. 1 + Alg. 2 fixed
33 32.5
Alg. 1 Alg. 1 + Alg. 2 fixed
−3
−3
10
BER
(e) paris QP=24
10
BER
(f) news QP=30
5. CONCLUSIONS The paper has presented a cross-layer packet classification algorithm for a Multiple Description Coding scheme based on the IEEE 802.11e protocol. The proposed strategy relies on assigning the highest-priority service classes to the packets of frames with a low percentage of null quantized transform coefficients, while the others are transmitted with a lower QoS level since the quality of the reconstructed sequence is not significantly affected by their loss. Experimental results show that is possible to improve the PSNR value up to 2 dB with respect to a similar approach where packet classification only depends on which description they are related to. 6. REFERENCES [1] Vivek K. Goyal, “Multiple Description Coding: Compression Meets The Network,” IEEE Signal Processing Mag., vol. 8, no. 5, pp. 74–93, Sept. 2001. [2] IEEE 802.11/D13.0, Part 11, “Wireless LAN medium access control (MAC) and physical layer (PHY) specifications: Medium access control (MAC) enhancements for Quality of Service (QoS),” Jan. 2005. [3] A. Ksentini, M. Naimi, and A. Gu´eroui, “Toward an improvement of H.264 video transmission over IEEE
Fig. 4. Comparison of different classification algorithm for the CL schemes. The plots report the average PSNR value for the reconstructed sequence from 10 trials vs. the BER value. 802.11e through a cross-layer architecture,” IEEE Commun. Mag., vol. 44, no. 1, pp. 107–114, Jan. 2006. [4] R. Bernardini, M. Durigon, R. Rinaldo, P. Zontone, and A. Vitali, “Real-Time Multiple Description Video Streaming Over QoS-Based Wireless Networks,” in Proc. of ICIP 2007, San Antonio, TX, USA, Sept. 2007. [5] Q. Qu, Y. Pei, and W. Modestino, “An Adaptive MotionBased Unequal Error Protection Approach for Real-Time Video Transport Over Wireless IP Networks,” IEEE Trans. on CSVT, vol. 8, no. 5, pp. 1033–1044, Oct. 2006. [6] S. Milani, G. A. Mian, and L. Celetto, “Joint Optimization of Source-Channel Video Coding Using the H.264 Encoder and FEC Codes,” in Proc. of EUSIPCO 2005, Antalya, Turkey, Sept. 2005. [7] I. V. Baji`c, “The Effects of Channel Correlation on the Performance of Some Multiple Description Schemes,” in Proc. of CWIT 2007, Edmonton, Alberta, Canada, June 6–8 2007.