Priority Ordering and Packetization for Scalable ... - Semantic Scholar

23 downloads 60 Views 1MB Size Report
Keywords: scalable video coding (SVC), network coding (NC), robust. 1 Introduction. With the maturity of video coding technologies, networking infra-structures ...
Priority Ordering and Packetization for Scalable Video Multicast with Network Coding* Song Xiao1,2, Hui Wang2, and C.-C. Jay Kuo2 1 ISN key Lab, Xidian University, Xi’an, Shaanxi, 710071, China Ming Hsieh Dept. of Electrical Engineering, University of Southern California Los Angeles, CA, 90089, USA [email protected],[email protected],[email protected] 2

Abstract. The integration of scalable video representation and network coding (NC) offers an excellent solution to robust and flexible video multicast over IP networks. In this work, we examine one critical component in this system, i.e. video priority ordering and packetization at the source of the multicast tree. First, a GOP-adaptive layer-based packet priority ordering algorithm is proposed to allow flexible prioritized video transmission with unequal error protection. Then, a packetization scheme tailored to NC delivery is discussed. Simulation results are given to demonstrate that the proposed algorithms offer better performance in video quality and bandwidth efficiency as compared the SNR-based packetization method. Keywords: scalable video coding (SVC), network coding (NC), robust.

1 Introduction With the maturity of video coding technologies, networking infra-structures and the rapid growth of computing power, digital video such as video-conferencing, multimedia chatting, video on demand (VoD), IPTV has reached us over wired and wireless IP networks. Video transmission over broadband networks in general and wireless networks in particular suffers from time-varying bandwidth and packet delay and loss. The scalable video coding (SVC) standard [1],[2], developed as an extension of H.264/MPEG-4 AVC, has attracted a lot of research interests for its flexible scalability and adaptability to a wide range of varying network conditions, applications and terminals. An efficient and robust SVC transmission system is expected to overcome these challenges for reliable video transmission. Research has been conducted to achieve robust video transmission over wired or wireless IP networks extensively for years. Unequal error protection (UEP) [3], [4], [5] based on priority encoding transmission (PET) [6] offers one of the most promising techniques among numerous proposed methods. For example, the data partition mode in H.264 divides a bit stream into three types of data of different importance so that different channel coding rates can be applied to them. Rate *

This project is supported by the NSFC (No. 60702058) and Chinese Scholarship Council.

H.H.S. Ip et al. (Eds.): PCM 2007, LNCS 4810, pp. 520–529, 2007. © Springer-Verlag Berlin Heidelberg 2007

Priority Ordering and Packetization for Scalable Video Multicast

521

allocation to minimize the distortion or power as well as maximize video quality was considered by Xiao et al. in [3]. For SVC transmission, Fang and Chau [4] proposed a scheme to allocate an unequal amount of protection based on Reed-Solomon codes to different frames of a GOP (i.e., temporal scalability) or the progressive bit stream in each frame (i.e., SNR scalability) using the genetic algorithm. Schierl et al. [5] used the Raptor forward error correction (FEC) codes to protect SVC layers of different importance. Only source and channel coding methods have been considered in these methods. While packets are transmitted in the network, they are delivered by the traditional store-and-forward (S/F) mechanism in intermediate nodes. More recently, network coding (NC) [7]-[9] has been extensively studied by researchers in the information theory community. NC allows packets to be encoded at intermediate nodes and can achieve the maximum multicast information rate. The application of NC to wireless video multicasting using the H.264/SVC video format was studied by Wang, Xiao and Kuo [7]. However, a critical component in the whole system was not well addressed in [7]; namely, video packetization at the source node, due to the space limit. This work serves as a companion paper to [7] by focusing on the source video packetization problem. In this research, a GOP-adaptive layer-based priority ordering algorithm is proposed to organize the H.264/SVC video bitstream, which makes the whole system more robust under the same bandwidth condition than the default SVC ordering scheme or the SNR-based ordering algorithm. Then, the packetization algorithm optimized for NC delivery is discussed. The rest of this paper is organized as follows. Sec. 2 presents an overview of the NC-based video multicast system. Sec. 3 describes the packet priority ordering algorithm and the packetization algorithm in detail. The proposed ordering scheme is compared with other ordering algorithms in Sec. 4. Concluding remarks and future research directions are given in Sec. 5.

Fig. 1. The proposed SVC video multicast system using the NC technique in the network

2 Video Multicast with Network Coding The proposed SVC video multicast system using the NC technique is shown in Fig. 1. First, SVC video coding and packetization are performed at the source node. Then, the random linear network coding (RLNC) technique is conducted at intermediate

522

S. Xiao, H. Wang, and C.-C.J. Kuo

nodes of a multicast tree with multiple in-degrees. Finally, the Gaussian elimination method is used for random linear network decoding, which is followed by packet reconstruction at all receiving nodes. In this work, we focus on the task on the first stage, i.e., source video packetization. For more details in RLNC encoding and decoding, we refer to [7]. There are three major steps in preparing packets at the source node as detailed below. Step 1: SVC bitstream generation and priority layering The prioritized layers of H.264/SVC are represented as L0 ,..., LM −1 . Each layer is composed by symbols s j ∈ Li over the finite Galois field F . Step 2: Packet mapping Different amounts of redundant protection bits are assigned to layers of different priority, and these data are interleaved into packets. Most of the PET-based methods [3], [4] adopt the (n,k) Reed-Solomon code, where n-k redundant packets to protect k source packets. With NC, we can use zeros rather than RS codes [7] to simplify the packetization process. For example, some packets are concatenation of symbols from all layers ( Pi = s1,i || s2,i || ..., || sM −1,i ) while others are concatenation of symbols from some layers and redundant protection bits ( Pj = 0 || 0 || sm, j || ..., || sM −1, j ) . By interleaving data this way, it helps equalize the importance of different packets. As a result, there is no need for intermediate nodes to differentiate the importance of each packet in the NC mixing process, which simplifies the design and maintenance of multicast trees. Step 3: Packet loading Consider a video stream of N packets. Due to the finite input bandwidth, we do not pump all N packets into the source node simultaneously. Instead, they are organized into multiple groups, and each group is sent to the source node per unit time. This group of packets is represented by G t where t is the time index. Packets within the same group are mixed in intermediate nodes of the network using the RLNC algorithm. The loading time τ is the total time required for the source node to transmit all the packets of the video stream. There are two factors that determine the loading time. One is the capacity of out-going links of source node denoted by C s , which limits the amount of data to be sent to the network. The other is the bottleneck of the network, which corresponds to the minimum cut of the graph denoted by Cmin . If C = min(C s , Cmin ) , we need τ = N / C time units to send out all N packets to the receiver. In the following sections, we describe Step 1 and Step 2 in detail; namely, the packet priority ordering and packtization algorithms.

Priority Ordering and Packetization for Scalable Video Multicast

523

3 Priority Ordering and Packetization 3.1 Priority Ordering When transmitting SVC video over the network, the source content is encoded once with the highest resolution and bit-rate. If there is no error and/or packet loss during the transmission, different receivers can extract a bit stream of different resolutions according to its own capability. However, if some errors and/or packet loss occur, reconstructed video quality may degrade. The default ordering of the SVC bit stream is done according to the temporal layer (frame) in the unit of GOP. Each temporal layer contains several spatial layers that may contain one base SNR layer and several enhancement SNR layers. The frontal portion is the base layer. At the same time, SVC uses the network abstraction layer (NAL) packet as its basis unit for transmission over the network. When one NAL packet is lost, all packets that depend on this NAL packet will be affected. Sometimes, it may make the subsequent bit stream un-decodable at the receiver. As a result, the impact of packet loss to SVC video transmission over networks must be considered carefully. A 3D layer based ordering algorithm is proposed in this work to organize SVC video with prioritization within one GOP. It can have the highest quality (or least distortion) under a certain number of correctly received NAL packets. In other words, the impact of NAL packet loss to the overall performance of the GOP can be significantly reduced by the proposed ordering algorithm. The method can be applied to bandwidth adaptation. When the target resolution is determined, the method can provide optimal performance to the receivers under different bandwidth conditions.

Fig. 2. Illustration of the path selection process, where lines with different colors represent different paths from the start point to the destination point

The objective of the method is to find the optimal path in the 3D coordinates to get the optimal rate-distortion (RD) performance, where the 3D co-ordinates are formed by temporal, spatial and SNR layers as shown in Fig. 2. By the optimal path, we mean

524

S. Xiao, H. Wang, and C.-C.J. Kuo

the optimal rate distortion performance from the lowest spatial, temporal and SNR resolution (i.e., the base layer) to the highest resolution (i.e., the inclusion of all enhancement layers). We see from the figure that there are many candidate paths for selection. Since each GOP of different video sequences may have different characteristics, the path may vary from one GOP to the other. The basic idea of the proposed algorithm is to choose several candidates under certain constraints at each local step and then integrate these local decisions to form a path that provides the optimal RD performance. Let (s,t,q) be the 3D integer layer indices, where s, t and q are spatial, temporal and quality layer indices, respectively. The lowest resolution is (0,0,0) while the highest resolution is (s*,t*,q*). By ordering, we map the 3D coordinates into 1D array, i.e, L(s,t,q)=i, where L represents a specific ordering scheme. We use L j to denote the jth element of the 1D array ordered by scheme L. It is clear that a legitimate order has to meet the following criterion: PSNRLi > PSNRL j , ∀i > j.

In other words, video quality improves if an additional layer is added into existing video. A greedy algorithm to select a legitimate path is described below. •

Initialization ( i = 0 ) The lowest layer (0,0,0) is chosen as the start point of the path.



Iteration ( i = 0,1,... ) Suppose that L( s, t , q ) = i is the current 1D index. We consider three possible positions (s+1,t,q), (s,t+1,q) and (s,t,q+1) as the possible next 1D index denoted by j=i+1. Then, we choose the following one: L*j = arg max

∂PSNR L j , ∂RL j

where ∂RL is the rate increase due to the addition of this new layer. The above j

process is repeated until the maximum resolution along each dimension is reached. 3.2 Packetization Algorithm After priority ordering, we consider the packetization problem. Assuming all NALs in the p th GOP are partitioned into M layers Li , i = 0,1,...M − 1 . These M layers are packetized into N packets, each of which has K bytes as shown in Fig. 3. The width and the height of each layer are wi and hi bytes respectively. Then, we have M −1

∑wh i =0

i i

= Rp

(1)

Priority Ordering and Packetization for Scalable Video Multicast

525

M −1

∑h + H = K

(2)

wi ≤ w j when i < j

(3)

i =0

i

where R p is the total bit rate of the p th GOP.

Fig. 3. The packetization structure

Usually, H and K are fixed parameters. N is always smaller or equal to C s , which is the capacity of outgoing channel of source node. We show how to find parameters wi and hi , i = 0,1,..., M − 1 , below. The distortion function can be written as D( p ) =

M −1

∑ (ΔD i=0

i

⋅ P (.)) ,

(4)

i=0 , and P (.) is the probability that the i th layer where ΔD = ⎧⎨ Di i ⎩ Di − Di −1 0 < i ≤ M − 1 could be correctly received. The objective is to find parameters to minimize the overall distortion or maximize the quality of decoded video. This constrained optimization problem can be formulated as

min

wi ∈wopt , hi ∈hopt

D subject to constraints (1)-(3)

(5)

By using random linear network coding (RLNC), the source information is distributed among different packets. For example, for the ith layer, wi source information is

526

S. Xiao, H. Wang, and C.-C.J. Kuo

distributed among N packets. The more the source information is distributed, the higher the probability of correct reception. Then, we can relate P(.) in (4) with the ratios of wi and replace the objective function in (5) by the following: M −1

w0 ) wi

(6)

subject to constraints (1)-(3)

(7)

Δ=

D0

∑( D i =0

min

wi ∈wopt , hi ∈hopt

Δ



i

The fast bidirectional local search algorithm with iterative improvement can be used to solve the optimization problem. For details, we refer to [3]. 3.3 Packet Delivery with Network Coding After priority ordering and packetization, we have wM −1 source symbol vectors, denoted by x j = [ x j ,1 , x j , 2, ...x j ,K ] , j = 0,1,...wM −1 − 1 , for one GOP. To transmit them, we can follow the standard network coding framework as stated in [8]-[9]. Consider an acyclic graph (V , E ) , a sender s ∈ V and a set of receivers T ⊆ V . Then, each edge e ∈ E dispersed from a node V = in(e) carries a symbol y (e) that is a linear combination of source symbols, i.e. y (e) =

wM −1 −1

∑f j =0

j

( e) x j

,

where the vector of coefficients f (e) = [ f 0 (e), f1 (e)... f w −1 (e)] is known as the global kernel vector on edge e . It can be determined recursively by local kernel vectors that are randomly chosen from finite field F and entering symbols. Any receiver t ∈ T receiving wi (i = 0,1...M − 1) or more incoming symbols in form M −1

⎡ y (e0 ) ⎤ ⎡ f (e0 ) ⎢ y (e ) ⎥ ⎢ f (e ) 1 ⎥ 1 ⎢ =⎢ ⎢ ... ⎥ ⎢ ... ⎥ ⎢ ⎢ ⎢⎣ y (ewi )⎥⎦ ⎢⎣ f (ewi )

f1 (e0 ) f1 (e1 )

... ...

...

...

f1 (ewi ) ...

f wi (e0 ) ⎤ ⎡ x0 ⎤ ⎡ x0 ⎤ ⎢x ⎥ f wi (e1 ) ⎥⎥ ⎢⎢ x1 ⎥⎥ 1 = Fe ⎢ ⎥ ⎢ ... ⎥ ... ⎥ ⎢ ... ⎥ ⎢ ⎥ ⎥⎢ ⎥ f wi (ewi )⎥⎦ ⎢⎣ xwi ⎥⎦ ⎢⎣ xwi ⎥⎦

(8)

can recover source symbols x0 , x1...xw as long as matrix Fe of global kernel vectors i f (e0 ), f (e1 )... f (ewi ) has rank wi . This implies that i source layers (including all layers

small or equal to i ) can be decoded correctly. During the package procedure, the global kernel vector is recorded at each packet header so that it can reach any receiver via received packets. This can be implemented by appending the jth global kernel vector to the jth source vector x j ,0 = 1,2..., wM −1 − 1 . Any receiver can recover the source vector by applying the Gaussian elimination algorithm to wM −1 or more received packets.

Priority Ordering and Packetization for Scalable Video Multicast

527

4 Simulation Results To verify the efficiency of the proposed priority ordering and packetization algorithms, the bit stream generated by the SVC reference source code JSVM6 [2] was used. The standard sequences of mother&daughter(class A), foreman(class B) and football(class C) sequence of size 352×288, 30 frames per second and 300 frames in total were tested. The test sequences were coded by H.264/SVC with 2 spatial layers, 3 temporal layers and 2 SNR layers. The GOP size was 32, where the first frame of each GOP was intra-coded. A 3-tier multicast tree with 5 receive nodes with increasing access capacity were adopted to simulate the single source multicast network. The maximum in-degree of any intermediate node was 3. The finite field was of size 256. We set the target resolution to be full spatial, temporal and SNR resolutions. If the bit stream cannot be decoded due to the packet loss, error concealment of frame copy in the temporal domain and AVC half-sample interpolation filter ({1,-5, 20, 20,-5, 1}/32 of luminance and {16, 16}/32 of chrominance) in the spatial domain was used to achieve the target resolution.

(a) Mother&daughter sequence

(b) Foreman sequence

(c) Football sequence Fig. 4. R-D performance Comparison of three priority ordering methods

528

S. Xiao, H. Wang, and C.-C.J. Kuo

In Fig. 4, we compare the R-D performance of three sequences using the default SVC ordering, the SNR-based ordering and the proposed priority ordering, where the averaged PSNR value of multiple simulation runs were shown. From the highest resolution layer to the lowest one, the bit stream is arranged first by the SNR layer, then the temporal layer and finally the spatial layer for the SNR-based ordering while it is arranged first by the temporal layer, then the spatial layer and finally the SNR layer for the default SVC ordering. When the bit rate is low, the proposed ordering method can provide better performance than the SNR-based ordering and the default SVC bit stream for all three sequences. As the bit rate reaches certain bit rate (about 18Kbytes for mother&daughter, 41Kbytes for foreman and 103Kbytes for football), the gap between the proposed ordering algorithm and the SNR-based ordering algorithm becomes very small, because the SNR layer contributes more to the R-D performance at this time and the proposed method will chose SNR layer as its increment direction. However, they are still much better than the default SVC ordering. When the bit rate reaches the highest spatial, temporal and SNR resolution, the difference among the three ordering methods disappears. It is also shown in the figures that the R-D performances of three priority ordering methods are content sensitive. For sequences with low spatial detail and low amount of movement (mother&daughter), the performance gap between the proposed ordering algorithm and the SNR-based ordering algorithm is the smallest (up to 1.8dB) among three sequences. For sequences with medium spatial detail and low amount of movement (foreman), the coding gain of proposed method beyond SNR-based method (up to 3.7dB) is even higher than that of the sequences with medium amount of movement and high spatial detail (football) (up to 2.8dB). This is probably because that, when the bit rate is small, the spatial enhancement layer often contributes more to the R-D performance, while the foreman (class B) sequence has less spatial details than the football (class C) sequence, when the full resolution is required but small bits are correctly decoded at the receiver, class B sequences will have better performance than class C sequences.

Fig. 5. Comparison of NC and S/F delivery mechanism using three orderings

Priority Ordering and Packetization for Scalable Video Multicast

529

The performance of the default SVC ordering, the SNR-based ordering and the proposed priority ordering at different packet loss rates for video transmission over a multicast network using NC and the store/forward (S/F) delivery with Reed Solomon codes is shown in Fig. 5. 1000 runs were used to verify the validity of the method. We see that NC delivery outperforms the S/F delivery by about 5dB in all packet loss rates for the same ordering. With the same delivery mechanism, the proposed ordering gives the best performance while the default SVC ordering method the worst. Our priority ordering outperforms the SNR based ordering by 3dB under different packet loss rates either the NC or the S/F delivery mechanism.

5 Conclusion and Future Work A new priority ordering scheme and a packetization algorithm for H.264/SVC video multicasting using NC was proposed. The excellent performance of the proposed priority ordering and packetization with NC or S/F delivery was demonstrated by computer simulation. The proposed priority method can provide better performance than SNR-based and default SVC methods. The performance can be further improved by NC as compared to those using S/F with FEC. Our preliminary study reveals the significant advantage of integrating NC and H.264/SVC in video multicasting. More test cases, including different network topologies and video sequences, will be conducted in the near future.

References 1. Wiegand, T., Sullivan, G., Reichel, J., Schwarz, H., Wien, M.: Joint Draft 8 of SVC Amendment, ISO/IEC JTC/SC29/WG11 and ITU-T SG16 Q.6. Hangzhou, China (October 20-27, 2006) 2. Reichel, J., Schwarz, H., Wien, M.: Joint scalable video model JSVM-6, ISO/IEC JTC/SC29/WG11 and ITU-T SG16 Q.6, Geneva, Switzerland (31 March-7 April 2006) 3. Xiao, S., Wu, C., Du, J., Yang, Y.: Reliable transmission of H.264 video over wireless network. In: Proceedings of the 20th International Conference on Advanced Information Networking and Applications, Vienna, Austria vol. 2, pp. 84–88 (April 18-20, 2006) 4. Fang, T., Chau, L.P.: GOP- based channel rate allocation using genetic algorithm for scalable video streaming over error-prone networks. IEEE Trans. Image processing 15(6), 1323–1329 (2006) 5. Schierl, T., Ganger, K., Hellge, C., Wiegand, T.: SVC-based multisource streaming for robust video transmission in mobile ad hoc network. IEEE Wireless Comm. 13(5), 96–103 (2006) 6. Albanese, A., Blömer, J., Edmonds, J., Luby, M., Sudan, M.: Priority encoding transmission. IEEE Trans. Inf. Theory 42(6), 1737–1744 (1996) 7. Wang, H., Xiao, S., Jay Kuo, C.-C.: Robust and flexible wireless video multicast with network coding (accepted for publication in Globecom 2007) 8. Chou, P.A., Wu, Y., Jain, K.: Practical network coding. In: 51st Allerton Conference on Communication, Control, and Computing, Monticello, IL (October 2003) 9. Li, S-.Y.R., Yeung, R.W., Cai, N.: Linear network coding. IEEE Trans. On Information Theory 49(2), 371–381 (2003)

Suggest Documents