An error resilient coding scheme for H.264/AVC video transmission ...

3 downloads 8625 Views 1MB Size Report
tant data for a corrupted MB can be correctly extracted, the extracted ... types of important data with different transmission error recovery capabilities for each MB.
J. Vis. Commun. Image R. 16 (2005) 93–114 www.elsevier.com/locate/jvci

An error resilient coding scheme for H.264/AVC video transmission based on data embeddingq Li-Wei Kang, Jin-Jang Leou* Department of Computer Science and Information Engineering, National Chung Cheng University, Chiayi, Taiwan 621, ROC Received 29 December 2003; accepted 15 April 2004 Available online 14 August 2004

Abstract For entropy-coded H.264/AVC video frames, a transmission error in a codeword will not only affect the underlying codeword but also may affect subsequent codewords, resulting in a great degradation of the received video frames. In this study, an error resilient coding scheme for H.264/AVC video transmission is proposed. At the encoder, for an H.264/AVC intra-coded I frame, the important data for each macroblock (MB) are extracted and embedded into the next frame by the proposed MB-interleaving slice-based data embedding scheme for I frames. For an H.264/AVC inter-coded P frame, two types of important data with different error recovery capabilities for each MB are extracted and embedded into the next frame by the proposed MB-interleaving slice-based data embedding scheme for P frames. At the decoder, if the important data for a corrupted MB can be correctly extracted, the extracted important data for the corrupted MB will facilitate the employed error concealment scheme to conceal the corrupted MB; otherwise, the employed error concealment scheme is simply used to conceal the corrupted MB. As compared with some recent error resilient approaches based on data embedding, in this study, the important data selection mechanism for different types of MBs, the detailed data embedding mechanism, and the error detection and concealment scheme performed at the decoder

q

This work was supported in part by National Science Council, Republic of China under Grants NSC 91-2213-E-194-025 and NSC 92-2213-E-194-038. * Corresponding author. Fax: +886-5-2720859. E-mail address: [email protected] (J.-J. Leou). 1047-3203/$ - see front matter  2004 Elsevier Inc. All rights reserved. doi:10.1016/j.jvcir.2004.04.003

94

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

are well developed to design an integrated error resilient coding scheme. Additionally, two types of important data with different transmission error recovery capabilities for each MB in P frames can provide more reliable error resiliency. Based on the simulation results obtained in this study, the proposed scheme can recover high-quality H.264/AVC video frames from the corresponding corrupted video frames up to a video packet loss rate of 20%.  2004 Elsevier Inc. All rights reserved. Index terms: Error resilient coding; Error concealment; H.264/AVC video; Transmission error; Data embedding

1. Introduction To reduce transmission bit rate or storage capacity, many compression techniques have been developed for various applications, such as videophones, videoconferencing, and multimedia communications. Reliable transmission of compressed images/ video over noisy channels is a challenging problem (Wang and Zhu, 1998; Wang et al., 2000, 2002a). For entropy-coded H.264/AVC video frames (JVT, 2003; Stockhammer et al., 2003; Wenger, 2003; Wiegand et al., 2003), a transmission error in a codeword will not only affect the underlying codeword but also may affect subsequent codewords, resulting in a great degradation of the received video frames. To cope with the synchronization problem, each of the two top layers of the H.264/ AVC hierarchical structure (JVT, 2003), namely, picture and slice, is ahead with a fixed-length start code. After the decoder receives any start code (a synchronization codeword), the decoder resynchronizes regardless of the preceding slippage. Although the propagation effect of a transmission error within a video frame can be terminated when any start code is correctly received, a transmission error may affect the underlying codeword and its subsequent codewords within a corrupted slice. Moreover, because of the use of motion-compensated interframe coding, the effect of a transmission error may be propagated to the subsequent video frames, as an illustrated example shown in Fig. 1. In this study, an error resilient coding scheme for H.264/AVC video transmission based on data embedding is proposed.

Fig. 1. The error-free and corrupted H.264/AVC video frames of the sixth frame of the ‘‘Carphone’’ sequence with the video packet loss rate = 10%: (A) the error-free video frame and (B) the corresponding corrupted video frame.

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

95

In general, error resilient approaches include three categories (Wang and Zhu, 1998; Wang et al., 2000), namely: (1) the error resilient encoding approach (Frossard and Verscheure, 2001; Gallant and Kossentini, 2001; Redmill and Kingsbury, 1996; Wang and Lin, 2002), (2) the error concealment approach (Kang and Leou, 2002a, 2004; Li and Orchard, 2002; Wang et al., 2002; Zhang et al., 2000), and (3) the encoder–decoder interactive error control approach (Stockhammer et al., 2002). The error resilient encoding approach can be further divided into four categories, namely, (1) the robust entropy coding approach (Redmill and Kingsbury, 1996), (2) the error resilient prediction approach (Frossard and Verscheure, 2001), (3) the layered coding with unequal error protection approach (Gallant and Kossentini, 2001), and (4) the multiple description coding approach (Wang and Lin, 2002). The robust entropy coding approach (Redmill and Kingsbury, 1996) copes with the synchronization problem, whereas the error resilient prediction approach (Frossard and Verscheure, 2001) copes with the temporal error propagation problem. The layered coding with unequal error protection approach (Gallant and Kossentini, 2001) divides an image/video bitstream into a base layer and one or several enhancement layer(s) with unequal degrees of error protection. The multiple description coding approach (Wang and Lin, 2002) divides an image/video bitstream into several sub-bitstreams, known as descriptions. Any single description can provide a basic quality, and more descriptions together will provide an improved quality. The error concealment approach (Kang and Leou, 2002a, 2004; Li and Orchard, 2002; Wang et al., 2002; Zhang et al., 2000) conceals the corrupted (lost) information due to transmission errors in the transmitted video bitstream at the decoder by using (1) spatial (spectral) (Li and Orchard, 2002), (2) temporal (Kang and Leou, 2002a), or (3) hybrid (spatial and temporal) (Kang and Leou, 2002a, 2004; Wang et al., 2002) image/video information. Using the information of spatially and/or temporally neighboring blocks of a corrupted block to conceal the corrupted block may introduce some problems. First, the information of neighboring blocks may be not available (may also be corrupted). Second, the video contents between a corrupted block and its neighboring blocks may be very different. In those cases, the concealed results of the foregoing approaches are usually not good enough. Recently, several error resilient coding approaches based on data embedding are proposed (Kang and Leou, 2003a,b, 2002b; Song and Liu, 2001; Yilmaz and Alatan, 2003; Yin et al., 2001; Zeng, 2003), in which some important data useful for error concealment performed at the decoder can be embedded into video frames at the encoder. The embedded data should be ‘‘almost’’ invisible and cannot degrade video quality greatly, just like digital watermarking (Hartung and Kutter, 1999). At the decoder, if some corrupted blocks are detected and located, the important (embedded) data for the corrupted blocks will be extracted and used to facilitate error concealment performed at the decoder. Song and Liu (2001) proposed a data embedding scheme for error-prone channels, in which some redundant information used to protect motion vectors (MVs) and coding modes of macroblocks (MBs) in one frame is embedded into the MVs in the next frame. Based on the two assumptions that the next frame of a corrupted video frame will be correctly received and at most one group of blocks

96

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

(GOB) will be corrupted in a video frame, the decoder can exactly recover the MVs of the corrupted GOBs in the corrupted video frame. In some recent error resilient approaches based on data embedding (Song and Liu, 2001; Yilmaz and Alatan, 2003; Zeng, 2003) a case that both an MB (or block) and its important data (embedded into another MB or block) are corrupted simultaneously is not well treated. In this case, the embedded data for a corrupted MB (or block) are not available and an error concealment scheme is required. In this study, the important data selection mechanism for different types of MBs, the detailed data embedding mechanism, and the error detection and concealment scheme performed at the decoder are well developed to design an integrated error resilient coding scheme. Additionally, the above-mentioned case will be well treated. In this study, an error resilient coding scheme for H.264/AVC video transmission is proposed. At the encoder, for an I frame, the important data for each MB are extracted and embedded into the next frame by the proposed MB-interleaving slicebased data embedding scheme for I frames. For a P frame, two types of important data for each MB are extracted and embedded into the next frame by the proposed MB-interleaving slice-based data embedding scheme for P frames. At the decoder, if the important data for a corrupted MB can be correctly extracted, the extracted important data for the corrupted MB will facilitate the employed error concealment scheme to conceal the corrupted MB; otherwise, the employed error concealment scheme is simply used to conceal the corrupted MB. This paper is organized as follows. A brief overview of the H.264/AVC video compression standard, the employed error detection and error concealment scheme for H.264/AVC video transmission are given in Section 2. The proposed error resilient coding scheme for H.264/AVC video transmission is addressed in Section 3. Simulation results are included in Section 4, followed by concluding remarks.

2. H.264/AVC video compression standard, error detection, and error concealment for H.264/AVC video transmission 2.1. H.264/AVC video compression standard H.264/AVC (JVT, 2003; Stockhammer et al., 2003; Wenger, 2003; Wiegand et al., 2003) is a recent video coding standard developed by the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goals of H.264/AVC are to provide enhanced compression performance and network-friendly video applications, including ‘‘conversational’’ and ‘‘non-conversational’’ applications. Similar to previous video coding standards, such as H.263, H.264/AVC utilizes transform coding of the prediction residual. However, in H.264/AVC, the transform is applied to 4 · 4 blocks and, instead of using the conventional discrete cosine transform (DCT), a 4 · 4 integer DCT is used so that the inverse transform mismatch effect is avoided. Additionally, H.264/AVC provides several enhanced features, such as (1) variable block-size motion compensation; (2) quarter-sample-accuracy motion compensation; (3) multiple reference frame motion compensation; (4) directional

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

97

spatial prediction for intra coding; (5) more effective entropy coding, e.g., contextadaptive variable-length coding (CAVLC) and context-adaptive binary arithmetic coding (CABAC); and (6) more flexible encoding features, e.g., flexible slice size, flexible macroblock ordering (FMO), and arbitrary slice ordering (ASO). 2.2. Error detection for H.264/AVC video transmission In this study, error detection for an H.264/AVC slice (or equivalent a video packet) is performed by checking a set of error-checking conditions, without adding extra redundant bits to the transmitted video bitstream. The set of error-checking conditions is derived from the constraints imposed on the H.264/AVC video bitstream syntax and is listed as follows. (1) An invalid codeword for the VLC code, the transform coefficient, the motion vector code, CBP, DQUANT, the MBTYPE code, or the REFFRAME code is found. (2) The total number of decoded MBs in a slice is not equal to the size of the slice. (3) The number of the decoded transform coefficients within a 4 · 4 block is larger than 16. (4) Invalid video data are detected. For example, the prediction error between the predictive block and the current block is an invalid value. If any of the above error-checking conditions is satisfied, we stop decoding the current slice (or video packet) and mark the current slice (or video packet) as a corrupted one. 2.3. Error concealment for H.264/AVC video transmission In this study, the spatial error concealment algorithm for I frames in H.264/AVC (Wang et al., 2002) is employed, in which each pixel value in a corrupted MB can be concealed by a weighted sum of the closest boundary pixels of the selected four-connected neighboring MBs. The weight associated with each boundary pixel is relative to the inverse distance between the pixel to be concealed and the closest boundary pixel. In this study, the two employed error concealment schemes for P frames are based on the motion-compensated best neighborhood matching (BNM) algorithm (Kang and Leou, 2002a). As shown in Fig. 2, each corrupted block of size M · M is extracted from a video frame together with its neighborhood as a range block of size (M + m) · (M + m). Within a range block, all the pixels in the corrupted region belong to the lost part and the others belong to the good part. After a range block is extracted, an H · L searching range block centralized with the motion-compensated block in the reference frame is generated. Here, the motion vector (MV) of a corrupted block is recovered as the average of the MVs of the ‘‘believable’’ (i.e., correctly received or previously concealed) neighboring blocks of the corrupted block. Each (M + m) · (M + m) block in the searching range block may be a candidate domain block to recover the lost part of the range block (the corrupted block). For each can-

98

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

Fig. 2. The motion-compensated best neighborhood matching (BNM) algorithm.

didate domain block, the mean absolute error (MAE) between the good part of the range block and the corresponding good part of the candidate domain block is evaluated. The candidate domain block with the minimum MAE (the best domain block) is then used to conceal the lost part of the range block by copying its corresponding central part to the lost part of the range block. The first employed error concealment scheme is the zero MV BNM algorithm, in which S candidate domain blocks with zero MVs in the S reference frames of a corrupted P frame are evaluated. Here S = 3. The second error concealment scheme is employed by applying the motion-compensated BNM algorithm (Kang and Leou, 2002a) to S reference frames. In general, a corrupted MB (to be concealed) usually has similar statistical, spectral, spatial, and ‘‘motion’’ properties as those of its believable neighboring MBs, respectively. Thus, a corrupted MB usually has the similar motion vector as those of its believable neighboring MBs. In the two employed error concealment schemes, based on the criteria (Kang and Leou, 2004) for determining the motion mode (‘‘slow-motion’’ or ‘‘fast-motion’’) of a corrupted MB, the average motion magnitude, ŒMVŒave, for a corrupted MB in a P frame is computed by j MVjave ¼

Nb X

j MVi j =N b ;

ð1Þ

i¼1

where Nb is the number of the blocks in the believable eight-connected spatially neighboring MBs of the corrupted MB and ŒMVi Œ is the magnitude of the MV of the ith block in the neighboring MBs, 1 6 i 6 Nb. Here, the magnitude, ŒMVŒ, of a MV, MV = (MVx, MVy), is defined as qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð2Þ j MV j¼ MV2x þ MV2y ; where MVx and MVy are the horizontal and vertical components of MV, respectively. If ŒMVŒave is smaller than a predefined threshold, TMV, the corrupted MB is determined as a ‘‘slow-motion’’ MB and then concealed by applying the zero MV BNM algorithm on the S reference frames (selecting the best one among the S reference frames). Otherwise, the corrupted MB is determined as a ‘‘fast-motion’’ MB

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

99

and then concealed by applying the motion-compensated BNM algorithm (with the motion vector MV) on the S reference frames (selecting the best one among the S reference frames).

3. Proposed error resilient coding scheme In this study, an error resilient coding scheme for H.264/AVC video transmission based on data embedding is proposed. Within the proposed scheme, the following issues will be addressed: (1) what kind of important data for the macroblocks (MBs) within a video frame should be extracted and embedded, (2) where should the important data be embedded, (3) how to embed the important data to the corresponding ‘‘masking’’ MBs or slices, and (4) how to extract and use the important data to facilitate error concealment performed at the decoder. In this study, the flexible macroblock ordering (FMO) capability in H.264/AVC (JVT, 2003; Stockhammer et al., 2003; Wenger, 2003; Wiegand et al., 2003) is enabled. By using FMO, a video frame can be partitioned into several slice groups, in which each slice group contains a set of MBs defined by an MB to slice group map. Each slice group contains one or more slices. In this study, a QCIF video frame is partitioned into nine slice groups, in which each slice group consists of exactly one slice. A slice contains the even-number MBs in a row and the odd-number MBs in another row so that the neighboring MBs of a corrupted MB may not be corrupted simultaneously. The 99 MBs of the nine slices of a QCIF video frame is illustrated in Fig. 3, in which the number (0–8) denotes the slice that an MB belongs to. For example, the first slice contains the even-number MBs in the first row and the oddnumber MBs in the second row. Note that a slice is the smallest synchronization unit in H.264/AVC. 3.1. Proposed error resilient coding scheme for H.264/AVC intra-coded I frames 3.1.1. What important data should be extracted and embedded? At the encoder, similar to (Yilmaz and Alatan, 2003; Yin et al., 2001), for an H.264/AVC intra-coded I frame, the edge direction within an MB is extracted as

Fig. 3. The 99 macroblocks of the nine slices of a QCIF H.264/AVC video frame.

100

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

its important data. For a pixel f(i, j) in an MB, the two gradient components, Gx(i, j) and Gy(i, j), of f(i, j) is computed by the Sobel operators (Gonzalez and Woods, 2002). The magnitude, Œ$f(i,j)Œ, and the direction, h(i,j), of the gradient of f(i, j) can be computed as: j rf ði; jÞ j¼j Gx ði; jÞ j þ j Gy ði; jÞ j;

ð3Þ

hði; jÞ ¼ arctanðGy ði; jÞ=Gx ði; jÞÞ:

ð4Þ

If Œ$f(i,j)Œ is greater than a predefined threshold, t, f(i, j) is determined as an edge pixel. If an MB contains no edge pixels, it is determined as a ‘‘smooth’’ MB; otherwise, it is determined as an ‘‘edge’’ MB. For an edge MB, the direction for each edge pixel is quantized to one of the E equally spaced directions between 0 and 180 and the dominated edge direction of the edge MB is used as its edge direction (Yilmaz and Alatan, 2003; Yin et al., 2001). Hence, one bit is used to denote the MB type (smooth or edge) and Ølog2 Eø bits are used to denote the dominated edge direction of an edge MB, i.e., the important data for a ‘‘smooth’’ MB contain only 1 bit and the important data for an ‘‘edge’’ MB contain 1 + Ølog2 Eø bits. 3.1.2. Where to embed the important data? The important data for each MB in an I frame will be embedded by using the proposed MB-interleaving slice-based data embedding scheme, in which the important data for all the MBs in a slice will be embedded into two corresponding ‘‘masking’’ slices in the next P frame. Because a slice and its masking slices may be corrupted simultaneously, the important data for a slice should not be completely embedded into only one masking slice. For example, the important data for the even-number MBs of slice 0 and those for the odd-number MBs of slice 2 can be interleaved and concatenated to a mixed bitstream, which is embedded into one ‘‘masking’’ slice, slice 1, in the next P frame. The corresponding ‘‘masking’’ slices in the second frame for all the slices in the first QCIF I frame are illustrated in Table 1. The major design criteria of the proposed MB-interleaving slice-based data embedding scheme for I frames shown in Table 1 can be deTable 1 The corresponding ‘‘masking’’ slices in the second video frame for all the slices in the first QCIF H.264/ AVC intra-coded I frame The masking slice in the second video frame

Embedded important data in the first QCIF H.264/AVC intra-coded I frame

0 1 2 3 4 5 6 7 8

Important Important Important Important Important Important Important Important Important

data data data data data data data data data

for for for for for for for for for

even even even even even even even even even

MBs MBs MBs MBs MBs MBs MBs MBs MBs

of of of of of of of of of

slice slice slice slice slice slice slice slice slice

8 0 3 2 5 4 7 6 1

and and and and and and and and and

odd odd odd odd odd odd odd odd odd

MBs MBs MBs MBs MBs MBs MBs MBs MBs

of of of of of of of of of

slice slice slice slice slice slice slice slice slice

3 2 1 0 8 6 5 4 7

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

101

scribed as follows. (1) The important data for the MBs in a slice should be separately embedded into two different masking slices in the next frame because a slice and its masking slices may be corrupted simultaneously. (2) The masking slices for a slice should not be two consecutive ones because burst video packet loss may occur. In fact, if the two above-mentioned criteria are satisfied, the interleaving operations described in Table 1 can be modified to be some other similar manners. 3.1.3. How to embed the important data? To perform data embedding in an I frame, the odd–even data embedding scheme (Yin et al., 2001) is employed and applied on the non-zero quantized integer transform coefficients in the corresponding masking slice. If the data bit to be embedded is ‘‘0,’’ the non-zero quantized integer transform coefficient will be forced to be an even number, whereas if the data bit to be embedded is ‘‘1,’’ the non-zero quantized integer transform coefficient will be forced to be an odd number. That is, if the data bit to be embedded is bj, the non-zero quantized integer transform coefficient Ci of the odd–even data embedding scheme is determined as 8 > < Ci þ 1 Ci ¼ Ci  1 > : Ci

if C i mod 2 6¼ bj ; and C i > 0; if C i mod 2 ¼ 6 bj ; and C i < 0;

ð5Þ

otherwise:

The important data of an MB will be embedded into both the luminance and chrominance blocks of its masking slice. The proposed data embedding scheme for an I frame can be summarized as follows. (1) Extract the important data, i.e., the MB type (smooth or edge) and/or the dominated edge direction, for each MB. (2) Determine the masking slice in the next P frame for each MB by using Table 1. (3) Embed the important data for each MB into its masking slice in the next P frame by using the odd–even data embedding scheme (Eq. (5)).

3.1.4. How to use the important data for error concealment? At the decoder, the masking slice for each corrupted MB is determined, and its important data will be extracted if the masking slice is correctly received. Then, if the corrupted MB is an ‘‘edge’’ MB, it will be concealed by using bilinear interpolation with its dominated edge direction and the two corresponding boundary pixels in the neighboring MBs. On the other hand, if (1) the corrupted MB is a smooth MB, (2) any of the two boundary pixels is not available, or (3) its masking slice is also corrupted, the corrupted MB will be concealed by the spatial interpolation algorithm provided in H.264/AVC (Wang et al., 2002).

102

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

3.2. Proposed error resilient coding scheme for H.264/AVC inter-coded P frames 3.2.1. What important data should be extracted and embedded? At the encoder, for an MB in an H.264/AVC inter-coded P frame, two types (Type-I and Type-II) of important data will be extracted and embedded. The Type-I data for an MB contains the coding mode, the reference frame(s), and the motion vector(s) for the MB, whereas the Type-II data for the MB includes the best error concealment scheme among 15 ‘‘pre-evaluated’’ error concealment schemes for the MB. 3.2.1.1. Type-I data. For an MB in a P frame, there are three coding modes, namely, skip, inter-coded, and intra-coded. Within the inter-coded mode, there are seven various block sizes (4 · 4, 4 · 8, 8 · 4, 8 · 8, 8 · 16, 16 · 8, and 16 · 16) for motion estimation and compensation. For an inter-coded MB, there are at least one and at most 16 motion vectors (MVs). For the Type-I data of an MB, one bit is used to denote whether the Type-I data will be embedded. If the Type-I data will be embedded, one bit is used to denote its coding mode (skip or inter-coded). Note that the rarely used intra-coded mode is ignored here. For an inter-coded MB, 2 bits are used to denote which inter-coded mode (inter-16 · 16, inter-16 · 8, or inter-8 · 16) is used. Any of the other inter-coded modes containing too many MVs is ignored here. For an MB (inter-16 · 16) or subblock (inter-16 · 8 or inter-8 · 16), 2 bits are used to denote the reference frame (three reference frames are used here, i.e., S = 3, where S is the number of the reference frames for a P frame) and one bit is used to denote whether the MV is a zero MV. For an MB or subblock with a non-zero MV, 16 bits are used to denote the MV with the search range being set to ± 16 pixels in quarter pixel accuracy. Note that for an ignored intra-coded MB or an ignored inter-coded MB, only one bit denoting that no Type-I data are embedded. 3.2.1.2. Type-II data. Different error concealment schemes have their own advantages for different video sequences and different error scenarios. The relative performances of different error concealment schemes for a particular corrupted MB can be easily ‘‘pre-evaluated’’ at the encoder (Zeng, 2003). For the Type-II data of an MB, 15 error concealment schemes for an MB will be ‘‘pre-evaluated’’ at the encoder and the best one is extracted as the important data (4 bits) of the MB. The 15 error concealment schemes are simply described as follows. (1) Three zero MV techniques, in which the three MBs at the corresponding spatial locations in the three reference frames are copied to conceal the corrupted MB, respectively. (2) Three average MV techniques. If the eight-connected spatially neighboring MBs of the corrupted MB are denoted as Bi, 1 6 i 6 8 and three reference frames are used, i.e., S = 3, all the MVs of BiÕs will locate on the three reference frames. Then, the average MV, mvs,av, on the sth reference frame of the MVs of the eight-connected spatially neighboring MBs of the corrupted MB can be denoted as mvs,av, s = 1, 2, 3. The three motion-compensated MBs with the three MVs, mvs,av, s = 1, 2, 3, are used to conceal the corrupted MB, respectively. (3) Assume that the coding mode of a corrupted MB is inter-16 · 8 and the top (bottom) 16 · 8 subblock can be concealed by the three average MV techniques with mvs,av, s = 1, 2, 3, which are calculated over the

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

103

half top (bottom) of the eight-connected spatially neighboring MBs (subblocks), respectively. Hence, there are nine possible combinations, i.e., nine error concealment schemes. As a summary, the mean absolute errors (MAEs) between the error-free MB and the 15 ‘‘pre-concealed’’ MBs using the 15 ‘‘pre-evaluated’’ error concealment schemes are evaluated. The best error concealment scheme with the minimum MAE is finally extracted as the Type-II data for the MB. If more reference frames are used or more error concealment schemes are evaluated, the data size of the Type-II important data for an MB will increase accordingly. 3.2.2. Where to embed the important data? For an MB in a P frame, the important data will be embedded by using the proposed MB-interleaving slice-based data embedding scheme for P frames, in which the important data for all the MBs in a slice will be embedded into the four corresponding ‘‘masking’’ slices in the next frame. For example, the Type-I data for the even-number MBs of slice 0, the Type-I data for the odd-number MBs of slice 2, the Type-II data for the even-number MBs of slice 4, and the Type-II data for the odd-number MBs of slice 6 are interleaved and concatenated to a mixed bitstream, which is embedded into its ‘‘masking’’ slice, slice 1, in the next frame. The corresponding ‘‘masking’’ slices in the (k + 1)th frame for all the slices in the kth QCIF P frame are illustrated in Table 2. The major design criteria of the proposed MB-interleaving slice-based data embedding scheme for P frames shown in Table 2 can be described as follows. (1) The important data for the MBs in a slice should be separately embedded into different masking slices in the next frame because a slice and its masking slices may be corrupted simultaneously. (2) The masking slices for a slice should not be consecutive ones because burst video packet loss may occur. (3) Two types (Type-I and Type-II) of the important data for an MB should be separately embedded into different masking slices in the next frame because if the masking slice for one type important data is also corrupted, the other type data are usually available. In fact, if the three above-mentioned criteria are satisfied, the interleaving operations described in Table 2 can be modified to some other similar manners. 3.2.3. How to embed the important data? Here, the odd–even data embedding scheme defined by Eq. (5) (Yin et al., 2001) is also employed to embed the important data for all slices (or MBs). For an MB in a P frame, the data size of the Type-I data is at most 42 bits, which is sometimes too large to be embedded. Hence, before embedding Type-I data, the priority for an MB will be determined by the mean absolute error (MAE) between the error-free MB and the corresponding ‘‘pre-concealed’’ MB by using the corresponding best error concealment scheme among the 15 ‘‘pre-evaluated’’ schemes. The larger the MAE is, the higher the priority of the MB will be. Then, the Type-I data for the R MBs having the lowest priorities among all the MBs within the same ‘‘masking’’ slice will be ignored, i.e., replaced by R ‘‘denoting’’ bits denoting that no Type-I data for the R MBs are embedded. Note that the Type-II data for each MB will be always embedded. The proposed data embedding scheme for P frames can be summarized as follows:

104

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

Table 2 The corresponding ‘‘masking’’ slices in the (k+1)th video frame for all the slices in the kth QCIF H.264/ AVC inter-coded P frame The masking slice in the (k+1)th video frame

Embedded important data in the kth QCIF H.264/AVC inter-coded P frame

0

Type-I data for even MBs of slice 8, Type-I data for odd MBs of slice 3, Type-II data for even MBs of slice 5, and Type-II data for odd MBs of slice 7

1

Type-I data for even MBs of slice 0, Type-I data for odd MBs of slice 2, Type-II data for even MBs of slice 4, and Type-II data for odd MBs of slice 6

2

Type-I data for even MBs of slice 3, Type-I data for odd MBs of slice 1, Type-II data for even MBs of slice 7, and Type-II data for odd MBs of slice 8

3

Type-I data for even MBs of slice 2, Type-I data for odd MBs of slice 0, Type-II data for even MBs of slice 6, and Type-II data for odd MBs of slice 4

4

Type-I data for even MBs of slice 5, Type-I data for odd MBs of slice 8, Type-II data for even MBs of slice 1, and Type-II data for odd MBs of slice 3

5

Type-I data for even MBs of slice 4, Type-I data for odd MBs of slice 6, Type-II data for even MBs of slice 0, and Type-II data for odd MBs of slice 2

6

Type-I data for even MBs of slice 7, Type-I data for odd MBs of slice 5, Type-II data for even MBs of slice 8, and Type-II data for odd MBs of slice 1

7

Type-I data for even MBs of slice 6, Type-I data for odd MBs of slice 4, Type-II data for even MBs of slice 2, and Type-II data for odd MBs of slice 0

8

Type-I data for even MBs of slice 1, Type-I data for odd MBs of slice 7, Type-II data for even MBs of slice 3, and Type-II data for odd MBs of slice 5

(1) For each MB, extract its Type-I data: (i) one bit denoting whether its Type-I data will be embedded. If its Type-I data will be embedded, and (ii) if the MB is skipped, one bit denoting its coding mode ‘‘skip’’ will be embedded or (iii) if the MB is inter-coded, one bit denoting its coding mode ‘‘inter-coded,’’ 2 bits denoting its inter-coded mode (inter-16 · 16, inter-16 · 8, or inter8 · 16), 2 bits denoting each reference frame, 1 bit denoting whether its MV is a zero MV, and/or 16 bits denoting each non-zero MV (if the MV is a non-zero MV) will be embedded. (2) For each MB, extract its Type-II data (4 bits). (3) Determine the priority for each MB. (4) Determine the masking slice in the next frame for each MB by using Table 2. (5) Embed the Type-I data and/or the Type-II data for each MB according to its priority into its masking slice in the next frame by using the odd–even data embedding scheme (Eq. (5)).

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

105

3.2.4. How to use the important data for error concealment? At the decoder, the masking slice for a corrupted MB is determined and the important data for the corrupted MB is then extracted if its masking slice is correctly received. Then, if the coding mode of the corrupted MB is ‘‘skip,’’ the corresponding MB in the previous frame is used to conceal the corrupted MB. If the coding mode of the corrupted MB is ‘‘inter-coded,’’ the coding mode, reference frame(s), and MV(s) of the corrupted MB are together used to conceal the corrupted MB. For a corrupted MB, if either (1) no Type-I data are embedded or (2) its Type-I data cannot be correctly extracted, but its Type-II data are available, the best ‘‘pre-evaluated’’ error concealment scheme (Type-II data) is used to conceal the corrupted MB. If no important data for the corrupted MB are available, the employed error concealment scheme for P frames is used to conceal the corrupted MB. Note that before concealing a corrupted MB by using either its Type-II data or the employed error concealment scheme for P frames, the eight-connected spatially neighboring corrupted MBs of the corrupted MB can be concealed first if they can be concealed with their correctly extracted Type-I data. Then, the corrupted MB can be concealed by using the best ‘‘pre-evaluated’’ (Type-II data) or employed error concealment scheme with more spatially neighboring MB information, resulting in the better concealed MB. In this study, for a corrupted slice, if at least one of its four masking slices is correctly received, at least the even-number (or odd-number) MBs of the corrupted slice can be concealed by using the important data (Type-I or Type-II) extracted from the ‘‘good’’ masking slice first. Then the odd-number (or even-number) MBs can be concealed by the employed error concealment scheme with more neighboring MB information. Because the corresponding four masking slices of a slice are usually far apart, i.e., the corresponding four masking slices are seldom corrupted simultaneously, the concealed results by using the proposed MB-interleaving slice-based data embedding scheme will be better than those of recent error resilient approaches based on data embedding (Song and Liu, 2001; Yilmaz and Alatan, 2003). In Song and Liu (2001), a frame-based embedding scheme is proposed, based on the two assumptions that the next frame of a corrupted video frame will be correctly received and at most one group of blocks (GOB) will be corrupted in a video frame. If any of the two assumptions is not valid, the important data for the corrupted video frame cannot be correctly extracted. Note that the two assumptions are usually not valid in a noisy transmission channel. In Yilmaz and Alatan (2003), the important data for a GOB is completely embedded into one masking GOB in the next video frame. If the GOB and its masking GOB are corrupted simultaneously, the important data for the corrupted GOB cannot be correctly extracted. However, in the proposed scheme, the important data for a slice in a P frame are distributed into the four (remote) masking slices in the next P frame by using the proposed MB-interleaving slice-based data embedding scheme. Hence, the important data for a corrupted slice are usually either ‘‘completely’’ available or at least ‘‘partially’’ available. Cooperating with the employed error detection and concealment scheme, the better concealed results will be obtained in this study.

106

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

4. Simulation results Four QCIF test video sequences, ‘‘Carphone,’’ ‘‘Coastguard,’’ ‘‘Foreman,’’ and ‘‘Salesman’’ with different video packet loss rates, denoted by VPLR, are used to evaluate the performance of the proposed scheme. Here it is assumed that a video packet consists of one complete slice and all the test video sequences are coded at a default frame rate 30 frames/second (fps). To prevent bit rate increasing due to data embedding, in the proposed scheme, the quantization parameter (QP) for P frames is set to 30 (default QP = 28). Here, the bit rates for the four test video sequences obtained by the original H.264/AVC (denoted by H.264) and the proposed scheme (denoted by Proposed) are listed in Table 3. The peak signal to noise ratio (PSNR) is employed in this study as an objective performance measure for the three components (Y, U, and V) of video frames. The PSNR of the ith video frame in a video, denoted by PSNRi, is given by PSNRi ¼ ð4  PSNRY ;i þ PSNRU ;i þ PSNRV ;i Þ=6;

ð6Þ

where PSNRY,i, PSNRU,i, and PSNRV,i are the corresponding PSNR values of the Y, U, and V components of the ith video frame, respectively. The average PSNR of a video sequence is denoted by PSNRseq. In the employed error concealment scheme for P frames, M, m, H, and L are empirically set to 16, 8, 30, and 30, respectively. In the proposed scheme, S, TMV, t, E, and R are empirically set to 3, 5, 300, 32, and 2, respectively. To evaluate the performance of the proposed scheme, five existing error resilient coding and error concealment approaches for comparison (Kang and Leou, 2002a,b; Song and Liu, 2001; Wang et al., 2002) are implemented in this study. They are: (1) zero-substitution, which simply replaces all pixels in a corrupted MB by zeros (denoted by Zero-S); (2) the error concealment scheme in H.264/AVC (denoted by H.264) (Wang et al., 2002); (3) the motion-compensated BNM algorithm (denoted by BNM) (Kang and Leou, 2002a); (4) the data embedded video coding scheme (denoted by DEVCS) (Song and Liu, 2001); and (5) the error resilient video coding scheme based on data embedding (denoted by ERDE) (Kang and Leou, 2002b). In terms of PSNR (dB), the performance comparisons between the five existing approaches for comparison and the proposed scheme for the first 100 frames of the ‘‘Carphone,’’ ‘‘Foreman,’’ and ‘‘Salesman’’ sequences are shown in Figs. 4–6. In terms of PSNRseq (dB), the simulation results for the ‘‘Carphone,’’ ‘‘Coastguard,’’ ‘‘Fore-

Table 3 The bit rates for the four test video sequences obtained by the original H.264/AVC and the proposed scheme Bit rate (kbps)

H.264

Proposed

Carphone Coastguard Foreman Salesman

126.01 255.63 143.29 82.44

124.19 220.86 137.88 83.60

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

107

Fig. 4. The performance comparison between the five existing approaches for comparison and the proposed scheme for the first 100 frames of the ‘‘Carphone’’ sequence with the video packet loss rate = 10%.

Fig. 5. The performance comparison between the five existing approaches for comparison and the proposed scheme for the first 100 frames of the ‘‘Foreman’’ sequence with the video packet loss rate = 15%.

Fig. 6. The performance comparison between the five existing approaches for comparison and the proposed scheme for the first 100 frames of the ‘‘Salesman’’ sequence with the video packet loss rate = 20%.

man,’’ and ‘‘Salesman’’ sequences with different VPLRs of the five existing approaches for comparison and the proposed scheme are listed in Tables 4–7. As a subjective measure of the quality of the concealed video frames, the error-free and concealed video frames by the five existing approaches for comparison and the proposed scheme

108

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

Table 4 The simulation results, PSNRseq (dB), for the ‘‘Carphone’’ sequence with different video packet loss rates of the five existing error resilient coding and error concealment approaches for comparison and the proposed scheme VPLR (%)

0 10 15 20

Without data embedding

With data embedding

Zero-S

H.264

BNM

DEVCS

ERDE

Proposed

38.47 8.95 8.87 8.60

38.47 30.45 29.53 27.95

38.47 31.79 30.89 29.72

37.46 30.53 29.62 27.94

37.72 33.35 32.63 31.49

37.51 35.45 34.27 33.20

Table 5 The simulation results, PSNRseq (dB), for the ‘‘Coastguard’’ sequence with different video packet loss rates of the five existing error resilient coding and error concealment approaches for comparison and the proposed scheme VPLR (%)

0 10 15 20

Without data embedding

With data embedding

Zero-S

H.264

BNM

DEVCS

ERDE

Proposed

37.23 8.25 7.89 7.03

37.23 31.38 30.72 30.21

37.23 32.07 31.15 31.01

36.08 31.41 30.74 30.22

36.24 32.62 32.05 31.53

36.10 33.68 33.20 32.67

Table 6 The simulation results, PSNRseq (dB), for the ‘‘Foreman’’ sequence with different video packet loss rates of the five existing error resilient coding and error concealment approaches for comparison and the proposed scheme VPLR (%)

0 10 15 20

Without data embedding

With data embedding

Zero-S

H.264

BNM

DEVCS

ERDE

Proposed

37.15 7.24 7.13 7.06

37.15 30.68 29.33 28.49

37.15 31.74 30.71 29.82

36.18 30.76 29.35 28.52

36.42 33.67 32.89 31.98

36.23 34.86 34.03 33.56

Table 7 The simulation results, PSNRseq (dB), for the ‘‘Salesman’’ sequence with different video packet loss rates of the five existing error resilient coding and error concealment approaches for comparison and the proposed scheme VPLR (%)

0 10 15 20

Without data embedding

With data embedding

Zero-S

H.264

BNM

DEVCS

ERDE

Proposed

36.98 10.82 10.60 10.51

36.98 31.75 30.23 29.76

36.98 32.18 31.91 31.36

36.23 31.79 30.26 29.79

36.61 32.45 32.12 31.93

36.40 33.57 33.46 32.97

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

109

for the ‘‘Carphone,’’ ‘‘Foreman,’’ and ‘‘Salesman’’ sequences with different VPLRs are shown in Figs. 7–10. The rate-distortion curves for the error-free ‘‘Carphone’’ and ‘‘Foreman’’ sequences obtained by the original H.264/AVC, the original H.264/ AVC with additional important data transmitted as extra video packets (denoted by H.264 extra), and the proposed scheme with data embedding are shown in Fig. 11.

Fig. 7. The error-free and concealed H.264/AVC video frames of an I-frame (the first frame) within the ‘‘Foreman’’ sequence with the video packet loss rate = 10%: (A) the error-free frame; (B) the error-free frame with data embedding; and (C)–(H) the concealed frames by Zero-S, H.264, BNM, DEVCS, ERDE, and the proposed scheme, respectively.

Fig. 8. The error-free and concealed H.264/AVC video frames of a P-frame (the 14th frame) within the ‘‘Carphone’’ sequence with the video packet loss rate = 15%: (A) the error-free frame; (B) the error-free frame with data embedding; and (C)–(H) the concealed frames by Zero-S, H.264, BNM, DEVCS, ERDE, and the proposed scheme, respectively.

110

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

Fig. 9. The error-free and concealed H.264/AVC video frames of a P-frame (the fifth frame) within the ‘‘Foreman’’ sequence with the video packet loss rate = 20%: (A) the error-free frame; (B) the error-free frame with data embedding; and (C)–(H) the concealed frames by Zero-S, H.264, BNM, DEVCS, ERDE, and the proposed scheme, respectively.

Fig. 10. The error-free and concealed H.264/AVC video frames of a P-frame (the 12th frame) within the ‘‘Salesman’’ sequence with the video packet loss rate = 10%: (A) the error-free frame; (B) the error-free frame with data embedding; and (C)–(H) the concealed frames by Zero-S, H.264, BNM, DEVCS, ERDE, and the proposed scheme, respectively.

5. Concluding remarks Based on the simulation results obtained in this study, several observations can be found. (1) Based on the simulation results shown in Tables 4–7 and Figs. 4–10, the concealed results of the proposed scheme are better than those of the five existing approaches for comparison. (2) Based on the simulation results shown in Tables

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

111

Fig. 11. The rate-distortion curves for the error-free (A) ‘‘Carphone’’ and (B) ‘‘Foreman’’ sequences obtained by the original H.264/AVC, the original H.264/AVC with additional data transmitted as extra video packets, and the proposed scheme with data embedding.

4–7 and Figs. 4–10, the relative performance gains (in dB) of the proposed scheme over the five existing approaches for comparison increase as the VPLR is increased. (3) Based on the simulation results shown in Tables 4–7 and Figs. 4–10, the performance of the proposed scheme is ‘‘slightly’’ better than those of the five existing approaches for comparison for ‘‘slow-motion’’ video sequences, such as the ‘‘Salesman’’ sequence, whereas the performance of the proposed scheme is ‘‘much’’ better than those of the five existing approaches for comparison for ‘‘fast-motion’’ video sequences, such as the ‘‘Carphone’’ sequence. (4) Based on the simulation results shown in Tables 4–7, for the error-free video sequences, the degradation of the proposed scheme with data embedding (compared with the original H.264/ AVC) is usually below 1.2 dB, which is comparable with those in (Kang and Leou, 2002b; Song and Liu, 2001). (5) Based on the simulation results shown in Fig. 11, the average additional bit rates for embedding important data of the error-free ‘‘Carphone’’ and ‘‘Foreman’’ sequences in the proposed scheme are about 23 and 24 kbps, respectively, when the video qualities obtained by the original H.264/AVC and the proposed scheme are similar (similar PSNR values). However, based on Table 3, when perceptibly invisible video quality degradation is allowed in the proposed scheme, the bit rate will be slightly increased. The approach that the important data are transmitted as extra packets and the proposed scheme with data embedding can be compared as follows. (1) If the important data are transmitted as extra packets, it may consume more extra overhead in maintaining the inter-packet synchronization between the important data packets and the video data packets. Furthermore, the delay jitter of the important data packets may also introduce extra delay in error concealment performed at the decoder, since it may need to wait for the important data packets when errors occur. However, if the important data are embedded, less extra overheads are needed. (2) If the important data transmitted as extra packets are variable-length coded, extra synchronization codewords will be required to protect these data. On the other hand, if the important data are embedded into the next video frame, they are protected explicitly without extra synchronization codewords. (3) Based on Fig. 11, the bit rate increments induced by transmitting the important data as extra packets are more significant than those induced by the proposed scheme with data embedding when

112

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

video qualities are similar (similar PSNR values). Note that, for the error-free transmission case, the rate-distortion curve of the original H.264/AVC is better than that of the proposed scheme with data embedding. However, for the noisy transmission case, the rate-distortion performance of the proposed scheme with data embedding is better than that of the original H.264/AVC. Compared with the five existing approaches for comparison, the proposed scheme is more robust for noisy channels with burst video packet losses. That is, for low video packet loss rate cases, burst video packet losses will seldom occur and the important data, such as the MV(s), for each corrupted MB can be well estimated. On the other hand, for high video packet loss rate cases, burst video packet losses will frequently occur and the important data, such as the MV(s), for each corrupted MB cannot be estimated accurately. In the proposed scheme using the MB-interleaving slice-based data embedding scheme, the important data, such as the MV(s), for each corrupted MB are usually available. Moreover, by embedding two types (Type-I and Type-II) of important data for each MB in a P frame, each corrupted MB can be usually concealed by using at least one type of correctly extracted important data. Additionally, by applying the proposed data embedding scheme and the flexible macroblock ordering (FMO) capability in H.264/AVC, on the average, each corrupted MB without correctly extracted important data can be concealed with more neighboring MB information, resulting in the better concealed results. In this study, an error resilient coding scheme for H.264/AVC video transmission based on data embedding is proposed. At the encoder, for an I frame, the important data (edge information) for each MB are extracted and embedded into the next frame by the proposed MB-interleaving slice-based data embedding scheme for I frames and the odd–even data embedding scheme. For an H.264/AVC inter-coded P frame, two types, i.e., Type-I (coding mode, reference frame(s), and/or motion vector(s)) and Type-II (the best ‘‘pre-evaluated’’ error concealment scheme), of important data for each MB are extracted and embedded into the next frame by the proposed MB-interleaving slice-based data embedding scheme for P frames and the odd–even data embedding scheme. At the decoder, if the important data for a corrupted MB can be correctly extracted, the extracted important data for the corrupted MB will facilitate the employed error concealment scheme to conceal the corrupted MB; otherwise, the employed error concealment scheme is simply used to conceal the corrupted MB. Note that the edge information for I frames was shown to be effective (Yilmaz and Alatan, 2003; Yin et al., 2001), the coding mode and motion vector information for P frames were shown to be effective (Kang and Leou, 2002b; Song and Liu, 2001; Yilmaz and Alatan, 2003), the best ‘‘pre-evaluated’’ error concealment scheme for P frames was shown to be effective (Zeng, 2003), and the employed odd–even data embedding scheme was shown to be effective (Yin et al., 2001; Kang and Leou, 2003a; Kang and Leou, 2002b; Yilmaz and Alatan, 2003; Zeng, 2003). Additionally, the employed error concealment schemes performed at the decoder for I and P frames were also shown to be effective (Kang and Leou, 2002a, 2004; Wang et al., 2002). The new contributions of this paper can be described as follows. (1) A set of extraction methods for the important data is well developed, which is suitable

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

113

for H.264/AVC video. (2) The proposed MB-interleaving slice-based data embedding scheme and the odd–even data embedding scheme are used together to provide a more reliable mechanism for data embedding. (3) For a P frame, two types (Type-I and Type-II) of important data with different transmission error recovery capabilities for each MB can provide more reliable error resiliency. Additionally, embedding the Type-I and/or Type-II data for each MB according to its priority can provide more efficient utilization of the limited embedding capability of a P frame. (4) Cooperating with the employed error detection and concealment scheme performed at the decoder, the better concealed results will be obtained in this study. As compared with some recent error resilient approaches based on data embedding, in this study, the important data selection mechanism for different types of MBs, the detailed data embedding mechanism, and the error detection and concealment scheme performed at the decoder are well developed to design an integrated error resilient coding scheme. Based on the simulation results obtained in this study, the proposed scheme can recover high-quality H.264/AVC video frames from the corresponding corrupted video frames up to a video packet loss rate of 20%. References Frossard, P., Verscheure, O., 2001. AMISP: a complete content-based MPEG-2 error-resilient scheme. IEEE Trans. Circuits Syst. Video Technol. 11 (9), 989–998. Gallant, M., Kossentini, F., 2001. Rate-distortion optimized layered coding with unequal error protection for robust Internet video. IEEE Trans. Circuits Syst. Video Technol. 11 (3), 357–372. Gonzalez, R.C., Woods, R.E., 2002. Digital Image Processing, second ed. Prentice Hall, New Jersey. Hartung, F., Kutter, M., 1999. Multimedia watermarking techniques. Proc. IEEE 87 (7), 1079–1107. Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264ŒISO/IEC 14496-10 AVC), 2003. Kang, L.W., Leou, J.J., 2002a. A new hybrid error concealment scheme for MPEG-2 video transmission. In: Proceedings of the IEEE International Workshop on Multimedia Signal Processing, St. Thomas, US Virgin Islands, pp. 29–32. Kang, L.W., Leou, J.J., 2002b. A new error resilient coding scheme for H.263 video transmission. In: Proceedings of the IEEE Pacific-Rim Conference on Multimedia, Hsinchu, Taiwan, pp. 814–822. Kang, L.W., Leou, J.J., 2003a. A new error resilient coding scheme for JPEG image transmission based on data embedding and vector quantization. In: Proceedings of the IEEE International Symposium on Circuits and Systems, vol. 2, Bangkok, Thailand, pp. 532–535. Kang, L.W., Leou, J.J., 2003b. Two error resilient coding schemes for wavelet-based image transmission based on data embedding and genetic algorithms. In: Proceedings of the IEEE International Conference on Image Processing, Barcelona, Spain, pp. 461–464. Kang, L.W., Leou, J.J., 2004. A hybrid error concealment scheme for MPEG-2 video transmission based on best neighborhood matching algorithm. In: Proceedings of the IEEE International Conference on Multimedia and Expo, Taipei, Taiwan. Li, X., Orchard, M.T., 2002. Novel sequential error-concealment techniques using orientation adaptive interpolation. IEEE Trans. Circuits Syst. Video Technol. 12 (10), 857–864. Redmill, D.W., Kingsbury, N.G., 1996. The EREC: an error-resilient technique for coding variable-length blocks of data. IEEE Trans. Image Process. 5 (4), 565–574. Stockhammer, T., Jenkac, H., Weiß, C., 2002. Feedback and error protection strategies for wireless progressive video transmission. IEEE Trans. Circuits Syst. Video Technol. 12 (6), 465–482.

114

L.-W. Kang, J.-J. Leou / J. Vis. Commun. Image R. 16 (2005) 93–114

Stockhammer, T., Hannuksela, M.M., Wiegand, T., 2003. H.264/AVC in wireless environments. IEEE Trans. Circuits Syst. Video Technol. 13 (7), 657–673. Song, J., Liu, K.J.R., 2001. A data embedded video coding scheme for error-prone channels. IEEE Trans. Multimedia 3 (4), 415–423. Wang, Y., Lin, S., 2002. Error-resilient video coding using multiple description motion compensation. IEEE Trans. Circuits Syst. Video Technol. 12 (6), 438–452. Wang, Y., Zhu, Q.F., 1998. Error control and concealment for video communication: a review. Proc. IEEE 86 (5), 974–997. Wang, Y., Ostermann, J., Zhang, Y.Q., 2002a. Video Processing and Communications. Prentice Hall, New Jersey. Wang, Y., Wenger, S., Wen, J., Katsaggelos, A.K., 2000. Error resilient video coding techniques. IEEE Signal Process. Magazine 17 (4), 61–82. Wang, Y.K., Hannuksela, M.M., Varsa, V., Hourunranta, A., Gabbouj, M., 2002. The error concealment feature in the H.26L test model. In: Proceedings of the IEEE International Conference on Image Processing, Rochester, NY, USA, pp. 729–732. Wenger, S., 2003. H.264/AVC over IP. IEEE Trans. Circuits Syst. Video Technol. 13 (7), 645–656. Wiegand, T., Sullivan, G.J., Bjøntegaard, G., Luthra, A., 2003. Overview of the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 13 (7), 560–576. Yilmaz, A., Alatan, A.A., 2003. Error concealment of video sequences by data hiding. In: Proceedings of the IEEE International Conference on Image Processing, Barcelona, Spain, pp. 679–682. Yin, P., Liu, B., Yu, H.H., 2001. Error concealment using data hiding. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1453–1456. Zeng, W., 2003. Spatial-temporal error concealment with side information for standard video codes. In: Proceedings of the IEEE International Conference on Multimedia and Expo, vol. 2, Baltimore, Maryland, USA, pp. 113–116. Zhang, J., Arnold, J.F., Frater, M.R., 2000. A cell-loss concealment technique for MPEG-2 coded video. IEEE Trans. Circuits Syst. Video Technol. 10 (4), 659–665. Biographical sketch of Li-Wei Kang. Li-Wei Kang was born in Taipei, Taiwan, Republic of China on December 26, 1974. He received the B.S. and M.S. degrees in computer science and information engineering in 1997 and 1999, respectively, all from National Chung Cheng University, Chiayi, Taiwan. Since September 1999, he has been working toward the Ph.D. degree in computer science and information engineering at National Chung Cheng University, Chiayi, Taiwan. His current research interests include image/video processing, image/video communication, and pattern recognition. Biographical sketch of Jin-Jang Leou. Jin-Jang Leou was born in Chiayi, Taiwan, Republic of China on October 25, 1956. He received the B.S. degree in communication engineering in 1979, the M.S. degree in communication engineering in 1981, and the Ph.D. degree in electronics in 1989, all from National Chiao Tung University, Hsinchu, Taiwan. From 1981 to 1983, he served in the Chinese Army as a Communication Officer. From 1983 to 1984, he was at National Chiao Tung University as a lecturer. Since August 1989, he has been on the faculty of the Department of Computer Science and Information Engineering at National Chung Cheng University, Chiayi, Taiwan. His current research interests include image/video processing, image/video communication, pattern recognition, and computer vision.

Suggest Documents