EFFICIENT DISTRIBUTED VIDEO CODING USING SYMMETRIC MOTION ESTIMATION AND CHANNEL DIVISION Sang-Uk ParkÝ , Jin-Woo ChoiÝ , Chang-Su KimÞ ,Sang-Uk LeeÝ , Jung-Won KangÜ ,and Kyung-Jun LeeÜ Ý
Ý
School of Electrical Engineering and INMC, Seoul National University, Korea Þ School of Electrical Engineering, Korea University, Seoul, Korea Ü Electronics & Telecommunications Research Institute (ETRI), Korea supark, jwchoi, sanguk@ipl.snu.ac.kr, Þ
[email protected], Ü lkj0610, jungwon@etri.re.kr ABSTRACT
As the quality of SI gets higher, the encoder needs to transmit less parity bits to the decoder, yielding better R-D performances. Lu et al. [4] proposed a multi-frame SI generation scheme that employs an adaptive temporal filtering to estimate pixel values and a motion vector filtering for the refinement. Ascenso et al. [5] proposed a spatial motion smoothing algorithm, which can alleviate the distortions in SI using the weighted vector median filter. Argyropoulos et al. [6] showed that an adaptive block size can reduce errors around sharp edges in SI frames. Adikari et al. [7] proposed the sequential motion refinement scheme for SI frames, which uses the information in already decoded bit planes to reconstruct a current bit plane. In this work, we propose a DVC algorithm using an efficient SI generation scheme. To obtain higher quality SI, the proposed algorithm employs the symmetric motion estimation, which minimizes the forward and backward matching costs simultaneously. Moreover, after the SI generation, the proposed algorithm classifies blocks in the side information into reliable blocks and unreliable blocks, which can be regarded as the channel division. Then, parity blocks are transmitted for unreliable blocks only, reducing the overall bitrate. Simulation results show that the proposed algorithm provides significantly better R-D performance than the conventional algorithms [2, 5].
An efficient distributed video coding algorithm using symmetric motion estimation and channel division is proposed in this work. We employ the symmetric motion estimation to generate high quality side information for Wyner-Ziv frames. Also, in the channel division, we classify blocks in the side information into reliable ones and unreliable ones. Then, we transmit parity bits for unreliable blocks only, achieving a coding gain. Simulation results demonstrate that the proposed algorithm provides up to 4 dB better PSNR performance than the conventional distributed video coding algorithms. Index Terms— Distributed video coding, Wyner-Ziv coding, side information, and frame interpolation. 1. INTRODUCTION Recently, distributed video coding (DVC) has drawn a lot of interests as a means to achieve low complexity encoding in various applications, such as mobile video communications, wireless video surveillance, and sensor networks. Theoretical R-D performance bounds of distributed coding have been studied based on the information theory [1]. Girod et al. [2] proposed a practical DVC system, called pixel domain Wyner-Ziv (PDWZ) coder. In their system, the decoder generates side information (SI) for Wyner-Ziv frames, and errors in the SI are corrected by parity bits, which are requested through a feedback channel. Ramchandran et al. [3] proposed a DVC system without a feedback channel, which uses cosetbased syndromes. These systems reduce the encoding complexity by transferring the computationally expensive motion search module to the decoder. However, the performances of DVC algorithms are inferior to those of traditional video coders, such as H.264, which use a hybrid of motion compensated prediction and transform coding. To improve the coding performance of DVC, several attempts have been made to generate higher quality SI.
2. PROPOSED ALGORITHM Fig. 1 shows the block diagram of the proposed algorithm, which is based on the PDWZ architecture [2]. The side information is generated for a Wyner-Ziv frame based on the motion compensated interpolation from adjacent key frames. The differences between and are corrected by LDPC parity bits. To improve the quality of the side information , we propose the symmetric motion estimation. Moreover, after constructing , the decoder estimates the reliability of each block in and classifies it into a reliable one or an unreliable one. A reliable block corresponds to a region where the symmetric motion estimation is successful. Thus, it can be used in the reconstructed frame ¼ without the LDPC decoding.
Ï
Ï
This work was supported by the IT R&D program of MIC/IITA. [2008S-006-01, Development of Open-IPTV (IPTV2.0) Technologies for Wired and Wireless Networks]
Ï
Ï
Ï
Ï
Ï
Ï
863
978-1-4244-4561-5/09/$25.00 ©2009 IEEE
PACRIM’09
bit-plane 1
Mk
2 level Quantizer
qk
Extract Bit-plane
Wbad
. . .
bit-plane M k
LDPC Encoder
Key Frames K
Reconstruction
^ W bad
Request Bits
Wyner-Ziv Frames W
q k’
LDPC Decoder
Buffer
^ W
Side Information Generation
Channel Division
Buffer
Conventional Intra-frame Encoder
W’
Conventional Intra-frame Decoder
K’
Fig. 1. The block diagram of the proposed DVC algorithm. Therefore, the decoder requests parity bits only for unreliable blocks, saving the overall bitrate. Notice that the differences between and can be regarded as channel distortions. Thus, the partitioning of the into reliable blocks and unreliable blocks side information is conceptually equivalent to the channel division. A WynerZiv frame is transmitted over two distinct channels to form at the decoder. Reliable blocks are transmitted over a good channel, while unreliable blocks over a bad channel. Parity bits are used for the bad channel only.
Ï
Ï
Ï
Ï
Ï
co-located block
forward MV
X n-1
Xn
X n+1
(a)
2.1. Side Information Generation Given two adjacent key frames, the decoder generates the side information for a Wyner-Ziv frame. Let ½ and ·½ denote the previous and the next key frames, and denote the current Wyner-Ziv frame. The objective of the side informafrom tion generation is to interpolate ½ and ·½ . Fig. 2 illustrates two widely-used motion-compensated frame interpolation schemes: the direct prediction [8] and the bilateral motion estimation [9]. In the direct prediction, to mo, the motion vector of the cotion compensate a block in to located block in ½ ·½ is estimated and scaled by a factor of 2. The direct prediction becomes ineffective when the motion vector field has fast variations spatially or temporally. The bilateral prediction was proposed to overcome this weakness. In the bilateral motion estimation, a block in ½ is matched to another block in ·½ so that their averaging . position is identical to the position of the current block in The performance of the bilateral motion estimation, however, depends strongly on the search range of motion vectors. It may provide unreliable performance if a sequence contains irregular motions. In this work, we propose the symmetric motion estimation more relito find motion vectors of the Wyner-Ziv frame ably. Note that the direct prediction in Fig. 2 (b) can be performed in a backward manner. Specifically, instead of finding the forward motion vector of the co-located block in ½ , we can estimate the backward motion vector of the co-located block in ·½ . Since both the forward and the backward mo-
co-located block
Bilateral MV
X n-1
co-located block
Xn
X n+1
(b)
Fig. 2. Frame interpolation schemes: (a) the direct prediction [8] and (b) the bilateral motion estimation [9].
tion vectors are used to approximate the motion vector of the , they should point to opposite directions. current block in Based on this observation, we find the motion vector Ú£ that minimizes the sum of the forward sum of absolution differences (SAD) and the backward SAD, given by
Ú£
864
Ú Ú Ú ½ Ô ·½ Ô Ú
Ô¾ Ô¾
·½
Ô
½ Ô Ú
(1)
where Ô denotes a pixel coordinate in block . The estimated motion vector is halved before the motion compensated frame interpolation. To improve the frame interpolation performance further, we adopt the motion refinement scheme in [5]. Specifically, within a search range around the estimated motion vector, the bilateral criterion is employed to obtain the refined motion vector. Also, outlier motion vectors are removed using the weighted vector median filter. Finally, the SI or motion-compensated interpolated frame is constructed by
for Ô
Ô
Side matching cost
v/2 -v/2
½ Ô Ú
·½ Ô Ú
backward predictive block forward predictive block
X n-1
X n+1
Xn
Fig. 3. The bilateral cost and the side matching cost are used to estimate the reliability of a block in the side information.
(2)
.
2.2. Channel Division In the conventional DVC algorithm, parity bits are transmitted to the decoder to correct errors in a whole Wyner-Ziv frame. However, on regions with no or linear motions, the generated SI can provide a reconstruction of sufficiently high quality. In such a case, the transmission of parity bits is not necessary since it only wastes the limited bit budget. Therefore, we propose the channel division, which classifies blocks in the SI into two categories: reliable blocks and unreliable blocks. Then, parity bits are transmitted for unreliable blocks only. Fig. 3 illustrates how to estimate the reliability of a block in the SI. First, we use the bilateral criterion. In (2), the current block is reconstructed by the average of the backward predictive block and the forward predictive block. This is based on the assumption that the block experiences a constant motion between ½ and ·½ . Thus, the bilateral cost, given by
bilateral
½ Ô Ú
Ô
(3) . Also, we use the side
should be small for a reliable block matching cost
side
In this work, half of blocks are declared as unreliable ones, but this is a conservative threshold since most errors are concentrated on only a few blocks where the motion-compensated interpolation is not successful. By transmitting parity bits for half of blocks only, the proposed algorithm saves the bitrate. Moreover, the decoding time is also reduced by about 35%, since the computationally demanding procedure of error correction decoding is performed for unreliable blocks only.
·½ Ô Ú
Ò Ô
3. EXPERIMENTAL RESULTS (4) We evaluate the performance of the proposed algorithm using two QCIF test sequences “Foreman” and “Salesman.” For the side information generation, a Wyner-Ziv frame is divided into blocks of size , and the motion vector of each block is estimated within the search range with integer pixel accuracy. First, Fig. 4 compares the motion-compensated interpolation results of 10th frame in the “Foreman” sequence, which are obtained by the Ascenso et al.’s algorithm [5] and the proposed algorithm. We see that the proposed algorithm provides higher quality SI by employing the symmetric motion estimation. On average, the proposed SI generation algorithm yields about 0.3 dB better PSNR performance on the “Foreman” sequence than the Ascenso et al.’s SI generation algorithm.
is the set of boundary pixels in the block and Then, the overall cost is defined as a weighted sum of the where
ÒÔ denotes the neighboring pixel of Ô in the adjacent block.
bilateral cost and the side match cost,
bilateral side The weighting coefficient is fixed to 0.64 in this work.
(b)
Fig. 4. The motion-compensated interpolation results on 10th frame in the “Foreman” sequence: (a) the Ascenso et al.’s algorithm and (b) the proposed algorithm.
¾
(a)
(5)
The decoder sorts the blocks in a Wyner-Ziv frame in the decreasing order of their costs. Then, it declares the top 50% of blocks as unreliable ones, and requests parity bits to correct errors in the unreliable blocks. Reliable blocks are used directly without the error correction in the reconstructed frame.
865
Next, Fig. 5 evaluates the R-D performance of the proposed DVC algorithm in comparison with conventional DVC algorithms and the H.264 standard. In this test, all key frames are assumed to be transmitted losslessly to the decoder. By employing a more efficient SI generation scheme, the proposed algorithm provides better R-D performance than the Ascenso et al.’s algorithm. Moreover, when the channel division scheme is incorporated, the performance of the proposed algorithm is improved further significantly. For instance, on the “Foreman” sequence at 500 Kbps, the proposed algorithm with the channel division provides about 1.6 dB better performance than the Ascenso et al.’s algorithm. Furthermore, note that the proposed algorithm provides up to 4 dB better performance than the PDWZ algorithm [2]. However, the performance of the proposed algorithm is still worse than that of the H.264 inter mode. The performance gap is narrower on the “Salesman” sequence, which has slower motions.
(a)
4. CONCLUSIONS In this work, we proposed an efficient DVC algorithm using the symmetric motion estimation and the channel division. The symmetric motion estimation is employed to construct higher quality SI at the decoder. Moreover, the channel division classifies blocks in the SI into reliable ones and unreliable ones, and parity bits are transmitted for unreliable blocks only. Experimental results demonstrated that the proposed algorithm provides up to 4 dB better PSNR performance than the conventional PDWZ algorithm. The notion of channel division is new and promising. Future research issues include the development of more accurate reliability estimation of blocks in SI frames, and the classification of blocks into more than two categories and the adaptive allocation of parity bits to those classes.
(b)
Fig. 5. Comparison of the R-D performances on (a) the “Foreman” sequence and (b) the “Salesman” sequence.
5. REFERENCES [1] D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inform. Theory, vol. IT-19, no. 4, pp. 1673–1687, July 1973.
[6] S. Argyropoulos, N. Thomas, N. V. Boulgouris, and M. G. Strintzis, “Adaptive frame interpolation for Wyner-Ziv video coding,” in Proc. IEEE Intl. Workshop Multimedia Signal Processing, Oct. 2007, pp. 159–162.
[2] B. Girod, A. Aaron, S. Rane, and D. Rebollo-Monedero, “Distributed video coding,” Proc. IEEE, vol. 93, no. 1, pp. 71–83, Jan. 2005.
[7] A. B. B. Adikari, W. A. C. Fernando, and H. K. Arachchi, “A sequential motion compensation refinement technique for distributed video coding of Wyner-Ziv frames,” in Proc. IEEE Intl. Conf. Image Processing, Oct. 2006, pp. 597–600.
[3] R. Puri, A. Majumdar, and K. Ramchandran, “PRISM: a video coding paradigm with motion estimation at the decoder,” IEEE Trans. Image Proc., vol. 16, no. 10, pp. 2436–2448, Oct. 2007.
[8] A. M. Tourapis, F. Wu, and S. Li, “Direct mode coding for bipredictive slices in the H.264 standard,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 1, pp. 119–126, Jan. 2005.
[4] L. Lu, D. He, and A. Jagmohan, “Side information generation for distributed video coding,” in Proc. IEEE Intl. Conf. Image Processing, Oct. 2007, pp. II.13–II.16.
[9] B.-D. Choi, J.-W. Han, C.-S. Kim, and S.-J. Ko, “Motioncompensated frame interpolation using bilateral motion estimation and adaptive overlapped block motion compensation,” IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 4, pp. 407–416, Apr. 2007.
[5] J. Ascenso, C. Brites, and F. Pererira, “Improving frame interpolation with spatial motion smoothing for pixel domain distributed video coding,” in Proc. EURASIP Conf. Speech and Image Processing, July 2005, pp. 311–316.
866