Motion Estimation for Frame-Rate Reduction in H ... - Semantic Scholar

Motion Estimation for Frame-Rate Reduction in H.264 Transcoding Il-hong Shin, Yung-Lyul Lee* and HyunWook Park Dept. of Electrical Engineering, Korea Advanced Institute of Science and Technology 373-1 Guseong-dong, Yuseong-gu, Daejeon, 305-701, Korea [email protected] Department of Internet Engineering, Sejong University, Seoul, Korea H.264 [7] codec, which is a joint standard of the ITU-T video compression and the ISO/IEC MPEG-4 Part 10 AVC (Advanced Video Coding), shows improvement of video coding performance. Although it has the same building blocks of the video coding as the previous MPEG-4 standard [6], it has some improved features such as 4×4 integer transform, multiple reference frames, variable block types for motion compensation (MC), quarter-pixel MC, universal variable length coding (UVLC) or context-based adaptive binary arithmetic coding (CABAC), and a non-normative rate-distortion optimization (RDO) tool that is used to decide the optimal block type. With the improved features, H.264 is expected to have high coding efficiency with more than 50 % improvement compared to existing coding standards such as H.263, MPEG-2, and MPEG-4. H.264 can be used in many applications such as video on demand (VOD), teleconferencing, and distance learning. Therefore, transcoding from previous codecs to H.264 or H.264 to H.264 is required. The paper proposes an H.264 to H.264 transcoder to reduce frame rates. We also propose a block-adaptive motion vector resampling (BAMVR) method to estimate optimum motion vectors for MC. An advanced method called BAMVR with RDO method is also proposed to improve the rate-distortion performance compared to the BAMVR only method. In this paper, the proposed transcoder structure of H.264 is described in Section 2. The BAMVR only method and the BAMVR with RDO method are introduced in Section 3. Experimental results and analysis of the proposed methods are given in Section 4, and our concluding remarks are given in Section 5.

Abstract This paper proposes a transcoding method for frame rate reduction in H.264 video coding standard. H.264 adopts various block types and multiple reference frames for motion compensation. When frames are skipped to reduce frame rates in transcoder, it is not easy to estimate optimum motion vectors and block types in H.264. A simple and effective block-adaptive motion vector resampling (BAMVR) method is proposed to estimate motion vector for motion compensation. In order to improve coding efficiency and visual quality, the ratedistortion optimization (RDO) algorithm is also combined with the BAMVR method in transcoder. In experimental results, rate-distortion performance and computational complexity of the proposed transcoder are analyzed for various video sequences. The proposed method achieves remarkable improvement in computational complexity compared to the full-motion estimation (ME) with RDO method.

1. Introduction Ubiquitous networks require situation-aware control of application services such as video and audio transmission. In home-network server, transcoding is required for allowing spontaneous adaptation of bitrate, which are dependent on screen size or processing power of mobile device. Transcoding can be an efficient method to achieve spontaneous bitrate adaptation and good rate-distortion performance [1]-[3]. A simple transcoding method is to convert original bitstreams into lower-bitrate bitstreams to meet the required channel bandwidth. Various transcoder structures have been proposed in H.263 [4], MPEG-2 [5], and MPEG-4 [6] codecs.

2. The proposed transcoder

63

number of reference frames for tracing motion vector in transcoder is S = N FrameSkip + 1 .

Block diagram of the proposed transcoder is shown in Fig. 1, which has a straightforward cascading architecture of the decoder and encoder. The motion estimation usually takes most computations in the encoder. The proposed BAMVR method is used to select optimum motion vectors and reference frames. The loop filter provides improvement of visual quality by reducing blocking artifacts. The non-linearity of loop filter generates drift and mismatch artifacts in transcoding method [1]. The proposed cascade-type transcoder in H.264 is drift-free and mismatch-free thanks to the feed-back loop and the 4×4 integer transform, respectively. End

Rin

end

In H.264, multiple reference frames are used in the motion estimation process in order to obtain better motion compensation performance and help making the H.264 bitstream to be error resilient [8]. In this paper, the multiple reference frames is not used for simplicity.

MV data, S = (N FrameSkip + 1 ) × TranscoderN REF

xnd

case 3

case 2

mvt −1

case 1

mvt

mv = mvt + mvt −1

M ( xnd− k )

ent

Figure. 2. An example of video frames that are remained or skipped for motion vector estimation in the transcoder (three frames are skipped) However, the current block may not be aligned with blocks in the reference frame. In addition, each block can have different block type. For example, the reference block of current 8×8 block in Fig. 3 is overlapped with 4×4 (upper-left), 4×8 (upper-right), 8×4 (lower-left), and 8×8 (lower-right) blocks in the reference frame. Then, motion vectors of four overlapped blocks should be traced, since block types in a frame are widely varied in H.264. Therefore, tracing of motion vector is difficult due to various overlapping regions when frames are skipped in the transcoder.

Ent

M ( xnt − k )

Figure. 1. Block diagram of the proposed transcoder structure

OB0

3. Block-adaptive motion vector

MV0

Ref0

resampling (BAMVR) method MV1

8×8 block

The optimized motion vector can be obtained by reestimating a new motion vector in transcoder. However, motion estimation (ME) requires high computational complexity, unless it utilizes the motion vectors of input video streams. Many researches [2]-[3] exploit the incoming motion vectors in the decoder of transcoder to estimate new motion vectors. Fig. 2 shows an example of frame skipping in the transcoder. N REF , which is set to 1 (the maximum number of reference frames in current H.264 standard is 5), denotes the number of multiple reference frames (MR) in original H.264 encoder, and N FrameSkip is the number

4

OB0 w0 w1 SB00

SB10

OB1

OB1

4 SB01

SB11

in current frame

MV

h0

SB00

h2 OB2

Ref1 h1 h3

MV2

w2 w3

OB2

OB3 Ref2

in reference frame (skipped in transcoder)

MV3

OB3

Ref3

Figure. 3. An example of motion vector resampling of the decomposed 4×4 blocks

of frames to be skipped in the transcoder. Then, the

64

To solve this problem in the paper, an M × N block is divided into the 4×4 subblocks to trace the motion vectors, where M × N can be 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, and 4×4 according to the block types. Trace of motion vector of the 4×4 subblock provides efficient composition method. The motion vector of the kj-th 4×4 subblock (SBkj) of Fig. 3 in the M × N block can be traced as follows:

The BAMVR method that reuses incoming block types as current block type is called BAMVR only method. The transcoded block type is defined as same as the incoming block types in the proposed BAMVR only method. When QP value in the transcoder is different from that of the original compressed bitstream, the incoming block type may not be optimal in the transcoder because characteristics of rate and distortion vary according to QP.

3

∑ A(OB ) × h × w × MV i

MVkj =

i

i

i

i =0

3

∑

meet condition = 1

(1)

n = reference frame number of current block mv = motion vector of current block

A(OBi ) × hi × wi

i =0

while(meet condition )

where subscripts of k and j denote the horizontal and vertical indices of the 4×4 subblocks in the M × N block, and A(OBi ) denotes the area of i-th overlapped block in

{ adaptive estimation of current motion vectors ( MVtemp ) using eqs . (1) - (2) n = n + 1;

the skipped reference frame. In eq. (1), MVi is the

mv = mv + MVtemp ;

motion vector in Fig. 3, respectively, and hi and wi are the horizontal and vertical overlapping region between the 4×4 subblock and the i-th overlapped block, respectively. After obtaining the motion vector MVkj of the kj-th 4×4

if( ( n + 1)% (N FrameSkip + 1 ) = 0 ) meet condition = 0

subblocks, estimation of motion vector for the current M×N block is figured out as follows:  M   N   4   4     

M V =

∑ ∑ k =0

α

kj

j=0  M   N   4   4     

∑ ∑ k =0

× M V kj

α

} MVt = mv

Figure. 4. Pseudo code of the proposed BAMVR method in the transcoder

(2)

Also direct use of incoming block type results in degradation of visual quality. In order to improve ratedistortion performance and visual quality in transcoder, the BAMVR only method is combined with RDO [7] which selects the block type that minimizes the ratedistortion function from {INTRA4×4 and INTRA16×16} for INTRA frame, and from {INTRA4×4, INTRA16×16, SKIP, 16×16, 16×8, 8×16, P8×8} for INTER frame, in which P8×8 means 8×8, 8×4, 4×8, and 4×4 for each 8×8 block. The improved method, which is called the BAMVR with RDO method, is illustrated in Fig. 5. The arrow in each block indicates direction and magnitude of motion vectors. At first, each 8×8 block in the MB is divided into four 4×4 blocks, where the motion vector of 8×8 block are copied to four 4×4 blocks. For each 4×4 block, MVkj is

kj

j=0

where MV is the motion of the current M × N block in frame skipping situation, respectively,  x  denotes the nearest integer less than or equal to x , and the weighting factor α kj is set to one in the paper. Using eqs. (1) and (2), new motion vectors are adaptively estimated for current M × N block using incoming block types and motion vectors, when the first reference frame is skipped in the transcoder. The BAMVR method is proposed for estimation of motion vector. The pseudo code of the BAMVR method in the frame-skipping situation is shown in Fig. 4. In Fig. 4, % denotes the modulo operator, and MVt is the final motion vector of a current block from the BAMVR method. The while loop in Fig. 4 provides validity of reference frame number. Until meetcondition is 0, the adaptive estimation is repeated.

obtained by using eq. (1). After obtaining the motion vector and the reference frame number of each 4×4 block, motion vectors of the M×N block type are obtained by using eq. (2). Therefore, seven sets of motion vectors are obtained for seven block types. RDO is applied to these seven sets of MVs for seven block types and two intra

65

(QCIF). Foreman and Silent sequences have large complex local motion and small motion, respectively. The full-search ME with RDO method has ±16 motion search ranges, and the proposed transcoder employs a search window size of ±2 pixels in integer unit for motion vector refinement. In this paper, the fast motion estimation methods [9] are not applied to the transcoder. When those are applied, the computational complexity of the proposed transcoder will be reduced.

block types in order to select the most optimized block type in terms of rate distortion performance. In addition to the BAMVR method, motion vector refinement can be executed to improve the rate-distortion performance. This paper adopts the motion vector refinement process with search range of ±2 in the experiment MV of each 8 ×8 block 16 ×16 block

Foreman, 7.5 Hz MV of each 8 ×4 block

8 MV Decompose into of each 4 ×8 block 16 4×4 blocks

PSNR [dB]

8

RDO in P8×8 for each 8×8 block

Determined P8×8 type

Two intra block types MV Estimate MVkj for of each 16× 8 every 4 ×4 block block ( eq. (1) )

43 42 41 40 39 38 37 36 35 34 33 32 31 30

BAMVR with RDO full-search ME with RDO 10

RDO in MB

MV of each 8 ×16 block

BAMVR only

60

110

160

210

260

310

rate [kbits]

(a)

Final block type

Silent, 10 Hz

PSNR [dB]

MV of 16× 16 block by using eq. (2)

Figure. 5. An example of motion vector composition of 8×8 block using BAMVR with RDO method

4. EXPERIMENTAL RESULTS The proposed transcoder was implemented using the H.264 JM4.2 video codec [7] with CABAC [7], variable block-based ME/MC, motion vector search range of ±16 , quarter-pixel MC, 4×4 integer DCT, 1 reference frames, and the rate-distortion optimization. In experiments, the original compressed bitstreams have QP value of 10 at 30 frames per second (fps) to provide large variation of PSNR (Peak Signal to Noise Ratio) and bitrate in the transcoder. The first frame was compressed with an INTRA frame, and the others with all INTER frames. Bi-directional predicted frames (Bframes) of H.264 were not considered in the paper. PSNR, bitrates, and computational complexity were analyzed for two video sequences such as the “Foreman” and “Silent” sequence of the quarter common intermediate format

43 42 41 40 39 38 37 36 35 34 33 32 31 30 10

30

50

70

90

110

130

150

rate [kbits]

(b) Figure. 6. Rate-distortion (PSNR vs. bitrate) plots of three methods for (a) Foreman and (b) Paris Fig. 6 shows the rate-PSNR curves of various methods such as the proposed BAMVR only method where the incoming block types are reused, the BAMVR with RDO method, and the full-search ME with RDO method. PSNR of the proposed BAMVR with RDO method is degraded with approximately 0.2 dB from the full-search ME with RDO method in Silent as shown in Fig. 6(b). But video sequence with fast and complex

66

This work has been supported by CUCN(National Center of Excellence in Ubiquitous Computing and Networking).

motion such as Foreman has much severe PSNR degradation of 0.5 dB, as shown in Fig. 6(a). In order to reduce PSNR degradation, we can apply larger search window for motion vector refinement in the fast motion sequences. However, it is noted that the BAMVR with RDO method shows better PSNR of 1~3 dB improvement than the BAMVR only method.

6. REFERENCES [1] G. Keesman et al., “Transcoding of MPEG bitstreams”, Signal Processing: Image Commun., vol. 8, pp. 481-500, 1996. [2] J. Youn, M.-T. Sun, and C.-W. Lin, “Motion vector refinement for high performance transcoders”, IEEE Trans. Multimedia, vol. 1, pp. 30-40, Mar. 1999. [3] B. Shen, I. K. Sethi, and B. Vasudev, “Adaptive motion vector resampling for compressed video downscaling”, IEEE Trans. Circuits Syst. Video Technol., vol. 9, pp. 929-936, Sept. 1999. [4] Video Coding for Low Bitrate Communication, ITU-T Draft Recommendation H.263, May, 1996. [5] ISO/IEC 13818-2 (Mpeg2-Video), Information Technology – Coding of Moving Pictures and Associated Audio for Digital Storage Media at up about 1.5 Mbit/s: Video, 1993. [6] ISO/IEC JTC1/SC29/WG11, Mpeg-4 Video Verification Model 8.0, MPEG97/N1796, July, 1997. [7] T. Wiegand, Joint Final Committee Draft (JFCD) of Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC), JVT-D157, August 2002. [8] T. Wigend, X. Zhang, and B. Girod, “Long-term memory motion-compensated prediction”, IEEE Trans. Circuits Syst. Video Technol., vol. 9, pp. 70-84, Feb. 1999. [9] S. Zhu and K.-K. Ma, “A new diamond search algorithm for fast block matching motion estimation”, IEEE Trans. Image Processing, Vol. 10, pp. 287-290, Feb, 2000.

The computational complexity

Relative complexity ratio

1 0.9

BAMVR only

0.8

BAMVR with RDO

0.7 0.6

full-search ME with RDO

0.5 0.4 0.3 0.2 0.1 0

Figure. 7. Average computation amount Fig. 7 shows the average computation amounts of two video sequences from the BAMVR only method, the BAMVR with RDO method, and the full-search ME with RDO method. For comparison of computation amount, we used Pentium IV 1.7 GHz with 512 MB memory. The relative computational complexity of the proposed BAMVR with RDO method is about 38 % of the fullsearch ME with RDO method. Although BAMVR only method requires the lowest computation amount of about 11% from the full-search ME with RDO method, the BAMVR only method itself shows poor rate-distortion performance and visual quality.

5. CONCLUSION We proposed an efficient transcoder for frame-rate reduction in H.264. The BAMVR method was proposed to reduce computational complexity when optimum motion vectors were estimated for various block types. In addition, the BAMVR with RDO method was suggested to improve rate-distortion performance and visual quality. The experimental results show that the proposed method has a suitable performance compared with full-search ME with RDO method in terms of rate distortion and computational complexity. We expect the proposed method will be applied to the real-time transcoding for H.264 bitstreams in near future.

Acknowledgement

67

Motion Estimation for Frame-Rate Reduction in H ... - Semantic Scholar

Motion Estimation for Frame-Rate Reduction in H ... - Semantic Scholar

Suggest Documents

Motion Estimation for Frame-Rate Reduction in H ... - Semantic Scholar

Motion Artifact Reduction in ... - Semantic Scholar

DOPPLER-BASED MOTION ESTIMATION FOR ... - Semantic Scholar

An H' Strategy for Strain Estimation in Ultrasound ... - Semantic Scholar

A Comprehensive Cardiac Motion Estimation ... - Semantic Scholar

Reciprocal Subpixel Motion Estimation: Video ... - Semantic Scholar

Underwater Vehicle Motion Parameters Estimation ... - Semantic Scholar

STABLE CAMERA MOTION ESTIMATION USING ... - Semantic Scholar

Radar Maneuvering Target Motion Estimation ... - Semantic Scholar

Distributed Centroid Estimation and Motion ... - Semantic Scholar

Complementary Limb Motion Estimation (CLME) - Semantic Scholar

A Joint Motion & Disparity Motion Estimation ... - Semantic Scholar

R-D Optimal Motion Estimation for Fast H.264/AVC Bit-Rate Reduction

Reduction of Patient Motion Artifacts in Digital ... - Semantic Scholar

motion estimation for low power video devices - Semantic Scholar

VLSI Architecture for Motion Estimation using the ... - Semantic Scholar

Hybrid motion estimation scheme for secondary ... - Semantic Scholar

Fast Compressed Domain Motion Detection in H ... - Semantic Scholar

High-Performance Motion Estimation for Image ... - Semantic Scholar

Efficient Cost Measures for Motion Estimation at ... - Semantic Scholar

Segmentation-based Motion Estimation for Video ... - Semantic Scholar

fast video motion estimation algorithm for mobile ... - Semantic Scholar

Rate-Distortion Optimized Motion Estimation for ... - Semantic Scholar

a new approach for real time motion estimation ... - Semantic Scholar