FAST AND EFFICIENT FRACTIONAL PIXEL MOTION ... - IEEE Xplore

1 downloads 0 Views 89KB Size Report
This paper presents a fast algorithm for H.264 fractional motion estimation (ME). In H.264 ME is the most time consuming component. The ME process consists ...
FAST AND EFFICIENT FRACTIONAL PIXEL MOTION ESTIMATION FOR H.264/AVC VIDEO CODING Humaira Nisar and Tae-Sun Choi, Senior Member, IEEE Gwangju Institute of Science and Technology, Gwangju, Korea [email protected],[email protected] both necessary and significant.

ABSTRACT This paper presents a fast algorithm for H.264 fractional motion estimation (ME). In H.264 ME is the most time consuming component. The ME process consists of two stages: integer pixel and fractional pixel search. To reduce the complexity of fractional pixel ME we propose a quadrant based directional fractional pixel ME algorithm that is based on the unimodal property of the fractional pixel error surface. Better computation reduction has been achieved by using this strategy. Experimental results show that as compared to fast sub-pel ME proposed in H.264, the proposed method can speedup fractional ME, with a negligible degradation in video quality.

Fig. 1 shows a conventional hierarchical fractional pel search (HFPS) method [2] in H.264. First it examines eight ½ pel positions surrounding the best integer pixel position and obtains the best ½ pel MV. Then it checks eight ¼ pel positions to obtain the best ¼ pel MV. Center Biased fractional pel search (CBFPS) [2] is also adopted in JVT for fast fractional pel ME as shown in Fig. 2.

Integer pel ½ pel ¼ pel

Index Terms— Motion Estimation, Block Matching, Unimodal Error Surface Assumption, Video Coding, Fractional Pixel. 1. INTRODUCTION Fig. 1. Hierarchical Fractional Pel Search algorithm In H.264/AVC the motion estimation (ME) process is divided into two steps: the integer pixel ME and fractional pixel ME at quarter pel accuracy. The search range of fractional pel ME in Joint Video Team (JVT) reference software is fixed to + 3 for quarter accuracy case so in many cases people prefer to choose FS at this stage for simplicity. Generally integer pixel ME takes most of the computational cost of the whole ME. However with the development of fast ME algorithms [1]-[5] the computational cost of integer pixel ME has been greatly reduced. The fractional ME has a strong impact on peak signal to noise ratio (PSNR), (about 2-3 dB significant improvement), and has also high computational complexity due to complex sub pel interpolation process. Therefore the computational cost of fractional pel ME becomes comparable to that of integer pel ME, as it requires 49 points in the full fractional search method. The conventional Hierarchical Fractional Pixel Search (HFPS) that has been adopted in the reference software needs to check 17 search points. Hence reducing the computational load for fractional pel motion search is

978-1-4244-5654-3/09/$26.00 ©2009 IEEE

1561

Integer pel ¼ pel (0,0)

PMV

Fig. 2. Center Biased Fractional Pel Search algorithm 2. FRACTIONAL PIXEL ERROR SURFACE It has been observed that fast integer ME works best if the error surface inside the search window is unimodal. But unfortunately this is not true for integer pel ME due to the large search window and the complexity of the real video content. So the ME search would be easily trapped into a

ICIP 2009

directions (pels) around the search window center instead of checking all the eight directions (pels). In order to implement the above technique we have adopted Quadrant Selection Approach that is discussed in the following section.

local minimum. On the other hand since sub pels are generated from the interpolation of integer pels, the correlation inside the fractional pel search window is much higher than integer pel search window. Thus the unimodal error surface is valid in most fractional pel cases. So the matching error decreases monotonically as the search point moves closer to the global minimum. The error surface of integer and fractional pel ME is shown in Fig. 3 (a) and 3 (b) respectively.

2.1. Quadrant Selection Approach Quadrant selection divides the search area into the desired number of quadrants hence localizes the movement in a certain direction. Instead of searching haphazardly we get an approximate idea of the direction of motion that results in decreased computational load. To avoid using redundant directions we choose to use quadrant selection approach as shown in Fig.5.

(a)

(b) Fig.3. Statistics from [3], (a) Error surface of fractional pel motion estimation (1/8-pel case), (b) Error surface of integer pel motion estimation (search range = 32).

If D (X) > D (Z) and D (X) > D (Y), Quadrant I is selected.

(1)

If D (X) > D (Z) and D (X) < D (Y), Quadrant II is selected.

(2)

If D (X) < D (Z) and D (X) < D (Y), Quadrant III is selected.

(3)

If D (X) < D (Z) and D (X) > D (Y), Quadrant IV is selected. (4) D(X), D(Y) and D(Z) stands for the distortion of pixel X, Y and Z respectively. 2.2. Search Process

III

II

IV

I

X

The proposed algorithm uses two step search process for ½ pel and another 2 steps for ¼ pel search points as shown in Fig. 6.

Z

2.2.1 Half Pel Points:

Y

Fig. 4. Quadrant Selection Approach, a) search area divided into 4 quadrants b) Mandatory points required for quadrant selection. 2. PROPOSED ALGORITHM In full search method every fractional pel around the full integer pel is checked. However with the valid unimodal error surface assumption as discussed above it is redundant to check two points that are lying in the opposite directions, as minimum cannot occur in both directions simultaneously. Hence it is a good practice to check few

Step 1) Calculate the cost of the best integer pel position and 2 search points (below and right) in half pel positions (dark circular points). Step 2) Calculate the quadrant selected using Eq. 1 to 4. Depending on the quadrant selected the algorithm now selects additional ½ pel points in that quadrant to find the best ½ pel position. Case 1) For quadrant I, only one additional ½ pel point is selected (lighter circle), as shown in Fig. 5 (a). Case 2) For quadrant II, two additional ½ pel points are selected (lighter circle), as shown in Fig. 5 (b). Case 3) For quadrant III, three additional ½ pel points are selected (lighter circle), as shown in Fig. 5 (a).

1562

Case 4) For quadrant IV, two additional ½ pel points are selected (lighter circle), as shown in Fig. 5 (b). Once the best ½ pel point is chosen the search for ¼ pel point starts from the best ½ pel point.

using sequences Flower, Car Phone, Mobile and Claire that show a variety of motion content from slow video conferencing sequence like Claire to fast and complex sequence Carphone, Mobile sequence exhibits occlusion problem and Flower is a typical camera panning sequence.

2.2.2. Quarter Pel Points:

Same integer pel ME algorithm has been used with both fractional pel ME algorithms for comparison. The sequences information is given in Table I. The simulation results are tabulated in Table II to V. From the results it can be observed that our algorithm performs very well in terms of quality and the YPSNR and bit-rate are quite close to that of CBFPS. Whereas in terms of speed the proposed algorithm is slightly better. Further improvement in speed can be achieved by using early termination criteria and prediction information.

The search procedure for the ¼ pel points is exactly the same as ½ pel points. Two examples are shown in Fig. 6, that are explained as follows: Example 1: In the first ½ pel case (right below quadrant), quadrant I is selected. The lighter circle is chosen as the best ½ pel point. Again for the ¼ pel case quadrant 1 is chosen and three ¼ pel points are selected. The minimum is to be calculated among these one ½ pel and three ¼ pel points. Example 2: In the second ½ pel case, quadrant III is selected. The lighter circle is chosen as the best ½ pel point. Again for the ¼ pel case quadrant II is chosen and two ¼ pel points are selected. The minimum is to be calculated among these one ½ pel and three ¼ pel points.

III

II IV

I

Hence the concept of the algorithm is based on valid unimodal error surface assumption. We can just examine the points in a particular quadrant and skip the unlikely positions like eight neighboring points that seem to be redundant. Thus compared with the reference software our algorithm significantly reduces the number of search points. Both ½ pel and ¼ pel search process is adaptive. With above steps, our algorithm just needs 7 (case 1 for both ½ pel and ¼ pel) to 11 (case 3 for both ½ pel and ¼ pel) points.

Integer pel

½ pel

(a) (b) Fig. 5. Proposed Search algorithm for ½ pel, (a) Case 1 and Case 3 (b) Case 2 and Case 4.

Integer pel ½ pel ¼ pel

3. SIMULATION RESULTS We implemented the proposed algorithm in JM-12.2 [6] of H.264/AVC reference software and compared it with Center Biased fractional pel search (CBFPS) [2] that has been adopted in JVT. The comparison criteria is chosen to be luminance peak signal to noise ratio (YPSNR) and bit rate to evaluate quality and fractional pixel ME time and Speedup with reference to JVT software to evaluate computational complexity. The simulation is carried out at quantization parameters (QP=28, 32) to test the algorithm at different bit rates. For encoding JM-12.2 Main Encoder Profile has been used. For each test sequence the 1st frame is coded as I frame and remaining as P frames. One reference frame is used. Search range is set to 16. The simulation platform is a PC with Intel Pentium IV 2.66 GHZ CPU. The algorithm is tested

Fig. 6. Proposed Search algorithm for ¼ pel 4. CONCLUSION The proposed fast fractional search motion estimation algorithm is based on the strict application of unimodal error surface assumption. The algorithm divides the search area into four quadrants and then finds the minimum error point by searching only some points that lie in that quadrant. In this way sufficient computational gain can be achieved. The experimental results show reduction in computation time while keeping almost same performance as CBFPS algorithm.

1563

TABLE I SEQUENCE INFORMATION

CBFPS

12.22

--

Proposed

11.70

4.3

CBFPS

3.06

--

Proposed

2.84

7.2

CBFPS

5.01

--

Proposed

4.91

2

Car Phone

Sequence

Resolution

No. of frames Frame rate

Flower

CIF

150

30 f/s

Car Phone

QCIF

300

30 f/s

Mobile

QCIF

100

30 f/s

Claire

QCIF

100

30 f/s

Mobile

Claire Average time saving

TABLE II PERFORMANCE COMPARISON (SPEEDUP) QP=28 Time Saving Sequence Method ME Time (s) % CBFPS

24.56

--

Proposed

22.00

10.4

4.0%

TABLE V PERFORMANCE COMPARISON (QUALITY) QP=32 Sequence

Method

Flower

YPSNR Bitrate (dB) (kb/s)

CBFPS

31.17

807.3

Proposed

31.12

822.2

CBFPS

33.47

84.69

Proposed

33.48

84.58

CBFPS

36.82

18.84

Proposed

36.77

18.45

CBFPS

29.49

224.33

Proposed

29.42

224.23

Flower CBFPS

12.44

--

Proposed

10.51

15.5

CBFPS

4.58

--

Proposed

4.39

4.1

CBFPS

2.88

--

Proposed

2.64

8.3

Car Phone Car Phone

Mobile Mobile

Claire Average time saving

Car Phone

9.6%

REFERENCES

TABLE III PERFORMANCE COMPARISON (QUALITY) QP=28 YPSNR Bitrate Sequence Method (dB) (kb/s) CBFPS

34.80

1472.9

Proposed

34.72

1526.2

CBFPS

36.47

154.72

Proposed

36.44

157.37

CBFPS

33.14

451.85

Proposed

33.12

453.67

CBFPS

39.77

33.28

Proposed

39.72

33.23

[1] S. Zhu and K. K. Ma, “A new diamond search algorithm for fast block matching motion estimation,” IEEE Trans. Image Process., Vol. 9, No. 2, pp. 287-290, Feb 2000. [2] Z. Chen, P. Zhou, Y. He and Y. Chen, “Fast Integer Pel and Fractional Pel Motion Estimation for JVT” ITU-T, Doc. #JVT-F-017, Dec. 2002.”

Flower

[3] Humaira Nisar, Tae-Sun Choi, “A fast block motion estimation algorithm based on motion classification and directional search patterns”, Optical Engineering, Vol. 47, No. 10, pp. 107001.1-10, Oct. 2008.

Car Phone

Mobile

[4] Humaira Nisar, Tae-Sun Choi, “Multiple Initial Point Prediction based Search Pattern Selection for Fast Motion Estimation”, Pattern Recognition, Vol. 42, No. 3, pp. 475486, Mar. 2009.

Car Phone

TABLE IV PERFORMANCE COMPARISON (SPEEDUP) QP=32 Time Saving Sequence Method ME Time (s) % CBFPS

24.75

--

Proposed

24.21

2.2

Flower

[5] Yu-Jen Wang, Chao-Chung Cheng and Tian-Sheuan Chang, “A Fast Algorithm and its VLSI Architecture for Fractional Motion Estimation for H.264/MPEG-4 AVC Video Coding”, IEEE Trans. Circuit and Sys. Video Tech., Vol. 17, No. 5, pp. 578-583, May 2007. [6] Joint Video Team Reference Software, Version 12.2 (JM12.2), http://iphome.hhi.de/suehring/tml/download/.

1564

Suggest Documents