A FAST ADAPTIVE TWO-LEVEL MULTI-MODE SEARCH ... - CiteSeerX

0 downloads 0 Views 185KB Size Report
ally neither monotonic nor convex, as a result, these fast BME's may quickly fall into a local ..... [8] John C.-H. Ju, Yen-Kuang Chen, and S.Y.Kung. “a fast rate-.
2003 2003 ISPACS Awaji Awaji Island Island

2003 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS 2003) Awaji Island, Japan, December 7-10, 2003

D3-6

A FAST ADAPTIVE TWO-LEVEL MULTI-MODE SEARCH ALGORITHM FOR MOTION ESTIMATION Yilong Liu and Soontorn Oraintara Department of Electrical Engineering, University of Texas at Arlington, 416 Yates St., Arlington, TX, 76019-0016 USA Email:[email protected], [email protected] ABSTRACT

being evaluated in order to reduce the extremely high complexity of the FS approach. However, different video sequences usually contain different contents and moving behaviors, it is not possible to fit them with a single search policy. Since the error surface, which is defined by a function of distance between the block being considered and its candidates over the search window, is generally neither monotonic nor convex, as a result, these fast BME’s may quickly fall into a local minimum with decreasing number of search steps. Inevitably, these fast BME’s can only improve the performance by increasing the number of search points inefficiently. Therefore, it is beneficial if one can adaptively choose the search pattern and strategy, and is vitally important that the starting point of the search based on an analysis of efficiently predicted motion vectors.

Motion estimation is the main bottleneck in real-time video coding applications, and the search for fast and effective motion estimation algorithms has been a challenging problem recently. This paper describes a new block-matching algorithm that exploits the spatial and temporal correlations, and proposes an adaptive twolevel multi-mode search algorithm (ATMMS) based on the analysis of the neighboring area surrounding the predicted motion vectors. Compared with the well-know diamond search (DS) algorithm recently proposed, our algorithm is robust and achieves better performance with fewer search points. 1. INTRODUCTION Recently, due to the fast development of communication techniques, transmission of a video sequence has been emphasized for the near future. Considering limited channel bandwidth and real-time processing requirement, it is necessary to apply efficient video source coding with very high compression ratio. To achieve this objective, the temporal redundancy between adjacent frames in a video sequence has to be properly identified and eliminated. Block-matching motion estimation (BME), an efficient and popular method, has been widely adopted to decrease the temporal redundancy in many video compression standards, such as ITU-T H.263, ITU-T H.26L, ISO/IEC MPEG-1, MPEG-2 and MPEG4 [1]. However, this process typically is computationally intensive since one has to select from a set of large number of candidates for a block that best matches to the one being considered. Therefore, it is highly desirable to find a fast, low-complexity block-based search technique while maintaining good reconstructed video quality at the same time. The simplest BME is the full search (FS) algorithm, which gives the global optimum motion solution, i.e. the minimum matching error point, by evaluating all the candidates within the search window. However, it is not practical for many applications especially for those that operate in real-time and are under powerconstraint, due to its substantial amount of computational load demanded. To overcome this drawback, many fast BME’s have been proposed, such as three-step search (TSS) [2], four-step search (4SS) [3], cross search (CS) [4], block-based gradient descent search (BBGDS) [5], new three-step search (NTSS) [6], and diamond search (DS) [7]. Generally, these fast BME’s attempt to locate optimal vectors with the same search pattern and strategy in a sequential manner guided by some gradient calculation. Note also that all of these methods normally start the search at the position of the current block. In addition, it must minimize the number points

403

2. ADAPTIVE TWO-LEVEL MULTI-MODE SEARCH ALGORITHM(ATMMS) 2.1. Motion vector prediction Since the error surface is not convex, the final solution is dependent on the initial motion vector. In this section, we discuss how to select the initial motion vector. In the real world video sequences it is natural that a motion vector of the current block is highly correlated to those of its adjacent blocks since they tend to move in the same direction and those of the corresponding neighboring blocks in the previous frame assuming that the object has sufficiently slow movement. Therefore, predicting motion vectors from these spatial and temporal fields by evaluating the distortions can provide an efficient starting point. One way to reduce the possibility of being trapped in a local minimum is to increase the block size. The motion field of nature sequences is piecewise continuous in the spatial domain [8]. Pixels inside the same video object move consistently in a certain direction.Generally, the motion vector obtained from a larger block is also more robust to noise than that from a smaller block. Predicting an approximate large-scale motion vector using larger blocks and refining the predicted motion vector using smaller blocks could increase the probability of reaching the global optimum solution. It is less complex to apply the larger scale motion estimation in a coarser resolution by a 4:1 subsampling of pixels [9]. With the same block size, any fast BME’s can be applied to the coarser level within half-size search window of the fine level as shown in Fig.1. In [7], the DS method is recommended because of its high efficiency. In this method, when two motion vectors produced similar mean absolute difference (MAD), the one that is closer will be selected. The motion field produced by this method will be

minimum SAD among these five points, two points with smallest SAD’s are selected. For convenience of the discussion, let  and  denote the two points with smallest and second smallest SAD’s, respectively. The search modes are defined by the different combinations of the locations of  and . Hence it is easy to see that it can be categorized into four different search modes as summarized below:

Search Window (-15,15) xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx

16

16

xxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx Finer Level xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx (Original Image) xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx

Mode A: Mode B: Mode C:

 is the center point.

 is the center point. Neither  nor  is the center point and they

are diagonally connected. Neither  nor  is the center point and they are not connected. Figure 2 illustrates the four possible search modes. If mode A is selected, only three more neighboring points in the direction of  will be added as new candidates and their SAD’s will be compare with that of . The one with minimum SAD will be chosen as the final solution of the process and the search will stop. If mode B is selected, three new points next to  will be added and their SAD’s are compared with that of . The one with minimum SAD (including ) will be selected as a new starting point. This new starting point together with its four neighboring points will be used to determine the next search mode, and the search continues. Similarly, if mode C or D is selected, five or six more candidates will be added to the consideration for determination of the new starting point of the next search routine. The search will stop when either after mode A has been selected or it reaches the border of the search window. It should be noted that the new mode determination will be less complex than the first one since only two or three more points are introduced in the comparison. At least one point has already been tested and is available from last search routine. This also reduces the computational complexity of the proposed search algorithm. Figure 3 illustrates the two difference cases of new search points in the new search routine. The ATMMS algorithm can be summarized as follows.

16

Mode D:

16 Coarser Level

Search Window (-7,7) Figure 1: Two level larger scale motion vector prediction

smoother than that obtained from the FS method. Based on the above analysis, four motion vectors obtained from the ones from the upper and the left blocks of the same frame, the one of the same block of the previous frame, and the corresponding block in the coarser level are chosen to generate a predicted motion vector set as follows:

                                

(1)

where   is the block index in the finer level,  is the frame index, and  and  are the motion vectors in the finer and coarser levels respectively. Since each block in the coarser level actually corresponds to 4 blocks in the finer level, the operation of        gives the prediction from the coarser level. By evaluating the summation of absolute difference (SAD) of these candidates, the one with the minimum SAD is selected as the starting point for the multi-mode search routine which will be described in the next section.

Step 1 Obtain the coarser resolution image by computing the mean of the non-overlapping    pixels from the finer level image. Set the block size to be    for both levels. At the coarser level, get  by applying DS algorithm.

Step 2 Create a motion vector set     as in Eq.(1), and choose the one with smallest SAD as the starting point of the multi-mode search. Step 3 Compare the SAD of the starting point and that of its four neighboring points to identify  and .

2.2. Multi-mode search

Step 4 Use the combination of  and  to determine the search mode as described above (see Figure 2).

Conventionally, search points in fast BME’s are restricted within the area surrounding the starting point, which can be inefficient in some cases. In fact, the resulting motion vector might just be a local minimum. However, if the search area can be adaptively changed according to some pre-analysis of the neighboring points of the current predicted starting point, it will be much efficient for further search with complexity reduced at the same time. In particular, let     be a predicted optimum candidate obtained from the predicted motion vector set     described earlier in the previous section, which will be used as a starting point of the search algorithm. The SAD’s of the four neighboring points,     ,     ,     , and      are calculated and compared with that of the starting point    . Instead of choosing the point with the

Step 5 If the search reaches the border,  is selected as the final solution and the search stops. If mode A is selected, incorporate the other neighboring points next to  into consideration. The one with minimum SAD is selected as the final solution and the search stops. If mode B, C or D is selected, the new points as shown in Figure 2 are incorporated. The one with minimum SAD is selected as the starting point of the next search routine, and return to Step 3. Since, within the neighboring area of starting point, the error surface is usually not convex, searching in the direction of  only does not guarantee that the search will reach the global optimum.

404

xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxxx xxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxxx xxxxx xxx xxx xxxxxxxxx xxxxxxxxx xxxx xxxxx xxxxxxxxx xxxxxxxxx xxxx xxxxx xxxxxxxxx xxxxxxxxx xxxx

By introducing the combinations of  and , though global optimallity is not guaranteed, the probability of being trapped in a local minimum is reduced which enhances the probability of reaching the global minimum. For example, in mode A,  is located at the center which is also the starting point, and  is one of the four corners. In this situation, most other fast BME’s stop searching and take the center point as the final solution, which is not true in some cases. In addition, some other BME’s, such as DS, search the large diamond points around the center point, which is not efficient since many points with small likelihood of being close to the global optimum are included. It is more likely that the best solution will occur in the direction of  compared to the others. This is why the points surrounding  are also incorporated into the search. Similarly, in modes B, C and D, only three, five and six points, respectively, are included for the next search.

xxxxxxxxxxxx xxxxx xxxxxxxxxxxx xxxxx xxxxxxxxxxxx xxx xxxxx xxxxxxxxx xxxxxxxxxxxx xxx xxxxx xxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxx xxxxxxxxxxxxx xxxxxxxxx xxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxx xxxx xxxxxxxxxxxxx xxx xxxxxxxxx xxxx xxxxxxxxxxxxx xxx xxxx xxxxxxxxxxxxx xxxx xxxxxxxxxxxxx xxxx xxxxxxxxxxxxx

(a) Mode A

(b) Mode B

xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxx xxxxxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxx xxxxxxxx xxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxx xxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxx xxxxxxxxxxxxx xxx xxxxxxxxx xxxxx xxxxxxxxxxxxx xxx xxxxx xxxxxxxxxxxxx xxx xxxxx xxxxxxxxxxxxx xxxxx xxxxxxxxxxxxx

xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxx xxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxx xxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxx xxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxx xxxxxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxx xxxxxxxxxx xxxxx xxxxxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxx xxxxxxxxxx xxxxx xxxxxxxx xxx xxx xxxxxx xxxxxxxxxx xxxxx xxxxxxxx xxxxxx xxxxxxxxxx xxxxx xxxxxxxx xxxxxx xxxxxxxxxx xxxxx

3. SIMULATION RESULTS In the simulations, the block size is fixed at    pixels and the search window is set at     . Therefore there are    candidates for each block. In the coarser scale for the estimation of the starting point, the block size is also fixed at   . Hence the search windows is set at       . The SAD is defined as follows [10]:

xxxxxxxxx xxxx xxxxxxxxx xxxxxxxxxx xxxx xxxx xxxxxxxxx xxxxxxxxxx xxxx xxx xxxx xxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxx xxx xxxx xxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx

(c) Mode C

(d) Mode D

    

      

               (2) where   is the block position, and  and  are horizontal and vertical block offsets, i.e.       , where in this paper,   or 15 and N=16. Since the block size in the

:Point with the smallest SAD, denoted as P1 :Point with the second smallest SAD, denoted as P2 :The points other than P1 and P2 xxx xxx xxx

:New search point

coarser and finer scales are the same, their computation costs for obtaining the SAD are the same. In this paper, in order to compare the computational complexity of each method, the number of search points whose SAD’s are calculated is used. However, for a fair comparison, we also include the number of search points in the coarser scale for the ATMMS method which is approximately four time less than that of the finer scale. In this paper, six representative video sequences (all in CIF format), which vary in motion contents, are tested. Table 1 summarizes the testing video sequences and their numbers of frames. Figure 4 shows the performance of the proposed search algorithm

Figure 2: Adaptive Multi-mode search

New search points given the new central point

New search points given the new central point

xxx xxx xxx xxx xxx xxx

(a)

Table 1: Testing video sequences in CIF format. Sequence Number of frames Garden 121 Mobile 200 Coastguard 299 Tempete 251 Tennis 299 Foreman 250

xxx xxx xxx xxx xxx xxx

xxx xxx xxx xxx

   

xxx xxx xxx

in term of mean square error (MSE). It is evident that the ATMMS yields similar MSE values to that of the FS with slight degradation and higher than that of the DS method in most of the frames. Table 2 compares the average MSE’s calculated for different video sequences. In term of the average MSE, the results obtained by using the ATMMS method are consistently better than that of the DS method.

(b)

Figure 3: New points whose SAD’s are to be calculated after a new starting point has been determined.

405

Figure 5 shows the numbers of search points used to calculate the SAD in each frame. It is clear that the number of search points of the ATMMS method is significantly reduced from that of the FS method (961 search points). In addition, its computational complexity is consistently lower than that of the DS method. Table 3 presents the average numbers of search points for different testing video sequences. The computational complexity can also be quantified by the speed-up ratio which is a ratio between the numbers of search points in the FS and the testing method as:

      

4. CONCLUSION

(3)

where is the number of search points per block for FS method, i.e.    (in this paper, N=15), and  is the average number of search points per block for testing fast block-matching algorithms. Table 4 tabulates speed-up ratios given in Eq 3. It can be seen from Figures 4 and 5 that the ATMMS performs very competitively in terms of low block MSE distortion while minimizing more than 50% of search points from the DS for fast changing sequences such as the “tennis” and the “foreman” sequences. The speed improvement is also quite substantial for sequences containing large quantities of small motions, such as the “tempete” sequence. The best performance in term of MSE is achieved with fast camera panning sequences such as in the “mobile” sequence (see Figure 4(c)).

In this paper, a new search method, ATMMS, for motion estimation block matching used in video compression is proposed. It incorporates both spatial and temporal informations in the prediction of the search starting point in order to reduce the number of search points. Multiscale search is used to prevent the search from being trapped in local minima and is suitable for large moving objects. The search is divided into four different modes based on the set of predicted starting points. Simulation results show that the proposed ATMMS method yields similar MSE as the FS method and occasionally outperforms the DS method. It also significantly reduces the computational complexity of the DS method by 20 to 50%. 5. REFERENCES [1] Abdul H. Sadka. “Compressed video communications”. John wiley and sons Ltd, 2002. [2] T.Koga, K. Iinuma, A. Hiranoa, Y. Iijima, and T. Ishiguro. “motion compensated interframe coding for video conferencing”. pages G5.3.1–G5.3.5, Nov. 29-Dec. 3 1981. [3] L.M.Po and W.C.Ma. “a novel four-step search algorithm for fast block motion estimation”. IEEE Trans. Circuits Syst. Video Technol., 6:313–317, June 1996. [4] M.Ghanbari. “the cross-search algorithm for motion estimation”. IEEE Trans. Commun., 38:950–953, July 1990.

Table 2: Average MSE of different video sequences. Sequence FS DS ATMMS Garden 82.7856 83.5344 83.2886 Mobile 77.7165 78.6313 77.7272 Coastguard 47.1426 47.5135 47.3279 Tempete 57.3849 58.2749 57.6580 Tennis 34.5140 34.6368 34.6160 Foreman 33.3816 34.4195 34.0811

[5] L.K.Liu and E.Feig. “a block-based gradient descent search algorithm for block motion estimation in video coding”. IEEE Trans. Circuits Syst.Video Technol., 6:419–423, Aug. 1996. [6] R.Li, B.Zeng, and M.L.Liou. “a new three-step search algorithm for block motion estimation”. IEEE Trans. Circuits Syst. Video Technol., 4:438–442, Aug. 199. [7] Shan Zhu and Kai-Kuang Ma. “a new diamond search algorithm for fast block-matching motion estimation”. IEEE Trans. Image Processing, 9:287–290, Feburary 2000.

Table 3: Average number of search points per block with respect to different video sequences. Sequence FS DS ATMMS Garden 961.0 19.93 15.70 Mobile 961.0 22.38 12.74 Coastguard 961.0 20.06 16.12 Tempete 961.0 23.59 11.66 Tennis 961.0 23.45 11.51 Foreman 961.0 24.06 15.22

[8] John C.-H. Ju, Yen-Kuang Chen, and S.Y.Kung. “a fast rateoptimized motin estimation algorithm for low-bit-rate video coding”. IEEE Trans. on Circuits and Systems for Video Technology, 9:994–1002, October 1999. [9] B.Liu and A.Zaccarin. “new fast algorithms for the estimation of block motion vectors”. IEEE Trans. Circuits Syst. Video Technol., 3:54–70, April 1993. [10] Jo Yew Tham, Surendra Ranganath, Maitreya Ranganath, and Ashraf Ali Kassim. “a novel unrestricted center-biased diamond search algorithm for block motion estimation”. IEEE Trans. Circuits Syst. Video Technol., 8:369–277, Aug. 1998.

Table 4: Speed up ratios over the FS method with different video sequences. Sequence FS DS ATMMS Garden 1.0 48.2188 61.2102 Mobile 1.0 42.9401 75.4317 Coastguard 1.0 47.9063 59.6154 Tempete 1.0 40.7376 82.4185 Tennis 1.0 40.9808 83.4926 Foreman 1.0 39.9418 63.1406

406

64

34

32 62 30 60

MSE

MSE

28

58

26

24 56 22 FS DS AMMTS

54

52

0

20

40

FS DS ATMMS

20

60 80 frame number

100

120

18

140

0

20

40

60

(a)

80 100 frame number

120

140

160

180

(b)

83

88

82

86

81

84 80

MSE

MSE

82 79

80 78

78 77 FS DS ATMMS

76

75 60

80

100

FS DS AMMTS

76

120 140 frame number

160

180

74 10

200

20

30

40 50 frame number

(c)

60

70

80

(d)

49

37 FS DS AMMTS

48

36

35 47

34

33 MSE

MSE

46 32

45 31

30

44

FS DS AMMTS

29 43 28

42 140

160

180

200 frame number

220

240

260

(e)

27 40

50

60

70 frame number

80

90

100

(f)

Figure 4: Comparison of MSE for FS,DS and ATMMS using (a) Tempete (frames 1-140), (b) Foreman (frames 1-180), (c) Mobile (frames 60-190), (d) Garden (frames 10-80), (e) Coastguard (frames 140-260) and (f) Tennis (frames 40-100).

407

26

24

24

22

22

20

20

Number of search points

Number of search points

26

DS AMMTS

18

16

18

16

14

14

12

12

10

10

DS ATMMS

0

10

20

30

40

50 60 frame number

70

80

90

100

0

10

20

30

40

(a)

50 60 frame number

70

80

90

100

90

100

(b) Number of search points comparison for flo.cif (100 frames)

26

26

24

DS AMMTS

24

22 22

# of search points

# of search points

20

18

16

14

20

18

16

12 14 DS ATMMS

10 0

10

20

30

40

50 60 frame number

70

80

90

12

100

0

10

20

30

40

(c)

50 60 frame number

70

(d) DS AMMTS

35 30

80

Diamond search ATMMS

30 Number of search points

# of search points

25

20

25

20

15

15

10

10 0

10

20

30

40

50 60 frame number

70

80

90

100

(e)

50

100

150 frame number

200

250

(f)

Figure 5: Comparison of search points for FS, DS and ATMMS using (a) Tempete (251 frames), (b) Foreman (250 frames), (c) Mobile (200 frames), (d) Garden (121 frames), (e) Coastguard (299 frames) and (f) Tennis (299 frames).

408

Suggest Documents