2003 2003 ISPACS Awaji Awaji Island Island
2003 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS 2003) Awaji Island, Japan, December 7-10, 2003
D3-6
A FAST ADAPTIVE TWO-LEVEL MULTI-MODE SEARCH ALGORITHM FOR MOTION ESTIMATION Yilong Liu and Soontorn Oraintara Department of Electrical Engineering, University of Texas at Arlington, 416 Yates St., Arlington, TX, 76019-0016 USA Email:
[email protected],
[email protected] ABSTRACT
being evaluated in order to reduce the extremely high complexity of the FS approach. However, different video sequences usually contain different contents and moving behaviors, it is not possible to fit them with a single search policy. Since the error surface, which is defined by a function of distance between the block being considered and its candidates over the search window, is generally neither monotonic nor convex, as a result, these fast BME’s may quickly fall into a local minimum with decreasing number of search steps. Inevitably, these fast BME’s can only improve the performance by increasing the number of search points inefficiently. Therefore, it is beneficial if one can adaptively choose the search pattern and strategy, and is vitally important that the starting point of the search based on an analysis of efficiently predicted motion vectors.
Motion estimation is the main bottleneck in real-time video coding applications, and the search for fast and effective motion estimation algorithms has been a challenging problem recently. This paper describes a new block-matching algorithm that exploits the spatial and temporal correlations, and proposes an adaptive twolevel multi-mode search algorithm (ATMMS) based on the analysis of the neighboring area surrounding the predicted motion vectors. Compared with the well-know diamond search (DS) algorithm recently proposed, our algorithm is robust and achieves better performance with fewer search points. 1. INTRODUCTION Recently, due to the fast development of communication techniques, transmission of a video sequence has been emphasized for the near future. Considering limited channel bandwidth and real-time processing requirement, it is necessary to apply efficient video source coding with very high compression ratio. To achieve this objective, the temporal redundancy between adjacent frames in a video sequence has to be properly identified and eliminated. Block-matching motion estimation (BME), an efficient and popular method, has been widely adopted to decrease the temporal redundancy in many video compression standards, such as ITU-T H.263, ITU-T H.26L, ISO/IEC MPEG-1, MPEG-2 and MPEG4 [1]. However, this process typically is computationally intensive since one has to select from a set of large number of candidates for a block that best matches to the one being considered. Therefore, it is highly desirable to find a fast, low-complexity block-based search technique while maintaining good reconstructed video quality at the same time. The simplest BME is the full search (FS) algorithm, which gives the global optimum motion solution, i.e. the minimum matching error point, by evaluating all the candidates within the search window. However, it is not practical for many applications especially for those that operate in real-time and are under powerconstraint, due to its substantial amount of computational load demanded. To overcome this drawback, many fast BME’s have been proposed, such as three-step search (TSS) [2], four-step search (4SS) [3], cross search (CS) [4], block-based gradient descent search (BBGDS) [5], new three-step search (NTSS) [6], and diamond search (DS) [7]. Generally, these fast BME’s attempt to locate optimal vectors with the same search pattern and strategy in a sequential manner guided by some gradient calculation. Note also that all of these methods normally start the search at the position of the current block. In addition, it must minimize the number points
403
2. ADAPTIVE TWO-LEVEL MULTI-MODE SEARCH ALGORITHM(ATMMS) 2.1. Motion vector prediction Since the error surface is not convex, the final solution is dependent on the initial motion vector. In this section, we discuss how to select the initial motion vector. In the real world video sequences it is natural that a motion vector of the current block is highly correlated to those of its adjacent blocks since they tend to move in the same direction and those of the corresponding neighboring blocks in the previous frame assuming that the object has sufficiently slow movement. Therefore, predicting motion vectors from these spatial and temporal fields by evaluating the distortions can provide an efficient starting point. One way to reduce the possibility of being trapped in a local minimum is to increase the block size. The motion field of nature sequences is piecewise continuous in the spatial domain [8]. Pixels inside the same video object move consistently in a certain direction.Generally, the motion vector obtained from a larger block is also more robust to noise than that from a smaller block. Predicting an approximate large-scale motion vector using larger blocks and refining the predicted motion vector using smaller blocks could increase the probability of reaching the global optimum solution. It is less complex to apply the larger scale motion estimation in a coarser resolution by a 4:1 subsampling of pixels [9]. With the same block size, any fast BME’s can be applied to the coarser level within half-size search window of the fine level as shown in Fig.1. In [7], the DS method is recommended because of its high efficiency. In this method, when two motion vectors produced similar mean absolute difference (MAD), the one that is closer will be selected. The motion field produced by this method will be
minimum SAD among these five points, two points with smallest SAD’s are selected. For convenience of the discussion, let and denote the two points with smallest and second smallest SAD’s, respectively. The search modes are defined by the different combinations of the locations of and . Hence it is easy to see that it can be categorized into four different search modes as summarized below:
Search Window (-15,15) xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx
16
16
xxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx Finer Level xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx (Original Image) xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx
Mode A: Mode B: Mode C:
is the center point.
is the center point. Neither nor is the center point and they
are diagonally connected. Neither nor is the center point and they are not connected. Figure 2 illustrates the four possible search modes. If mode A is selected, only three more neighboring points in the direction of will be added as new candidates and their SAD’s will be compare with that of . The one with minimum SAD will be chosen as the final solution of the process and the search will stop. If mode B is selected, three new points next to will be added and their SAD’s are compared with that of . The one with minimum SAD (including ) will be selected as a new starting point. This new starting point together with its four neighboring points will be used to determine the next search mode, and the search continues. Similarly, if mode C or D is selected, five or six more candidates will be added to the consideration for determination of the new starting point of the next search routine. The search will stop when either after mode A has been selected or it reaches the border of the search window. It should be noted that the new mode determination will be less complex than the first one since only two or three more points are introduced in the comparison. At least one point has already been tested and is available from last search routine. This also reduces the computational complexity of the proposed search algorithm. Figure 3 illustrates the two difference cases of new search points in the new search routine. The ATMMS algorithm can be summarized as follows.
16
Mode D:
16 Coarser Level
Search Window (-7,7) Figure 1: Two level larger scale motion vector prediction
smoother than that obtained from the FS method. Based on the above analysis, four motion vectors obtained from the ones from the upper and the left blocks of the same frame, the one of the same block of the previous frame, and the corresponding block in the coarser level are chosen to generate a predicted motion vector set as follows:
(1)
where is the block index in the finer level, is the frame index, and and are the motion vectors in the finer and coarser levels respectively. Since each block in the coarser level actually corresponds to 4 blocks in the finer level, the operation of gives the prediction from the coarser level. By evaluating the summation of absolute difference (SAD) of these candidates, the one with the minimum SAD is selected as the starting point for the multi-mode search routine which will be described in the next section.
Step 1 Obtain the coarser resolution image by computing the mean of the non-overlapping pixels from the finer level image. Set the block size to be for both levels. At the coarser level, get by applying DS algorithm.
Step 2 Create a motion vector set as in Eq.(1), and choose the one with smallest SAD as the starting point of the multi-mode search. Step 3 Compare the SAD of the starting point and that of its four neighboring points to identify and .
2.2. Multi-mode search
Step 4 Use the combination of and to determine the search mode as described above (see Figure 2).
Conventionally, search points in fast BME’s are restricted within the area surrounding the starting point, which can be inefficient in some cases. In fact, the resulting motion vector might just be a local minimum. However, if the search area can be adaptively changed according to some pre-analysis of the neighboring points of the current predicted starting point, it will be much efficient for further search with complexity reduced at the same time. In particular, let be a predicted optimum candidate obtained from the predicted motion vector set described earlier in the previous section, which will be used as a starting point of the search algorithm. The SAD’s of the four neighboring points, , , , and are calculated and compared with that of the starting point . Instead of choosing the point with the
Step 5 If the search reaches the border, is selected as the final solution and the search stops. If mode A is selected, incorporate the other neighboring points next to into consideration. The one with minimum SAD is selected as the final solution and the search stops. If mode B, C or D is selected, the new points as shown in Figure 2 are incorporated. The one with minimum SAD is selected as the starting point of the next search routine, and return to Step 3. Since, within the neighboring area of starting point, the error surface is usually not convex, searching in the direction of only does not guarantee that the search will reach the global optimum.
404
xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxxx xxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxx xxxxxxxxx xxxx xxxxx xxx xxx xxxxxxxxx xxxxxxxxx xxxx xxxxx xxxxxxxxx xxxxxxxxx xxxx xxxxx xxxxxxxxx xxxxxxxxx xxxx
By introducing the combinations of and , though global optimallity is not guaranteed, the probability of being trapped in a local minimum is reduced which enhances the probability of reaching the global minimum. For example, in mode A, is located at the center which is also the starting point, and is one of the four corners. In this situation, most other fast BME’s stop searching and take the center point as the final solution, which is not true in some cases. In addition, some other BME’s, such as DS, search the large diamond points around the center point, which is not efficient since many points with small likelihood of being close to the global optimum are included. It is more likely that the best solution will occur in the direction of compared to the others. This is why the points surrounding are also incorporated into the search. Similarly, in modes B, C and D, only three, five and six points, respectively, are included for the next search.
xxxxxxxxxxxx xxxxx xxxxxxxxxxxx xxxxx xxxxxxxxxxxx xxx xxxxx xxxxxxxxx xxxxxxxxxxxx xxx xxxxx xxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxx xxxxxxxxxxxxx xxxxxxxxx xxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxx xxxx xxxxxxxxxxxxx xxx xxxxxxxxx xxxx xxxxxxxxxxxxx xxx xxxx xxxxxxxxxxxxx xxxx xxxxxxxxxxxxx xxxx xxxxxxxxxxxxx
(a) Mode A
(b) Mode B
xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxx xxxxxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxx xxxxxxxx xxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxx xxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxx xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxx xxxxxxxxxxxxx xxx xxxxxxxxx xxxxx xxxxxxxxxxxxx xxx xxxxx xxxxxxxxxxxxx xxx xxxxx xxxxxxxxxxxxx xxxxx xxxxxxxxxxxxx
xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxx xxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxx xxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxx xxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxx xxxxxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxx xxxxxxxxxx xxxxx xxxxxxxx xxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxx xxxxxxxxxx xxxxx xxxxxxxx xxx xxx xxxxxx xxxxxxxxxx xxxxx xxxxxxxx xxxxxx xxxxxxxxxx xxxxx xxxxxxxx xxxxxx xxxxxxxxxx xxxxx
3. SIMULATION RESULTS In the simulations, the block size is fixed at pixels and the search window is set at . Therefore there are candidates for each block. In the coarser scale for the estimation of the starting point, the block size is also fixed at . Hence the search windows is set at . The SAD is defined as follows [10]:
xxxxxxxxx xxxx xxxxxxxxx xxxxxxxxxx xxxx xxxx xxxxxxxxx xxxxxxxxxx xxxx xxx xxxx xxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxx xxx xxxx xxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxxxxxx
(c) Mode C
(d) Mode D
(2) where is the block position, and and are horizontal and vertical block offsets, i.e. , where in this paper, or 15 and N=16. Since the block size in the
:Point with the smallest SAD, denoted as P1 :Point with the second smallest SAD, denoted as P2 :The points other than P1 and P2 xxx xxx xxx
:New search point
coarser and finer scales are the same, their computation costs for obtaining the SAD are the same. In this paper, in order to compare the computational complexity of each method, the number of search points whose SAD’s are calculated is used. However, for a fair comparison, we also include the number of search points in the coarser scale for the ATMMS method which is approximately four time less than that of the finer scale. In this paper, six representative video sequences (all in CIF format), which vary in motion contents, are tested. Table 1 summarizes the testing video sequences and their numbers of frames. Figure 4 shows the performance of the proposed search algorithm
Figure 2: Adaptive Multi-mode search
New search points given the new central point
New search points given the new central point
xxx xxx xxx xxx xxx xxx
(a)
Table 1: Testing video sequences in CIF format. Sequence Number of frames Garden 121 Mobile 200 Coastguard 299 Tempete 251 Tennis 299 Foreman 250
xxx xxx xxx xxx xxx xxx
xxx xxx xxx xxx
xxx xxx xxx
in term of mean square error (MSE). It is evident that the ATMMS yields similar MSE values to that of the FS with slight degradation and higher than that of the DS method in most of the frames. Table 2 compares the average MSE’s calculated for different video sequences. In term of the average MSE, the results obtained by using the ATMMS method are consistently better than that of the DS method.
(b)
Figure 3: New points whose SAD’s are to be calculated after a new starting point has been determined.
405
Figure 5 shows the numbers of search points used to calculate the SAD in each frame. It is clear that the number of search points of the ATMMS method is significantly reduced from that of the FS method (961 search points). In addition, its computational complexity is consistently lower than that of the DS method. Table 3 presents the average numbers of search points for different testing video sequences. The computational complexity can also be quantified by the speed-up ratio which is a ratio between the numbers of search points in the FS and the testing method as:
4. CONCLUSION
(3)
where is the number of search points per block for FS method, i.e. (in this paper, N=15), and is the average number of search points per block for testing fast block-matching algorithms. Table 4 tabulates speed-up ratios given in Eq 3. It can be seen from Figures 4 and 5 that the ATMMS performs very competitively in terms of low block MSE distortion while minimizing more than 50% of search points from the DS for fast changing sequences such as the “tennis” and the “foreman” sequences. The speed improvement is also quite substantial for sequences containing large quantities of small motions, such as the “tempete” sequence. The best performance in term of MSE is achieved with fast camera panning sequences such as in the “mobile” sequence (see Figure 4(c)).
In this paper, a new search method, ATMMS, for motion estimation block matching used in video compression is proposed. It incorporates both spatial and temporal informations in the prediction of the search starting point in order to reduce the number of search points. Multiscale search is used to prevent the search from being trapped in local minima and is suitable for large moving objects. The search is divided into four different modes based on the set of predicted starting points. Simulation results show that the proposed ATMMS method yields similar MSE as the FS method and occasionally outperforms the DS method. It also significantly reduces the computational complexity of the DS method by 20 to 50%. 5. REFERENCES [1] Abdul H. Sadka. “Compressed video communications”. John wiley and sons Ltd, 2002. [2] T.Koga, K. Iinuma, A. Hiranoa, Y. Iijima, and T. Ishiguro. “motion compensated interframe coding for video conferencing”. pages G5.3.1–G5.3.5, Nov. 29-Dec. 3 1981. [3] L.M.Po and W.C.Ma. “a novel four-step search algorithm for fast block motion estimation”. IEEE Trans. Circuits Syst. Video Technol., 6:313–317, June 1996. [4] M.Ghanbari. “the cross-search algorithm for motion estimation”. IEEE Trans. Commun., 38:950–953, July 1990.
Table 2: Average MSE of different video sequences. Sequence FS DS ATMMS Garden 82.7856 83.5344 83.2886 Mobile 77.7165 78.6313 77.7272 Coastguard 47.1426 47.5135 47.3279 Tempete 57.3849 58.2749 57.6580 Tennis 34.5140 34.6368 34.6160 Foreman 33.3816 34.4195 34.0811
[5] L.K.Liu and E.Feig. “a block-based gradient descent search algorithm for block motion estimation in video coding”. IEEE Trans. Circuits Syst.Video Technol., 6:419–423, Aug. 1996. [6] R.Li, B.Zeng, and M.L.Liou. “a new three-step search algorithm for block motion estimation”. IEEE Trans. Circuits Syst. Video Technol., 4:438–442, Aug. 199. [7] Shan Zhu and Kai-Kuang Ma. “a new diamond search algorithm for fast block-matching motion estimation”. IEEE Trans. Image Processing, 9:287–290, Feburary 2000.
Table 3: Average number of search points per block with respect to different video sequences. Sequence FS DS ATMMS Garden 961.0 19.93 15.70 Mobile 961.0 22.38 12.74 Coastguard 961.0 20.06 16.12 Tempete 961.0 23.59 11.66 Tennis 961.0 23.45 11.51 Foreman 961.0 24.06 15.22
[8] John C.-H. Ju, Yen-Kuang Chen, and S.Y.Kung. “a fast rateoptimized motin estimation algorithm for low-bit-rate video coding”. IEEE Trans. on Circuits and Systems for Video Technology, 9:994–1002, October 1999. [9] B.Liu and A.Zaccarin. “new fast algorithms for the estimation of block motion vectors”. IEEE Trans. Circuits Syst. Video Technol., 3:54–70, April 1993. [10] Jo Yew Tham, Surendra Ranganath, Maitreya Ranganath, and Ashraf Ali Kassim. “a novel unrestricted center-biased diamond search algorithm for block motion estimation”. IEEE Trans. Circuits Syst. Video Technol., 8:369–277, Aug. 1998.
Table 4: Speed up ratios over the FS method with different video sequences. Sequence FS DS ATMMS Garden 1.0 48.2188 61.2102 Mobile 1.0 42.9401 75.4317 Coastguard 1.0 47.9063 59.6154 Tempete 1.0 40.7376 82.4185 Tennis 1.0 40.9808 83.4926 Foreman 1.0 39.9418 63.1406
406
64
34
32 62 30 60
MSE
MSE
28
58
26
24 56 22 FS DS AMMTS
54
52
0
20
40
FS DS ATMMS
20
60 80 frame number
100
120
18
140
0
20
40
60
(a)
80 100 frame number
120
140
160
180
(b)
83
88
82
86
81
84 80
MSE
MSE
82 79
80 78
78 77 FS DS ATMMS
76
75 60
80
100
FS DS AMMTS
76
120 140 frame number
160
180
74 10
200
20
30
40 50 frame number
(c)
60
70
80
(d)
49
37 FS DS AMMTS
48
36
35 47
34
33 MSE
MSE
46 32
45 31
30
44
FS DS AMMTS
29 43 28
42 140
160
180
200 frame number
220
240
260
(e)
27 40
50
60
70 frame number
80
90
100
(f)
Figure 4: Comparison of MSE for FS,DS and ATMMS using (a) Tempete (frames 1-140), (b) Foreman (frames 1-180), (c) Mobile (frames 60-190), (d) Garden (frames 10-80), (e) Coastguard (frames 140-260) and (f) Tennis (frames 40-100).
407
26
24
24
22
22
20
20
Number of search points
Number of search points
26
DS AMMTS
18
16
18
16
14
14
12
12
10
10
DS ATMMS
0
10
20
30
40
50 60 frame number
70
80
90
100
0
10
20
30
40
(a)
50 60 frame number
70
80
90
100
90
100
(b) Number of search points comparison for flo.cif (100 frames)
26
26
24
DS AMMTS
24
22 22
# of search points
# of search points
20
18
16
14
20
18
16
12 14 DS ATMMS
10 0
10
20
30
40
50 60 frame number
70
80
90
12
100
0
10
20
30
40
(c)
50 60 frame number
70
(d) DS AMMTS
35 30
80
Diamond search ATMMS
30 Number of search points
# of search points
25
20
25
20
15
15
10
10 0
10
20
30
40
50 60 frame number
70
80
90
100
(e)
50
100
150 frame number
200
250
(f)
Figure 5: Comparison of search points for FS, DS and ATMMS using (a) Tempete (251 frames), (b) Foreman (250 frames), (c) Mobile (200 frames), (d) Garden (121 frames), (e) Coastguard (299 frames) and (f) Tennis (299 frames).
408