Probabilistic Confidence Measures for Block Matching Motion Estimation Ioannis Patras, Member, IEEE, and Emile A. Hendriks, and Reginald L. Lagendijk Senior Member, IEEE.
Abstract— This paper addresses the problem of deriving measures that express the degree of the reliability of motion vectors estimated by a Block-matching motion estimation method. We express the block matching motion estimation scheme in the probabilistic framework as a Maximum Likelihood estimation scheme. Subsequently, we derive the confidence measures in terms of the a posteriori probabilities and the likelihoods of the estimated vectors. The assumptions about the type of the likelihood, that is, about the underlying conditional probability distribution of the motion compensated intensity differences (e.g. Laplacian) are derived from the objective criterion of the Blockmatching estimator. All parameters are estimated from data that are derived as by-product of the motion estimation scheme and our method, practically, introduces no additional computational cost. The derivation of the confidence measures is incorporated in a multiscale scheme. Experimental results are presented for image sequences with known ground-truth motion. Index Terms— Block Matching, Motion Estimation, Probabilistic Confidence Measures, Maximum likelihood estimation.
I. I NTRODUCTION One of the issues that often arise in the area of motion estimation is that of the reliability of the estimated motion field. Motion estimators are known to be prone to errors originating from a variety of sources such as occlusion phenomena, absence of texture and support region that strives over motion discontinuities. Confidence measures have been developed by the realization that it is possible to quantify the reliability of the estimation of a motion vector without explicitly identifying the source of error. The usage of such a quantification measure is two-fold. On the one hand it can be incorporated in the estimation procedure itself in order to increase the estimation accuracy [6] [18]. On the other hand, it can provide useful information to the process in which the motion field is intended to be used. Lundmark et al [8] use motion vector certainty in order to reduce the bit rate in video coding, Altunbasak et al [1] in order to discard unreliable motion vectors in a regression scheme, Sand and McMillan [12] in order to select good features to track, and Patras [10] used it as weighing factor in a robust regression scheme in the context of motionbased segmentation. A number of methods have been developed in order to express the degree of confidence on an estimated motion field (e.g. [4] [13] [17]). In the first category belong methods that Manuscript received 28 April 2004; revised 04 September 2006. Ioannis Patras is with the Department of Electronic Engineering, Queen Mary, University of London (email:
[email protected]) Emile A. Hendriks and Reginald L. Lagendijk are with the Information and Communication Theory Group, Delft University of Technology, The Netherlands (e-mail: {E.A.Hendriks, R.L.Lagendijk}@tudelft.nl).
were developed having in mind motion estimation techniques that rely on the optical flow constraint. Although these methods are very significant, their analysis is tailored to the optical flow constraint and therefore they are not applicable for the dominant motion estimation scheme for video technologies, the block matching motion estimation scheme. In this category, Simoncelli et al [14] formulate the motion estimation problem in a probabilistic framework and derive a scalar confidence measure as the trace of the covariance matrix of the a posteriori probability of the motion vector. The later is derived by modeling the prior distribution of the motion vector, the noise in the estimation of the intensity derivatives and the discrepancy between the “true” motion field and the apparent motion field. Dev et al [4] arrive at a similar measure by performing an error analysis of the assumed image motion model. In a second category, belong methods that analyze the orientation of the spatial image derivatives within a block. Barron et al [3] and Ghosal and Vanek [6] derive confidence measures in terms of the eigenvalues of the covariance matrix of the spatial intensity derivatives. In a similar approach, Yoshida et al [18] propose a measure which quantifies the sensitivity of the Block-based motion estimator in certain directions. Both [6] and [18] incorporate the confidence measures in the motion estimation scheme, either to impose anisotropic smoothness constraints ( [6]) or to merge blocks with the same directional sensitivity in order to reliably estimate their motion ( [18]). The main drawback of such methods is that their measures do not express the confidence to an estimated motion field, but rather the expected sensitivity of the motion estimation scheme at certain directions. Such measures can not be used for the validation of motion vectors, for example, in occluded areas. In a third category belong methods that, like Lundmark et al [8], derive the confidence measure as the (weighted) sum of the motion compensated intensity differences in the block in question. The main drawback of such methods is that such a measure does not take into consideration the statistics of the motion compensated intensity differences. For example, it assigns high confidences for blocks in areas with low intensity variation, even though presicely at such areas the confidence should be low (apperture problem). Anandan very early [2] proposed measures that express the confidence on the estimated motion vector at certain directions. However, it is difficult to use his measures [3] since they a) do not have any interpretation in terms of their magnitude and b) require manual parameter tuning. Recently, To et al [16] propose a confidence measure for a block-based scheme that relies on a phase-based matching error. Finally, Fermuller et al [5] provide an analysis of the biases of both gradient-based
II. B LOCK - BASED M OTION E STIMATION IN P ROBABILISTIC F RAMEWORK
THE
Block-based motion estimators belong to a general class of estimators that utilize the Intensity Conservation Principle, the later implying that a pixel and its correspondence in a successive frame are expected to have the same intensity value. They attempt to overcome the ill-posedness of the correspondence problem by adopting a support region in the form of a block, and estimate a motion vector for the whole block. The motion vector is estimated as the one that minimizes an objective criterion which, typically, is either the Mean Absolute Displaced Block Difference or the Mean Square Displaced Block Difference. While the analysis that follows is based on the former objective criterion, it is almost straightforward to derive the confidence measures for the later. Formally, the motion vector v ˆi is estimated for each block Bi such that: v ˆi = arg min Di (v) (1) v
where, Di (v) =
X I(j) − I − (j − v)
j∈Bi
(2)
With i we denote the pixel in the center of the block, and with Bi the set of the pixels in the block. With I and I − we denote the image intensities in the current and previous frame respectively. Remark 1: The Block-based motion estimator that minimizes the Sum of Absolute Differences is equivalent to a Maximum Likelihood estimator which assumes that the motion compensated intensity differences follow independent Laplacian distributions,1 that is under the assumption that − ˆ , I(j)) = λ2 e−λ|I(j)−I (j−v)| . p(I − (j − vi )|vi = v Proof: |Bi | λ v ˆi = arg min Di (v) = arg max e−λDi (v) (3) v v 2 Y λ − e−λ|I(j)−I (j−v)| (4) = arg max v 2 j∈Bi Y ˆ , I(j)) p(I − (j − vi )|vi = v (5) = arg max v
j∈Bi
= arg max P (I − |I, vi = v)
(6)
v
where the dependence on the level h of the multiscale scheme is omitted for notational simplicity. The parameter λ in eq.4 is related to the deviation of the Laplacian thus, it does not influence the location of the minimum but only the “width” of the distribution. On the other hand, the larger the deviation, the lower the relative importance of the differences in the objective criterion. The latter is depicted in Fig. 1 where the ratio of the likelihoods (P (I − |I, vi = v)) of two different candidate motion vectors is drawn as a function of λ−1 . Each curve depicts the likelihood ratio for a certain value of the difference in the corresponding D(v)s (i.e. d = D(v1) − D(v2), for different d). It is apparent, that the larger the deviation the closer the likelihood ratio to one. Thus, the larger the deviation, the lower our confidence in the candidate motion vector that generates the smaller of the D(v). Let us note that the likelihood ratio can be useful as a confidence measure only when precisely two candidate vectors are available. It is introduced here mainly in order to illustrate the importance of a good estimation of λ. 1 0.9
Likelihood ratio
and correlation-based methods based on different assumptions about noise in the measurements. In this paper, an early version of which appears at [11], we concentrate on a Hierarchical Block Matching motion estimation scheme and will derive confidence measures that express the confidence to the estimated motion field. As Simoncelli, we formulate the motion estimation problem in the probabilistic framework. We estimate the a posteriori probability of the motion vector by an estimation of the prior distribution of the intensity which is provided as a by-product of the search scheme. We incorporate our work in a multiscale scheme and, in the Bayesian framework, derive the confidence measures in terms of the a posteriori probabilities at the different levels of the hierarchy and the likelihood at the finest level. The type of the conditional probability distribution of the motion compensated intensity differences is derived from the objective criterion of the Block-based motion estimator and all parameters are estimated from the data itself. Data is provided as a by-product of the motion estimation procedure and our method does not require the estimation of spatial or temporal derivatives which are sensitive to noise. The remainder of the paper is organized as follows. In section II the block matching motion estimation scheme is expressed in a probabilistic framework as a Maximum Likelihood estimator. Based on the objective criterion of the block matching estimator we derive the type of the Likelihoood, that is the type of the probability distribution of the motion compensated intensity differences. Subsequently, we derive the confidence measures of the motion vectors in section III and in section IV we address computational issues. In section V we present experimental results for image sequences for which the motion is known and in section VI conclusions are drawn. Finally, in appendix A we concisely describe the image sequences that were used in the section of the experimental results.
d = 0.3
0.8 0.7
d=1
0.6 0.5
d=2
0.4 0.3
d=5
0.2 0.1 0 0
0.5
1
1.5
2
2.5
λ−1
3
Fig. 1. Likelihood ratio of two candidate motion vectors v1 and v2 as a function of the inverse of λ. Curves are drawn for different values of the differences in the corresponding D(v) (d = D(v1) − D(v2)) 1 Similarly, the Block-based estimator under the SSD criterion is equivalent to the ML estimator when the differences follow i.i.d. Gaussians.
The usual assumption about λ is that it is the same for every block, or even that it is the same for all image sequences. In what follows we will make the hypothesis that λ depends on the local intensity variation. We will experimentally show that such a hypothesis provides by far better estimates of the deviation of the distribution of the motion compensated intensity differences. Our hypothesis is that λ is correlated with the amount of texture in the block. We model it as being inversely linear with the standard deviation of the intensity in the block in question, that is
λ
−1
2
−1 Y1 λ−1 = 0.7 λ = .2 σB
1.5
1
0.5
0
λBi
β , = σBi
0
MEASURES
The Block-matching motion estimator minimizes eq.1 and equivalently, as shown in the proof of Remark 1, maximizes the likelihood: |Bi | λBi (8) e−λBi Di (v) . P (I − |vi = v, I) = 2 It does so with a search scheme that evaluates different candidate motion vectors (v) in terms of the objective criterion. In what follows, we will utilize the data provided by them in order to estimate the a posteriori probability of the best (and therefore chosen) candidate motion vector (ˆ vi ). By using the theorem of Bayes and the total probabilities [9] we have that P (I − |vi = v ˆi , I)P (vi = v ˆi ) P (vi = v ˆi |I − , I) = P − v P (I |vi = v, I)P (vi = v)
4
6
8
σB
10
(a) Y1
where β is a model parameter that will be estimated from data that results as a byproduct of the search scheme. Although the linearity of the proposed modelling is not guaranteed, it is to be expected that the higher the degree of texture, the higher the variation in the observed motion compensated intensity differences. We have tested the suitability of the proposed linear model in a number of sequences. Here we present results for two of them (Y1 and R1) which were chosen since the degree of texture and the motion characteristics vary significantly, as explained in Section V. In Fig. 2 we present the inverse of an estimation of λ as a function of the withinblock deviation of the intensity σB . In the same figure we present the lines fit with the Least Squares criterion, on the one hand under our modeling (eq. 7) and on the other hand under the usual assumption that λ is constant. It is apparent that λ is highly correlated with the within-block intensity deviation and that our modeling follows much more closely the true statistics. Finally, observe the differences in the values of the parameters of the fitted lines between Fig. 2(b) and Fig. 2(a). This advocates the need for the reestimation of the model parameter β for different sequences. Finally, let us note that we have adopted a linear model (eq. 7) for its simplicity and due to the fact that it experimentally proved to follow well the true statistics. The fact that we adopt a linear model does not influence the derivations below and it can be replaced by any other model whose parameters can be estimated efficiently from the data in a way similar to the way that β is estimated (eq. 12) as described in section III. III. C ONFIDENCE
2
(7)
(9)
λ
−1
−1
R1
= 0.06 λ −1 λ = .02 σB
0.25 0.2 0.15 0.1 0.05 0 0
2
4
6
8
10
σB
(b) R1 ˆ −1 ) as a function of the Fig. 2. The inverse of the estimation of λ (i.e. λ within-block standard deviation of the intensity σB and the Least Squares line fitting under the assumptions i) that λ is constant and ii) that it is inversely proportional to σB
where with P (vi = v) we denote the a priori probability of a motion vector. By an appropriate modeling we can incorporate domain knowledge and/or utilize smoothness constraints. However, such issues are not addressed in the classical block matching algorithm.Therefore they are also not considered in our analysis which assumes a uniform a priori distribution. Let us for notational simplicity denote with gi the a posteriori probability. Then, substituting eq.8 in eq.9 we obtain that: −1 X gi = 1 + e−λBi (Di (v)−Di (ˆvi )) (10) v6=v ˆi
= 1 +
X
v6=v ˆi
e
vi )) − σβ (Di (v)−Di (ˆ Bi
−1
(11)
where the parameter β is assumed to be the same for each block. It is estimated as σBi σBi ˆ ≈ (12) β = EB i Di (ˆ vi ) Di (ˆ vi ) that is as the mean value of the ratio of Di (ˆ vi ) with the standard deviation of the intensity in block Bi . Clearly, the mean is estimated over all blocks Bi . Note that the estimation scheme of eq.12 is with the Least Squares criterion but more robust estimates can be easily obtained.
Loosely speaking, the a posteriori probability, as defined in eq.11, gives a measure of how prominent is the local minimum in v ˆi , in terms of how lower is Di (ˆ vi ) in comparison to the Di (v) of the other candidate motion vectors2 . In order to arrive to a useful confidence measure we need to address the two following issues. The first issue is that, while the minimum in v ˆi might be prominent in comparison to the Di (v) of the other candidate motion vectors, it might be high in absolute values. This occurs often, for example at areas that are occluded and at which “prominent” minimums but with high Di (v) are observed. The second issue is that the block matching motion estimation usually employs a multiscale scheme in order to be able to reliably estimate motions large in magnitude. The a posteriori probabilities that we have derived can provide a measure of confidence in the estimation at a single level of the hierarchy. Clearly, there is a need to combine the evidence provided at each level in order to derive a measure that expresses the confidence in the multiscale motion estimation. Let us denote with gih the a posteriori probability P (vi h = h − v ˆi |I , I) of the component of the motion vector estimated at level h. If the estimations at each level were independent, the probability that all of the v ˆih (1 ≤ h ≤ H) are estimated correctly would be equal to the product of gih . However, this is not the case since a) the search in the lower level h − 1 are initialized around the motion vectors that are estimated at the higher level h and b) the data (i.e. the images) at each level are filtered and subsampled versions of the ones at the lower level. Furthermore, we do not expect that our estimation of the a posteriori probability is very accurate since it depends only on a few candidate motion vectors. A multiplication of the a posteriori probabilities would enhance the errors made at each level. For classification purposes, Tax et al [15] and Kittler et al [7] address similar issues when combining a posteriori probabilities estimated from different data. From theoretical and experimental analysis, they conclude that it is better to average the a posteriori probabilities instead of multiplying them, when the data are strongly correlated and the estimation of the a posteriori probabilities not accurate. Experimentally [11] we also came to the conclusion that in our case the average of gih (for 1 ≤ h ≤ H) is slightly more stable than its the product. Formally, the confidence measure is defined as: H P (I − |vi 1 = v ˆi1 , I) X h gi , (13) ci = H h=1
where the first term (i.e. the likelihood at the finest level P (I − |vi 1 = v ˆi1 , I)), gives a measure of how low is the minimum of Di (vi ). Finally, let us note that we introduced the assumption that the value of λ is inversely linear with the local intensity variation, in the last step of the derivations of the confidence measures (i.e. eq.9 to eq.11). Therefore, our method can be easily adapted to another modeling of λ as long as an estimation scheme for the parameters of the model can be provided in place of eq.12. 2 Note that the selection of the candidate vectors, and therefore the estimation of gi , depends on the search scheme that the particular block matching motion estimator employs.
IV. C OMPUTATIONAL I SSUES The a posteriori probabilities at each level h are estimated from the Displaced Block Differences for each candidate motion vector. Except of the computationally inexpensive estimation of the deviation of the intensity within each block, the rest of the data that are used are by products of the search scheme. Therefore, from the data acquisition point of view we, practically, do not introduce any additional computational burden. On the other hand, the estimation of the confidence measures is not possible until an estimation of the value of β is available. Practically, this means that the Displaced Block Differences need to be stored until β has been estimated. The additional memory requirements depend on the cardinality of the set of the motion vector candidates that the particular motion estimator employs and at each level they are proportional to the number of candidates (N h ) multiplied by the number of blocks (|{Bi }|) at that level. That is, at each level h they are proportional to N h |{Bi }| . (14) In the case that memory restrictions do not allow such a scheme we can resort to an approximation. More specifically, we can estimate the gih s using the value of β estimated for the previous frame and simultaneously estimate the value of β that will be used for the estimation of gih s for the next frame. Such an approach which assumes that the statistics of the motion compensated intensity differences change slowly in subsequent frames, introduces no additional computational costs or delays. V. E XPERIMENTAL R ESULTS In order to examine the validity of the derived measure we have conducted a number of experiments with synthetically generated data. Here, we will present results for the image sequences C1, R1, Y1, S5 (Appendix A) and for the well-known “yosemite” image sequence. Each of them exhibits different characteristics in terms of the amount of texture and the motion magnitude. The multiscale motion estimator was extended to half pixel accuracy and for the above mentioned sequences 2, 1, 4, 5 and 3 levels were used respectively. Overlapping blocks were used and dense motion fields were derived. Finally, in order to show the influence of the a posteriori probability terms (i.e. of gih ), we provide comparative results with a scheme that estimates the confidence measure in terms only of the likelihood at the lowest level (i.e. P (I − |vi 1 = v ˆi1 , I)). The original frames of the image sequences as well as the corresponding estimated motion fields are presented in appendix A. Let us denote with v ˜i the true motion vector at block i. In Fig. 3 we present the norm L1 of the true estimation error (i.e. kˆ vi − v ˜i k1 ) as a function of the rank of ci ’s (100 pixels with consecutive ranks are used for each point in the plot). It is apparent that the derived confidence measure is largely correlated with the true estimation error for all of the test sequences. As Fig. 3 reveals, higher rank according to ci implies, to a large extent, small estimation error. In order to demonstrate the accuracy of the derived measure we present in Fig. 3 the curve for the image sequence S5 obtained with
8
4
7
3.5
| v~ −v| 16
~
| v− v | 1
5
3
2.5
4
2
3
R1
2 1
1
C1
0 0
R1
1.5
C1
0.5
5000 10000150002000025000300003500040000
Rank of c
0 0
(a) C1 and R1
0.2
70
| v~ −v | 1
0.4
0.6
0.8
c
1
(a) C1 and R1
60
35
50
S5 Ideal
40
| v~ −v | 1
30 25
30
20
20 10
S5
Y1
0 0
10000
20000
30000
S5
15 40000
50000 Rank of c
(b) S5 and Y1 Fig. 3. Norm L1 of the true error in motion estimation (in pixels) as a function of the rank of the confidence measure c. For “S5 Ideal” the ranking is based on the true error instead of c
10
Y1
5 0 0
0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 c
(b) S5 and Y1 Fig. 4. Norm L1 of the estimation error (in pixels) as a function of the confidence measure c
the best possible ranking measure, that is the true estimation error itself (“S5 Ideal”). The curve S5, derived by the ranking according to the proposed confidence measure, follows the “S5 Ideal” curve quite closely. Finally, in Fig. 4(a) and Fig. 4(b) we present the norm L1 of the error as a function of the confidence measure. It is clear that the derived confidence measure adapts well to the presented sequences even though the type and magnitude of motion, the degree of texture and the degree of presence of occlusions differ significantly. For all of the sequences the only parameter (i.e. β) was estimated from the data by eq.12. In order to illustrate the localization accuracy, we present in Fig. 5 and Fig. 6 the norm L1 of the estimation error and the confidence measures as images for the sequences R1 and S5 respectively. For visibility purposes the images are linearly stretched. It is apparent that although R1 and S5 are very different sequences, for both of them the structure of the image of confidence measures follows very closely the structure of the true error in the motion estimation. Note that the large areas with low confidence in the S5 sequence are due to occlusions that originate from motions large in magnitude. Finally, in order to illustrate the influence of the a posteriori terms in the proposed confidence measure, we present comparative results with a method that derives the confidence measure as the likelihood at the lowest level (i.e. ci = P (I − |vi 1 = v ˆi1 |I)). We performed experiments under our modeling of eq.7 that λ varies inversely linear with the block variance, and under the usual assumption that λ is the same for all blocks. In fig. 8 we present results for the R1 and the “yosemite” image sequences. It is apparent that the proposed confidence measure
(a) Inversely stretched error norm L1 . High intensities indicate low true error
(b) Stretched confidence values. High intensities indicate high confidence values
Fig. 5. True error in motion estimation and confidence measure for image sequence R1
clearly outperforms the likelihood. For the “yosemite” image sequence (Fig. 8(a)) the true error decreases as a function of the proposed confidence measure. In comparison, it decreases as a function of the likelihood only for low likelihoods, that is at the far left part of Fig. 8(a). For the rest of graph, the true error seems to increase as a function of the likelihood. Similarly, for the R1 image sequence (Fig. 8(b)) it is clear that although at the left part of Fig. 8(b) the error decreases nicely as the likelihood increases, at the right part (areas with higher likelihood) the true error increases abruptly. Finally, in order to illustrate the influence of the proposed modeling of the λ, in Fig. 8(c) we present results for a confidence measure that
7
6
| v~−v| 1
Proposed method
Only likelihood
5
4
3
2
1
(a) Inversely stretched error norm L1 . High intensities indicate low true error Fig. 6. S5
(b) Stretched confidence values. High intensities indicate high confidence values
0
0
1
2
3
4
5
6
7
8 4
Rank of c
x 10
(a) “yosemite” image sequence.
True error in motion estimation and confidence for image sequence
3.5
Proposed method
3
| v~−v| 1
2.5
is defined as the likelihood under the (usual) assumption that λ is constant. It is clear that there is correlation with the true error only at a very small portion of the left part of the graph, that is only for blocks whose likelihood is very low. For the rest of the graph the likelihood seems to carry very little (if any) information about the true error.
2
Only likelihood
1.5
1
0.5
0
0
0.5
1
1.5
2
2.5
3
3.5
4 4
Rank of c
x 10
(b) ’R1’ image sequence. 6
5
| v~−v| 1 4
3
(a) Frame 2
2
7
1
~
| v−v| 1
6
0 5
0
1
2
3
4
5
6
7
8 4
Rank of c
x 10
4
(c) “yosemite” image sequence: only likelihood (constant λ).
3
2
Fig. 8. Comparative results: Norm L1 of the estimation error (in pixels) as a function of the rank of different confidence measure c.
1
0
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
c (b) Norm L1 of the estimation error (in pixels) as a function of the confidence measure c Fig. 7.
Results for the “yosemite” image sequence.
•
All parameters are estimated from the data itself and practically the method introduces no additional computational burden. VII. A PPENDIX A
VI. C ONCLUSIONS In this paper we have proposed a new confidence measures for the Block-matching motion estimator. We have expressed the block matching motion estimator in the probabilistic framework and derived the confidence measures in terms of the a posteriori probabilities and the likelihoods of the estimated vectors. Very good experimental results were obtained for image sequences with varying textures and motion patterns and magnitudes. The contributions of the paper can be summarized as follows: • We propose a confidence measure that it is well defined and clearly interpreted in the probabilistic framework. • Our modeling of the parameter λ models much more accurately the statistics of the intensity differences than the usual assumption that λ is a constant.
This appendix contains a short description of the synthetic image sequences that are used in this paper. Each sequence consists of three frames. Since the model generated motion fields are real-valued, a bicubical interpolator on the image intensities was used. Translational Motion of the Background (Image sequence C1) In this experiment, we generate an image sequence in which the whole image is displaced by (5, 1) pixels per frame. The second frame of the sequence (part of the “Coastguard” image sequence) is depicted in Fig. 9(a) and the model generated motion field (magnified by a factor of 2) in Fig. 9(c). Translational Motion of the Background (Image sequence R1) In this experiment, we generate an image sequence in which the whole image is displaced by (1, 1) pixels per frame.
depicted in Fig. 11(a) and the model generated motion field in Fig. 11(c). The object’s mask in the second frame is depicted in Fig. 11(b).
(a) Frame 2 (“C1”) Fig. 9.
(b) Frame 2 (“R1”)
(c) True motion (“C1”)
field
Translational motion: “C1” and “R1’ image sequences. (a) Frame 2
The second frame of the sequence (A frame of “Rubic” image sequence) is depicted in Fig. 9(b). The main characteristic of the sequence is the lack of texture in large areas. Affine Motion of the Background (Image sequence Y1) An image sequence is generated in which the background is displaced according to an affine parametric model (Table I). That is, θ(1) θ(2) θ(3) v˜ = i+ (15) θ(4) θ(5) θ(6) where θ contains the affine motion parameters and i are the block’s central pixel coordinates. The second frame (A frame of “yosemite” sequence) is depicted in Fig. 10(a) and the model generated motion field (magnified by a factor of 2) in Fig. 10(b). The difficulties arise due to a) the relative large extent of occlusions on the the borders of the image and b) to the violation of the assumption that the motion within each block is translational. Image Sequence “Y1” “S5”
Background Background Object
θ(1)
θ(2)
θ(3)
θ(4)
θ(5)
θ(6)
0.1 0.01 0
0 −0.005 0.03
10 25 −2
0 0.05 0.1
0.1 0 0
3 1 −13
TABLE I M OTION PARAMETERS FOR “Y1” AND “S5” IMAGE SEQUENCES
(a) Frame 2 Fig. 10.
(b) Model-generated tion field
mo-
“Y1” image sequence
Two objects, large affine motions (Image sequence S5) An image sequence is generated in which the background and an object are displaced according to two different affine parametric models (Table I). The affine parameters are chosen so that the magnitude of motion is quite large, thus large occlusions are present. Due to occlusion phenomena, areas on the left and on the right of the boat are visible only at the second frame of the sequence. The second frame is
Fig. 11.
(b) Object mask
(c) Model-generated tion field
mo-
“S5” image sequence
R EFERENCES [1] Y. Altunbasak, Eren P.E., and A.M. Tekalp. Region-based parametric motion segmentation using color information. Graphical models and Image Processing, 60(1):13–23, Jan 1998. [2] P. Anandan. A computational framework and an algorithm for the measurement of visual motion. IJCV, (2):283–310, Feb. 1989. [3] J. Barron, D. Fleet, and S. Beauchemin. Performance of optical flow techniques. IJCV, 12(1):43–77, 1994. [4] A. Dev, B.J.A. Krose, and F.C.A. Groen. Confidence measures for image motion estimation. In RWC Symposium, pages 199–206, 1997. [5] C. Fermuller, D. Shulman, and Y. Aloimonos. The statistics of optical flow. Computer Vision and Image Understanding, 82:1–32, 2001. [6] S. Ghosal and P. Vanek. A fast scalable algorithm for discontinuous optical-flow estimation. PAMI, 18(2):181–194, February 1996. [7] J. Kittler, M. Hatef, R.P.W. Duin, and J. Matas. On combining classifiers. IEEE Trans. Pattern Analysis and Machine Intelligence, 20(3):226–239, Mar. 2000. [8] A. Lundmark, H. Li, and R. Forchheimer. Motion vector certainty reduces bit rate in backward motion estimation video coding. In SPIE VCIP, Jun. 2000. [9] A. Papoulis. Probability, Random Variable and Stochastic Processes. Electrical and Electronic Engineering series. McGraw-Hill, 1991. 3rd edition. [10] I. Patras. Object-based Video Segmentation with Region Labeling. PhD thesis, Delft University of Technology, Nov. 2001. [11] I. Patras, E.A. Hendriks, and R.L. Lagendijk. Confidence measures for block matching motion estimation. In Int’l Conf. Image Processing, Sep. 2002. Rochester, NY, USA. [12] Peter Sand and Leonard McMillan. Efficient selection of image patches with high motion confidence. In Int’l Conf. Image Processing, 2002. Rochester, NY, USA. [13] E.P. Simoncelli. Distributed Representation and Analysis of Visual Motion. PhD thesis, Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA, 1993. [14] E.P. Simoncelli, E.H. Adelson, and D.J. Heeger. Probability distributions of optical flow. In Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, pages 310–315, June 1991. Maui, Hawaii. [15] D.M.J. Tax, M. van Breukelen, R.P.W. Duin, and J. Kittler. Combining multiple classifiers by averaging or by multiplying? Pattern Recognition, 33(9):1475–1485, 2000. [16] L. To, M. Pickering, M. Frater, and J. Arnold. A motion confidence measure from phase information. In Int’l Conf. Image Processing, volume IV, pages 2583–2586, Oct 2004. [17] Joseph Weber and Jitendra Malik. Robust computation of optical flow in a multi-scale differential framework. International Journal of Computer Vision (gzipped ps) or Postscript, 14(1), 1995. [18] T. Yoshida, H. Katoh, and Y. Sakai. Block matching motion estimation using block integration based on reliability metric. In ICIP, volume 2, pages 152–155, 1997.