{patrick.heas,cedric.herzet,etienne.memin}@inria.fr. Abstract. Selecting optimal ... Bayesian modeling framework for robust optic-flow estimation. It yields the de- ..... 1-st order regularizer (5). Obviously, using ... 8 (1999) 231â246. 3. MacKay ...
Robust optic-flow estimation with Bayesian inference of model and hyper-parameters P. H´eas, C. Herzet, E. M´emin Fluminance, INRIA Rennes-Bretagne Atlantique, Campus universitaire de Beaulieu, 35042 Rennes, France {patrick.heas,cedric.herzet,etienne.memin}@inria.fr
Abstract. Selecting optimal models and hyper-parameters is crucial for accurate optic-flow estimation. This paper solves the problem in a generic variational Bayesian framework. The method is based on a conditional model linking the image intensity function, the velocity field and the hyper-parameters characterizing the motion model. Inference is performed at three levels by considering maximum a posteriori problem of marginalized probabilities. We assessed the performance of the proposed method on image sequences of fluid flows and of the “Middlebury” database. Experiments prove that applying the proposed inference strategy on very simple models yields better results than manually tuning smoothing parameters or discontinuity preserving cost functions of classical state-of-the-art methods. Key words: Motion modeling, marginalized posterior, Bayesian inference, regularization coefficients, robust hyper-parameters, cost-functions.
1
Introduction
Choosing appropriate models and fixing hyper-parameters is a tricky and often hidden process in optic-flow estimation. Most of the motion estimators proposed so far have generally to rely on successive trials and a empirical strategy for fixing the hyper-parameters values and choosing the adequate model. Besides of its computational inefficiency, this strategy may produce catastrophic estimate without any relevant feedback for the end-user, especially when motions are difficult to apprehend as for instance for complex deformations or non-conventional imagery. Imposing hard values to these parameters may also yield poor results when the lighting conditions or the underlying motions differ from those the system has been calibrated with. At the extreme, the estimate may be either too smooth or at the opposite exhibits nonexistant strong motion discontinuities. However, Bayesian analysis has been intensively studied in the past for hyperparameter estimation and for model selection [1][2]. In particular, in the context of interpolation of noisy data, a powerful hierarchical Bayesian model has been proposed in the seminal work of [3]. In optic-flow estimation, state-of-the-art inference techniques [4, 5] remain limited since they select the weight of different model candidates rather than really selecting one and/or do not consider
2
P. H´eas, C. Herzet, E. M´emin
model deviations from Gaussianity. Such non-Gaussian models are nevertheless very common in computer vision where we have to cope with with motion discontinuities and observation outliers due to noise or varying lighting conditions. Non-gaussian robust statistics are commonly used to manage such problems. Another problem raised by the use of the robust norms is the choice of their hyper-parameters, since in general they are parametrical models. This choice is crucial and different tuning of these parameters can lead to motion estimates which are drastically different. Finally, although it is crucial for accurate motion measurement, very little emphasis has been devoted in the computer vision literature to the problem of model selection for optic-flow estimation. In particular, except in a particular case [6], no proper Bayesian formulation has been proposed in the literature for the selection of optimal optic-flow data and regularization models. In the perspective of solving this crucial problem, this work presents a generic Bayesian modeling framework for robust optic-flow estimation. It yields the design of non-parametrical estimation methods, able to reliably decide among several data and regularization models with optimal tuning of regularization coefficients and robust model hyper-parameters. The effectiveness of our approach is illustrated on challenging image sequences of turbulent diffusive flows and computer vision scenes. In particular the proposed method achieves with very simple models better performances than classical state of the art algorithms. The notational conventions adopted in this paper are as follows. Italic lowercase indicates a scalar quantity, as in a; boldface lowercase indicates a vector quantity, as in a; the kth element of vector a is denoted a(k); capital boldface letters indicate matrices, as in A; the element corresponding to the ith row and jth column of A is denoted as A(i, j); we will use the notation Λa to define a diagonal matrix whose elements are those of vector a; calligraphic letters, e.g., A, represent the set of values that a variable or vector can take on; capital normal letters, as A, denote random variables.
2 2.1
Short overview of optic-flow estimation Data and Prior Models
Let I : (s, t) ∈ S × T → I(s, t) ∈ R be an image intensity function where S ⊆ R2 (resp. T ⊆ R) is the image spatial (resp. temporal) domain. Moreover, let the optic flow be defined by a function v : (s, t) ∈ S × T → v(s, t) ∈ R2 which associates a two-dimensional motion vector to every spatio-temporal position. Using this formalism, the optic-flow problem can then be restated as the problem of identifying v(s, t) from the (partial) knowledge of I(s, t). Note that, in practice, complexity and storage constraints often limit the estimation of the motion field over a finite subset Sr × Tr of S × T . We will consider this scenario hereafter and use the notation v to denote the vector made up of the concatenation of v(s, t)’s ∀ s ∈ Sr , ∀ t ∈ Tr . The estimation of the optic flow requires a mathematical characterization of the link between the image intensity and the motion field. One standard way to relate v(s, t) to I(s, t) is the (so-called) ”Optic Flow Constraint”
Robust optic-flow estimation with Bayesian inference
3
(OFC), ∂I (s, t) + ∇Ts I(s, t) v(s, t) = 0, ∂t
∀s ∈ Sr , ∀ t ∈ Tr ,
(1)
which is valid under rigid motion and stable lighting hypotheses. For other configurations, many other models have been proposed in the literature to relate the image intensity function to the sought motion fields [7]. All these models obey the same general formulation: ∀s ∈ Sr , ∀ t ∈ Tr ,
ΦI,v (s, t) = 0
(2)
where ΦI,v is an operator on I and v. In the sequel, we will refer to ΦI,v as the data model and use the notation Φ to denote the vector made up of the concatenation of the ΦI,v (s, t)’s ∀ s ∈ Sr , ∀ t ∈ Tr . Note that an important family of data models is defined by linear operators, i.e., Φ = AΦ v + bΦ ,
(3)
where AΦ and bΦ are respectively a matrix and a vector characterizing the operator. The system of equations defined in (2) is commonly underdetermined i.e., it does not univocally specify a solution for v. A proper conditioning of the problem requires therefore to include some additional constraints specifying the nature of the sought solution, e.g., Πv (w) = 0,
w ∈ W,
(4)
where Πv denotes an operator on v which is parameterized by a (possibly multidimensional) index w. In the sequel, we will refer to Πv as the prior model and use the notation Π to denote the vector formed by the concatenation of the Πv (w)’s ∀ w ∈ W. The choice of the prior model is commonly made (but does not have to) so that some form of regularity is ensured. For example, a possible choice for Πv is as follows [8] Πv (s, t) , ∇s v(s, t),
∀s ∈ Sr , ∀ t ∈ Tr
(5)
where ∇s v denotes the Jacobian of v and we made the identification w , (s, t), W , Sr × Tr . In other application, Πv can enforce the solution to satisfy some physical constraints on motion regularity (see e.g., [6]). Among the possible prior operators, we will have a particular emphasis on the family of linear operators, i.e., Π = AΠ v + bΠ , (6) where AΠ and bΠ are respectively a matrix and a vector characterizing the operator. In practice, the data and the prior models may not perfectly describe the sought motion field. Looking for v satisfying both (2) and (4) may then lead to aberrant or unstable solutions. Hence, a common approach to avoid such issues consists in minimizing an energy functional composed of two terms balanced by a regularization coefficient γ:
4
P. H´eas, C. Herzet, E. M´emin
L(I, v, γ) = fd (ΦI,v ) + γfr (Πv ),
(7)
where γ is a positive parameter. The “data term” fd (ΦI,v ) (resp. “regularization term” fr (Πv )) penalizes discrepancies from the considered data model (2) (resp. prior model (4)) whereas γ tunes the tradeoff between the two terms. The choice of the cost functions fd and fr is of great importance since it implicitly defines the type of solution we are looking for. One possible choice for fd is, for example, X fd (ΦI,v ) = kΦk22 , (ΦI,v (s, t))2 , (8) (s,t)∈Sr ×Tr
which measures the Euclidean distance of Φ to zero. In the context of strong model deviations, e.g., when dealing with observation outliers, the use of the `2 norm may be inefficient. In such scenarios, “robust cost functions” are commonly considered to penalize model discrepancies: fd (ΦI,v ) = ρd (Φ, τd ),
(9)
where ρd : R|Sr ||Tr |+1 → R is an even continuously differentiable concave function with some suitable properties and τd is a parameter. Robust cost functions are also commonly referred to as “M-estimators”. Well-known instances of Mestimators include Leclerc’s cost function and an approximation of the `1 norm (see e.g., [9][10]) . Similarly, fr (Πv ) can be defined by either a quadratic norm or, in the context of strong motion spatial discontinuities, a robust cost function ρr (Π, τr ), where τr is a parameter. 2.2
Standard optic-flow estimation
Practical estimation of the optic flow requires to find tractable and accurate implementations of the following problem: ˆ = arg min L(I, v, γ). v v
(10)
Different cases can be distinguished according to whether the data/prior operators are linear or not, the cost functions fd and fr implement quadratic norm or robust cost functions. When the operators are linear and the cost functions quadratic, the problem is convex. There is therefore one unique minimum which can be efficiently accessed by numerical procedure such as Conjugate Gradient Squared (CGS) algorithm [11] or multi-grid algorithms [12]. When fd and fr are robust cost functions, the direct application of standard optimization algorithms may lead to cumbersome procedures. Instead, one common technique consists in expressing (10) as a sequence of quadratic problems. This approach is based on the concavity of M-estimators and Fenchel-Legendre duality. In particular, it can be shown [9][13] that min L(I, v, γ) = min B(I, v, z, γ), v
v,z
(11)
where B(I, v, z, γ) , ΦT Λzd Φ + ψΦ (zd ) + γ(Π T Λzr Π + ψΠ (zr )), z , [zTd zTr ]T and ψΦ (zd ) (resp. ψΠ (zr )) is the Fenchel-Legendre dual function of fd (resp.
Robust optic-flow estimation with Bayesian inference
5
fr ). Minimizing B instead of L often eases the resolution of the optimization problem. Indeed, considering iterative conditional minimization of B, we have v(n) = arg min B(I, v, z(n) , γ),
(12)
z(n+1) = arg min B(I, v(n) , z, γ).
(13)
v z
Now, since B is a quadratic function with respect to Φ and Π, (12) can be solved efficiently by applying standard optimization procedures. Moreover, (13) usually possesses an easy analytical solution. In the case of linear model, we have: (n+1)
zd
z(n+1) r
1 −2 Λ ∇Φ ρd (AΦ v(n) + bΦ , τd ), 2τd AΦ v(n) +bΦ 1 −2 = Λ ∇Π ρd (AΠ v(n) + bΠ , τr ). 2τr AΠ v(n) +bΠ =
(14) (15)
Hence, the minimization of B via (12)-(13) reduces to solving a sequence of tractable quadratic problems.
3
A Bayesian framework for model and hyper-parameter selection
In the previous section we emphasized that the optic-flow estimation problem requires to make assumptions about: i) the observation and data models, ΦI,v and Πv ; ii) the costs functions, fd and fr ; iii) the hyper-parameters, γ, τd and τr . The choice of these quantities often dramatically influences the performance achieved by the estimation algorithms. Quite surprisingly this problem has been mainly overlooked in the current literature. In this section, we propose a Bayesian inference method to make proper decisions about the models, the cost functions and the value of the hyper-parameters. In a first part, we give a Bayesian reformulation of the standard optic-flow problem (10) which motivates our subsequent derivations. We then devise a Bayesian method for the estimation of the models, cost functions and hyper-parameters based on the so-called “Bayesian evidence framework” [3]. 3.1
Bayesian formulation of the optic-flow estimation problem
In this section, we emphasize that the general optic-flow estimation problem (v? , z? ) = arg min B(I, v, z, α, β), v,z
(16)
can also be expressed as a maximum a posteriori (MAP) problem. For the sake of conciseness, we focus exclusively on the case of linear operators (3), (6) but keep in mind that non-linear operators can be made (locally) linear by a firstorder Taylor expansion. We consider the following probabilistic model relating I, v and z:
6
P. H´eas, C. Herzet, E. M´emin
(bΦ + AΦ v)T βΛzd (bΦ + AΦ v) p(bΦ |zd , v, β, ΦI,v ) , Zb−1 , exp − Φ 2 β ψΦ (zd ) p p(zd |β, ΦI,v ) , Zz−1 exp − det(βΛzd )−1 , d 2 (v − mΠ )T Γ −1 −1 Π (v − mΠ ) p(v|zr , α, Πv ) , Zv exp − , 2 q α ψΠ (zr ) det(ATΠ αΛzr AΠ )−1 , p(zr |α, Πv ) , Zz−1 exp − r 2
(17) (18) (19) (20)
where ZbΦ , Zzd , Zv and Zzr are normalization constants, α, β are two positive parameters and Γ Π , (αATΠ Λzr AΠ )−1 ,
mΠ = Γ Π AΠ Λzr bΠ .
(21)
Equation (17) defines a family of Gaussian distributions on bΦ parameterized by zd ; the probability of each Gaussian of this family is given in (18) and depends on ψΦ , the M-estimator dual function. It is interesting to note that bΦ is a function of the observed image I; p(bΦ |zd , v, β, ΦI,v ) can therefore be seen as a probabilistic “observation model” relating I to v. Similarly, (19) defines a family of Gaussians parameterized by zr ; (20) is a prior on the probability of occurrence of each instance of this family. p(v|zr , α, Πv ) can therefore be regarded as a probabilistic “prior model” on v. Based on these definitions, we can now define the following MAP estimation problem: (v? , z? ) = arg max {log p(bΦ , v, z, |α, β, ΦI,v , Πv )} (22) (v,z)
where p(bΦ , v, z|α, β, ΦI,v , Πv ) =p(bΦ |v, zd , β, ΦI,v ) p(zd |β, ΦI,v ) p(v|zr , α, Πv ) p(zr |α, Πv ).
(23)
It is quite easy to see (by direct substitution) that (22) is equivalent to (16) if we set γ , α β . This connection gives a physical interpretation to the assumptions which are implicitly made when considering standard problem (16). For example, the optimization of B with respect to zd (resp. zr ) is equivalent to selecting the best probabilistic data (resp. prior) model among a family of Gaussians with different covariance matrices. 3.2
Bayesian inference for robust optic-flow estimation
Since standard optic-flow estimation algorithms based on (16) implicitly consider probabilistic model (17)-(20), it is legitimate to wonder whether this model can also be exploited to infer the data/prior models, the cost functions and the hyper-parameters? In this section, we will assume so and propose a Bayesian methodology to estimate these quantities based on model (17)-(20). Note that this Bayesian approach differs from the learning strategies proposed in [14], since
Robust optic-flow estimation with Bayesian inference
7
it neither requires training data nor ground truth. We will use the notation ωd to refer to the couple (ΦI,v , fd ) specifying the data model and cost function. Similarly, ωr will refer to (Πv , fr ). Finally, we will use the following short-hand notations: θ , [α, β, τd , τr ]T and ω , [ωd , ωr ]T . We will infer v, z, θ and ω from the following set of problems: (v? , z? ) = arg max log pBΦ ,V,Z|Θ,Ω (bΦ , v, z|θ ? , ω ? ) , (24) (v,z) θ ? = arg max log pBΦ |Z,Θ,Ω (bΦ |ˆ z(θ, v? ), θ, ω ? ) , (25) θ ω ? = arg max log pBΦ |Z,Ω (bΦ |ˆ z(θ ? , v? ), ω) , (26) ω
ˆ(θ) = [ˆ ˆTr (θ, v)]T , with z zTd (θ, v) z 1 −2 ˆd (θ, v) = Λ ∇Φ ρd (AΦ v + bΦ , τd ), z 2τd AΦ v+bΦ 1 −2 ˆr (θ, v) = z Λ ∇Π ρr (AΠ v + bΠ , τr ). 2τr AΠ v+bΠ
(27) (28)
The system (24)-(26) is inspired from the Bayesian evidence framework proposed in [3]. It defines three levels of inference. In the first level (24), v and z are estimated by relying on hyper-parameter and model estimates (θ ? , ω ? ). In the second level (25), the dependence on v is marginalized out and θ is inferred by assuming ω = ω ? . Finally, in the last level (26), ω ? is computed by maximizing a likelihood function in which both the dependence on v and θ has been removed. Note that the set of problems defined in (24)-(26) is slightly different from the one presented in [3] since the dependence on z is not removed in (25)-(26). Instead, we constraint z to have a particular structure, namely (27)-(28). As will see in the remainder of this section, this digression from the original Bayesian evidence framework allows a tractable implementation of the inference algorithm. On the other hand, it also forces an interconnection between all level of inference: ˆ(θ, v? ) whereas v? is the maximum of a function θ ? depends on v? through z ? depending on θ , etc. We consider the following iterative procedure to find an estimate satisfying all the equations of the system (24)-(26): n o (∞) ω ˆ = arg max log pBΦ |Z,Ω (bΦ |ˆ z(θ (∞) (29) ω , vω ), ω) , ω
(n)
∞ where the sequence {θ (n) ω , vω }n=0 is defined as follows o n (n−1) (vω(n) , z(n) , ω) , ω ) = arg max log p(bΦ , v, z|θ ω
(30)
(v,z)
(n−1) , ω), + µP∇θ log p(bΦ |ˆ z(θ (n−1) , vω(n) ), θ (n−1) θ (n) ω ω ω = θω
(31)
and µ is positive step factor, and P a positive definite matrix. In practice, convergence is obtained after about 10 iterations in average. Matrix P is chosen to be a finite difference approximation of the Hessian so that (31) constitutes an iteration of a Quasi-Newton ascent method (see [11]). Step µ is fixed according to the strong Wolf conditions [11]. Clearly, any fixed point of recursions (29)-(31) satisfies (24)-(26). We detail hereafter the strategy we considered to implement each step of the proposed algorithm:
8
P. H´eas, C. Herzet, E. M´emin
Step (30): The problem to solve is equivalent to the standard optic-flow estimation problem (16) (see Section 3.1) and can be solved by iterative conditional maximizations (12)-(13). (n) Step (31): The update requires the gradient of log p(bΦ |ˆ z(θ, vω ), θ, ω) which can be efficiently computed by noticing that [15]: Z ∇θ log p(bΦ |z, θ, ω) =
p(v|z, θ, ω)∇θ log p(bΦ , v|z, θ, ω)dv.
(32)
This leads to 1
∂ log p(bΦ |z, θ, ω) = − α−1 + (v − mΠ )T ATΠ Λzr AΠ (v − mΠ ) , ∂α 2 ∂ 1 −1 log p(bΦ |z, θ, ω) = − β + (bΦ − AΦ v)T Λzd (bΦ − AΦ v) , ∂β 2 1 ∂ T log p(bΦ |z, θ, ω) = − tr Λ−1 + (b − A v) (b − A v) , Λ βΛ ∂z ∂z Φ Φ Φ Φ d d zd ∂τd ∂τd ∂τd 2 E ∂ 1 D −1 tr Λzr Λ ∂zr + (v − mΠ )T αATΠ Λ ∂zr AΠ (v − mΠ ) , log p(bΦ |z, θ, ω) = − ∂τr ∂τr ∂τr 2 where we use the notation h·i to denote the expectation with respect to p(v|z, θ, ω). Note that v only appears in linear and quadratic forms in the expressions of the partial derivatives defined above. As a consequence, the latter derivatives are only a function of the mean and covariance of p(v|z, θ, ω). Now, it is easy to see that p(v|z, θ, ω) is a Gaussian distribution with mean and covariance defined as mv|z,θ,ω , hvi = Γ v|z,θ,ω (ATΦ Λzd bΦ + ATΠ Λzr bΠ ), T
Γ v|z,θ,ω , h(v − mv|z,θ,ω )(v − mv|z,θ,ω ) i =
αATΦ Λzd AΦ
(33) +
−1 βATΠ Λzr AΠ
. (34)
Therefore, the computation of the gradient of p(v|z, θ, ω) only requires tractable linear operations. Step (29): The decision on the model and the cost function ω is made by max(∞) imizing log p(bΦ |ˆ z(θ (∞) ω , vω ), ω).We assume that ω takes on its values in a fi(∞) nite set so that solving (29) only requires to evaluate log p(bΦ |ˆ z(θ (∞) ω , vω ), ω) for these values. We use the Laplace’s method to derive an approximation of p(bΦ |z, ω). This method approximates the integral of a function by fitting a Gaussian at its maximum and computing the volume under the Gaussian. For a k dimensional variable x and a function f (x), the Laplaces’ approximation Z reads: f (x)dx ' f (ˆ x)(2π)k/2 [det{−∇2x log f (ˆ x)}]−1/2 , (35) ˆ = arg maxx f (x). Hence, if p(θ) where ∇2x represents the Hessian operator and x is a flat non-informative prior we Zget the following approximation: (∞) pBΦ |Z,Ω (bΦ |ˆ z(θ (∞) ω , vω ), ω) =
(∞) p(bΦ |ˆ z(θ (∞) ω , vω ), θ, ω) p(θ) dθ,
(36)
(∞) (∞) −1/2 ∝ p(bΦ |ˆ z(θ (∞) , ω , vω ), θ ω , ω)(− det Hθ ) (37)
Robust optic-flow estimation with Bayesian inference
9
(∞)
(∞) where Hθ = ∇2θ log p(bΦ |ˆ z(θ (∞) ω , vω ), θ ω , ω). Finally, we obtain: 1 1 (∞) log pBΦ |Z,Ω (bΦ |ˆ z(θ (∞) log det Γ −1 ω , vω ), ω) ' − log(− det Hθ ) − v|z,θ,ω 2 2 1 T 1 1 T T T − mTv|z,θ,ω Γ −1 v|z,θ,ω mv|z,θ,ω + bΦ Λzd bΦ + mΠ AΠ Λzr AΠ mΠ . (38) 2 2 2
where mv|z,θ,ω and Γ v|z,θ,ω are the a posteriori mean and covariance of v defined (∞) ˆ(θ (∞) in (34)-(33) and evaluated at θ = θ (∞) and z = z ω ω , vω ).
4
Experiments
Fig. 1. Generated images I(t), ground truth, our estimate and the one of [16] in color representation. Color and intensity code vector orientations and magnitudes [17].
In the following, basic experiments have been designed in order to provide a proof of concept on the capabilities of the Bayesian inference technique. In our experiments, hyper-parameters and motion have been estimated at the different level of a standard multi-resolution algorithm. The inference of the models has been performed only on the finest resolution level. 4.1
Fluid motion image sequence
We first consider a synthetic sequence of scalar images of 256 × 256 pixels representing the evolution of two-dimensional turbulent flow [18]. The dynamical process was obtained by direct numerical simulation of the Navier-Stokes equations coupled with the advection-diffusion of a passive scalar equation: ∂I(s,t) + ∇T I(s, t)v(s, t) = ν∆I(s, t), where ν represents a unknown diffusion co∂t efficient.Fig. 1 presents a scalar image of the sequence together with the ground truth motion. In our simulation we defined the set of possible value for ω as follows. Advection-diffusion equation with different values of ν constitutes the set of possible data models. A 1-st order regularizer (5) is used to implement the prior model. Both the `2 and Leclerc’s cost functions are possible choices for the data and prior cost functions. We considered the estimation of hyper-parameters α, β, τd and τr . ˆ , the Mean End Fig. 1 and Fig 2 displays the posterior motion estimate v Point (MEP) error and the Mean Barron Angular (MBA) [17] error obtained applying the proposed Bayesian inference framework. Comparing with results of [16] displayed in the same figures, one can notice that the use of Bayesian inference with a simple robust 1-st order regularization outperforms the most accurate state of the art fluid motion estimators [18]. Therefore, fitting an inappropriate regularizers while selecting data models by Bayesian inference yields better
10
P. H´eas, C. Herzet, E. M´emin
MEP error: 0.27
MBA error: 9.09◦
MEP error: 0.28
MBA error: 9.86◦
Bayesian inference results in [16] (no Bayesian inference) (robust diffusion model & robust 1-st order reg.) (robust OFC & Quadratic Curl 2-nd order reg.)
Fig. 2. Motion field, MEP and MBA errors [17] corresponding to estimation with (left) or without (right) Bayesian inference. 1.2
1 End Point Error [pixel] Evidence Energy
0.3 0.29 0.28 0.27 0.26 0.25 0.24 0.23 0.22 0.21 0.2
1 0.8
0.8 0.6
End Point Error [pixel] Evidence Energy Likelihood Energy
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.6 0.4 0.4
0.2 0.2
0 0.01
0.1 Diffusion Coefficient
1
0 0
10
20
30
40 50 60 70 Regularization Coefficient (Gamma)
80
90
100
Fig. 3. Left: the energy of model probability w.r.t. coefficient ν (green) is minimum for νˆ minimizing the MEP error (red). Right: the energy of the regularization coefficient probability (green) reaches its minimum close to the MEP error curve minimum (red).
results than fine regularizers adjusted by manually tuning hyper-parameters. Fig. 3 shows that the inferred diffusion and regularization coefficients also minimize the MEP error. 4.2
Computer vision scenes
In this section, we assess the performance of the proposed Bayesian inference method with image sequences from the Middleburry database [17]. We show the power of the proposed method by emphasizing its effectiveness for very simple observation and prior models: the basic model for the data (monochromatic model, OFC equation (1) without any image gradient preservation) and a basic 1-st order regularizer (5). Obviously, using more sophisticated models would likely improve our results. “Venus” sequence: We considered `2 , `1 or Leclerc’s function for fr while we chose for fd the Leclerc’s cost function. Our scheme selects `1 norm as the best ˆ obtained with ω prior cost function estimate. The estimated motion field v ˆ and ˆ is shown in Fig. 4 together with the maps of data outliers and motion spatial θ discontinuities related to estimate τˆd and τˆr . As shown in the table of Fig. 5, errors obtained with Bayesian inference for these very simple observation and prior models are comparable to error of manually tuned hyper-parameters of affine regularization model or specialized data term dedicated to such scenes composed of rigid objects [9, 12]. “Dimetrodon” sequence: We considered `2 , `1 norms or Leclerc’s cost function for fd and fr . Our scheme selects the `1 norm for both fd and fr and adjusts
Robust optic-flow estimation with Bayesian inference
11
Ground truth
ˆ motion estimate v
Likelihood robust weights z ˆd
Prior robust weights z ˆr Prior robust weights z ˆr (horizontal cliques) (vertical cliques)
Ground truth
ˆ motion estimate v
Likelihood robust weights z ˆd
Prior robust weights z ˆr Prior robust weights z ˆr (horizontal cliques) (vertical cliques)
ˆ , robust weights zˆd and zˆr . Top (resp. bottom) line Fig. 4. Ground truth, estimate v represents result for the frames 10-11 of the “Venus” (resp. “Dimetrodon”) sequence.
hyper-parameters. The left table in Fig. 5 shows that this combination performs ˆ and the best in terms of MEP and MBA errors. The estimated motion field v ˆ are displayed in Fig. 4 together with maps error maps obtained with ω ˆ and θ of data outliers and motion spatial discontinuities related to estimate τˆd and τˆr . The right table in Fig. 5 shows that Bayesian inference enables to obtain a higher accuracy, or at least, results comparable to more refined method with manually tuned parameters. fd
fr
α/ ˆ βˆ
τˆd
τˆr
`2 `2 12.48 0 0 Leclerc `2 14.97 0.32 0 `1 `2 1.85 20.0 0 `2 Leclerc 9.58 0 2.00 Leclerc Leclerc 15.01 0.32 1.34 `1 Leclerc 1.83 20.0 0.39 `2 `1 3.24 0 10.0 Leclerc `1 12.28 0.34 10.0 `1 `1 1.70 20.0 10.0
− log End-point Barron p(bΦ |ˆ z, ω ˆ) error error 443013 0.201 3.656 418662 0.199 3.542 337990 0.191 3.309 437326 0.206 3.760 418602 0.199 3.542 338097 0.191 3.308 434656 0.258 4.883 417340 0.204 3.657 335564 0.190 3.303
Venus Dimetrodon Bayesian inference 8.348 Bruhn&al [12]
8.732
3.303 10.993
Black&Anandan [9]
7.641
9.261
Lucas-Kanade [19]
14.614
10.272
Media PlayerTM
15.485
15.824
Zitnick & al [20]
11.423
30.105
Fig. 5. Left: selection of the most likely cost functions for the data and the regularization terms. Right: comparison with state of the art based on the MBA error criterion
5
Conclusion
A generic and efficient Bayesian inference scheme has been proposed for selecting models and hyper-parameters in robust optic-flow estimation. Motion fields, models together with their hyper-parameters are treated as interdependent random variables. Optic-flow, regularization coefficients, M-estimator parameters, prior and likelihood motion models are simultaneously inferred in this context by maximizing the posterior marginalized distributions. Experiments prove that the proposed Bayesian inference scheme succeeded to select appropriate model and hyper-parameters. In particular, using very simple models, we achieve an accuracy comparable to state of the art results. An intensive evaluation adapting models to the Middleburry optic-flow database could provide a fair judgement
12
P. H´eas, C. Herzet, E. M´emin
of the proposed framework performances.
References 1. Jaynes, E.T.: Bayesian methods: General background (1986) 2. Molina, R., Katsaggelos, A.K., Mateos, J.: Bayesian and regularization methods for hyperparameter estimation in image restoration. IEEE Trans. Image Processing 8 (1999) 231–246 3. MacKay, D.J.C.: Bayesian interpolation. Neural Computation 4 (1992) 415–447 4. Krajsek, K., Mester, R.: Bayesian model selection for optical flow estimation. In: DAGM-Symposium. (2007) 142–151 5. Nir, T., Bruckstein, A.M., Kimmel, R.: Over-parameterized variational optical flow. Int. J. Comput. Vision 76 (2008) 205–216 6. Heas, P., Memin, E., Heitz, D., Mininni, P.: Bayesian selection of scaling laws for motion modeling in images. In: International Conference on Computer Vision (ICCV’09), Kyoto, Japan (2009) 7. Liu, T., Shen, L.: Fluid flow and optical flow. Journal of Fluid Mechanics 614 (2008) 253 8. Horn, B., Schunck, B.: Determining optical flow. Artificial Intelligence 17 (1981) 185–203 9. Black, M., Anandan, P.: The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields. Computer Vision and Image Understanding 63 (1996) 75–104 10. Geman, D., Reynolds, G.: Constrained restoration and the recovery of discontinuities. IEEE Trans. Pattern Anal. Mach. Intell. 14 (1992) 367–383 11. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research. Springer-Verlag, New York, NY (1999) 12. Bruhn, A., Weickert, J., Kohlberger, T., Schnorr, C.: A multigrid platform for real-time motion computation with discontinuity-preserving variational methods. International Journal of Computer Vision 70 (2006) 257–277 13. Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.: An introduction to variational methods for graphical models. Machine Learning 37 (1999) 183–233 14. Sun, D., Roth, S., Lewis, J.P., Black, M.: Learning optical flow. In: Proceedings of the 10th European Conference on Computer Vision. (2008) 83–97 15. Noels, N., Steendam, H., Moeneclaey, M.: The true cramer-rao bound for phaseindependent carrier frequency estimation from a psk signal. In: IEEE Global Telecommunications Conference 2002. (2002) 1137–1141 16. Yuan, J., Schnoerr, C., Memin, E.: Discrete orthogonal decomposition and variational fluid flow estimation. Journ. of Math. Imaging & Vison 28 (2007) 67–80 17. Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M., Szeliski, R.: A database and evaluation methodology for optical flow. In: Int. Conf. on Comp. Vis., ICCV 2007. (2007) 18. Carlier, J., Wieneke, B.: Report 1 on production and diffusion of fluid mechanics images and data. Fluid project deliverable 1.2. http://www.fluid.irisa.fr (2005) 19. Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereovision. In: Int. Joint Conf. on Artificial Intel. (IJCAI). (1981) 674–679 20. Zitnick, C., Jojic, N., Kang, S.B.: Consistent segmentation for optical flow estimation. In: Tenth IEEE International Conference on Computer Vision (ICCV’ 05). Volume 2. (2005) 1308–1315 Vol. 2