Essential Matrix Estimation via Newton-type ... - UCSD Mathematics

2 downloads 0 Views 172KB Size Report
Uwe Helmke1, Knut Hüper2, Pei Yean Lee3, John Moore3. Abstract ..... πµ : TaffE→E,U(E0 + Ω1(x)E0 − E0Ω2(x))V ↦→ U eΩ1(x) E0 e−Ω2(x) V . (33). Obviously ...
1

Essential Matrix Estimation via Newton-type Methods Uwe Helmke1 , Knut H¨ uper2 , Pei Yean Lee3 , John Moore3 Abstract In this paper camera parameters are assumed to be known and a novel approach for essential matrix estimation is presented. We estimate the essential matrix from point correspondences between a stereo image pair. The technical approach we take is a generalization of the classical Newton method. It is well-known that the set of essential matrices forms a smooth manifold. Moreover, it is quite natural to minimise a suitable cost function the global minimum of which is the essential matrix we are looking for. In this paper we present several algorithms generalising the classical Newton method in the sense that (i) one of our methods can be considered as an intrinsic Newton method operating on the Riemannian manifold consisting of all essential matrices, (ii) the other two methods approximating the first one but being more efficient from a numerical point of view. To understand the algorithm requires a careful analysis of the underlying geometry of the problem. Keywords Essential matrix, stereo vision, motion and structure recovery, 2-jet approximation, Newton method, quadratic convergence.

I. Introduction Epipolar geometry has important applications in 3D computer vision. These include determining 3D motion and estimating the structure of an object moving relative to a camera, estimating the relative orientation and location of two cameras which are both observing the same 3D objects, and camera calibration. Robust and accurate computation of the fundamental or essential matrix is crucial in these applications. In this paper, we assume camera parameters are known and a new approach for essential matrix estimation is presented. The new algorithm can be viewed as a Gauss-on-manifold method and is an approximation to a more sophisticated locally quadratically convergent Gauss-Newton-on-manifold method. Each algorithmic step of our algorithms consists of a two stage process. The first step computes a solution of a least-squares problem in Euclidean space giving an iterate outside the manifold considered as an embedded submanifold. The result of this optimisation procedure is then projected back onto the manifold. To understand the algorithms requires a careful analysis of the underlying geometry of the problem. We compute (i) the fixed points of our algorithms, (ii) discuss the differentiability properties of our methods and (iii) present an analysis resulting in local quadratic convergence in three cases. II. Epipolar Geometry and the Essential Manifold A. Epipolar geometry Two images of the same scene are related by epipolar geometry as illustrated in Figure 1. The images can be taken by two cameras or the images can be taken by a mobile camera at two different ˆ and positions centered at C and C0 . Given an object point M and its two dimensional projections m ˆ 0 on both image planes, the three points define an epipolar plane Π, which intersects both image m 0 0 planes I and I0 at the epipolar lines lm ˆ 0 and lm ˆ . The image of the camera centers C ,C are captured 0 0 on the image planes I , I at epipoles e , e respectively. 1 Department of Mathematics, University of W¨ urzburg, D-97074 W¨ urzburg, Germany, Email: [email protected] 2 National ICT Australia Ltd, Locked Bag 8001, Canberra ACT 2601, Australia, Email: [email protected] 3 Department of Systems Engineering, RSISE, Australian National University, Canberra, ACT 0200, Australia, and National ICT Australia Ltd, Locked Bag 8001, Canberra ACT 2601, Australia, Email: {peiyean,john.moore}@syseng.anu.edu.au

eplacements 2

M I1

I2

m2

m1 x1 C1

z1

l2

l1 e2

e1 y1

z2

x2 y2 C2

(R, T ) Fig. 1. The epipolar geometry

Algebraically, epipolar geometry is represented by a fundamental matrix F , as b 0> F m b = 0, m

(1)

m0> Em = 0,

(2)

E = K 0> F K.

(3)

b = [u v 1]> , m b 0 = [u0 v 0 1]> . The fundamental matrix is a (3 × 3)-matrix of rank two. It is where m b and m b 0 are used to establish correspondences between two uncalibrated images. Here, image points m described in pixel image coordinates, and for now we assume that the image data are noise free. b ↔ m b 0 can be When the camera calibration matrices K and K 0 are known, the image pair m b 0 and (1) can be b m0 = K 0−1 m expressed in terms of normalized image coordinates as m = K −1 m, reformulated in terms of an essential matrix E as

where From (1) and (2), it is clear that the essential matrix E is equivalent to a fundamental matrix F when the known image points are expressed in normalized coordinates. B. Essential manifold To set up a notational framework for our results, we review in this section basic facts on the geometry of the essential matrix, see [1],[3],[4]. For simplicity of the subsequent analysis, however, we present these results using the terminology of Lie groups and Lie algebras. Let O3 denote the Lie group of real (3 × 3)-orthogonal matrices with determinant equal to plus one or minus one. Let SO 3 denote the Lie group of real (3×3)-orthogonal matrices with determinant equal to one, and let so3 denote the corresponding Lie algebra, i.e., the set of real (3 × 3)-skew symmetric matrices so3 := {X ∈ R3×3 |X = −X > }. Let RP2 denote the real projective plane, i.e. the set of lines through the origin in R3 . Recall, that an essential matrix is a real (3 × 3)-matrix in factored form E = ΩΘ.

(4)

3

Choosing a basis of so3 as      0 −1 0 0 0 1 0 0 0 Ω1 := 0 0 −1 , Ω2 :=  0 0 0 , Ω3 := 1 0 0 0 0 0 −1 0 0 0 1 0 

(5)

then Ω=

3 X i=1

ωi Ωi 6= 0, Θ ∈ SO 3 .

(6)

Here Θ is the rotation matrix indicating the rotation between the two cameras and ω = [ω 1 ω2 ω3 ]> is the translation vector between the two cameras. Once E is known, then the factorization (4) can be made explicit as shown below. It is well-known, [1], [4], that the essential matrices are characterized by the property that they have exactly one positive singular value of multiplicity two, consequently√E must be rank 2. In particular, normalized essential matrices are those of Frobenius norm equal to 2 and which are therefore characterized by having the set of singular values {1, 1, 0}. The normalized essential manifold is defined as ¯ ª © (7) E := ΩΘ ¯ Ω ∈ so3 , Θ ∈ SO 3 , kΩk2 = tr(ΩΩ> ) = 2 . This is the basic nonlinear constraint set on which the proposed algorithms are defined.

C. Characterization of normalized essential manifold First, we show that a non-zero (3 × 3)-matrix E is essential if and only if E = U ΣV > ,

(8)

 s 0 0 Σ = 0 s 0 , s > 0, U, V ∈ SO 3 . 0 0 0

(9)

where 

Note that, E is a normalized essential matrix when s = 1. Assume E = ΩΘ with Ω and Θ as in (6). The equality E = ΩΘ implies EE > = ΩΘΘ> Ω> = −Ω2 , with corresponding set of eigenvalues v u 3 uX © 2 2 ª > λ(EE ) = s , s , 0 , where s := t ωi2 . i=1

The set of singular values of E is then σ(E) = {s, s, 0} . For the converse, consider    0 1 0 0 −s 0 Ψ := s 0 0 ∈ so3 and Γ := −1 0 0 ∈ SO3 . 0 0 1 0 0 0 

4

One has



 s 0 0 Ψ> = Γ 0 s 0  0 0 0 Hence for the singular value decomposition of E,     s 0 0 s 0 0 E = U 0 s 0 V > = U Γ> Γ 0 s 0 ΓU > U Γ> V > = (U Γ> Ψ> ΓU > ) (U Γ> V > ) = ΩΘ. {z } | {z } | 0 0 0 0 0 0 Ω Θ

(10)

as required. The next result characterizes the normalized essential manifold E as the smooth manifold of (3 × 3)matrices with fixed set of singular values {1, 1, 0}, see [2] for details on the geometry of such manifolds. Proposition II.1: Let denote ¸ · I2 0 , (11) E0 := 0 0 ¯ © ª E := U E0 V > ¯ U, V ∈ SO 3 . (12)

Then, E is a smooth five-dimensional manifold diffeomorphic to RP2 × SO3 . ¥ The orthogonal matrices appearing in the above SVD of a given essential matrix are not uniquely determined. However, the possible choices are easily described, leading to an explicit description of factorizations. D. Tangent space of essential manifold We now consider the tangent spaces to E. Theorem II.1: The tangent space TE E at the normalized essential matrix E = U E0 V > is ¯ © ª TE E = U (ΩE0 −E0 Ψ)V > ¯ Ω, Ψ ∈ so3     0 ω12 − ψ12 −ψ13   ¯ >¯   0 −ψ23 V ¯ωij , ψij ∈ R, i, j ∈ {1, 2, 3} = U −(ω12 − ψ12 )   −ω13 −ω23 0

(13)

with usual notation Ω = (ωij ) and Ψ = (ψij ) . Proof: For any E = U E0 V > ∈ E let αE : SO3 × SO3 → E be the smooth map defined by b , Vb ) = U b E Vb > . The tangent space TE E is the image of the linear map αE ( U b Ψ) b 7→ ΩE b − E Ψ, b D αE (I3 , I3 ) : so3 × so3 → R3×3 , (Ω,

(14)

i.e., the image of the derivative of αE0 evaluated at the identity (I3 , I3 ) ∈ SO3 × SO3 . By setting b and Ψ := V > ΨV b the result follows, see [2], pp 89 for details. Ω := U > ΩU Corollary II.1: The kernel of the mapping D αE0 (I3 , I3 ) : so3 × so3 → R3×3 is the set of matrix pairs (Ω, Ψ) ∈ so3 × so3 with   0 x 0 Ω = Ψ = −x 0 0 , x ∈ R. (15) 0 0 0 ¥ Corollary II.2: The affine tangent space TEaff E at the normalized essential matrix E = U E0 V > is     1 −x3 −x5  ¯  aff >¯   1 x 4 V ¯ x1 , . . . , x 5 ∈ R . T E E = U x3 (16)   −x x 0 2

1

¥

5

E. Parameterisation of the essential manifold Computations on a manifold are often conveniently carried out in terms of a local parameterisation. For our later convergence analysis we therefore need a local parameterisation of the essential manifold. Lemma II.1: Let N (0) ⊂ R5 denote small open neighborhood of the origin in R5 . Let £ a sufficiently ¤ > 5 U, V ∈ SO 3 be arbitrary, let x = x1 , . . . , x5 ∈ R , and let E0 be defined as in (11). Consider the mappings   0 − √x32 x2 ¤ £ 1 > 0 −x1  (17) Ω1 : R5 → so3 , x1 , . . . , x5 7→ √  √x32 2 −x2 x1 0 and



0 ¤> £ 1  x3 5 7 √ − √2 Ω2 : R → so3 , x1 , . . . , x5 → 2 −x5

x3 √ 2

0 x4

 x5 −x4  . 0

(18)

Consider also µ : N (0) → E, x 7→ U eΩ1 (x) E0 e−Ω2 (x) V > . (19) Then the mapping µ is a diffeomorphism of N (0) onto the image µ(N (0)). Proof: Smoothness of µ is obvious. We will show that µ is an immersion at 0. To see that µ is an immersion it is sufficient to show that the derivative D µ(0) : R5 → Tµ(0) E

(20)

¤ £ is injective. For arbitrary h> = h1 , . . . , h5 ∈ R5 we get 



1 D µ(0)h = U  √  2

0 h3 √ 2

−h2

  − √h32 h2 0 1  h3  0 −h1 E0 − E0 √ − √2 2 h1 0 −h5

h3 √ 2

0 h4

 h5 −h4  V > 0

 √ 2h −h 0 − 3 5 √ 1 = √ U  2h3 0 h4  V > , 2 −h2 h1 0 

which implies injectivity in an obvious manner. The result follows. Remark II.1: In this paper we consider the essential manifold as an orbit of the group SO 3 × SO3 acting on E0 by equivalence. By the differential of this group action the usual canonical Riemannian metric on SO 3 ×SO3 induces a Riemannian metric on the essential manifold which is called the normal Riemannian metric on E, see e.g. [2] for details about this construction. Moreover, by exploiting Corollary II.1 one can show that by this group action geodesics on SO 3 × SO3 , namely one parameter subgroups, are mapped to geodesics on E. We refer to [6], Theorem 5.9.2, for a proof of this fact in a more general context. It turns out that the curves on E we will use in the sequel, i.e. γ : t 7→ U etΩ1 (x) E0 e−tΩ2 (x) V > ,

x ∈ R5

(21)

are indeed geodesics on E with respect to the normal Riemannian metric. In addition, the inverse µ −1 defines a socalled normal Riemannian coordinate chart. Such a chart has the remarkable feature that the Riemannian metric expressed in this chart evaluated at zero is represented by the identity.

6

F. Cost function >

Let M (i) := m(i) m0(i) , where m(i) , m0(i) ∈ R3 correspond to the normalized i-th point image pair in the left and in the right camera, respectively, for which the correspondence is assumed to be known. Consider the smooth function n n ´2 1 X 1 X ³ 0(i)> (i) f : E → R, f (E) = Em m tr2 (M (i) E). (22) = 2 i=1 2 i=1 The value of this cost function atains zero if and only if there is an essential matrix which fulfills the epipolar constraint for each image point pair. That is, in the noise free case the global minimum value is zero. In the noisy case the zero value will in general not be attained. It nevertheless makes sense to search for mimima of this cost function even in the presence of noise. The minima then can be interpreted as least squares approximations to the true essential matrix ignoring for a while any statistical interpretations or refinements. G. 2-jet of the cost function f on E in terms of local parameterisation

The 2-jet or second order Taylor polynomial of f around the point E = U E0 V > ∈ E expressed in local parameters using the smooth parameterisation µ as in (19), is defined as ¯ ¯ ¯ ¯ 1 d2 d (2) ¯ ¯ . (23) j0 (f ◦ µ)(x) : N (0) → R, x 7→ f (µ(tx))|t=0 + f (µ(tx))¯ + f (µ(tx)) ¯ 2 dt 2 d t t=0 t=0

That is

(2) j0 (f (µ(x)))

n n ³ ´ ´ X ¡ ¢³ 1 X 2 (i) = tr M E + tr M (i) E tr M (i) U Ω1 (x)E0 − E0 Ω2 (x) V > + 2 i=1 i=1 n ´ 1 X 2 (i) ³ + tr M U Ω1 (x)E0 − E0 Ω2 (x) V > + 2 i=1 n ´ ´ ³ ¢³ 1 X¡ tr M (i) E tr M (i) U Ω21 (x)E0 + E0 Ω22 (x) − 2Ω1 (x)E0 Ω2 (x) V > . (24) + 2 i=1

As expected the 2-jet contains three terms: (i) A constant n

(f ◦ µ)(tx)|t=0 =

1 X 2 (i) tr M E = const . 2 i=1

(ii) A linear one ¯ n ³ ´ ´ X ¯ ¡ ¢³ d f (µ(tx))¯¯ = tr M (i) E tr M (i) U Ω1 (x)E0 − E0 Ω2 (x) V > dt t=0 i=1 ­ ® = (∇(f ◦ µ)(0))> · x = grad f (U E0 V > ), U (Ω1 (x)E0 − E0 Ω2 (x))V > n.RM ,

(25)

(26)

which can be interpreted as either (I) the transposed Euclidean gradient of f ◦ µ : R 5 → R evaluated at zero acting on x ∈ R5 , or (II) as the Riemannian gradient of f : E → R evaluated at U E0 V > ∈ E paired by the normal Riemannian metric with the tangent element U (Ω1 (x)E0 − E0 Ω2 (x))V > ∈ TU E0 V > E. (iii) A quadratic term in x. Actually, the quadratic term consists of a sum of two terms. The first one, namely the term following the factor 1/2 in the second line of (24) n ³ ´ X b f (µ(0)) · x, tr2 M (i) U Ω1 (x)E0 − E0 Ω2 (x) V > = x> · H (27) i=1

7

b f (µ(0)) being positive (semi)definite for all is a quadratic form on R5 with the corresponding matrix H U, V ∈ SO 3 . We can interpret this summand as a positive (semi)definite part of b f (µ(0)) + H e f (µ(0)) , Hf (µ(0)) = H

with Hf (µ(0)) the Hessian matrix of f ◦ µ : R5 → R evaluated at zero. For the quadratic term in (23) there is then a further interpretation: ¯ ¯ ¯ d2 ¯ , ¯ = Hf (γ(t)) (γ(t), ˙ γ(t)) ˙ f (µ(tx)) ¯ t=0 d t2 t=0

(28)

(29)

i.e., Hf (γ(t)) is the Hessian operator of f : E → R represented along geodesics γ : R → E, γ(t) = µ(tx), γ(0) = U E0 V > . H. Projecting onto the manifold E

As will be explained in more detail below our algorithms are iterative in nature. Each algorithmic step consists of two partial steps, the first one being an optimisation procedure defined on an appropriate affine tangent space to E, and the second one is a nonlinear projection back to the manifold. H.1 Approximating arbitrary matrices by essential ones We need an explicit description for the best approximant of an arbitrary matrix X ∈ R 3×3 to E. Theorem II.2: Let X ∈ R3×3 with ordered singular value decomposition X = U ΣV > , i.e., Σ = b ∈E diag(σ1 , σ2 , σ3 ), σ1 ≥ σ2 ≥ σ3 ≥ 0, and U, V ∈ O3 . If σ3 is simple, the unique best approximant X with respect to the Frobenius norm, i.e., b := arg min kX − Zk X Z∈E

is given as

(30)

b = U E0 V > . X (31) Proof: The proof proceeds along the same lines as for the corresponding Eckart-Young-Mirsky theorem for symmetric matrices; see [2]. The rather straightforward details are omitted here. b In the sequel we will use the notation πSVD (X) = X. H.2 Projecting back by means of the parameterisation µ

Let X = U (E0 + Ω1 E0 − E0 Ω2 ) V > with Ω1 , Ω2 ∈ so3 be an arbitrary element of TUaffE0 V > E. We can simply project X back to the manifold in the following way: X 7→ U eΩ1 E0 e−Ω2 V > .

(32)

It is straight forward to verify that this mapping is a projection, moreover, it is smooth: Let Ω1 and Ω2 defined as in (17) and (18), respectively. Let T aff E be the affine tangent bundle of E. For fixed x ∈ R5 consider the smooth mapping ¡ ¢ πµ : T aff E → E, U E0 + Ω1 (x)E0 − E0 Ω2 (x) V > 7→ U eΩ1 (x) E0 e−Ω2 (x) V > . (33)

Obviously, for fixed x the mapping πµ maps straight lines in TUaffE0 V > going through U E0 V > such as ¡ ¢ (34) lU E0 V > : t 7→ U E0 + tΩ1 (x)E0 − E0 tΩ2 (x) V > , ¢ ¡ to smooth curves πµ lU E0 V > (t) ⊂ E. As mentioned above the resulting curves on E, namely ¢ ¡ (35) πµ lU E0 V > (t) = U etΩ1 (x) E0 e−tΩ2 (x) V > = U eΩ1 (tx) E0 e−Ω2 (tx) V > ,

8

are geodesics on E with respect to the so-called normal Riemannian metric on E. One therefore can think of the projection πµ defined by (33) as a Riemannian one. Moreover, the parameterisation µ given by (19) defines a so-called Riemannian normal coordinate chart µ−1 sending a suitably chosen open neighborhood of E ∈ E diffeomorphically to an open neighborhood of the origin of R 5 . H.3 Cayley-like projection As a further alternative one might approximate the matrix exponential of skew-symmetric matrices by its first order diagonal Pad´e approximant, or more commonly called Cayley transformation: ¶µ ¶−1 µ 1 1 I− Ω cay : so3 → SO3 , Ω 7→ I + Ω . (36) 2 2 The Cayley mapping on so3 is well known to be a local diffeomorphism around 0 ∈ so3 . Moreover, it approximates the exponential mapping exp : so3 → SO3 defined by Ω 7→ exp(Ω) = eΩ up to second order. We therefore consider in the sequel the smooth projection mapping πcay : T aff E → E, U (E0 + Ω1 E0 − E0 Ω2 ) V > 7→ U cay(Ω1 )E0 cay(−Ω2 )V > .

(37)

III. Algorithm A. Objective Function The cost function we consider first in this paper is analysed to some extent in [5]. We recall here the critical points of f : E → R defined by (22). Lemma III.1: Let n 1 X 2 (i) f : E → R, f (E) = tr (M E). (38) 2 i=1 The element E = U E0 V > ∈ E is a critical point of f if and only if for all Ψ1 , Ψ2 ∈ so3 n X ¡ ¢¡ ¢ tr M (i) E tr M (i) U (Ψ1 E0 − E0 Ψ2 )V > = 0.

(39)

i=1

¥ B. Algorithm We consider the algorithm as the self map s = π 2 ◦ π1 : E → E consisting of an optimisation step followed by projection. Here ¡ ¢ π1 : E → R3×3 , E = U E0 V > 7→ U E0 + Ω1 (x)E0 − E0 Ω2 (x) V >

(40)

(41)

and x ∈ R5 as a function of E solves the problem of minimizing the 2-jet of the objective function f , i.e., (2) x = arg min j0 (f ◦ µ)(y), (42) y∈N (0)

where µ(0) = E. The second mapping π2 denotes a projection π2 : R3×3 → E, X 7→ proj(X),

(43)

where proj is chosen to be one of the projections discussed in Section II-H. Therefore one algorithmic step of s consists of two partial steps, namely π1 sending a point E on the essential manifold E to an element of the affine tangent space TEaff E, followed by π2 projecting that element back to E.

9

C. Fixed Points of the Algorithm The following theorem holds. Theorem III.1: Let π2 be either πcay or πSVD , but not πµ . The only fixed points of the corresponding algorithm s = π2 ◦ π1 are those elements of E which are minima of the objective function f : E → R. ¥ Unfortunately, the situation is more involved if the projection we use is π µ . Consider the following example. For arbitrary U, V ∈ SO 3 let E = U E0 V > and suppose √ £ ¤> xopt = 2 0 π 0 0 π , (44) therefore



 0 0 π Ω1 (xopt ) = Ω2 (xopt ) =  0 0 0  −π 0 0 Then

and

eΩ1 (xopt ) = eΩ2 (xopt )



 −1 0 0 =  0 1 0 . 0 0 −1



but

(45)

 1 0 −π π1 (E) = U  0 1 0  V > 6= E −π 0 0

(46)

πµ (π1 (E)) = U E0 V > = E.

(47)

One can easily find other examples, but the reason that one can construct such examples is that the injectivity radius of the exponential map exp : so3 → SO3 is finite, namely equal to π. IV. Smoothness Properties of the Algorithm A. Smoothness of the optimisation step π1 This is obvious under the assumption of the Hessian of f ◦ µ being everywhere invertible. Indeed, under this assumption the linear system to be solved in each optimisation step has a unique solution. B. Smoothness of the projections πSVD , πµ and πcay Theorem IV.1: The projections πµ and πcay are smooth mappings. Proof: This is obvious. Theorem IV.2: Let U := {X ∈ R3×3 | smallest singular value is simple}.

(48)

U ⊂ R3×3 is an open subset and the projection b πSVD : U → E, X 7→ X

is smooth. Proof: Consider     α β 0   ¯ p ¯ M := (U, Σ, V ) ∈ O3 × R3×3 × O3 ¯Σ = β γ 0 , 2|δ| < α + γ − (α − γ)2 + 4β 2 .   0 0 δ

Note that

σmin

µ·

α β β γ

¸¶

p 1 = (α + γ − (α − γ)2 + 4β 2 ). 2

(49)

10

Thus the condition on δ implies that δ is the eigenvalue of Σ with smallest absolute value. Let ¸¾ · ½ ¯ I2 0 ¯ 3×3 M0 := (U, E0 , V ) ∈ O3 × R × O 3 ¯E 0 = 0 0

and

¾ ¸ ½ · ∆ 0 3×3 ∈ R |∆ ∈ O2 . Γ := S = 0 1 Then σ : Γ × M → M, (S, (U, Σ, V )) 7→ (U S, S > ΣS, V S) defines a smooth, proper Lie group action with smooth orbit space M/Γ and the quotient map P : M → U, (U, Σ, V ) 7→ U ΣV > , is a principal fibre bundle with structure group Γ. Obviously, σ leaves M0 invariant and therefore restricts to a smooth quotient map P : M0 → E, (U, E0 , V ) 7→ U E0 V > . Moreover, the projection map F : M → M0 , F (U, Σ, V ) = (U, E0 , V ) is smooth and the diagram F

M −−−→   Py

M0   yP

(50)

U −−−→ E πSVD

is commutative. By standard arguments this implies that πSVD is smooth. V. Convergence Analysis of the Algorithm Let E∗ denote a fixed point of s = π2 ◦ π1 , i.e. E∗ is a minimum of the function f . We will compute the first derivative of s at this fixed point. By the chain rule and the fact that π1 (E∗ ) = π2 (E∗ ) = E∗ we have for all tangent elements ξ ∈ TE∗ E D s(E∗ ) · ξ = D π2 (E∗ ) · D π1 (E∗ ) · ξ.

(51)

Reconsidering s expressed in local coordinates amounts to studying the self map µ−1 ◦ s ◦ µ : R5 → R5 .

(52)

Therefore, rewriting (51) in terms of the parameterisation, defined by µ : R5 ⊃ N (0) → E, y 7→ U∗ eΩ1 (y) E0 e−Ω2 (y) V∗> ,

(53)

µ(0) = E∗ = U∗ E0 V∗> ,

(54)

with

11

and Ω1 and Ω2 as in (17) and (18), respectively, we get ¡ ¢ D µ−1 ◦ s ◦ µ (0)·h = D µ−1 (E∗ )·D s(E∗ )·D µ(0)·h = (D µ(0))−1 ·D π2 (E∗ )·D π1 (E∗ )·D µ(0)·h. (55)

Now

¡ ¢ π1 ◦ µ : R5 ⊃ N (0) → R3×3 , y 7→ U∗ eΩ1 (y) E0 + Ω1 (xopt (y))E0 − E0 Ω2 (xopt (y)) e−Ω2 (y) V∗> ,

where

(2)

xopt : R5 → R5 , y 7→ arg min5 j0 ϕ(z, y),

(57)

z∈R

and

(56)

¡ ¢ e (0) × N (0) → R, ϕ(x, y) = f U∗ eΩ1 (y) eΩ1 (x) E0 e−Ω2 (x) e−Ω2 (y) V∗> . ϕ:N

(58)

e (0) ⊂ N (0) is a suitably chosen open neighborhood of zero and the 2-jet of ϕ in (57) is Here N understood to be the one with respect to the first argument of ϕ. Exploiting linearity of the mappings Ω1 and Ω2 and using the well known formula for differentiating the matrix exponential, we compute the first derivative of π1 in local coordinates as ¡ ¢ ¡ ¢ D(π1 ◦ µ)(0) · h = U∗ Ω1 (h)E0 − E0 Ω2 (h) V∗> + U∗ Ω1 (D xopt (0) · h)E0 − E0 Ω2 (D xopt (0) · h) V∗> ¡ ¢ = U∗ Ω1 (h + D xopt (0) · h)E0 − E0 Ω2 (h + D xopt (0) · h) V∗> . (59) We need an expression for D xopt (0) · h. Define ψ : N (0) × N (0) × R5 → R, ψ(x, y, k) = D1 ϕ(x, y) · k,

(60)

i.e., for all k ∈ R5 the following holds true ψ(xopt (y), y, k) = 0,

(61)

ψ(xopt (0), 0, k) = ψ(0, 0, k) = 0.

(62)

and similarly Taking the derivative of (61) with respect to the variable y evaluated at zero acting on an arbitrary h ∈ R5 gives ¯ ¯ ¯ ¯ D1 ψ(xopt (y), y, k) · D xopt (y) · h¯ + D2 ψ(xopt (y), y, k) · h¯ = 0. (63) y=0

y=0

The linear system (63) has a unique solution in terms of D xopt (y) · h|y=0 because the linear mapping D1 ψ(xopt (y), y, k)|y=0 ,

(64)

is invertible. The reason for this is simply that expression (64) is equal to the Hessian of f ◦µ evaluated at the point 0 and is invertible by assumption. Therefore, application of the Implicit Function Theorem to ψ(x, y, k) implies not only smoothness of xopt but also one gets the explicit expression ¡ ¢−1 D xopt (0) · h = − D1 ψ(0, 0, k) · D2 ψ(0, 0, k) · h = −h.

(65)

D(π1 ◦ µ)(0) · h = 0.

(66)

Plugging (65) into (59) gives the result

12

We therefore can state the main mathematical result of this paper Theorem V.1: If the algorithm s converges to the fixed point E∗ then it converges locally quadratically fast to E∗ . Proof: Plugging (66) into (55) shows that for all h ∈ R5 ¡ ¢ D µ−1 ◦ s ◦ µ (0) · h = (D µ(0))−1 · D π2 (E∗ ) · D π1 (E∗ ) · D µ(0) · h = 0

(67)

irrespective which projection π2 we use. Let (E (k) ) denote the sequence of essential matrices generated by the algorithm. Let x(k) = µ−1 (E (k) ) denote the corresponding elements in R5 . For sufficiently large j we may assume that for all k ≥ j the iterates x(k) stay in a sufficiently small neighborhood of the origin in R5 . Vanishing of the first derivative then implies local quadratic convergence by the Taylor-type argument ° ° ° ° ° ° −1 °(µ ◦ s ◦ µ)(x(k) )° ≤ sup °D2 (µ−1 ◦ s ◦ µ)(y)° · °x(k) °2 .

(68)

y∈N (0)

A. Discussion A few remarks are in order here. Our proof of quadratic convergence was essentially independent of the chosen cost function. More detailed exploitation of this fact is under consideration and will be published elsewhere. If πµ is used for the second algorithmic step π2 then one can show that the overall algorithm is nothing other than a Riemannian manifold version of Newton’s method, the Riemannian metric being the so-called normal one. Despite the well-known fact that under mild assumptions, the Riemannian manifold version of Newton’s method is locally quadratically convergent, see [7], Theorem 3.4, p. 57, our results are apparently more than just an application of this nice result. We would like to mention that the latter version of our algorithm is also different from the approach taken in [5]. The Riemannian metric those authors use is different, therefore also their geodesics are not in accordance with ours. Whereas in [5] the local structure of the essential manifold being a product of Stiefel manifolds is exploited we here prefer to think of this manifold as an orbit of SO 3 × SO3 acting on R3×3 by equivalence, i.e., the manifold of all (3 × 3)-matrices having the set of singular values equal to {1, 1, 0}. Some features about these different approaches are summarised as follows.

A.1 Manifold structure Ma et al., [5]: The essential manifold E is locally diffeomorphic to the product of two Stiefel manifolds E∼ =local S2 × SO3

(69)

Our approach: We exploit the global diffeomorphism of E to the set of matrices having singular values {1, 1, 0} 

 1 0 0 E∼ = SO3 · 0 1 0 · SO3 0 0 0

(70)

13

A.2 Geodesics emanating from E = ΩΘ = U E0 V > ∈ E: Ma et al.:

¡ ¢ t 7→ e∆t Ω e−∆t , Θ eΓt

(71)

where ∆, Γ ∈ so3 and [∆, [∆, Ω]] = − 21 k∆k2 Ω. Our approach:

where ∆ =

·

0 x3 −x2

−x3 0 x1

t 7→ U e∆t E0 e−Γt V > ¸ · ¸ x2 0 x3 x5 −x1 and Γ = −x3 0 −x4 and x1 , . . . , x5 ∈ R. 0

−x5

x4

(72)

0

A.3 Riemannian metric g : TE E × TE E → R:

Ma et al.: The Euclidean one induced by the canonical submanifold structure of each factor S ⊂ R3

and SO3 ⊂ R3×3 ,

(73)

or equivalently, the normal one induced by the similarity group action on the first factor SO3 × S → S,

(U, Ω) 7→ U ΩU >

(74)

and right translation on the second factor SO3 × SO3 → SO3 ,

(V, Θ) 7→ ΘV > .

Explicitly, for two elements of the tangent space ξ1 , ξ2 ∈ T(Ω,Θ) E with ξi = ([∆i , Ω], ΘΓi ) ³¡ ¡ ¢ ¡ ¢´ ¢ ¡ ¢ g [∆1 , Ω], ΘΓ1 , [∆2 , Ω], ΘΓ2 = tr ∆1 > ∆2 + tr Γ1 > Γ2

(75)

(76)

with ∆i , Γi ∈ so3 , [∆i , [∆i , Ω]] = − 12 k∆i k2 Ω for i = 1, 2. Our approach: The normal one induced by the equivalence group action SO3 × SO3 × R3×3 → R3×3 ,

((U, V ), E) 7→ U EV > .

(77)

Explicitly, for two elements of the tangent space ξ1 , ξ2 ∈ TU E0 V > E with ξi = U (∆i E0 − E0 Γi )V > ´ ³ ¢ ¡ ¢ ¡ (78) g U (∆1 E0 − E0 Γ1 )V > , U (∆2 E0 − E0 Γ2 )V > = tr ∆1 > ∆2 + tr Γ1 > Γ2 · ¸ · ¸ (i) (i) (i) (i) 0 −x3 x2 0 x3 x5 (i) (i) (i) (i) (i) 0 −x1 0 −x4 where for i = 1, 2: ∆i = x(i) and Γ = −x(i) and x1 , . . . , x5 ∈ R. 3 3 (i) (i) (i) −x2

x1

0

−x5

x4

0

In fact, the tangent map of µ defined by (19) maps frames {e1 , . . . , e5 } in R5 , orthonormal with respect to the Euclidean metric, into frames of TE E, orthonormal with respect to the normal Riemannian metric:   ei 7→ D µ(0) · ei = U Ω1 (ei )E0 − E0 Ω2 (ei ) V > , {z } |

(79)

=:ξi

with

­

U ξi V > , U ξ j V >

®

n.RM

> > = tr Ω> 1 (ei )Ω1 (ej ) + tr Ω2 (ei )Ω2 (ej ) = ei ej = δij .

(80)

One might argue that the Riemannian metric we use is induced by restriction from another Riemannian metric defined on the embedding R3×3 . This is actually not the case, moreover, one can show that such a metric on R3×3 does not exist.

14

VI. Implementation of Algorithms Start with an initial estimate of Essential matrix E = U E0 V > obtained from the standard 8-point algorithm. Step 1. Carry out the optimization step π1 , ¦ Compute the gradient ∇f (µ(0)) and the Hessian Hf (µ(0)) . ¦ If Hf (µ(0)) > 0, compute the Newton step xopt = −H−1 f (µ(0)) ∇f (µ(0)) , b −1 ∇f (µ(0)) . otherwise compute the Gauss step xopt = −H f (µ(0))

Step 2. Carry out the projection step π2 . There are three alternative projections, ¦ πSVD : Let xopt = [x1 x2 · · · x5 ], form the optimal affine tangent vector, ξopt ∈ TEaff E, √     1 −x3 −x5 /√ 2 σ1 0 0 b  0 σ2 0  Vb > , ξopt = U  x3√ 1√ x4 / 2  V > = U 0 0 σ3 −x2 / 2 x1 / 2 0

b=U b E0 Vb > . and compute the projected estimate of the essential matrix E b = U eΩ1 (xopt ) , Vb = V eΩ2 (xopt ) , E b=U b E0 Vb > . ¦ πµ : U b = U cay(Ω1 (xopt )), Vb = V cay(Ω2 (xopt )), E b=U b E0 Vb > . ¦ πcay : U b U =U b , V = Vb , go back to Step 1 if k∇f (µ(0)) k > ε, a prescribed accuracy. Step 3. Set E = E, VII. Acknowledgment

The authors thank Jochen Trumpf, National ICT Australia, Canberra, for many fruitful discussions. The first two authors were partially supported by DAAD PPP Australia, Germany, under grant D/0243869. The last two authors were partially supported by the Australian-German Grant under 49020-17. National ICT Australia is funded by the Australian Department of Communications, Information & Technology & the Arts and the Australian Research Council through Backing Australia’s ability and the ICT Centre of Excellence Program. References [1] [2] [3] [4]

R. Hartley and A. Zisserman. Multiple View Geometry. Cambridge Univ. Press, Cambridge, 2000. U. Helmke and J.B. Moore. Optimization and Dynamical Systems. CCES. Springer, London, 1994. K. Kanatani. Statistical Optimization for Geometric Computation: Theory and Practice. Elsevier, Amsterdam, 1996. Q.-T. Luong and O.D. Faugeras. The fundamental matrix: Theory, algorithms and stability analysis. Int. J. of Computer Vision, 17(1):43–75, 1996. [5] Y. Ma, J. Koˇseck´ a, and S. Sastry. Optimization criteria and geometric algorithms for motion and structure estimation. Int. J. of Computer Vision, 44(3):219–249, 2001. [6] R. Mahony. Optimization algorithms on homogeneous spaces. PhD thesis, Australian National University, Canberra, March 1994. [7] S.T. Smith. Geometric optimization methods for adaptive filtering. PhD thesis, Harvard University, Cambridge, May 1993.

Suggest Documents