Four points in two or three calibrated views - University of Oxford

31 downloads 0 Views 3MB Size Report
Aug 15, 2005 - points on each conic of the pencil of conics through the four image points. We show how ..... We can now sharpen theorem 8. THEOREM 10.
Four points in two or three calibrated views: theory and practice. David Nist´er †

Sarnoff Corporation, CN5300, Princeton, NJ 08530, USA ([email protected])

Frederik Schaffalitzky §

Robotics Research Group, University of Oxford, UK ([email protected]) Abstract. Suppose two perspective views of four world points are given and that the intrinsic parameters are known but the camera poses and the world point positions are not. We prove that the epipole in each view is then constrained to lie on a curve of degree ten. We derive the equation for the curve and establish many of the curve’s properties. For example, we show that the curve has four branches through each of the image points and that it has four additional points on each conic of the pencil of conics through the four image points. We show how to compute the four curve points on each conic in closed form. We show that orientation constraints allow only parts of the curve and find that there are impossible configurations of four corresponding point pairs. We give a novel algorithm that solves for the essential matrix given three corresponding points and one of the epipoles. We then use the theory to create the most efficient solution yet to the notoriously difficult problem of solving for the pose of three views given four corresponding points. The solution is a search over a onedimensional parameter domain, where each point in the search can be evaluated in closed form. The intended use for the solution is in a hypothesise-and-test architecture to solve for structure and motion.

1. Introduction Solving for unknown camera locations and scene structure given multiple views of a scene has been a central task in computer vision for several decades and in photogrammetry for almost two centuries. If the intrinsic parameters such as focal lengths of the cameras are known a priori, the cameras are said to be calibrated. In the calibrated case, it is possible to determine up to ten solutions for the relative pose between two views, given five corresponding points (Nist´er, 2003a; Faugeras, 1993). In the uncalibrated case, at least seven Now at the Center for Visualization and Virtual Environments, Computer Science Department, University of Kentucky § Formerly at the Research School of Information Sciences and Engineering, Australian National University †

Prepared through collaborative participation in the Robotics Consortium sponsored by the U. S. Army Research Laboratory under the Collaborative Technology Alliance Program, Cooperative Agreement DAAD19-01-2-0012. The U. S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon.

c 2005 Kluwer Academic Publishers. Printed in the Netherlands.

final.tex; 15/08/2005; 15:02; p.1

2

David Nist´er et al

corresponding points are required to obtain up to three solutions for the fundamental matrix, which is the uncalibrated equivalent to relative pose (Hartley and Zisserman, 2000). We will find it useful to characterise the solutions in terms of their epipoles, that is the image in one view of the perspective center of the other view, or equivalently, the image of the translation direction. If we have one point correspondence less than the minimum, we can expect to get a continuum of solutions. In the uncalibrated case it is well known (Maybank, 1993) that six point correspondences give rise to a cubic curve of possible epipoles. However, to the best of our knowledge, the case of four point correspondences between two calibrated views has not been studied previously. We will show that four point correspondences between two calibrated views constrain the epipole in each image to lie on a decic (of degree ten) curve. Moreover, if we disregard orientation constraints, each point on the decic curve is a possible epipole. The decic curve varies with the configuration of the points and cameras and can take on a wide variety of beautiful and intriguing shapes. Some examples are shown in Figure 1. We apply the theory to create an efficient solution to the 3-view 4-point perspective pose problem (3v4p problem for short), which amounts to finding the relative poses of three calibrated perspective cameras given 4 corresponding point triplets. The 3v4p problem is notoriously difficult to solve, but has a unique solution in general (Holt and Netravali, 1995; Quan et al., 2003b; Quan et al., 2003a). It is in fact overconstrained by one, meaning that in general four point triplets can not be realised as the three calibrated images of four common world points. Adjustment methods typically fail to solve the 3v4p problem and no practical numerical solution is known. Our theory leads to an efficient solution to the 3v4p problem that is based on a onedimensional exhaustive search. The search procedure evaluates the points on the decic curve arising from two of the three views. Each point can be evaluated and checked for three view consistency with closed form calculations, leading to an overall computational cost for solving the 3v4p problem of the order of milliseconds. Reminiscent of (Schaffalitzky et al., 2000), the solution minimises an image based error, which is highly desirable. The algorithm can also be used to determine if four image point triplets are realisable as the three calibrated images of four common world points. Many more point correspondences than the minimal number are needed to obtain reliable solutions for structure and motion. The intended use for our 3v4p solution is as a hypothesis generator in a hypothesise-and-test architecture such as (Nist´er, 2003a; Nist´er, 2003b; Fischler and Bolles, 1981). Many samples of four corresponding point triplets are taken and the solutions are scored based on their support among the whole set of observations. The rest of the paper is organised as follows. In Section 2, we establish some notation and highlight some known results. In Section 3, we describe the geometric construction that serves as the basis for the main discoveries of

final.tex; 15/08/2005; 15:02; p.2

Four points in two or three calibrated views

3

Figure 1. Some examples of decic curves of possible epipoles given four points in two calibrated images.

the paper. In Sections 4 and 5, we work out the consequences of the geometric construction. In Section 6, we derive an algebraic expression for the decic curve. In Section 7 we establish some properties of the variety of possible epipoles and in Section 8 we state and prove our main result. In Section 9, we investigate some of the implications of the orientation constraints. In Section 11 a link to the uncalibrated 5-point case is established. In Section 12 the 3v4p algorithm is given. Section 14 concludes.

2. Preliminaries We broadly follow the notational conventions in (Hartley and Zisserman, 2000; Semple and Kneebone, 1952). Image points are represented by homo-

final.tex; 15/08/2005; 15:02; p.3

4

David Nist´er et al

geneous 3-vectors x. World points are represented by homogeneous 4-vectors X. Plane conics are represented by 3 × 3 symmetric matrices and we often refer to such a matrix as a conic. The symbol ∼ denotes equality up to scale. We will use the notation A∗ to denote the adjugate matrix of A, which is the transpose of the cofactor matrix of A. We will use |A| to denote the determinant and tr(A) to denote the trace of the matrix A. We assume the reader has some background in multi-view geometry and is familiar with concepts such as camera matrices, the absolute conic, the image of the absolute conic (IAC) under a camera projection and its dual (the DIAC). When discussing more than one view, we generally use prime notation to indicate quantities that are related to the second image; for example x and x0 might be corresponding image points in the first and second view, respectively. Similarly, we use e and e0 to denote the two epipoles and ω and ω 0 to denote the IACs in two views. Given n corresponding image points xi ↔ x0i in two views the epipolar constraint (Maybank, 1993; Faugeras, 1993) is expressed in: THEOREM 1. The projective parameters of the n rays joining e to x i are homographically related to the projective parameters of the n rays joining e 0 to x0i . The situation is illustrated in Figure 2. The condition asserts the existence of a 1D homography that relates the pencil of lines through e to the pencil of lines through e0 . This homography is called the epipolar line homography. x

x

x’

x’ x’

x e

x

x’

e’

Figure 2. Illustration to theorem 1. The pencils of rays from the epipoles to the image points are homographically related.

The epipolar constraint relates corresponding image points. For the pair ω, ω 0 of corresponding conics we have the Kruppa constraints: THEOREM 2. The two tangents from e to the IAC ω are related by the epipolar line homography to the two tangents from e0 to the IAC ω 0 . The constraint is illustrated in Figure 3. Figure 4 illustrates the epipolar and Kruppa constraints in the same figure.

final.tex; 15/08/2005; 15:02; p.4

5

Four points in two or three calibrated views

The algebraic constraints treat the pre-image of an image point as an infinite line extending both backwards and forwards from the projection centre. Hence the projective constraint allows the world point X to be on the backward part of the line. However, the image rays are in reality half-lines extending only in the forward direction because only points in front of a camera can be imaged. Moreover, unless our images might have been mirrored, we typically know which direction is forward. The constraint that X should be on the forward part is referred to as the orientation constraint. The orientation constraints imply that the epipolar line homography is oriented and thus preserves the orientation of the rays in theorem 1. See (Werner, 2003) for more details. ω

ω’

e’

e

Figure 3. Illustration of theorem 2, usually called the Kruppa constraints. The epipolar tangents to the images of the absolute conics are related by the epipolar line homography.

x

x’ x x’ x

x’

x’

x

e

ω’

e’

ω

Figure 4. Epipolar and Kruppa constraints in one diagram

3. The Geometric Construction Assume that we have two perspective views of four common but unknown world points and that the intrinsic parameters of the cameras are known but the poses unknown. Let the image correspondences be x i ↔ x0i . In general, no three out of the four image points in either image are collinear and we shall henceforth exclude the collinear case from further consideration.

final.tex; 15/08/2005; 15:02; p.5

6

David Nist´er et al

Accordingly, we may then choose (Semple and Kneebone, 1952) projective coordinates in each image such that the four points have the same coordinates in both images. In other words, we may assume that x i = x0i and we shall do this henceforth, thinking of both images planes as co-registered into one coordinate system. It then follows from Steiner’s and Chasles’s theorems (Semple and Kneebone, 1952) that the constraint from theorem 1 can be converted into: THEOREM 3. The epipoles e, e0 and the four image points must lie on a conic B. Conversely, two epipoles e, e0 that are conconic with the four image points can satisfy the epipolar constraint. An illustration is given in Figure 5. This conic will be important in what follows and the reader should take note of it now. When e and e 0 are conconic with the four image points, there is a unique epipolar line homography that makes the four lines through e correspond to the four lines through e 0 . One way to appreciate B is to note that we can parameterize the pencil of lines through e (or e0 ) by the points of B and that corresponding lines of the two pencils meet the conic B in the same point. Armed with this observation we x

x

x x

B e

e’

Figure 5. Illustration of theorem 3. When the two image planes are co-registered with the homography relating the four image points, the epipoles are conconic with the four image points.

can translate the Kruppa constraints into: THEOREM 4. The calibration constraints are equivalent to the condition that the two lines tangent to ω that pass through e intersect B in the same two additional points as the two lines tangent to ω 0 through e0 . This geometric construction will serve as a foundation for the rest of our development. The situation is depicted in Figure 6. Loosely speaking, the two projections (through the epipoles) of the IACs onto B must coincide.

final.tex; 15/08/2005; 15:02; p.6

7

Four points in two or three calibrated views

x

x

x x

ω

ω’

B e

e’

Figure 6. The geometric construction corresponding to theorem 4. The images of the IACs ω,ω 0 made by projecting through the epipoles and onto B have to coincide. This construction is the basis for the rest of our development.

4. Projection onto the Conic To make progress from theorem 4 we will work out how to perform the projection of an IAC ω onto a conic B = B(e) that is determined by an epipole e and the four image points. One can think of the projection as being defined by the two points where the tangents to the IAC from an epipole meet B. But the two tangents do not come in any particular order, which is a nuisance. To avoid this complication we use the line joining the two intersection points on B as our representation. This is accomplished by: THEOREM 5. The projection of the IAC ω onto the proper conic B through the epipole e is given by the intersections between the line (ω♦B)e and B, where we define the conic (ω♦B) ≡ 2Bω ∗ B − tr(ω ∗ B)B.

(1)

Proof: 1 Using the properties of trace, it can be verified that Equation (1) is a projectively invariant formula for a conic. Therefore, if the theorem holds for one projective coordinate system, it holds for all. So we may choose (Semple and Kneebone, 1952) homogeneous coordinates x, y, z such that B is the conic xz − y 2 = 0, which has matrix 0 0 1  0 −2 0  1 0 0 



1 This theorem is a stronger version of theorems 2 and 3, pages 179-180 in (Semple and Kneebone, 1952), which do not give a formula for (ω♦B).

final.tex; 15/08/2005; 15:02; p.7

8

David Nist´er et al

>

and which can be parameterized by 1 θ θ2 where θ is scalar. If we let θ correspond to e and let λ parameterise an additional point p on B, the line through the two points defined by θ and λ is then given by a cross product 

l = l(θ, λ) = 1 θ θ2 × 1 λ λ2 = (λ − θ) θλ −(θ + λ) 1 











and we can ignore the scale factor (λ − θ). The condition for this line to be tangent to the IAC ω is that l > ω ∗ l = 0. If we simply expand this and write ω ∗ = C = (Cij ), we arrive at 

1 θ θ2



C33 −2C23 C22    −2C23 2(C13 + C22 ) −2C12  1 λ λ2 > = 0 C22 −2C12 C11 



By inspection, the 3×3 matrix in this expression is exactly one half of (ω♦B) (using again the special form of B chosen earlier). Therefore, the condition that l = e × p be tangent to ω is equivalent to e> (ω♦B)l = 0. Hence, the projection of ω onto B through the point e is defined by the intersections of the line (ω♦B)e with B. The symmetric matrix (ω♦B) thus represents a conic locus that has the properties stated in the theorem. The theorem follows.  Note that the line (ω♦B)e is the polar line of e with respect to (ω♦B). We are using the pole-polar relationship defined by the conic locus of (ω♦B) to perform the projection. It is possible, though not necessary for our purposes, to relate the conic (ω♦B) to classical terminology (Semple and Kneebone, 1952) by saying that ω is the harmonic envelope of B and (ω♦B). A given ω defines a correspondence B ↔ (ω♦B) between plane conics in the sense that ω♦(ω♦B) ∼ B. Note also that the ♦ operator is not commutative: projecting ω onto B is different from projecting B onto ω. Equation (1) shows that (ω♦B) belongs to the pencil of conics determined by B and Bω ∗ B. In fact the conic (ω♦B) passes through the four points where the double tangents to ω and B touch B. These four points lie on both B and Bω ∗ B. Moreover, the double tangents of ω and (ω♦B) touch (ω♦B) at the same four points. These properties enable a geometric construction of (ω♦B) from ω and B. The situation is illustrated in Figure 7. We combine theorems 4 and 5 to arrive at THEOREM 6. Given that e and e0 are conconic with the four image points, the Kruppa constraints are equivalent to the constraint that the polar lines (ω♦B)e and (ω 0 ♦B)e0 coincide.

final.tex; 15/08/2005; 15:02; p.8

Four points in two or three calibrated views

9

Figure 7. The conic locus (ω♦B) from theorem 5. Note, in the bottom diagram, that the line (ω♦B)e is the polar line of e with respect to (ω♦B). This means that we are using the pole-polar relationship defined by the conic locus (ω♦B) to perform the projection. Referring to the top diagram, Equation (1) shows that (ω♦B) belongs to the pencil of conics determined by B and Bω ∗ B. This is a manifestation of the fact that (ω♦B) goes through the four points where the double tangents between ω and B touch B. These four points lie on both B and Bω ∗ B. Moreover, the double tangents between ω and (ω♦B) touch (ω♦B) at the same four points. In the bottom diagram, the line (ω♦B)e and its pole (ω · B)e with respect to B is shown (see main text for definition of (ω · B)). Both can be used to represent the projection of ω onto B through e.

5. The Four Solutions Note that (ω♦B) = B(ω · B) where we define the homography 2 (ω · B) ≡ 2ω ∗ B − tr(ω ∗ B)I.

(2)

In view of this, we can “cancel” a B in theorem 6 and arrive at: THEOREM 7. The epipoles are related by the relation (ω 0 · B)e0 ∼ (ω · B)e 2

In the classical terminology of (Semple and Kneebone, 1952) this is strictly speaking a “collineation” but in multi-view geometry and computer vision the accepted term is “homography” so we will use that in this paper.

final.tex; 15/08/2005; 15:02; p.9

10

David Nist´er et al

which implies the seventh degree mapping (remember we are implicitly thinking of B as a function of e) e 7→ e0 ∼ (ω 0 · B)∗ (ω · B)e

(3)

The Kruppa constraints single out four solutions 3 for the epipole e on each proper conic B. The solutions are the intersections of B with the conic C ≡ (ω · B)> (ω 0 · B)∗> B(ω 0 · B)∗ (ω · B).

(4)

On the three conics B of the pencil for which |(ω 0 ·B)| = 0, the four solutions group into two pairs of coincident solutions. Proof: Any two conics B1 , B2 from the pencil can be used as a basis for the pencil and we have B(e) = (e> B2 e)B1 − (e> B1 e)B2 ,

(5)

so that B(e) can be expressed quadratically in terms of e. By theorem 6, (ω 0 ♦B)e0 ∼ (ω♦B)e, which for proper B is equivalent to (ω 0 · B)e0 ∼ p ∼ (ω · B)e

(6)

for some common point p. If we assume that |(ω 0 · B)| 6= 0, this is equivalent to Equation (3), which is seen to be a 7-th degree mapping. Since e 0 has to be on B, i.e. e0> Be0 = 0, we get that e must be on the conic C defined by Equation 4. The rest of the theorem requires detailed consideration of the case when (ω 0 · B) becomes rank 2, in which case we show that C degenerates to a repeated line. Observe that with points and cameras in general position, there is no reason for (ω · B) and (ω 0 · B) to degenerate simultaneously. So we can turn things around and use Equation (6) to see that our constraints are equivalent to (ω · B)∗ (ω 0 · B)e0 ∼ e. (7) and that e0 is on the rank 2 conic (ω 0 · B)> (ω · B)∗> B(ω · B)∗ (ω 0 · B),

(8)

which when intersected with B yields four distinct solutions for e 0 . If we again consider (6), we see that because of the degeneracy (ω 0 · B) maps a whole line into the same p. Hence, if e0 maps into a p that corresponds to a solution, there is exactly one other solution for e0 that maps into the same p, namely the other intersection of the conic B with the line that maps to p. 3

Apart from the four image points.

final.tex; 15/08/2005; 15:02; p.10

Four points in two or three calibrated views

11

Thus, the solutions for e group into pairs. Finally, by (6) the two solutions for e are such that (ω · B)e maps into p and we see that (ω 0 · B)∗ (ω · B)e ∼ (ω 0 · B)∗ p ∼ (ω 0 · B)∗ (ω 0 · B)e0 = |(ω 0 · B)|e0 = 0, (9) which implies that the solutions for e also lie on C, which in this case is a repeated line.  The situation is illustrated in Figure 8. If we use Equation (5) to express

Figure 8. As described in theorem 7, the possible epipoles according to the projective and calibration constraints can be found on proper conics B through the four points as the intersections between B and C.

B in terms of e, then e will lie on B by definition. Thus, e> Be = 0 is a tautology. This is not the case with C however and we get: THEOREM 8. An epipole hypothesis e that gives rise to a proper conic B satisfies the 16-th degree equation e> Ce = 0

(10)

if and only if it satisfies the projective and calibration constraints. An example plot of the locus of points that satisfy Equation (10) is shown in Figure 9. As indicated by the following theorem, the locus includes the six lines through all pairs of image points: THEOREM 9. For degenerate B consisting of a line-pair, the homography (ω · B) interchanges the lines of the line-pair. As a result, the curve defined by Equation (10) contains the six lines through pairs of the four image points. In other words, the 16ic expression contains |B| as a factor.

final.tex; 15/08/2005; 15:02; p.11

12

David Nist´er et al

Proof: Theorem 4 applies for any B, but theorem 5 only applies when B is proper. If B consists of a line-pair and e is on one of the lines, (ω♦B)e coincides with the other line. This can be seen by direct calculation or by continuity: as e moves so that B approaches one of the lines of a line pair, the two points defined by projecting ω onto B through e approaches the other line of the line pair. Hence, the line (ω♦B)e becomes the other line. In a similar fashion, the point (ω · B)e, which is the pole of (ω♦B)e with respect to B, becomes a point on the other line. Thus, we get C ∼ B when B is a line-pair. Hence, all points e apart from the four image points on any line through two of the four image points must satisfy Equation (10). Since the expression is algebraic, this must then also be true for the four image points and the theorem follows. We can now sharpen theorem 8. THEOREM 10. Any epipole e satisfying the projective and calibration constraints has to satisfy the 16-th degree equation e> Ce = 0. An epipole hypothesis e for which |B(e)| 6= 0 satisfies the projective and calibration constraints if and only if e satisfies the remaining decic factor of e > Ce = 0. Proof: If e does not lie on any of the six lines defined by pairs of the four image points, e gives rise to a proper conic B and then theorem 8 applies. If e is a point on one of the six lines, then the equation is clearly satisfied according to theorem 9. Hence, the expression (10) defines a superset of the possible epipoles as determined by the projective and calibration constraints. However, all the points on the six lines are not allowed by the calibration constraint. In fact, since theorem 4 still applies, we will work out the consequences of the geometric construction specifically for degenerate B. We will then show algebraically in Section 6 how the factor |B| can be eliminated from Equation (10) in order to arrive at a decic expression for e. Later, we will show that the decic expression is exactly the curve of possible epipoles. THEOREM 11. Apart from the four image points, there are at most two possible epipoles e on any line joining a pair of the four image points. Proof: When B is a line-pair, theorem 4 still applies. For a valid solution, both epipoles e and e0 have to be on the same line, because the epipolar line homography can not separate coincident rays. The geometric construction becomes that of projecting ω through e and ω 0 through e0 onto the other line and requiring that the images coincide. Let x1 be the point where the lines of the line-pair meet. Choose another point x2 on the same line as the epipoles and a third point x3 on the other line. Choose the coordinate system such that

final.tex; 15/08/2005; 15:02; p.12

13

Four points in two or three calibrated views

x1 , x2 , x3 are the reference points, i.e. The epipole e can be written



x1 x2 x3



is the identity matrix. (11)

e = θx1 + x2 ,

where θ is a scalar parameter. Let λ parameterise a point (λx1 + x3 ) on the other line. The line through this point and e is l(θ, λ) = (θx1 + x2 ) × (λx1 + x3 ) = θ(x1 × x3 ) + λ(x2 × x1 ) + (x2 × x3 )  > ∼ −1 θ λ

(12)

This line belongs to the DIAC ω ∗ when l> ω ∗ l = 0, or more explicitly 

−1 θ λ ω ∗ −1 θ λ 

which is equivalent to 

where we define ∗ v(θ) ≡ ω33





>

(13)

= 0,

(14)

λ2 λ 1 v(θ) = 0, 

∗ −ω ∗ ) 2(θω23 13

∗ −2θω ∗ + ω ∗ θ2 ω22 12 11

>

.

(15)

Since the possible values of λ for a given θ constitute the image of ω and v(θ) holds the coefficients of a quadratic equation for λ, we can use v(θ) to represent the image of ω. Moreover, if we define v 0 (θ0 ) in an analogous manner, the images of ω and ω 0 coincide if and only if v(θ) × v 0 (θ0 ) = 0.

(16)

Finally, we observe that (v × v 0 )3 is a linear equation in θ, θ 0 and (v × v 0 )2 is quadratic in the same. Hence there are at most two solutions for θ that can be found as the intersection between a conic and a line in (θ, θ 0 )-space. The theorem follows. 6. The Decic Expression This section shows how to remove the factor |B| from Equation (10) to arrive at the 10-ic expression (31). This endeavor is algebraically involved and the reader may want to skip directly to Equation (31). The end result will be an expression for a 3 × 3 matrix G(e), depending on e, such that the 10-ic curve is given by e> G(e)e = 0. We shall also leave functional dependencies on e and e0 implicit in order to keep the notation as compact as possible. Define D≡ ω∗ t≡ tr(DB) U ≡ (ω · B) = 2DB − tr(DB)I

(17) (18) (19)

final.tex; 15/08/2005; 15:02; p.13

14

David Nist´er et al

Figure 9. The 16-th degree expression in Equation (10) defines a superset of the possible epipoles as determined by the projective and calibration constraints. However, it also includes the six lines through all pairs of image points as factors, which can be eliminated (see main text). The top plot shows the 16-th degree curve, including the six lines, and the bottom plot shows the curve without the six lines.

and analogously for the primed entities. Then Equation (10) can be written e> U > U 0∗> BU 0∗ U e = 0.

(20)

We will make use of the following facts about matrix adjugates A∗∗ = |A|A, ∗ (AB) = B ∗ A∗ AA∗ = |A|I ∗ ∗ 2 (A + sI) = A + s I + tr(A)sI − sA A(A − tr(A)I) = A∗ − tr(A∗ )I

(21) (22) (23) (24) (25)

where A and B are 3 × 3 matrices and s is a scalar. The first three follow from the cofactor formula for matrix inverses, the fourth one from direct calculation and the last one can be deduced from the Cayley-Hamilton theorem. In light of equations (19) and (24) we have U ∗ = (2DB − tI)∗ = 4B ∗ D∗ − t2 I + 2tDB = 4B ∗ D∗ + tU.

(26)

and therefore get e> U > U 0∗> BU 0∗ U e = (U e)> (4D 0∗ B ∗ + t0 U 0> )B(4B ∗ D 0∗ + t0 U 0 )U e = 4|B|e> U > (4D 0∗ B ∗ D 0∗ + 2t0 D 0∗ U 0 )U e + t02 e> U > U 0> BU 0 U e

We focus on the last term. Observe that U 0 U = 4D 0 BDB − 2t0 DB − 2tD 0 B + tt0 I

(27)

final.tex; 15/08/2005; 15:02; p.14

Four points in two or three calibrated views

15

By direct expansion of the left and right hand sides, and since e > Be = 0, one can verify that e> U > U 0> BU 0 U e = e> (16BDBD 0 B(D0 B − t0 I)(DB − tI) + 4t02 BDB(DB − tI) + 4t2 BD0 B(D0 B − t0 I))e

(28)

Now observe that, by equations (22) and (25) applied to the matrix DB, we have DB(DB − tI) = B ∗ D∗ − tr(B ∗ D∗ )I (29) and if we use this in (28) we get e> U > U 0> BU 0 U e = 4|B|e> (4BDD 0∗ (DB − tI) − 4tr(B ∗ D0∗ )D∗ + t02 D∗ + t2 D0∗ )e

(30)

If we insert this into (27) and divide out 4|B| we get e> G(e)e = 0

(31)

where G(e) = 4U > D0∗ B ∗ D0∗ U + 2t0 U > D0∗ U 0 U + 4t02 BDD0∗ (DB − tI) − 4t02 tr(B ∗ D0∗ )D∗ + t04 D∗ + t2 t02 D0∗

(32)

Expanding the second term of G(e), identifying symmetric parts, removing irrelevant B terms and again using Equation (25), we get 2t0 U > D0∗ U 0 U = (4t0 BD − 2tt0 I)(2|D 0 |B − t0 D0∗ )(2DB − tI) = 16|D 0 |t0 B(DB − tI)DB − 8t02 BDD0∗ (DB − tI) − 2t2 t02 D0∗ = 16|B||D 0 |t0 D∗ − 8t02 BDD0∗ (DB − tI) − 2t2 t02 D0∗

(33)

If we insert this into Equation 32, we get G(e) = 4U > D0∗ B ∗ D0∗ U − 4t02 BDD0∗ (DB − tI) + sD∗ − t2 t02 D0∗ where we have defined the scalar s = (16|B||D 0 |t0 + t04 − 4t02 tr(B ∗ D0∗ ))

(34)

The conic matrix G(e) can readily be seen to be octic in terms of e, since D is constant and B, U and t are all quadratic in e. Thus, Equation (31) describes a decic curve. Some examples are shown in Figures 1 and 11.

final.tex; 15/08/2005; 15:02; p.15

16

David Nist´er et al

7. Further Properties of the Curve We will argue that the curve defined by the decic expression (31) is exactly the set of possible epipoles under the projective and calibration constraints. We will be helped by the following. THEOREM 12. Given three point correspondences x i ↔ x0i and the epipole e in one image, the projective and calibration constraints lead to four solutions for the other epipole e0 . A proof of this by a novel constructive algorithm is given in Appendix B. THEOREM 13. The set of possible epipoles according to the projective and calibration constraints has exactly four branches through each of the four image points. 4 The branches do not “terminate” but pass smoothly through the image points. 5 For points in general position, 0,2 or 4 of the branches can be real. Proof: By theorem 12, for a general epipole position e, there are four solutions for the essential matrix that are consistent with three given point pairs. When e coincides with one of the image points, x1 say, all four solutions that obey the other three image point pairs also satisify x1 ↔ x01 , since the epipolar line l0 joining e0 and x01 always map into a line l through x1 . Moreover, the line l determined by the other three image point pairs changes continuously if we change e and hence l has to be the tangent direction of the corresponding curve branch. It is in fact the same as the tangent at x 1 to the conic determined by the four image points and the corresponding solution for e0 . The tangent direction of each branch can thus be computed in closed form with the algorithm in Appendix B. It is clear from the algorithm that points in general position can not give rise to an odd number of real solutions. Using the algorithm, we have found examples of cases with 0, 2 and 4 real solutions. We have even found examples with four real solutions that can satisfy the orientation constraints. The situation is illustrated in Figure 10. The case of two real branches is by far the most common, but in Figure 11 a pair of actual numerical examples with zero and four real branches are shown. 4

Note that our definitions of the backprojection and the projective constraints allow the world points to coincide with the projection centres of the cameras. 5 If we consider the joint epipole (e, e0 ), the possible joint epipoles describe a curve in 2 P × P 2 which we assume is smooth. This joint epipole curve projects into P 2 such that four points map to x1 . What we really mean when we talk about tracing a curve branch in an image is tracing the curve in P 2 × P 2 . For example, the four branches through an image point can be resolved into four non-intersecting branches in P 2 × P 2 . The additional six nodal points of the decic can be resolved similarly.

final.tex; 15/08/2005; 15:02; p.16

Four points in two or three calibrated views

17

Figure 10. By theorem 13, there are four branches of the curve of possible epipoles through each of the four image points. Each curve branch is tangent to the conic B that includes the four image points and the epipole e0 corresponding to e coincident with the image point according to theorem 13 and the algorithm in Appendix B.

THEOREM 14. The set of possible epipoles according to the projective and calibration constraints can not be contained in an algebraic curve of degree less than ten. The same is true about the parts of the set that do not belong to any of the six lines through pairs of the four image points. Proof: If we combine theorem 13 and theorem 7 we see that there are twenty possible epipoles e on any proper conic B if multiplicity and complex solutions are taken into account. This is seen by counting: each of the four image points represents a solution with multiplicity four and then there are four solutions apart from those. By Bezout’s theorem, an algebraic curve of degree n has at most 2n intersections with a conic, unless it contains the conic as a factor. Since B is variable, the first part of the theorem follows. It follows from theorem 11 that neither of the four branches through one of the four image points is created by one of the lines through a pair of the four image points. Hence, the argument holds also for the second part of the theorem. THEOREM 15. The decic expression (31) has no repeated factors. Moreover, it does not have any of the six lines through pairs of the four image points as factors. Moreover, the decic has exactly the same four branches through each of the four image points as the set of possible epipoles according to the projective and calibration constraints.

final.tex; 15/08/2005; 15:02; p.17

18

David Nist´er et al

Proof: If the decic had repeated factors then theorem 10 would indicate that the set of possible epipoles away from the six lines could be covered by an algebraic curve of degree less than ten, which would violate theorem 14. A similar argument rules out the possibility that the decic includes one of the six lines as factors. The last part of the theorem is then obtained by combining theorem 10 and theorem 13. THEOREM 16. Apart from the four image points, the decic curve (31) passes exactly twice through any of the six lines through pairs of the four image points. Proof: By Bezout’s theorem, the decic curve has ten intersections with any one of the lines. By theorem 15, exactly eight of the intersections are accounted for by the two times four branches through the pair of image points on the line. Hence there are exactly two additional ones. THEOREM 17. Apart from the four image points, there are exactly two possible epipoles according to the projective and calibration constraints on any of the six lines through pairs of the four image points. All ten solutions on any one of the lines are given by the intersections of the line with the decic curve (31). Proof: By theorem 16, the decic intersects any one of the lines in two points apart from the pair of image points on the line. It follows from theorem 10 that the geometric construction suggested by theorem 4 can be satisfied with epipoles e on the decic arbitrarily close to the intersection between the decic and the line. If we move e across the line, the corresponding epipole e 0 and the images of ω, ω 0 through e, e0 , respectively, onto B(e) change continuously. Hence, it follows from continuity of the geometric construction that the point where the decic intersects the line also satisfies the geometric construction. Thus, it is also a possible epipole. So by theorem 11, the two intersections have to be the only two possible epipoles on the line apart from the pair of image points. 8. Main Result We now state our main result. THEOREM 18. A point e is a possible epipole according to the projective and calibration constraints if and only if it lies on the decic curve defined by Equation (31). Moreover, this condition is equivalent to requiring that e lies on the conic G(e) defined in Equation (32). For general e on the curve, the other epipole e0 is a seventh degree function e0 ∼ U 0∗ U e of e, given in Equation (3).

final.tex; 15/08/2005; 15:02; p.18

Four points in two or three calibrated views

19

Figure 11. Examples of decic curves with zero and four real branches through the image points. The image points are marked with small circles. The figures on the right are close-ups of the ones on the left.

Proof: The first part of the theorem follows from combining theorem 10 and theorem 17. The second part follows directly. The final statement was shown as part of the proof of theorem 7. We can also get a more complete version of theorem 7 that applies even when the conic B from Equation (5) is degenerate. THEOREM 19. Given any point x distinct from the four image points, the possible epipoles on the conic B(x) according to the projective and calibration constraints are exactly the four image points plus the intersections between the two conics B(x) and G(x). Proof: By theorem 18, a point is a possible epipole iff it lies on G(e). An epipole hypothesis e apart from the four image points generates the same conic B(e) = B(x) as x iff it lies on B(x). Since G(e) can be written as a function of B only, all e on B(x) apart from the four image points generate the same G(e) = G(x). Thus, the points on B(x) apart from the four image points satisfy the decic iff they lie on G(x). Finally, the four image points lie on B(x) by construction and they are always possible epipoles. THEOREM 20. The curve of possible epipoles according to the projective and calibration constraints has exactly ten nodal points (self-intersections).

final.tex; 15/08/2005; 15:02; p.19

20

David Nist´er et al

Figure 12. Some more examples of decic curves of possible epipoles according to the projective and calibration constraints.

The four image points each have multiplicity four. In addition, there are exactly three pairs of nodal points with multiplicity two. 6 The three pairs of nodal points occur on the three conics B for which |(ω 0 · B)| = 0. These conics are exactly the three conics B of the pencil that have an inscribed quadrangle that is also circumscribed to ω 0 . Proof: Recall theorem 6 and observe that on proper conics B, the solution e has multiplicity two exactly when the line l = (ω♦B)e obtained by projecting ω onto B through e also can be obtained by projecting ω 0 onto B in two distinct ways through two distinct points e0 on B. This is illustrated in Figure 13. By Poncelet’s Porism (Semple and Kneebone, 1952), given proper conics B and ω 0 , we have two possibilities. Either there is no quadrangle inscribed in B that is also circumscribed to ω 0 , or there is one such quadrangle with any point on B as one of its vertices. In the former case, no epipole hypothesis e 0 on B in the second image ever generates the same line l 0 = (ω 0 ♦B)e0 as some other epipole hypothesis. Hence no solution for e can then have multiplicity two. In the latter case, every epipole hypothesis e0 generates the same line l 0 as exactly one other epipole hypothesis. Thus, in this case the solutions e on 6 By the degree-genus formula (Kirwan, 1995) the genus of the curve is therefore ((10 − 1)(10 − 2) − 4 × 4(4 − 1) − 6 × 2(2 − 1))/2 = 6 so, in particular, the curve is not rational.

final.tex; 15/08/2005; 15:02; p.20

Four points in two or three calibrated views

21

B always have multiplicity two. The latter case has to happen exactly when |(ω 0 ·B)| = 0 and we see that this must be the same as the condition that there is a quadrangle inscribed in B that is also circumscribed to ω 0 . The remaining parts of the theorem follow from theorems 7, 13 and 17.

e

e’

ω

ω’

B e’

Figure 13. For there to be multiple e0 corresponding to one e, there has to be a quadrangle inscribed in B that is also circumscribed to ω 0 . By Poncelet’s porism, there is either no such quadrangle, or a whole family of them.

9. Bringing in the Orientation Constraint We first observe that given a point on the decic curve, it is straightforward to check if it can satisfy the orientation constraints. The orientation constraints are simply that the space points should lie on the forward part of the half-rays emanating from their corresponding image points. The situation is illustrated in Figure 14. For general epipoles e on the decic, Equation (3) determines a unique epipole e0 in the other image. If e coincides with one of the four image points, the algorithm in Appendix B determines up to four real solutions for e0 . Once the pair e, e0 of epipoles is determined, the epipolar line homography is also uniquely given by the point correspondences. Thus, we can determine the essential matrix. If we ignore possible positive scaling by requiring that the baseline between the cameras be of unit length, the essential matrix corresponds to four possible 3D configurations. If we choose one of the configurations, the other three can be obtained by rotating one of the views 180 degrees around the baseline to obtain the so called twisted pair or by swapping the positions of the cameras so that the orientation of the baseline is reversed, or both. The projective constraints are satisfied for each of the four configurations, since the two rays emanating from a corresponding pair of image points are coplanar and hence have a point in common in projective space. The orientation constraint is now exactly that the common point lies on the forward part of both rays. It is well known (Nist´er, 2003a) that a

final.tex; 15/08/2005; 15:02; p.21

22

David Nist´er et al

corresponding point pair singles out one of the four 3D configurations corresponding to an essential matrix. The orientation constraints can be satisfied exactly when all four point correspondences indicate the same configuration. Note that the baseline separates an epipolar plane into two half-planes. We will find it useful to think of the orientation constraint as consisting of two parts. The first part is the condition that the two forward half-rays of an image correspondence lie in the same half-plane. When this condition is satisfied, the common space point is either on the forward part of both rays, or on the backward part of both rays. This condition on a single point correspondence narrows down the possible 3D configurations from four to two. We now observe that this condition can be satisfied for all point correspondences exactly when the epipolar line homography is oriented. Intuitively, the epipolar line homography is oriented when it aligns epipolar half-rays with the correct orientation, which means exactly that the half-rays point into the same halfplane. We will hence refer to this part of the orientation constraint as the oriented epipolar constraint. The second part of the orientation constraint is the condition that the two forward half-rays should converge in their common half-plane. We will refer to this as the convergence constraint. In summary, we have THEOREM 21. The oriented epipolar constraint asserts that there is a 3D configuration for which the forward half-rays indicated by each corresponding pair of image points lie in a common half-plane. The convergence constraint then guarantees that there is such a configuration where the forward half-rays of each pair meet in their common half-plane. The oriented epipolar constraint and the convergence constraint in conjunction are equivalent to the orientation constraints.

epipolar plane X

                                                                                                                                                                                                           x

e

x’

e’

baseline

Figure 14. The orientation constraint is simply that the space points should be in the forward direction on their respective image rays. It can be partitioned into first requiring that the forward half-rays point into the same half-plane and then to stipulate that the half-rays converge in that half-plane.

final.tex; 15/08/2005; 15:02; p.22

Four points in two or three calibrated views

23

THEOREM 22. The satisfiability of the oriented epipolar constraint can only change at those points e of the decic curve for which e or one of its possible corresponding e0 coincide with one of the four image points, i.e. only at the four image points or at the up to 4 × 4 real points that correspond to one of the four image points according to theorem 13 and the algorithm in Appendix B. Proof: When we move e along the decic curve and neither e or its corresponding e0 coincides with one of the four image points, e0 and the epipolar line homography changes continuously with e. Moreover, since the epipolar constraint is satisfied for all points on the decic, the ray orientations can not change unless one of the epipoles coincides with one of the four image points. The situation is illustrated in Figure 16. THEOREM 23. A branch of the decic curve through one of the four image points can only have at most one side allowed by the oriented epipolar constraint. One side of the branch is allowed if and only if the epipolar line homography accompanying the pair of epipoles e, e0 corresponding to the branch is oriented with respect to the other three image correspondences. Proof: Consider a branch of the decic through one of the image points, x say. For that particular branch, e0 and the epipolar line homography changes continuously with e, so the orientations of the rays to the other three image points do not change at x. However, the orientation of the ray to x changes when e passes through x. Hence, at most one side of the branch can satisfy the oriented epipolar constraint and one side does if and only if the epipolar line homography is oriented with respect to the other three points. The following theorem is illustrated in Figure 15. THEOREM 24. The projective constraints imply that the epipoles e, e 0 lie on a common conic B. The four image points segment every conic B into four parts. On a particular B, the answer to the question if the epipolar line homography is oriented or not depends solely on which segments e and e 0 belong to. On a particular B there are four possible cases. 1: e and e 0 are required to belong to the same segment. 2: e and e0 have to be assigned to the pair of segments attached to a particular point. 3: e and e 0 have to be assigned to a particular pair of separated segments. 4: No allowed assignment exists. When B varies, Case 3 and 4 can swap when B degenerates. The other cases stay the same throughout the entire pencil of conics. Proof: As e or e0 moves around B, the epipolar line homography changes continuously and ray directions switch when and only when e or e 0 moves

final.tex; 15/08/2005; 15:02; p.23

24

David Nist´er et al

across one of the four image points. Observe now that the co-registration of the four image points from the two images can in general only be done up to sign, i.e. only in projective space and not in oriented projective space. Moreover, the conic B induces a cyclic order on its points. For the epipolar line homography to be oriented, e and e0 have to separate in the cyclic order those image points that have equal sign from those image points that have different sign after the co-registration. The four different cases are illustrated in Figure 15. Case 1 occurs when all points have equal or different signs after co-registration. Case 2 occurs when one or three points have equal signs after co-registration. Case 3 or 4 occur when two points have equal signs after co-registration. Case 3 occurs when the two points with equal sign are not separated by the other two in the cyclic order. Case 4 occurs otherwise. It is clear that Case 1 and Case 2 only depend on the co-registration and not on the particular choice of B. Case 3 and 4 can only swap when the cyclic order changes, which can only happen when B degenerates.

Figure 15. Illustration to theorem 24. Four different cases arise. The cases depend on if the co-registration of the four image points x results in the same or different signs for zero, one or two points and on the cyclic order induced between the points by the conic B.

final.tex; 15/08/2005; 15:02; p.24

Four points in two or three calibrated views

25

Figure 16. Examples of curves of possible epipoles after the oriented epipolar constraint and the calibration constraints have been enforced. The small circles mark the four image points, while the squares mark the up to 4 × 4 real points that correspond to one of the four image points according to theorem 13 and the algorithm in Appendix B. As indicated by theorem 22, the curve has no loose ends apart from those points. The curves are rendered as the orthographic projection of a half-sphere and all curve segments that appear loose actually reappear at the antipode. There are also configurations of four point pairs for which the set of possible epipoles is completely empty, i.e. all configurations of four points in two calibrated views are not possible according to the oriented epipolar constraint.

10. The Convergence Constraint For points on the decic, we know that there is an epipolar geometry such that the epipolar lines are in the same plane. It is then easy to see that rays can only change between convergent and divergent when they become parallel. Moreover, this can only occur when the angles between the image points and the epipoles become equal in both images. Thus, we have THEOREM 25. The image rays from x and x0 can only change between convergent and divergent when the squared cosine of the angle between x and e is the same as for the angle between x0 and e0 .

final.tex; 15/08/2005; 15:02; p.25

26

David Nist´er et al

The Euclidean scalar product < x|y > between image directions x and y is encoded up to scale in the IAC ω as < x|y >= x> ωy.

(35)

The cosine between the image vectors is given by p

< x|y > p . < x|x > < y|y >

(36)

Hence, if we require the squared cosine between e and x to be equal to the cosine between e0 and x, we get 7 (x> ω 0 x)(e0> ω 0 e0 )(x> ωe)2 = (x> ωx)(e> ωe)(x> ω 0 e0 )2 .

(37)

If we assume that x is not on any of the IACs and choose the scale of the representation of ω and ω 0 relative to x such that (x> ω 0 x) = (x> ωx) = 1, we get (e0> ω 0 e0 )(x> ωe)2 − (e> ωe)(x> ω 0 e0 )2 = 0. (38) If we use our previous notation U = (ω · B) and the equation e0 ∼ U 0∗ U e.

(39)

to substitute for e0 , we get (e> U > U 0∗> ω 0 U 0∗ U e)(x> ωe)2 − (e> ωe)(x> ω 0 U 0∗ U e)2 = 0,

(40)

which is a 16th degree expression in e. Its 160 intersections with the decic curve are the only places where the image rays from x can change between converging and diverging. It is interesting to note that six of the intersections are the six points where the four double tangents between ω and ω 0 meet. This is because those points e are such that the projections of ω and ω 0 through e are identical regardless of what they are projected onto. Hence, e 0 = e is a solution and such e are self-corresponding in Equation (3). It is therefore evident that such e both lie on the decic curve and preserve their angle to x, i.e. lie on the curve defined by Equation (40). 11. Relation to the Uncalibrated Case If we add more point correspondences than four, one or two additional correspondences can be seen as a degenerate case of the calibration constraint in theorem 4. It is a classical result, also discussed in (Werner, 2003) that given 7

Remember that we are assuming the points are co-registered.

final.tex; 15/08/2005; 15:02; p.26

Four points in two or three calibrated views

27

Figure 17. Examples of curves of possible epipoles after all the constraints have been enforced.

five points, the projective constraints give rise to a 5th degree Cremona mapping between the epipoles. We can get an explicit formula for the mapping as a degenerate case of Equation (3). If we perform the thought-experiment of shrinking ω to a point x (i.e. ω becomes a conic envelope consisting of a repeated point), the two tangents to ω through e become coincident. Hence the two points arising from projection onto B become a common point (ω · B)e, which is obtained by intersecting B with the line through x and e. It is a classical result (Semple and Kneebone, 1952) that the pairwise correspondence induced on a conic B by lines through a vertex x is an involution, i.e. (ω · B) is a self-inverse homography. Hence when ω and ω 0 are shrunk to the points x, x0 of an additional point correspondence, Equation (3) becomes e0 ∼ (ω 0 · B)(ω · B)e,

(41)

final.tex; 15/08/2005; 15:02; p.27

28

David Nist´er et al

which is the 5th degree Cremona transformation between the epipoles. Thus, we obtain a relation to the uncalibrated case. More such connections can be made, e.g. the fundamental matrix can be expressed by the formula F ∼ [e0 ]× ((2e> Be0 )B ∗ [e]× + |B|ee> [e0 ]× ),

(42)

which we state without proof.

Figure 18. If we perform the thought-experiment of shrinking ω and ω 0 to the points x, x0 of an additional point correspondence, we get a formula for the 5th degree Cremona mapping between the epipoles arising in the uncalibrated 5-point case.

12. The 3v4p Algorithm Given four point correspondences in three calibrated views, we choose two views and trace out the decic curve for those two views with a one-dimensional sweep driven by a parameter θ. For each value of θ, all computations can be carried out very efficiently in closed form. The parameter is used to indicate one conic B from the pencil of conics. Given B, we can calculate the conic G from Equation (32) as a function of B and the intersections between G and B can then be found in closed form as the roots of a quartic polynomial as described in Appendix A. This yields up to four solutions for e. For each solution, the corresponding e0 can be found through Equation (3). If we rotate both coordinate systems so that the epipoles are moved to the origin, finding the epipolar line homography is just a simple matter of solving for a 1-D rotation with possible reflection. Thus, we get the essential matrix for the two views corresponding to each solution. Following (Nist´er, 2003a), we can then

final.tex; 15/08/2005; 15:02; p.28

Four points in two or three calibrated views

29

select a camera configuration for the two views and get the locations of the four points through triangulation. Up to four solutions for the pose of the third view can then be found by solving the three point perspective pose problem (Haralick et al., 1994) for three of the points. The orientation constraints are used to disqualify solutions for which the space points are not on the forward part of the image rays. Finally, the fourth point can be projected into the third view. For the correct value of θ and the correct solution, the projection of the fourth point should coincide with its observed image position. Moreover, this will only occur for valid solutions and in general there is a unique solution. Thus, θ is swept through the pencil of conics and the solution resulting in the reprojection closest to the observed fourth point position is selected. Note that this procedure has the desirable property of minimising an image based error. The problem is overconstrained by one degree of freedom. This means that with noisy correspondences, the reprojection will not coincide exactly with the observed position even for the correct solution. Note that from one value of θ, we get up to sixteen possible camera configurations and up to sixteen reprojections. The reprojections trace out a remarkably complex curve in the third view as θ is varied. An example is shown in Figure 19, giving an insight into the tremendous complexity of the 3v4p problem. For random sets of four point correspondences, the curve of reprojections will in general not pass through the fourth image point. Before the calculation is started the four points are co-registered with a homography. The homography is used to transform the DIAC from the second view into the coordinate system of the first. There are many ways to parameterise the pencil of conics. A simple choice is to use two of the degenerate conics as basis for the pencil.

13. Experimental Results Figure 20 shows two typical examples of the cost function. The best way we have found to execute the 1-D search is with an initial fixed granularity, followed by iterative refinement around multiple local minima and finally selecting the best point. This preferred approach is compared to several other in Figure 21 for ideal configurations. The method works well on ideal configurations and pursuing multiple local minima gives equivalent failure rates with an order of magnitude fewer actual search points. At 40 steps in the basic search, the preferred approach takes around 1ms on a 3GHz machine. At 1000 steps, the average execution time is around 12ms. When noise comes into the picture, the faster version with 40 steps turns out to be as good as the slower versions. We therefore give results for the faster version in the rest of the experiments, where we indicate that the 3v4p solution is quite practical. We do this by showing that it is not far behind the 5-point method, which has been thoroughly tested in (Nist´er, 2004). An

final.tex; 15/08/2005; 15:02; p.29

30

David Nist´er et al

Figure 19. The reprojections of the fourth image point in the third view, traced out by sweeping through the pencil of conics. The remarkable complexity of the curve gives an insight into the difficulty of the 3v4p problem. The top three curves show an example of the reprojections without use of the orientation constraints. The bottom tow curves show examples of the reprojections after the orientation constraints have been enforced. The true fourth point reprojection is indicated by a small circle.

additional point is generated and the 5-point method is used between two views, disambiguating using the third as described in (Nist´er, 2004). The test geometry and its parameters are shown in Figure 22. Unless otherwise noted, the parameters are as shown in table I. We concentrate on the root mean square deviation of the estimated translation directions from the true directions. Performance with respect to noise under easy and default conditions is shown in Figures 23 and 24, respectively. For completeness, rotational errors are shown in Figure 25. Translational errors with respect to true translation direction, magnitude of the baseline and depth of the scene are shown in Figures 26, 27 and 28 respectively. 14. Conclusion We have given necessary and sufficient conditions for the projective and calibration constraints to be satisfied given four corresponding points in two calibrated images. The possible epipoles are exactly those on a decic curve. We have shown that the second epipole is related to the first by a seventh degree expression. We also related this mapping to the uncalibrated 5-point

final.tex; 15/08/2005; 15:02; p.30

Four points in two or three calibrated views

31

Figure 20. Examples of the cost function. Left: ’Random’ configuration. Right: Small straight line motion.

Figure 21. Fraction of trials resulting in a numerical error above 10 −4 as the actual number of search points increases. The large hollow circles denote a basic search followed by refinement around the best point. The small dots denote adaptive search with different parameter values. The large filled dots denote a basic search followed by refinement at multiple local minima. Table I. The challenging while realistic default parameters used in the experiments. Depth

0.5

Baseline

0.1

Image Dimensions

352 × 288(CIF)

Noise Std-dev

1 Pixel

Field of View

45 Degrees

final.tex; 15/08/2005; 15:02; p.31

32

David Nist´er et al

Figure 22. The parameters of the test geometry used in the experiments. The distance to the scene volume is used as the unit of measure. The depth of the volume in which scene points are randomized is varied, as is the length of the baseline, the direction of motion, the amount of image noise and the number of points. The second camera centre is placed halfway to the third camera centre.

Figure 23. Same as in Figure 24, but under easy conditions (Depth=2, Baseline=0.3).

Figure 24. Lower quartile error in translation direction with respect to noise in pixels of a CIF image based on 104 trials per data point. The filled dots denote the 3v4p method and the plus marks denote the 5p method. Left: Sideway Motion. Right: Forward Motion.

final.tex; 15/08/2005; 15:02; p.32

Four points in two or three calibrated views

33

Figure 25. Lower quartile error in rotation with respect to noise. Left: Sideway Motion. Right: Forward Motion.

Figure 26. Translational error with respect to true translation direction given in degrees from the forward direction. Left: Easy conditions. Right: Default conditions.

Figure 27. Translational error with respect to magnitude of the baseline. Left: Sideway Motion. Right: Forward Motion.

Figure 28. Translational error with respect to scene depth. Left:Sideway Motion. Right: Forward Motion.

final.tex; 15/08/2005; 15:02; p.33

34

David Nist´er et al

case, which resulted in an explicit expression for the fifth degree Cremona mapping arising from five point correspondences. We have shown that if the orientation constraints are taken into account, only a subset of the decic curve corresponds to possible epipoles. As a result, we have found that there are configurations of four pairs of corresponding points that can not occur in two calibrated images. We have shown that points on the decic curve can be generated in closed form and that it is possible to trace out the curve very efficiently with a one-dimensional sweep. This yields the most efficient solution yet to the notoriously difficult problem of solving for the relative orientation of three calibrated views given four corresponding points. In passing, we have given a novel algorithm for finding the essential matrix given three point correspondences and one of the epipoles. Appendix A. Conic Intersection Assume that we have two conics. They can both be expressed as quadratic  > equations in the homogeneous image coordinates x = y z w . Assume that the image coordinate system has been transformed so that we can safely set w = 1. After a simple Gauss-Jordan elimination we can then write the two quadratic equations as the homogeneous equation system y2 I II

yz

y

1

1

α β

Γ , ∆

1

where α and β are scalars and Γ and ∆ are quadratic polynomials in z. Note that Equation (II) is y(z + β) = −∆. If we insert that into (I)(z + β)2 = 0 we get ∆2 − α∆(z + β) + Γ(z + β)2 = 0. (43) This is a quartic polynomial in z from which the up to four real solutions can be extracted in closed form. B. Three Points Plus Epipole To support our arguments we need to show that given three point correspondences xi ↔ x0i and the epipole e in one image, the projective and calibration constraints lead to four solutions for the other epipole e0 . To do this, we give

final.tex; 15/08/2005; 15:02; p.34

35

Four points in two or three calibrated views

a novel algorithm that constructs the four solutions. If the epipole in the first image is known, we can rotate the image coordinate system so that it is on the origin. Then the essential matrix is of the form E1 E2 0 E =  E3 E4 0  . E5 E6 0 



(44)

A point correspondence x0 ↔ x contributes the constraint ˜ >E ˜ = 0, X where and

(45)

˜ = x01 x1 x01 x2 x02 x1 x02 x2 x03 x1 x03 x2 X 

˜ = E1 E2 E3 E4 E5 E6 E 

>

>

(46) (47)

.

˜ > from three point correspondences are stacked, we get a If the vectors X ˜ must be in its 3-dimensional nullspace. Let Y, Z, W be a 3 × 6 matrix. E ˜ is of the form basis for the nullspace. Then E ˜ = yY + zZ + wW, E

(48)

where y, z, w are some scalars. Since an essential matrix E is characterised by having two equal singular values and one zero singular value, we have exactly the two additional constraints E12

(49) (50)

E1 E2 + E 3 E4 + E 5 E6 = 0 + E32 + E52 = E22 + E42 + E62 .

These constraints represent two conics and four solutions for y z w 

>

.

References Faugeras, O. D.: 1993, Three-Dimensional Computer Vision: a Geometric Viewpoint. MIT Press. Fischler, M. A. and R. C. Bolles: 1981, ‘Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography’. Comm. Assoc. Comp. Mach. 24(6), 381–395. Haralick, R. M., C. Lee, K. Ottenberg, and M. No¨ lle: 1994, ‘Review and Analysis of Solutions of the Three Point Perspective Pose Estimation Problem’. International Journal of Computer Vision 13(3), 331–356. Hartley, R. I. and A. Zisserman: 2000, Multiple View Geometry in Computer Vision. Cambridge University Press. Holt, R. and A. Netravali: 1995, ‘Uniqueness of Solutions to Three Perspective Views of Four Points’. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(3).

final.tex; 15/08/2005; 15:02; p.35

36

David Nist´er et al

Kirwan, F.: 1995, Complex Algebraic Curves. Cambridge University Press. Maybank, S. J.: 1993, Theory of reconstruction from image motion. Berlin: Springer-Verlag. Nist´er, D.: 2003a, ‘An Efficient Solution to the Five-Point Relative Pose Problem’. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, Vol. 2. pp. 195–202. Nist´er, D.: 2003b, ‘Preemptive RANSAC for Live Structure and Motion Estimation’. In: Proceedings of the International Conference on Computer Vision. pp. 199–206. Nist´er, D.: 2004, ‘An Efficient Solution to the Five-Point Relative Pose Problem’. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(6), 756–770. To appear. Quan, L., B. Triggs, and B. Mourrain: 2003a, ‘Some Results on Minimal Euclidean Reconstruction from Four Points’. Journal of Mathematical Imaging and Vision. To appear. Quan, L., B. Triggs, B. Mourrain, and A. Ameller: 2003b, ‘Uniqueness of Minimal Euclidean Reconstruction from 4 Points’. Unpublished article. Schaffalitzky, F., A. Zisserman, R. I. Hartley, and P. H. S. Torr: 2000, ‘A Six Point Solution for Structure and Motion’. In: Proceedings of the European Conference on Computer Vision. pp. 632–648, Springer-Verlag. Semple, J. G. and G. T. Kneebone: 1952, Algebraic Projective Geometry. Oxford University Press. Werner, T.: 2003, ‘Constraints on Five Points in Two Images’. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, Vol. 2. pp. 203–208.

The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U. S. Government.

final.tex; 15/08/2005; 15:02; p.36