Reconstruction and Prediction from Three Images of Uncalibrated Cameras Anders Heyden Dept of Mathematics, Lund University Box 118, S-221 00 Lund, Sweden email:
[email protected]
Abstract This paper deals with the problem of reconstructing the locations of a number of points in space from three different images taken by uncalibrated cameras. It is assumed that the correspondences between the points in the different images are known. In the case of six points this paper shows that there are in general three solutions to the problem of determining the shape of the object, but some of them may be complex and some may not be physically realisable (e.g. points behind the camera). The solutions are given by a third degree polynomial with coefficients depending on the coordinates of the points in the image. It is also shown how a priori information of the object, such as planarity of subsets of the points, can be used to make reconstruction. In this case the reconstruction is unique and it is obtained by a linear method. Furthermore it is shown how additional points in the first two images can be used to predict the location of the corresponding point in the third image, without calculating the epipoles. Finally a linear method for the reconstruction in the case of at least seven point matches are given.
1. Introduction A central problem in scene analysis is the reconstruction of 3D-objects from 2D-images, obtained by projections. In this paper we will concentrate on the case of three images consisting of several points each, with known correspondences. The objective is to calculate the shape of the object using the shapes of the images. We will present a method where no camera calibration is needed; making it possible to reconstruct the object up to a projective transformation. In the case of six points this problem was solved independently in [6] and [3], by different methods. Quan uses projective coordinates in the images, eliminates the camera matrices and finally calculates the projective coordinates of the object. In his approach a third degree polynomial gives the three different reconstructions. An analogous polynomial appears in [3] and also in the reconstruction of an object consisting of seven points from two different images, see [8].
The problem of predicting new points in the third image has been treated by Shashua, in [1], and by Faugeras&Robert, see [2]. Shashua uses affine invariants and gets two linear constraints on the coordinates of the points in the third image. Faugeras uses the fundamental matrices between the images in order to obtain two linear constraints. However using the epipoles may result in an unstable prediction (e.g. when the epipolar lines are parallel). The reconstruction problem with several points has been treated by Shashua, in [1], and we will give a similar method here. It is shown that it is possible to make a linear solution in the case of seven points as well as when there are more than seven points. The difference between our method and Shashuas is that we only need 15 variables to estimate compared to Shashuas 27. Our approach is based on the concept of shape and depth. As a by-product of the reconstruction the kinetic depths, see [8], can be easily calculated. These show in some way how the camera has moved between the images. We can also use the same ideas to predict points in the third image without calculating the epipoles or the fundamental matrix. Furthermore it is easy in this framework to incorporate additional information such as planarity of subsets of the points in the object. We start with a brief introduction of depth and shape in section 2. Then the problem is formulated using these concepts in section 3. In section 4 the solution is given. How a priori information can be used is described in section 5 and how to predict new points is described in section 6. In section 7 we derive the linear method for reconstruction from more than 6 points, in section 8 we give an example with real images and the conclusions are given in section 9. 2. Depth and shape In this chapter we will present some basic properties of the concepts of depth and shape. For a more detailed treatment see [7]. We start with the definition of shape. Definition 2.1. A configuration is an ordered set of points in 3-space, X = (X 1; : : : ; X n). Let xi denote their coordinates in some basis, i = 1; : : : ; n. Then the shape of X is defined as the linear space n
n
1 1 s(X ) = f ξ j ∑ ξi x = 0; ∑ ξi = 0 g = N 1 2 x x i=1 i=1 i
::: :::
1 xn
:
(1)
Remark. An important property of s(X ) is that it is independent of the coordinate representation of the points. It is even a complete affine invariant. Another property is that it can be used to determine if the points are collinear, coplanar or not coplanar. In fact the following holds Lemma 2.1. Let X be a configuration of n points. Then 8 < dim s(X ) = n
? 2 if the points are collinear, dim s(X ) = n ? 3 if the points are coplanar, but not collinear, : dim s(X ) = n ? 4 if the points are not coplanar.
Definition 2.2. A perspective transformation (or perspectivity) with center Z and image plane π, Z 2 = π, is a mapping with the property that every point on a line through Z is mapped onto the intersection of the line with π, see Figure 1. If one configuration X is mapped onto another Y , where Y is planar, then for some α = (α1 ; : : : ; αn ) with αi 6= 0; i = 1; : : : ; n, holds ZX i = αi ZY i ; i = 1; : : : ; n: The vector α is called the depth of X with respect to Y . A projectivity is a composition of perspective transformations. Remark. The center Z may be a point of infinity and then the perspective transformation is just a parallel projection in the direction given by Z. The importance of these concepts is illustrated in the following theorems (for more details
Z
ZYi αi ZYi Yi
Y Π
Xi
X
Fig. 1. A perspective transformation see [7]). Theorem 2.1. If X and Y are planar configurations, then the following statements are equivalent: (1) There exists a perspectivity P, such that P(X ) and Y have equal shape, and X has depth α with respect to P(X ) (2) diag(α)s(X ) = s(Y ) This theorem says that whenever an X -configuration, with a given shape s(X ), can be mapped by a perspectivity onto a Y -configuration, with a given shape s(Y ), that mapping must have the depth α given by the theorem, independently of Z and π.
Theorem 2.2. If X and Y are point configurations, then the following statements are equivalent: (1) There exists a perspectivity P, such that P(X ) and Y have equal shape, and X has depth α with respect to P(X ) (2) diag(α)s(X ) s(Y ) This theorem says that we can determine if a two dimensional point configuration is a projective image of a three dimensional point configuration by looking at linear spaces. 3. Problem formulation The problem we will consider is the following, see [8] for a more detailed description. Problem 3.1. Reciprocal Chasles’ Problem (RCP). Given three planar six point configurations Y1 , Y2 and Y3 . Determine all configurations X1 , X2 and X3 such that there exist projectivities Pi fulfilling
Yi = Pi(Xi ); i = 1; 2; 3 s(X1) = s(X2) = s(X3):
(2)
Remark. The original (RCP) dealt with two two planar seven point configurations. This is a straightforward generalisation. Remark. It is no difference in assuming s(X1 ) = s(X2 ) = s(X3) instead of X1 = X2 = X3 because the affine transformations allowed between the Xi ’s can not be distinguished from an extra affine transformation in Pi . Remark. We may even assume that there are projective transformations between the Xi , because this can be incorporated in Pi , which is unknown. This means however that the kinetic depths, as will be introduced later, lose their physical interpretation as quotients of depths. Theorem 2.2 shows that Eq. (2) is equivalent to the existence of α1 , α2 and α3 such that diag(α1 )s(X ) s(Y1 ); diag(α2 )s(X ) s(Y2 ); diag(α3 )s(X ) s(Y3 ); s(X ) = s(X1 ) = s(X2 ) = s(X3 );
(3)
or equivalently, if diag(αi ) is nonsingular, 1 s(X ) diag(α? i )s(Yi);
i = 1; 2; 3:
(4)
This can be rewritten as 1 s(X ) diag(α1?1 )s(Y1) \ diag(α2?1 )s(Y2 ) \ diag(α? 3 )s(Y3 ):
(5)
Here it is assumed that X is three dimensional and thus the left hand side has dimension 6-4=2 according to Lemma 2.1. If diag(αi?1 )s(Yi); i = 1; 2; 3; don’t coincide we must have
equality in Eq. (5). Multiplying by diag(α1 ) and introducing p = diag(α1 )?1 diag(α2 ) and q = diag(α1 )?1 diag(α3 ) gives s(X ) s(Y1) \ diag( p)?1 s(Y2) \ diag(q)?1 s(Y3 );
(6)
where p and q are the kinetic depths between image 1 and 2 respectively between image 1 and 3, see [8]. Hence the problem is reduced to the following, see [8]. Problem 3.2. Weak Chasles’ Problem(WCP). Given planar configurations Y1 , Y2 and Y3 , find p and q such that dim(s(Y1 ) \ diag( p)?1 s(Y2 ) \ diag(q)?1 s(Y3)) = 2;
(7)
Taking orthogonal complements we arrive at the equivalent condition dim(s(Y1)? + s(Y2 )? diag( p) + s(Y3 )? diag(q)) = 4:
(8)
Here s(Yi )? , i = 1; : : : 3 are the orthogonal complements of s(Yi ) and we have used that the orthogonal complement of diag(α)?1s(X ) is s(X )? diag(α). From the second equality in Eq. (1) follows that s(Yi )? can be represented as the range of matrices, YiT , where Yi contains as columns the coordinates in an affine system for the six points in each image. Forming in the same way a matrix X T , spanning the 4-dimensional space in Eq. (8), the columns of X define one possible reconstruction. All other reconstructions are thereafter obtained from projective transformations of this first one. The special reconstruction corresponding to X is the one obtained if the first camera is assumed to be at infinity, that is the first projectivity P1 is a parallel projection. Given this affine representative of the projective reconstruction we can calculate the projective coordinates as follows. If the affine coordinates are x1 = (1; 0; 0; 0), x2 = (0; 1; 0; 0), x3 = (0; 0; 1; 0), x4 = (0; 0; 0; 1), x5 = (x1 ; y1 ; z1 ; w1 ) and x6 = (x2 ; y2 ; z2 ; w2 ), new affine representatives are obtained by multiplying x1 by x1 , x2 by y1 , x3 by z1 and x4 by w1 and then dividing the first component in each vector by x1 , the second by y1 , the third by z1 and the fourth by w1 gives the following affine representation: x1 , x2 , x3 and x4 are the same as before, x5 = (1; 1; 1; 1) and x6 = (x2 =x1 ; y2 =y1 ; z2 =z1 ; w2 =w1 ). This representation is in projective canonical form. Observe that all operations above on these coordinates are projective transformations. The operations can also be described by multiplication of X by diagonal matrices. 4. Solution of Weak Chasles’ Problem In this section we will outline a method to solve the Weak Chasles’ Problem defined in Problem 3.2. We start with Eq. (8), which can be formulated as the rank of a matrix.
rank Y1T j diag( p)Y2T j diag(q)Y3T
=4
(9)
Introduce affine coordinates in the image planes and in 3-space such that 2
3
1 0 0 60 1 07 6 7 60 0 17 T 6 7; Y T Y1 = 6 7 2 a b c 1 1 1 6 7 4a2 b2 c2 5 a3 b3 c3
2
1 0 60 1 6 60 0 =6 6d e 1 6 1 4d2 e2 d3 e3
2
3
0 07 7 17 7; Y T 3 f1 7 7 f2 5 f3
3
1 0 0 60 1 07 6 7 60 0 17 6 7; X T =6 7 g h k 1 1 1 6 7 4g2 h2 k2 5 g3 h3 k3
This gives the following equivalent formulation of Eq. (9) 2
3
1 0 0 p1 0 0 q1 0 0 60 1 0 0 p2 0 0 q2 0 7 6 7 60 0 1 7 0 0 p 0 0 q 3 3 7=R R6 6a b c 7 1 1 p4 d1 p4 e1 p4 f1 q4 g1 q4 h1 q4 k1 7 6 1 4a2 b2 c2 p5 d2 p5 e2 p5 f2 q5 g2 q5 h2 q5 k2 5 a3 b3 c3 p6 d3 p6 e3 p6 f3 q6 g3 q6 h3 q6 k3
2
1 60 6 60 =6 60 6 4x1 x2
0 1 0 0 y1 y2
2
0 0 1 0 z1 z2
3
0 07 7 07 7: 17 7 w15 w2 3
1 0 0 0 60 1 0 07 6 7 60 0 1 7 0 6 7; 60 0 0 17 6 7 4x1 y1 z1 w1 5 x2 y2 z2 w2
where R means range, that is the linear space spanned by the columns in the matrix. This equation means that all nine vectors in the left hand matrix can be written as linear combinations of the four vectors in the right hand matrix. Thus the first column vector in the left matrix can be written as the first vector in the right matrix plus a1 times the fourth, and similarly for the second and third column vectors on the left. This gives with w1 = s; w2 = t, x1 = a2 ? a1 s;
x2 = a3 ? a1t ; z1 = c2 ? c1 s;
Thus the problem is reduced to 2
y1 = b2 ? b1 s; z2 = c3 ? c1t :
3
p1 0 0 q1 0 0 6 0 p 0 0 q 0 7 2 2 6 7 6 0 7 0 p 0 0 q 3 3 7 R6 6p d 7=R 6 4 1 p4 e1 p4 f1 q4 g1 q4 h1 q4 k1 7 4 p5 d2 p5 e2 p5 f2 q5 g2 q5 h2 q5 k2 5 p6 d3 p6 e3 p6 f3 q6 g3 q6 h3 q6 k3
2
1 0 0 0
y2 = b3 ? b1t
0 1 0 0
(10)
0 0 1 0
3
0 6 07 6 7 6 7 0 6 7: 6 7 1 6 7 4a2 ? a1 s b2 ? b1 s c2 ? c1 s s 5 a3 ? a1t b3 ? b1t c3 ? c1t t
Writing the six vectors in the left matrix as linear combinations of the four in the right matrix gives p5 d2 = p1 (a2 ? a1 s) + p4 d1 s; p5 e2 = p2 (b2 ? b1 s) + p4 e1 s; p5 f2 = p3 (c2 ? c1 s) + p4 f1 s;
p6 d3 = p1 (a3 ? a1t ) + p4d1t p6 e3 = p2 (b3 ? b1t ) + p4 e1t p6 f3 = p3 (c3 ? c1t ) + p4 f1t
(11)
and another six equations from the last three vectors in the left hand matrix, q5 g2 = q1 (a2 ? a1 s) + q4 g1 s; q5 h2 = q2 (b2 ? b1 s) + q4 h1 s; q5 k2 = q3 (c2 ? c1 s) + q4k1 s;
q6 g3 = q1 (a3 ? a1t ) + q4g1t q6 h3 = q2 (b3 ? b1t ) + q4h1t q6 k3 = q3 (c3 ? c1t ) + q4k1t :
(12)
The first six equations, in Eq. (11), are linear in pi . They have a nontrivial solution if and only if 2
a2 ? a1 s 0 0 6 a3 ? a1t 0 0 6 6 0 b2 ? b1 s 0 detU1 = det 6 6 0 b ? b t 0 3 1 6 4 0 0 c2 ? c1 s 0 0 c3 ? c1t
d1 s d1 t e1 s e1t f1 s f1t
?d2 0 ?e2 0 ? f2 0
3
0 ?d377 0 7 7 ?e3 77 = 0 0 5 ? f3
(13)
and the second six, in Eq. (12), are linear in qi , and have a nontrivial solution if and only if 2
a2 ? a1 s 0 0 6 a3 ? a1t 0 0 6 6 0 b ? b s 0 2 1 detU2 = det 6 6 0 b3 ? b1 t 0 6 4 0 0 c2 ? c1 s 0 0 c3 ? c1t
g1 s g1 t h1 s h1 t k1 s k1t
?g2 0
?h2 0 ?k2 0
3
0 ?g377 0 7 7 ?h377 = 0: 0 5 ?k3
(14)
Definition 4.3. The matrices above, in Eq. (13) and Eq. (14), are called the universal matrices, Ui , i = 1; 2. Then we have the following theorem Theorem 4.3. The Weak Chasles Problem (WCP) has a solution if and only if there exists a common solution to the equations det(Ui ) = 0;
i = 1; 2;
where Ui are the universal matrices. Remark. Observe that one common solution to det(Ui ) = 0; i = 1; 2 is s = 0 and t = 0. However this implies that all points in X except the fourth lies in a common plane and this can not be a generic case. One further property of this solution is that, in general, all pi except p4 are zero and all qi except q4 are zero. There are also the following three obvious solutions s = a2 =a1 ; s = b2 =b1 ; s = c2 =c1 ;
t = a3 =a1 ; t = b3 =b1 ; t = c3 =c1 :
These solutions imply in turn that all points in X except the first, second respective third ones lie in a common plane. Again this can not be a generic case. One further property of this solution is that in general all pi and qi except i = 1, i = 2 respective i = 3 are zero. We have to be observant of these solutions when solving the equations in Theorem 4.3. Their existence introduces difficulties in numerical algorithms. Observe that when we have solved Eq. (13) and Eq. (14) for s and t the kinetic depthvectors p and q can be computed as the nullspaces of the universal matrices, Ui . These kinetic depth-vectors tell us something of how the camera has been moved between the exposure times of the three images, and can be interpreted as depth flow vectors, see [8].
Computing the determinants of the universal matrices in Eq. (13) and Eq. (14) gives two equations l1 s + l2t + l3 s2 + l4 st + l5t 2 + l6 s2t + l7 st 2
(15)
m1 s + m2t + m3 s2 + m4 st + m5t 2 + m6 s2t + m7 st 2
where the coefficients li and mi are polynomial expressions in ai , bi , ci , di , ei , fi , gi , hi and ki . The equations in Eq. (15) have a common solution if and only if the resultant is zero, 2
3
l 1 s + l 3 s2 l 2 + l 4 s + l 6 s2 l5 + l7 s 0 2 2 6 7 + l s l + l s + l s l + 0 l s 3 2 4 6 1 5 l7 s 7 det 6 4m1 s + m3 s2 m2 + m4 s + m6 s2 5 = 0: m5 + m7 s 0 m2 + m4 s + m6 s2 m5 + m7 s 0 m1 s + m3 s2
(16)
This is a seventh degree polynomial equation in s, which looks like n7 s7 + n6 s6 + n5 s5 + n4 s4 + n3 s3 + n2 s2 + n1 s = 0:
(17)
Considering Eq. (13) and Eq. (14) we see three common solutions immediately as pointed out in the remark above s = a2 =a1 ; s = b2 =b1 and s = c2 =c1 : Hence the three factors s ? a2 =a1 , s ? b2 =b1 and s ? c2 =c1 can be eliminated from Eq. (17) as well as the factor s. This gives a polynomial equation in s of degree 3, as3 + bs2 + cs + d = 0 where the coefficients a, b, c and d are polynomial expressions in ai , bi , ci , di , ei , fi , gi , hi and ki . The coefficients can be computed in MAPLE but they are very complicated. The degree of these polynomials is 24 and there are 27 variables which gives less than 10 14 terms. Since polynomial equations of degree three can be solved by a closed formula this shows that we have closed form expressions for the three solutions to RCP. Of course these closed expressions are very complicated and using them is certainly not the best method to compute the solutions. We calculate first li and mi , then ni and finally a, b, c and d. This is simple to implement because given the image coordinates it is simple to calculate li and mi and so on. 5. Using a priori information of the object If we have some a priori information of the object it is desirable to use it in the reconstruction. The kind of a priori information that is interesting here is planarities. If we know that four specific points are coplanar, say points 1, 2, 3 and 6, then w2 = 0. Thus Eq. (10) gives t = 0. According to Eq. (15), Eq. (13) and Eq. (14) gives with t = 0 l1 s + l3 s2 = (l1 + l3 s)s = 0 m1 s + m3 s2 = (m1 + m3 s)s = 0:
If the points 1, 2, 3 and 5 are not coplanar, then s = 0 can be excluded. Solving these two linear equations in one variable in the least squares sense, gives s=?
1 l1 m1 ( + ): 2 l3 m3
(18)
Now we have the reconstruction and can find the kinetic depth vectors by finding the right nullspace to the universal matrices. For noisy data, the determinant is in general not exactly zero. To overcome this, a singular value decomposition can be used. We conclude this section with three further observations. 1. If we know that two collections of four points are coplanar in the object, say points 1, 2, 3, 6, and points 1, 2, 4, 5. Then z1 = 0 and w2 = 0 and as before t = 0. Now Eq. (10) also gives z1 = c2 ? c1 s = 0, i.e. s = c2 =c1 , and the reconstruction is completed. Finally we can solve for the kinetic depth vectors by a singular value decomposition of the universal matrices just as in the previous case. Observe that there exists no canonical projective frame of the configuration X. when z1 = 0 and w2 = 0, because no subcollection of five points are projectively independent. Here by projectively independency of five points is meant that no subcollection of four points are coplanar. 2. If we know that five points are in a common plane, say points 1, 2, 3, 5 and 6, then w1 = 0 and w2 = 0. This gives s = t = 0 and we can proceed as before. Also in this case it is impossible to find the projective canonical frame because the points are not in general position. 3. Finally, if all points are in a common plane then we can use Theorem 2.2 directly to find q, see [5] for another method to solve these problems.
6. Predicting points in the third image Suppose we have three images of six points together with an additional point in each of the first two images. We will show that then it is possible to predict this point in the third image. If we know which one of the three reconstructions to use we know s and t as well as the kinetic depths p and q up to proportionality. Consider now the seventh point, with affine coordinates [a5 ; b5 ; c5 ] in the first image and [d5 ; e5 ; f5 ] in the second image. By means of the universal matrix corresponding to points 1, 2, 3, 4, 5 and 7 and image one and two, we get 2
a2 ? a1 s 0 0 6a4 ? a1 u 0 0 6 6 0 b2 ? b1 s 0 6 6 0 b ? b u 0 4 1 6 4 0 0 c2 ? c1 s 0 0 c4 ? c1 u
d1 s d1 u e1 s e1 u f1 s f1 u
?d2 0 ?e2 0 ? f2 0
32
3
2 3
0 p1 0 6 6 7 7 ?d47 6 p27 6077 6 7 6 7 0 7 7 6 p3 7 = 607 : ?e4 77 66 p477 66077 0 5 4 p5 5 405 ? f4 p7 0
(19)
Here we have assumed the following affine coordinates in the object 2
1 60 X=6 40 0
0 1 0 0
0 0 1 0
3
0 x1 x2 x3 0 y1 y2 y3 7 7 0 z1 z2 z3 5 1 s t u
(20)
The second, fourth and sixth equation in Eq. (19) gives
? a1u) p1 + d1up4 ? d4 p7 = 0 (b4 ? b1 u) p2 + e1 up4 ? e4 p7 = 0 (c4 ? c1 u) p3 + f1 up4 ? f4 p7 = 0 (a4
(21)
;
which can be written as 2
a4 p1 d1 p4 ? a1 p1 4b4 p2 e1 p4 ? b1 p2 c4 p3 f1 p4 ? c1 p3
32
3
2 3
?d4 1 0 ?e4 5 4 u 5 = 405 0 ? f4 p7
(22)
:
Here we know everything except u and p7 . Thus it is possible to find the nullspace of the matrix in Eq. (22), which gives u and p7 . Observe that this is an overdetermined system which can be helpful to determine which one of the three solutions of s to use. We can get an indication of which of them that is the correct one by studying Eq. (22) for the three different solutions for pi ; i = 1; : : : 6, obtained from the three different solutions of s via the universal matrices. Consider now the universal matrix corresponding to points 1, 2, 3, 4, 5 and 7 and the images one and three. It can easily be seen from Eq. (13) that the determinant of the universal matrix is linear in the coordinates in the last column. However we will make use of the actual reconstruction done above and project the reconstructed object down to the third image. This can be done from the following set of equations that follows exactly as Eq. (21), from the universal matrix corresponding to points 1, 2, 3, 4, 5 and 7 and the images one and three, 9
? a1u)q1 + g1uq4 ? g4q7 = 0> = (b4 ? b1 u)q2 + h1 uq4 ? h4 q7 = 0 > ; (c4 ? c1 u)q3 + k1 uq4 ? k4 q7 = 0 (a4
2
)
=
3
2
3
g4 (a4 ? a1 u)q1 + g1 uq4 q7 4h4 5 = 4(b4 ? b1 u)q2 + h1 uq4 5 ; (23) k4 (c4 ? c1 u)q3 + k1 uq4
where (g4 ; h4 ; k4 ) are affine coordinates for the seventh point in the third image. Thus we can uniquely determine the image coordinates, since the sum of the coordinates is 1. 7. Linear reconstruction Looking at the first three equations in Eq. (11) and Eq. (12) it can be seen that they are
linear in s, p5 and q5 and can be written as 2
p1 a2 6 p2 b2 6 6p c 6 3 2 6q a 6 1 2 4 q2 b2 q3 c2
p4 d1 ? p1 a1 p4 e1 ? p2 a1 p4 f1 ? p3 a1 q4 g1 ? q1 a1 q4 h1 ? q2 a1 q4 k1 ? q3 a1
3
?d2 ?e2 ? f2
0 2 3 2 3 0 7 0 7 1 6 7 6 7 0 7 6 s 7 607 7 ?g277 4 p55 = 405 : 0 ?h25 q5 ?k2
0 0 0
(24)
Observe that similar equations are obtained from the last three equations in Eq. (11) and Eq. (11). We just have to change (d2 ; e2 ; f2 ) to (d3; e3 ; f3 ), (g2 ; h2 ; k2 ) to (g3; h3 ; k3 ), s to t, and p5 and q5 to p6 and q6 . Introducing t = ( p4 d1 ? p1 a1 ; p4 e1 ? p2 a1 ; p4 f1 ? p3 a1 ) ¯t = (q4 g1 ? q1 a1 ; q4 h1 ? q2 a1 ; q4 k1 ? q3 a1 );
(25)
Eq. (24) have a solution if and only if 2
p1 x 6p y 6 2 6p z 3 rank 6 6q x 6 1 4 q2 y q3 z
t1 t2 t3 ¯t1 ¯t2 ¯t3
x¯ y¯ z¯ 0 0 0
3
0 07 7 07 7 4; xˆ7 7 yˆ5 zˆ
(26)
where now (x; y; z), (x¯; y¯; z¯) and (xˆ; yˆ; zˆ) denotes coordinates for arbitrary but corresponding points in the three images. This rank condition can be expressed as the vanishing of 4 4 subdeterminants from the matrix in Eq. (26). These subdeterminants are linear in image coordinates and it turns out that there are four linearly independent such expressions, see also [1] and [4]. Furthermore there are 15 different coefficients, ki , that can be expressed in pi , qi , ti and ¯ti . They are k1 = t1 q1 ? ¯t1 p1 ; k2 = ¯t2 p1 ; k3 = t1 q2 ; k4 = ¯t3 p1 ; k5 = t1 q3 ; k6 = t2 q2 ? ¯t2 p2 ; k7 = t2 q1 ; k8 = ¯t1 p2 ; k9 = ¯t3 p2 ; k10 = t2 q3 ; k11 = t3 q3 ? ¯t3 p3 ; k12 = t3 q1 ; k13 = ¯t1 p3 ; k14 = t3 q2 ; k15 = ¯t2 p3 :
(27)
The four linear expressions can now be written as k5 x¯ ˆzz ? k4 x¯ ˆzx ? k11 xˆxz ¯ + k12 zˆxx ¯ ? k13 zˆxz ¯ ? k1 zˆz¯x = 0 ˆzz + k10 yˆxz ¯ ? k14 zˆxy ¯ + k15 zˆxz ¯ ? k2 zˆz¯x + k3 zˆz¯y = 0 k4 y¯ ˆzx ? k5 y¯ k9 x¯ ˆzy ? k10 x¯ ˆzz + k11 xˆyz ¯ ? k12 zˆyx ¯ + k13 zˆyz ¯ + k7 zˆz¯x ? k8 zˆz¯y = 0 k9 y¯ ˆzy ? k10 y¯ ˆzz + k11 yˆyz ¯ ? k14 zˆyy ¯ + k15 zˆyz ¯ + k6 zˆz¯y = 0;
(28)
which are obtained from the determinant of the matrix formed by rows 1, 3, 4, 6, rows 2, 3, 5, 6, rows 2, 3, 4, 6. respectively rows 1, 3, 5 and 6 from the matrix in Eq. (26) Since
there are four linearly independent expressions in the coefficients in Eq. (27) it is possible to estimate these 15 coefficients linearly from 7 point matches in three images, since the first three basis points gives no information and the other four gives four equations each. It can be shown that it is possible to calculate pi , qi , ti and ¯ti from the 15 expressions in Eq. (27), also linearly, see [4], and then we have the reconstruction. To see this observe that it is possible to eliminate ti from k3 , k5 , k7 , k10 , k12 and k14 , which gives k5 q2 ? k3 q3 = 0;
k10 q1 ? k7 q3 = 0;
k14 q1 ? k12 q2 = 0;
(29)
and these equations are linear in qi . In the same way we can eliminate ¯ti from k2 , k4 , k7 , k9 , k11 and k13 , giving three equations, similar to them in Eq. (29), linear in pi . Finally when pi and qi are known every ki in Eq. (27) are linear in ti and ¯ti . This can of course be done in exactly the same way if there are more than 7 point matches. Observe that t and ¯t only can be determined up to one unknown scale factor. When this is done we can recover the correct lengths from Eq. (25). Then it is possible to calculate the reconstruction from Eq. (24), which gives the last coordinate s. Finally the other coordinates are obtained from Eq. (10). This method is similar to the one used by Shashua in [1]. He obtains four linear expressions with 27 different coefficients, which he calls the fundamental tensor. Thanks to our parametrisation of the problem we obtain just 15 components. 8. An example Consider the three images of a cube in Figure 2. Applying the algorithm to the points 5
7
5
5
6
6
2
7 2
6 7 2
4 3
1
3
3
4
4 1 1
Fig. 2. Three images of a cube, with point correspondences and numbered points. 1 to 6 we have obtained the following three q-vectors from image one to image two p1 = (?0:0376; 0:0894; ?0:0613; ?0:0212; 0:9922; 0:0450); p2 = (0:4156; 0:4004; 0:4221; 0:4060; 0:3979; 0:4070); p3 = (?0:7430; 0:0027; ?0:5945; 0:2917; ?0:0432; ?0:0870);
and the following from image two to image three q1 = (?0:0349; ?0:0109; ?0:0834; ?0:0321; 0:9919; ?0:0823); q2 = (0:4240; 0:4179; 0:4100; 0:4058; 0:3870; 0:4038); q3 = (?0:2639; 0:2076; ?0:4802; 0:7755; ?0:0707; ?0:2242): Thus only the second solution is physically realisable (all kinetic depths are positive) and the reconstruction gives in this case the following projective coordinates for the sixth point (0:3404;
0; 6601; 0:6696; 0:0008):
This seems reasonable because the sixth point is in the same plane as the first, second and third, which have projective coordinates (1; 0; 0; 0), (0; 0; 1; 0) and (0; 0; 1; 0) respectively. Thus the fourth coordinate of the sixth point should vanish. In fact the projective coordinates of the sixth point is (0:3333; 0:6667; 0:6667; 0) if the object is assumed to be a cube. The projective coordinate vectors are just determined up to a multiplication by a scalar, and here we have normalised them such that the length is equal to 1. The angle between these vectors can be calculated and it is 0.58 degrees. Assume now that we know that the points 1, 2, 3 and 6 are coplanar in the object. Using the method mentioned in section 5 we get the following q-vectors p = (0:4146; 0:4018; 0:4200; 0:4059; 0:3986; 0:4081) q = (0:4241; 0:4188; 0:4095; 0:4053; 0:3872; 0:4035) and the following reconstruction (0:3398;
0:6607; 0:6694; 0):
This time the angle between the reconstruction and the correct projective coordinates is 0.53 degrees, a slight improvement. If we use the seventh point in the images one and two, by means of the reconstruction obtained above we can predict the coordinates of the seventh point in image three, as described in section 6. The coordinates were predicted to be (515:4; 243:8), which can be compared to the coordinates measured in the image (519; 235). This seems reasonable. Finally we can calculate ki in Eq. (27) linearly from Eq. (28) using all seven points and then calculate pi , qi , ti and ¯ti from ki . Finally Eq. (25) and Eq. (24) can be used to calculate the reconstruction. This gives the following reconstruction for the last two points (0:3351; 0:6680; 0:6644;
?0 0005) :
;
(0:3340; 0:6621; 0:0005; 0:6709);
where the first five are a standard projective basis. This can be compared with the coordinates (1; 2; 2; 0) and (1; 2; 0; 2) obtained if the object is assumed to be a cube. The angles between these vectors are 0:18 and 0:36 degrees respectively.
9. Conclusions In this paper we have outlined methods for reconstruction and prediction by means of three images, taken by uncalibrated cameras. We have shown that a reconstruction up to a projective transformation can be done if the so called universal matrices have vanishing determinant. We have shown that there exists in general three different solutions to the reconstruction problem given 6 point matches. These solutions can be obtained from a third degree polynomial where the coefficients are polynomial functions in the image coordinates. It has also been shown how the depth flow vectors for the respective image pairs can be computed. It is also possible within this framework to estimate locations of additional points in the third image, without using the epipolar structure. We have instead made a reconstruction and projected this onto the third image. This prediction can be carried out even in cases where the epipolar constraints fail because of parallel epipolar lines in the third image. We have also shown that it is possible to use information of the object such as coplanarity to construct linear algorithms for reconstruction. Finally is is possible to use linear methods for reconstruction when there are more than 6 point matches in the three images and we have demonstrated our methods with some experiment and real data. Acknowledgement: This work has been done within the ESPRIT-BRA project VIVA.
References [1] Shashua, A., Trilinearity in Visual Recognition by Alignment, ECCV’94, Lecture notes in Computer Science, Vol 800. Ed. J-O. Eklund, Springer-Verlag 1994, pp. 479-484. [2] Faugeras, O., D. & Robert, L., What can two images tell us about a third one?, ECCV’94, Lecture notes in Computer Science, Vol 800. Ed. J-O. Eklund, SpringerVerlag 1994, pp. 485-492. [3] Heyden, A., Reconstruction from Three Images of Six Point Objects, Proc. Symposium on Image Analysis, SSAB, Halmstad, Sweden, 1994, pp. 49-52. [4] Heyden, A., Reconstruction from Image Sequences by Means of Relative Depths, Proc. 5:th ICCV, Boston, USA, 1995. [5] Heyden, A., Spanne S., Sparr, G., Proximity Measures for Recognition, Technical Report, Lund, CODEN:LUFTD2(TFMA- -94)/7004- -SE, Sweden, 1994. [6] Quan, L. Invariants of 6 Points from 3 Uncalibrated Images, ECCV’94, Lecture notes in Computer Science, Vol 801. Ed. J-O. Eklund, Springer-Verlag 1994, pp. 459470. [7] Sparr, G., Depth-Computations from Polyhedral Images, ECCV’92, Lecture notes in Computer Science, Vol 588. Ed. G. Sandini, Springer-Verlag 1992, pp. 378-386. Also in Image and Vision Computing, Vol 10. 1992, pp. 683-688. [8] Sparr, G., A Common Framework for Kinetic Depth, Reconstruction and Motion for Deformable Objects, ECCV’94, Lecture notes in Computer Science, Vol 801. Ed. JO. Eklund, Springer-Verlag 1994, pp. 471-482.