Tutorial on Recti cation of Stereo Images Andrea Fusiello
Dipartimento di Matematica e Informatica Universita di Udine, Italia
[email protected]
1 Introduction Given a pair of stereo images, recti cation determines a transformation of each image plane such that pairs of conjugate epipolar lines become collinear and parallel to one of the image axes. The recti ed images can be thought of as acquired by a new stereo rig, obtained by rotating the original cameras around the optical centre. The important advantage of recti cation is that computing correspondences, a 2-D search problem in general, is reduced to a 1-D search problem, typically along the horizontal raster lines of the recti ed images. A good starting point to explore the literature on recti cation includes [1, 5, 6, 8, 4, 9]. After reviewing some concepts related to the pinhole camera model and the epipolar geometry, we will discuss in detail the process of recti cation.
2 Camera model and epipolar geometry We assume prior knowledge of the projective camera model, and of the epipolar geometry. This concepts will be recalled brie y here. The notation follows [3].
2.1 Camera model
The camera is modelled by its optical centre c and its retinal plane (or image plane) R. In each camera, a 3-D point w = (x; y; z)> in world coordinates (where the world coordinate frame is xed arbitrarily) is projected into an image point m = (u; v)> in camera coordinates, where m is the intersection of R with the line containing w and c. In projective (or homogeneous) coordinates, the transformation from w to m is modelled by the linear transformation P~ m~ = P~ w~ ; (1) where 0 1
0 1 U B m~ = @ V CA S
B w~ = BB@
x y z
1
CC CA
(2)
In R. Fisher, editor, CVonline: On-Line Compendium of ComputerVision [Online]. Available: http://www.dai.ed.ac.uk/CVonline/.1998.
2
2 CAMERA MODEL AND EPIPOLAR GEOMETRY
m=
U=S V=S
!
(if S 6= 0):
(3)
The points w for which S = 0 de ne the focal plane and are projected to in nity. F
z
w
x
R u
c
y
m v f z = -f
Figure 1: Pinhole camera model. Each pinhole camera is therefore modelled by its perspective projection matrix (PPM) P~ , which can be decomposed into the product
P~ = A(I j 0)G
(4) The matrix A gathers the intrinsic parameters of the camera, and has the following form:
1 0 u 0 u 0 A = B@ 0 v v0 CA : 0
0 1
(5)
where u; v are the focal lengths in vertical and horizontal pixels, respectively, and (u0; v0) are the coordinates of the principal point. The matrix G is composed by a 3 3 rotation matrix R and a vector t, encoding the camera position and orientation (extrinsic parameters) in the world reference frame, respectively:
G = R0 1t Let us write the PPM as
!
:
1 0 > q14 q 1 P~ = B@ q>2 q24 CA = (Pjp~ ):
q>3 q34
(6) (7)
The plane q>3 w + q34 = 0 (S = 0) is the focal plane, and the two planes q>1 w + q14 = 0 and q>2 w + q24 = 0 intersect the retinal plane in the vertical (U = 0) and horizontal (V = 0) axis of the retinal coordinates, respectively.
2.2 Epipolar geometry
3
The optical centre c is the intersection of the three planes introduced in the previous paragraph; therefore ! c P~ 1 = 0 (8) and
c = ?P?1p~ : (9) The optical ray associated to an image point m is the line cm, i.e. the set of points fw : m~ = P~ w~ g. The equation of this ray can be written in parametric form as w = c + P?1m~ : (10) with an arbitrary real number.
2.2 Epipolar geometry
Let us consider a stereo rig composed by two pinhole cameras (Fig. 2). Let c1 and c2 be the optical centres of the two cameras. A 3D point w is projected onto both image planes, to points m1 and m2, which constitute a conjugate pair. Given a point m1 in the left image plane, its conjugate point in the right image is constrained to lie on a line called the epipolar line (of m1). This line is the projection through c2 of the optical ray of m1; indeed m1 may be the projection of an arbitrary point on its optical ray. R2 R1
epipolar plane
epipolar line
w c2
c1
e1 m1
e2
m2
Figure 2: Epipolar geometry. Furthermore one should observe that all the epipolar lines in one image plane pass through a common point called the epipole, which is the projection of the conjugate optical centre:
c1
e~2 = P~ 2 1
!
(11)
4
2 CAMERA MODEL AND EPIPOLAR GEOMETRY
~ 1 writes: The parametric equation of the epipolar line of m
m~ >2 = e~2 + P2P?1 1m~ 1
(12)
In image coordinates it becomes: [~e2 ]1 + [~v]1 [~e2 ]3 + [~v]3 [~e ] + [~v]2 v = [m2]2 = 2 2 [~e2 ]3 + [~v]3
u = [m2]1 =
(13) (14)
~ 1 and [:]i is the projection operator extracting the ith component from where v~ = P2P?1 1m a vector. Left image
Right image
Rectified left image
Rectified right image
Figure 3: Top row: a stereo pair (Copyright SYNTIM-INRIA). Bottom row: the recti ed pair. The right pictures plot the epipolar lines corresponding to the point marked in the left pictures. When c1 is in the focal plane of the right camera, the right epipole is at in nity, and the epipolar lines form a bundle of parallel lines in the right image. Analytically, the direction of each epipolar line can be obtained by taking the derivative of the parametric equations (13,14) with respect to :
5 du [~v]1 [~e2]3 ? [~v]3[~e2 ]1 = d ([~e2 ]3 + [~v]3 )2 dv [~v]2 [~e2]3 ? [~v]3[~e2 ]2 = d ([~e2 ]3 + [~v]3 )2
(15) (16)
Note that the denominator is the same in both components, hence it does not aect the direction of the vector. The epipole is rejected to in nity when [~e2]3 = 0. In this case, the direction of the epipolar lines in the right image doesn't depend on v any more and all the epipolar lines becomes parallel to vector ([~e2 ]1[~e2 ]2 )> . A very special case is when both epipoles are at in nity, that happens when the line containing c1 and c2 (the baseline) is contained in both focal planes, or the retinal planes are parallel to the baseline. Epipolar lines form a bundle of parallel lines in both images. Any pair of images can be transformed so that epipolar lines are parallel and horizontal in each image as in Fig. 3. This procedure is called recti cation.
3 Recti cation We will assume that the stereo rig is calibrated, that is the PPMs are known. This assumption is not strictly necessary [5, 8], but leads to a simpler technique. The idea behind recti cation [2] is to de ne two new perspective matrices which preserve the optical centres but with image planes parallel to the baseline. In addition we want the epipolar line of the point m1 = (u1; v1)> in the right image to be the horizontal line v2 = v1 in the left image. Consider Fig. 4 where the old retinal plane Ro and the new one Rn are depicted: image recti cation is the operation of transforming from Ro to Rn. R
Rectifying retinal plane
n
mn
Ro
mo
c
baseline
w
Figure 4: Recti cation.
6
3 RECTIFICATION
3.1 Constraining the rectifying PPMs
We will compute the new PPMs, P~ n1 and P~ n2 as the solution of a system of equations, which express the constraints arising from recti cation requirements plus others constraints necessary to ensure a unique solution. In the following, we will denote with P~ o1 ; P~ o2 the old PPMs. Let
0 > 1 a a14 1 P~ n1 = B@ a>2 a24 CA
a>3
a34
0 > 1 b14 b 1 P~ n2 = B@ b>2 b24 CA
b>3
b34
(17)
be the sought rectifying PPMs.
Common focal plane. If cameras share the same focal plane the common retinal plane is
constrained to be parallel to the baseline and epipolar lines are parallel. The two rectifying PPM have the same focal plane i
a3 = b3
a34 = b34 :
(18)
Position of the optical centres. The optical centres of the rectifying perspective projections must be the same as those of the original projections:
P~ n1
!
!
c1 = 0 P~ n2 c2 = 0; 1 1
(19)
where, according to Eq. (9), c1 and c2 are given by
c1 = ?P?o11 p~ o1 c2 = ?P?o21 p~ o2:
(20)
Eq. (19) gives six linear constraints:
8 > > > > < > > > > :
a>1 c1 + a14 = 0 a>2 c1 + a24 = 0 a>3 c1 + a34 = 0 b>1 c2 + b14 = 0 b>2 c2 + b24 = 0 b>3 c2 + b34 = 0:
(21)
Alignment of conjugate epipolar lines. The vertical coordinate of the projection of a 3-D point onto the rectifying retinal plane must be the same in both image, i.e: a>2 w + a24 = b>2 w + b24 : a>3 w + a34 b>3 w + b34 Using Eq. (18) we obtain the constraints
(22)
a2 = b2
(23)
a24 = b24 :
3.1 Constraining the rectifying PPMs
7
Note. Notice that the equations written to this point are sucient to guarantee recti cation: to prove this, let us verify that epipolar lines are parallel and horizontal. When [~e1]3 = 0 (the epipole is at in nity) the epipolar lines are parallel to the vector ([~e1 ]1[~e1 ]2)>: As we know, each epipole is the projection of the conjugate optical centre, i.e.
e~1 = P~ n1 c12 e~2 = P~ n2 c11
! 0 a>1 c2 + a14 =B @ a>2 c2 + a24 a>3 c2 + a34 0 >c1 + b14 ! b 1 =B @ b>2 c1 + b24
b>c1 + b34
1 CA 1 CA
(24) (25)
3
Hence epipolar lines are are horizontal when:
8 > > > > < > > > > :
a>1 c2 + a14 6= 0 a>2 c2 + a24 = 0 a>3 c2 + a34 = 0 b>1 c1 + b14 =6 0 b>2 c1 + b24 = 0 b>3 c1 + b34 = 0
(26)
The four equations are satis ed as long as Equations (19), (18) and (23) hold, as one can easily verify. Although recti cation is guaranteed, the orientation of the retinal plane has still one degree of freedom. Moreover, the constraints written up to now are not enough to obtain a unique PPM. We shall therefore choose explicitly the intrinsic parameters to obtain enough equations.
Orientation of the rectifying retinal plane. We choose the rectifying focal planes to
be parallel to the intersection of the two original focal planes, i.e.
a>3 (f1 ^ f2) = 0; (27) where f1 and f2 are the third rows of Po and Po respectively. Notice that the corresponding equation, b>3 (f1 ^ f2 ) = 0, is redundant thanks to Eq. (18). 1
2
Orthogonality of the rectifying reference frames. The intersections of the retinal plane with the planes a>1 w + a14 = 0 and a>2 w + a24 = 0 correspond to the v and u axes, respectively, of the retinal reference frame (Figure 1). In order for this reference frame to be orthogonal, the planes must be perpendicular to each other; hence, taking into account Eq. (23), a>1 a2 = 0 b>1 a2 = 0: (28) Principal points. The principal point (u0; v0) is given by [3]: u0 = a>1 a3
v0 = a>2 a3 :
(29)
8
3 RECTIFICATION
We set the two principal points to (0; 0) and use Eqs. (18) and (23) to obtain the constraints 8 > > < a>1 a3 = 0 a a =0 (30) > : b2>1 a33 = 0:
Focal lengths in pixels. The horizontal and vertical focal lengths in pixels, respectively, are given by
u = jja1 ^ a3 jj v = jja2 ^ a3 jj: By setting arbitrarily the values of u and v we obtain the constraints
8 > < jja1 ^ a3 jj22 = u22 jja ^ a jj = > : jjb21 ^ a33 jj2 = vu2 ;
(31) (32)
which, by virtue of the equivalence jjx ^ yjj2 = jjxjj2jjyjj2 ? (xT y)2 and Eq. (30), can be rewritten as 8 > < jja1jj22jja3jj22 = u22 jja jj jja jj = (33) > : jjb21jj2jja33jj2 = vu2 :
Set the scale factor. PPMs are de ned up to a scale factor, and a common choice to block the latter is to set
jja3jj = 1 jjb3jj = 1:
3.2 Solving for the rectifying PPM
(34)
We organise the constraints introduced in the previous section in the following four systems: 8 a> c + a = 0 > > < a3>3 c12 + a3434 = 0 (35) > > > : ajj3a3(jjf1 =^ f12 ) = 0 8 a> c + a = 0 > > < a2>2 c12 + a2424 = 0 (36) > a3 = 0 a > 2 : jja2jj = v 8 a> c + a = 0 > > a1>a1 = 014 < 1 2 (37) > a3 = 0 > a 1 > : jja1jj = u 8 b>c + b = 0 > > < b1>1 a22 = 014 : (38) > a3 = 0 > b 1 > : jjb1jj = u
3.3 The rectifying transformation Plus
9
8 a =b > > < a242 = b224 > > : a3 = b 3
(39)
a34 = b34
The rst four systems have all the same structure, each one being a 3 4 linear homogeneous system subject to a quadratic constraint, that is, ( Ax = 0 (40) jjx0jj = k; where x0 is a vector composed by the the rst three components of x, and k is a real number. The four systems above are solved in sequence, top to bottom. The solution of each system is obtained by rst computing (for example by SVD factorisation [7]) a one-parameter family of solutions to Ax = 0 of the form x = x0, where x0 is a nontrivial solution and is an arbitrary real number, and then letting = k=jjx00jj.
3.3 The rectifying transformation
Now that { for each camera { the rectifying PPM is known, we want to compute the linear transformation (in projective coordinates) that maps the retinal plane of the old PPM, P~ o = (Pojp~ o), onto the retinal plane of P~ n = (Pnjp~ n). We will see that this transformation is the 3 3 matrix T = PnP?o 1. For any 3-D point w we can write ( m~ o = P~ ow~ (41) m~ n = P~ nw~ : We know from Eq. (10) that the equation of the optical ray associated to m0 is w = co + P?o 1m~ o; (42) hence ?1 ~ ! c o + Po m o ~ m~ n = Pn1 = 1 ! ! ?1 m ~ c P o o o = P~ n1 1 + P~ n1 = 0 ! c n ~ o: = P~ n1 1 + PnP?o 1m Assuming that recti cation does not alter the optical centre (cn = co), we obtain: m~ n = PnP?o 1m~ o: (43) The transformation T is then applied to the original image to produce a recti ed image, as in Fig. 3. Note that the pixels (integer-coordinate positions) of the recti ed image correspond, in general, to non-integer positions on the original image plane. Therefore, the gray levels of the recti ed image are computed by bilinear interpolation.
10
4 SUMMARY OF THE ALGORITHM
4 Summary of the algorithm The process of recti cation can be summarised as follows: Given a stereo pair of images I1,I2 and PPMs Po1,Po2 (obtained by calibration); compute [T1,T2,Pn1,Pn2] = rectify(Po1,Po2) (see box); rectify images by applying T1 and T2. Reconstruction can be performed from the recti ed images directly, using Pn1,Pn2. function [T1,T2,Pn1,Pn2] = rectify(Po1,Po2) % RECTIFY compute rectification matrices % % [T1,T2,Pn1,Pn2] = rectify(Po1,Po2) computes the % rectifying projection matrices "Pn1", "Pn2", and % the rectifying transformation of the retinal plane % "T1", "T2" (in homogeneous coordinate). The arguments % are the two old projection matrices "Po1" and "Po2". % focal lenght % (extp(a,b) is external product of vectors a,b) au = norm(extp(Po1(1,1:3)', Po1(3,1:3)')); av = norm(extp(Po1(2,1:3)', Po1(3,1:3)')); % optical centres c1 = - inv(Po1(:,1:3))*Po1(:,4); c2 = - inv(Po2(:,1:3))*Po2(:,4); % retinal planes fl = Po1(3,1:3)'; fr = Po2(3,1:3)'; nn = extp(fl,fr); % solve the four systems A = [ [c1' 1]' [c2' 1]' [nn' 0]' ]'; [U,S,V] = svd(A); r = 1/(norm(V([1 2 3],4))); a3 = r * V(:,4); A = [ [c1' 1]' [c2' 1]' [a3(1:3)' 0]' ]'; [U,S,V] = svd(A); r = norm(av)/(norm(V([1 2 3],4))); a2 = r * V(:,4); A = [ [c1' 1]' [a2(1:3)' 0]' [a3(1:3)' 0]' ]'; [U,S,V] = svd(A); r = norm(au)/(norm(V([1 2 3],4))); a1 = r * V(:,4); A = [ [c2' 1]' [a2(1:3)' 0]' [a3(1:3)' 0]' ]'; [U,S,V] = svd(A); r = norm(au)/(norm(V([1 2 3],4))); b1 = r * V(:,4); % adjustment H = [ 1 0 0 0 1 0 0 0 1 ]; % rectifying projection matrices Pn1 = H * [ a1 a2 a3 ]'; Pn2 = H * [ b1 a2 a3 ]'; % rectifying image transformation T1 = Pn1(1:3,1:3)* inv(Po1(1:3,1:3)); T2 = Pn2(1:3,1:3)* inv(Po2(1:3,1:3));
11 Notice that, owing to a sign ambiguity in the computation of eigenvalues, the recti ed images can experience a re ection along the vertical or horizontal axis. This can be detected by checking whether or not the ordering between the two diagonal corners of the image is preserved. If a re ection occurs, it can be trivially compensated by pre-multiplying both rectifying PPM by a matrix H of the form:
1 0 su 0 0 H = B@ 0 sv 0 CA :
(44)
0 0 1
with su; sv 2 f0; 1g. Figure 5 is another example of recti cation of a generic stereo pair. Left image
Right image
Rectified left image
Rectified right image
Figure 5: Top row: a stereo pair (Copyright SYNTIM-INRIA). Bottom row: the recti ed pair. The right pictures plot the epipolar lines corresponding to the point marked in the left pictures. The MATLAB code of function rectify and the implementation in C of the image recti cation algorithm can be found on line1. 1
http://www.dimi.uniud.it/~fusiello/rect.html
12
REFERENCES
Acknowledgements
This tutorial is a revised version of [4]. Manuel Trucco read the draft and made precious comments.
References [1] N. Ayache. Arti cial Vision for Mobile Robots: Stereo Vision and Multisensory Perception, chapter 3. The MIT Press, 1991. [2] N. Ayache. Arti cial Vision for Mobile Robots: Stereo Vision and Multisensory Perception. The MIT Press, 1991. [3] O. Faugeras. Three-Dimensional Computer Vision: A Geometric Viewpoint. The MIT Press, Cambridge, 1993. [4] A. Fusiello, E. Trucco, and A. Verri. Recti cation with unconstrained stereo geometry. In A. F. Clark, editor, Proceedings of the British Machine Vision Conference, pages 400{409. BMVA Press, September 1997. [5] R. Hartley and R. Gupta. Computing matched-epipolar projections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 549{555, 1993. [6] D. V. Papadimitriou and T. J. Dennis. Epipolar line estimation and recti cation for stereo images pairs. IEEE Transactions on Image Processing, 3(4):672{676, April 1996. [7] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in C: The Art of Scienti c Computing. Cambridge University Press, second edition, 1992. [8] L. Robert, C. Zeller, O. Faugeras, and M. Hebert. Applications of non-metric vision to some visually-guided robotics tasks. In Y. Aloimonos, editor, Visual Navigation: From Biological Systems to Unmanned Ground Vehicles, chapter 5, pages 89{134. Lawrence Erlbaum Associates, 1997. [9] E. Trucco and A. Verri. Introductory Techniques for 3-D Computer Vision, chapter 7. Prentice-Hall, 1998.