An image rectification method on image sequence of ...

1 downloads 0 Views 365KB Size Report
sequence of monocular motion vision, is proposed to rectify the calibrated image ... Finally, we deduce a homography matrix to realize the image rectification.
An image rectification method on image sequence of monocular motion vision Yunting Lia, Jun Zhang*a, Wenwen Hua, Xiaomao Liu*b, Jinwen Tiana a Science and Technology on Multi-spectral Information Processing Laboratory, Huazhong University of Science and Technology(HUST), Wuhan 430074, P. R. China b School of Mathematics and Statistics, HUST, Wuhan 430074, P. R. China ABSTRACT Image rectification reduce the search space from 2-dimension to 1-dimension and improve the searching efficiency of stereo matching algorithm greatly. In this paper, a simple and convenient method, which fully considered image sequence of monocular motion vision, is proposed to rectify the calibrated image sequence. The method is based on coordinate system transformation, which can avoid the mass and complex computations, and the method rectifies image sequence (three images) at once, which is efficient in image sequence processing. In this method, the rectification is composed of several steps. Firstly, we establish a reference coordinate system by three movement position. The Z axis of the reference coordinate system o_XYZ is the normal vector of the plane which three positions located. The direction of X axis coincides with the baseline from position 2 to position 1. We set Y axis according to right-hand principle. Secondly, we set the x axis and z axis of reference image space coordinate system o_xyz coincides with the X axis and Z axis of the reference coordinate system, and the y axis is set to coincide with the line from position 2 to position 3. Finally, we deduce a homography matrix to realize the image rectification. Both image data and computer simulation data show that the method is an effective rectification method. Keywords: image rectification; calibrated cameras; coordinated transformation; homography matrix;

1.INTRODUCTION Monocular motion vision, as the name implies, is a single camera in its movement process, generate the image or image sequence, and depth perception through the different image between parallax information. Monocular motion vision system is used for visual navigation system such as unmanned aerial vehicle (UAV) [1] and other remote sensing platform, because that wide baseline stereo binocular vision system cannot be installed on these camera carriers. Visual navigation system features include measurement speed, height, orientation, posture, three-dimensional reconstruction and so on autonomous navigation need. The important problem is stereo match. Obviously, images taken in different time, the camera intrinsic parameters keep unchanged and the camera external parameters are associated with camera carrier posture and speed on monocular motion vision system. Image taken at different times can be regard as different views of a same scene by cameras. Since there are epipolar geometry constraints in image pairs, we can use many methods of computer vision to searching the corresponding points. It is known image rectification is an important component of computer vision. The purpose of which is that to make the

*[email protected], [email protected]; MIPPR 2013: Pattern Recognition and Computer Vision, edited by Zhiguo Cao, Proc. of SPIE Vol. 8919, 89190M · © 2013 SPIE · CCC code: 0277-786X/13/$18 · doi: 10.1117/12.2031242

Proc. of SPIE Vol. 8919 89190M-1 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 12/25/2013 Terms of Use: http://spiedl.org/terms

corresponding epipolar lines of image pairs be parallel to the horizontal or vertical direction of the image. In the case stereo matching algorithm can take advantage of the epipolar constrain to reduce the search space from 2-dimension to 1-dimension and improve the searching efficiency greatly. So image rectification is an important pre-process procedure of stereo matching algorithm. So far various image rectification methods are proposed. Ayache [2] taken the condition that the corresponding epipolar lines should be in the same vertical or horizontal coordinates after rectification to derive the required homography matrix. But this method requires several corresponding point pairs to compute epipolar lines in derivation which made large computation consumption. Hartley [3] determined the projective matrix through the constraint that the vertical difference between the corresponding points should be minimized and the epipoles should be at infinity. This method based on fundamental matrix, many variables were needed to be resolved and need several corresponding point pairs previous. Loop [4] decomposed rectification transformation matrix into similarity, shearing and projective component, and attempt to reduce the projective distortion. But it required that all involving matrix must be positive in the rectify process. Fusiello [5] improved Ayache’s algorithm, Zhu [6] improved Hartley’s algorithm and Sui [7], Mallon [8] improved Loop’s algorithm, but the complex computation of the epipolar lines was inevitable. More recently, Fusiello [9] introduced a quasi-Euclidean rectification algorithm which needs some calibrated parameters. However, the robustness of this method was low. Lin [10] proposed an algorithm for making the rectification be more robust, the transformation matrix was calculated by the corresponding points and then was optimized by Levenbery-Marquardt method and Evolutionary Programming algorithm. Wan [11] proposed a simple and effective image reification algorithm based on spherical rectification for PTZ (Pan-Tilt-Zoom) cameras, but it’s limited to PTZ cameras. Kumar [12] proposed an approach for rectifying the stereo image pairs which had different zoom and image resolution, but the computation was still complex. Su [13] proposed a rectification algorithm of calibrated image pairs based on geometric transformation, it could simplify the rectification, but it required many parameters from cameras calibrated. In this work, a simple and convenient method, which fully considered image sequence of monocular motion vision, is proposed to rectify the calibrated image sequence. The method is based on coordinate system transformation, which can avoid the mass and complex computations, and the method rectifies image sequence (three images) at once, which is efficient in image sequence processing. In this method, the rectification is composed of several steps. Firstly, we establish a reference coordinate system by three movement position. The Z axis of the reference coordinate system is the normal vector of the plane which three positions located. The direction of X axis coincides with the baseline from position 2 to position 1. We set Y axis according to right-hand principle. Secondly, we set the x axis and z axis of reference image space coordinate system coincides with the X axis and Z axis of the reference coordinate system, and the y axis is set to coincide with the line from position 2 to position 3. Finally, we deduce a homography matrix to realize the image rectification. This paper is organized as follows. In section 2, the preliminaries on which the method is based are introduced briefly. In section 3, we introduce our method in three steps. In section 4, experiments and result are presented. Finally the conclusions are given in section 5.

2. PRELIMINARLES 2.1 Camera model The common camera model is pinhole model which is just a perspective projection followed by an affine transformation

Proc. of SPIE Vol. 8919 89190M-2 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 12/25/2013 Terms of Use: http://spiedl.org/terms

in the image plane. We describe intrinsic parameter matrix K of camera model as equation (1). In which, (u0 , v0 ) represents the pixel coordinate of the principal point of the image plane (i.e., the intersection of the image plane with the optical axis). f u represents the focal length on the column direction and f v represents the focal length on the row direction,

f s denote the parameter describing the skewness of the two image axes [14].  fu  K  0 0 

s fv 0

u0   v0  1 

(1)

An arbitrary point of 3D world is denoted by P. The camera coordinate system is denoted by o-XYZ. The origin of the o-XYZ is the optical center of the camera, and the Z axis coincides with its optical axis. The X axis and Y axis are coincides with the vertical and horizontal rotation axis of the image plane c-uv. The image plane at a distance f to the origin of the o- XYZ, where f represents the (effective) focal length of the camera. Additionally, the world coordinate system coincides with geodetic coordinate system which is denoted by O- XwYwZw. Xw, Yw and Zw axes are coincide with East, North and Sky direction. All the coordinate system in this paper is right-hand. Let Pw  [ X w , Yw , Z w ]T represent the coordinates of P in O-XwYwZw and I  [u, v,1]T represent the corresponding image coordinates. Fig.1 shows the relationship among o-XYZ, O- XwYwZw and the image plane. c

u

P

v

 u 0 , v0 

I

Z o

Zw

Yw

X Y

R n , On

O

Xw

Fig.1 The camera model and coordinate system

On monocular motion vision system, images taken in different time, the camera intrinsic parameters keep unchanged and the external parameter is associated with camera carrier postures and speed. The camera carrier postures which relative to the world coordinate system can be achieved by GPS navigation. The camera carrier postures consist of pitch angle  n , yaw angle n , and roll angle

 n [15], subscripts n indicate the time. In this paper, we use  n , n

and

 n to construct

an external parameter matrix which from world coordinate system to camera coordinate system. The external parameter matrix R n is defined as:

R n  Re R  y,  n  R  x, n  R  z,n 

(2)

Where R  r , s  is a

3  3 rotation matrix denotes rotate s angle around r axis. R e is a rotation matrix to assure the Z axis coincides with optical axis. R e is equate to R  x, 90  in the forward-looking case and R  x, 180  in the 0

0

down-looking case. Combining (1) and (2), according to geometric properties of optical imaging, we have:

Zn  I n  K  R n   Pw  On 

(3)

Where O n is the camera position in O-XwYwZw at time n. Z n is the third element of vector R n   Pw  On  . 2.2 Epipolar geometry The epipolar geometry describes the relations that exist between stereo images. Every point in a plane that passes through both centers of projection will be projected in each image on the intersection of this plane with the corresponding image plane. According to the position of two cameras’ optical axes, the configuration of stereo vision system is divided into two types: ideal-type and general-type (Fig.2). In the ideal-type, the optical axes of two cameras

Proc. of SPIE Vol. 8919 89190M-3 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 12/25/2013 Terms of Use: http://spiedl.org/terms

should be parallel to each other. General-type is unconstrained but it is the common configuration of stereo vision system. optical axes

optical axes

P

epipolar line

P

en  en1 

on Yn

In

In1

In Zn

on1

Xn

Z n 1

In1

Zn

Z n 1

on

X n 1

Xn

Yn

Yn 1

baseline

epipolar line

en

en1 baseline

(a)

on1

X n 1

Yn 1

(b) Fig.2 The configuration of stereo vision system (a) ideal-type (b) general-type

As show in Fig.2, the epipolar lines are not collinear in the general-type of stereo vision system, so it needs additional computation while using epipolar lines to search corresponding points. However, the corresponding epipolar lines stay at the same height and the optical axes are parallel to each other in the ideal-type, so that the corresponding point is directly searched along the horizontal direction. Our image rectification method is to transform the general-type into the ideal-type.

3. THE IMAGE RECTIFICATION METHOD In this paper, the image rectification method is consists of three steps. Firstly, we establish a reference coordinate system by three movement position. The Z axis of the reference coordinate system is the normal vector of the plane which three positions located. The direction of X axis coincides with the baseline from position 2 to position 1. We set Y axis according to right-hand principle. Secondly, we set the x axis and z axis of reference image space coordinate system coincides with the X axis and Z axis of the reference coordinate system, and the y axis is set to coincide with the line from position 2 to position 3. Finally, we deduce a homography matrix to realize the image rectification. 3.1 Establish a reference coordinate system As we know the camera has different postures at different time on monocular motion vision system. So the first step of image rectification is to transform different camera coordinate system into a common world coordinate system to which associate with each other. According to the external parameter matrix R n of camera carrier at different time, we can establish virtual cameras on the common world coordinate system. Given the arbitrary P’s pixel location I n . According to equation (3), we can obtain the coordinates of P in common com

com

world coordinate system On -X n Yn

Z ncom . Pncom  Z n  RTn  K 1  I n

(4)

Where Pncom is the coordinates of P in On -X ncomYncom Z ncom . Z ncom

Y nco m

Z ncom 1 Y nco 1m

Z n 1

on1 Yn 1

X n 1

common world image plane Z ncom 1

Zn

on

Xn Yn

X nco 1m

X nco m

Y nco 1m

Zn1

baseline

on1

X n1

X nco m1

Yn1

Fig.3 Common reference coordinate system

As show in Fig.3, the common world coordinate system of different virtual cameras coincide with the world coordinate system O- XwYwZw, but only with a different translation in each other. The optical axes of the virtual cameras are parallel

Proc. of SPIE Vol. 8919 89190M-4 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 12/25/2013 Terms of Use: http://spiedl.org/terms

to each other now. If we project the Pncom into image pixel, we can see that the virtual image planes parallel each other, but the corresponding epipolar lines not parallel to the horizontal or vertical direction. Then, we establish a reference coordinate system by three movement position. We obtain the normal vector N plane of the plane which three positions located.

N plane  OnOn1  OnOn1

(5)

On , On1 , On1 are three camera positions in O- XwYwZw and (a  b) is the cross-product of 3-dimentional vector a and b . N plane is the Z axis of the reference coordinate system On -X nref Ynref Znref . To obtain the reference coordinate system Where

On -X nref Ynref Z nref , there is a two-step process: (1) Transform the Z ncom axis coincides with the normal vector N plane . (2) Transform the

X 'n axis coincides with the baseline from position O n to position O . n1

Z ncom



Z nref

Z nref



Y ncom

on

Y 'n X ncom

 Y 'n X

X 'n

on

X 'n

ref n

ref n

Y

on1

(a)

(b)

Fig.4 Transform the common world coordinate system (a) first transformation of the coordinate system (b) second transformation of the coordinate system

As show in Fig.4 (a), first transform of the common world coordinate system On -X ncomYncom Z ncom is to rotate



around

axis of the common world coordinate system and then rotating  around Y axis of the common world coordinate system. Then we obtain a temporary coordinate system On -X 'n Y 'n Z 'n . Similar to equation (2), the transformation matrix R t is defined as:

X

com n

com n

Rt  R  y,   R  x,  

(6)

 and  is estimated by normal vector N plane : tan   Where N plane   nx

ny

ny nz

(7);

tan  

tx n  nz2 2 y

(8)

T

nz  .



Z 'n axis of the temporary coordinate system On -X 'n Y 'n Z 'n as show in Fig.4 (b). Similar to equation (2), the transformation matrix R s is defined as: Second transformation is to rotate

around

R s  R  z,  

(9);

tan  

ty tx

(10)

 is estimated by the position O n1 and O n . t  Rt (On1  On )  tx t y tz T .   ref ref ref As a result, we obtain the reference coordinate system On -X n Yn Z n . Now the corresponding epipolar lines of image n  1 and n are stay at the same row as show in Fig.5. But there is a problem that the corresponding epipolar lines of image n and n  1 are not at the same column yet. Where

Proc. of SPIE Vol. 8919 89190M-5 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 12/25/2013 Terms of Use: http://spiedl.org/terms

Znref Znref1

Xnref1

on Xnref

on1

Znref

epipolar line

Ynref

 on

Znref1

Ynref1

X

baseline

ref n

on1

Xnref1

Ynref

on1

ref n1

Y

Fig.5 the reference coordinate system

Fig.6 transformation of the image

3.2 Transform image space coordinate system This step is to make the corresponding epipolar lines of image n and n  1 at the same column. The coordinate transformation is that translation v tan  pixel along the horizontal direction of image and let vertical direction of image unchanged as show in Fig.6. The transformation matrix

 1 tan  0    R tr   0 1 0 0 0 1  

R tr is defined as:

(11);

tan  

sx sy

(12)

 is estimated by the position O n and O n1 . s  K R s Rt (On1  On )   sx s y sz  . After transformation, the corresponding epipolar lines of image n and n  1 at the same column. T

Where

3.3 Deduce a homography matrix The final step it to re-project of the image points. To solve this problem, here is to construct a homography matrix. Given the arbitrary P’s pixel location I n , combining (3), (4) ,(6),(9)and (11), according to geometric properties of optical imaging, we can obtain the pixel location of P in new image coordinate. 1 In '   R tr  K  R s  R t  RTn  K 1  I n (13) Zn ' Where Z n ' is the third element of vector Rtr  K  R s  Rt  RTn  K 1  I n . Here we make the new camera’s intrinsic parameters are equal to original camera’s. ref

ref

ref

Now the coordinate of the new image point in reference coordinate system On -X n Yn Z n homography matrix which has been obtained.

I n '  Hn  I n 1 Hn   R tr  K  R s  R t  RTn  K 1 Zn ' The projection between the original image coordinate

is obtained through the (14) (15)

I n and new image coordinate I n ' is described by the

homography matrix H n .After re-projection, the original image information is transfer to the ideal-type cameras’ image plane and the image rectification is finished.

4. EXPERIMENTS The proposed method has been tested on both image data and computer simulated data. 4.1 Image data The simulated images are created by Visual C++ 6.0 and OpenCV 1.0. We construct two three cameras to take image on two virtual cuboids (in the forward-looking case). Table 1 show the intrinsic and extrinsic parameters of cameras which manually set. As show in Fig.5, the simulated image pairs before and after rectification is presented, and the vertical coordinates and the vertical disparities of the corners marked in the images are shown in table 2. From both figure and table we can see

Proc. of SPIE Vol. 8919 89190M-6 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 12/25/2013 Terms of Use: http://spiedl.org/terms

that the corresponding corners lay on the same horizontal lines and the vertical disparities of them equal to zero after rectification (Fig.7 (b)) while they are not before rectification (Fig.7 (a)). The conclusion is the same as Table 2. Therefore, our method is feasible. f u (pixel)

intrinsic parameters

f v (pixel)

s u0 (pixel) v0 (pixel)  i (degree)

extrinsic parameters

i (degree)  i (degree)

O i (m)

Table.1 camera parameters Camera n-1 Camera n 560 560 560 560 0 0 512 512 512 512 4.00 5.00 -15.00 -16.00 1.00 0.00

 2800

600 2500

T

3000

Camera n+1 560 560 0 512 512 10.00 -20.00 0.00

800 2600

T

(a)

3100

1000 2500

T

(b)

Fig.7 The simulated image pairs (a) Before rectification(left: image n-1,middle:image n, right: image n+1) (b) After rectification(top-left: image n, top-right: image n-1, bottom: image n+1)

4.2 Computer Simulations To verify our method more effectively, we have simulated 20 pairs of the image pairs in different extrinsic parameters but the same intrinsic perimeters. We calculate the arithmetic mean value and standard deviation value of all corresponding corners’ horizontal disparities after rectification. As show in Fig.8, we see that the arithmetic mean value and standard deviation value are less than 1 pixel. It is show that our method is effectively. NO. 1 2 3 4 1 2 3 4

Table.2 pixels disparity Camera n-1 Camera n Disparity Camera n Camera n+1 Disparity Horizontal coordinate before rectification (pixel) Vertical coordinate before rectification (pixel) 181 199 -18 143 116 27 181 200 -19 257 249 8 52 48 4 241 258 -17 51 46 5 312 340 28 Horizontal coordinate after rectification (pixel) Vertical coordinate after rectification (pixel) 322 322 0 1022 1022 0 409 409 0 920 920 0 523 523 0 1305 1305 0 533 533 0 1187 1187 0

5. CONCLUSIONS In this paper, we present a method to rectify the corresponding epipolar lines of the image planes stay at the same height. The method need to get the intrinsic and extrinsic parameters of cameras before rectify. At first, we establish virtual cameras with optical axes parallel on one axis of a common reference coordinate system, and then we transform their parallel optical axes perpendicular to the baseline. Finally, we deduce a homography matrix to realize the image rectification. This method avoids the complex computation of epipolar lines or fundamental matrix, and it is suit for

Proc. of SPIE Vol. 8919 89190M-7 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 12/25/2013 Terms of Use: http://spiedl.org/terms

monocular motion vision system. Both computer simulation and real image pairs have been used to validate our method and very good result have been obtained. 12

-e- Mao Mand tleviaboo

08

0.6

02

-0 2° 6

The number ;eXper,meZon

"

16

18

Fig.8 The repeated simulated experiment

6. ACKNOWLEDGEMENTS This work was partially supported by the National Natural Science Foundation of China (NSFC) under the Grant No. 61273279 and No. 61273241.

References 1. 2. 3. 4. 5. 6.

7. 8. 9. 10. 11. 12. 13. 14. 15.

K. P. Valavanis. Advance in Unmanned Aerial Vehicles: State of the Art and Road to Autonomy. The Netherlands: Springer, 2007. N. Ayache, C. Hansen, “Rectification of image for binocular and trinocular stereovision”, In Proceedings of 9th International Conference on Pattern Recognition, vol.1, pp.11-16, 1988. R. Hartley. “Theory and practice of projective rectification”, International Journal of Computer Vision, 35(2), 115-127 (1999). C. Loop, Z. Zhang, “Computing rectifying homographies for stereo vision”, In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp.125-131, 1999. A. Fusiello, E. Trucco, A. Verri, “A compact algorithm for rectification of stereo pairs”, Machine Vision and Applications, 12(1), 16-22 (2000). M. Zhu, Y. Ge, S. Huang, W Chen, “Stereo vision rectification based on epipolar lines match and three variables projective matrix”, In Proceedings of the 2007 IEEE International Conference on International Technology, pp.133-138, 2007. L. Sui, J. Zhang, D. Cui, “Image rectification using affine epipolar geometric constraint”, In Proceedings of International Symposium on Computer Science and Computational Technology, vol.2, pp.582-588, 2008. J. Mallon, P. F. Whelan, “Projective rectification from the fundamental matrix”, Image and Vision Computing, 23(7), 643-650 (2005). A. Fusiello, L. Irsara.” Quasi-Euclidean uncalibrated epipolar rectification”. In Proceedings of the 19th International Conference on Pattern Recognition, pp.1-4, 2008. G. Lin, X. Chen, W. Zhang, “A Robust Epipolar Rectification Method of Stereo Pairs,” In 2010 International Conference on Measuring Technology and Mechatronics Automation, vol. 1, pp.322-326, 2010. D. Wan, J. Zhou, “Self-calibration of spherical rectification for a PTZ-stereo system,” Image and Vision Computing, 28(3), 367-375 (2010). S. Kumar, C. Micheloni, C. Piciarelli, G. L. Foresti, “Stereo rectification of uncalibrated and heterogeneous images,” Pattern Recognition Letters, 31(11), 1445-1452 (2010). H. Su, B. He, “Stereo rectification of calibrated image pairs based on geometric transformation”, I. J. Modern Education and Computer Science, 3(4), 17-24 (2011). Z. Zhang, “Flexible camera calibration by viewing a plane from unknown orientations”. In Proc. 7th IEEE International Conference on Computer Vision (Kerkyra), vol.1, pp.666-673, 1999. K. Mikolajczyk, and C. Schmid, “A performance evaluation of local descriptors”. IEEE Trans. Pattern Anal. Mach. Intel., 27(10), 1615–1629 (2005).

Proc. of SPIE Vol. 8919 89190M-8 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 12/25/2013 Terms of Use: http://spiedl.org/terms

Suggest Documents