ACTEA 2009
July 15-17, 2009 Zouk Mosbeh, Lebanon
An Automated Camera Calibration Framework for Desktop Vision Systems Hamed Rezazadegan Tavakoli and Hamid Reza Pourreza Abstract— Camera calibration is one of the fundamental problems of machine vision. There have been lots of efforts for providing autonomous calibration algorithms. One of the major problems that put up barrier toward autonomy is feature detection and extraction. In this paper, the architecture of an autonomous camera calibration framework is studied. The autonomy of calibration framework originates in its hardware setup. The applied setup makes automatic feature detection and extraction possible. It is shown that the calibration framework is accurate.
I. I NTRODUCTION Camera Calibration is referred to the process of determining camera geometric and optical characteristic (intrinsic parameters) and/or the position and orientation of the camera (extrinsic parameters) frame relative to a certain world coordinate system [1]. Camera Calibration is the most fundamental and basic part of many computer vision systems, as it is the best and only mean of providing metric information. Having a metric understanding the vision system would be capable of doing measurements which many applications rely on. There are different camera calibration algorithms available, these algorithms mostly consider parameterization or solution technique. There is also algorithm classification considering parameterization and solution method such as the one provided by Heikkila [2] or Weng, Cohen, and Herniou [3]. The disadvantage of these classifications is that there is always an ’other’ room for upcoming new algorithms. For example a new classification could be added regarding the methods that rely on soft-computing techniques such as neural networks, support vector machine, genetic algorithms and fuzzy methods, some examples could be found to be cited in [4]-[7]. There are also methods that rely on some geometrical characteristics such as vanishing points and lines e.g. [8]-[12]. Besides, these classifications have got overlaps. However, the classification of Zhang [8] which classifies methods regarding the calibration target dimension does not suffer from such weaknesses. In this paper, the focus is on the development of an automatic accurate camera calibration framework. However, there are a set of calibration algorithms known as selfcalibration/zero-dimension which are well-known because of H. Rezazadegan Tavakoli is currently an individual researcher in the field of machine vision and machine intelligence and a member of Young Researchers Club, Islamic Azad University, Mashhad Branch
[email protected] H.R. Pourreza is with the Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, 91775-1111, Iran
[email protected]
978-1-4244-3834-1/09/$25.00 © 2009 IEEE
autonomy, but they are not as accurate as classic methods such as those presented in [1]-[3], [13]. The framework is as accurate as classic methods meanwhile it is fully autonomous. Developed calibration framework deals with both lens distortion and internal parameters of camera. The approach used to provide autonomy utilizes active targets. By active target, we mean those target that are controllable by the calibration algorithm. This new concept gives a new synthesis to the active calibration algorithms. The active approach presented makes a novel method of approximating center of radial distortion possible. In the next section the active aspect and realization of active targets is presented. Section three, contains the information about the calibration framework, its components and algorithms used. The last section contains the experiments, followed by conclusion. II. ACTIVE C ALIBRATION Having an active calibration, it is necessary to interact with the environment. Active camera calibration mechanisms interact with the environment by camera movements [9]. Active calibration have gained attention in the field of robot vision; such algorithms’ examples could be found in [10], [11]. It is possible to have an active calibration algorithm while the camera is not that active, and is fixed on a tripod. The idea of such an active calibration algorithm, is that the information gained from each frame could be used to signal the calibration target for the next frame. This requires the calibration target to be active and controllable by the calibration algorithm. The term active calibration could be used in term of both methods. Meanwhile, the two are totally different. An active target could be a Light-emitting Diode (LED) carried by a controlled robotic arm; or a board of LEDs, which switching them on and off shapes patterns. Approaches that rely on mechanical instruments is not versatile; flexible; precise and economical. The same is true for a board of LEDs. Another approach could be use of monitors for screening of patterns. A. Active Targets It is possible to use a computer program for generating different patterns and screening them on a LCD monitor. Using this technique switching from one pattern to another would be easy and fast, giving the maximum flexibility, adaptability, and precision. Also it should be considered that popularity of LCD monitors has made them available in
96
Authorized licensed use limited to: Ferdowsi University of Mashhad. Downloaded on September 2, 2009 at 14:37 from IEEE Xplore. Restrictions apply.
every laboratory, institute, and house; besides they are not much expensive. A monitor depending on its setting can provide different precisions. As an example, a monitor with 1024 × 768 resolution; and 317mm×236mm viewable screen has pixels of approximately 0.31mm tall and 0.31mm wide; which means the pattern can have movements with precision of 0.31mm. It is obvious the precision would increase at higher resolutions. The advantage of a monitor and a pattern generator program, is that patterns can be controlled and changed regarding the circumstances through the calibration process adaptively. The camera is in sync with the pattern generator and calibration program. It makes a fully automatic image acquisition and feature extraction possible. The key to the automatic feature extraction is that, a pattern can be screened in multiple frames. This makes feature extraction from region of interest easy, especially if one region of interest be displayed in each frame. III. C ALIBRATION F RAMEWORK In this section different aspects of calibration framework is explained. At first lens distortion handling is explained. Afterwards, estimation of intrinsic parameters is studied. At last the overall framework architecture is introduced. A. Radial Distortion Geometrical distortion deals with image points positioning because the actual position of a point would be different than the imaged position if lens distortion is present.There are different methods for estimation of lens distortion parameters. These methods fall into two categories, quantitative methods [1]-[3], [12]-[14] and qualitative methods [15]-[18]. The approach used in the framework is a qualitative one. Qualitative methods are those that rely on the invariance image properties such as straightness of lines [18] and use these properties to compensate the distortion. The advantage of these techniques is that they do not rely on camera information. However, there is a qualitative approach to distortion parameter estimation which requires some extra information about the camera [19]. 1) Center of Radial Distortion: The framework uses a novel approach to estimate center of radial distortion in advance. The approach relies on the fact that a line passing through the center of radial distortion would be straight. The fact gives rise to the following theorem: Theorem 1: Under radial distortion, two concurrent lines l1 , l2 would stay straight if and only if the intersecting point p is positioned on the distortion center o. Fig.1 provides a visualization of theorem. The proof is beyond the scope of this paper, however, it is somehow self-evident. The reader is referred to [20] for an in-depth construal. A simple search algorithm is proposed for finding distortion center; the aim of search is finding the two straight lines
Fig. 1. Center of Radial Distortion, xi is the image of x. The lines of cross image would be straight if pi lies on o, i.e. p and oc (optical center) are aligned together.
intersection by moving a cross calibration target in front of camera. 2) Distortion Model: A polynomial model is used for approximating the first two coefficients of radial distortion. Unlike conventional methods, the center of radial distortion is approximated beforehand. Line straightness is used as the measure of distortion. The polynomial model is defined by the following equation: n r = rd 1 + k1 rd2 + k2 rd4 + ... + kn rd2
(1)
B. Intrinsic Parameters ˆ = [X, Y, Z, 1] from 3D world It is known that a point X T can be related to its corresponding point x ˆ = [x, y, 1] in 2D world using (2) T
ˆ wˆ x = K[R|t]X
(2)
where w is an arbitrary scale factor, R is a rotation matrix, t is a translation vector, and K is camera’s intrinsic matrix defined by (3) ⎡
fx K=⎣ 0 0
s fy 0
⎤ u0 v0 ⎦ 0
(3)
where fx and fy are the focal length in terms of pixel dimensions; s is skew; u0 and v0 are principal point coordinates in terms of pixel dimensions. Equation (2) can be reduced ˆ under assumption of Z = 0, where H is (4), to wˆ x = HX ˆ and X = [X, Y, 1]T . H=
r1
r2
t
97
Authorized licensed use limited to: Ferdowsi University of Mashhad. Downloaded on September 2, 2009 at 14:37 from IEEE Xplore. Restrictions apply.
(4)
where ri is the ith column of rotation matrix. H is known as the homography matrix. ˆ from 1) Solving Intrinsic Parameters: Having x ˆ and X observation, homography was calculated using the Gold Standard algorithm after applying an isotropic normalization [21]. Using the knowledge that r1 and r2 are orthonormal, the following constraints on intrinsic parameters are inferred, hT1 K −T K −1 h2 = 0 hT1 K −T K −1 h1 = hT2 K −T K −1 h2
(5) (6)
It is possible to obtain ω = K −T K −1 from the constraints, using a direct linear approach. Afterwards, the intrinsic parameters are calculated using Cholesky factorization. However, because framework uses one target plane only fx and fy are estimated. It is assumed that s = 0 and principal point is the center of image. Having the intrinsic parameters, extrinsic parameters are calculated using (4). At last, all the parameters are optimized using Levenberg-Marquardt technique minimizing the following distance: m
xi − x ˆ (K, R, t, Xi )
(7)
i=1
where x ˆ is the projection of point Xi in image plane according to reduced form of (2). C. Framework Architecture Fig. 2 shows the framework architecture. The camera calibration framework consists of two major independent programs; one is the pattern generator and the other one is a program that performs all the computation. The latter one would be referred as computational program in the rest of this paper. Both programs are in connection with each other using a communication channel. A communication center is in charge of transferring information and commands between these two programs. An interpreter is in charge of coding and decoding messages from numerical string into meaningful structures and vice versa. The pattern generator consists of a graphic unit, a pixelmetric convertor, and a communication center. The graphic unit is in charge of displaying patterns. Patterns are generated by means of feature points. The type of pattern, feature point and the region of interest is requested by the computational program. A frame is displayed when a display signal is received; in response the pattern generator displays the requested frame, sends a displayed signal and waits for the next request. This ensures that the requested frame is captured. The computational program can get metric and pixel based information of monitor by requesting it from pixel-metric convertor unit. The computational program consists of five other components, plus a communication center. These components are image acquisition; feature extraction; geometrical lens
Fig. 2.
Camera Calibration Framework Architecture.
distortion handler; camera parameter handler; and decision unit. Image acquisition is responsible for capturing frames. Geometrical lens distortion handler is responsible for finding distortion center and radial distortion coefficients using techniques explained. Camera parameter handler is responsible for approximation of internal parameters using undistorted images. The decision unit is in charge of these components. Decision unit decides on the information sent from pattern generator and decides where the information should be routed. It also handles the requests and data from different components and decides on the destination they should be sent (e.g. it decides which component should receive the extracted features information). The reason for decision unit presence is that computational program is not as simple as pattern generator and a simple interpreter is not enough to handle all the information. The approach to camera calibration is based on undistorting the image using a qualitative technique and using undistorted information for finding the camera intrinsic parameters. At first a simple test is performed to find the distortion presence; later the center of distortion is estimated and distortion coefficients are approximated. At last, intrinsic parameters are approximated. IV. E XPERIMENTS The video camera used in these experiments is a Sony camcorder (DCR-TRV460E) equipped with a 16 ” CCD sensor, and a 2.5 − 50mm Sony lens. The lens focal length
98
Authorized licensed use limited to: Ferdowsi University of Mashhad. Downloaded on September 2, 2009 at 14:37 from IEEE Xplore. Restrictions apply.
TABLE I C ALIBRATION R ESULT
Framework DLT
Fig. 3.
Calibration setup used in experiments.
was kept to 2.5mm, which is the widest possible focal length in all the experiments The camera is capable of USB streaming, so no digital to analog converter is needed. The frames are directly grabbed at the resolution of 640 × 480 in RGB color space and later converted to grayscale. A 15” TFT monitor with native resolution of 1024 × 768 (Sony SDM-HS53/H) was used to screen the patterns generated by pattern generator. A user defined color space with maximum backlight used meanwhile the experiments. Also the camera optical axis is nearly orthogonal to monitor. Fig. 3 shows the hardware setup used in experiments. The calibration framework performance was compared with its counterpart developed at computational vision group of Caltech by Jean-Yves Bouguet1 . The pattern used for the case of Caltech’s toolbox was a chessboard pattern provided by the toolbox. The pattern was printed on a paper; and fixed on a surface. Afterwards nearly thirty frames from different angles was taken by moving the camera freely on hand. The feature extraction used was a corner based feature extraction provided by the toolbox. The semi-automatic approach of corner detector was selected, where the four outer corners of calibration target is selected and based on the number of squares the positioning of corners are approximated. Later, corners are optimized using an iterative scheme. It has been reported higher number of interest points results in a more accurate calibration result [22]. Consequently, a pattern consisted of nearly two hundred feature points was used in case of calibration framework. The positioning was quite random. The calibration results of the proposed framework and Caltech’s toolbox is provided in Table I. The Caltech’s toolbox calculates the first two coefficients of tangential distortion. However, in the proposed framework it is not needed to consider them because of known center of distortion. The accuracy evaluation was done by approximating the angle between two intersecting planes. The targets used are shown in Fig. 4. The ground truth is 90◦ ± 1. Eight correspondent points was selected by hand. Afterwards the 1 Reachable
at: http://www.vision.caltech.edu/bougetj/ calib_doc/index.hml
Framework Optimized
Caltech’s Toolbox
fx 724.6332 713.4747 863.31337 fy 743.4705 732.942 884.06995 s 0 -0.2157 0 u0 240 242.4362 237.76127 v0 320 322.1570 340.13527 cx 321.6408 321.6408 – cy 247.4743 247.4743 – -0.01450 -0.01450 -0.17007 k1 k2 -0.00126 -0.00126 0.48422 p1 – – -0.01077 – – -0.01077 p2 ’Framework DLT’ is the result of direct linear transformation before optimization and ’Framework Optimized’ is the final result of framework. cx and cy are center of radial distortion, ki is the ith radial distortion coefficient, pi is the ith tangential distortion coefficient. TABLE II R ESULT OF A NGLE E STIMATION
Angle Framework DLT Framework Optimized Caltech’s Toolbox
95.6303◦ 94.9125◦ 99.6314◦
angle was calculated. The evaluation result is summarized in Table II. As it is shown the framework’s final optimized answer has the best result. V. C ONCLUSION In this paper, an autonomous calibration framework was introduced. The framework is as accurate as well-known methods meanwhile it is fully autonomous. Also because of its autonomy and capability of using as many feature points as possible, the intrinsic parameter approximation is stable. It was shown that the framework can outperform some well-known calibration toolboxes. The main reason is because of accurate automatic feature extraction. In classic calibration, the region of interest for each feature point should be selected by user in order to have a careful calibration. In fact, the calibration process would be a tedious process. On the other hand, proposed calibration framework does not hurt from such a defect This papers also presents a new synthesis to active calibration, where the target is active and controlled by algorithm. The proposed center of radial distortion algorithm relies on this idea. Besides, the realization of active target by means of a pattern generator and a monitor as explained makes the framework a suitable one for the case of desktop vision systems (DVS), where the user is a novice. R EFERENCES [1] R.Y. Tsai, A Versatile Camera Calibration Technique for Highaccuracy 3D Machine Vision Metrology Using Off-the-Shelf TV
99
Authorized licensed use limited to: Ferdowsi University of Mashhad. Downloaded on September 2, 2009 at 14:37 from IEEE Xplore. Restrictions apply.
[17] [18]
[19] [20] Fig. 4.
Target used in angle estimation.
[21] [22]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
Cameras and Lenses, IEEE Journal of Robotics and Automation, vol. RA-3, 1987, pp 323-344. J. Heikkila, Geometric Camera Calibration Using Circular Control Points, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, 2000, pp 1066-1077. J. Weng, P. Cohen and M. Herniou, Camera Calibration with Distortion Models and Accuracy Evaluation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, 1992, pp 965-980. J. Qiang and Y. Zhang, Camera Calibration with Genetic Algorithms, IEEE Transactions on Systems, Man and Cybernetics, Part A, vol. 31, pp 120-130. M.S. Mousavi and R.J. Schalkoff, ANN Implementation of Stereo Vision Using a Multi-Layer Feedback Architecture, IEEE Transactions on Systems, Man and Cybernetics, vol. 24, 1994, pp 1220-1238. C.V. Jawahar and P.J. Narayanan, ”Towards Fuzzy Calibration”, in AFSS 2002 International Conference on Fuzzy Systems, Calcutta, India, 2002, pp 305-313. R. Mohamed, A. Ahmed, A. Eid and A. Farag, ”Support Vector Machines for Camera Calibration Problem”, in 2006 IEEE International Conference on Image Processing, 2006, pp 1029-1032. Z. Zhang, Camera Calibration with One-Dimensional Objects, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, 2004, pp 892-899. A. Basu and K. Ravi, Active Camera Calibration Using Pan, Tilt and Roll, IEEE Transactions on Systems, Man and Cybernetics, Part B, vol. 27, 1997, pp 559-566. D. Konstantinos and E. Jorg, Active Intrinsic Calibration Using Vanishing Points, Pattern Recognition Letters, vol. 17, 1996, pp 11791189. P.F. McLauchlan and D.W. Murray, Active Camera Calibration for a Head-Eye Platformusing the Variable State-Dimension Filter, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, 1996, pp. 15- 22. J. Kannala and S.S. Brandt, A Generic Camera Model and Calibration Method for Conventional, Wide-Angle, and Fish-Eye Lenses, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, 2006, pp 1335-1340. Z. Zhang, A Flexible New Technique for Camera Calibration, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, 2000, pp 1130-1134. G.-Q. Wei and S.D. Ma, Implicit and Explicit Camera Calibration: Theory and Experiments, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, 1994, pp 469-480. J.P. Barreto, R. Swaminathan and J. Roquette, ”Non Parametric Distortion Correction in Endoscopic Medical Images”, in 3DTV-CON The True Vision, Capture, Transmission and Display of 3D Video, Kos, Greece, 2007. H.Li and R. Hartley, ”A Non-Iterative Method for Lens Distortion Cor-
rection from Point Matches”, in OmniVis’05 (workshop in conjunction with ICCV’05), Beijing, 2005. H. Farid and A.C. Popescu, Blind Removal of Lens Distortion, Journal of the Optical Society of America, 2001. F. Devernay and O. Faugeras, Straight Lines Have to Be Straight: Automatic Calibration and Removal of Distortion from Scenes of Structured Environments, Machine Vision and Applications, vol. 13, 2001, pp 14-24. J. Wang, F. Shi, J. Zhang and Y. Liu, A New Calibration Model of Camera Lens Distortion, Pattern Recognition, vol. 41, 2008, pp 607615. H.R. Tavakoli, Automatic Camera Calibration Mechanism , in Department of Computer and Artificial Intelligence, Islamic Azad University of Mashhad, M.Sc thesis, 2008. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed., Cambridge University Press, 2003. W. Sun and J.R. Cooperstock, An Empirical Evaluation of Factors Influencing Camera Calibration Accuracy Using Three Publicly Available Techniques, Machine Vision and Applications, vol. 17, 2006, pp 51-67.
100
Authorized licensed use limited to: Ferdowsi University of Mashhad. Downloaded on September 2, 2009 at 14:37 from IEEE Xplore. Restrictions apply.