IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 4, AUGUST 2008
843
Adaptive Visual Servoing Using Point and Line Features With an Uncalibrated Eye-in-Hand Camera Hesheng Wang, Member, IEEE, Yun-Hui Liu, Senior Member, IEEE, and Dongxiang Zhou
Abstract—This paper presents a novel approach for image-based visual servoing of a robot manipulator with an eye-in-hand camera when the camera parameters are not calibrated and the 3-D coordinates of the features are not known. Both point and line features are considered. This paper extends the concept of depth-independent interaction (or image Jacobian) matrix, developed in earlier work for visual servoing using point features and fixed cameras, to the problem using eye-in-hand cameras and point and line features. By using the depth-independent interaction matrix, it is possible to linearly parameterize, by the unknown camera parameters and the unknown coordinates of the features, the closed-loop dynamics of the system. A new algorithm is developed to estimate unknown parameters online by combining the Slotine–Li method with the idea of structure from motion in computer vision. By minimizing the errors between the real and estimated projections of the feature on multiple images captured during motion of the robot, this new adaptive algorithm can guarantee the convergence of the estimated parameters to the real values up to a scale. On the basis of the nonlinear robot dynamics, we proved asymptotic convergence of the image errors by the Lyapunov theory. Experiments have been conducted to demonstrate the performance of the proposed controller. Index Terms—Adaptive control, eye-in-hand, uncalibrated, visual servoing.
I. INTRODUCTION MAGE-BASED eye-in-hand visual servoing is a problem of controlling the projections of image features to desired positions on the image plane of a camera mounted on a robot manipulator by controlling motion of the manipulator [1]. Compared to a camera fixed in the workspace, an eye-in-hand camera enables the manipulator to view the workspace more flexibly. To implement a visual servo controller, an important step is to calibrate the intrinsic and extrinsic parameters of the camera. It is well known that the camera calibration is costly and tedious. A survey of camera self-calibration is found in [2]. To
I
Manuscript received May 8, 2007; revised November 19, 2007. This paper was recommended for publication by Associate Editor S. Hirai and Editor L. Parker upon evaluation of the reviewers’ comments. This work is supported in part by the Hong Kong Research Grants Council (RGC) under Grant 414406 and Grant 414707 and in part by the National Natural Science Foundation of China (NSFC) under Project 60334010 and Project 60475029. H. Wang is with the Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong Kong (e-mail:
[email protected]). Y.-H. Liu is with the Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong Kong, and also with the National University of Defense Technology, Changsha 410073, China (e-mail: yhliu@ mae.cuhk.edu.hk). D. Zhou is with the Joint Center for Intelligent Sensing and Systems, National University of Defense Technology, Changsha 410073, China. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TRO.2008.2001356
avoid camera calibration, many efforts have been made to uncalibrated visual seroving for which methods can be classified into kinematics-based controllers and dynamic controllers. Kinematics-based methods decouple the designs of motion controller of manipulators and the visual controller under the assumption that the manipulator can perfectly perform the position or velocity control required in visual servoing. They change visual servoing into a problem of designing the velocity or position of the end-effector of the robot using the visual feedback signals. In kinematics-based methods, the typical idea for uncalibrated visual servoing is to estimate the image Jaocbian or the depths of the features online. Papanikolopoulos et al. [3] developed an online estimation method to determine the depth information of the feature point. Malis [4], [5] proposed 2–1/2D visual servoing to deal with uncalibrated camera intrinsic parameters. Yoshimi and Allen [7] proposed an estimator of the image Jacobian for a peg-in-hole alignment task. Hosada and Asada [6] designed an online algorithm to estimate the image Jacobian. Ruf et al. [8] proposed a position-based visual servoing by adaptive kinematic prediction. Pomares et al. [9] proposed adaptive visual servoing by simultaneous camera calibration. Since the design of visual servo controller does not take the nonlinear robot dynamics into account, the stability of the overall system is not guaranteed even if the visual controller is stable as pointed out in [10]–[18]. Dynamic methods design the joint input directly, instead of the velocity of the end-effector, using the visual feedback and include the nonlinear robot dynamics in the control loop. Carelli et al. [19] proposed an adaptive controller to estimate robot parameters for the eye-in-hand setup. Kelly et al. [20] developed an adaptive controller for visual servoing of planar manipulators. The controller developed by Hashimoto et al. [21], [22] incorporated the nonlinear robot dynamics in controller design and employed a motion estimator to estimate motion of an object in a plane. In our early work, we proposed a new adaptive controller for uncalibrated visual servoing of 3-D manipulator using a fixed camera [23]–[27]. The underlying idea in our controller is the development of the concept of the depth-independent interaction matrix (or image Jacobian) matrix to avoid the depth that appears in the denominator of the perspective projection equation. Most of existing visual servo controllers cope with point features, while polyhedral objects are much more common in physical world. Lines are easier to detect and trace and more robust to image noise than points. Only few works have extended visual servoing to lines. Espiau et al. [28], Urban et al. [29], and Andreff et al. [30] proposed different methods for visual servoing from lines, while assuming the camera parameters are calibrated. Malis et al. [31] proposed intrinsic parameters free
1552-3098/$25.00 © 2008 IEEE
844
IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 4, AUGUST 2008
visual servoing with respect to straight lines. However, dynamics of the robots were not considered in these methods. Mahony and Hamel [32] proposed a dynamic visual servo controller for aerial robots using lines while assuming that the camera parameters are calibrated. This paper extends our controller developed in [24] to imagebased visual servoing of point and line features using an uncalibrated eye-in-hand camera. Two new challenges arise here. First, in an eye-in-hand setup in addition to the camera parameters, the 3-D coordinates of the features are not known either. Therefore, we need to estimate both the camera parameters and the 3-D coordinates online. Second, in addition to point features, this paper also copes with visual servoing of line features. The basic idea in the controller design is to use the depth-independent interaction (or image Jacobian) matrix so that the depth appearing in the denominator of the perspective projection equation can be avoided. To derive the depth-independent interaction matrix for line features, we propose a new scheme to represent the projection of lines, which is similar to the Pl¨ucker coordinates. This new representation enables us to linearly parameterize the depth-independent interaction matrix as well as the closed-loop dynamics. To simultaneously estimate the camera parameters and the unknown 3-D coordinates of the features, we combine the Slotine–Li method with an online estimator designed on the basis of the idea of structure from motion in computer vision. The estimator uses the image sequence, captured during motion of the manipulator, to estimate the 3-D structure and the perspective projection matrix online by minimizing the Frobenius norm of the estimated projection errors. Based on the nonlinear dynamics of the manipulator, we have proved by the Lyapunov theory that the image errors will be convergent to zero and the unknown camera parameters and 3-D coordinates of the features are convergent to the real values up to a scale. Experiments have been conducted to validate the proposed methods. This paper differs from our early works [24] in the following aspects. First, this paper copes with eye-in-hand problems, but the previous works addressed problems using fixed cameras. Second, this paper assumes that the 3-D coordinates of the features are unknown, while in [24], the coordinates of the feature points with respect to the robot are given. Third, this paper copes with both point and line features and extends the concept of depth-independent interaction matrix to line features. The contributions of this paper can be summarized as follows. First, we propose a new adaptive controller for image-based visual servoing of both point and line features using an uncalibrated eye-in-hand camera. Second, a new algorithm is developed for online simultaneous estimation of the unknown camera parameters and 3-D structure of the features. Finally, the dynamic stability of the proposed controller is rigorously proved by the Lyapunov method. II. KINEMATICS A. Problem Definition Consider an eye-in-hand setup (see Fig. 1), in which a vision system is mounted on the end-effector to monitor a set of image features. Assume that the image features are fixed ones but their
Fig. 1.
Eye-in-hand setup for visual servoing.
3-D coordinates in the space are not known. Suppose that the camera is a pinhole camera with perspective projection. Furthermore, we assume that the intrinsic parameters of the camera and the extrinsic parameters, i.e., the homogeneous transform matrix between the camera and the end-effector, are unknown. Assume that there are sets of features in the environment. The features under consideration are either points or lines. The problems addressed are as follows. 1) Problem 1 (Visual servoing using point features): Given the desired projections of the feature points on the image plane, design a proper joint input for the manipulator such that the projections of the feature points on the image plane are asymptotically convergent to the desired positions. 2) Problem 2 (Visual servoing using line features): Given desired projections of a set of line features on the image plane, design a proper input for the manipulator such that the projections of the feature lines asymptotically align with the desired ones. To simplify the discussion, we will assume that the features always remain in the field of view. B. Notations The notations adopted here are the following: a bold capital and lower case letter represents a matrix and a vector, respectively. An italic letter represents a scalar quantity. A matrix, or vector, and/or scalar accompanied with a bracket (t) implies that its value varies with time. Furthermore, let Ik ×k and 0k ×l denote the k × k identity matrix and the k × l zero matrix, respectively. C. Perspective Projection of Point Features In Fig. 1, three coordinate frames, namely the robot base frame, the end-effector frame, and the camera frame, have been set up to represent motion of the manipulator. Denote the joint angle of the manipulator by n × 1 vector q(t), where n is the number of DOFs. Denote the 3-D coordinates of the feature w.r.t. the robot base and the camera frames by b x and c x, respectively. Note that the b x is a constant vector for a fixed feature point. Denote the homogenous transform matrix of the end-effector with respect to the base frame by Te (t), which can be calculated from the kinematics of the manipulator. Denote the homogeneous transformation matrix of the camera frame with respect to the end-effector frame by Tc , which represents
WANG et al.: ADAPTIVE VISUAL SERVOING USING POINT AND LINE FEATURES
the camera extrinsic parameters. From the forward kinematics of the manipulator, we have c b x(t) x −1 T (t) = T−1 (1) c e 1 1 and T−1 e (t)
=
R(t) ξ(t) 0 1
(2)
where R(t) and ξ(t) are the rotation matrix and the translation vector from robot base frame to end-effector frame. Let y = (y1 , y2 )T express the coordinates of the projection of the feature point on the image plane. Under the perspective projection model b 1 y(t) x MT−1 (t) =c (3) e 1 1 z(t) where c z(t) denotes the depth of the feature point and 3 × 4 matrix M is called perspective projection matrix, which depends on the intrinsic and extrinsic parameters only. And (3) can be rewritten as b 1 x PT−1 (t) y(t) = c (4) e 1 z(t) where P is the matrix consisting of the first two rows of the matrix M. Let mTi denote the ith row vector of the matrix M, and P has the following form: T m1 . (5) P= mT2 The depth of the feature point is given by b x c z(t) = mT3 T−1 (t) . e 1
(6)
It is important to note [34] the following. 1) Property 1: A necessary and sufficient condition for matrix M to be a perspective projection matrix is that it has a full rank. 2) Property 2: The vector c z(t)( y(t) 1 ) can be represented as a linear form of a parameter vector ϕp corresponding to the products of the unknown camera parameters and the unknown 3-D world coordinates of the feature point b x, i.e., c
z(t)
y(t) 1
845
where Ω is the upper-left 3 × 3 submatrix of the perspective project matrix M and χ is the fourth column vector of the matrix, i.e., M = (Ω χ). In (8), Ω is a constant unknown matrix with nine independent components, and hence, there are 27 independent products of components of matrix Ω and b x in the first term, and nine independent parameters of the components of the matrix Ω in the second term. The third term χ is a constant unknown vector, which has three independent components. Therefore, we can derive (7), where the dimension of the parameters vector is 39 (= 27 + 9 + 3). Remark 1: From (3) and (7), the parameters can be determined only up to a scale. Therefore, we can fix one parameter so that 38 parameters need to be estimated. Denote the translation vector between the end-effector and the camera frames in the z-direction by pz , which is the third row and fourth column components of M, i.e., pz = M34 . The sign of this parameter can be known after the system is set up. Here, we fix the estimated value of this parameter as following: 1, if pz > 0 pˆz = (9) −1, if pz > 0. Then, define the remaining 38 parameters as θ p = ϕp / pz . After fixing the parameter pz , (3) can be rewritten as follows: b pz M −1 x y(t) T (t) . = 1 1 pz c z(t) e To simplify the notation, we redefine c z(t)/ pz and mTi / pz in these equations as c z(t) and mTi , respectively. By differentiating (6), we have b x T −1 ∂ m3 Te (t) 1 c ˙ z(t) ˙ = ˙ (10) q(t) = aT (t)q(t) ∂q a T (t)
where a(t) is a vector determined by the parameters and the joint position. By differentiating (4), we obtain the following relation: ˙ y(t) =
1 c z(t)
−1
˙ e (t) − y(t)c z(t)) (PT ˙ =
1 c z(t)
A(t)q(t) ˙ (11)
where the matrix A(t) is given by
b x −1 ∂ Te (t) T 1 mT − y (t)m 1 1 3 A(t) = . T mT ∂q 2 − y2 (t)m3
= Φp (q(t))ϕp .
(7)
The dimension of the vector ϕp is 39 × 1. Proof: Considering (2), we rewrite (7) in the following form: b y(t) x c z(t) = MT−1 (t) e 1 1 R(t)b x + ξ(t) = (Ω χ) 1 = ΩR(t)b x + Ωξ(t) + χ
(8)
(12)
D (y(t))
It should be noted that the image Jacobian matrix is the matrix (1/c z(t))A(t). Matrix A(t) differs from the image Jacobian matrix by the scale factor. Here, we call A(t) as depthindependent interaction matrix. Proposition 1: Assume that the Jacobian matrix J(q(t)) of the manipulator is not singular. For any vector b x, the matrix b x ∂(T−1 e (t)( 1 ))/∂q has a rank of 3.
846
IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 4, AUGUST 2008
line is represented by its direction vector u and a vector defined as follows: i yp (t) u × (15) h = (Oc P ) × u = f 0
Fig. 2.
Projection geometry of a line feature.
Proof: Substituting (2) into the previous matrix, we obtain, b x −1 ∂ Te (t) 1 ∂(R(t)b x + ξ(t)) = ∂q ∂q R(t) 0 b J(q(t)) = ( sk{R(t) x + ξ(t)} I3×3 ) 0 R(t) (13) where sk is a matrix operator and sk(x) with vector x = [ x1 x2 x3 ]T can be written as a matrix form 0 −x3 x2 sk(x) = x3 0 −x1 . −x2 x1 0 Obviously, the matrix in (13) has a rank of 3. Property 3: The depth-independent interaction matrix A(t) has a rank of 2 if the matrix M is a perspective projection matrix. The detailed proof is referred to [24]. Property 4: For any vector ρ, the product A(t)ρ can be written as a linear form of the unknown parameters with a constant σ p , i.e., A(t)ρ = Q(ρ, y(t))θ p + σ p
(14)
where Q(ρ, y(t)) does not depend on the parameters. D. Perspective Projection of Line Features This section reveals the geometric properties of a line feature under the perspective projection. As shown in Fig. 2, let Oc be the center of the camera frame and Oi be the origin point of the physical retina coordinate frame of the camera. A 3-D straight line L is projected onto the image plane as a line l. P and P1 are two points on the 3-D line, while p and p1 are their corresponding projections on the image plan. u is the unit direction vector of line L and i u is its projection on the image plane. In this paper, we define a new representation of image lines, similar to the so-called normalized Euclidean Pl¨ucker coordinates [33]. In the normalized Euclidean Pl¨ucker coordinates, a
where P is the closest point on the line to the origin of the camera frame Oc . To calculate h from the projection of the line on the image plane, it is necessary to know the focal length f , which is not available when the camera is not calibrated. To solve this problem, we define another frame g, which is parallel to the physical retina frame but whose origin is shifted by 1 from the retina frame coordinate along the negative z-axis of the camera frame. We define n(t) as the unit normal vector of the plane defined by the projection l and the origin Og . The vector n(t) can uniquely determine the projection l of the line. The vector n(t) is directly calculated from the projection l as follows: i yp (t) u × 1 0 (16) n(t) = i . u yp (t) × 1 0 To derive the geometry of the perspective projection of lines based on the new representation, we define the following vector: b(t) = c zP c zP 1 (Og p × Og p1 )
(17)
where c zP and c zP 1 are the depths of two arbitrary 3-D points P and P1 on the line with respect to the camera frame respectively. Og p and Og p1 are vectors from the origin Og to the projections p and p1 on the image plane, respectively. Then, b 1 yp (t) xP Og p = (t) (18) = c MT−1 e 1 1 zP b 1 yp 1 (t) xP 1 −1 MTe (t) . (19) =c Og p1 = 1 1 zP 1 By substituting (18) and (19) into (17), we can obtain b b xP xP 1 −1 −1 b(t) = MTe (t) × MTe (t) 1 1 b b xP xP 1 −1 = MT−1 (t) (t) × MT e e 1 1 b xP −MT−1 e (t) 1 b b xP u −1 −1 = MTe (t) × MTe (t) 0 1 = {Ω(R(t)b xP + ξ(t)) + χ} × ΩR(t)b u.
(20)
Property 5: The vector b(t) can be represented as a linear form of a parameter vector ϕl corresponding to the products of the unknown camera parameters and the unknown coordinates of b xP and b u, i.e., b(t) = Φl (q(t))ϕl . The dimension of the vector ϕl is 108.
(21)
WANG et al.: ADAPTIVE VISUAL SERVOING USING POINT AND LINE FEATURES
Proof: We rewrite (20) as the following form:
847
sented in the following form:
b(t) = ΩR(t)b xP × ΩR(t)b u + Ωξ(t) × ΩR(t) U + χ × ΩR(t) u. b
b
n(t) = (22)
The terms on the right-hand side of (22) can be represented as follows: ΩR(t)b xP × ΩR(t)b u b T T xP R (t) ψ 2 ψ T3 − ψ 3 ψ T2 R(t)b u = b xTP RT (t) ψ 3 ψ T1 − ψ 1 ψ T3 R(t)b u b T xP RT (t) ψ 1 ψ T2 − ψ 2 ψ T1 R(t)b u Ωξ(t) × ΩR(t)b u T ξ (t) ψ 2 ψ T3 − ψ 3 ψ T2 R(t)b u = ξ T (t) ψ 3 ψ T1 − ψ 1 ψ T3 R(t)b u ξ T (t) ψ 1 ψ T2 − ψ 2 ψ T1 R(t)b u χ × ΩR(t)b u = sk(χ)ΩR(t)b u.
b
pl = xP x f (Mij ) ux
(28)
By differentiating (27) and combing it with (20), we have R(t) 0 ˙ b(t) = F(t) J(q)q˙ (29) 0 R(t) where
(23)
F(t) = sk{ΩR(t)b u}Ω
−sk{ΩR(t)b u}Ωsk{R(t)b xb +ξ(t)}
.
+sk{Ω(R(t)b xP +ξ(t))+χ}Ωsk{R(t)bu} (30) (24)
It is important to note that b(t)2 = b(t)T b(t).
(25)
Here, vector ψ Ti represents the ith row vector of matrix Ω. The matrix (ψ i ψ Tj − ψ j ψ Ti ) is a 3 × 3 skew-symmetric matrix and has only three independent nonzero components. Therefore, there are 54 independent products of components of matrix Ω, b xP , and b u in (23) (refer to the Appendix for the detailed proof), and 27 independent products (parameters) of the components of the matrix Ω and b u in (24). In (25), sk(χ)Ω is a constant unknown matrix with nine independent components, and hence, there are 27 independent products (parameters) of components of matrix sk(χ)Ω and vector b u. From (23)–(25), we can derive (22), where the dimension of the parameters vector is 108 (= 54 + 27 + 27). Proposition 2: There always exists one nonzero parameter in the 108 unknown parameters. Proof: Denote pl as the parameter from (23) and can be written as b
b(t) . b(t)
(26)
where f (Mij ) is a constant value calculated by the components of the perspective projection matrix. Assume that the line is not perpendicular to the x-axis and b xP x is the component of b xP in the x-axis. The sign of b xP x could be positive or negative because the point P may lie in any position of the line. Therefore, there always exists one point P to guarantee pl = 0. Remark 2: Since the parameters can be only estimated up to scale, we can fix one parameter so that 117 parameters need to be estimated. By Proposition 2, we can select one parameter and fix its estimated value pˆl = 1. Then, define the left 107 parameters as θ l = ϕl /pl . Equation (20) can be rewritten as b b u x b(t) P −1 × MT = MT−1 (t) (t) (27) pl . e e pl 1 0 To simplify the notation, we redefine b(t)/pl and b u/pl as b(t) and b u, respectively. The normal vector can also be repre-
(31)
By differentiating both sides of (31), 2 b(t)
d ˙ b(t) = 2bT (t)b(t). dt
(32)
Then, bT (t) ˙ d ˙ b(t) = b(t) = nT (t)b(t) dt b(t) = c(n(t), q(t), θ l )q(t). ˙
(33)
The derivative of the normalized vector is as follows: (d/dt) b(t) 1 ˙ b(t) − b(t) b(t) b(t)2 1 ˙ I3×3 − n(t)nT (t) b(t) = b(t)
n(t) ˙ =
=
1 L(n(t), q(t), θ l )q(t). ˙ b(t)
(34)
In (34), b(t) corresponds to the depth. The matrix L(n(t), q(t), θ l ) is called the depth-independent interaction matrix and its dimension is 3 × n. The matrix depends on the camera parameters, the unknown constant vector b xP , and the directional vector b u of the line. Similarly, we have the following property. Property 6: For an n × 1 vector ρ, we have L(n(t), q(t), ϕl )ρ = Ξ(n(t), q(t), ρ)θ l + σ l
(35)
where matrix Ξ(n(t), q(t), ρ) does not depend on the unknown camera parameters and unknown direction and coordinates of the line in 3-D space. σ l is a constant value. III. ADAPTIVE-IMAGE-BASED VISUAL SERVOING USING POINT FEATURES In the controller design, we will use the feedback of both the position and velocity (state) of the manipulator and the image errors (output).
848
IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 4, AUGUST 2008
A. Robot Dynamics The dynamic equation of a robot manipulator has the form 1 ˙ q(t)) ˙ q(t)+G(q(t)) ˙ =τ H(q(t))¨ q(t)+ H(q(t))+C(q(t), 2 (36) where H(q(t)) is the positive definite and symmetric inertia matrix. C(q(t), q(t)) ˙ is a skew-symmetric matrix. The term G(q(t)) represents the gravitational force and τ is the joint input of the robot manipulator. The first term on the left side of (36) is the inertial force, and the second term represents the Colioris and centrifugal forces.
Fig. 3.
Projections of the feature point on the multiple image planes.
C. Estimation of the Parameters B. Controller Design Denote the desired position of the feature point on the image plane by yd , which is a constant vector. The image error is given by ∆y(t) = y(t) − yd .
(37)
Denote an estimation of the unknown parameters θ p by θˆp (t). Using the estimation, we propose the following controller inspired from [36]: ˆ T (t)+ 1 ˆ a(t)∆yT (t) B∆y(t). ˙ A τ = G(q(t))−K1 q(t)− 2 (38) The first term is to cancel the gravitational force. The second term is a velocity feedback. The last term represents the ˆ visual feedback. A(t) is the estimated depth-independent interaction matrix calculated by the estimated parameters. ˆ a(t) is an estimation of vector a(t) in (10), and K1 and B are positive definite gain matrices. It is important to note that 1/c z(t) does not appear in the controller and its effect in the perspective projection is compensated by the quadratic form of ∆y(t) in the controller. Using the depth-independent interaction matrix and including the quadratic term differentiates our controller from other existing ones. By substituting (38) into (36), we obtain 1 ˙ H(q(t))+C(q(t), q(t)) ˙ q(t) ˙ H(q(t))¨ q(t)+ 2 1 T T = −K1 q(t) ˙ − A (t) + a(t)∆y (t) B∆y(t) 2 1 T ˆ a(t)−a(t))∆yT (t) B∆y(t). (39) (t))+ (ˆ − (A(t)−A 2 From the Property 4, the last term can be represented as a linear form of the estimation errors of the parameters as follows: 1 T T ˆ a(t) − a(t))∆y (t) B∆y(t) − (A(t) − A (t)) + (ˆ 2 = Yp (q(t), y(t))∆θ p (t)
(40)
where ∆θ p (t) = θˆp (t) − θ p is the estimation error and the regressor Yp (q(t), y(t)) does not depend on the unknown parameters.
With the motion of the robot manipulator, the camera moves and captures the images of the feature point from different viewpoints. What we need to estimate here is the perspective projection matrix, i.e., the intrinsic and extrinsic parameters of the camera, and the 3-D coordinates of the feature point. The problem is similar to the projective structure from motion problem in computer vision [34]. It should be pointed out that the adaptive algorithm differs from the methods in structure from motion in two points. First, all structure from motion methods are offline algorithms, but our algorithm estimates the camera parameters and the 3-D structures online. Second, our objective is not to accurately estimate the parameters or the structure. In the projective structure from motion, one needs to capture a number (denote the number by k) of images of the feature point at different viewpoints, which are obtained at different configurations of the robot manipulator. Denote by tj the time instant when the jth image of the feature point was captured. y(tj ) = (y1 (tj ), y2 (tj ))T is image coordinates of the feature point on the jth image and q(tj ) represents the corresponding joint position of the manipulator (see Fig. 3). ˆ and b x ˆ that miniOur objective is to find an estimation of M mizes the Frobenius norm of the errors given by b k k x ˆ −1 c ˆ ep (tj ) = zˆ(tj )y(tj ) − P(t)Te (tj ) . (41) 1 j =1
j =1
Note that the k images are selected during motion of the manipulator. It is well known in computer vision that given a sufficient number k of images sequence, it is only possible to estimate the perspective projection matrix M and the coordinates of b x up to a scale. Since there are 38 unknowns in θ p , 19 images are necessary for estimating the parameters. Note that from (4), for the true M and b x b x −1 c z(tj )y(tj ) − PTe (tj ) = 0. (42) 1 Therefore, we have b x ˆ(tj ) c −1 ˆ zˆ(tj )y(tj ) − P(t)Te (tj ) 1 = y(tj )(c zˆ(tj ) − c z(tj )) b b x ˆ(tj ) x ˆ(tj ) −1 −1 ˆ − P(t)Te (tj ) − PTe (tj ) 1 1 = Wp (x(tj ), y(tj ))∆θ p (t).
(43)
WANG et al.: ADAPTIVE VISUAL SERVOING USING POINT AND LINE FEATURES
From the result in computer vision [34], we have the following proposition. Proposition 3: If a sufficient number of images are selected during motion of the manipulator, then Wp (x(tj ), y(tj ))∆θ p = 0
∀j = 1, 2, . . . , k
(44)
implies that the estimated parameters can be estimated to the real values up to a scale, i.e., ϕl . θˆp (t) = θ p = pz
(45)
Note that c z(t) > 0 when the feature is in the field of view. Multiplying q˙ T (t) from the left to (39) results in 1 ˙ q˙ T (t)H(q(t))¨ q(t)+ q˙ T (t)H(q(t)) q(t) ˙ = −q˙ T (t)K1 q(t) ˙ 2 1 − q˙ T (t)AT (q(t))B∆y(t)− q˙ T (t)a(t)(∆yT (t)B∆y(t)) 2 + q˙ T (t)Y(q(t), y(t))∆θ p (t).
which implies that Wp (x(tj ), y(tj ))∆θ p (t) = ±y(tj ). It is obviously wrong since we know that Wp (x(tj ), y(tj ))∆θ p (t) = 0. So, we can avoid the trivial solution. The adaptive algorithm for estimating the unknown parameters is developed based on two ideas. First, the Slotine–Li [35] method is adopted to cancel the regressor term. Second, the adaptive algorithm carries out online minimization of the Frobenius norm of the errors ep (tj ) defined on a number of images captured during motion of the manipulator. Therefore, in parameter estimation, we integrate the Slotine–Li method with an online process of the perspective structure from motion. The adaptive rule is as follows: d ˆ −1 θ(t) = −Γ ˙ YpT (q(t), y(t))q(t) dt k T + Wp (x(tj ), y(tj ))K3 ep (tj ) (46) j =1
where Γ and K3 are positive definite and diagonal gain matrices.
From (11), we have
This section analyzes the stability of the proposed controller. Theorem 1: Assume that the feature points are in the field of view, under the control of the controller (38) and the adaptive algorithm (46) for parameters estimation, the image error of the feature point is convergent to zero, i.e., lim ∆y(t) = 0.
(47)
Furthermore, if a sufficient number of images are used in the adaptive algorithm, then the estimated parameters are convergent to the real ones up to a scale. Proof: Introduce the following positive function: 1 T {q˙ (t)H(q(t))q(t) ˙ 2 + c z(t)∆yT (t)B∆y(t) + ∆θ Tp (t)Γ∆θ p (t)}.
(48)
(50)
By multiplying ∆θ T (t) from the left to (46), we obtain ˙ ∆θ Tp (t)Γ∆θ˙ p (t) = − ∆θ Tp YpT (q(t), y(t))q(t) −
k
eTp (tj )K3 ep (tj ).
(51)
j =1
Differentiating the function V(t) results in 1 ˙ V˙ (t) = q˙ T (t) H(q(t))¨ q(t)) ˙ q(t) ˙ q(t) + H( 2 T
+ ∆θ˙ p (t)Γ∆θ p (t) + c z(t)∆y˙ T (t)B∆y(t) +
1c T z(t)∆y ˙ (t)B∆y(t). 2 (52)
Note that c
z(t) ˙ = aT (t)q(t). ˙
(53)
By combining (49)–(53), we have ˙ − V˙ (t) = −q˙ T (t)K1 q(t)
k
eTp (tj )K3 ep (tj ).
(54)
j =1
Obviously, we have V˙ (t) ≤ 0, and hence, q(t), ˙ ∆y(t), and ∆θ(t) are all bounded. From (39) and (46), we know q ¨(t) and ˙ (t) are bounded, respectively. Differentiating the function θˆ p V˙ (t) results in V¨ (t) = −¨ qT (t)K1 q(t) ˙ −2
D. Stability Analysis
t→∞
(49)
q˙ T (t)AT (t) = c z(t)y˙ T (t) = c z(t)∆y˙ T (t).
Proposition 4: If Wp (x(tj ), y(tj ))∆θ p = 0, we can avoid the trivial solution of θˆp (t) = 0. Proof: By fixing the parameter pˆz = ±1, if θˆp (t) = 0, we can obtain b x ˆ(t) c zˆ(t) = m ˆ T3 T−1 (t) = ±1 e 1
V (t) =
849
k
e˙ Tp (tj )K3 ep (tj ).
(55)
j =1
Then, we can conclude that V¨ (t) is bounded. Consequently, from Barbalat’s lemma, we have ˙ =0 lim q(t)
t→∞
lim ep (tj ) = 0.
t→∞
(56)
The convergence of ep (tj ) implies that the estimated matrix ˆ θ p (t) is convergent to the real values. In order to prove the convergence of the image error, consider the invariant set of the system when V˙ (t) = 0. From the closed-loop dynamics (39), at the invariant set ˆ T (t) + 1 ˆ a(t)∆yT (t) B∆y(t) = 0. (57) A 2
850
IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 4, AUGUST 2008
It is reasonable to assume that the manipulator is not at the singular configurations so that J(q(t)) is of full rank. Note that
IV. ADAPTIVE-IMAGE-BASED VISUAL SERVOING USING LINE FEATURES
ˆ T (t) + 1 ˆ a(t)∆yT (t) A 2 b T T x ˆ(t) −1 ˆ 1 +(0.5∆u(t)−u(t))m ˆ T3 T ∂ Te (t) m 1 m T = ˆ T3 . ˆ 2 +(0.5∆v(t)−v(t))m ∂q 0
This section copes with visual servoing of the manipulator using line features.
E(t)
(58) The matrix
A. Controller Design The image information to be used here is the normal vector n(t)of the plane determined by the projection of the feature line and origin Og . Suppose that a constant desired value nd is given. The image error is defined as ∆n(t) = n(t) − nd .
∂ T−1 e (t)
b
x ˆ(t) 1
∂q is not singular from Proposition 1. Using a similar proof as Property 3, we can show that the matrix E(t) has a rank of 2 if ˆ is of rank 3. Therefore, from (57), it is obvious that at the M invariant set, the position error on the image plane must be zero, e.g., ∆y(t) = 0.
(59)
Therefore, we can conclude the convergence of the position error of the feature point projections on the image plane to zero. E. Extension to Multiple Feature Points This controller can be directly extended to visual seroving using a number S of feature points. In control of multiple feature points, the dimension of the depth-independent interaction matrix will increase, and hence, the major difference will be the difference in computation time. A controller similar to that in (38) can be designed by τ = G(q(t)) − K1 q(t) ˙ S ˆ Ti (t) + 1 ˆ ai (t)∆yiT (t) B∆yi (t) − A 2 i=1
(60)
ˆ (t) and ˆ where A ai (t) have a similar form to (12) and (10) for i point i, respectively. ∆yi (t) is the image position error for the ith feature point. The image errors are convergent to zero when the number n of DOFs of the manipulator is larger than or equal to 2S. When n = 6 and S ≥ 3, the convergence of image errors can also be guaranteed. It is well known that three noncollinear projections of fixed feature points on the image plane can uniquely define the 3-D position of the camera, and hence, the projections of all other fixed feature points are uniquely determined. Therefore, if the projections of three feature points whose projections are convergent to their desired positions on the image plane, so are the projection of all other feature points. We can conclude the convergence of the image errors when the manipulator has equal or more DOFs than 2S, or when the manipulator has 6 DOF.
(61)
The control objective is to force ∆n(t) to zero by properly designing the control input. We adopt a similar idea to that used for control of point features. An adaptive algorithm is employed to estimate the unknown camera parameters and 3-D coordinates of the feature line. The controller is given as follows: ˆ T (t) + 1 ˆ τ = g(q(t))−K1 q(t)− cT (t)∆nT (t) B∆n(t) ˙ L 2 (62) ˆ where L(t)is the estimated depth-independent interaction matrix. ˆ c(t) is an estimation of vector c(t). K1 and B are the positive definite gain matrices. By substituting the controller in the dynamic equation of the robots, we can have the following closed-loop dynamics: 1 ˙ H(q(t))+C(q(t), q(t)) ˙ q(t) ˙ = −K1 q(t) ˙ H(q(t))¨ q(t)+ 2 1 T T T − L (t) + c (t)∆n (t) B∆n(t) 2 1 + LT (t) + cT (t)∆nT (t) B∆n(t) 2 1 T T T ˆ c (t)∆n (t) B∆n(t). (63) − L (t) + ˆ 2 From the Property 6, the last two terms can be represented as a linear form of the estimation errors of the parameters as follows: 1 T T T L (t) + c (t)∆n (t) B∆n(t) 2 ˆ T (t) + 1 ˆ cT (t)∆nT (t) B∆n(t) − L 2 = Yl (q(t), n(t))∆θ l (t)
(64)
where ∆θ l (t) = θˆl (t) − θ l is the estimation error and the regressor Yl (q(t), n(t)) does not depend on the unknown parameters. B. Adaptive Algorithm To estimate the unknown parameters, we also select k images captured by the camera at different time instants during motion of the manipulator and minimize an error vector based on the projections of the feature line on the images. The error
WANG et al.: ADAPTIVE VISUAL SERVOING USING POINT AND LINE FEATURES
vector to be used must satisfy several conditions. First, the error vector can be represented as a linear form of the unknown parameters. Second, the error vector must be zero for the true parameters. Third, the estimations of the unknown parameters should be equal to or different from the true values by a scale if the error vectors are all zero for a sufficiently large number of images. The error vector defined next satisfies the previous condition ˆ j ) − (b ˆ T (tj )n(tj ))n(tj ) el (tj ) = b(t = Wl (q(tj ), n(tj ))∆θˆl (t).
851
From the relation (34) between the joint velocity and the visual velocity, we have ˙ T. q˙ T (t)LT (t) = b(t) n(t)
(70)
Multiplying ∆θ Tl (t) from the left to (66), we obtain ∆θ Tl (t)Γ∆θ˙ l (t) = −∆θ Tl (t)YlT (q(t), n(t))q(t) ˙ −
k
eTl (tj )K3 el (tj ).
(71)
j =1
(65)
Differentiating the function V (t) results in Here, tj represents the time instant when the jth image was ˆ j ) is calculated by the position and captured. The vector b(t orientation of the camera at time tj and the estimation θˆl (t), and n(tj ) is obtained from the image coordinates of the feature line on the image captured at time tj . Matrix Wl (q(tj ), n(tj )) does not depend on the unknown parameters and has a rank of 2. Since it is only possible to estimate the unknown parameters up to a scale and one of the parameters has been fixed, 54 images are necessary for estimating the 107 unknown parameters θ l . Let k denote the number of images captured by the cameras at different configurations of the manipulator. The adaptive algorithm is given by d ˆ −1 θ l (t) = −Γ Yl (q(t), n(t))q(t) ˙ dt k T + Wl (q(tj ), n(tj ))K3 el (tj ) (66) j =1
1 ˙ V˙ (t) = q˙ T (t)H(q(t))¨ q(t) + q˙ T (t)H(q(t)) q(t) ˙ 2 1 d + ∆θ Tl (t)Γ∆θ˙ l (t) + ∆nT (t)B∆n(t) b(t) 2 dt + b(t) ∆nT (t)B∆n(t)}. ˙
(72)
By combining (69)–(72), we have V˙ (t) = −q˙ T (t)K1 q(t) ˙ −
k
eTl (tj )K3 el (tj ).
(73)
j =1
Obviously, we have V˙ (t) ≤ 0, and hence, q(t), ˙ ∆n(t), and ∆θ l (t) are all bounded. From (63) and (66), we know q ¨(t) ˙ ˆ and θ l (t) are bounded, respectively. Differentiating the function V˙ (t) results in V¨ (t) = −2¨ qT (t)K1 q(t) ˙ −2
k
e˙ Tl (tj )K3 el (tj ).
(74)
j =1
where Γ and K3 are the positive definite gain matrices.
Then, we can conclude that V¨ (t) is bounded. Consequently, from Barbalat’s lemma, we have
C. Stability Analysis This section proves the asymptotic stability of the system under the proposed controller. Theorem 2: Given a sufficient number of images in the adaptive algorithm, the proposed controller (62) and the adaptive algorithm (66) give rise to lim ∆n(t) = 0.
t→∞
(67)
Proof: Introduce the following positive function: V (t) =
1 T {q˙ (t)H(q(t))q˙ T (t) 2 + b(t) ∆nT (t)B∆n(t) + ∆θ Tl Γ∆θ l }. (68)
Multiplying q˙ T (t) from the left to (63) results in 1 ˙ q˙ T (t)H(q(t))¨ q(t) + q˙ T (t)H(q(t)) q(t) ˙ 2 1 T T T T T = −˙q (t)K1 q(t)− ˙ q˙ (t) L (t)+ c (t)∆ n(t) B∆n(t) 2 + q˙ T (t)Y(q(t), n(t))∆θ l (t).
(69)
˙ =0 lim q(t)
t→∞
˙ lim θˆl (t) = 0
t→∞
lim el (tj ) = 0.
t→∞
(75)
In order to prove the convergence of the image error, consider the invariant set of the system when V˙ (t) = 0. From the closedloop dynamics (63), at the invariant set 1 T T T ˆ c (t)∆n (t) B∆n(t) = 0 (76) L (t) + ˆ 2 where 1 T T T ˆ c (t)∆n (t) L (t) + ˆ 2 T T 1 1 R 0 ˆT T T = J (q) F (t) I3×3 − n(t)+ nd (t) n (t) . 0 RT 2 2 (77) From LaSalle theorem, we can have the convergence of the system to the invariant set in (76).
852
Fig. 4.
IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 4, AUGUST 2008
Experimental setup.
To further prove that limt→∞ ∆n(t) = 0, the matrix in (77) should have a rank of 2 because n(t) is a unit vector. It is reasonable to assume that J(q) is full rank. When a sufficient number of images are used in the adaptive algorithm, the convergence of el (tj ) implies that the estimated parameters θˆl (t) are convergent to the real values up to scale. Then, we can conclude that ˆ T (t) in (30) is 2. the rank of matrix F Next, we consider the rank of (I3×3 − ( 12 n(t) + 12 nd (t)) T n (t))T . If ∆n(t) = 0, we have 1 1 T det I3×3 − n(t) + nd (t) n (t) 2 2 1 1 T = 1 − n (t) n(t) + nd (t) 2 2 1 T 1 n (t)∆n(t) = (1 − nT (t)nd (t)) = 0. (78) 2 2 T 1 So the rank of I3×3 − ( 2 n(t) + 12 nd (t))nT (t) is 3. Therefore, the matrix in (77) has a rank of 2. Since n(t) is a unit vector, we can conclude that limt→∞ ∆n(t) = 0 from (76). =
V. EXPERIMENTS We have implemented the proposed controller in a 3-DOF robot manipulator (see Fig. 4) at the Networked Sensors and Robot Laboratory of the Chinese University of Hong Kong. The moment inertia about its vertical axis of the first link of the manipulator is 0.005 kg·m2 , and the masses of the second and third links are 0.167 and 0.1 kg, respectively. The lengths of the second and third links are 0.145 and 0.1285 m, respectively. The joints of the manipulator are driven by Maxon brushed dc motors. The powers of the motors are 20, 10, and 10 W, respectively. The gear ratios at the joints are 480:49, 12:1, and 384:49, respectively. Since the gear ratios are small and the input motor powers are small as well, the nonlinear forces have strong effect on the robot motion, though the manipulator is light. We use three incremental optical encoders with a resolution of 2000 pulses/turn to measure the joint angles of the motors. The joint
Fig. 5. Initial and desired (black square) position of the feature point on the image plane.
velocities are obtained by differentiating the joint angles. A Prosilica camera connecting with an Intel Pentium IV PC is used to capture images at the rate of 100 frames per second (fps). Three experiments have been carried out on this system to verify the performance of the controller. A. Control of One Feature Point The first experiment controls the projection of a single feature point. We first set a position of the feature point and record its image position as the initial position, then move the robot to another position and record that image of the feature point as the desired one (see Fig. 5). The control gains are K1 = 40, B = 0.00001, K3 = 0.001, and Γ = 5000. The calibration values of the intrinsic parameters of the camera are au = 2182, av = 2186, u0 = 340, and v0 = 199. The initial estimated transformation matrix of the robot base frame with respect to the camera frame is −0.3 0 0.95 0 0 −1 0 0 ˆ = T . 0.95 0 0.3 1 0 0 0 1 The initial estimation of the feature point with respect to the robot base frame is x ˆ = [ 1 −0.1 0.2 ]T m. The initial esˆv (0) = 2000, timated intrinsic parameters are a ˆu (0) = 2000, a u ˆ0 (0) = 300, and vˆ0 (0) = 300. In this experiment, 19 frames were used in the adaptive algorithm. The sampling time in the experiment is 15 ms. Fig. 6(a) and (b) demonstrates the position errors and the trajectory of the feature point on the image plane, respectively. The results confirmed the convergence of the image error to zero under the control of the proposed method. Fig. 7 plots the profiles of the 38 estimated parameters. Fig. 8 illustrates the 3-D trajectory of the end-effector. This experiment showed that it is possible to achieve the convergence of the image error without camera parameters and the 3-D position of the feature point information.
WANG et al.: ADAPTIVE VISUAL SERVOING USING POINT AND LINE FEATURES
853
Fig. 6. Experimental result for one feature point. (a) Image error of the feature point. (b) Trajectory of the feature point on the image plane.
B. Control of Three Feature Points In the second experiment, we control three feature points whose initial estimated coordinates with respect to the robot base frames are x ˆ1 = [ 1 −0.1 0.2 ]T m, x ˆ2 = T T ˆ3 = [ 1 −0.15 0.15 ] m, re[ 1 −0.15 0.25 ] m, and x spectively. The control gains used in the experiments are K1 = 35, B = 0.000005, K3 = 0.001, and Γ = 10000. The true values and initial estimations of the camera intrinsic and extrinsic parameters are the same as those in previous experiments. In this experiment, seven frames are used in the adaptive algorithm. The initial and desired positions of the feature points are shown in Fig. 9. The image errors of the feature points on the image plane are demonstrated in Fig. 10. The experimental results confirmed convergence of the image errors of the feature points. The residual image errors are within one pixel. C. Control of Line Feature In this experiment, the robot is first moved to a goal position and the camera is shown a 3-D line as referenced view. Then, the
Fig. 7.
Profiles of the estimated parameters.
854
Fig. 8.
Fig. 9.
IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 4, AUGUST 2008
3-D trajectory of the end-effector. Fig. 10.
Image errors of the three feature points.
Fig. 11.
Initial (lower left) and desired line.
Fig. 12.
Normal vector errors.
Initial and desired (black square) positions.
camera is moved to an initial position to start the experiment, as shown in Fig. 11. Remark 3: The fact that the initial and desired lines are parallel to each other is due to the fact that the robot manipulator used only has 3 DOF. The robot is not able to rotate the camera about its optical axis. So no matter how the manipulator moves, all the projections of a fixed 3-D line on the image plane are parallel to each other. The control gains used are K1 = 100, B = 5, and K3 = 0.01. The initial estimated intrinsic parameters are ˆv (0) = 3000, u ˆ0 (0) = 300, and vˆ0 (0) = a ˆu (0) = 3000, a ˆTP , b u ˆT ) = 300. The initial 3-D lines estimation is (b x T T ((0.02 0.4 0.1) , (0.6 0.6 0.01) ). In this experiment, 54 frames are used in the adaptive algorithm. Hough transform is used to identify the image lines. The sampling time in the experiment is 27 ms. The errors of the normal vector are shown in Fig. 12. The results confirmed expected asymptoticconvergence of the errors to zero. Fig. 13 shows the profiles of the first unknown eight parameters.
WANG et al.: ADAPTIVE VISUAL SERVOING USING POINT AND LINE FEATURES
855
adaptive algorithm has been developed to estimate the unknown camera parameters and the 3-D structure of the features. On the basis of the nonlinear robot dynamics, we have proved by the Lyapunov method the convergence of the image feature errors to zero and the convergence of the estimated parameters to the real values up to a scale. Experimental results confirmed the good performance of the proposed methods. APPENDIX Proof of Property 5: Let
R11 R(t) = R21 R31
b
Fig. 13.
VI. CONCLUSION This paper presents new adaptive controllers for dynamic image-based visual servoing of a robot manipulator using uncalibrated eye-in-hand visual feedback. Both point and line features are considered. To cope with nonlinear dependence of the image Jacobian on the unknown parameters, the controllers employ the depth-independent interaction matrix to linearly parameterize the dynamics. A new representation is developed for line features to facilitate the linear parameterization. A new xTP RT (t)(ψ 2 ψ T3 − ψ 3 ψ T2 )R(t)b u T R11 b x1 + R12 b x2 + R13 b x3 = R21 b x1 + R22 b x2 + R23 b x3 R31 b x1 + R32 b x2 + R33 b x3 T R11 b x1 + R12 b x2 + R13 b x3 = R21 b x1 + R22 b x2 + R23 b x3 R31 b x1 + R32 b x2 + R33 b x3
R13 R23 R33
u3 )T , b xP = ( b x1 b x2 b x3 )T , and 0 −ψ3 ψ2 0 −ψ1 . (ψ 2 ψ T3 − ψ 3 ψ T2 ) = ψ3 −ψ2 ψ1 0
u = ( b u1
Profile of the estimated parameters.
R12 R22 R32
b
u2
b
Consider the first component of the vector b
xTP RT (t)(ψ 2 ψ T3 − ψ 3 ψ T2 )R(t)b u b T T T T xP R (t)(ψ 3 ψ 1 − ψ 1 ψ 3 )R(t)b u b T xP RT (t)(ψ 1 ψ T2 − ψ 2 ψ T1 )R(t)b u in (23):
b
R11 b u1 + R12 b u2 + R13 b u3 0 −ψ3 ψ2 ψ3 0 −ψ1 R21 b u1 + R22 b u2 + R23 b u3 −ψ2 ψ1 0 R31 b u1 + R32 b u2 + R33 b u3 −ψ3 (R21 b u1 + R22 b u2 + R23 b u3 ) + ψ2 (R31 b u1 + R32 b u2 + R33 b u3 ) ψ3 (R11 b u1 + R12 b u2 + R13 b u3 ) − ψ1 (R31 b u1 + R32 b u2 + R33 b u3 ) −ψ2 (R11 b u1 + R12 b u2 + R13 b u3 ) + ψ1 (R21 b u1 + R22 b u2 + R23 b u3 )
= (R11 b x1 + R12 b x2 + R13 b x3 )(−ψ3 (R21 b u1 + R22 b u2 + R23 b u3 ) + ψ2 (R31 b u1 + R32 b u2 + R33 b u3 )) + (R21 b x1 + R22 b x2 + R23 b x3 )(ψ3 (R11 b u1 + R12 b u2 + R13 b u3 ) − ψ1 (R31 b u1 + R32 b u2 + R33 b u3 )) + (R31 b x1 + R32 b x2 + R33 b x3 )(−ψ2 (R11 b u1 + R12 b u2 + R13 b u3 ) + ψ1 (R21 b u1 + R22 b u2 + R23 b u3 )) = ψ1 ((R31 b x1 +R32 b x2 +R33 b x3 )(R21 b u1 +R22 b u2 +R23 b u3 )−(R31 b u1 +R32 b u2 +R33 b u3 )(R21 b x1 +R22 b x2 +R23 b x3 )) + ψ2 ((R11 b x1 +R12 b x2 +R13 b x3 )(R31 b u1 +R32 b u2 +R33 b u3 )−(R31 b x1 +R32 b x2 +R33 b x3 )(R11 b u1 +R12 b u2 +R13 b u3 )) + ψ3 ((R21 b x1 + R22 b x2 + R23 b x3 )(R11 b u1 + R12 b u2 + R13 b u3 ) − (R11 b x1 + R12 b x2 + R13 b x3 )(R21 b u1 + R22 b u2 + R23 b u3 )) = ψ1 ((R31 R22 − R32 R21 )b x1 b u2 + (R31 R23 − R33 R21 )b x1 b u3 + (R32 R21 − R31 R22 )b x2 b u1 + (R32 R23 − R33 R22 )b x2 b u3 + (R33 R21 − R31 R23 )b x3 b u1 + (R33 R22 − R32 R23 )b x3 b u2 ) + ψ2 ((R11 R32 − R31 R12 )b x1 b u2 + (R11 R33 − R31 R13 )b x1 b u3 + (R12 R31 − R32 R11 )b x2 b u1 + (R12 R33 − R32 R13 )b x2 b u3 + (R13 R31 − R33 R11 )b x3 b u1 + (R13 R32 − R33 R12 )b x3 b u2 ) + ψ3 ((R21 R12 − R11 R22 )b x1 b u2 + (R21 R13 − R11 R23 )b x1 b u3 + (R22 R11 − R12 R21 )b x2 b u1 + (R22 R13 − R12 R23 )b x2 b u3 + (R23 R11 − R13 R21 )b x3 b u1 + (R23 R12 − R13 R22 )b x3 b u2 ).
(79)
856
IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 4, AUGUST 2008
This equation implies there are only 18 (3 × 6) independent parameters for the first component of the vector in (23). A similar derivation can prove that the other two components of the vector also contain 18 independent parameters. Therefore, the vector in (23) contains 54 independent parameters. REFERENCES [1] S. Hutchinson, G. D. Hager, and P. I. Corke, “A tutorial on visual servo control,” IEEE Trans. Robot. Autom., vol. 12, no. 5, pp. 651–670, Oct. 1996. [2] E. E. Hemayed, “A survey of camera self-calibration,” in Proc. IEEE Conf. Adv. Video Signal Based Surveillance, 2003, pp. 351–357. [3] N. P. Papanikolopoulos, B. J. Nelson, and P. K. Khosla, “Six degree-of-freedom hand/eye visual tracking with uncertain parameters,” IEEE Trans. Robot. Autom., vol. 11, no. 5, pp. 725–732, Oct. 1995. [4] E. Malis, F. Chaumette, and S. Boudet, “2–1/2-D visual servoing,” IEEE Trans. Robot. Autom., vol. 15, no. 2, pp. 238–250, Apr. 1999. [5] E. Malis, “Visual servoing invariant to changes in camera-intrisic parameters,” IEEE. Trans. Robot. Autom., vol. 20, no. 1, pp. 72–81, Feb. 2004. [6] K. Hosada and M. Asada, “Versatile visual servoing without knowledge of true Jacobain,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 1994, pp. 186–191. [7] B. H. Yoshimi and P. K. Allen, “Alignment using an uncalibrated camera system,” IEEE. Trans. Robot. Autom., vol. 11, no. 4, pp. 516–521, Aug. 1995. [8] A. Ruf, M. Tonko, R. Horaud, and H.-H. Nagel, “Visual tracking of an end effector by adaptive kinematic prediction,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 1997, pp. 893–898. [9] J. Pomares, F. Chaumette, and F. Torres, “Adaptive visual servoing by simultaneous camera calibration,” in Proc. IEEE Int. Conf. Robot. Autom., 2007, pp. 2811–2816. [10] Y. Shen, D. Song, Y. H. Liu, and K. Li, “Asymptotic trajectory tracking of manipulators using uncalibrated visual feedback,” IEEE/ASME Trans. Mechatronics, vol. 8, no. 1, pp. 87–98, Mar. 2003. [11] H. Wang and Y. H. Liu, “Adaptive image-based trajectory tracking of robots,” in Proc. IEEE Int. Conf. Robot. Autom., 2005, pp. 564–2569. [12] B. Bishop and M. W. Spong, “Adaptive calibration and control of 2D monocular visual servo system,” in Proc. IFAC Symp. Robot Control, 1997, pp. 525–530. [13] C. C. Cheah, M. Hirano, S. Kawamura, and S. Arimoto, “Approximate Jacobian control for robots with uncertain kinematics and dynamics,” IEEE Trans. Robot. Autom., vol. 19, no. 4, pp. 692–702, Aug. 2003. [14] L. Deng, F. Janabi-Sharifi, and W. J. Wilson, “Stability and robustness of visual servoing methods,” in Proc. IEEE Int. Conf. Robot. Autom., 2002, pp. 1604–1609. [15] L. Hsu and P. L. S. Aquino, “Adaptive visual tracking with uncertain manipulator dynamics and uncalibrated camera,” in Proc. 38th IEEE Int. Conf. Decis. Control, 1999, pp. 1248–1253. [16] R. Kelly, F. Reyes, J. Moreno, and S. Hutchinson, “A two-loops direct visual control of direct-drive planar robots with moving target,” in Proc. IEEE Int. Conf. Robot. Autom., 1999, pp. 599–604. [17] D. Kim, A. A. Rizzi, G. D. Hager, and D. E. Koditschek, “A “robust” convergent visual servoing system,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 1995, pp. 348–353. [18] W. E. Dixon, “Adaptive regulation of amplitude limited robot manipulators with uncertain kinematics and dynamics,” IEEE Trans. Autom. Control, vol. 52, no. 3, pp. 488–493, Mar. 2007. [19] R. Carelli, O. Nasisi, and B. Kuchen, “Adaptive robot control with visual feedback,” in Proc. Amer. Control Conf., 1994, pp. 1757– 1760. [20] R. Kelly, R. Carelli, O. Nasisi, B. Kuchen, and F. Reyes, “Stable visual servoing of camera-in-hand robotic systems,” IEEE/ASME Trans. Mechatronics, vol. 5, no. 1, pp. 39–48, Mar. 2000. [21] K. Hashimoto, K. Nagahama, and T. Noritsugu, “A mode switching estimator for visual seroving,” in Proc. IEEE Int. Conf. Robot. Autom., 2002, pp. 1610–1615. [22] K. Nagahama, K. Hashimoto, T. Norisugu, and M. Takaiawa, “Visual servoing based on object motion estimation,” in Proc. IEEE Int. Conf. Robot. Autom., 2002, pp. 245–250.
[23] Y. H. Liu, H. Wang, and K. K. Lam, “Dynamic visual servoing of robots in uncalibrated environments,” in Proc. IEEE Int. Conf. Robot. Autom., 2005, pp. 3142–3147. [24] Y. H. Liu, H. Wang, C. Wang, and K. Lam, “Uncalibrated visual servoing of robots using a depth-independent image Jacobian matrix,” IEEE Trans. Robot., vol. 22, no. 4, pp. 804–817, Aug. 2006. [25] H. Wang and Y. H. Liu, “Uncalibrated visual tracking control without visual velocity,” in Proc. IEEE Int. Conf. Robot. Autom., 2006, pp. 2738– 2743. [26] H. Wang, Y. H. Liu, and D. Zhou, “Dynamic visual tracking for manipulators using an uncalibrated fixed camera,” IEEE Trans. Robot., vol. 23, no. 3, pp. 610–617, Jun. 2007. [27] H. Wang and Y. H. Liu, “Dynamic visual servoing of robots using uncalibrated eye-in-hand visual feedback,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2006, pp. 3797–3802. [28] B. Espiau, F. Chaumette, and P. Rives, “A new approach to visual servoing in robotics,” IEEE Trans. Robot. Autom., vol. 8, no. 3, pp. 313–326, Jun. 1992. [29] J. P. Urban, G. Motyl, and J. Gallice, “Real-time visual servoing using controlled illumination,” Int. J. Robot. Res., vol. 13, no. 1, pp. 93–100, 1994. [30] N. Andreff, B. Espiau, and R. Horaud, “Visual servoing from lines,” in Proc. IEEE Int. Conf. Robot. Autom., 2000, pp. 2070–2075. [31] E. Malis, J.-J. Borrelly, and P. Rives, “Intrinsics-free visual servoing with respect to straight lines,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2002, pp. 384–389. [32] R. Mahony and T. Hamel, “Image-based visual servo control of aerial robotic systems using linear image features,” IEEE Trans. Robot., vol. 21, no. 2, pp. 227–239, Apr. 2005. [33] J. Pl¨ucker, “On a new geometry of space,” Philos. Trans. Roy. Soc. Lond., vol. 155, pp. 725–791, 1865. [34] D. A. Forsyth and J. Ponce, Computer Vision: A Modern Approach. Englewood Cliffs, NJ: Prentice-Hall, 2003. [35] J. J. Slotine and W. Li, “On the adaptive control of robot manipulators,” Int. J. Robot. Res., vol. 6, pp. 49–59, 1987. [36] M. Takegaki and S. Arimoto, “A new feedback method for dynamic control of manipulators,” ASME J. Dyn. Syst., Meas. Control, vol. 103, pp. 119– 125, 1981.
Hesheng Wang (S’05–A’07–M’08) received the B.Eng. degree in electrical engineering from Harbin Institute of Technology, Harbin, China, in 2002, and the M.Phil. and Ph.D. degrees in automation and computer-aided engineering from The Chinese University of Hong Kong, Hong Kong, in 2004 and 2007, respectively. He is currently a Postdoctoral Researcher in the Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong. His current research interests include visual servoing, adaptive robot control, and computer vision. Dr. Wang received the Best Student Conference Paper at the IEEE International Conference on Integration Technology in 2007.
WANG et al.: ADAPTIVE VISUAL SERVOING USING POINT AND LINE FEATURES
Yun-Hui Liu (S’90–M’92–SM’98) received the B.Eng. degree in applied dynamics from Beijing Institute of Technology, Beijing, China, in 1985, the M.Eng. degree in mechanical engineering from Osaka University, Osaka, Japan, in 1989, and the Ph.D. degree in mathematical engineering and information physics from the University of Tokyo, Tokyo, Japan, in 1992. From 1992 to 1995, he was with the Electrotechnical Laboratory, Ministry of International Trade and Industry (MITI), Japan. Since February 1995, he has been with The Chinese University of Hong Kong, Hong Kong, where he is currently a Professor in the Department of Mechanical and Automation Engineering and the Director of the Joint Center for Intelligent Sensing and Systems. He is also a ChangJiang Professor of the National University of Defense Technology, Changsha, China. He has authored or coauthored over 100 papers published in refereed journals and refereed conference proceedings. His current research interests include adaptive control, visual servoing, multifingered robotic hands, sensor networks, virtual reality, Internet-based robotics, and machine intelligence. Dr. Liu received the 1994 and 1998 Best Paper Awards from the Robotics Society of Japan and the Best Conference Paper Awards of 2000 and 2003 of the IEEE Electro/Information Technology Conferences. He was an Associate Editor of the IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION from 2001 to 2004. He was the General Chair of the 2006 IEEE/Robotics Society of Japan (RSJ) International Conference on Intelligent Robots and Systems (IROS).
857
Dongxiang Zhou received the B.Eng. and M.Eng. degrees in physical electronics and optoelectronics from the Southeast University, Nanjing, China, in 1989 and 1992, respectively, and the Ph.D. degree in information and communication engineering from the National University of Defense Technology, Changsha, China, in 2000. He is currently an Associate Professor in the School of Electronic Science and Engineering, National University of Defense Technology. From 2004 to 2006, he was a Postdoctoral Fellow and a Research Associate at the University of Alberta, Edmonton, AB, Canada. His current research interests include image processing, computer vision, and integrated intelligent systems.