Int. J. Systems, Control and Communications, Vol. 1, No. 1, 2008
Online estimation of Image Jacobian Matrix for uncalibrated dynamic hand-eye coordination Jianbo Su∗ and Yanjun Zhang Department of Automation, Shanghai Jiaotong University, Shanghai, 200240, China Fax: +86-21-3420 4022 E-mail:
[email protected] E-mail: fl
[email protected] ∗ Corresponding author
Zhiwei Luo Bio-Mimetic Control Research Center, RIKEN Anagahora, Shimoshidami, Moriyama-ku, Nagoya, 463-0003, Japan E-mail:
[email protected] Abstract: This paper focuses on three different strategies for online estimation of the Image Jacobian Matrix (IJM) in uncalibrated robotic visual servoing, with the help of different scenarios of system configurations and coordination tasks that are prevalent in current research. The least square estimation method and the constant IJM policy are proposed for monocular visual feedback, while a Kalman filter-based method is proposed for a stereovision system. The PI control law and the optimal control theory are respectively adopted for coordination controllers to suit different control purposes. Extensive simulations and experiments are provided to evaluate performance of the proposed methods. Keywords: calibration-free; Image Jacobian Matrix; IJM; Kalman filter; least-square estimation; visual servoing. Reference to this paper should be made as follows: Su, J., Zhang, Y. and Luo, Z. (2008) ‘Online estimation of Image Jacobian Matrix for uncalibrated dynamic hand-eye coordination’, Int. J. Systems, Control and Communications, Vol. 1, No. 1, pp.31–52. Biographical notes: Jianbo Su, PhD, Professor in Department of Automation, Shanghai Jiaotong University, Shanghai, China. He obtained PhD Degree from Southeast University, Nanjing, China. He was a visiting scholar in the Chinese University of Hongkong, Technical University of Munich, and Research Institute of Physics and Chemistry of Japan, in 1999, 2002, and 2004, respectively. His research interests include robotics, pattern recognition. Copyright © 2008 Inderscience Enterprises Ltd.
31
32
J. Su et al. Yanjun Zhang, PhD student in Shanghai Jiaotong University. His research interests include plug-and-play robotics, visual servoing and networked robotics. Zhiwei Luo, PhD and Professor in the Research Institute of Physics and Chemistry of Japan. His research interests include medical robot, control sciences and engineering.
1 Introduction The Image Jacobian Matrix (IJM) model has widely been accepted in calibration-free robotic hand-eye coordination. It is used to directly map visual feedback to robot control, without knowledge of hand-eye relations whose determinations are always from tedious and error-prone calibration processes (Feddema and George Lee, 1990; Meng and Zhuang, 2001). The IJM is usually a linear matrix that relates a differential movement in robot coordinate system to that in image coordinate system. With the help of IJM, the complex nonlinearity of the hand-eye relationship can be approximated by simple linearities of proper spatial and temporary combinations. Since IJM is variant, online estimations are always necessary (Hespanha et al., 1998). Researches in this area have shown that performance of the calibration-free robotic hand-eye coordination system mainly relies on the online estimation speed and accuracy of the IJM. Online estimation of IJM postulates online data accumulations. These are normally realised by introducing some uncorrelated exploratory movements of the robot hand in the neighbourhood of its current work position and observing the resultant projections in image plane(s). Exploratory movements are radically irrelevant thus redundant to the robot movements planned for task executions. When the coordination is under a static condition, it is feasible to include redundant exploratory movements during task fulfillment, although efficiency of the whole hand-eye coordination system is accordingly degraded (Yoshimi and Allen, 1995). However, when the coordination is under a dynamic environment, and/or executing a dynamic task, such as tracking a moving object, exploratory movements can not be involved since they destroy continuity of task execution that is critical for successful dynamic coordination. Therefore, how to estimate IJM in real-time without introducing redundant exploratory movements is the most significant issue for dynamic coordination of the uncalibrated robotic hand-eye system. Besides, computation expenses as well as time delay from online IJM estimation should also be carefully counted on in system control design and performance evaluation. There has been a flurry of activity in online acquisitions of IJM for dynamic hand-eye coordination. A straightforward attempt is based on online calibration of the hand-eye relations (Drummond and Cipolla, 2002; Horaud et al., 1998). It is also understood that when the robotic workspace is small, the IJM may not change dramatically across the whole workspace. A constant IJM could easily be acceptable during the whole coordination process to save computation expenses (Piepmeier and Lipkin, 2003). The unmodelled errors from employing a constant IJM is compensated by coordination controller designed from adaptive control theory (Papanikolopoulos
Online estimation of Image Jacobian Matrix
33
et al., 1993) or optimal control theory (Hashimoto et al., 1998). These fashions are apparently not in the category of online estimation of IJM. The truly online IJM estimation methods tend to take advantage of task-fulfillment movements of the robot (Sutanto et al., 1997), in order to eliminate exploratory movements of the robot hand. Most of algorithms developed are in the framework of least square estimation method (Hosoda and Asada, 1994; Piepmeier et al., 2004). However this recipe may give rise to singularity problem since task-fulfillment movements are not guaranteed to be sequentially uncorrelated to each other. It is reported that to estimate the increment of the IJM in each visual sampling interval, instead of IJM itself, may decrease occurrence of singularities (Hosoda and Asada, 1994). Several algorithms are proposed thereby, such as the Broyden iterative algorithm in Jagersand et al. (1997). However, all these estimation methods are brittle to system noises and external environmental disturbances, so that the system performance is not ultimately reliable enough for applications. In Su et al. (2004a), the IJM estimation is viewed as the system unmodelled dynamics and estimated by a newly-invented state observer. Nevertheless, determinations of the parameters and nonlinear functions involved in the state observer are not straightforward, which inhibits easy utilities of the method. Efforts on reducing online computational burden of the IJM estimation lie in performing the artificial neural networks through offline training. A variety of neural networks have been deployed and tested, such as BP neural networks (Hashimoto et al., 1992), self-organised neural networks (Sharma and Srinivasa, 1998) and fuzzy neural networks (Su et al., 2004c). However, since the IJM is a linear model, the capacity of the neural networks to approach a nonlinear function is not well exploited, and the resultant system performance is also limited. In addition, structures of neural networks seem to be closely related to hand-eye system configurations. It is not yet obvious which type of neural network is more compatible with a prescribed system setup and a particular task for achieving better performance. Moreover, offline training for mapping IJM with neural networks is computationally comparable with hand-eye relation calibration. In this paper, we will thoroughly investigate the online IJM estimation methods through extensive simulations and experiments. To help achieving this end, we take advantage of a robot manipulator and two cameras, by which two kinds of configuration of hand-eye system could be deployed as shown in Figure 1. In Figure 1(a), the hand-eye robot system is coordinated under monocular visual feedback from either the camera globally mounted above the robotic workspace or the camera locally mounted in the robot hand. A least-squared estimation of IJM is discussed under global visual feedback; whereas a constant IJM policy is adopted for the local visual feedback. We exhibit performance of these two IJM estimation methods by carrying on a 2-D tracking task. Moreover, a switch control scheme is accordingly proposed to match different visual feedback and IJM estimation fashions so that a better tracking performance can be achieved, which will be discussed in details in Section 2. If the robot hand is required to manipulate in 3-D space, a stereovision feedback is necessary to provide 3-D depth information of the object. In Figure 1(b), two cameras, fixed with respect to the robot base, constitute a stereovision system, with which the robot hand is coordinated. Section 3 investigates the IJM estimation schemes under the stereovision feedback. The least square method is first studied. Due to higher
34
J. Su et al.
complexity in system structure and model, the least square method is not satisfactory in dealing with system noises and external disturbances. And the singularity problem in iterative estimation procedure is more serious even if the policy of estimating increments of IJM is adopted. Consequently, by constructing an instrumental dynamic system with the elements of IJM, a Kalman filter-based method is proposed to elaborate a standard iterative estimation procedure, which achieves more stable and robust performance. These two methods are evaluated by simulations and experiments with a 3-D tracking task. Comparisons with the well-known Sutanto’s method (Sutanto et al., 1997) are also provided under noisy conditions, followed by conclusions in Section 4. Figure 1 Overview of the robotic hand-eye coordination systems: (a) system with a global camera and a hand camera to construct two types of hand-eye coordination system with monocular visual feedback and (b) system with two global cameras to construct a hand-eye coordination system with a stereo visual feedback
2 IJM estimation under monocular visual feedback This section studies IJM estimation under monocular visual feedback from a camera mounted either above the robot workspace or at the robot hand, as shown in Figure 1(a). The global camera fixed above the robotic workspace can provide visual feedback of large scope, but can not see the object when the hand reaches the object because of visual occlusion. On the other hand, the camera in robot hand has higher resolution in observing the object, but merely a limited visual field so that it can only observe a small area around the robot hand. Since there is only one camera used for coordination control at each time moment, the task is to drive the robot hand in a 2-D translation way to approach and track an object moving in a 2-D work-plane. When the object is far away from the robot hand, the robot hand coordinates with the global camera to rapidly converge to the object across the large scope of the robot workspace. When the object is within the visual field of the hand camera, the robot hand is switched to coordinate with the hand camera. The task is said to be finished when the object is kept to locate at a prescribed position, e.g., the centre, in the image plane of the hand camera with predefined accuracy.
2.1 Global visual feedback We first consider the IJM estimation under the global visual feedback. It is reasonable to assume that the robot hand and the moving object are both always within the visual
Online estimation of Image Jacobian Matrix
35
field of the camera during coordination process. Two Cartesian coordinate systems are respectively attached to the robot base and the global camera to measure positions of the robot hand and the object in space.
2.1.1 Image Jacobian Matrix Since the robot hand is controlled in the robot base frame, it is essential to derive the IJM in terms of the robot hand so that a differential movement in the robot base frame can be related to that in the image frame. Suppose the position of the robot hand is [xb , yb , Z]T in the robot base frame, where Z is the depth of the hand and will be kept unchanged for 2-D tracking. Hence movements of the robot hand remain in the x − y plane of the robot base frame. The control input for the robot hand is p(t) = [xb (t), yb (t)]T . If the projection of the robot hand in the image plane is s(t) = [u(t), v(t)]T , we have ˙ s(t) ˙ = Jg (t) · p(t),
(1)
where ∂u Jg (t) =
∂xb ∂v ∂xb
∂u ∂yb ∂v ∂yb
(2)
is defined as the IJM. Suppose the robot hand is at [xc , yc , zc ] in the camera’s coordinate system, then xc u = f z c (3) y c v = f zc where f is the focal length of the camera. If the camera’s coordinate system and the robot base frame are related by xc xb y c = R · yb + T (4) zc Z where r11 r12 r13 Tx R = r21 r22 r23 , T = Ty Tz r31 r32 r33 are both constants since the global camera is fixed in the robot base frame. From equations (3) and (4), we have
f u˙ s˙ = = zc v˙ 0
0 f zc
x˙ f c − zuc y˙c = zc v − zc 0 z˙c
0 f zc
r 11 r12 − zuc r21 r22 p. ˙ − zvc r31 r32
(5)
36
J. Su et al.
According to equations (1) and (5), the IJM under the global visual feedback has the form
1 f r11 − u r31 f r12 − u r32 Jg = , (6) zc f r21 − v r31 f r22 − v r32 which exactly shows how Jg changes in terms of variations of zc , u and v due to robot hand’s movements.
2.1.2 Iterative estimation of IJM We adopt the scheme to iterate the IJM by its increment at each visual sampling moment, i.e., Jg (k) = Jg (k − 1) + ∆Jg (k). Assuming the depth variation of the robot hand in the camera’s coordinate system between two sequential visual sampling moments can be disregarded, i.e., zc (k) ≈ zc (k − 1) (Since the global camera is normally far away from the robot’s workplane, this assumption is reasonable and has extensively been used in literatures,) from equation (6), we have:
f r11 − u(k) r31 f r12 − u(k) r32 1 zc (k) f r21 − v(k) r31 f r22 − v(k) r32
f r11 − u(k − 1) r31 f r12 − u(k − 1) r32 1 − zc (k − 1) f r21 − v(k − 1) r31 f r22 − v(k − 1) r32
u(k − 1) − u(k) 1 r31 r32 ≈ zc (k) v(k − 1) − v(k)
∆Jg (k) =
= ∆s(k) · w(k)T ,
(7)
in which
r31
− zc (k) ∆u(k) u(k) − u(k − 1) w1 (k) . = ∆s(k) = = , w(k) = w2 (k) ∆v(k) v(k) − v(k − 1) − zr32 (k) c
Thus from equation (7), we can see that estimation of ∆Jg is transformed to estimation of w(k) since ∆s(k) is known from image processing. The estimation of w(k) should resort to equation (1) in its discrete form at kth time moment, ∆s(k) = Jg (k) · ∆p(k) = (Jg (k − 1) + ∆Jg (k)) · ∆p(k) = (Jg (k − 1) + ∆s(k) · w(k)T ) · ∆p(k).
(8)
If we replace Jg (k − 1) in equation (8) with its estimation Jg (k − 1) from previous iteration, the estimation for w(k) can be achieved from a least-square algorithm w(k) ˆ =
∆p(k) · ∆s(k)T · (∆s(k) − Jg (k − 1) · ∆p(k)) , ∆s(k)2 · ∆p(k)2
(9)
which leads to Jg (k) = Jg (k − 1) + ∆Jg (k) = Jg (k − 1) + ∆s(k) · w(k) ˆ T.
(10)
Online estimation of Image Jacobian Matrix
37
Equations (9) and (10) construct an online iterative procedure for IJM estimation. The initial value Jg (0) of the iterative procedure can be figured out by initial exploration movements. Given two exploration movements ∆p1 and ∆p2 of the robot hand which are perpendicular to each other in the robot base frame, their projections in the image plane are ∆s1 and ∆s2 , respectively, then Jg (0) = [∆s1 ∆s2 ] · [∆p1 ∆p2 ]−1 .
(11)
2.2 Local visual feedback The camera fixed at the robot hand observes only a local area around the hand, with higher resolutions than the global camera. However the camera is moving along with the robot hand. It is difficult to recognise the 3-D camera motion or object motion by object movements in the camera’s image plane. This gives rise to an inherent difficulty in IJM estimation under this system configuration. We attach a Cartesian coordinate system to the local hand camera. The object motion, instead of the robot hand motion as in the global visual feedback case, is considered to derive IJM since the hand camera is observing the motions of the object with respect to the robot hand. Denote the object position in the image plane of the local hand camera by g = [u, v]T , its change is related to object motion b˙ and robot hand motion p˙ in the robot base frame by ˙ g˙ = Jl · (b˙ − p).
(12)
Comparing equations (12) with (1), we can see that object motion b˙ in equation (12) is not known and impractical to measure online. So it is not likely to estimate IJM by the procedure as stated in Section 2.1.2, provided that continuouss redundant online data accumulation procedures are involved with prohibitively high computational expenses. Nevertheless, the camera in the hand is used to guide the robot hand to track the moving object in a small scope within the robot workspace, the IJM under this condition usually does not change significantly. Therefore it is worthwhile to adopt a constant Jl here to save computational expenses and enhance the robot’s reaction speed. And the modelling error from Jl leaves to be compensated by controller designs. This is possibly the most cheapest fashion to solve the calibration-free hand-eye coordination problem under this system configuration. From equation (12), we can see that if b˙ = 0, i.e., the object is static, then g˙ = −Jl · p. ˙
(13)
Then a reasonable estimation of the constant Jl can be obtained from the following procedure: •
Moving the robot hand to the central area in the robot workspace. Be sure that the hand camera observes a static object with the object projected in the central region of its image plane.
•
Driving the robot hand to move by ∆p1 randomly while keeping the object observed by the camera. Recognising the resultant object position change in image plane ∆g1 .
38 •
J. Su et al. Driving the robot hand to move by ∆p2 in the direction perpendicular to that of ∆p1 . Recognising the resultant object position change in image plane ∆g2 .
Consequently, an estimation of Jl from above procedure is Jl = −[∆g1 ∆g2 ] · [∆p1 ∆p2 ]−1 .
(14)
2.3 Controller designs and switch It is obvious that controller design is closely related to coordination purposes. Therefore, it is very helpful to clarify the coordination purposes according to system setup and task requirements before deploying control theories into controller design. When the robot is coordinated with the global camera, the purpose of the control is to drive the robot hand to approach the object so that the object can be observed by hand camera. We take advantage of an optimal controller to generate control signals for robot hand. If the global camera observes the object at g(t) and the robot hand at s(t) in its image plane, then the system control error is defined as eg (t) = g(t) − s(t).
(15)
The control for the robot hand is obtained by minimising the following cost function F =
1 T e eg . 2 g
(16)
Substituting equations (15) into (16) and considering equation (1) will yield the following control law in discrete form p(k + 1) = p(k) + Jg (k)−1 · (ˆ g (k + 1) − s(k)),
(17)
where a one-step predictor is employed for object movement observed in image plane gˆ(k + 1) = g(k) + (g(k) − g(k − 1) = 2g(k) − g(k − 1). When the robot is coordinated with the local(hand) camera, the control purpose is to drive the robot hand to let the object be projected at a prescribed position in the image plane of its attached camera. We adopt a PI controller to achieve a zero static error for the system. Define el (t) = g d − g(t), where g d and g(t) are respectively the desired position and the true position of the object in the image plane of the hand camera, we have the following control law in discrete form k p(k + 1) = Jl−1 · c1 el (k) + c2 el (i) ,
(18)
i=1
where c1 and c2 are the coefficients of the PI controller. When the system is to approach and track the moving object, the controller by equation (17) is first used. If the position error of the object and the robot hand is within a threshold in the global camera’s image plane and the object is observed by the local hand camera, then the controller by equation (18) is activated to take over the hand control, with visual feedback from the hand camera too. Figure 2 shows the flow chart, in which εg is a predefined threshold triggering controller switches and εl is a threshold specifying task accomplishment. Note that selection of εg is critical for a successful switch of the controllers at a proper time instant.
Online estimation of Image Jacobian Matrix
39
Figure 2 Flowchart of system control and controller switches
2.4 Simulations In simulations, the workplane where the object is moving is defined to be the xoy plane of the robot base frame. The robot hand is controlled to move on a plane of 70 mm above the xoy plane. The focal length of the global camera is 15 mm and the image resolution is 5 × 5/500 × 500 (mm/pixel). Its pose with respect to the robot base frame is 0 0.9569 −0.2588 R = 0.8192 −0.1485 −0.5540 , T = [800 600 800]T mm. −0.5736 −0.2120 −0.7912 The focal length of the hand camera is 6 mm, and the image resolution is 5 × 5/300 × 300 (mm/pixel). Its pose with respect to the robot hand is
0.7660 −0.6428 0 R = −0.6428 −0.7660 0 , T = [0 0 − 50]T mm. 0 0 −1 In simulations, all internal and external parameters of both cameras are not known to coordination controllers. A zero-mean and maximally 3-pixel Gaussian noise is also added to image processing. εg is empirically set to be 30 pixels and εl is set to be 5 pixels. In the first simulation, the object is moving along straight lines with a midway direction change. Its initial position is (100, 400) mm in the robot base frame and moving speed is 0.1 m/s. The robot hand is initially located at (200, 100) mm in the robot base frame. Parameters of the PI controller are empirically set as c1 = 1.5, and c2 = 0.1. Figure 3 shows the simulation results, in which Figure 3(a) shows the tracking process in the robot base frame and Figure 3(b) shows the tracking errors. It is easy to see that the tracking is successful with its error suppressed fast enough to near zero. The tracking error has fluctuations at about 20th step when the controller is switched
40
J. Su et al.
and 70th step when the moving direction of the object has a big change. This shows the effectiveness of the switching control scheme, which can have faster response time to object maneuvering motions with a constant IJM policy. In the second simulation, the object is moving in a circular way. The initial position of the object is at (100, 500) mm in the robot base frame. Its linear speed is 0.1 m/s. The robot hand is at (200, 100) mm initially. The PI controller parameters are selected as c1 = 1.5 and c2 = 0.3. Simulation results are shown in Figure 4, which also illustrates satisfactory tracking performance, although the static tracking error is expectedly a little bit larger than that in Figure 3. Figure 3 Simulation results for tracking a linearly moving object with a direction change: (a) tracking process in robot base coordinate system and (b) tracking error
Figure 4 Simulation results for tracking a circularly moving object: (a) tracking process in robot base coordinate system and (b) tracking error
2.5 Experiments Experiment are conducted in a testbed of an ADEPT 604S robot manipulator of four degrees of freedom. The hand is constrained to move in the xoy plane in the
Online estimation of Image Jacobian Matrix
41
robot base frame. To simplify image processing for object recognition, a red block is attached in the robot hand and a green block is used as the object. Figure 5 shows the experiment results. The object is moving along a straight line with a midway veer. At the beginning of the experiment, two exploration motions of the robot hand are activated to give initial guess for Jg . The controller along with the visual feedback is switched from optimal controller with global camera to PI controller with hand camera. The thresholds in control are set the same as in simulations. A switch point can be seen in the figure which is indicated as a jerk in tracking trajectory. From experiments, we can see that system performance is satisfactory for the 2-D tracking task, which verifies the effectiveness of the estimation methods of IJM’s for the calibration-free robotic hand-eye coordination under two kinds of monocular visual feedback. The global camera can supply visual feedback for the whole work-plane that enables the robot hand converges to the object fast and compensates the visual field of the hand camera at the same time. On the other hand, the hand camera can overcome the visual occlusion problem of the robot hand to object for the global camera and gives more accurate visual feedback for object tracking to improve overall tracking performance. Figure 5 Experiment results for tracking a linearly moving object with a direction change: (a) tracking process in robot base coordinate system and (b) tracking error
3 IJM estimation under stereo visual feedback The more challenging topic in robotic hand-eye coordination is to drive the robot hand tracking an object moving in 3-D space. Therefore a stereo vision system is used to give 3-D measurements of the robot workspace. The hand-eye system discussed here is composed of one robotic hand with two cameras fixed in the robot workspace to construct a stereovision system, as shown in Figure 1(b). It is postulated that the relative poses of the two cameras are properly fixed in 3-D space so that they can supplementally provide well-posed 3-D measurements for the robot workspace. If the position of the robot hand in robot base frame is p = [xb , yb , zb ]T and its projections in the two cameras’ image planes are respectively s1 = [u1 , v1 ]T
42
J. Su et al.
and s2 = [u2 , v2 ]T , which can further form a stacked vector s = [u1 , v1 , u2 , v2 ]T , then we have s(t) ˙ = Js · p(t), ˙
(19)
where Js ∈ R4×3 is the IJM under stereo visual feedback. It is easy to learn that Js has higher dimension than those of Jg and Jl in Section 2. Therefore, its estimation is more complicated. In this section, we first study the least square estimation method for Js , and then propose a Kalman-filter estimation method.
3.1 Least square method We take advantage of the robot movements that are planned for task execution to estimate Js . Suppose the robot hand moves in the robot base frame by ∆pk−3 , ∆pk−2 , and ∆pk−1 in last three visual sampling intervals, respectively, and their projections observed in the stereovision image planes are ∆sk−3 , ∆sk−2 , and ∆sk−1 , respectively, a reasonable estimation of Js at kth visual sampling moment is Js (k) = [∆sk−3 ∆sk−2 ∆sk−1 ] · [∆pk−3 ∆pk−2 ∆pk−1 ]−1 = DS(k) · DP (k)−1 ,
(20)
if DP (k) is of full rank and thereby invertible. Unfortunately, DP (k) is not always invertible in that its three component movements ∆pk−3 , ∆pk−2 , and ∆pk−1 are essentially planned for the robot hand to execute a particular task. They are most presumably correlated to each other when, e.g., the object is moving along the direction of a specific axis of the robot base frame and has no motion components in other axes. In this case, DP is singular. We utilise the Singular Value Decomposition (SVD) of DP DP = V ΣU T ,
(21)
in which V is an orthogonal matrix of eigenvectors of the product DP · DP T and U is an orthogonal matrix of eigenvectors of the product DP T DP . Σ is a diagonal matrix of descendingly-ordered nonnegative singular values of DP . With the help of equation (21), we have an estimation for Js Js (k) = DS(k) · DP (k)+ + α(k) · (I − DP (k) · DP (k)+ )
(22)
where DP (k)+ = U Σ+ V T is the pseudo-inverse of DP (k) in the sense of SVD. Σ+ , the pseudo-inverse of Σ, is obtained from ΣT by inverting its nonzero elements. α(k) in equation (22) is a scaling factor adjusting weight of the singularity of DP (k) contributed to the estimation of Js at the moment k. Inherently, Js obtained from equation (22) connotes a null space that gives rise to uncontrollable degree(s)-of-freedom in the control of the robot hand, which is not recoverable even after the object turns to move in an orthogonal direction. Therefore, if DP (k) happens to be singular, an exploration movement orthogonal to the last two movements, ∆pk−2 , and ∆pk−1 , or two orthogonal exploration movements orthogonal to the last movement ∆pk−1 , should add to drive DP (k) away from singularity.
Online estimation of Image Jacobian Matrix
43
Obviously, these exploration movements will consequently degrade the efficiency of the task fulfillment process. Control of the robot hand under stereovision feedback can also be obtained from the optimal control law. An index function of the square of the image error of the object and the robot hand, as in equation (16), is minimised, which results in p(k + 1) = p(k) + λ(k) · Js (k)+ · es (k),
(23)
where Js (k)+ = (Js (k)T · Js (k))−1 Js (k) is the pseudo-inverse of Js (k). es (k) is defined as the errors of the object and the robot hand in the two image planes, as in equation (15), and λ(k) is defined as λ(k) = min
∆pmax
Js (k)+ · e(k)
,1 ,
in which ∆pmax is the upper bound for the velocity increment of the robot hand in each visual sampling period.
3.2 Kalman filter-based method It has been clarified in Section 3.1 that the estimation of IJM by the least square method is not so satisfactory in methodology in that the singularity problem in estimation procedure is not avoidable inherently. When estimation singularity occurs, the task execution process should be interrupted and an exploratory movement process has to be triggered for robot hand, which will surely deteriorate performance of the whole system. Moreover, the image processing is always an error-prone process, which imposes more difficulties for improving accuracy of IJM estimation. Fortunately, modern control theory has provided many efficient tools for system’s unmodelled dynamics estimation, such as Kalman filter, which has been proved to be robust to system noise and external disturbances. In this subsection, we will study how the Kalman filter is utilised in IJM estimation to obviate singularity problem besides its robust nature to system noises. Constructing an instrumental system whose state is an augmented vector formed by concatenations of the row elements of Js (k) at kth time moment
∂s1 x(k) = ∂p
∂s2 ∂p
T ,
(24)
where
∂ui ∂si = ∂p ∂xb
∂ui ∂yb
∂ui ∂zb
∂vi ∂xb
∂vi ∂yb
∂vi , (i = 1, 2). ∂zb
(25)
Define y(k) = s(k + 1) − s(k), from equation (19), we can have the instrumental system x(k + 1) = x(k) + η(k) y(k) = C(k) · x(k) + (k)
,
(26)
44
J. Su et al.
where η(k) and (k) are respectively noises in system state and output. Both of them are assumed to be white Gaussian noises. C(k) in equation (26) is the measurement matrix 0 0 0 ∆p(k)T 0 ∆p(k)T 0 0 C(k) = 0 ∆p(k)T 0 0 0 0 0 ∆p(k)T 4×12
in which ∆p(k) = p(k + 1) − p(k). The standard Kalman filter algorithm can be applied to equation (26) to observe its state (Chui and Chen, 1987): Q(k + 1) = P (k) + Rη K(k + 1) = Q(k + 1)C T (k)[C(k)Q(k + 1)C T (k) + R ]−1 P (k + 1) = [I − K(k + 1)C(k)]Q(k + 1) x ˆ(k + 1) = x ˆ(k) + K(k + 1)[y(k + 1) − C(k)ˆ x(k)]
(27)
in which Rη and R are the covariance matrices of η(k) and (k), respectively. Thus a reasonable estimation of the IJM can be reached by x ˆ(k + 1) from the iteration process ˆ that is also obtained by of equation (27). Initial value of x ˆ(0) can be resolved from J(0) a similar procedure to equation (11) with several uncorrelated exploratory movements prior to task executions. Since we do not need to employ the robot movements at last three visual sampling intervals to form a matrix that is required to be invertible, the estimation singularity problem is obviated in the Kalman-filter fashion of IJM estimation. Moreover, the Kalman filter has already been proven its superb properties in dealing with system noises, which is a solid base in attaining more accurate estimation of IJM. This is actually another motivation to apply Kalman filter for online IJM estimation.
3.3 Simulations In the following simulations, two cameras are respectively located at [200 650 850] mm and [−200 650 850] mm with respect to the robot base frame. Their optical axes are oriented towards the points [−42 340 0] mm and [42 340 0] mm in xoy plane of the robot base frame. The two cameras have the same focal lengths of 15 mm and the image resolutions are both 5 × 5/500 × 500 (mm/pixels). The speed of the object is 0.2 m/s. The maximum velocity of the robot hand is set to be 0.8 m/s from physical constraints of the robot actuators. All these parameters are unknown to IJM estimation and coordination controller. In addition, the image positions of the object and the hand are overlapped with a zero-mean and maximally 4-pixel Gaussian noise in both horizontal and vertical directions. The first simulation is for IJM estimation with least square method by equations (20) and (22). Results are shown in Figure 6, in which Figure 6(a) is for an object moving along a straight line with a midway direction turn, while Figure 6(b) is for an object moving along a spiral curve. It is easy to see from the figures that the tracking procedure is successful, though not so smooth and consistent, which implies
Online estimation of Image Jacobian Matrix
45
the IJM estimated by the least square method is correct, though not so robust as that in Section 2.4. Since the IJM under stereo visual feedback has higher dimensions than that under monocular visual feedback, this result is expectable. Figure 6 Coordination procedure with the IJM estimated by least square method: (a) tracking a linearly moving object with a direction turn in 3-D space and (b) tracking a spirally moving object in 3-D space
Performance of the IJM estimation by Kalman filter is evaluated in the second simulation. The simulation parameters are set the same as those in the first one. To emphasise on comparisons of IJM estimation performance by different fashions, control law by equation (23) is also adopted here. The results are shown in Figure 7.
46
J. Su et al.
Figure 7(a) is the tracking procedure to a linearly moving object with direction turn and Figure 7(b) shows the tracking errors. Figure 7(c) is the tracking procedure to a spirally moving object and Figure 7(d) shows the tracking errors. Compared with Figure 6, we can easily see that the tracking performance of the system with IJM estimated by Kalman filter is much better than that by the least square method. This implies that the Kalman filter method is advantageous over the least square method for IJM estimation under stereovision feedback, since both simulations take advantage of the same controllers. Specifically, Kalman filter has better ability to deal with system noise than the least square method in image processing. Figure 7 Coordination procedure with IJM estimated by Kalman filter: (a) tracking a linearly moving object with direction turn; (b) tracking errors; (c) tracking a spirally moving object and the tracking errors and (d) tracking errors
To further evaluate the performance of Kalman filter for IJM estimation, we compare it with the Sutanto’s method reported in Sutanto et al. (1997). Same simulation parameters are used and same simulation noises are added. Figure 8 shows the tracking results with IJM estimated by Sutanto’s method, in which Figure 8(a) shows tracking procedure to a linearly moving object with a direction change and Figure 8(b) shows tracking procedure to a circularly moving object. It is obvious that the tracking procedures in Figure 7(a) and (c) are both more smooth and accurate than those in
Online estimation of Image Jacobian Matrix
47
Figure 8(a) and (b). This indicates that the proposed Kalman filter method is better than the Sutanto’s method for online IJM estimation. Figure 8 Tracking with the IJM estimated by Sutanto’s method: (a) tracking a linearly moving object with direction change and (b) tracking a circularly moving object
3.4 Experiments The experiment system is set up as shown in Figure 9, in which an ADEPT 604s robot manipulator of four degrees-of freedom is used, with an upper camera fixed above the workspace and a side camera fixed aside the robot manipulator to construct a stereovision system. The image sizes for image processing are both 320 × 320 pixels. To simplify image processing and object recognition, the robot hand is recognised by a red block attached in it and the object is a green block. Both of them are positioned by the centers of the blocks. The image sampling period is 100 ms on a 700 MHz Pentium III processor. The object’s speed is about 0.1 m/s in the experiments. Figure 9 Testbed in experiments
The first experiment was conducted based on the IJM estimated by least square method. The tracking procedure to a circularly moving object observed by the upper
48
J. Su et al.
camera is shown in Figure 10, in which the robot hand approaches the object after 12 iterations. The tracking procedure is preceded by a three-step exploration to determine the initial value of IJM that is not shown in the figure. The control is successful. But the system controller takes long time to adjust the tracking errors before being convergent. This indicates that IJM estimation under least square method are not very stable, which is also the cause for fluctuations accompanied with the whole tracking procedure. Note that since the object and the robot hand are represented by the projection centers of blocks, each of which has the radius of about 10 pixels in image planes. When the tracking procedure completes as two colour blocks actually touch with each other, they are still about 20 pixels away from each other in image planes. Figure 10 Tracking experiment with IJM estimated by least square method
Figure 11 Tracking experiment with IJM estimated by Kalman filter: (a) tracking process in upper camera and (b) tracking process in side camera
The second experiment evaluates the performance of Kalman filter method for IJM estimation. The same experiment is conducted and the results are shown in Figure 11, in which Figure 11(a) is from the upper camera and Figure 11(b) is from
Online estimation of Image Jacobian Matrix
49
the side camera. The tracking is stable after 11 iterations. Due to the size of the representing blocks, the robot hand does not overlap with the object in image plane when the tracking task is finished. Comparing Figure 11 with Figure 10, it is explicit that the tracking procedure under Kalman-filter estimated IJM is much faster and smoother than that under least-square estimated IJM. This illustrates that Kalman filter can perform a more stable IJM estimation that facilitates the robot hand to quickly converge to the object by a prescribed error threshold without adjusting fluctuations. Figure 12 Testbed for experiment with changed cameras’ poses
Figure 13 Tracking experiment with IJM estimated by Kalman filter under changed camera poses: (a) tracking process in upper camera and (b) tracking process in side camera
To further evaluate the performance of the Kalman filter in IJM estimation, one more experiment is carried on to check its irrelevance to camera calibrations. The robotic hand-eye coordination system with changed cameras’ positions is shown in Figure 12. The same tracking task is accomplished with the same controller and the same parameters as above. The results are shown in Figure 13, where Figure 13(a) is from the upper camera, and Figure 13(b) is from the side camera. The task is successfully
50
J. Su et al.
finished after 11 iterations, without explicit difference in tracking performance from those shown in Figure 11. This implies that the proposed control scheme and the IJM estimation method is independent from system configurations and hand-eye relations, which is exactly calibration free. Figure 14 shows the scenes captured by the two cameras at the starting and end moments of the tracking task. It is clear that 3-D motion of the robot hand can be achieved with the help of the two-camera stereo vision feedback. The views from the upper camera (Figure 14(a) and (b)) illustrates motion of the robot hand in x and y directions in the robot base frame, whereas the view from the side camera (Figure 14(c) and (d)) illustrates motion of the robot hand in z and y directions evidently. Figure 14 Scenes captured by the two cameras at the start and end time moments: (a) start scene in upper camera; (b) end scene in upper camera; (c) start scene in side camera and (d) end scene in side camera
4 Conclusions This paper highlights the IJM estimation methods in a systematic way for uncalibrated robotic hand-eye coordination problem with dynamic tracking tasks. Different methods are explored according to system configurations. Three kinds of strategies are proposed for online estimation of the IJM: •
constant IJM for the robot to coordinate with a monocular hand camera
•
the least square estimation method for the robot to respectively coordinate with either a monocular global camera or a stereovision system
•
the Kalman filter method for the robot to coordinate with a stereovision system.
Online estimation of Image Jacobian Matrix
51
These methods are completely investigated with object tracking tasks to show their properties. It is obvious that the Kalman filter-based fashion is more accurate and achieve much better performance, concerning about system noises and external disturbances. To arrive at a satisfactory system performance, the coordination controller is also an essential point besides IJM estimation. A switch control scheme between optimal controller and PI controller was considered in this paper to achieve better overall tracking performance. These two control fashions are adopted according to specific coordination purposes and tasks. Of course, there exists a more proper controller to outperform others when combined with a specific IJM estimation method. This is an open problem for future research.
References Chui, C.K. and Chen, G. (1987) Kalman Filtering with Realtime Application, Springer-Verlag, New York, NY, USA. Drummond, T. and Cipolla, R. (2002) ‘Real-time tracking of complex structures with online camera calibration’, Image and Vision Computing, Vol. 20, pp.427–433. Feddema, J.T. and George Lee, C.S. (1990) ‘Adaptive image feature prediction and control for visual tracking with a hand-eye coordinated camera’, IEEE Trans. System, Man and Cybernetics, Vol. 20, No. 5, pp.1172–1183. Hashimoto, H., Kubota, T., Sato, M. and Hurashima, F. (1992) ‘Visual control of robotic manipulator based on neural networks’, IEEE Trans. Industrial Electronics, Vol. 39, No. 6, pp.490–496. Hashimoto, K., Ebine, T. and Kimura, H. (1998) ‘Visual servoing with hand-eye manipulatoroptimal control approach’, IEEE Trans. Robot. and Automat., Vol. 12, No. 5, pp.766–774. Hespanha, J. Dodds, Z. Hager, G.D. and Morse, A.S. (1998) ‘What can be done with an uncalibrated stereo systems?’, Proc. IEEE Int. Conf. on Robotics and Automation, pp.1366–1372. Horaud, R.; Dornaika, F. and Espiau, B. (1998) ‘Visually guided object grasping’, IEEE Trans. Robot. and Automat., Vol. 14, No. 4, pp.525–532. Hosoda, K. and Asada, M. (1994) ‘Versatile visual servoing without knowledge of true Jacobian’, Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, pp.186–193. Jagersand, M., Fuentes, O. and Nelson, R. (1997) ‘Experimental evaluation of uncalibrated visual servoing for precision manipulation’, Proc. IEEE Int. Conf. on Robotics and Automation, pp.2874–2880. Meng, Y. and Zhuang, H. (2001) ‘Self-calibration of camera-equipped robot manipulators’, The International Journal of Robotics Research, Vol. 20, November, pp.909–921. Papanikolopoulos, N.P., Khosla, P.K. and Kanade, T. (1993) ‘Visual tracking of a moving target by a camera mounted on a robot: a combination of vision and control’, IEEE Trans. Robot. and Automat., Vol. 9, No. 1, pp.14–35. Piepmeier, J.A. and Lipkin, H. (2003) ‘Uncalibrated eye-in-hand visual servoing’, International Journal of Robotics Research, Vol. 22, Nos. 10–11, October, pp.805–819. Piepmeier, J.A., McMurray, G.V. and Lipkin, H. (2004) ‘Uncalibrated dynamic visual servoing’, IEEE Trans. on Robot. and Autom., Vol. 20, No. 1, pp.143–147. Sharma, R. and Srinivasa, N. (1998) ‘A framework for active vision-based robot control using neural networks’, Robotica, Vol. 16, pp.309–327.
52
J. Su et al.
Su, J.B., Ma, H.Y., Qiu, W.B. and Xi, Y.G. (2004a) ‘Task-independent robotic uncalibrated hand-eye coordination based on the extended state observer’, IEEE Trans. on System Man and Cybernetics, B, Vol. 34, No. 4, pp.1917–1922. Su, J.B., Pan, Q.L. and Luo, Z.W. (2004c) ‘Full-DOF calibration-free robotic hand-eye coordination based on fuzzy neural network’, Lecture Notes in Computer Science, Vol. 3174, pp.7–12. Sutanto, H., Sharma, R. and Varma, V. (1997) ‘Image based autodocking without calibration’, Proc. IEEE Int. Conf. on Robotics and Automation, pp.974–979. Yoshimi, B.H. and Allen, P.K. (1995) ‘Alignment using an uncalibrated camera system’, IEEE Trans. Robot. Automat., Vol. 11, No. 4, pp.516–521.
Bibliography Conticelli, F., Allotta, B. and Colombo, C. (1999) ‘Hybrid visual servoing: a combination of nonlinear control and linear vision’, Robotics and Autonomous Systems, Vol. 29, No. 4, pp.243–156. Cretual, A. and Chaumette, F. (2001) ‘Visual servoing based on image motion’, The International Journal of Robotics Research, Vol. 20, pp.857–877. Malis, E., Chaumette, F. and Boudet, S. (1999) ‘2-1/2-D visual servoing’, IEEE Trans. Robot. and Autom., Vol. 15, No. 2, pp.238–250. Oh, P.Y. and Allen, P.K. (2001) ‘Visual servoing by partitioning degrees of freedom’, IEEE Trans. on Robot. and Autom., Vol. 17, No. 1, pp.1–17. Papanikolopoulos, N.P. and Khosla, P.K. (1993) ‘Adaptive robotic visual tracking: theory and experiments’, IEEE Trans. Automat. Control, Vol. 38, No. 3, pp.429–445. Qian, J. and Su, J. (2002) ‘Online estimation of image Jacobian matrix by Kalman-Bucy filter for uncalibrated stereo vision feedback’, Proc. IEEE Int. Conf. on Robot. and Autom., pp.562–567. Scheering, C. and Kersting, B. (1998) ‘Uncalibrated hand-eye coordination with a redundant camera system’, Proc. IEEE Int. Conf. on Robotics and Automation, pp.2953–2958. Su, J.B., Xi, Y.G., Hanebeck, U.D. and Schmidt, G. (2004b) ‘Nonlinear visual mapping model for 3-D visual tracking with uncalibrated eye-in-hand robotic system’, IEEE Trans. System, Man and Cybernetics, B, Vol. 34, No. 1, pp.652–659.