3D Point Tracking and Pose Estimation of a Space Object Using ...

3D Point Tracking and Pose Estimation of a Space Object Using Stereo Images Nassir W. Oumer and Giorgio Panin Institute of Robotics and Mechatronics German Aerospace Center (DLR) Email: {nassir.oumer, giorgio.panin}@dlr.de

Abstract In this paper, we present a novel method for stereo camera-based 3D point tracking, integrated into a bank of iterated, extended Kalman filters (IEKF), associated to each visual feature. The approach utilizes optical flow and stereo correspondence of visible, predominatly specular features on a target satellite surface, in order to estimate translational and rotational velocities of the rigid body. The motion of each 3D point cloud can be predicted, since all point clouds are constrained by the common, rigid motion. A robust pose estimator, based on dual-quaternions and median statistics, is further applied to the estimated points. In case of temporarily missing measurements, the last estimated body velocity is used to predict the next poses. Results are shown on images of a satellite-mockup simulator, to demonstrate performances for on-orbit servicing.

1. Introduction Visual tracking of a space object, such as malfunctioning satellite, is currently an interesting research topic. One of the applications foreseen is on-orbit servicing: there exist plenty of defective satellites in space, occupying precious orbits. The ultimate goal is either to repair, or to deorbit them. This may be done by launching a servicer satellite, mounting a robot arm. The relative pose of the defective sattelite with respect to the servicer needs to be tracked and predicted over time, so that the servicer can approach it for visual servoing. However, under direct sunlight the highly specular nature of the surface, due to metallic parts or the multilayer insulation (MLI) wrapped around, poses strong difficulties for tracking. For tracking purposes, [13] utilizes a CAD model of the defective satellite, and implements an iterative

closest point (ICP) algorithm. Similarly, a wireframe model was used in [3]. Joint estimation of pose and dynamic parameters, based on a laser-camera system and a CAD model, is experimentally demonstrated in [1]. The aforementioned approaches are suitable, if apriori knowledge about the geometry of the satellite and the initial pose are given. Without such knowledge, the problem becomes more difficult. On one hand, Hillenbrand et. al [5] developed leastsquares method for pose estimation using range data that also identifies the six inertial parameters of the target. A model-free, stereo-camera based approach using an extended Kalman filter was also used to estimate structure, relative pose and motion of non-cooperative targets in [10]. In this scheme, inertial parameters are identified through a hypothesis-based likelihood score, over a finite set of possibilities, and numerical simulation have been presented. On the other hand, traditional tracking approaches using feature points can be applied on relatively variable lighting conditions. Standard features [11] can be selected for sparse optical flow, with a multiresolution implementation satisfying most of the required real-time efficiency and accuracy. Since the above mentioned work, many extensions appeared. A few of them incorporated photometric and geometric cues of a scene [9], improving computational effeciency through an inverse compositional approach [2, 16]. Gouiffés et.al [4] also considered specular highlights and brightness variations. Such approaches require feature correspondences across stereo images, to reconstruct 3D points which, in turn, are registered in order to estimate rigid motion. For this purpose, various pose estimation algorithms, based on the SVD [7] and [12], quaternions [6] or dual quaternions [14]) can be used. Our work is inspired by the existing correlation between image flow and disparity, as demonstrated in [15], and stereo-motion fusion, using the ratio of disparity rate to disparity, for recovering the rigid motion

3. 3D structure estimation using IEKFs

parameters [8]. We show a novel method for 3D sparse structure recovering and tracking, exploiting specular keypoints of the rigid target object, with the capability of predicting features in absence of image measurements. This is particularly relevant for on-orbit servicing, where direct sunlight saturates the CCD sensor of the cameras. The approach is based on a bank of iterated, extended Kalman filters (IEKF), using point-wise kinematic models, therefore not requiring any model of global target geometry or dynamics.

The single-point dynamics at discrete time k is pk = Fk pk−1 + GVk−1 + wk with 

 1 ∆tωzk −∆tωzk 1 ∆tωxk  Fk =  −∆tωzk ∆tωyk ∆tωxk 1   −∆t 0 0 −∆t 0  G= 0 0 0 −∆t T Vk = Vxk Vyk Vzk

2. Rigid body motion estimation from image sequences

zk = h(pk/k−1 ) + vk

while the projection onto camera pixel coordinates, through a pin-hole model, is given by X + cx Z Y y =fy + cy Z

(2)

where fx , fy are the focal length, and cx , cy the principal point coordinates in pixels. By taking the total time derivative of Eq. (2) and substituting into Eq.(1), the motion field and rigid body velocities are related by u = AT b where u =    A= 

u

v

T

b=

Vx

Vy

(3)

Vz

ωx

0 −fy /Z (y − cy )/Z fy (1 + ((y − cy )/fy )2 ) −fy (x − cx )/fx fy (x − cy )/fx

ωy

ωz

T

(4)

At each frame, visible keypoints are detected and tracked, as well as matched across stereo views in order to estimate depth Z. Detected features are separately tracked over the sequence; consequently, the rigid body velocity, which is used in the prediction stage of a Bayesian filter, can be estimated by minimizing

2 argmin AT b − u b

(5)

(9) (10)

The y coordinate of the rectified system is common, hence one element of h is redundant and can be discarded. The Jacobian Hk is computed as ∂h (12) Hk = ∂p p=pk|k−1

is feature velocity in the image

−fx /Z 0 (x − cx )/Z (x − cx )(y − cy )/fy −fx (1 + ((x − cx )/fx )2 ) fx (y − cy )/fy

(8)

where ∆t is time interval between two consecutive image captures and zk are feature pairs of the stereo camera, tracked over the sequence and reprojected under the pin-hole model (2). Measurement errors vk and motion noise wk are assumed Gaussian, zero-mean additive vectors. For rectified cameras, with baseline Tx , we have   fxl X/Z + cxl   fyl Y /Z + cyl  h= (11)  fxr (X + Tx )/Z + cxr  fyr Y /Z + cyr

(1)

x =fx

(7)

and the measurement model is

The motion of 3D surface points p = [X, Y, Z]T of a rigid body, translating and rotating at velocity V, ω in camera frame, respectively, is expressed by p˙ = −V − ω × p

(6)

 −fxl X/Z 2  −fyl Y /Z 2  2  −fxr (X + Tx )/Z −fyl Y /Z 2 (13) where, l and r indicate left and right stereo camera, respectively. For each point, we iterate the filter until convergence. The main advantage of Bayesian filtering here is that, when a complete occlusion occurs, or the target moves momentarily out of the field of view, motion can be still predicted and used later on to re-initialize tracking. Furthermore, at each step features may be selected independently of those chosen from the previous image, for dynamic occlusion handling. Finally, we use robust median statistics for pose estimation, based on the work [14]. 

    

fxl /Z ∂h  0  = ∂p  fxr /Z 0

0 fyl /Z 0 fyr

Figure 1. The DLR satellite mockup for rendezvous and capture

Figure 2. Projection of the recovered 3D points: correctly reprojected points (green) coincide with tracked features (red).

4. Experimental results The method has been implemented with the European Proximity Operations Simulator (EPOS), using two industrial robots simulating a servicer and client (Fig. 1). The client consists of a rigid structure, covered with reflective golden foil. A direct high-power floodlight, simulating the sun, illuminates the surface of the client with different incident angles. We generated several ground truth trajectories, consisting of rototranslation, at the close range of a few meters, while stereo cameras with a small baseline (12cm) capture images at 1Hz. About 180 keypoints are detected on the first frame, and tracked over the rest of the sequence [11]. The keypoints used for velocity estimation are separately detected and matched from each stereo image, to avoid drift. For all test trajectories, we initialize the extended Kalman filters with zero position for 3D points, and identity covariance matrix, before the target structure appears on the camera. The reprojection of reconstructed, sparse 3D points on the image, is shown in Fig. 1, where measuerements and reprojections are marked with circles of different colors. Fig.3 displays the partial results of absolute pose estimation, obtained by registering the sparse 3D points from the first and current frame.

5. Conclusions We presented a feature-based, model-free approach for 3D motion estimation of a non-cooperative satelite. First, the translational and rotaional velocity of the target is estimated in camera frame, by utilizing motion

flow and stereo correspondences. The estimated rigid motion is used by a bank of iterated Kalman filters, to propagate the point cloud to the next frame, where predicted points are updated with tracked stereo measurements. By registering point clouds, the six DOF of motion are estimated through a robust dual quaternion method. The experimental results obtained from a simulation mockup demonstrate robustness and accuracy of the proposed method. Currently, the algorithm requires depth information to estimate rigid body velocities, and we are exploring this to integrate it into the filter.

Acknowlegments We would like to acknowelge Dr. Toralf Boge and Tilman Wimmer for their valuable suppport in realizing generated motion trajectories at EPOS facility.

References [1] F. Aghili and K. Parsa. An adaptive vision system for guidance of a robotic manipulator to capture a tumbling satellite with unknown dynamics. IEEE/RSJ International Conference on Intelligence Robots and Systems, pages 3064–3071, 2008. [2] S. Baker and I. Mathews. Lucas-kanade 20 years on: A unifying framework. Int. Journal of Computer Vision,, 56(3):221–225, 2004. [3] C.Miravet, L. Pascual, E. Krouch, and J. Delcura. An image-based sensor system for autonmous rendez-vous with uncooperative satellites. In 7th International ESA Conference on Guidance, Navigation and Control Systems, June 2008.

160

3

Rotation error about z−axis [deg]

Rotation about Z−axis [deg]

140 120 100 80 60 40

2.5

2

1.5

1

0.5 20 0

0

10

20

30

40

50

60

70

0

80

0

10

20

30

Time[s]

90

50

60

70

80

50

60

70

80

0.5

80

0

70

Error along Z−axis [cm]

Translation along Z−axis [cm]

40

Time[s]

60 50 40 30

−0.5

−1

−1.5

−2

20 −2.5

10 0

0

10

20

30

40

50

60

70

80

−3

0

Time[s]

10

20

30

40

Time[s]

Figure 3. Left column: estimated absolute Z-axis roto-translation (blue), and ground truth (red). Right column: respective absolute errors.

[4] M. Gouiffès, C. Collewet, C. Fernandez-Maloigne, and A. Trémeau. Feature points tracking: Robustness to specular highlights and lighting changes. In 9th European Conference on Computer Vision, ECCV (4) 2006, pages 82–93, 2006. [5] U. Hillenbrand and R. Lampariello. Motion and parameter estimation of a free-floating space object from range data for motion prediction. 8th International Symposium on Artifical Intelligence, Robotics and Automation in Space: i-SAIRAS 2005. [6] B. K. Horn. A closed-form solution of absolute orientation using unit quaternions. Journal of the Optical Society of America, 4:629, 1987. [7] K.S.Arun, T. Huang, and S. Blostein. Least -squares fitting of two 3d point sets. IEEE Trans. Pattern Analysis and Machine Intelligence, 9:698–700, 1987. [8] S. Negahdaripour. Motion recovery from images sequences using only first order optical flow information. International Journal of Computer Vision, pages 163– 184, 1992. [9] S. Negahdaripour. Revised definition of optical flow: Integration of radiometric and geometric cues for dynamic scene analysis. IEEE Trans. Pattern Anal. Mach. Intell., 20(9), 1998. [10] S. Segal, A. Carmi, and P. Gurfil. Vision-based relative state estimation of non-cooperative spacecraft un-

[11]

[12]

[13]

[14]

[15]

[16]

der modeling uncertainity. IEEE Aerospace Conference, 1:1–8, 2011. J. Shi and C. Tomasi. ”good features to track”. proc. IEEE Computer Soc. Conference Computer Vision and Pattern Recognition, pages 593–600, 1994. S.Umeyama. Least-squares estimation of transformation parameters between two point patterns. IEEE Trans. Pattern Analysis and Machine Intelligence, 13:376–380, 1991. F. Terui. Model based visual relative motion estimation and control of a spacecraft utilizing computer graphics. In 21st International Symposium on Space Flight dynamics, Tolouse, France, 2009. M. W. Walker and L. Shao. Estimating 3-d location parameters using dual number quaternions. CVGIP : Image Understanding, 54(3):358–367, 1991. A. M. WaxMan and J. H.Duncan. Binocular image flows: Steps toward stereo-motion fusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI 8(6), 1986. T. Zinsser, C. Grl, and H. Niemann. Effecient feature tracking for long video sequences. Pattern Recognition, 26th DAGM Symposium, 2008.

3D Point Tracking and Pose Estimation of a Space Object Using ...

3D Point Tracking and Pose Estimation of a Space Object Using ...

Suggest Documents

Robust cylinder detection and pose estimation using 3D point ... - VisLab

Comparison of Pose Estimation Methods for a 3D Laser Tracking

Vision Based 3D Tracking and Pose Estimation for ... - Semantic Scholar

From Contours to 3D Object Detection and Pose Estimation

Multisensor dense estimation of 3D motion and object tracking based ...

A 3D Object Detection and Pose Estimation Pipeline Using RGB-D ...

Globally Optimal Object Pose Estimation in Point ... - Research - MIT

Deep Manifold Embedding for 3D Object Pose Estimation (PDF ...

3D Head Pose and Facial Expression Tracking

Shadow Gestures: 3D Hand Pose Estimation using a ...

Model-Based Vehicle Pose Estimation and Tracking in Videos Using ...

Object Recognition and Pose Estimation for

Joint Real-time Object Detection and Pose Estimation Using ...

Tracking 3D pose of rigid objects using Inverse

Monocular 3D Human Pose Estimation Using Transfer Learning and ...

Pose-invariant 3d object recognition using linear combination of 2d

Monocular 3D Human Pose Estimation Using Transfer Learning and ...

3D Pose Estimation and Segmentation using Specular Cues - CiteSeerX

3D Human Pose Estimation using 2D-Data and an ... - CiteSeerX

Pose Estimation of Object Categories in Videos Using Linear

3D Pose Estimation of Beef Carcasses using Symmetry

Precise 3D Pose Estimation of Human Faces

Robust 3D Head Tracking using Camera Pose ... - Semantic Scholar

Real-Time 3D Articulated Pose Tracking using ... - Semantic Scholar