An EKF-SLAM Method with Filter Consistency Test for Mobile Robots using a 3D Camera Cang Ye, Senior Member, IEEE, Yimin Zhao
Abstract—This paper presents a new Extended Kalman Filter (EKF) for Simultaneous Localization and Mapping (SLAM) of a mobile robot using a 3D Time Of Flight (TOF) camera. The proposed method employs an egomotion estimation algorithm to estimate the camera’s egomotion, which is then used as the motion model by the EKF to estimate the camera’s pose in the world coordinate system by tracking a set of visual features from the previous intensity images. Unlike the existing work that attempts to maintain the EKF’s consistency by building local maps and then joining the maps into a global one, the proposed method tests the EKF’s consistency at each step and restarts the filter whenever it fails the test. By doing so, the proposed EKF retains its consistency and therefore improve the accuracy of SLAM. The filter consistency is evaluated online by a chi-square test on the time-averaged normalized innovation squared of each stable feature’s observation. Experimental results in various indoor and outdoor terrains demonstrate that proposed approach substantially improves the accuracy of SLAM and thus the accuracy of 3D maps of the environments.
I. INTRODUCTION Simultaneous Localization and Mapping (SLAM) [1] is an essential capability for a mobile robot to explore unknown environments. For a SLAM problem, a robot has to utilize relative measurements gathered by its onboard sensor(s) to localize itself and map the environment gradually [2] due to the absence of global information. In the literature, a great deal of work [3], [4], [5], [6], [7] have been devoted to address SLAM problem. The prevailing SLAM methods can be classified into two categories: state filter based SLAM [3], [4], [5] and pose-graph SLAM [6], [7]. EKF-SLAM is a representative method in the first category. In the context of online Pose Estimation (PE) in this paper, a graph SLAM method needs to compute multiple egomotions between the current step and numerous earlier steps while an EKF-based method only needs to compute one egomotion between the current step and the previous step. As computing egomotion may incur expensive computation, the graph method’s advantage over the EKF method in computational cost is somewhat compromised, making the EKF method a good alternative if a limited number of features will be used for state update. Also, an EKF method requires much less storage space than a graph method. In this work, we choose to use an EKF-SLAM method for 6-DOF PE of a robot equipped with a 3D TOF camera.
This work was supported in part by the National Science Foundation Grant IIS1017672, NIH Grant R01EB018117 and NASA Grant NNX13AD32A. C. Ye and Y. M. Zhao are with the System Engineering Department, University of Arkansas at Little Rock, Little Rock, AR 72204, USA (e-mail:
[email protected],
[email protected]).
As the proposed EKF-SLAM method tracks only a limited number of visual features sampled from previous images, the computation cost is affordable. The main problem that may hamper the method’s application in a large environment is that the EKF may become inconsistent due to the linearization errors of the motion and observation models. The linearization errors increase the robot state uncertainty over time and eventually drive the EKF out of consistency, resulting in a large PE error. Recently, EKF-SLAM consistency has drawn researchers’ attention and some approaches have been proposed to analyze EKF-SLAM inconsistency [8], [9], [10], [11], [12], [13], [14]. Bailey et al. [8] demonstrate that EKF-SLAM consistency can be maintained for a long period of time if the robot’s heading uncertainty (2-D case) is small and the filter is robust to process/observation noises and nonlinearities and is assisted by stabilising noise. Huang and Dissanayake [10] investigate the fundamental cause of the inconsistency for two basic scenarios of a 2-D EKF-SLAM problem with point landmarks observed by a range-and-bearing sensor and provide the theoretical proof. They reveal that the EKF-SLAM inconsistency is due to: i) the violation of some fundamental constraints governing the relationship between various Jacobians when they are evaluated at the current state estimate; and ii) the use of relative location information from robot to landmarks to update the estimates for absolute robot state and landmark locations. It is also shown that the robot orientation uncertainty has the largest contribution to EKF-SLAM inconsistency. In theory, the convergence property of EKF-SLAM hinges on that the Jacobians are evaluated at the true robot pose with true landmark positions. However, these true values are unavailable in a real-world SLAM problem. Consequently, the Jacobians must be evaluated using the estimated values and therefore result in inconsistent state estimates. A larger robot orientation uncertainty leads to larger landmark uncertainties and thus more significant inconsistency. The above-mentioned work provides some insights into the EKF-SLAM inconsistency issue. In principle, the linearization errors induced inconsistency is inherent to EKF-SLAM and it cannot be removed as long as there is uncertainty in the robot pose. In our case, the noises in egomotion estimation and visual feature observations are not exactly zero-mean Gaussian. This may exacerbates the inconsistency problem and drive the EKF out of consistency faster. To mitigate EKF-SLAM inconsistency, Huang et al. [12] propose to compute the state-propagation Jacobians using the robot position estimate prior to state update (instead of the updated estimate). Simulation results show that the way the Jacobians are computed results in a better EKF consistency. However, it is unclear what causes the consistency improvement. Up to date, the practical method is
to build local maps with a fixed size (m time-steps) and then join them into one [15], [16]. However, it is difficult to determine a suitable m-value ahead of time to ensure EKF consistency for each local map.
demonstrated that the Pose Change Estimation (PCE) error of the VRO can be approximated by a Gaussian distribution. Therefore, we may employ an EKF to reduce the accumulative pose error.
In this paper, we propose a new method to retain the consistency of EKF-SLAM. The method performs an online EKF consistency test at each step using the time-averaged normalized innovation squared [17] of each stable feature. If the test fails, the EKF is reset and a new EKF-SLAM process begins. The method is similar to, but distinct from [18], in that it automatically determines an appropriate m-value for each EKF-SLAM process (i.e., a submap building process).
We have also shown in [19] that a non-zero incident angle of the camera’s light beam results in a smaller signal amplitude and a larger noise in PCE. In our application here, the IR light’s incident angle is always non-zero due to the SR4000’s initial pitch angle. This may induce a relatively large PCE error that may affect the EKF-SLAM consistency and therefore deteriorate its PE performance.
The paper is organized as follows. In Section 2, the robot platform and the 3D camera used for PE are introduced. In Section 3, the proposed PE method is described. In Section 4, the EKF consistency check and reset is described. In Section 5, the experimental results are presented and the paper is concluded in Section 6. II. ROBOT PLATFORM AND SR4000 A Pioneer P3-AT robot as depicted in Fig. 1 is used for experiment in this work. The robot uses a 3D TOF cameraSwissRanger SR4000for perception. The camera is mounted on a camera post with a -30 tilt angle. It has a spatial resolution of 176×144 pixels and a field of view of 43.6×34.6. The camera illuminates its environment with modulated infrared light. Based on phase shift measurement, it detects the ranges up to 10 meters (with 1 cm accuracy) for every points sensed by the imaging pixels. The SR4000 produces both range and intensity images simultaneously at a rate up to 54 frames per second. The range data will be used for both terrain mapping and obstacle avoidance. In this work, the intensity and range data are used for robot PE and terrain mapping.
To address this issue, in our previous work [ 20], we propose an EKF method for indoor SLAM of a smart cane. The method extracts the Floor Plane (FP) from the SR4000’s range data and measure the distance between the camera and the FP, which is treated as the observation of the camera y-coordinate. The scheme limits the error and uncertainty of y-coordinate and thus that of the cane’s pose. However, this approach is not applicable to outdoor terrain where no FP can be extracted. In this paper, we propose an EKF with online consistency evaluation and filter reset for robot PE. In the proposed method, the state vector 𝐱 𝐤 at step 𝑘 is the concatenation of the robot’s state (pose) 𝐱 𝐬𝐤 and a stacked vector of m SIFT features maintained in a local map 𝐌𝐤 = [𝐟𝐤𝟏 ; 𝐟𝐤𝟐 ; . . . , 𝐟𝐤𝐦 ]𝑇 . 𝐱 𝐬𝐤 is a 7-dimensional vector consisting of position 𝐫𝐤 = [𝑥𝑘 ; 𝑦𝑘 ; 𝑧𝑘 ] and orientation quaternion 𝐪𝐤 = [𝑞1 ; 𝑞2 ; 𝑞3 ; 𝑞4 ] of the robot in the world coordinate system. The state covariance matrix 𝐏𝐤 contains the correlation information between the elements of the state vector. The world coordinate system 𝑋𝑊 𝑌𝑊 𝑍𝑊 and the camera coordinate system are depicted in Fig. 1. With the robot on a horizontal plane, 𝑋𝑊 points rightward, 𝑌𝑊 downwards and 𝑍𝑊 forward. At the beginning (k=0) of navigation, 𝑋𝑊 𝑌𝑊 𝑍𝑊 is fixed. The state of the robot (i.e., camera) in 𝑋𝑊 𝑌𝑊 𝑍𝑊 is then initialized as 𝐱 𝐬𝟎 = [0;
0;
0;
𝐪𝟎 ( 𝐖𝐂𝐑)],
(1)
where 𝐖𝐂𝐑 is the rotation of 𝑋𝐶 𝑌𝐶 𝑍𝐶 with reference to 𝑋𝑊 𝑌𝑊 𝑍𝑊 and 𝐪𝟎 ( 𝐖𝐂𝐑) is the quaternion representation of 𝐖 𝐂𝐑 at k=0. A. EKF State Estimation Loop The proposed method is iterative and each iteration consists of six stages as shown in Fig. 2. They are briefly described as follows:
Fig. 1. The P4-AT robot equipped with the SR4000 camera
III. THE PROPOSED METHOD In our previous work [19], we proposed a SR4000-based egomotion estimation method, call Visual-Range Odometry (VRO), for determining the camera’s 6-DOF Pose Change (PC) between two camera views. By integrating the VRO’s PC over time, we may obtain the camera’s pose in the world coordinate system. However, this dead reckoning approach may accrue a large pose error over time. In [19], we
1) Preprocessing: In this stage, PC calculation (by the VRO) and map management are performed. The PC is computed by the VRO method through matching the SIFT features between images k and k-1. To keep the computation affordable, the method maintains a feature map that contains a set of SIFT features. A map management scheme is devised to discard obsolete features and add new features. The VRO and the map management scheme are described in II.B. 2) SIFT feature correspondences: In this stage, the method predicts the state and its covariance using the PC for step k. It then computes the predicted observation and the covariance for each SIFT feature. They are used to determine a region on
the image (at step k) within which a SIFT feature match should locate. If a SIFT feature match is found within the region, a so-called Individually Compatible Match (ICM) is found and a feature correspondence is created. This process is performed on all predicted feature obervations and results in a set of ICMs denoted by 𝐳 𝐈𝐂𝐌 . Input: 𝐱̂ 𝐤−𝟏|𝐤−𝟏, 𝐏𝐤−𝟏|𝐤−𝟏 (EKF estimate at step 𝑘 − 1), Framek, FrameK-1, τ (threshold determining low-innovation inliers) Output: 𝐱̂ 𝐤|𝐤, 𝐏𝐤|𝐤 (EKF estimate at step 𝑘) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 30 29
A) Preprocessing 𝐮𝐤−𝟏 = [Δ𝐫𝒌−𝟏, Δ𝐪𝒌−𝟏 ] =VRO(Framek , Framek−1 ) M = MAP_MANAGEMENT(M) B) EKF prediction and individually compatible matches [𝐱̂ 𝐤|𝐤−𝟏, 𝐏𝐤|𝐤−𝟏]= PREDICT(𝐱̂ 𝐤−𝟏|𝐤−𝟏, 𝐏𝐤−𝟏|𝐤−𝟏, 𝐮𝐤−𝟏) ̂ 𝐤|𝐤−𝟏, 𝐒𝐤|𝐤−𝟏]=PREDICT_OBSERVATION(𝐱̂ 𝐤|𝐤−𝟏, 𝐏𝐤|𝐤−𝟏) [𝐡 ̂ 𝐤|𝐤−𝟏, 𝐒𝐤|𝐤−𝟏) 𝐳𝐈𝐂𝐌 = SEARCH_IC_MATCHES(𝐡 C) 3-point hypothesis generation and evaluation 𝐳𝐤𝐥𝐢 = [ ] For 𝑖 = 0 to 𝑛ℎ𝑦𝑝 (𝒏𝒉𝒚𝒑 takes a big value and is updated in the loop) 𝐳𝐢 = SELECT_3_Tracked_Features(𝐳𝐈𝐂𝐌 ) 𝐱̂ 𝐢 = EKF_STATE_UPDATE(𝐳𝐢 , 𝐱̂ 𝐤|𝐤−𝟏) ̂ 𝐢 = PREDICT_ALL_MEASUREMENTS(𝐱̂ 𝐢 ) 𝐡 ̂ 𝐢 , 𝛕) 𝐳𝐢𝐡𝐬 = FIND_MATCHES_WITHIN_THRESHOLD(𝐳𝐈𝐂𝐌 , 𝐡 IF 𝑠𝑖𝑧𝑒(𝐳𝐢hs ) > 𝑠𝑖𝑧𝑒(𝐳𝐥𝐢 ) 𝐳𝐤𝐥𝐢 = 𝐳𝐢𝐡𝐬 , ε =
𝑠𝑖𝑧𝑒(𝐳 𝐥𝐢 )
, 𝑛ℎ𝑦𝑝 =
𝑠𝑖𝑧𝑒(𝐳 𝐈𝐂𝐌 )
𝑙𝑜𝑔(1−μ) 𝑙𝑜𝑔(1−ε3 )
D) Partial update using low-innovation inliers [𝐱̂ 𝐤|𝐤, 𝐏𝐤|𝐤 ]= EKF_UPDATE(𝐱̂ 𝐤|𝐤−𝟏, 𝐏𝐤|𝐤−𝟏 , 𝐳𝐥𝐢 ) E) Partial update using high-innovation inliers 𝐳𝐤𝐡𝐢 =[ ] For every matched 𝐳𝐣 with innovation above a threshold τ ̂ 𝐣 , 𝐒𝐣 ]=INDIVIDUAL_FEATURE_PREDICTION(𝐱̂ 𝐤|𝐤, 𝐏𝐤|𝐤 , 𝑗) [𝐡 ̂𝐣 ν𝑗 = 𝐳𝐣 − 𝐡
IF ν𝑗𝑇 𝐒𝐣−𝟏ν𝐣 < χ22,0.01 𝐳𝐤𝐡𝐢 =ADD_MATCH_J(𝐳𝐤𝐡𝐢 , 𝐳𝐣 ) If 𝑠𝑖𝑧𝑒(𝐳𝐤𝐡𝐢 ) > 0 [𝐱̂ 𝐤|𝐤, 𝐏𝐤|𝐤 ]= EKF_UPDATE(𝐱̂ 𝐤|𝐤, 𝐏𝐤|𝐤, 𝐳𝐤𝐡𝐢 ) F) EKF consistency check and filter reset For 𝑗 = 1 to 𝑛𝑠 (𝑛𝑠 : number of stable features) 𝐣 ̂𝐣 ̂ 𝐤|𝐤−𝟏, 𝐏𝐤|𝐤−𝟏) [𝐡 𝐤|𝐤−𝟏, 𝐒𝐤|𝐤−𝟏 ] =PREDICT_OBSERVATION(𝐱 𝐣 𝐣 𝐣 𝑗 𝑇 −1 𝐣 ̂𝐣 ̂𝐣 ̃ 31 𝜀𝑘 = (𝒛̃𝐤 − 𝐡 ) (𝐒 ) (𝒛 − 𝐡 ); NIS for 𝒛̃𝐤 𝐤|𝐤−𝟏 𝐤|𝐤−𝟏 𝐤 𝐤|𝐤−𝟏 32
1
𝐣 𝑗 𝑝 𝑗 𝜖𝑘 = ∑𝑝=1 𝜀𝑘−𝑝+1 ; time-averaged NIS for 𝒛̃𝐤 𝑝
33 𝐶k = 0; consistency count 𝑗 34 IF χ22,0.01 < 𝜖𝑘 < χ22,0.95 35 𝐶k =𝐶k + 1 36 If 𝐶k /𝑛𝑠 ≤0.7 37 Re-start a new EKF
Fig. 2. Diagram of the proposed method
3) 3-point hypothesis generation and evaluation: In this stage, a RANSAC process draws a set of three samples from the feature correspondences and generates a state hypothesis by updating the state vector using the set of samples. The support of the hypothesis is then calculated as the number of features whose innovation values (under the state hypothesis) are smaller than a threshold 𝜏. This process of hypothesis generation and test is performed a number of times and the most supported hypothesis is obtained. The feature correspondences that support this hypothesis are identified as the Low-Innovation Inliers (LIIs) and the rest are outliers. 4) Partial update 1: In this stage, the LLIs 𝐳 𝐥𝐢 from the stage 3 is used to perform a partial update of the EKF and obtain the updated state that is supported by the majority of the feature correspondences.
5) Partial update 2: In this stage, outliers from stage 3 are examined using the updated state to find additional inliers 𝐳 𝐡𝐢 . These inliers, called High-Innovation Inliers (HIIs), are then used to perform a further state update. 6) EKF consistency test and filter reset: In this stage, the time-averaged normalized innovation squared of each stable feature (defined IV) in the inliers (generated in stages 3 and 4) is evaluated by a chi-square test to assess the EKF consistency. If the consistency test fails, then the EKF is reset and a new EKF process starts. The detail of the EKF consistency test is described in IV. The input to the EKF-SLAM algorithm is the state estimate 𝐱̂ 𝐤−𝟏|𝐤−𝟏 , its covariance 𝐏𝐤−𝟏|𝐤−𝟏 , threshold 𝜏 for inlier determination, and the SR4000’s data in steps k-1 and k-2. The output is the state estimate 𝐱̂ 𝐤|𝐤 and covariance matrix 𝐏𝐤|𝐤 , respectively. In this work, 𝐏𝟎|𝟎 is a diagonal matrix where the diagonal elements are the PC variances of the VRO. In the remainder of this section, some technical details of the proposed method are described. B. Pose Change Calculation and Map Management In this step, the PC between steps k-1 and k as well as its corresponding covariance is calculated. Here, the VRO procedure in [21] is adopted with a modification to reduce the pose error. In the VRO process, the SIFT features are extracted from the intensity image at step k and matched to the SIFT features of the intensity image at step k-1. After finding the matched features, the corresponding 3D points are obtained from the range data. A RANSAC process is then employed to find the inliersthe maximum number of data points that supports a hypothesized transformation between the two point sets. From the inliers, the robot PC, Δ𝐫𝒌−𝟏 and Δ𝐪𝒌−𝟏 , is computed. The modification made in this paper is to use the SR4000’s confidence value for each range data to discard unreliable SIFT features before the RANSAC process. The confidence value represents how reliable the range measurement is. In this work, we discard those SIFT features whose confidence values are smaller than 1/2 of the maximum confidence value. This treatment is to improve the accuracy of the VRO. To calculate the covariance of the PC, we used the bootstrap approach as described in [22]. As the robot moves, new features are detected and added to the feature map. This may result in inordinate growth in the map size. To avoid this problem, the feature map is examined at the beginning of each EKF iteration to detect obsolete features and remove them from the map. For each feature, the ratio of the number of times it is observed to the number of times it is predicted to be visible in the camera’s field of view is calculated. A feature with a ratio less than 0.5 is determined as an obsolete one and discarded from the feature map. When the total number of the tracked features in the map falls below 50, a feature initialization process is executed to add new features into the map. In order to add new features with a relatively longer life span, we use a Gaussian kernel weighted sampling scheme to randomly draw features from the inliers identified by the VRO process. The Gaussian kernel has its mean located at the image center and a standard deviation that allows all image pixels reside within the 2 range (i.e., 95% confidence ellipse). New
features are randomly selected from the inliers of the VRO using the roulette wheel selection scheme based on the Gaussian kernel. The feature sampling scheme utilizes the fact that a feature closer to the image center is more likely to be visible in the next a few time steps (i.e., has a longer life span in the feature map) and is thus selected and added to the map with a higher probability. The feature sampling scheme reduces the chance of initializing into the map a short-life feature that are not useful for the EKF tracking.
procedure is repeated for a number of time to find the hypothesis with the maximum number of supports. The corresponding features are identified as inliers and denoted by 𝐳𝐤𝐥𝐢 . These inliers are called Low Innoviation Inliers (LIIs). They will be used to perform a patial state update to the EKF. F. Partial State Update Using Low Innovation Inliers In this stage, 𝒛𝒍𝒊 𝒌 is is used to perform a partial state update by:
C. Motion Model and Jacobians The motion model of the robot is given by: 𝐫𝐤 𝐫𝐤−𝟏 + 𝐑(𝐪𝐤−𝟏 )Δ𝐫𝐤−𝟏 (𝐪𝐤 ) = 𝑓𝑠 (𝐱 𝐤−𝟏 , 𝐮𝐤−𝟏 ) = (𝐪𝐤−𝟏 ⊗ Δ𝐪𝐤−𝟏 ), 𝐌𝐤 𝐌𝐤−𝟏
−𝟏 𝐊 𝐤 = 𝐏𝐤|𝐤−𝟏 𝐇𝐤𝐓 𝐒𝐤|𝐤−𝟏
(2)
where ⊗ is quaternion product and 𝐑(𝐪𝐤−𝟏 ) is the rotation matrix determined by quaternion vector 𝐪𝐤−𝟏 . The input to the motion model, 𝐮𝐤−𝟏 = [𝚫𝐫𝐤−𝟏 , Δ𝐪𝐤−𝟏 ] , represents the PC from step k-1 to k. The Jacobians of the motion model are ∂𝐱 ∂𝐱𝐤 𝐅𝐤−𝟏 = 𝐤 | 𝐱̂𝐤−𝟏|𝐤−𝟏 and 𝐆𝐤−𝟏 = |𝐱̂ 𝐤−𝟏|𝐤−𝟏 . 𝐬 𝐬 ∂𝐱𝐤−𝟏
∂(Δ𝐫𝐤−𝟏 ,Δ𝐪𝐤−𝟏 )
The state vector and its covariance are predicted as follows: 𝐱̂ 𝐤|𝐤−𝟏 = 𝑓𝑠 (𝐱̂ 𝐤−𝟏|𝐤−𝟏 , 𝐮𝐤−𝟏 ) 𝐓 𝐓 𝐏𝐤|𝐤−𝟏 = 𝐅𝐤−𝟏 𝐏𝐤−𝟏|𝐤−𝟏 𝐅𝐤−𝟏 + 𝐆𝐤−𝟏 𝐐𝐤−𝟏 𝐆𝐤−𝟏 ,
𝐱̂ 𝐤|𝐤 = 𝐱̂ 𝐤|𝐤−𝟏 + 𝐊 𝐤 (𝐳𝐤𝐥𝐢 − 𝐡(𝐱̂ 𝐤|𝐤−𝟏 )) 𝑷𝒌|𝒌 = (𝑰 − 𝑲𝒌 𝑯𝒌 )𝑷𝒌|𝒌−𝟏
(3)
G. Partial Update Using High Innovation Inliers: Some feature correspondences that were classified as outliers in the previous step may become inliers [16] after the above partial state update. These features are called High Innovation Inliers (HIIs). In this stage, we first apply a 𝜒 2 test to the outlier points resulted from the previous stage. Points passing the test are identified as HIIs and their corresponding observations 𝒛𝒉𝒊 𝒌 are used to perform another partial update by ̂𝒌|𝒌 and 𝑷𝒌|𝒌 obtained from II.F are taken (5). The values of 𝒙 ̂𝒌|𝒌−𝟏 and 𝑷𝒌|𝒌−𝟏 for this update. as 𝒙
where 𝐐𝐤−𝟏 is the covariance matrix of 𝐮𝒌−𝟏.
IV. EKF CONSISTENCY TEST AND FILTER RESET
D. Observation Model In this paper, we use SIFT features as observations in the EKF process. When a new feature is added to the map, it is initialized with its 3D coordinates in the world coordinate system. Its subsequent observation on the camera’s image plane is calculated using the state and the camera’s model 𝐡. A pin hole camera model is used. To account for the optical distortion of the SR 4000, an optical calibration process was performed to obtain the camera’s intrinsic and extrinsic parameters. The observation of each feature at step k is predicted by 𝐡 using 𝐱̂ 𝐬𝐤|𝐤−𝟏 and the feature’s coordinate. The observation model is denoted by 𝐡(𝐱̂ 𝐤|𝐤−𝟏 ). The predicted observation is given by ̂ 𝐤|𝐤−𝟏 = 𝐡(𝐱̂ 𝐤|𝐤−𝟏 ) 𝐡 𝐒𝐤|𝐤−𝟏 = 𝐇𝐤 𝐏𝐤|𝐤−𝟏 𝐇𝐤𝐓 + 𝐑 𝐤 ,
(5)
After each EKF state update, the EKF consistency is evaluated. If the result is satisfactory, the EKF process will continue. Otherwise, the current process will be terminated and a new EKF process will begin at step k-1 with 𝐱̂ 𝐤−𝟏|𝐤−𝟏 by initilizing 𝐏𝐤−𝟏|𝐤−𝟏 with the PC variance of the VRO (between frames k-1 and k). The EKF consistency measure is calculated as follows: Let 𝐳𝐤 be the set of observations containing the LIIs and HIIs at step k. A set of features out of 𝐳𝐤 that have been tracked for more than l steps is identified and denoted by 𝐳̃𝐤 . 𝐣 Features in 𝐳̃𝐤 are called stable features. For the jth feature 𝐳̃𝐤 ̂𝐣 in 𝐳̃𝐤 , the predicted observation 𝐡 𝐤|𝐤−𝟏 and its covariance 𝐣
𝐒𝐤|𝐤−𝟏 have been computed earlier by (4). They are used to (4)
𝛛𝐡
where 𝐇𝐤 = |𝐱̂𝐤|𝐤−𝟏 is the jacobian of the observation model 𝛛𝐱 and 𝐑 𝐤 is the covariance of the observation noise. E. Hypothesized State Generation and State Update The 6-DOF PE problem requires three feature correspondences. However, due to noise of the camera’s data using more correpondences may reduce the noise effect and improve the estimation accracy. The process of finding the maxium number of corresponences out of the ICMs is achieved by a 3-point RANSAC procedure as follows: (1) Three ICMs are randomly drawn to generate a hypothesized state update 𝐱̂ 𝐢 , based on which we calculate the predicted ̂ 𝐢 that includes predicted observations for all observation 𝐡 features. (2) A threshold 𝜏 is used to find a set of features (𝐳𝐢𝐡𝐬 ) whose innovations are below 𝜏 under hypothesis 𝐱̂ 𝐢 . These features are the supports to the state hypothesis 𝐱̂ 𝐢 . The 2-step
𝐣
calculate the Normalized Innovation Squared (NIS) for 𝐳̃𝐤 as follows: 𝑗
𝐣
𝐣
𝐣
𝐣
𝐣
𝑇 −1 ̂ ̂ ̃𝐤 − 𝐡 𝜀𝑘 = ( 𝐳̃𝐤 − 𝐡 𝐤|𝐤−𝟏 ) (𝐒𝐤|𝐤−𝟏 ) ( 𝐳 𝐤|𝐤−𝟏 ).
(6)
Theoretically, 𝜀𝑘 has a 2 distribution with 2 degrees of freedom. The finite-sample consistency property requires that the innovation (based on a finite number of samples) should be acceptable as zero mean and have magnitude commensurate with the innovation covariance as yielded by 𝑗 the EKF [17]. This means that the ensemble-average of 𝜀𝑘 2 (over multiple independent samples) has a distribution with 2 degrees of freedom. Assuming that the innovation sequence has ergodicity, the ensemble-average can be replaced by a time-average given by 𝑗
1
𝑗 𝑝 𝑗 𝜖𝑘 = ∑𝑝=1 𝜀𝑘−𝑝+1 , 𝑝
(7)
𝐣
where p (𝑝 ≥ 𝑙) is the number of steps that 𝐳̃𝐤 has been 𝑗 tracked steps. The confidence interval [𝑟1 , 𝑟2 ] for 𝜖𝑘 to accept 𝑗 hypothesis 𝐻0 that 𝜖𝑘 is 22 distributed with probability 1 − 𝛼 can be determined using a 2 table. For 𝛼 = 0.1, we have [𝑟1 , 𝑟2 ] = [ 22,0.05 , 22,0.95 ]=[0.1026, 5.9915]. If the 𝜖𝑘𝑗 ∈ [𝑟1 , 𝑟2 ], 𝐳̃𝐤𝐣 passes the 2-sided chi-square test. Otherwise, it fails the test. A right-side/left-side failure indicates an underestimated/overestimated covariance matrix. In our method, the consistency test is performed for all features in 𝐳̃𝐤 that satisfy 𝑝 ≥ 5 (i.e., l=5). If over 70% of the features pass the test, the EKF is considered to be consistent. Otherwise, it is inconsistent and thus reset. A new EKF process is then started with new SIFT features initialized at time step k-1. V. EXPERIMENTAL RESULTS We carried out numerous experiments in various indoor and outdoor environments. The results show that the EKF-SLAM consistency test and reset scheme substantially improves the PE accuracy. In our experiments, we drove the robot in a looped path, i.e., returned the robot to the start point exactly, and collected a number of datasets. We ran the three methods, the VRO-based dead reckoning (VRO), the EKF without filter reset (EKF) and the EKF with Filter Reset (EKF-FR), on each of the datasets and compared the performances of the 3 methods. The DR results are used as the baselines to showcase the PE improvement of the proposed method. It is noted that this work is not intended to deal with the loop-closure problem, we used a loop trajectory simple because it provided us a zero position error for evaluating the PE accuracies of the SLAM methods. In fact, the robot does not move in a looped path in our intended applications. We used the robot’s Endpoint Position Error Norm (EPEN) as the performance metric for performance evaluation. The EPEN is the norm of the camera’s final position error represented as the percentage of the camera’s trajectory length. Due to the unavailability of ground truth value, the camera’s path-length is estimated by the EKF-FR method. Six experimental results are tabulated in Table I. To demonstrate the performance improvements of the EKF and EKF-FR over the VRO, the two methods’ EPEN Reduction (EPENR) for each experiment is computed and tabulated. The average values of these performance metrics indicating the overall performance of the methods over the eight TABLE I PERFORMANCES OF DR, EKF AND EKF-FR EPEN (%) EPENR (%) VRO EKF EKF-FR EKF EKF-FR 3.56 (0.12) 3.33 (0.05) 1.84 (0.07) 6.5 48.3 5.87 (0.10) 6.85 (0.09) 4.44 (0.10) -16.7 24.4 7.67 (0.15) 6.32 (0.04) 1.27 (0.05) 17.6 83.4 6.76 (0.10) 4.81 (0.09) 3.63 (0.09) 28.8 46.3 6.68 (0.03) 5.95 (0.15) 4.51 (0.12) 10.9 32.5 6.31 (0.20) 4.56 (0.08) 1.01 (0.05) 27.7 84.0 6.61 (0.11) 1.86 (0.10) 0.77 (0.02) 71.9 88.4 6.22 (0.12) 1.89 (0.12) 1.27 (0.07) 69.6 79.6 7.10 4.45 2.34 27.0 60.9
Experiment (path-length) 1 (37.27 m) 2 (39.17 m) 3 (44.53 m) 4 (41.28 m) 5 (19.72 m) 6 (17.66 m) 7 (16.45 m) 8 (17.59 m) Average
Each number in the parenthesis is the standard deviation of the 10 runs of each method on the same dataset. 1-4: outdoor, 4-8: indoor.
experiments are also computed. From the tables, we can see that: (1) The filter reset scheme of the EKF-FR substantially reduces the EPENs and results in an overall much better PE performance. (2) The EKF method sometimes produces only slightly better reusults (experiments 1 and 5) and sometimes even worse result (experiment 2) than the VRO because the EKF has been inconsistant at some points in these experiments. However, the EKF-FR always produce much better results. We developed a visualization software to examine the proposed method’s activities with respect to feature matching, inlier/outlier detection and filter consistency test. The software displays the location and trajectory of the camera, feature map, unmatched features, inliers and outliers and thus visualizes the state estimation process. This visualization software has been helpful in both algorithm development and validation phases in this work. Fig. 3 shows the screenshot of the video clip (generated by experiment 6) at the time when the robot returned to the start point. In Fig. 3 , there are only unmatched features and LIIs.
Fig. 3. A screenshot of the video clip of experiment 6 showing SIFT feature tracking and inlier detection. Left: ‘+’ with an ellipse denotes the projected position of a feature point on the intensity image. The ellipse depicts the position uncertainty within which a matched SIFT feature is anticipated. Blue indicates an unmatched feature point. Red indicates an LII and pink an HII. Right: The robot’s trajectory in X-Z coordinate. The feature points (in the feature map) are depicted by red ‘+’. Their uncertainty ellipses are not shown.
The robot trajectories estimated by the 3 methods for experiment 6 (indoor terrain) and experiment 3 (outdoor terrin) are plotted over the terrain maps (generated using the EKF-FR results) and shown in Fig. 4 and Fig 5, respectively, from which it can be seen that the VRO has the largest PE
Fig. 4. Experimental results: Robot trajectories estimated by the 3 methods on indoor mock terrain (experiment 6). Sand bags are placed on the ground and covered by outdoor mats to create mock terrain.
error, followed by the EKF method, and the EKF-FR method has the smallest PE error.
Fig. 5. Experimental results: robot trajectories estimated by the 3 methods on outdoor terrain (experiments 3).
VI. CONCLUSION In this paper, we propose a new EKF-SLAM method for 6-DOF robot PE. The method integrates the robot pose changes (measured by the VRO method) into its absolute pose by tracking a set of SIFT features sampled from the previous images. To mitigate the effect of the EKF-SLAM inconsistency, we devise a scheme to evaluate the EKF consistency at each step by a chi-square test on the time-averaged NIS of each stable feature’s observation. The scheme resets the current EKF process and start a new EKF-SLAM process if the filter fails the consistency test. Experimental results on various indoor and outdoor terrains demonstrate that the proposed method results in much smaller robot pose errors than that of the dead reckoning method and the standard EKF-SLAM method. REFERENCES [1]
[2]
[3]
[4]
[5]
[6] [7]
[8]
[9]
H. Durrant-Whyte and T. Bailey, “Simultaneous localization and mapping: part I,” IEEE Robot. Automat. Mag., vol. 13, no. 2, pp. 99-110, 2006. T. Bailey and H. Durrant-Whyte, “Simultaneous localization and mapping (SLAM): part II,” IEEE Robot. Automat. Mag., vol. 13, no. 3, pp. 108-117, 2006. S. Thrun, et al., “Simultaneous localization and mapping with sparse extended information filters,” Int. J. Robot. Research, vol. 23, no. 7-8, pp. 693-716, 2004. A. J. Davison, et al., “MonoSLAM: Real-time single camera SLAM,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 6, pp. 1052-1067, 2007. G. Dissanayake, P. Newman, S. Clark, H. Durrant-Whyte, and M. Csorba, “A solution to the simultaneous localization and map building (SLAM) problem,” IEEE Trans. Robot. Automat., vol. 17, no. 3, pp. 229-241, 2001. G. Grisetti, et al., “A tutorial on graph-based SLAM,” IEEE Intelligent Transportation Systems Magazine, vol. 2, no. 4, pp. 31-34, 2010. P. Henry, et al., “RGB-D mapping using Kinect-style depth cameras for dense 3D modeling of indoor environments,” Int. J. Robot. Res., vol. 31, no. 5, pp. 647-663, 2012. T. Bailey, J. Nieto, J. Guivant, M. Stevens, and E. Nebot, “Consistency of the EKF-SLAM algorithm,” in Proc. IEEE/RSJ Int. Conf. on Intelligent Robot. Syst., 2006, pp. 3562-3568. J. A. Castellanos, J. Neira, and J. D. Tardos, “Limit to the consistency of EKF-based SLAM,” in Proc. IFAC Symp. Intelligent Autonomous Vehicles, 2004.
[10] S. D. Huang and G. Dissanayake, “Convergence and Consistency Analysis for Extended Kalman Filter Based SLAM,” IEEE Trans. Robot., vol. 23, no. 5, pp. 1036-1049, 2007. [11] J. Andrade-Cetto and A. Sanfeliu, “The effects of partial observability when building fully correlated maps,” IEEE Trans. Robot., vol. 21, no. 4, pp. 771-777, 2005. [12] G. Huang, A. Mourikis, and S. Roumeliotis, “Analysis and improvement of the consistency of extended Kalman filter based SLAM,” in Proc. IEEE Int. Conf. on Robot. Automat., 2008, pp. 473-479. [13] Y. H. Lee, et al., “The analysis of effect of observation models and data associations on the consistency of EKF SLAM,” in Proc. IEEE Int. Conf. Ubiquitous on Robots and Ambient Intelligence, 2011, pp. 359-361. [14] S. Huang and G. Dissanayake, “Convergence and Consistency Analysis for Extended Kalman Filter Based SLAM,” IEEE Trans. Robot., vol. 23, no. 5, pp. 1036-1049, 2007. [15] J. Tardos, Jose Neira, P. Newman, and J. Leonard, “Robust Mapping and Localization in Indoor Environments Using Sonar Data,” Int. J. Robot. Res., vol. 21, no. 4, pp. 311-330, 2002. [16] J. Castellanos, R. Martinez-Cantin, J. Tardos, and J. Neira, “Robocentric map joining: improving the consistency of EKF-SLAM,” Robot. Autonomous Syst., vol. 55, no. 1, pp. 21-29, 2007. [17] Y. Bar-Shalom, X. R. Li, and T. Kirubarajan, Estimation with Application to Tracking and Navagation: State Estimation in Discrete-Time Linear Dynamic Systems, John Wiley & Sons, New York, 2001. [18] J. Tardos, Jose Neira, P. Newman, and J. Leonard, “Robust Mapping and Localization in Indoor Environments Using Sonar Data,” Int. J. Robot. Res., vol. 21, no. 4, pp. 311-330, 2002. [19] C. Ye and M Bruch, “A Visual Odometry Method Based On The Swissranger SR-4000,” in Proc. Unmanned Syst. Tech. 7th Conf. of the SPIE Defense, Security, and Sensing Symposium, Orlando, FL, 2010. [20] A. Tamjidi, C. Ye, and S. Hong, “6-DOF pose estimation of a portable navigation aid for the visually impaired,” in Proc. IEEE Int. Symposium on Robotic and Sensors Environments, 2013, pp. 178-183. [21] S. Hong, C. Ye, M. Bruch, and R. Halterman, “Performance Evaluation of a Pose Estimation Method Based on the SwissRanger SR4000,” in Proc. IEEE Int. Conf. Mechatronics and Automat., 2012, pp. 499-504. [22] M. Deans, C. Kunz, R. Sargent, E. Park, and L. Pedersen, “Combined Feature based and Shape based Visual Tracker For Robot Navigation,” IEEE Aerospace Conference, 2005, pp. 339-346.