Localization Based on Multiple Visual-Metric Maps

44 downloads 0 Views 3MB Size Report
Since the DARPA Grand [1] and Urban [2] challenges showed that autonomous car navigation is feasible, au- tonomous driving has caught the attention of ...
2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI 2017) November 16 - 18, 2017, Daegu, Korea

Localization Based on Multiple Visual-Metric Maps Adi Sujiwo1 Eijiro Takeuchi1 Luis Yoichi Morales2 Naoki Akai2 Yoshiki Ninomiya2 Masato Edahiro1

Abstract— This paper presents a fusion of monocular camera-based metric localization, IMU and odometry in dynamic environments of public roads. We build multiple visionbased maps and use them at the same time in localization phase. For the mapping phase, visual maps are built by employing ORB-SLAM and accurate metric positioning from LiDAR-based NDT scan matching. This external positioning is utilized to correct for scale drift inherent in all vision-based SLAM methods. Next in the localization phase, these embedded positions are used to estimate the vehicle pose in metric global coordinates using solely monocular camera. Furthermore, to increase system robustness we also proposed utilization of multiple maps and sensor fusion with odometry and IMU using particle filter method. Experimental testing were performed through public road environment as far as 170 km at different times of day to evaluate and compare localization results of vision-only, GNSS and sensor fusion methods. The results show that sensor fusion method offers lower average errors than GNSS and better coverage than vision-only one. Index Terms— Simultaneous Localization and Mapping, Robot Vision Systems

I. I NTRODUCTION Since the DARPA Grand [1] and Urban [2] challenges showed that autonomous car navigation is feasible, autonomous driving has caught the attention of automotive industry and the robotics community. Currently prototype vehicles use expensive sensor configurations based on LiDARs for localization, while vision-based systems are only used for road detection and lane keeping. The motivation of this work is a push toward reliable and inexpensive localization system that is interoperable with existing motion planners. As candidate, monocular camera satisfy cost requirement but unable to work with motion planner; not to mention other limitations such as sensitivity to environmental changes. Primary reason against vision systems for localization is that they are prone to scale drift in the long run [3]; this prevents vision systems to provide metric pose output as requirement for interoperability with motion planning [4]. In the approach presented in this work, metric information is embedded to individual frames of visual maps and utilized during vehicle localization. The availability of metric information enables to combine the output of the vision localizer with other sensors modalities which usually work in metric-based coordinates, i.e., GNSS and odometry. Most importantly, consistent metric enables direct system integration to path and motion planners.

Fig. 1. A car equipped with high-grade sensors builds several accurate visual metric maps. Consumer-grade cars with monocular camera and odometry utilize the maps to localize themselves.

To attach real-world metric information to frames of visual maps, we use a LiDAR and high definition maps as external localization framework for mapping phase. In the past years, there has been increasing availability of high definition maps, and it is expected to have extended mapping coverage [5]. We expect to have few vehicles equipped with expensive LiDAR equipment building visual maps while the rest would only require consumer grade cameras to localize themselves. Fig. 1 shows our concept of this study, which consists of two main steps: off-line mapping and on-line localization. The offline mapping uses single monocular camera running ORB-SLAM framework [6] to build a number of visual feature maps. As maps from vision-based SLAM lack accurate metric information, a laser-based localization system based on the Normal Distribution Transform (NDT) scan matching [7] runs in parallel to the visual SLAM. NDT localization is consistently accurate due to matching against high definition 3D point cloud maps. Number of produced maps is arbitrary, but could reflect various situations (for example, one map may be made from vehicle runs in winter while others may contain frames from summer or autumn). Meanwhile the online localization requires only a single monocular camera. As visual features are computed using ORB-SLAM in each frame, the corresponding metric is retrieved and scale and deformation correction is performed to achieve metric localization. The contributions of the work are threefold:

1 Graduate School of Information Science, Nagoya University [email protected] 2 Intelligent Vehicle Research Division, Institute of Innovation for Future Society, Nagoya University

978-1-5090-6064-1/17/$31.00 ©2017 IEEE

212

1) The proposal of a localization approach based on multiple visual feature maps with metric information. 2) A sensor fusion framework possible due to the availability of metric information. 3) System experimental evaluation in outdoor environments after 170 km of traversed data. The rest of the paper is organized as follows. Section II

introduces related works of visual localization. In Section III , we described our metric localization system. Section IV explains the experimental settings and Section V presents the results and discussion. Finally conclusions are given in Section VI. II. R ELATED W ORKS Localization and mapping has shifted away from static environments to dynamic ones where different solutions are subjected to testing against long-term datasets. This section describes related works in long time-interval mapping, localization and navigation based on laser and vision approaches. Early efforts for vision-based mapping and localization were centered in topological methods, such as the FABMAP [8]. Other notable work is [9] that proposed a method for topological global localization based on adaptive mapping with an omni-directional camera where the system performance is measured in terms of correct localization percentage. Later, the work in [10] presents a visual mapping system using stereo camera which updates a metric map. This work shows the importance of robust visual place recognition to update the map and recover from localization errors. In a large data set context, the work in [11] presents a monocular vision approach for long-term navigation in dynamic environments based on multiple visual maps (experiences) taken at different times and environmental conditions capable to localize a vehicle during several kilometers. An extension to the work presented a probabilistic approach to prioritize the loading of maps to reduce computation time [12]. There are other existing visual approaches such as Maddern et al. in [13] who presented a graphical appearance SLAM with loop closure and no metric representation. The work in [14] proposed a solution of visual localization utilizing HOG-based description of the images. An improvement to the work was presented by Vysotska et al. in [15] where they used GNSS priors. The work in [16] proposed a dual likelihood localization system (with similar and different likelihood condition models) capable to select the correct model online. An on-line landmark selection for visual localization selecting only relevant information was also proposed in [17]. Most current methods for robot motion planning and control require accurate state of the robot in the environment. In turn, this requirement dictates that localization must output the position in metric space that is difficult to realize from vision-based topological methods. The major problem of visual localization in metric space is that monocular visual SLAM solutions that rely on SfM (Structure from Motion) are prone to scale drift in long run, due to 3D point recovery from multiple 2D views has unknown scale factor. Therefore, most metric-based SLAM methods for autonomous vehicle community are based on LiDAR sensors. The closest related works to the research presented in this paper are the works described by M¨uhlfellner et al. in [18] and Grimmett et al. in [19] where they used visual and metric information. The former work presents a localization and

mapping approach for changing environments. A discussion of the trade-off between localization performance and computational complexity is presented and an metric evaluation based on an INS-system is provided. Meanwhile the latter work integrates metric and semantic maps in an automated parking scenario. Recently another similar work was presented in [20]. In this paper, 3D reconstructed feature points by local Bundle adjustment are directly matched with a 3D LiDAR map. In other words, this method is based on geometric informationbased scan matching even monocular camera is used. Since geometry of environment is more stable than that of appearance, it is robust against seasonal changes. Our proposed method also uses a 3D LiDAR map, but our objective is to ensure consistency between the multiple visual maps and metric frame. This improves robustness against appearance change due to seasonal changes. III. V ISUAL -M ETRIC L OCALIZATION Our visual localization system is based on our previous work [21], that will be described briefly here. The system is based on ORB-SLAM [6] which works using a monocular camera. In our works, estimated poses from the LiDARbased method are assumed to be ground truth since it has enough accuracy to evaluate vision-based system. A. Visual SLAM Using ORB-SLAM ORB-SLAM builds the map using 3D reconstruction from multiple sequential views of a scene [6], which is based on structure from motion (SfM). SfM is the process of estimating sparse 3D structures from 2D image sequences concurrent with camera motion estimation [22]. In the process, ORB-SLAM employs visual odometry to track static 3D structures and infer the camera motion. A resulting ORB-SLAM map is built from a single run of the vehicle in a given environment. This map essentially consists of a set of distinct keyframes and keypoints. Each keyframe stores a camera pose in 3D and list of 2D feature points, that are used for place recognition. Meanwhile, each keypoint represents a single 3D pose of ORB feature points. These keypoints are used to fine-tune rough pose estimation derived from place recognition in localization phase. It must be noted that all 3D coordinates output from ORB-SLAM are not in metric and prone to scale distortion. B. Visual-Metric Relation

Fig. 2. Off-line visual mapping with metric. “n” visual-metric relations are computed.

213

Fig. 5. nates. Fig. 3.

Coordinate conversion between ORB-SLAM and metric coordi-

High definition 3D point cloud map.

In order to correct for scale drift, accurate pose in metric space for each keyframe in ORB-SLAM map is required as reference in the localization phase. In this research we employ 3D NDT scan-matching method with 3D LiDAR for this pose reference. The NDT scan-matching requires highdefinition 3D point cloud map (example is shown in Fig. 3), which allows us to accurately estimate vehicle pose in order of centimeters. As described in Fig. 2, the log data that consists of point cloud data and image frames are processed through two different flows by NDT and ORB-SLAM. The image frame sequences from monocular camera are processed by ORBSLAM to build map for the particular time. Meanwhile in parallel, NDT consumes point cloud data from 3D LiDAR scans to be matched against previously-built 3D point cloud map to acquire vehicle’s metric pose. This metric pose is then embedded to each resulting keyframe in the ORB-SLAM’s map, in addition to its own estimated pose. To generate multiple maps, this process is repeated as much as necessary for each mapping run. As formalization and reference for scale correction later in localization phase, we refer to poses in feature space and metric space as Pk and Pn , respectively. Each pose P consists of translation vector t = (x, y, z)T and rotation matrix R.

In localization phase, the system only uses monocular camera to obtain metric phase, as shown in Fig. 4. Here, the ORB-SLAM is used for: 1) Place recognition as means of global localization for initialization and recovery in the event of lost; 2) Visual odometry to match each image frame’s feature against map keyframes to estimate pose. Both functions of ORB-SLAM result in camera pose in feature space. By leveraging previously embedded metric poses in each keyframes, we can convert the position reported by ORB-SLAM into world metric space, assuming that scale changes in small movements are uniform (see Fig. 5 for reference). To perform scale correction, robot pose in metric space Pc is predicted as scaling-up from ORB-SLAM pose Po . First, we search for nearest keyframe in ORB-SLAM frame as Pk and its offset as Pk0 ; their positions in metric frame are Pn and Pn0 . Next, scale factor s is calculated as s =

||t0n − tn || . ||t0k − tk ||

(1)

Pr0 is transformation from Pn to Pc , computed by the following expression:   R(qr ) str 0 Pr = , (2) 0T 1 where R(qr ) is the rotation component of Pr . Lastly, final pose in metric frame is derived from:

C. Visual Localization in Metric Frame

Pc

= Pn Pr0 .

(3)

D. Localization Using Multiple Maps

Fig. 4. On-line camera-only localization computes “n” metric poses. A final pose is estimated using a particle filter with odometry information.

During our experiments we found that maps of the same location but created at different times delivered distinct results. The use of multiple maps gives the system redundancy by increasing its coverage. This is done by running multiple instances of the ORB-SLAM process with same input data, each instance utilizes distinct maps created from different times. It is important to emphasize the importance of scale and deformation-corrected metric coordinates assigned for each keyframe of ORB-SLAM. With this correction, sensor fusion

214

TABLE I T IME AND CONDITION FOR MAPPING Datasets 01-21 01-21 N 01-26 01-29 02-03 02-05 02-09 02-12 02-12 N 02-24 Fig. 6.

Start 13:40 15:10 09:00 11:10 09:00 13:40 13:40 08:30 13:20 10:30

AND TESTING RUNS

Stop 15:00 17:00 11:00 12:30 11:00 15:00 15:00 09:50 14:20 11:50

Weather Overcast Overcast Sunny Rainy Sunny Sunny Rainy Cloudy Cloudy Sunny

Experiment car and equipped sensors

with other metric sensor modalities can be done just after visual feature matching and metric coordinate retrieving. E. Augmenting ORB-SLAM Localization with Odometry To combine output of the visual metric localization and odometry data we used a particle filter as described by [23]. We assume that the vehicle moves in a 2D plane; the robot state at time t is represented by its position and orientation in 2D plane as xt = (xt , yt , θt ). Control data from vehicle wheels come in as linear and angular velocities represented respectively, as ut = (vt , ωt ). The particle filter takes a sample of M number of “particles”. These samples are updated by control variables ut and ORB-SLAM measurements z, which are estimated poses from all available maps. The particle filter then selects particles proportional to their fitness against z as weight w. In order to account for errors in v and ω, we introduce noises to v and ω. The noise are assumed to be Gaussian with standard deviation α1 and α2 , which are device-specific and must be determined by experiment. Meanwhile for measurement model, each particle’s weight is determined from its distances to all ORB-SLAM measurements. Here, each ORB-SLAM measurement is assumed to be independent and may contain noise. Particle weights wn of state Xt = (x0 , y0 , θ 0 ) against ORB measurement Zn = (xn , yn , θn ) (n is the number of maps) is computed with respect to measurement noise Σ which is the covariance matrix that represents error measurements of ORB-SLAM in lateral, longitudinal and yaw given by   σx 0  σy Σ= (4) 0 ϑ  Where wn = exp − 21 (Xt − Zn )T Σ−1 (Xt − Zn ) . The final particle weight is given by w = max(w1 , w2 , ..., wn ). Values of σx , σy and ϑ should be computed experimentally. In our implementation, we provided average values of metric vision localization for these noise values. IV. E XPERIMENTAL S ETTINGS A. Vehicle Settings Figure 6 shows the experimental vehicle. This research uses a LiDAR (Velodyne HDL64E-S2), a monocular camera

(Point Grey grasshopper3), DGNSS-IMU (JAVAD DELTAG3T with IMU) and the CAN information coming from the vehicle. The Velodyne HDL64E-S2 is mounted on the top, with measurement range around 120 m, vertical angular range is 26.9o and measurement period is 10 Hz. The GNSS is used to obtain initial position for NDT localization. B. Datasets Datasets for evaluation of our system were collected by running the vehicle around the city of Nagoya for multiple days. The trajectory is shown in Fig. 7. For each run, image sequence for localization tests and LiDAR scans for ground truth were collected simultaneously. Images were captured in full HD resolution at rate of 20 Hz, but downscaled to 800 × 600 pixels for processing. We also captured raw CAN data for processing into odometry information, which will be used for sensor fusion. In general, the traversed different loop patterns of trajectories that varies from 9 to 30 km. The track is dominated by sub-urban environment, with occasional high-rise buildings along the road. Time and weather condition of each mapping and testing runs are recorded in Table I. Each vehicle run was taken on different day, with varying weather conditions ranging from cloudy, rainy to sunny days. From these runs, three were selected to be fetched into mapping process (marked in yellow). All other runs were designated as localization runs using these three maps. C. Evaluation Criteria To evaluate our system, two numerical criteria are used to gauge quality of localization: coverage and accuracy. 1) Coverage: We define coverage of a single map in one vehicle run as percentage of time that the map is capable to localize without being lost. This measurement could provide rough description of capability of the maps and localization system when coping with changing condition. 2) Accuracy: Quality of localization results are evaluated by metric errors, which represent distance from predicted vehicle position against ground truth provided by the NDT scan matching. These errors are further separated into lateral and longitudinal ones, in order to couple the localization system with motion planner. For each run, we take means and maximum values to illustrate capability of this system.

215

Fig. 8.

Fig. 7.

Mapping situations

Vehicle path used for evaluation

V. R ESULTS AND D ISCUSSION Our algorithm is implemented under ROS (Robotic Operating System) framework, that runs under middle-range PC. On average, single map requires around 3-4 GB memory.

Fig. 9.

Visualization of localization at work

C. Coverage

A. Mapping Process As described above, the mapping process requires the vehicle to record image sequences and camera position in the world. Next, we process this image sequence through ORB-SLAM to produce keyframes and map points, and append the corresponding vehicle pose to each keyframe. These augmented keyframes and map points are then saved to map files in disk for subsequent localization. Two mapping runs were conducted in the same day with cloudy weather, with another one in different day with sunny weather. As shown in Fig. 10, maps from cloudy day provided better coverage than sunny one. We suppose that this low coverage was caused by prevalent visual disturbances that occur both in mapping and localization. Fig. 8 gives common situations that occured in mapping runs. Most part of the track covers suburban areas A and wide road B, with forest as other substantial part C. As expected in sunny day, vehicle may encounter visual disturbance such as lens smear and flares D. B. Localization Results Overview Fig. 9 depicted visualization of localization process. The colored axis shows vehicle position from sensor fusion of visual localization and odometry, after being overlaid in metric feature map that covers road marks. In the inset, visual feature tracking are being performed by ORB-SLAM, projecting map points that are currently visible in the image.

Fig. 10. Comparison of coverage from each map for each of vehicle runs

Each map gives different performance regarding the coverage for localization. Fig. 10 shows how coverage of each map changes in every run, with map 3 gave lowest coverage. During its creation, we noticed that the camera experienced heavy lens smears and flares during run. This condition also occurred during localization run 02-03. Meanwhile, map 1 and 3 that were created in same day delivered similar coverage. For an extreme example, we plot the coverage of all maps in runs 02-03 (lowest coverage from all maps) at Fig. 11. In this particular run, there are significant parts of the track in which all maps were unable to provide localization. From this observation, it is interesting that there is possibility

216

Fig. 12.

Severe case of large lateral and longitudinal errors in run 02-03 TABLE III

ACCURACY

OF

S ENSOR F USION B ETWEEN M ULTI - MAP V ISUAL

L OCALIZATION AND O DOMETRY ( METER ) Datasets 01-29 02-03 02-05 02-09 02-12 02-12 N 02-24 Fig. 11.

Lateral Avg. 0.20 0.54 0.21 0.27 0.22 0.25 0.22

Max. 2.98 48.50 3.08 4.73 3.73 6.48 4.55

Std. 0.25 2.78 0.26 0.38 0.26 0.32 0.27

Longitudinal Avg. Max. 1.08 5.01 1.46 48.34 0.86 3.80 1.09 4.60 1.04 4.73 1.27 4.36 0.97 5.73

Std. 0.86 2.58 0.61 0.88 0.88 0.87 0.83

Trajectory coverage for 02-03 dataset

to improve coverage by jointly using results from different maps together in consistent manner. D. Accuracy of Vision-Only Localization From the coverage graph, we expect that good map coverage will deliver accurate metric localization. Unfortunately, this is not always the case. Fig. 12 depicted a scene where some large but very brief “jumps” had taken place. These localization errors came from map 3, that has been described to have problem during its creation. However, other maps may also exhibit similar behavior randomly. All statistics of error rate from vision-only localization are shown in Table II. In general, average errors, both laterally and longitudinally are small. However, for map 3 we found that it has substantially large standard deviation compared to other maps in longitudinal direction. E. Accuracy of Sensor Fusion Having knowledge that single map is unable to cover the whole track, we also performed sensor fusion experiments to test the viability of results of combining multiple maps with odometry from CAN data and provide single localization results. Table III shows average, maximum and standard deviation of our sensor fusion approach during all localization test runs. In that table, multiple maps and sensor fusion approach is capable to jointly suppress large error that may results from single localization results.

F. Performance Comparison Between Sensor Fusion and Vision-Only Localization To compare performance between sensor fusion and vision-only localization, we take an example from run 02-03 that will be representative of large errors from both methods and plot cumulative distributive function (CDF) of lateral and longitudinal errors in Fig. 13 and 14, respectively. In each figure, horizontal axis represents size of error in meters, while vertical one represents percentage of error data covered into the corresponding error size. Fig. 13 shows that lateral error distribution of sensor fusion converges to unity faster than vision-only localization. In other words, the sensor fusion localization of multiple maps results in better accuracy than any single maps used for vision-only localization. Similar situation also occurs in Fig. 14, which shows cumulative distribution of longitudinal errors from all localization methods. From these plots, it is clear that sensor fusion of multiple vision maps and odometry localization is capable to suppress large errors that may occur from vision-only localization. This error suppression has its limit, that is when vision localization lost for long time. Table IV shows GNSS localization results using same datasets (N/A=not available). Average lateral error estimated by our vision-based localization method are smaller than that of GNSS localization results. Although longitudinal error average of our method is close to that of GNSS results, standard deviation of GNSS longitudinal error is smaller than that of our method. These results suggest that precision of longitudinal estimation would be improved by fusing GNSS data in the particle filter.

217

TABLE II ACCURACY

Datasets 01-29 02-03 02-05 02-09 02-12 02-12 N 02-24

Map 1 Avg. 0.21 0.22 0.28 0.31 0.30 0.28 0.28

Std. 0.33 0.46 0.57 0.68 0.64 0.50 0.58

OF

V ISION -O NLY L OCALIZATION ( METER )

Lateral Map 2 Avg Std 0.17 0.30 0.21 0.63 0.16 0.32 0.16 0.26 0.18 0.27 0.23 0.46 0.24 0.41

Map 3 Avg Std 0.28 1.06 0.20 0.47 0.17 0.27 0.24 0.94 0.21 0.51 0.20 0.42 0.50 1.28

TABLE IV

Longitudinal Map 2 Avg Std 0.40 0.79 0.43 3.82 0.36 0.81 0.41 0.96 0.49 0.99 0.44 0.79 0.24 0.71

Map 1 Avg Std 0.51 1.39 0.51 1.51 0.43 1.66 0.49 1.50 0.46 1.21 0.51 1.28 0.74 2.62

Map 3 Avg Std 1.45 12.04 0.67 2.09 0.54 2.08 1.13 10.17 0.76 5.27 0.70 4.90 2.06 11.36

1

Dataset

Avg. 1.64 N/A 1.09 0.97 1.03 1.11 1.23

01-29 02-03 02-05 02-09 02-12 02-12 N 02-24

Lateral Max. 15.70 N/A 9.64 8.76 3.81 8.49 13.06

Std. 1.47 N/A 0.87 0.62 0.52 0.68 0.93

Avg. 1.12 N/A 0.88 0.79 0.77 0.88 0.95

Longitudinal Max. 14.85 N/A 8.22 7.10 4.88 5.03 13.07

CDF

GNSS L OCALIZATION E RRORS

Std. 1.23 N/A 0.67 0.55 0.52 0.57 0.77

0.5 fused map1 map2 map3 0

0

1

2

3

4

5

6

7

8

9

10

Longitudinal Error [m]

Fig. 14.

Longitudinal Error CDF Plot for Run 02-03

1

CDF

ACKNOWLEDGMENT 0.5 fused map1 map2 map3 0

0

1

2

Fig. 13.

3 4 5 Lateral error [m]

6

7

8

9

10

This work is supported by MEXT COI stream program, A Diverse and Individualized Social Innovation Hub - The “Mobility Society” for the Elderly: Lead to an Active and Joyful Lifestyle. Our fork of ORB-SLAM is available as part of Autoware, open source software for urban autonomous driving [24].

Lateral Error CDF Plot for Run 02-03

R EFERENCES VI. C ONCLUSIONS This paper presented sensor fusion approach for solving monocular visual localization in metric space under dynamic environments such as public roads. First, this method combines visual maps with accurate LiDAR-based positioning built offline. On-line metric pose estimation using a single monocular camera was achieved with the availability of such maps. Moreover, the approach for doing sensor fusion with other metric sensing modalities such as odometry was explained. Results of this experiment show a comparison between the performance of vision-only and vision-odometry fusion estimation. The vision-only localization provided coverage only in good visual conditions. Vision-odometry sensor fusion maintained full coverage and provided smoother pose estimated results. For comparison, our vision-based localization has been shown to have better accuracy than high-end GNSS. However, truly autonomous navigation based on lowcost sensors (in this case, monocular vision and IMU) is still difficult because of reliability issues regarding large errors and low coverage that may occur.

218

[1] S. Thrun, et al., “Stanley: The robot that won the DARPA grand challenge,” Journal of Field Robotics, vol. 23, no. 1, pp. 661–692, June 2006. [2] C. Urmson, et al., “Autonomous driving in urban environments: Boss and the urban challenge,” Journal of Field Robotics Special Issue on the 2007 DARPA Urban Challenge, Part I, vol. 25, no. 1, pp. 425–466, June 2008. [3] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge University Press, 2003. [4] B. Paden, M. Cap, S. Z. Yong, D. Yershov, and E. Frazzoli, “A survey of motion planning and control techniques for self-driving urban vehicles,” IEEE Transactions on Intelligent Vehicles, vol. 1, no. 1, pp. 33–55, 2016. [5] J. Ristevski, “Building a global 3D routing map,” in GIM International: The Global Magazine for Geomatics, vol. 9, no. 8, August 2015, pp. 23–25. [6] R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, “ORB-SLAM: a versatile and accurate monocular SLAM system,” IEEE Transactions on Robotics, vol. 31, no. 5, pp. 1147–1163, 2015. [7] E. Takeuchi and T. Tsubouchi, “A 3-D scan matching using improved 3-D normal distributions transform for mobile robotic mapping,” in 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2006, pp. 3068–3073. [8] M. Cummins and P. Newman, “FAB-MAP: Probabilistic localization and mapping in the space of appearance,” The International Journal of Robotics Research, vol. 27, no. 6, pp. 647–665, 2008. [9] F. Dayoub and T. Duckett, “An adaptive appearance-based map for long-term topological localization of mobile robots,” in 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sept 2008, pp. 3364–3369.

[10] K. Konolige and J. Bowman, “Towards lifelong visual maps,” in 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct 2009, pp. 1156–1163. [11] W. Churchill and P. Newman, “Practice makes perfect? managing and leveraging visual experiences for lifelong navigation,” in Proc. IEEE International Conference on Robotics and Automation (ICRA), Minnesota, USA, May 2012. [12] C. Linegar, W. Churchill, and P. Newman, “Work smart, not hard: Recalling relevant experiences for vast-scale but time-constrained localisation,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, May 2015. [13] W. Maddern, M. Milford, and G. Wyeth, “Towards persistent indoor appearance-based localization, mapping and navigation using CATGraph,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct 2012, pp. 4224–4230. [14] T. Naseer, M. Ruhnke, L. Spinello, C. Stachniss, and W. Burgard, “Robust visual SLAM across seasons,” in Proceeding of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2015. [15] O. Vysotska, T. Naseer, L. Spinello, W. Burgard, and C. Stachniss, “Efficient and effective matching of image sequences under substantial appearance changes exploiting GPS priors,” in 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2015, pp. 2774–2779. [16] S. Lowry and M. J. Milford, “Building beliefs: Unsupervised generation of observation likelihoods for probabilistic localization in changing environments,” in Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on, Sept 2015, pp. 3071– 3078. [17] M. Buerki, I. Gilitschenski, E. Stumm, R. Siegwart, and J. Nieto, “Appearance-based landmark selection for efficient long-term visual localization,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea, October 2015. [18] P. M¨uhlfellner, M. Brki, M. Bosse, W. Derendarz, R. Philippsen, and P. Furgale, “Summary maps for lifelong visual localization,” Journal of Field Robotics, vol. 33, no. 5, pp. 561–590, 2016. [19] H. Grimmett, et al., “Integrating metric and semantic maps for visiononly automated parking,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, May 2015. [20] T. Caselitz, B. Steder, M. Ruhnke, and W. Burgard, “Monocular camera localization in 3D LiDAR maps,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Oct 2016, pp. 1926–1931. [21] A. Sujiwo, T. Ando, E. Takeuchi, Y. Ninomiya, and M. Edahiro, “Monocular vision-based localization using ORB-SLAM with LiDARaided mapping in real-world robot challenge,” Journal of Robotics and Mechatronics, vol. 28, no. 4, pp. 479–490, Aug. 2016. [22] J. Fuentes-Pacheco, J. Ruiz-Ascencio, and J. M. Rendn-Mancha, “Visual simultaneous localization and mapping: a survey,” Artificial Intelligence Review, vol. 43, no. 1, pp. 55–81, 2015. [23] S. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics (Intelligent Robotics and Autonomous Agents). The MIT Press, 2005. [24] S. Kato, E. Takeuchi, Y. Ishiguro, Y. Ninomiya, K. Takeda, and T. Hamada, “An open approach to autonomous vehicles,” IEEE Micro, vol. 35, no. 6, pp. 60–68, 2015.

219

Suggest Documents