Mar 17, 2009 ... Body-relative Navigation. Using Uncalibrated Cameras. Oli i K h. S hT ll.
Massachusetts Institute of Technology. Olivier Koch
Body-relative Navigation Using Uncalibrated Cameras Olivier Oli i K Koch h
[email protected]
Seth S hT Teller ll
[email protected]
Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory
Motivation Navigation guidance to humans – Finding our way in complex/new environments
Whyy is it hard? – No external source of localization (GPS) – Unknown environment (no map)
Why should you care? – Soldiers in the field – Visually impaired – Guidance in public places (hospitals, museums)
3/17/2009
Olivier Koch
2
Vision-based navigation
Four Pointgrey Firefly MV Cameras (640x480 8‐bit grayscale images) FOV: 360° (h) x 90° (v)
Why vision? 9 Light, inexpensive, compact c information o at o ((vss laser ase rangerfinders) a ge de s) 9 Rich 9 No temporal drift (vs inertial sensors) 3/17/2009
Olivier Koch
3
Uncalibrated cameras
Four Pointgrey Firefly MV Cameras (640x480 8‐bit grayscale images) FOV: 360° (h) x 90° (v)
Why use uncalibrated cameras? ¾ Intrinsic calibration is tedious t s c calibration ca b at o is s hard a d for o body body-worn o app applications cat o s ¾ Extrinsic 3/17/2009
Olivier Koch
4
Problem statement Input Live video stream from wearable set of uncalibrated cameras
start end
3/17/2009
Olivier Koch
5
Problem statement Input
Output
Live video stream from wearable set of uncalibrated cameras
Body-relative guidance for: Homing (going back to start point) Replay (from start point to end point) Point-to-point navigation
start end
3/17/2009
Olivier Koch
6
Sample dataset
3/17/2009
Olivier Koch
7
Method overview Exploration path: undirected graph (place graph) Node: physical location in the world p y Edge: physical path between two nodes traversed by the user 9 Makes no assumption on user motion between nodes Makes no assumption on user motion between nodes
3/17/2009
Olivier Koch
8
Method overview Local node orientation: direction of the user leaving the node ¾ Assume smooth user motion Local node observations (visual features) ¾ Assume distinctive feature visibility 9 No global coordinate frame
3/17/2009
Olivier Koch
9
Method overview Node‐to‐node “hopping” problem 9 Does not require metric mapping of the environment q pp g ¾ Assumes that user stays in the graph during guidance
Determine location of user in the graph (local node estimation)
3/17/2009
Olivier Koch
Guide the user at that location (rotation guidance)
10
Method overview Loop closure detection
3/17/2009
Olivier Koch
11
Limitations & advantages Limitations ¾ User leaving exploration path ¾ Smooth user motion ¾ Distinctive features visibility
Advantages P Provides id iintuitive, t iti b body-relative d l ti guidance id Requires no extrinsic or intrinsic camera calibration Scales to arbitrary large environments
3/17/2009
Olivier Koch
12
Related work Visual Simultaneous Localization and Mapping (SLAM)
Davison et al al., MonoSLAM: Real-Time Real Time Single Camera SLAM SLAM, PAMI ’07 07 J. Neira et al., Data association in O(n) for Divide and Conquer SLAM, RSS ’07 Wolf et al al., Robust Vision-Based Vision Based Localization by Combining an Image Retrieval System with Monte Carlo Localization, IEEE Transactions Robotics ’05 Konolige, Agrawal et al., . Mapping, Navigation and Learning for Off-road Traversal, Journal of Field Robotics ‘08
Metric and topological localization
3/17/2009
Zhang & Kosecka, Hierarchical Building Recognition, Image and Vision Computing ‘07 B. Kuipers, Using the topological skeleton for scalable global metrical mapbuilding, IROS ’04 Olivier Koch
13
Related work Appearance-based navigation
3/17/2009
Cummins & Newman, Probabilistic Appearance Based Navigation and Loop Closing, ICRA ’07 Collet, Landmark learning and guidance in insects, Ph. Trans. Roy. Soc. London, 1992 Chen & Birchfield, Qualitative vision-based mobile robot navigation, ICRA 06 ICRA’06 Zhang & Kleeman, Robust appearance-based visual route following for navigation in large-scale outdoor environments, IJRR’09
Olivier Koch
14
Method overview
3/17/2009
Olivier Koch
15
Method overview
3/17/2009
Olivier Koch
16
The place graph World as an undirected g graph p G = ((V,, E)) Object
Represents…
Data Structure
ode Node
Location oca o in the e world o d
Visual sua features ea u es (e (e.g. g S SIFT))
Edge
Physical path between two nodes
N/A
3/17/2009
Olivier Koch
17
The place graph Similarityy function Ψ() () – Input: two sets of features F1, F2 – Output: average L2-distance for all feature matches between F1 and F2
Creating a node whenever Ψ > δ
In practice, new node every three seconds (5 meters) at human-walking speed.
3/17/2009
Olivier Koch
18
The place graph
3/17/2009
Olivier Koch
19
Method overview
3/17/2009
Olivier Koch
20
Local node estimation Input p Output
p position of the user in the map p at time t-1 observations at time t position in the map at time t
User motion = Markov process Recursive Bayesian estimation – State xk: position in the map at time k (node label) – Measurement zk: observations at time k (SIFT)
3/17/2009
Olivier Koch
21
Local node estimation p ( xk | z k −1 ) = ∑ p ( xk | xk −1 ) p ( xk −1 | z k −1 ) (prediction) p ( xk | z k ) = λ p ( z k | xk ) p ( xk | z k −1 ) p ( z k | xk ) ~ 1
3/17/2009
ε + ψ ( xk , z k )
Olivier Koch
(update)
p ( xk | xk −1 ) = N (0, σ )
22
Local node estimation p ( xk | z k −1 ) = ∑ p ( xk | xk −1 ) p ( xk −1 | z k −1 ) (prediction) p ( xk | z k ) = λ p ( z k | xk ) p ( xk | z k −1 ) p ( z k | xk ) ~ 1
ε + ψ ( xk , z k )
(update)
p ( xk | xk −1 ) = N (0, σ )
Compute pdf over a local neighborhood of current position only No new node creation
3/17/2009
Olivier Koch
23
Local node estimation
3/17/2009
Olivier Koch
24
Method Overview
3/17/2009
Olivier Koch
25
Rotation Guidance Input p Output
current p position of user in the g graph p current observations guidance to next node in user’s body frame
Approach visual learning
3/17/2009
Olivier Koch
26
The relative orientation problem Problem Estimate the relative user orientation between two visits of the same location (in 2D)
3/17/2009
Olivier Koch
27
The relative orientation problem Assuming g intrinsic & extrinsic camera calibration World features = bearing measurements (α,β,…)
3/17/2009
Olivier Koch
28
The relative orientation problem Assuming g no intrinsic & extrinsic camera calibration – Bearing α → coarse bearing α – α : average of all possible measurements on the camera
Four cameras, covering g each 90° of FOV
3/17/2009
Olivier Koch
29
The relative orientation problem Principle p For a large g number of measurements,, and given the assumptions below, using α instead of α yields a statistically valid estimate of the relative orientation γ. Assumptions – Observations are uniformlyy distributed in image g space p – Observations are made from the same vantage point during revisit
3/17/2009
Olivier Koch
30
The relative orientation problem α α
f/2
3/17/2009
Olivier Koch
31
The match matrix
3/17/2009
Olivier Koch
32
The match matrix H(i,j) ( ,j) = user rotation associated to a match btw camera i and camera j H = coarse approximation of the full camera calibration H is anti-symmetric
H satisfies the “circular circular equality” equality
3/17/2009
Olivier Koch
33
Learning the match matrix Training g Phase – Learn match matrix from training data – Once for a given camera configuration – Does not depend on training environment
Training algorithm – User U rotates t t in i place l iin arbitrary bit environment i t – Algorithm “learns” the match matrix
3/17/2009
Olivier Koch
34
Learning the match matrix η
η User rotates in place nr times in an arbitrary environment (nr =2) F For each pair of frames (f h i ff (fi,ffj) in the training sequence: )i h i i Estimate corresponding user rotation η (e.g. assuming constant rot. speed) Compute feature matches between fi and fj For each match m between a feature on camera r and a feature on camera s, update , p H(r,s) with η 3/17/2009
Olivier Koch
35
Learning the match matrix η
η Training algorithm ¾ 3/17/2009
Runs in arbitrary environment Done once for a given camera configuration Fast (a few minutes) and simple Quadratic complexity in # frames and # features/frame Olivier Koch
36
Learning the match matrix
3/17/2009
Olivier Koch
37
Rotation guidance using the match matrix
3/17/2009
Olivier Koch
38
Rotation guidance using the match matrix
3/17/2009
Olivier Koch
39
Method Overview
3/17/2009
Olivier Koch
40
Loop closure detection 1. Compute node similarity
3/17/2009
2. Extract similar sequences
Olivier Koch
3. Update place graph
41
Loop closure detection 1. Compute node similarity
“Bags of words” word w = (cw, rw) words store list of node labels
Incremental vocabulary Optimized search using search tree 9 Fully incremental 9 No a priori vocabulary Filliat, Interactive Learning of Visual Topological Navigation, IROS’08 3/17/2009
Olivier Koch
42
Loop closure detection 2. Extract similar sequences
Smith & Waterman algorithm Inspired from molecular biology Output: similar node subsequences 9 Robust loop closure detection ¾ Does not detect “instantaneous” loop closure
Ho & Newman, Detecting Loop Closure with Scene Sequences, IJCV’07 3/17/2009
Olivier Koch
43
Loop closure detection 3. Update place graph
Merge node sequences
3/17/2009
Olivier Koch
44
Loop closure detection
3/17/2009
Olivier Koch
45
Match matrix Anti-symmetry: y y error ~ 14.5° Circular equality: error ~ 1.5°
H
0
3/17/2009
⎛ 0 ⎜ ⎜ − 90 = ⎜ 180 ⎜ ⎜ 90 ⎝
90
180
0
90
− 90
0
180
− 90
− 90 ⎞ ⎟ 180 ⎟ 90 ⎟ 0 Olivier Koch
⎠ 46
Potential sources of error
3/17/2009
Constant rotation speed p during g training g Non-homogeneous feature distribution in image space Baseline due to translation during revisit Feature mismatches
Olivier Koch
47
Rotation guidance vs IMU User rotates in place in arbitrary environment Compare rotation guidance against IMU Standard deviation: 8.5° (max error: 20°)
3/17/2009
Olivier Koch
48
Large-scale rotation baseline
First image
3/17/2009
Second image (matches in blue) Aligned images
High-resolution camera pointing upward Average difference between SIFT feature orientations Standard error (vs IMU) < 2° Ground-truth Ground truth throughout exploration path Requires no intrinsic/extrinsic camera calibration Olivier Koch
49
Rotation guidance vs ground-truth
200 checkpoints Standard S d d error: 10 10.5°° (max ( error: 15°) 1 °) 3/17/2009
Olivier Koch
50
Local node estimation vs ground-truth
3/17/2009
Olivier Koch
51
Real-world explorations
GALLERIA dataset GALLERIA dataset
3/17/2009
CORRIDORS dataset CORRIDORS dataset
Olivier Koch
52
Off-path trajectories (GALLERIA dataset) Place graph node Guidance output F il Failure case
Rotation guidance overlaid on 2D map. Values are exact, not notional. 3/17/2009
Olivier Koch
53
Off-path trajectories (CORRIDORS dataset)
3/17/2009
Olivier Koch
54
Rotation guidance (CORRIDORS dataset)
3/17/2009
Olivier Koch
55
Rotation guidance (CORRIDORS dataset)
First visit (SIFT features in yellow)
3/17/2009
Revisit (body‐relative rotation guidance)
Olivier Koch
56
Loop-closure (CORRIDORS dataset)
Exploration path manually overlaid on 2D map 1,500 meters (30 min.)
3/17/2009
Olivier Koch
Place graph (spring‐mass model) 500 nodes (before loop closure) 197 nodes (after) 197 nodes (after)
57
Conclusion Assumptions ¾ ¾ ¾ ¾
Large number L b off visual i l ffeatures t visible i ibl att allll titime Uniform distribution of observations in image space Rigid-body transformation between cameras is fixed but can change slightly Training a gp phase ase (s (short, o ,o once ce for o a ca camera e a co configuration) gu a o )
Advantages 9 9 9 9
Requires no extrinsic or intrinsic camera calibration Scales to large environments (several km) Provides guidance in the user’s body frame Robust to off-path trajectories and high-frequency user motion
Future Work 3/17/2009
Extend to 3D motion (stair ascent/descent) User study on multiple real human users Application to robotics Olivier Koch
58
Questions
3/17/2009
Olivier Koch
59