Closed-Form Solution for Absolute Scale Velocity Estimation Using. Visual and Inertial Data with a Sliding Least-Squares Estimation. Vincenzo Lippiello and ...
2013 21st Mediterranean Conference on Control & Automation (MED) Platanias-Chania, Crete, Greece, June 25-28, 2013
Closed-Form Solution for Absolute Scale Velocity Estimation Using Visual and Inertial Data with a Sliding Least-Squares Estimation Vincenzo Lippiello and Rafik Mebarki Abstract— In this paper a method for the on-line absolutescale velocity estimation of a system composed of a single camera and of an inertial measurement unit is presented. The proposed formulation makes use of spherical image measurements acquired from at least three camera positions and inertial measurements to estimate the system velocity by solving also the absolute scale problem. A new multi-rate formulation based on a sliding least-squares estimation formulation is proposed, which is capable of providing the velocity estimation also in cases of constant and zero velocity. The effectiveness of the proposed approach is shown through extensive simulations.
I. I NTRODUCTION The adoption of vision systems in advanced robotics in unknown environment, e.g. mobile inspection robots and Unmanned Aerial Vehicles (UAVs), is now usual and largely diffused. Although many active sensors are now available for environment mapping and obstacles recognition, e.g. ultrasonic sensors, 3D cameras and laser range finders, vision systems are passive and outperform active navigation systems in terms of cost, weight, power consumption and size. All these aspects together with the richness of information provided by a camera are prompting several researchers to investigate the adoption of these sensors to cope with several open problems ranging from low-level control to the autonomous navigation and environment awareness solution, e.g. Simultaneously Localization and Mapping (SLAM). In recent years, there is a growing interest in adopting vision systems for UAVs guidance and navigation. Thanks to the achieved results, a new generation of aerial service robots [1], [2] are becoming effective also for interaction tasks with the environment [3], [4]. The main problem with using vision system is the loss of the depth information of the observed image features. The adoption of triangulation techniques from multiple views [5] is a possible solution to this drawback, that often results in a visual odometry [6] or visual SLAM [7]. In fact, visual SLAM [8], [9], stereo vision [10], [11], [12] and Structure-From-Motion [13], [14] techniques are generally used as visual navigation systems to localize and estimate the UAV egomotion. The research leading to these results has been supported by the SHERPA and ARCAS collaborative projects, which have received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreements ICT-600958 and ICT-287617, respectively. The authors are solely responsible for its content. It does not represent the opinion of the European Community and the Community is not responsible for any use that might be made of the information contained therein. The authors are with PRISMA Lab, Department of Electrical Engineering and Information Technology, Universit`a degli Studi di Napoli Federico II, via Claudio 21, 80125, Naples, Italy {lippiello, rafik.mebarki}@unina.it
978-1-4799-0997-1/13/$31.00 ©2013 IEEE
Although these approaches typically adopt scale propagation solutions to keep approximatively constant the scale factor, the use of monocular vision for egomotion estimation only allows velocity estimation up to a scale factor. In [15] the fusion of visual and inertial measurements provided by an Inertial Measurement Unit (IMU) is proposed, but the assumption of scaled visual position measurements is made. Structure-From-Motion and motion estimation filtering based on inertial measurements are combined to evaluate the absolute scale velocity in [16]. However, the scale for the visual reading is achieved by an initialization requiring a given set of known 3D points. Integral Extended Kalman Filter (EKF)-SLAM techniques are proposed in [17] and in [18] where the motion model has been replaced with the motion estimation based on IMU data. The computational complexity of all these approaches is considerable due to the involvement of a complete SLAM and does not scale with the size of the map. To cope with this limitation, an algorithm with a linear complexity where the number of considered features is proposed in [19], such that the scalability is improved by removing older features. A non-linear observer for pose estimation working with a static set of image features in the field of view and inertial measurements is proposed in [20] and tested in simulation. In [21] a framework for simultaneous recovery of scene structure and camera motion by combining visual and inertial cues is presented, which adapts ideas from non-linear state estimation as EKF for structure and motion recovery. A method for the on-line absolute-scale velocity estimation of a system composed of a single camera and of an IMU is presented in this paper. The proposed formulation is inspired by the method proposed in [22] and then modified in [23] also by using spherical image coordinates. The techniques proposed in these previous papers suffer however from ill-conditioning as soon as constant or zero velocity is reached. This by the way would require the introduction of exceptions and thresholds, that are hard to be tuned even to a limited (or restrained) range of motion conditions. To overcome these shortcomings, a Sliding Least-Squares (SLS) formulation is proposed in this paper. The effectiveness of the proposed approach is shown through simulations carried out on a statistically significant number of cases. II. P ROBLEM DEFINITION A moving system composed of a mono-camera and an IMU is considered. The inertial and the camera reference frames are denoted with I : O − xyz and C : OC − xC yC zC , respectively (see Fig. 1). Axis zC coincides with the optical
1261
Camera xC frame yC OC
Inertial frame
xC,k O k– 2
R
zC
xC,k–sN
d
OC,k
f^k–sN
yC,k–sN
f^k
zC,k
dk
dk–sN
zC,k–sN
x
yC,k
rkk–sN
OC,k–sN P
y
Ok–1
Ok–j
f^
z O
a v
P Fig. 1.
Reference frames. Fig. 2.
axis of the camera, while axes xC and yC lies within the camera image plane. The IMU system is considered N times faster than the camera system: while the acceleration measurements are provided by the IMU each period T , the visual information are thus acquired each N T period of time. Moreover, it is assumed that the the camera and the IMU reference frames are coincident and that the IMU is ideal, i.e., it provides gravity, and bias-free acceleration and gyroscopic measurements. Notice that, through a calibration procedure, it is easy to refer IMU data to the camera frame. Therefore, only the camera frame C will be considered in the rest of the section. The orientation of the camera frame, also available by using IMU measurements suitably combined into an EKF, is expressed in inertial frame I through the well-known TaitBryan (Euler) angles of roll, pitch, and yaw φ = (ϕ, θ, ψ). Adopting a pin-hole camera model1 , and assuming known the parameters, image feature vector f = camera calibration T x y z , i.e., the position of the observed feature with respect to (wrt) C, can be expressed using normalized image coordinates X and Y as follows f = z f˜ , (1) T is the homogeneous representawhere f˜ = X Y 1 tion of the image normalized coordinates. The corresponding spherical image coordinates fˆ of the feature vector is a unit vector depending only on visual measurements X and Y which is evaluable as follows 1 ˜ fˆ = f. (2) ˜ kf k Notice that f can also be espressed as f = dfˆ ,
(3)
where d = kf k is the distance of the feature wrt the camera. The image features considered in this paper are corners, which can be matched in a sequence of images through the wide-used Pyramidal Lucas-Kanade algorithm [25], [26]. When the camera is moving, a sequence of camera frames is considered. Let fˆ11 and fˆ22 be the spherical coordinates of a single feature corresponding to two consecutive images labeled with 1 and 2. They are represented in their corresponding camera reference frames C1 and C2 , respectively 1 Other camera models can be considered in reason of the available hardware, e.g. see [24] for the case of fisheye lens.
Camera (blue) and IMU (red) measurement reference frames.
(conventionally, for vectors and matrices the reference frame is indicated as superscript). Let also φ12 be the corresponding angular changes of the camera orientation. Then, fˆ12 representing the spherical image feature measured in camera frame C2 and expressed in C1 can be written as follows fˆ12 = R12fˆ22 ,
(4)
R12
where = R(φ12 ) is the rotation matrix representing the rotation performed by the camera from C1 to C2 in the form cϕ cθ cφ sθ sψ − sφ cψ cϕ sθ cψ + s φ sψ R(φ) = sϕ cθ sφ sθ sψ + cφ cψ sϕ sθ cψ − cφ cψ . −sθ −cθ sψ cθ cψ
Finally, the acceleration measurement a, as well as the system velocity v, is always expressed in the current reference frame (e.g. aj = ajj , where j refers to the camera frame Cj at the time instant tj ). The considered estimation problem consists in the evaluation of unknown system velocity v from the available sequences of camera, fˆ , and IMU measurements, a and φ. Note that more than a single image correspondence as well as different measurement sequence lengths can be considered to improve the estimation robustness wrt measurement noise. III. P ROBLEM FORMULATION In this section the formalism proposed in [23] with an explicit extension to a multi-frame multi-feature correspondence is described. Considering a camera motion, as shown in Fig. 2, and assuming that tk is the last sample time with available visual measurements, the previous available visual measurements are referred to the sample times tk−sN , with s ∈ N (i.e. s identifies each visual station). Hence, denoting r ji the relative displacement of Ci wrt Cj , and considering a single image feature match between frames at sampling times k and ks = k − sN , the following relation can be written T dks fˆkkss = Rkks dkfˆkk − r kks , (5) where
Rkks =
ekx,ks
eky,ks
ekz,ks
is the rotational matrix defining the orientation of Cks wrt Ck . ekx,ks , eky,ks , and ekz,ks are its unit column vectors. Relative
1262
displacement r kks can be expressed in terms of v k wrt the current frame Ck , and the integration of sN acceleration samples aj , that are provided by the IMU between tk−sN and tk . Let us consider the relative displacement and velocity between two consecutive IMU frames: 1 r j−1 = v j−1 T + aj−1 T 2 (6) j 2 j−1 v j = v j−1 + aj−1 T (7) 1 r jj−1 = −Rjj−1 r j−1 = −v j−1 T − Rjj−1 aj−1 T 2 (8) j 2 j v j = Rjj−1 v j−1 = v + R a (9) j−1 j j−1 j−1 T. Plugging (9) into (8) yields 1 r jj−1 = −v j T + Rjj−1 aj−1 T 2 . (10) 2 Then, the displacement between two consecutive visual frames can be expressed by adding all the displacements corresponding to the intermediate time intervals, with only IMU data available. This yields 1 k 2 ¯ T , (11) r kks = −sN T v k + a 2 ks with sN X ¯ kks = (12) a (2(sN − j) + 1)Rkk−j ak−j , j=1
which can be expressed in a recursive formulation as follows ¯ kk−N T 2 + r k−N −N T v k + 21 a ifs > 0 ks−1 r kks = . (13) 0 ifs = 0
Finally, plugging (11) into (5) and in view of (3), the following system of two equations and four unknowns, v k and dk , for the i-th image point correspondence between frames k and ks is achieved: T k ¯ kks T 2 di,k fˆ i,k + sN T v k − 12 a ekx,ks Xi,ks = T (14) k ¯ kks T 2 ekz,ks di,k fˆ i,k + sN T v k − 12 a T k ¯ kks T 2 eky,ks di,k fˆ i,k + sN T v k − 12 a Yi,ks = T . (15) k ¯ kks T 2 di,k fˆ i,k + sN T v k − 21 a ekz,ks
In the general case, considering ns visual stations (with s = 1, . . . , ns ) and nf image features (with i = 1, . . . , nf ), and stacking the distances di,k wrt the current frame Ck of the nf image features into the vector dk , a system of 2ns nf equation with 3 + nf unknowns, v k and dk , is achieved. This system is linear in the unknowns and can be easily arranged in the classical form AΘ = b, (16) T where Θ = v T , and matrix A is achieved by k stacking ns nf elementary 2 × (3 + nf ) matrices As,i T ekx,ks − Xi,ks ekz,ks h i k As,i = T sN T I 3 F i (fˆ i,k ) , eky,ks − Yi,ks ekz,ks
dT k
(17)
with s = 1, . . . , ns and i = 1, . . . , nf , where F i (ν) is a (3 × nf ) matrix with the vector ν in the i-th column and all other elements equal to zero, I 3 is the identity matrix of the indicated dimension. Analogusly, the vector b is achieved by stacking the ns nf elementary (2 × 1) vectors bs,i defined as follows: T ¯ kks T 2 ekx,ks − Xi,ks ekz,ks 12 a bs,i = (18) T . ¯ kks T 2 eky,ks − Yi,ks ekz,ks 21 a
The system (16) for ns = 2 and nf = 1 becomes a square system of 4 equations in 4 unknowns. However, by increasing ns and/or nf , a least-squares based solution can be easily achieved, which is more robust wrt the noise, but with some limitations, as it will be explained in the following. If ns is increased, the number of unknowns do not change, i.e. the complexity of the system solution remains the same, and the baseline employed for the triangulation considered in the equation system is enlarged, resulting in a better numerical conditioned problem. However, in this case more IMU samples needs to be integrated, resulting in a bad solution if the quality of the IMU system is poor, as the typical case of commercial UAVs. On the other hand, by increasing nf the same number of IMU data is employed but the number of unknowns increases linearly: the matrix A assumes a sparse conformation and the solution of the system becomes quickly inefficient; moreover, the complexity of the image feature matching algorithm increases and its outliers rejection capability decreases.Taking into account these considerations, a trade-off is required. For example, ns = 2, 3, or 4 can be adopted if a very good IMU system is available, and nf ≤ 3 for most cases. An important advantage of this formulation is that the solution is based on the integration of a finite number of IMU measurements, without the need to involve the previous estimation. In other words, the integration of the acceleration provided by the IMU is integrated only among a finite and short time interval. In this way, for example, the presence of a residual bias in the acceleration measurements will affect only the single velocity estimation and will not produce an integration of the error of successive estimation. It is important to note that system (16) becomes singular in two cases: 1) when the selected image features are aligned along the motion direction. In this case it is sufficient a selection of a new candidate feature set; 2) when the velocity of the camera is constant (or zero), i.e. when the value of the integral of the acceleration over two camera observation points is very small, and hence the motion remain unobservable. To cope with the latter case, a suitable threshold can be employed to detected the occurrence of stationary vehicle configurations, by monitoring the result of the IMU integration.
IV. S LIDING LEAST- SQUARES SOLUTION The adoption of a threshold for the online detection of constant velocity is a solution with several practical limitations. First of all, it depends on the motion condition,
1263
j=k−Nw +1
where β is a forgetting factor assigned to the different measurements, in such a way as to weigh more the fresh ones. Scalar mj is the max norm of the matrix AT j Aj , that is mj = kAT A k , which is employed for the normalization j max j between different measurements. The estimation objective is ˆ that best fits the model relationto obtain an estimation Θ ship (16), for all of those registered measurements. If the algorithm would have consisted in a weighted non-recursive method, the estimate would be obtained as ˆ = Γ−1 w. However, when the measurements do not Θ convey enough wealthy information, matrix Γ tends to be ill-conditioned. The SLS algorithm instead deals with such occurrence by processing only the valuable parts of the information differently from the other part, that is suspected to cause the ill-conditioning. This latter part is detected using eigenvalues through a discrimination performed according to a threshold ǫ; an eigenvalue, or its normalized value, smaller than the threshold is considered as related to the singularity. In details, consider the eigenvalue decomposition of matrix Γ, since the latter is symmetric according to (19), as follows Γ = QΛQT ,
(21)
where diagonal matrix Λ contains eigenvalues λi , with i = 1, . . . , 3 + nf , which are positive (λi ≥ 0), and are arranged in non-increasing fashion (λi ≥ λi+1 ). The (3 + nf ) × (3 + nf ) matrix Q is orthogonal (Q−1 = QT ), and is given as Q = q 1 . . . q 3+nf , (22)
where q i is the eigenvector associated to λi . According to ˆ k at sampling time k is thus the SLS algorithm, estimate Θ given by −1 Γ w if λ3+nf > ǫ if λ4 ≤ ǫ 0 P ˆk = l 1 T (23) Θ w+ i=1 λi q i q i P3+nf q q T Θ ˆ k−1 otherwise, i=l+1 i i
1.5
1 m/s
i.e. on the range of velocity to be estimated, and on the noise level of the available IMU. Moreover, when the motion becomes under the threshold all the information provided by the measurements are discarded, hence losing more than it is really necessary. To achieve a more effective estimation of the system velocity, we propose a Sliding least-squares (SLS) estimation algorithm for the solution [27]. The peculiarity of the SLS algorithm is that it takes into account in the estimation only the part of the measurement that conveys wealthy information. Consider different measurements bj and Aj recorded and saved on a window of Nw size (j = k−Nw +1 up to j = k). Their weighted correlations are calculated as follows [27] ! k X β k−j T Γ= A Aj (19) m2j j j=k−Nw +1 ! k X β k−j T w= (20) A bj , m2j j
0.5
0 0
2
4
6
8
10
s Fig. 3. Time history of the velocity trajectory components (x–green, y– blue, and z–red) employed during the simulation.
where in the “otherwise” case the index l is the maximum value for which λl > ǫ. Note that when λ3+nf > ǫ all the other eigenvalues are also larger than ǫ, and when λ4 ≤ ǫ no significant motion is present in the visual measurements and a motionless condition is considered. In the other case, the system is moving with a constant velocity, hence the available measurements can be employed only to estimate the motion direction [first part of the equation of the latter case in (23)], while the scale factor is automatically propagated from the previous estimation [second part of the equation of the latter case in (23)]. Finally, the threshold ǫ can be chosen as a fraction of the biggest eigenvalue λ1 or, in an equivalent way, the eigenvalues can be normalized wrt λ1 . In both cases, the tuning of the threshold becomes relative and does not longer depend on the system velocity range, which was not the case in the previous formulation. V. S IMULATIONS The method proposed in the previous section is tested by means of simulations performed with MATLAB/Simulink. To evaluate the effectiveness of the SLS solution, it is compared to the case of the inversion of the model (16) with the adoption of a threshold on the motion detected by the IMU measurements. Both solutions are statistically evaluated on a significant number of simulations. The vision system is supposed to be down-looking towards a planar surface containing sparse features. The idea is to emulate a case similar to an UAV equipped with a camera that look towards the floor and an IMU. The camera is supposed to be equipped with a 180o lens system, and with a sensor with a resolution equal to 1280×1024 pixels, running at 10 Hz. The image features are randomly distributed over the floor and cover an area of 10 × 10 m2 . A white Gaussian noise with a standard deviation of 0.01 mm is added to the image measurements in the normalized image plane. The IMU is supposed to run at 100 Hz (i.e. N = 10). A white Gaussian noise with a standard deviation of 0.40 m/s2 is added to the acceleration measurements, which is a value compatible with the most available IMU system equipping commercial UAVs. The basic case of nf = 1 and ns = 2 has proven to be very sensitive to the presence of image noise, while it
1264
m/s
1
m/s
1
0.5
0.5
0
0 0
2
4
6
8
10
0
2
4
s
6
8
10
6
8
10
6
8
10
s
m/s
1
m/s
1
0.5
0.5
0
0 0
2
4
6
8
10
0
2
4
s
s
m/s
1
m/s
1
0.5
0.5
0
0 0
2
4
6
8
10
0
s
2
4 s
Fig. 4. Time history of the velocity estimation (colored lines) and of the ground true (black thin lines) achieved by inverting (16) and by applying a threshold of 1 cm on the travel detected by the IMU measurements. From the top to the bottom: the cases with ns = 2, ns = 3, and ns = 4, all with and nf = 5.
seems to cope suitably with noise on the acceleration. For this reason, the case of nf = 5, that does not introduce a significant increase in the computational load, is considered. Moreover, three different cases with ns = 2, ns = 3, and ns = 4 are tested to quantify the effectiveness of the choice of the visual stations number. The ground true velocity trajectory is shown in Fig. 3. The trajectory has two time intervals, 2 s longer both at the beginning and at the end, with a zero velocity, a middle time interval (from 4 s to 6 s) with a constant velocity, and two time interval (from 2 s to 4 s and from 6 s to 8 s) with sinusoidal motion components. The mean value of the magnitude is 0.635 m/s, while the velocity in the cruise interval is 1.3 m/s. Figure 4 shows the velocity estimation results achieved by inverting the model (16) with a classical generalized-inverse, and applying a threshold of 1 cm on the norm of r kk−N for each integration path. This value of threshold has been chosen at the best of the authors possibilities according to the
Fig. 5. Time history of the velocity estimation (colored lines) and of the ground true (black thin lines) achieved by using the proposed SLS estimation method with ǫk = 1e−6 λ1,k . From the top to the bottom: the cases with ns = 2, ns = 3, and ns = 4, all with and nf = 5.
chosen velocity trajectory and IMU level of noise. Obviously, each different trajectory requires a specific tuning of this threshold. Three different cases study are considered: ns = 2, ns = 3, and ns = 4, all with and nf = 5. First of all, it can be notice that adoption of the threshold determines mainly two drawbacks: 1) the presence of a residual bias when the velocity becomes constant or zero; 2) the unobservability of velocity values that determines a travel under the threshold. Moreover, the benefit achievable by increasing ns is rapidly lost by the increase of the level of noise introduced by the IMU measurements in the model (16). For this reasons, the adoption of ns > 4 can be considered only if the employed IMU presents a low level of noise. The improvements achieved with the adoption of the proposed SLS algorithm are shown in Fig. 5. A relative threshold ǫk = 1e−6 λ1,k is employed at each sampling time for all the considered cases. It is remarkable how the residual bias of the velocity estimation is automatically avoided when the velocity becomes zero. Moreover, when the velocity becomes constant, the ill-conditioning of the system (16) is
1265
automatically avoided by considering only the part of the system that contains significant information. TABLE I RMS CORRESPONDING TO THE VELOCITY TRAJECTORY ESTIMATIONS OF F IGS . 4 AND 5 COMPUTED OVER 100 SIMULATIONS . Without SLS [m/s2 ]
With SLS [m/s2 ]
Improvement [%]
ns = 2, nf = 5
0.085
0.046
45.9
ns = 3, nf = 5
0.055
0.042
23.6
ns = 4, nf = 5
0.050
0.038
24.0
Case study
To reduce the statistical component that is present in the proposed results, which is caused by the random noise and image features, a statistical comparison between the cases with and without the SLS method is presented in Table I. In particular, a set of 100 simulations have been carried out. Tests presenting statistically incoherent values (i.e. far from the mean value more that 2 times the standard deviation) are discarded. The Root Mean Square (RMS) of the norm of the velocity estimation error has been computed over the statistically robust set and are shown in Table I for all the considered cases study. It is clear that the proposed method outperforms the standard in all conditions, with a relative improvements ranging from 23.6% to 45.9%. VI. C ONCLUSION A method for the on-line absolute-scale velocity estimation of a system composed of a single camera and of an IMU has been presented in this paper. The proposed formulation makes use of spherical image measurements acquired from at least three camera positions and inertial measurements to estimate the system velocity by solving the absolute scale problem. A new multi-rate formulation based on a sliding least-squares estimation formulation has been proposed. Besides of its capability of providing good velocity estimation even in cases of constant and zero velocity, it relaxes the previous constraint which required a specific threshold tuning for each velocity trajectory. The effectiveness of the proposed approach has been shown with simulations in several case studies. R EFERENCES [1] L. Marconi, F. Basile, G. Caprari, R. Carloni, P. Chiacchio, C. Hurzeler, V. Lippiello, R. Naldi, J. Nikolic, B. Siciliano, S. Stramigioli, and E. Zwicker, “Aerial service robotics: The AIRobots perspective,” in 2nd International Conference on Applied Robotics for the Power Industry, pp. 64–69, 2012. [2] L. Marconi, R. Naldi, A. Torre, J. Nikolic, C. Huerzeler, G. Caprari, E. Zwicker, B. Siciliano, V. Lippiello, R. Carloni, and S. Stramigioli, “Aerial service robots: An overview of the AIRobots activity,” in 2nd International Conference on Applied Robotics for the Power Industry, pp. 76–77, 2012. [3] V. Lippiello and F. Ruggiero, “Exploiting redundancy in Cartesian impedance control of UAVs equipped with a robotic arm,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3768– 3773, 2012. [4] V. Lippiello and F. Ruggiero, “Cartesian impedance control of a UAV with a robotic arm,” in 10th IFAC Symposium on Robot Control, pp. 704–709, 2012.
[5] R. I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge University Press, second ed., 2004. [6] D. Nister, O. Naroditsky, and J. Bergen, “Visual odometry,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 652–659, 2004. [7] A. Davison, I. Reid, N. Molton, and O. Stasse, “MonoSLAM: Realtime single camera SLAM,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1052–1067, 2007. [8] J. Kim and S. Sukkarieh, “Real-time implementation of airborne inertial-SLAM,” Robotics and Autonomous Systems, vol. 55, no. 1, pp. 62–71, 2007. [9] F. Caballero, L. Merino, J. Ferruz, and A. Ollero, “Vision-based odometry and SLAM for medium and high altitude flying UAVs,” Journal of Intelligent and Robotic Systems, vol. 54, pp. 137–161, 2009. [10] O. Amidi, T. Kanade, and K. Fujita, “A visual odometer for autonomous helicopter flight,” Robotics and Autonomous Systems, vol. 28, no. 23, pp. 185–193, 1999. [11] A. Johnson, J. Montgomery, and L. Matthies, “Vision guided landing of an autonomous helicopter in hazardous terrain,” in IEEE International Conference on Robotics and Automation, pp. 3966–3971, 2005. [12] V. Lippiello and B. Siciliano, “Wall inspection control of a VTOL unmanned aerial vehicle based on a stereo optical flow,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4296– 4302, 2012. [13] T. Kanade, O. Amidi, and Q. Ke, “Real-time and 3D vision for autonomous small and micro air vehicles,” in IEEE Conference on Decision and Control, December 2004. [14] F. Kendoul, I. Fantoni, and K. Nonami, “Optic flow-based vision system for autonomous 3D localization and control of small aerial vehicles,” Robotics and Autonomous Systems, vol. 57, no. 67, pp. 591– 602, 2009. [15] L. Armesto, J. Tornero, and M. Vincze, “Fast ego-motion estimation with multi-rate fusion of inertial and vision,” The International Journal of Robotics Research, vol. 26, no. 6, pp. 577–589, 2007. [16] P. Gemeiner, P. Einramhof, and M. Vincze, “Simultaneous motion and structure estimation by fusion of inertial and vision data,” The International Journal of Robotics Research, vol. 26, no. 6, pp. 591– 605, 2007. [17] E. Eade and T. Drummond, “Scalable monocular SLAM,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 469–476, 2006. [18] D. Strelow and S. Singh, “Motion estimation from image and inertial measurements,” The International Journal of Robotics Research, vol. 23, no. 12, pp. 1157–1195, 2004. [19] A. Mourikis and S. Roumeliotis, “A multi-state constraint Kalman filter for vision-aided inertial navigation,” in IEEE International Conference on Robotics and Automation, pp. 3565–3572, 2007. [20] G. Baldwin, R. Mahony, and J. Trumpf, “A nonlinear observer for 6 DOF pose estimation from inertial and bearing measurements,” in IEEE International Conference on Robotics and Automation, pp. 2237–2242, 2009. [21] D. Aufderheide and W. Krybus, “A visual-inertial approach for camera egomotion estimation and simultaneous recovery of scene structure,” in IEEE International Conference on Virtual Environments HumanComputer Interfaces and Measurement Systems, pp. 6–11, 2010. [22] L. Kneip, A. Martinelli, S. Weiss, D. Scaramuzza, and R. Siegwart, “Closed-form solution for absolute scale velocity determination combining inertial measurements and a single feature correspondence,” in IEEE International Conference on Robotics and Automation, pp. 4546–4553, 2011. [23] V. Lippiello, G. Loianno, and B. Siciliano, “MAV indoor navigation based on a closed-form solution for absolute scale velocity estimation using optical flow and inertial data,” in IEEE Conference on Decision and Control and European Control Conference, pp. 3566–3571, 2011. [24] D. Scaramuzza, A. Martinelli, and R. Siegwart, “A toolbox for easily calibrating omnidirectional cameras,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5695–5701, 2006. [25] B. D. Lucas and T. Kanade, “An iterative image registration technique with an application to stereo vision,” in 7th International Joint Conference on Artificial Intelligence, pp. 674–679, 1981. [26] J.-Y. Bouguet, “Pyramidal implementation of the lucas kanade feature tracker description of the algorithm,” 2000. [27] M. de Mathelin and R. Lozano, “Robust adaptive identification of slowly time-varying parameters with bounded disturbances,” Automatica, vol. 35, no. 7, pp. 1291–1305, 1999.
1266