In this section the general problem of passive nav- .... motion. The stability analysis of Eq. 5 is straightfor- ward. In obvious notation, the error on the inverse of.
Motion and Structure from One Dimensional Optical Flow Enrico De Micheli
Fabrizio Giachero
Istituto di Cibernetica e Biofisica Consiglio Nazionale delle Ricerche Via Dodecaneso 33, 16146 Genova, Italy
D.I.S.1 Universith di Genova Wale Benedetto XV 3, 16100 Genova, Italy 2
Abstract
In this section the general problem of passive navigation is restricted to the case in which the viewing system moves on a planar surface. N o assumption is made on the spatial structure of the viewed scene, and the motion is otherwise arbitrary. Let the center of projection be the origin of the reference system (X,Y, Z) attached to the camera, f the focal length, and let the normal vector of the i m age plane be parallel t o the Z-axis. The optical flow ( u , v ) at a point p = (z = fX/Z,y = fY/Z,f) on the image plane can be thought of as !he perspective projection of the 3 0 velocity field V , representing the relative motion between the camera and the environment, a t P = (X,Y,Z). In the case of arbitrary rigid motion, a:n be written in terms of the translational velocity T = (Tx, Ty ,Tz) and angular velocity fi = (Rx,Ry,Rz) between the camera and the environment. Thus, the two components U and v of the optical flow can be written as [l, 21
This paper i s based o n the observation that if the viewing camera i s appropriately mounted on a vehicle which m o v e s o n a planar surface, the problem of m o tion and structure recovery f r o m optical flow becomes linear and, in principle, can be solved locally. It i s shown that angular velocity and depth can be computed f r o m one component only of the optical flow, and that the accuracy an the estimation of depth f r o m the vertical component is more accurate than that from the horizontal component. Experiments on synthetic and real sequences support the presented analysis.
1
Introduction
Many of the techniques for the reconstruction of motion and three-dimensional (3D) structure of the viewed scene consist of two steps: first the optical flow is computed from the sequence of images, then the equations which relate optical flow t o motion and structure are solved. The key point of the paper is the observation that, if the motion of the vehicle is restricted on a planar surface, the optical flow equations can be drastically simplified. It is found that the computation of the full 2D optical flow over the entire 2D image plane is unnec,essary. The angular velocity can be optimally recovered from the horizontal component only of optical flow along the vertical line which goes through the optical center of the image plane. Once the angular velocity has been determined, depth can be estimated by means of one component only of the optical flow. The vertical component of the optical flow is is found to be better suited than the horizontal component. Therefore, only one-dimensional (1D) optical flow [3] is to be computed for achieving reliable information on the viewed motion and structure.
U
=
(E2
+ f 2 P Y - Y(zRx + fRz)
TYf - TZY - (Y2
+ f 2 ) R X - z(yR Y + fQz)
Txf
- Tzx
+
Z
v
=
Z
f
f
In this general setting, the problem of recovering the viewed motion and structure from optical flow at n points is nonlinear and can be stated as the problem of solving n pairs of equations for U and v [n 6 74 unknowns, i.e. the six motion parameters T and R and the depth Z at each of the n points. If the motion of the camera is restricted on a planar surface T , say normal to the Y-axis, the optical flow equations can be greatly simplified. The hypothesis of planar motion is clearly relevant to many natural and cultural scenarios. Two further assumptions will be needed: (i) the image plane of the viewing system is orthogonal to T , and (iz) the optical axis is parallel t o
+
962
1063-6919/94$3.00 0 1994 IEEE
Planar navigation
2.2
the instantaneous direction of translation. In the case of robot motion the hypothesis (ii) means the optical axis of the camera has t o be aligned with the robot wheels. By virtue of (i) and (zi) the translational and angular velocities can be written as T' = (0, 0, T), and 6 = (0, R, 0). The substitution of these expressions into the equations of the optical flow yields
u=Rf+z(;z+;);
By substituting the expression for R into the equation for r , this reads
-l = -v - z 7 Y 7"
(1)
2.1
-1_-
uy - v z
fY
l
f2v
7
- z(uy - vx) f 2Y
Thus, the larger the distance from the horizontal line l h , the smaller the error on the depth computed from Eq. 5 . R o m Eq. 6 it follows that the error increases with 1x1, but since lv/yl is usually much larger than IzRl f I this dependence should be hardly noticeable. The symmetry between the X- and Y-axis is broken by the assumption of planar motion and by the particular positioning of the viewing camera on the moving vehicle. The first asymmetric finding was that the Eqs. 1 cannot be solved along the horizontal line Ih of the image plane. The second asymmetric result was that the horizontal component of the optical flow is better suited than the vertical component for the computation of the angular velocity. The third one is that the two components of the optical flow cannot be equivalently used for depth estimation. By using Eqs. 2 let us write r as a function of R and U :
(2)
Angular velocity
By differentiating Eq. 2a, if Au and A V are the errors in the computation of U and U , the error A R on the angular velocity can be written as
(3)
-=--(;+i).
From Eq. 3 it can be concluded that AR is minimum at the points with x = 0, that is, the points which lie on the vertical line I , through the center of projection. For z = 0, R = u/f which implies that U is constant over I , . It follows that, in order to optimally estimate the angular velocity, it is sufficient ( a ) t o compute the horizontal component of the optical flow along the vertical line I , , and (b) to estimate R as
R=-
(5)
Therefore, once the angular velocity has been recovered, the 3D structure of the scene can be reconstructed by computing only the vertical component of optical flow. For most typical values o f f and T # 0, the term IzR/fl is usually much smaller than lv/yl. This agrees with the intuitive fact that a rotation around a vertical axis has a small (and purely perspective) effect on the vertical component of the apparent motion. The stability analysis of Eq. 5 is straightforward. In obvious notation, the error on the inverse of the estimated depth, A(l/r), writes
With these assumptions the problem becomes linear and consists of 2 equations in 2 unknowns, R and 117. Note that r = - Z I T , the time required to cover the distance 2 at the speed ITI,reflects the well known ambiguity between absolute distance and speed of a viewed point and can be thought of as the depth Z in time units. The analysis of the existence and uniqueness of the solution to Eqs. 1 is trivial. With the exception of the points for which y vanishes, that is on the horizontal line l h which goes through the optical center, the solution of Eqs. 1 in terms of both components of the optical flow takes the form
R=-.
3D structure
1
7
u
-
2
(7)
Eq. 7 is only superficially similar t o Eq. 5. Since a rotation around a vertical axis has a large effect on the horizontal component of the apparent motion, the second term on the right-hand-side of Eq. 7 is now also important. The stability analysis of Eq. 7 yields
< U >
f
(4)
where < U > is the average of U along I , . It is remarkable that Eq. 4 is true for arbitrary (planar) motion and independent of the 3D structure of the scene.
From Eq. 8 it follows that the accuracy in the reconstruction of the 3D structure from the horizontal component of the optical flow depends rather critically
963
from the horizontal location of the image point (the 0). Since the vertical locaerror is unbounded for z tion is also critical (the optical flow equations cannot be solved along I h ) , the use of the horizontal component of the optical flow for depth estimation is unlikely to be effective.
..
2.3
f
Y2+f2 + (+
.........
..
.......... ................ ................ e"""'
................ .::az:
i-1 nimage plane
Qx
:e:
*
.... .. .. .. .. .. .. .. .. .. .. .. .. .. .. ................. ............... ................ ................
*.(
........ .......... ......... A ......' ....... .d.. . :.* .. .&A 'R
;;,?
............... ......... ..........xi.'
zs:
...............
-
image plane
Figure 1: Top view of depth estimates from synthetic optical flows corrupted by white noise. Depth values evaluated along one horizontal line on the image plane by using Eq.5 (filled circles) and Eq.7 (open triangles). The optical flows reproduced the motion of a slanted planar surface. The slanted straight lines mark the true location of t_he moving plane. TheFotion parameters were ( a ) : T = (O,O,100)/frume, Cl = (.02, .03,0) deg/fiume, ( b ) : T' = (o,o,100)fiume, si = (.02, .3,0) deg/frume. From Eq.4 we had (U): sly = .0297 f .0001 deg/fmme, ( b ) : R y = .299 f .001 deg/frame. gaussian white noise. Notice that the results displayed in Figs. 1 are in very good agreement with the theoretical analysis of the previous section. While depth estimates from the vertical component of the optical flow are almost independent of both the angular velocity and the horizontal location of the image point, the accuracy in the depth estimates from the horizontal component is much lower, becomes meaningless near x = 0, and decreases noticeably when R increases. In the second experiment the viewing camera was mounted on a robot navigating within a laboratory. Figure 2a shows the scene viewed from the robot, which is composed of three boxes placed in front of the robot against a quite cluttered background. The distances between the camera and the three obstacles were, from left to right in the image, 1.00 m, 1.87 m, and 2.5 m respectively, while the distance between the camera and the background was approximately 3.2 m. The motion of the robot consisted of a random sequence of translations and rotations, whose magnitude was randomly chosen within a certain range. The goal of the experiment was double. During its translational motion, the robot had to compensate the random appearence of the rotational movements in order t o keep going straight. This task has been accomplished by simply evaluating R y from the horizontal component
(11)
Even in this case, equations 10 and 11 are not equivalent. The structural instability in the recovery of r is brought by the third term on the right-hand-side of Eqs. 10 and 11. Since the magnitude of the "unstable terms" is determined by the absolute values of the components of the angular velocity, the equation which will give more stable depth estimates can be selected by simply checking the ratio 1slx/sl~I.Consequently, for JQx/iZyJ< 1, the unstable term of Eq. 10 is smaller than that one of Eq. 11, thus Eq. 10 will yield less noisy depth estimates. Obviously, for Islx/slyI > 1, better results will be obtained through 1 Eqs. 10 and 11 are Eq. 11, while for IRx/slyl indeed almost equivalent.
-
3
................ ........... ...&..::::::.,
......... ..........
Once fi has been recovered, the depth values can be obtained in two alternative ways:
Y
................. ............... ................, .............. ..........
.e .y2 ...e +
The previous analysis can be extended to the simple but interesting case of non planar motion, in which the viewing camera can rotate also_aound the X-axis, i.e. d = (Ox, sly,O). By inserting R into the equations of the optical flow, and by following the same arguments of section 2.1, the two components of the angular velocity can be roboustly recovered by suitably averaging the components of the optical flow along the two lines I,, (i.e. x = 0), and lh (i.e. y = 0), to obtain
T
.. .. .. .. .. .. .. .. .. .. .. .. . . . . ......;;:*... .. ..... . . .: ................
.. .. .. .. .. .. ......... .. .. .. .. .. . .. .. .. .. .. .. .. .................. .........&......
Non planar motion
l -v z - ---sly
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
---+
Experimental results
Figure 1 displays the 3D reconstruction obtained from one horizontal row of two synthetically produced optical flows, simulating the motion of a slanted planar surface which is moving toward the image plane. To better simulate motion in the real world the two synthetic motions of the planar surface were corrupted by
964
I
I
3 4 5 6 7 8 9
I
1 1
10
I
0 -4 0
0 4 0 -6 0
I
-1.32f .05 -4.73 f .10 1.38 t ..ns .. .
10 0 I
I
10
15 0 15 0 10
I I
1.52 f .05 4.07 -~ f .05 -1.12f .06 -6.24 f .02 -1.34 f .09
Table 1: Angular velocity from the image sequence acquired from the robot. The second and third columns reproduce the temporal sequence of pure rotations and pure translations which composed the motion of the robot. The agreement between the nominal values which are reported and the true movements of the robot is within about 1 d e g l f r a m e for the rotations and within about 1 c m l f r a m e for the translations. The angular velocities shown in the fourth column have been computed from the horizontal component U of the optical flow with Eq. 4.
Figure 2: (U): One frame of a sequence acquired within the laboratory from the camera mounted on the mobile platform “Robuter” by Robosoft. The robot moved according to the sequence of translations and rotations described in Tab. 1. ( b ) , (c) and ( d ) : Top view of the 3D map recovered from the lower, central and upper scanlines displayed in (U), respectively. Depth values have been computed from the vertical component of the optical. The optical flow was computed according to the algorithm described in [4]. The values displayed on the Z axis indicate the average distance from the camera of the “plateaux” corresponding to the different objects present in the scene.
scanlines shown in Fig. 2u. The data depicted were obtained at the first translational move of the motion sequence displayed in Tab. 1. The map of Fig. 2b correctly displays the nearly costant depth at the scanline that corresponds to the floor, while from Figs. 2c and d the three obsatcles are clearly detectable.
of the optical flow through Eq. 4 . Table 1 summarizes the results obtained over the sequence of one trial. The nominal rotation, that corresponds to the true rotation of the robot with an uncertitude of about f l degree, is displayed in the second column. In the case of pure translation the amount of the traslation is shown in the third column. The fourth column shows the estimates of Cly obtained by averaging the U component ofthe optical flow over five equidistant locations on the vertical axis I , of the image plane. It can be seen that the agreement between estimated and nominal angular velocity is quite good over the entire sequence. The inaccuracy of the robot motion, which is due in large part to the limits of the motor control system and to the irregularities of the ground floor, is evidentiated by the detection of a small amount of rotation even in the presence of a nominal pure translation. The second goal of the experiment consists of recovering a raw description of the 3D structure of the viewed scene. Figures 2b, c, and d reproduce the depth estimates which were obtained by using only the v component of the optical flow along the three horizontal
References [l] Longuet-Higgins, H.C., and Prazdny, K., 1980. The interpretation of a moving retinal image. Proc. R. Soc. London B 208, 385397. [2] Prazdny, K., 1981. Egomotion and relative depth from optical flow. Bzol. Cyber. 36, 87-102. [3] De Micheli, E., Uras S. and Torre, V., 1992. The Accuracy of the Computation of Optical Flow and of the Recovery of Motion Parameters. IEEE Trans. P a t t . A n a l . Mach. Intell. 15-5, 434-447.
[4] Campani, M. and Verri, A., 1990. Computing optical flow from an overconstrained system of linear algebraic equations. Proc. 3rd Intern. Conf. C o m put. Vision, Osaka, 22-26.
965