Vision-Based Approach to Obstacle Avoidance - Eric N. Johnson

AIAA 2005-6092

AIAA Guidance, Navigation, and Control Conference and Exhibit 15 - 18 August 2005, San Francisco, California

Vision-based Approach to Obstacle Avoidance Yoko Watanabe∗, Eric N. Johnson† and Anthony J. Calise‡ Georgia Institute of Technology, Atlanta, GA, 30332 This paper describes a 3D obstacle modeling system which uses a 2D vision sensor. Prior work tracks feature points in a sequence of images and estimates their positions. However, in this paper we use obstacle edges instead. Using an image segmentation technique, edges are detected as line segments. Subsequently, these edges are modeled in a 3D space from the measured line segments using known camera motions. The z-test method is used for correlating estimated line data with measurements. Line addition and deletion algorithms are also explained. Simulation results show that simple structures are accurately modeled by the suggested line-based estimator. Finally, this method is applied to a 3D terrain mapping problem.

I.

Introduction

Unmanned aerial vehicles (UAVs) play an important role in military operations and have significant potential for commercial applications. UAVs are expected to operate in dangerous areas, such disaster site or enemy territory, and they can provide realtime information to the user. Various problems in UAV automation are still under investigation. One of these is obstacle detection and avoidance. If a vehicle operates in close proximity to unknown terrain or structures, its navigation system has to automatically detect obstacles and its guidance and control systems must avoid collisions with them. For obstacle detection, it is ideal to obtain 3D site mapping data of the terrain over which the UAV flies. Laser rangefinders can provide very accurate environmental data,1 however, they are too large and heavy to install on small UAVs. Moreover, they are very expensive. Thus, a single 2D camera is chosen as a sensor for obstacle detection. It is reasonable to use a camera because they are low cost and meet the size and weight constraints of most small UAVs. Furthermore, a camera can be used to obtain sufficient information of the vehicles unknown operational environment in realtime. This paper considers the design of a 3D site modeling system using a single 2D camera. In some studies, vision-based terrain modeling is achieved by tracking many feature points in a sequence of images and updating estimates of their actual 3D positions2 .3 Unlike these studies, this paper describes the estimator design based on line information, instead of points. In general, edges of obstacles may appear as curved lines of finite length in an image. In particular, most artificial structures such as buildings have straight edges which appear as a set of straight line segments in an image. Therefore, our objective is to estimate actual obstacle edge lines from the line segments which are detected in an image through an image segmentation technique, and to create a 3D model of the obstacles. It is notable that the line-based estimator uses more structural information (points and their connectivity) than the point-based estimator to create an obstacle model. First, every line segment in a given measurement set is matched with estimated line data. The statistical z-test value is introduced to perform this correlation.4 The z-test value is taken for a certain error index. Then the z-test value is inversely related to the likelihood of an event that a given measurement corresponds to the line data chosen. When using the z-test, both estimation error and measurement error covariances are taken in account. For each measurement, the z-test value is calculated and a line which attains the least value is chosen. After a line is assigned, an extended Kalman filter (EKF) is applied to update the two endpoint positions for each line from the residuals of the two endpoints of the measured line segment ∗ Graduate

Research Assistant, yoko [email protected] Martin Assistant Professor of Avionics Integration, [email protected] ‡ Professor, [email protected] † Lockheed

1 of 10 American Institute of Aeronautics and Astronautics

Copyright © 2005 by the Authors. Published by the American Institute of Aeronautics and Astronautics, Inc., with permission.

assigned to the data. Line addition and deletion procedures are also carried out. If some measurements are not matched with any estimated line data, they are considered to be newly detected lines and added to the database. On the other hand, if there is line data which is visible but not detected, the data is deleted. Finally, 3D models are constructed from the lines. Simulation results show that the line-based estimator is able to create a 3D model of the obstacle with sufficient accuracy. Furthermore, this work has been applied to an uneven grid terrain mapping simulation and this result also verifies a performance of the line-based estimator.

II.

Problem Formulation

A vision-based obstacle modeling problem is formulated in this section. The problem is to create a 3D model of obstacle edges using information from a single 2D vision sensor equipped on a UAV, as shown in Figure 1. Camera motion (or vehicle motion when the camera is fixed to it) is assumed to be known. Only static obstacles are considered and their edges are assumed to appear as a set of straight lines in an image. In addition, an image processor which detects line segments in an image from the vision sensor is required.

Figure 1. Vision-Based Obstacle Modelling

III. A.

Estimator Design

Measurement

As stated in II, the image processor detects line segments in an image, which correspond to obstacle edges. Figure 2 shows an example of these measurements. Line segments (shown as dotted lines) are detected from visual information (shown as solid lines) through an image segmentation technique. Each set of measurements is expressed with two endpoint positions in image coordinates (x, y). The image coordinates are expressed in terms of a relative position of an obstacle edge to the center position in a camera frame (XC , YC , ZC ) as follows. YC XC , y= (1) x= ZC ZC where the XC and YC axes are parallel to the x and y axes respectively and the ZC axis is an optical axis, with the origin at the center of camera. B.

Line Assignment

Once the line segments are measured, they are matched with estimated lines in a database. The statistical z-test value is utilized for the line assignment.4 The z-test value is taken for a given error index J, and it is the square of J divided by its variance. That is, the z-test value is inversely related to the likelihood that the chosen line data corresponds to a given measured line segment in an image. If there is a large error between the measurement and the image data, but the measurement also has a large uncertainty, then the probability of their correspondence should be higher than the case in which the measurement has a small uncertainty. Therefore, each line segment should be assigned to a line which attains the least z-test value, or the highest likelihood. After projecting all the visible estimated lines onto the current image plane, signed distances d1 , d2 perpendicular to each projected line, and e1 , e2 parallel to each projected line are calculated (See Figure 3) 2 of 10 American Institute of Aeronautics and Astronautics

Figure 2. Outputs of Image Processor

to examine a correspondence between the data and the measurement. The distances e1 , e2 are taken to the closest estimated endpoints from each set of measured endpoints. When the measurement and the estimated line data are perfectly matched, these distances become zero. The error index function is defined as J = d1 2 + d2 2 + c(e21 + e22 )

(2)

where c is a constant weight. Let x1 and x2 be endpoints positions of a line data projected onto an image plane and z 1 and z 2 be endpoints positions of a measured line segment as shown in Figure 3. Then, all the distances d1 , d2 , e1 and e2 are determined by x1 , x2 , z 1 , z 2 as follows. (z 1 − x1 ) × (x2 − x1 ) , x2 − x1 (z 1 − x1 ) · (x2 − x1 ) e1 = , x2 − x1

(z 2 − x1 ) × (x2 − x1 ) x2 − x1 (z 2 − x1 ) · (x2 − x1 ) e2 = x2 − x1

d2 =

d1 =

(3) (4)

Since we want to take the lateral distances from the closer estimated endpoint, we redefine e1 , e2 as follows. ei if |ei | ≤ |l − ei | ei = , i = 1, 2 (5) l − ei if |ei | > |l − ei | where l = x2 − x1 =

(x2 − x1 )2 + (y2 − y1 )2

(6)

Let X 1 and X 2 be two endpoint positions of an estimated line in a local frame. A relationship between X 1 , X 2 and x1 , x2 is obtained by a known camera motion. Let X c be a camera position in a local frame, and LCL be a direction cosine matrix from a local fixed frame to a camera frame. Then, relative positions of X 1 and X 2 to the camera are expressed in a camera frame as follows. ⎤ ⎡ XC i X C i = ⎣ YC i ⎦ = LCL (X i − X c ), i = 1, 2 (7) ZC i

Figure 3. Line Assignment


and x1 , x2 are derived from Equation(1) in Section A. Since an estimation error covariance P for X 1 , X 2 and a measurement error covariance R for z 1 , z 2 are known, a covariance of the error function J is derived as follows. T + Cz RCzT (8) C = CX P CX where CX

=

∂J

∂J

∂X 1

∂X 2

=

Cz

=

[

∂J ∂z1

∂J ∂z2

]=

∂J ∂ [ d1

d2

e1

∂J ∂ [ d1

d2

e1

T

e2 ]

e2 ]

T

d2

∂ [ d1

∂ [ xT1 ∂ [ d1

d2

∂ [ z T1

e1

e2 ]

xT2 ] e1 z T2 ]

T

T

T

e2 ]

∂ [ xT1

∂ [ X T1

xT2 ]

T

T

X T2 ]

(9) (10)

T

The matrices on the right hand side of Equation(9) and (10) can be calculated as follows. ∂J ∂ [ d1

d2

∂ [ d1 d2 ∂ [ xT1

e1

e2 ]T

e1 e2 ]T xT2 ]T

=

2 [ d1

⎡ =

⎢ ⎢ ⎣ ⎡

∂ [ d1 d2 ∂ [ z T1

e1 e2 ]T z T2 ]T

=

⎢ ⎣ ⎡

∂ [ xT1 ∂ [ X T1

xT2 ]T X T2 ]T

=

⎢ ⎢ ⎢ ⎢ ⎣

d2

ce1

ce2 ]

y2 −z1y x −z 1 )d1 1 )d1 + (x2 −x − 2 l 1x + (y2 −y l l2 l2 y2 −z2y 2x 1 )d1 1 )d1 + (x2 −x − x2 −z + (y2 −y l l l2 l2 2y1 −y2 −z1y 2x1 −x2 −z1x (x2 −x1 )e1 (y2 −y1 )e1 + + l l l2 l2 2y1 −y2 −z2y 2x1 −x2 −z2x (x2 −x1 )e2 (y2 −y1 )e2 + + l l l2 l2 x2 −x1 1 − y2 −y 0 0 l l x2 −x1 1 0 0 − y2 −y l l y2 −y1 x2 −x1 0 0 l l y2 −y1 x2 −x1 0 0 l l XC 1 1 0 −Z 2 0 0 0 ZC 1 C1 Y 1 0 − ZC 12 0 0 0 ZC 1 LCL C1 X 1 C2 O 0 0 0 0 − ZC 2 ZC 2 2 Y 1 0 0 0 0 − ZC 22 ZC 2 C2

If ei = l − ei is chosen in (5),

⎤

y1 −z1y 1 )d1 − (x2 −x l l2 y1 −z2y 1 )d1 − l − (x2 −x l2 z1x −x1 (x2 −x1 )e1 − l l2 z2x −x1 (x2 −x1 )e2 − l l2

−

x1 −z1x l x1 −z2x l z1y −y1 l z2y −y1 l

− − − −

(y2 −y1 )d1 l2 (y2 −y1 )d1 l2 (y2 −y1 )e1 l2 (y2 −y1 )e2 l2

⎥ ⎦

⎤

⎥ ⎥ ⎥ ⎥ ⎦

∂ei ∂∗

is modified by

∂ei ∂∗

=

ztest =

∂(l−ei ) ∂∗ .

O LCL

Then, the z-test value is finally obtained by

J2 C

(11)

As stated above, each line segment is assigned to a line which attains the least z-test value among all estimated lines in the database. However, if the least z-test value is still large, that measurement does not match well with the assigned line and does not correspond closely with any line in the database. In this case, the measured line segment is considered to be a newly detected line and new line data is added to the database. As discussed in Section D, we use a threshold z-test value to determine if lines should be added or deleted. C.

Extended Kalman Filter

After all measurements are assigned, an extended Kalman filter (EKF) is applied to update estimates of two endpoints positions X 1 and X 2 in a local frame from residuals d1 and d2 defined in (3). Let

d X1 , d= 1 X= X2 d2 Then, the EKF update procedure is written as follows.5 ˆ k−1 + ˆk = X Kk d X Kk Hk Pk−1 Pk = Pk−1 − Kk

= Pk−1 Hk (Hk Pk−1 HkT + Rk )−1 4 of 10

American Institute of Aeronautics and Astronautics

(12) (13) (14)

⎤ ⎥ ⎥ ⎦

ˆ k is an estimate of X, Pk is its estimation error covariance matrix, and Kk is a Kalman gain at a where X time step k. A measurement matrix Hk and a measurement error covariance matrix Rk is defined by Hk Rk

T ∂ [ xT1 xT2 ] ∂d ∂d = ˆ ˆ T ∂X X =X ∂X X =X k−1 k−1 ∂ [ xT1 xT2 ]

T ∂d ∂d R = T z =z T T T z 1 =z 1k ∂ [ z T1 z T2 ] z 12 =z 1k ∂ [ z z ] z 2 =z 2k 1 2 2k

=

T

where R is a measurement error covariance for [ z T1 z T2 ] . Hk and Rk are already calculated in the procedure of calculation of CX and Cz in Equation(9) and (10). The summation in update laws of Equation(12-13) is taken for each measured line segments which are assigned to the line estimate. Since all the measurements ˆ k and Pk . are independent, we can take the summation for update of X Usually, the EKF also includes a prediction procedure. However, because we are considering static ˙ k = 0, there will be no change in the state estimate. Furthermore, if no process noise is assumed, obstacles X the error covariance Pk does not change neither in the prediction procedure. D.

Line Addition and Deletion

Since a camera’s field-of-view changes due to its motion, procedures have been included for adding and deleting lines as obstacles enter and leave the field-of-view. As mentioned in Section B, unassigned line segments are considered as newly detected lines and a new line corresponding the measurement should be added to database. In order to create a new line, the initial estimate of its endpoints positions in a 3D local frame has to be determined from 2D information. When other information about the line is not available, a new line in the database is created by assuming that the line is on the zero altitude surface. With this assumption, the initial estimates of the endpoints positions in a local frame can be obtained from the measurement and the current camera state. On the other hand, if a line in database is supposed to be visible but is not detected by the vision sensor, then the line may no longer exist and is deleted from the database. To ensure that only lines that no longer exist are deleted, only lines with no measurement assigned for more than N (≥ 1) consecutive time steps are removed.

IV.

Simulation Results

Simulation results of the 3D obstacle modeling using the line-based estimator dedcribed in Section III are shown. Figure 4 compares an actual obstacle and its estimated model for three different simple obstacles including a line, a plane and a pyramid. A circle above each obstacle represents the camera trajectory. The camera has a fixed mounting angle of π/6 from a vertical downward axis. From the figures at the first time step t = 0.02, we can see that the initial estimates of detected lines are on the zero altitude surface. Then, after the camera flies over each obstacle, all the three obstacles are modeled with sufficient accuracy. Consider the plane obstacle case; the result in Figure 4 is obtained when the whole obstacle is always within a camera view field. That means, the line addition/deletion algorithms introduced in Section D is not necessary. Figure 5 shows the estimation results for a planar obstacle with two different camera trajectories. One trajectory is a circular trajectory, similar to the one in Figure 4 but with a smaller radius. The other is a zig-zag trajectory with a camera looking straight downward. Both camera trajectories are close to the obstacle so the camera can see only a part of the obstacle at each time instant. The results show that the line addition algorithm works properly. Though the resulting obstacle model has some missing parts, most of the obstacle is accurately estimated.


Line

Plane

6

Pyramid 40

25

35

5 20

30

4 25

Z

3

Z

t = 0.02

Z

15

10

20 15

2 10 5

1

5

0 15

0 4

2

0

−2

X

−4

8

−2

0

2

4

6

−4

−6

0 10 10

5

0

X

Y

7

−5

15

−5

0

5

10

−10

5

0

−5

X

Y

15

−10

15

−10

15

10

5

−20

15

10

5

−20

15

10

−5

0

5

10

−10

−10

Y

35

25

30

6 20

25

5 4

Z

Z

3

Z

t=5

20

15

10

2

15 10 5

1 5

0

0 0 15

−1 10 5 0

X

−5

10

8

6

4

2

0

−2

−4

−6

−5 20 10

5

0

X

Y

7

−5

15

−5

0

5

10

10

−10

0

X

Y

25

−5

0

5

10

−10

Y

35

6

30

20 5 25

4

20

Z

Z

Z

t = 10

15

3

15

10

2

10

1

5 5

0 −1 10

0 15 5 0

X

−5

10

5

0

−5

0 20

10

−10

5

X

Y

0

15

10

5

0

−5

−10

10

−15

X

Y

25

7

0

0

−5

−10

−15

−5

−10

−15

−5

−10

−15

Y

35 30

6

20

25

5 4

20

Z

Z

3

Z

t = 15

15

10

2

15 10 5

1

5

0

0 −1 10

0 20 5

0

X

−5

−10

10

5

0

−5

−5 20

10

0

−10

X

Y

7

−10

−20

15

10

5

0

−5

−10

−15

10

0

X

Y

25

−10

0

Y

35 30

6 20

25

5

3

Z

Z

Z

t = 20

20

15

4

10

15 10

2

5 5

1

0 10

0 0 20

5

0

X

−5

−10

10

5

0

Y

−5

−10

−5 20

10

0

X

−10

−20

15

10

5

0

−5

−10

−15

10

Y

Figure 4. Obstacle Modelling Results


0

X

−10

5

0

Y

Circle

Zig-Zag

9

10

8

9 8

7

7

6

6

Z

t = 0.02

Z

5 5

4 4

3

3

2

2

1 0 10

1

5

0

−5

−10

−15

−4

0

−2

Y

6

4

2

0 6

8

4

2

0

−2

−4

X

4

2

0

−2

−4

−6

Y

X

10

10 9

8

8 7

6

5

Z

t=8

Z

6

4

4

2

3 2

0

1 0 10 5 0 −5

−15

−10

10

2

0

−2

−4

−6

−4

10

10

8

8

6

6

4

4

0

−2

Y

2

2

4

6

X

2

0

0

−2 10

−2 6

5

0

−5

−10

−10

0

−5

5

10

4

2

0

X

−2

−4

−6

−4

10

10

8

8

6

6

4

4

2

0

−2

Y

Z

Z

4

X

Y

t = 20

5

0

−5

Z

t = 16

Z

Y

−2 6

2

4

6

X

2

0

0

−2 10

−2 5

5

0

Y

−5

−10

−10

0

−5

5

10

0 −5

X

−6

−4

0

−2

Y

2

4

6

X

10

8

6

Z

t = 24

4

2

0

−2 5 0

Y

−5

−6

−4

Figure 5. Obstacle Modelling Result(Line Addition/Deletion)


−2

X

0

2

4

6

V. A.

Application

3D Grid Terrain Mapping Simulator

The line-based estimator can be applied to 3D terrain mapping by partitioning the terrain into many grids and applying the line detection techniques described above. Figure 6 shows a vision-based 3D terrain mapping simulator. The top right window illustrates current states including camera position, attitude, the estimated terrain model, and actual grid terrain. The bottom right window shows a camera view and outputs of the image segmentation. A navigation system for the camera motion and an image processor have been implemented in the simulator. From image segmentation,6 the image processor outputs each point position on the image and connectivity between all points. The connectivity corresponds to line segments which are measurements for the line-based estimator designed in this paper.

Figure 6. 3D Grid Terrain Modelling Simulation

B.

Simulation Result

All the algorithms discussed in this paper are implemented in the vision-based 3D grid terrain mapping simulator, and the estimation results are shown in this section. The camera is moving over the terrain in circle of radius r = 100 with a constant speed V = 20. The simulation runs for 200 seconds, and the camera flies through approximately 6 full circles. Figure 7 shows position estimation errors for three different sample points; Point 73, 139 and 227. Figure 8 illustrates the 3D terrain model constructed by the linebased estimator after t = 0, 50, 100, 150 and 200. Since the initial estimate for each terrain edge is on the zero altitude surface, the terrain model at the beginning is flat. As the camera flies over the terrain, the estimation errors converge to zero as shown in Figure 7. The last picture in Figure 8 shows that a sufficiently accurate terrain model is obtained after 200 seconds. The estimation convergence is slow in this simulation, because the camera is moving fast and each terrain edge is in the field of view only for a short time interval per each round. The rate of the convergence can be improved by taking a larger initial covariance P0 in the EKF.


error

20

eX eY eZ

Point # 73

10 0 −10 −20 0

20

40

60

80

100

120

140

160

180

200

120

140

160

180

200

120

140

160

180

200

error

20

Point # 139

10 0 −10 −20 0

20

40

60

80

100

error

20

Point # 227

10 0 −10 −20 0

20

40

60

80

100

time

Figure 7. Estimation Error for Sample Points

t=0

t = 50

t = 150

t = 200 Figure 8. Grid Terrain Estimates


t = 100

VI.

Future Work

This work represents initial steps towards the development of a vision based autonomous obstacle avoidance system for UAVs. The next step will be to create a guidance system that uses the 3D obstacle models constructed by the line-based estimator in this paper. The guidance system will be designed to achieve predefined mission objectives while avoiding detected obstacles. It has been observed that obstacle edge estimation results depend on the camera motion with respect to obstacles so it is likely that the estimation accuracy will be used by the guidance policy. Its goal will be to enable the UAV to fly to a destination avoiding obstacles and collecting sufficient information to build a partial 3D site map. Development of an obstacle edge estimator for moving obstacles will also be future work. For example, if there are other UAVs operating in close proximity to our UAV, they could be considered moving obstacles. Designing an estimator for moving obstacles is challenging since the system will need to use obstacle motions, not currently included in the estimator formulation.

VII.

Conclusion

A 3D obstacle modeling system which uses a single 2D camera has been designed and implemented using line matching. First, an algorithm which assigns each detected line segment to an estimated line is introduced. The z-test value which inversely indicates a likelihood of a correspondence between measurements and estimates is utilized in the line assignment algorithm, and the measured line segment is matched with a line which attains the least z-test value. However, if the least z-test value is not sufficiently small, then that measurement remains unassigned. After the line matching, an extended Kalman filter is applied to update each line estimate in the database. The EKF is formulated to estimate two endpoints positions of a line in a local frame from a measurement of residuals from two endpoints of a measured line segment to the projected line on an image plane. In addition, procedures for line addition for newly detected lines and line deletion for lines estimated incorrectly are explained. Simulation results modeling three different simple obstacles verify the performance of the estimator. This work has been also applied to a 3D grid terrain modeling simulation where it correctly estimated the 3D terrain model. In future work, a guidance policy which achieves obstacle avoidance as well as a site modeling will be considered as will an extension for estimating moving obstacles.

VIII.

Acknowledgement

This work was supported in part by AFOSR MURI, #F49620-03-1-0401: Active Vision Control Systems for Complex Adversarial 3-D Environments, and by Office of Naval Research, #N00014-05-M-0002.

References 1 ”3-D Site Mapping with the CMU Autonomous Helicopter ” J.R.Miller and O.Amidi. The 5th International Conference on Intelligent Autonomous Systems (IAS-5). Jun. 1998. 2 ”A Sequential Factorization Method for Recovering Shape and Motion from Image Streams” T.Morita and T.Kanade. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.19, No.8. Aug. 1997. 3 ”Vision Based Terrain Recovery for Landing Unmmaned Aerial Vehicles” M.Meingast, C.Geyer and S.Sastry. IEEE Conference on Decision and Control. Dec. 2004. 4 ”Probability and Statistics” A.J.Hayter. Duxbury. 2002. 5 ”Introduction to Random Signals and Applied Kalman Filtering” R.G.Brown and P.Y.C.Hwang. John Wiley&Sons. 1997. 6 ”Three snippets of curve evolution theory in computer vision” A.Tannenbaum. Mathematical and Computer Modeling, vol.24, pp.103-119. 1996.