Hough based Terrain Classification for Realtime ... - Semantic Scholar

6 downloads 13663 Views 3MB Size Report
A fairly standard computer vision algorithm uses color information to estimate drivable terrain. .... cloud PC, i.e., a set of points in n-dimensional Cartesian space.
Hough based Terrain Classification for Realtime Detection of Drivable Ground Jann Poppinga, Andreas Birk, and Kaustubh Pathak Robotics, EECS Jacobs University Bremen∗ D-28759 Bremen, Germany http://robotics.iu-bremen.de {j.poppinga, a.birk, k.pathak}@iu-bremen.de

Abstract

The usability of mobile robots for surveillance, search and rescue missions can be significantly improved by intelligent functionalities decreasing the cognitive load on the operator or even allowing autonomous operations, e.g., when communication fails. Mobility in this regard is not only a mechatronic problem but also a perception, modeling and planning challenge. Here, the perception issue of detecting drivable ground is addressed, an important issue for safety, security, and rescue robots, which have to operate in a vast range of unstructured, challenging environments. The simple yet efficient approach is based on the Hough transform of planes. The idea is to design the parameter space such that drivable surfaces can be easily detected by the number of hits in the bins corresponding to drivability. A decision tree ∗

International University Bremen until spring 2007

on the bin properties increases robustness as it allows to handle uncertainties, especially sensor noise. In addition to the binary distinction of drivable/non-drivable ground, a classification of terrain types is possible. The algorithm is applied to 3D data obtained from two different sensors, namely, a time-of-flight camera and a stereo camera. Experimental results are presented for indoor and outdoor terrains, demonstrating robust realtime detection of drivable ground. Seven datasets recorded under very varying conditions are used. About 6,800 snapshots of range data are processed in total. It is shown that drivability can be robustly detected with success rates ranging between 83% and 100%. Computation is extremely fast in the order of 5 to 50 msec.

1

Introduction

Existing fieldable solutions for safety, security, and rescue robotics (SSRR) are optimized for core locomotion and ruggedness (Shah and Choset, 2004; Murphy, 2004; Davids, 2002). Even with plain tele-operation the systems are already very useful devices (Snyder, 2001; Abouaf, 1998). But any intelligent functionality added can significantly improve their usability (Birk and Carpin, 2006; Murphy et al., 2001). First of all, there is a tremendous cognitive load on human operators during SSRR missions (Scholtz et al., 2004), which can be significantly eased by adding automated support for mobility, perception and planning. Second, various degrees of intelligence up to fully autonomous behavior are desirable, for example to overcome drop outs in the communication systems or to allow a single operator to handle a whole team of robots. Terrain classification for determining the negotiability of a surface is in this context an important

asset for an SSRR robot. It not only forms the basis of any autonomous mobility behavior, it can also be used to annotate maps. In general, terrain classification is becoming more and more important as mobile robots increasingly operate in unstructured environments. Early work on this topic can be found in the field of walking machines where the negotiation of rough terrain is of obvious interest. The Ambler legged robot for example generates elevation maps that are used by a planning module to select suited footfall locations (Hoffman and Krotkov, 1993; Hoffman and Krotkov, 1991). Walking machines have in this respect the tremendous advantage that unlike wheeled or tracked locomotion systems they do not rely on more or less continuous flooring. On the down side, legged robots also suffer from disadvantages with respect to payload and power consumption, i.e., operation time. Tracked systems are hence predominantly used in application oriented domains where versatile locomotion is needed that can handle rough terrain (Hardarsson, 1997; Wong, 2001). One option for terrain classification in general is to acquire 3D data and to generate a complete 3D environment model (Howard et al., 2004; Unnikrishnan and Hebert, 2003; Lacroix et al., 2002; Lacroix et al., 2001; Thrun et al., 2000; Gennery, 1999), which is then used to determine negotiable paths. In doing so, simple environment cues like texture and color can help (Talukder et al., 2002; Howard et al., 2001). But these approaches are nevertheless computationally expensive and are therefore not really suited for realtime application on mobile robots (Niwa et al., 2004). Also, an efficient generation of the 3D environment model is not independent of the navigation process, but ideally it steers it by some next-best-view planning (Lozano Albalate et al., 2002). Another line of research is to use the interaction between the terrain and the locomotion system for classification purposes (Larson et al., 2004; Iagnemma et al., 2004; Iagnemma et al., 2002). This is obviously not a full-fledged alternative to distinguishing drivable from non-drivable ground ahead

of the robot’s current position, but it is beneficial to adjust control and it can help in planning. This holds especially with respect to walking machines where the estimation of the compliance or slope of the ground at the moment of foot contact is crucial (Kurazume et al., 2002; Lewis and Bekey, 2002; Yoneda and Hirose, 1995). But as mentioned before, for a wheeled or tracked system, it is more important to determine some way ahead whether a surface is negotiable. The most impressive results regarding the detection of drivable terrain were achieved in the DARPA Grand Challenge 2005, where five vehicles out of the 23 finalist teams managed to autonomously drive a 212km desert course (Iagnemma and Buehler, 2006a)(Iagnemma and Buehler, 2006b). The conditions and constraints for these feats are somewhat different from the SSRR scenarios that form the basis for the work presented in this article. The vehicles for the Grand Challenge have a significant payload. This allows high computation power by using multiple onboard computers. Furthermore, each vehicle carries a rich selection of high end range sensors, especially multiple high quality laser range finders and navigational radar. On the other hand, the vehicles travel fast and the risks of false assessments are extremely high. The main concern of the Grand Challenge approaches is accordingly to generate high quality, long range representations of the environment. The Red Team that ended with its two vehicles on 2nd and 3rd place mainly relies on long range navigational radar and a gimbaled Riegl lidar for binary obstacle detection of up to 50m. Some terrain classification is only done in closer range of 20m using two SICK LMS 291 laser range finders. This classification does not exploit 3D information. It operates only on single scans into which lines are fitted. This is used to calculate a cost heuristic for path-planning (Urmson et al., 2006).

The winning Stanford Racing Team (Thrun et al., 2006) is able to generate high quality 3D point clouds, which are then turned into occupancy maps for path-planning by looking at local elevation differences. Key elements for generating the point cloud are a Kalman filter based vehicle state estimation and a Markov model for pose drift, which takes the sequential character of the data acquisition into account. The approach is heavily parameter dependent. The system is hence trained by a human driver, who is instructed to only pass over obstacle free terrain. The intensive supervised optimization of the parameters then leads to a drop of false positives from 12.6% to 0.002% (Thrun et al., 2006). In addition, images from a monocular camera are used for long range obstacle detection. A fairly standard computer vision algorithm uses color information to estimate drivable terrain. The information from the short range laser generated map is then projected into the image and used for adaptation of the parameters (Thrun et al., 2006). Here, a novel approach to terrain classification is presented, which is based on the following idea. Range images, here from simple 3D sensors in the form of an optical time-of-flight camera and a stereo camera, are processed with a Hough transform. Concretely, a discretized parameter space for planes is used. The parameter space is designed such that each drivable surface leads to a single maximum, whereas non-drivable terrain leads to data-points spread over the space. In addition to this binary distinction, more refined classifications of the distributions are possible allowing to recognize different categories like plain floor, ramp, rubble, obstacle, and so on. This transform can be computed very efficiently and allows a robust classification in real-time as illustrated with exhaustive indoor and outdoor experiments. Our interest in the problem is motivated by work on intelligent behaviors up to full autonomy on a line of rescue robots developed in the Robotics Laboratory of Jacobs University Bremen (formerly known as IUB) since 2001. The latest type of robots from Jacobs University are the

Figure 1: A Jacobs University robot at a drill dealing with a hazardous material road accident (upper row) and at the European Land Robotics Trials, ELROB-2006 (lower row).

Figure 2: The RoboCup rescue competition features a very complex test environment (left), which includes several standardized test elements. The team demonstrated a combined usage of a teleoperated with a fully autonomous robot at the world championship 2006 (right).

so-called Rugbots, short for ”rugged robot” (Birk et al., 2006c). These robots have been tested on various occasions including a field test demo dealing with a hazardous material road accident, the European Land Robotics Trials (ELROB) and several RoboCup competitions (Birk et al., 2006a; Birk, 2005; Birk et al., 2004; Birk et al., 2002). Our latest system has demonstrated its capabilities at the RoboCup 2006 world championship in Bremen, where several fully autonomous runs were successfully done in the so-called rescue robot league (Birk et al., 2006b). The RoboCup rescue league (Kitano and Tadokoro, 2001) features a very challenging environment (Fig. 2) including several standardized test elements that can be used to assess the quality of the robots (Jacoff et al., 2003b; Jacoff et al., 2003a). The rest of this article is structured as follows. In section two the approach and its implementation are described, including some information on the sensors that are used. Section three presents and analyzes the results of real world experiments conducted to evaluate the approach. Section four concludes the article.

2

2.1

Approach and Implementation

3D Range Sensors

Classical obstacle and free space detection for mobile robots is based on two-dimensional range sensors like laser scanners. This is feasible as long as the robot operates in simple environments mainly consisting of flat floors and plain walls. The generation of complete 3D environment models is the other extreme, which requires significant processing power as well as high quality sensors. Furthermore, 3D mapping is still in its infancy and it is non-trivial to use the data for path planning. The approach presented here lies in the middle of the two extremes. A single 3D range snapshot

Pan-TiltZoom Camera

Laser Range Finder (LRF) Inclined LRF

Webcam

Thermo Camera

Stereo Camera

Swiss Ranger

Figure 3: The autonomous version of a Rugbot with some important onboard sensors pointed out. The Swissranger SR-3000 and the stereo camera deliver the 3D data for the terrain classification.

is processed to classify the terrain, especially with respect to drivability. This information can be used in various standard ways like reactive obstacle avoidance as well as 2D map building. The approach is very fast and it is an excellent candidate for replacing standard 2D approaches to sensor processing for obstacle avoidance and occupancy grid mapping in non-trivial environments. Snapshots of 3D range data form the basis of the approach. 3D laser scanners have become increasingly popular and a possible choice for this purpose (Wulf and Wagner, 2003; Surmann et al., 2003; Wulf et al., 2004). However, they are typically based on 2D devices, which are supplemented with an actuator to cover the additional dimension. The overall acquisition of the data is hence very slow, typically in the order of several seconds per snapshot. As shown later on, our classification algorithm takes in the order of a few milliseconds on standard computer hardware to process the data. This can only be fully exploited with sensors with high update rates, e.g., for fast

(a) Normal image

(c) Disparity image of the stereo camera

(b) Swissranger distance image

(d) Picture of the point cloud of the stereo camera

Figure 4: Sample scene shot with different cameras: (a) standard camera; (b) Swissranger SR3000 (gray-values encode the measured distance); (c) Videre STOC stereo camera disparity image; (d) corresponding point cloud

Table 1: Specification of the two 3D Sensors Swissranger

Stereo Camera

Manufacturer

CSEM

Videre Design

Model

SR-3000

Stereo-on-Chip (STOC)

Principle

Time of Flight (TOF) Stereo images’ disparity

Range

600 − 7500 mm

686 − ∞ mm

Horiz. Field of View

47o

65.5o

Vert. Field of View

39o

51.5o

Resolution

176 × 144

640 × 480

reactive obstacle avoidance. Sensors with fast update rates are used in the work presented here. Laser scanner data is nevertheless of potential interest as it is of high quality, i.e., the noise levels are small. The classification rates of our approach can be expected to be even higher with laser scanner data than with the fast but noise prone sensors used in the experiments presented later on. A Swissranger SR-3000 time-of-flight range camera (CSEM, 2006)(Lange and Seitz, 2001) and a Videre STOC stereo camera (Videre-Design, 2006)(Konolige and Beymer, 2006) are used in the experiments presented in this article. Their locations on the robot are indicated in figure 3. Both sensors allow update rates of around 30 Hz. The most important feature of this fast data acquisition is that robot motion does not influence it. As shown with experimental results later on, it does not matter for the classification whether the robot is driving or not. The technical details of the sensors are summarized in Table 1. Figure 4 shows typical outputs. As mentioned before, the advantage of fast update rates comes at the cost of rather high noise rates. Nevertheless, robust classification is possible as shown later on.

2.2

The Hough transform

The Hough transform is a feature detection heuristic, originally introduced for lines. It has been popularized and generalized by (Ballard, 1981) to all parameterized shapes like circles, squares and the like. Though it is typically used for 2D images, it can be extended to any dimension due to its general nature. In this article, it is used to detect infinite planes in a 3D point cloud returned by 3D range sensors. Generally, the Hough transform can be used to detect shapes in points clouds of arbitrary but matching dimensionality. Ideally, these shapes are can be desribed by as few parameters as possible. Originally, 2D lines were used, which can be defined by their angle with the x-axis and the distance to the origin. A point in the parameter space then represents one of the shapes being searched for. Each point in the input data set then can be used as a kind of vote for all parameter combinations of the shapes that pass through it. The parameter space is discretized by dividing it into equally sized so-called bins. For each point p in the input dataset, counts in the bins corresponding to parameters of shapes passing through p are incremented. In the case of the presence of a particular shape in the input data, the bin corresponding to its parameters has a high count or so-called hits of input points voting for it. In the following, the general form of the algorithm for the detection of a geometric shape with m parameters in n-dimensional space is shortly recapitulated. The input data is given as a point cloud P C, i.e., a set of points in n-dimensional Cartesian space. The general Hough transform as illustrated in algorithm 1 iterates over all points p in P C and increments for each all bins that belong to parameter values of shapes passing through p. In doing so, the parameter vm is so-called quantized, i.e., it is calculated based on the values for all the other m − 1 parameters and a point

from the input data. Algorithm 1 General form of the Hough transform algorithm searching for an m-parameter shape in an n-dimensional point cloud P C. The m-dimensional parameter space is discretized as array P S. for all point p ∈ P C do for all values v1 for parameter P1 do .. . for all values vm−1 for parameter Pm−1 do vm ← calculateVm (p, v1 , . . . , vm−1 ) P S[v1 ] . . . [vm ] + + end for .. . end for end for

2.3

Plane Parameterization

Here the shapes for the Hough transform are planes. The axes are chosen as shown in figure 5. The planes are characterized by the following three parameters:

Two angles: As shown in figure 5, the first angle ρx is the angle of intersection of the given plane with the xz-plane with the x-axis. The second angle ρy is the angle of intersection of the given plane with the yz-plane with the y-axis. Signed distance to the origin: The signed distance d has the same absolute value as the distance and the sign is the same as that of the z-axis intercept of the plane

Algorithm 2 Hough transform applied for plane detection. Searching for a plane with its three parameters in a 3D point cloud P C. The 3D parameter space is discretized as array P S. for all point p ∈ P C do for all angles ρx do for all angles ρy do n ← (−sin(ρx )cos(ρy ), −cos(ρx )sin(ρy ), cos(ρx )cos(ρy ))T d ← np P S[ρx ][ρy ][d] + + end for end for end for

z

ρx Front x

Up

O

ρy y

Figure 5: The definition for the angles ρx and ρy .

So, given a point from the point cloud returned by one of the sensors, and two angles, it is possible to compute the signed distance to the origin. In algorithm 2, the distance is used accordingly as the quantized parameter. In other words, the signed distance d corresponds to the quantized parameter vm in Algorithm 1.

2.3.1

Ground classification

The basic idea for the ground classification is simple: important classes of terrain are characterized by one plane. If the robot is facing no obstacles and an even terrain, a plane representing the ground should be detectable. This ground plane has parameters that correspond to the pose of the sensor relative to the floor. The corresponding bin of the parameter space should have significantly more hits than any other bin. Similarly, if the robot faces a ramp, an elevated plateau, or if it is standing at an edge to lower ground, the corresponding planes should be easily detectable. If no single plane is detected, it can be presumed that the robot is facing a kind of non-drivable terrain. One option is to detect walls and other obstacles perpendicular to the floor by looking at the according bins. An easier alternative used here is to exclude the corresponding areas of the parameter space. So if points of the input data lie on a wall or other obstacles perpendicular to the floor, their votes are not stored as the corresponding bins are not existent. The situation is then the same as for other non-drivable areas: no plane bin has extra-ordinarily many hits. In the following, five classes of terrain are used: “floor”, “ramp”, “canyon”, “plateau”, and “obstacle”. For all except the last one, there is one characterizing plane. The ”floor” is flat horizontal ground at the level of the robot. The ”ramp” is inclined terrain in front of the robot, i.e., when

traversing it only the robot’s pitch changes passively. The ”canyon” and the ”plateau” are even ground below and respectively above the current position of the robot. The classes, especially ramp, canyon and plateau can be further subdivided to take the physical capabilities of the robot into account. For our rugbots, plateaus and canyons with a step of more than 0.2m and ramps with a combined angle of more than 35o are considered to be “not passable”. Other ramps, plateaus, and canyons are deemed to be “passable”. There is of course the option to have a more refined distinction to adapt for example the robot’s driving speed or to compute a cost function for path planning.

2.4

Processing of the Hough Space for Classification

Algorithm 3 The classification algorithm: it uses three simple criteria organized in a decision tree like manner. #S is the cardinality of S, binmax is the bin with most hits. The algorithm returns after the first assignment to class. if #binfloor > th · #PC then class ← floor else if (#{bin | #bin > tm · #binmax } < tn ) and (#binmax > tp · #P C) then class ← type( binmax ) ∈ {floor, plateau, canyon, ramp} else class ← obstacle end if end if

The computation of the classification is based on three simple components. The main criterion is which bin has the most hits. Furthermore, a threshold is used to detect obstacles, i.e., the case

when no bin has significantly more hits than the others. There are two thresholds th and tp for this purpose to accommodate different terrain types. As the number of points in the input data can vary, the thresholds are relative to the cardinality of the processed point cloud. For perfect input data, these two criteria are sufficient. But as the snapshots from the sensors tend to be very noisy a third heuristic is used. This heuristic tries to take the shape of the distribution of the hits into account. It estimates how peaked the distribution is by determining the cardinality of {bin | #bin > tm · #binmax }, i.e., the set of bins with more hits than a certain fraction tm of the number of hits of the top bin. This cardinality is then compared to a threshold tn . The criteria are arranged in a decision tree like manner as shown in algorithm 3. The moment a type is assigned to the variable ”class”, the algorithm terminates. The order of detection may be rearranged, e.g., to first test for obstacles instead of checking for unobstructed floor first. But due to the extremely small computation requirements for this decision tree, the possible effects of rearrangement are negligible. The four parameters th , tp , tm , and tn seem to be quite uncritical as discussed in more detail in the section presenting experimental results. They have been determined using a few test cases and performed very well on several thousand input snapshots from seven large datasets recorded under various environment conditions. Their concrete values are: tm = 2/3, th = 1/3, tn = 7, and tp =1/5. The parameters of the discretization of the Hough space are minimum, maximum and resolution on the three axes. The range on the forward (x-)axis is set to [-45°, 18°] (positive values running downhill), since the sensors only detect very few points on ramps that have a stronger downward inclination than roughly 18°. The range of the y-axis is set to [-45°, 45°]. A ramp with say 30°

roll might not be drivable from the position it was detected from, yet it can be an indication for the robot to reposition itself to tackle the ramp. The range for the distance dimension is [-1m, 2m]. The distance resolution is set to 0.1m. All these parameters were simply chosen based on the capabilities and properties of our robot. The angular resolution of the discretization is the most interesting of the Hough space parameters. This parameter influences the run time as well as the potential accuracy of the algorithm. Different values were hence used for the detailed performance analysis presented in section 3.3.

3

3.1

Experiments and results

Experiment conditions and settings

The approach was intensively tested with real world datasets containing more than 7,500 snapshots of range data gathered by the two sensors in a large variety of real world situations. The data includes indoor and outdoor scenarios, different lighting conditions, and so on. Examples reflecting the large variety are shown in figures 6 and 7. The data was collected in seven independent runs with the robot moving more or less randomly in the environment. As the sensors acquire a snapshot almost instantaneously, there is no need to stop the robot to get a sample. The main properties of the different runs are listed in table 2. Note the large variation in the average number of points per point cloud in the different datasets, especially for the stereo camera. The stereo disparity computation relies heavily on the texture and feature information in the picture. Outdoor photographs can be particularly rich in these features. Refer for example to figures 7(d) and 8. However, the ground in outdoor scenes and walls in indoor

(a) Slightly obstructed floor

(b) Grass

(c) Ramp

(d) Hill

(e) Random step field

(f) Bush

Figure 6: Photos of the different types of scenes encountered in indoor(left) and outdoor(right) experiments. The photos are taken by a webcam which is mounted right next to the 3D sensors.

(a) Grass: Swissranger color coded image.

(b) Grass: 3D point cloud returned by the stereo camera.

(c) Hill: Swissranger color coded image.

(d) Hill: 3D point cloud returned by the stereo camera.

(e) Bush: Swissranger color coded image.

(f) Bush: 3D point cloud returned by the stereo camera.

Figure 7: The data returned by the Swissranger (left) and the stereo camera (right) for the outdoor scenes shown in Fig. 6.

Table 2: The seven datasets used in the experiments. dataset

description

point-clouds (PC) aver. points per PC

stereo set1

inside, rescue arena

408

5058

set2

outside, university campus

318

71744

set3

outside, university campus

414

39762

set4

inside, rescue arena

449

23515

set5

outside, university campus

470

16725

set6

outside, university campus

203

25171

set7

outside, university campus

5461

24790

TOF

scenes are often featureless and consequently the related data is often missing in stereo snapshots as illustrated in figure 8. Both the stereo camera and the Swissranger indicate points in their range images where information is missing. A preprocessing step is hence used, which estimates whether there are sufficiently many meaningful data-points in a snapshot. Table 3 shows the amount of ruled out snapshots due to insufficient data. The total number of snapshots used as actual input for classification is still about 6,800. Please note that despite the high amount of excluded data, the stereo camera is useful as a supplementary sensor to the Swissranger, which tends to fail in scenarios with direct exposure to very bright sunlight, which does not affect the stereo camera. The four parameters tm = 2/3, th = 1/3, tn = 7, and tp =1/5 were determined once based on the analysis of a few example snapshots, which were considered to be typical. The parameters were

Table 3: Percentages of snapshots excluded via preprocessing. Especially the stereo camera data suffers if there are few features in a scene (figure ), but it can complement the Swissranger SR3000, which can fail in different environment situations, e.g., in scenes with direct exposure to very bright sunlight. dataset point-clouds excluded data stereo set1

408

92%

set2

318

75%

set3

414

70%

set4

449

1%

set5

470

2%

set6

203

2%

set7

5461

0%

TOF

(a) The webcam image of the scene.

(b) 3D point cloud returned by the stereo camera.

The

ground does not show any features, so it does not have depth information.

Figure 8: An example scene where the stereo camera delivers few data points; especially ground information is missing. not altered during the experiments. In the following subsection some examples of Hough spaces are presented to illustrate the working principles of the classification.

3.2

Hough space examples

The real world data is obviously subject to noise and additionally, planes in the world can lie in an area in the parameter space just on the boundary between two bins. It can hence not be expected that even a perfect plane will produce hits in a single bin only. In this subsection, the working principles of our approach are illustrated by discussing a few typical examples. This is followed by a global performance analysis based on the 6,800 snapshots in the next subsection 3.3. For the discussion in this subsection, the parameter space of the Hough transform of a snapshot is depicted in two different ways: a two-dimensional histogram and a isometric pseudo-threedimensional depiction of the parameter space. Each of them is discussed separately as each has its

advantages in terms of illustration aspects. In figure 9, there are 2D histograms depicting the bins of the parameter space for three more or less typical snapshots. The origin of the histogram is in the top left corner, the down-pointing axis contains the distances, the right-pointing axis contains both ρx and ρy . This is accomplished by first fixing ρy and iterating ρx , then increasing ρy and iterating ρx again and so on. This is depicted graphically in sub-figure 9(a). The bin which corresponds to the floor is indicated by a little frame. The magnitude of the bins is represented by shade, where white corresponds to all magnitudes above the threshold tm = 2/3. All the other shades are scaled uniformly and thus the different histograms are better comparable in contrast to a scheme where just the top bin is white. The following discussion motivates why a threshold on only the absolute number of hits is not sufficient and the shapes of the distribution has to be also taken into account. In the floor histograms, a single bin sticks out, even with some slightly obstructed ground. But it can be noticed that the histogram of the ramp differs from the floor-histograms. First of all, the bins with many hits are concentrated on the right side of the histogram. This is obviously the case because planes with a greater inclination get more hits, proportionally to their similarity to the actual ramp. The second difference is that the strip of significantly filled bins is lower. This is because the ramp is at a greater distance to the camera than the floor. In the histogram of the random step field, the bins with a relatively high magnitude are most equally distributed among all the three histograms. In the ramp-histogram, there all also a lot of bins represented in white, but this is because the assignment of shade to magnitude is done in an absolute way. So in the ramp-histogram, bins with relatively medium magnitude are painted in white. In the random-step-field-histogram, there are 14 white bins, in the ramp-histogram there are

.. .

.. .

.. .

1.8m 1.9m 2.0m

d

-18° -9° 0° 9° 18° 27° 36° 45° -18° -9° 0° 9° 18° 27° 36° 45°

-18° -9° 0° 9° 18° 27° 36° 45° -18° -9° 0° 9° 18° 27° 36° 45°

-2.0m -1.9m -1.8m .. .

.. . ...

ρx

.. .

... -45°

-36°

.

.

.

36°

45°

ρy

(a) Layout of the bins in the depictions of parameter spaces below

(b) Plain floor

(c) Slightly obstructed floor

(d) Ramp

(e) Random step field

Figure 9: Two-dimensional depictions of the three dimensional parameter space for several example snapshots. Distances are on the y-axis, angles are on the x-axis, where ρy iterates just once and ρx iterates repeatedly. Hits in the bins are represented by grayscale, the darker the less hits.

35. Further illustrations of the reasons why the shape of the distribution has to be taken into account are presented in figures 10 and 11. They show the distribution of the bin values in d, ρx , ρy space in a volumetric visualization. Several planes normal to the d axis are drawn, and iso-contours of the bin values are drawn on each such plane. In this way, the bins having high values become highlighted. Common to all diagrams is that the areas with the higher magnitudes are distributed in a diagonal shape. This is caused by the following. Suppose there is just a lot of points from the floor. The planes with most points in them are very similar to the floor. For a plane, which is not very similar to the floor to encompass many points, the best way is to intersect the floor plane in an angle as small as possible. With growing distance to the floor, the angle has to become greater in order for the plane to intersect the floor plane in the area of the data points. In the region of the parameter space where the angles are just big enough, the values are the highest for every distance. For the ramp, the parameter space is a warped version of the one for the floor, because the region of the data points is smaller, so there are fewer possibilities to intersect with it. This is also the reason why the region of non-zero bins is so small in this parameter space (see figure 11(c)). Furthermore, the area of the highest overall magnitude is higher (on the ρx -axis), due to the ramp being tilted.

3.3

General Performance Analysis

The results presented in this subsection are based on those approximately 6,800 snapshots from the seven datasets recorded in different environment conditions which had meaningful range data. The data was labeled by a human to provide ground truth references to measure the accuracy of the

8000

5000 4500

7000

4000 6000 3500 3000 2500

20 1.5 0

2000

1 0.5

1500

0 −0.5

40 20

−1 −1.5

0 −20 Angle ρy (deg)

−40

−2

Angle ρx (deg)

Angle ρx (deg)

40

60

5000

40

4000

20

3000

0

1.5

2000

1

1000 0.5 500

0

Distance to Origin (m)

1000

0 −20

(a) Indoor unobstructed floor

−0.5 −40

Angle ρy (deg)

Distance to Origin (m)

(b) Outdoor grassy terrain 18000

3000

16000 2500 14000

40

10000

20

8000 1.5

0

1

6000

0.5 0 −0.5

40 20

4000

−1.5

−20 Angle ρy (deg)

−40

60

1500

40 1.5

20 0.5 0 −0.5 −1 −1.5

0 −20

Distance to Origin (m)

−2

1000

1

0

2000

−1

0

2000

Angle ρx (deg)

Angle ρx (deg)

12000

−40

Angle ρy (deg)

(c) Indoor ramp

−2

500

Distance to Origin (m)

(d) Outdoor hill 12000

10000

11000

9000

10000

8000

9000 7000

8000

6000

7000 Angle ρx (deg)

6000

20 1.5 0

1 0.5 0 −0.5

40 20

−1 −1.5

0 −20 Angle ρy (deg)

−40

−2

5000 4000 3000

Angle ρx (deg)

60

40

5000

40 1.5

20 1 0

2000 −1.5

−20 Angle ρ (deg)

−40

2000 1000

−1

0

Distance to Origin (m)

3000

0.5 0 −0.5

1000

4000

Distance to Origin (m)

−2

y

(e) Indoor random step field

(f) Outdoor Bush

Figure 10: Volumetric depiction of examples of Hough transformed Swissranger data with grayscale-coded iso-contours on planes perpendicular to the distance(d)-axis in the d, ρx , ρy space. The darker areas denote areas of higher counts.

4000

4500

3500

4000 3500

3000

3000 2500 2000

40 1.5

20

1500

1 0.5

0

1000

0 −0.5 −1 −1.5

0 −20 −40

Angle ρy (deg)

−2

2500

60 Angle ρx (deg)

Angle ρx (deg)

60

2000

40 1.5

20

1500

1 0.5

0

1000

0 −0.5

500

−1 −1.5

0 −20

Distance to Origin (m)

−40

Angle ρy (deg)

(a) Indoor unobstructed floor

−2

500 Distance to Origin (m)

(b) Outdoor grassy terrain 9000

16000

8000

14000

7000

12000

6000 10000 5000 4000

40 1.5

20

3000

1 0.5

0

2000

0 −0.5 −1 −1.5

0 −20 −40

Angle ρy (deg)

−2

60 Angle ρx (deg)

Angle ρx (deg)

60

8000

40 1.5

20 0.5

0

4000

0 −0.5

1000

6000

1

−1 −1.5

0 −20

Distance to Origin (m)

−40

Angle ρy (deg)

(c) Indoor ramp

−2

2000 Distance to Origin (m)

(d) Outdoor hill 3500

2500

3000 2000 2500 1500

2000 60

40

1500 1.5

20

1 0.5

0 0 −0.5 −1 −1.5

0 −20 Angle ρ (deg)

−40

−2

1000

Angle ρx (deg)

Angle ρx (deg)

60

40 1.5

20 0.5

0 0 −0.5

500

−1 −1.5

0 −20

Distance to Origin (m)

y

1000

1

Angle ρ (deg)

−40

−2

500

Distance to Origin (m)

y

(e) Indoor random step field

(f) Outdoor Bush

Figure 11: Volumetric depiction of examples of Hough transformed stereo camera data with grayscale-coded iso-contours on planes perpendicular to the distance(d)-axis in the d, ρx , ρy space. The darker areas denote areas of higher counts.

Figure 12: Average processing time and cardinality of point clouds. The average time per classification directly depends on the average number of 3D points per snapshot in each dataset.

Table 4: Success rates and computation times for drivability detection. dataset success rate false negative false positive time (msec) stereo set1

1.000

0.000

0.000

4

set2

0.987

0.00

0.013

53

set3

0.977

0.016

0.007

29

set4

0.831

0.169

0.000

11

set5

1.000

0.000

0.000

8

set6

1.000

0.000

0.000

12

set7

0.830

0.031

0.139

12

TOF

classification. As it would be very tedious to label the several thousand snapshots one by one, the seven datasets were broken down into ”scene sets”. Each scene consists of some larger sequence of snapshots for which the same label can be applied. The properties of the different scenes are discussed later on in the context of fine grain classification. The first and foremost result is with respect to coarse classification, namely the binary distinction between drivable and non-drivable terrain. For this purpose an angular resolution of 45° is used, i. e. only two bins per axis. As shown in table 4, the approach can robustly detect drivable ground in a very fast manner. The success rates of classifying the input data correctly, i.e., of assigning the same label as in the human generated ground truth data, range between 83% and 100%. Cases where the algorithm classifies terrain as non-driveable in opposite to the ground truth data label are counted as false negatives; when driveability is detected though the ground truth data label points

to non-driveability, a false positive is counted. As mentioned before, the stereo camera has the drawback that it does not deliver data in featureless environments, but it allows an almost perfect classification ranging between 98% and 100%. The Swissranger has a very low percentage of excluded data (see table 3), but it behaves poorly in strongly sunlit situations. In these cases, the snapshot is significantly distorted but not automatically recognizable as such. Such situations occurred during the recording of the outdoor data of set4 and set7 causing the lower success rates for the Swissranger in these two cases. The two sensors can hence supplement each other very well. The processing can be done very fast, namely in the order of 5 to 50 msec. Please note that these run times are for the full approach, i.e., the run times include the preprocessing (which is always successful for this data) as well as the computation of the Hough transformation and the execution of the decision tree. As illustrated in figure 12, the variations in processing time are due to the variations in the number of points per snapshot. Any standard 2D obstacle sensor would have done significantly worse in the test scenarios. It would especially have failed to detect perpendicular obstacles as well as rubble on the floor. The approach can hence serve as serious alternative to 2D obstacle detection for motion control and mapping in real world environments. The next question is of course to what extent the approach is capable of more fine grain classification of terrain types. Tables 5 and 6 show the different scene sets and their general properties. Plateaus and canyons onto which our robot can drive, like a curb between a meadow and a concrete ground, had to be included in the label ”floor” to make the manual labeling of the data feasible.

Table 5: Human generated ground truth labels for the different scenes in the stereo camera datasets. scene description set

human

# of aver. # points

label

PC

per PC

floor

32

5058

1

lab floor with black plastic

2

bush1 very near

obstacle

30

22151

3

bush2 very near

obstacle

1

71646

4

grass meadow

floor

2

11367

5

hill with grass

ramp

47

107173

6

tree1 very near

obstacle

1

15267

7

grass, background sky

floor

1

32686

8

tree2 very near

obstacle

30

27139

9

tree3 very near

obstacle

41

25141

10

railing very near

obstacle

2

77342

11

concrete slope

ramp

27

10770

12

wall very near

obstacle

23

113368

Table 6: Human generated ground truth labels for the different scenes in the Swissranger datasets. scene description set

human

# of aver. # points

label

PC

per PC

13

lab floor, background barrels

floor

55

23484

14

lab floor with black plastic

floor

33

23256

15

boxes very near

obstacle

43

25344

16

lab floor, dark cave

floor

75

25344

17

wooden ramp with plastic cover

ramp

113

19691

18

red-cloth hanging down

obstacle

124

25344

19

bush1 very near

obstacle

32

18121

20

bush2 very near

obstacle

49

24604

21

car very near

obstacle

44

17396

22

concrete ground1

floor

68

11372

23

grass meadow

floor

70

13812

24

hill with grass and earth

ramp

59

7235

25

concrete with rubble

obstacle

90

20375

26

tree1 very near

obstacle

50

23512

27

concrete ground2

floor

28

24726

28

grass, background far wall

floor

35

25342

29

mix of grass and concrete

floor

50

25004

30

grass, background near wall

floor

86

25344

31

grass, background sky

floor

13

25343

32

grass, background building1

floor

233

25343

33

grass, background building2

floor

892

25250

34

stone ramp

ramp

150

25287

35

hill1

ramp

169

19853

36

tree2 very near

obstacle

728

25344

37

tree3 very near

obstacle

904

22625

38

railing very near

obstacle

616

25344

39

grass, background building3

floor

538

25343

40

concrete slope

ramp

548

25202

41

grass hill

ramp

987

25343

42

wall very near

obstacle

655

25344

Table 7: Classification rates and run times for stereo camera data processed at different angular resolutions of the Hough space. Though drivability can be robustly detected with stereo, finer classification performs rather badly with this sensor. 45°

15°



scene

human

class.

time

class.

time

class.

time

set

label

rate

(msec)

rate

(msec)

rate

(msec)

1

floor

1.00

4

0.00

20

0.00

51

2

obstacle

1.00

16

1.00

92

1.00

232

3

obstacle

1.00

60

1.00

320

1.00

850

4

floor

1.00

15

1.00

60

1.00

160

5

ramp

0.00

78

0.00

431

0.00

1089

6

obstacle

0.00

10

0.00

50

0.00

130

7

floor

0.00

1

0.00

30

0.00

70

8

obstacle

1.00

22

1.00

109

1.00

279

9

obstacle

1.00

19

1.00

100

1.00

255

10

obstacle

0.00

35

0.50

180

0.50

455

11

ramp

0.00

8

0.00

42

0.00

109

12

obstacle

1.00

84

1.00

457

1.00

1163

0.58

29

0.54

158

0.54

404

average

Table 8: Classification rates and run times for Swissranger data processed at different angular resolutions of the Hough space. Unlike with the stereo camera, a higher angular resolution improves finer classification. 45°

15°



scene

human

class.

time

class.

time

class.

time

set

label

rate

(msec)

rate

(msec)

rate

(msec)

13

floor

0.98

12

0.96

59

0.96

148

14

floor

1.00

12

0.97

60

0.97

148

15

obstacle

0.79

10

1.00

64

1.00

160

16

floor

0.00

13

0.00

64

0.00

161

17

ramp

0.00

10

0.00

49

0.30

124

18

obstacle

0.00

12

1.00

64

1.00

160

19

obstacle

0.00

9

0.00

44

0.00

113

20

obstacle

0.00

12

0.00

61

0.24

155

21

obstacle

0.00

8

0.00

44

0.50

109

22

floor

1.00

5

1.00

29

1.00

72

23

floor

1.00

8

1.00

34

1.00

88

24

ramp

0.00

3

0.22

18

0.00

45

25

obstacle

1.00

10

1.00

51

1.00

128

26

obstacle

1.00

11

1.00

60

1.00

146

27

floor

1.00

13

1.00

64

1.00

160

28

floor

1.00

13

1.00

65

1.00

158

29

floor

1.00

13

1.00

65

0.98

162

30

floor

1.00

11

1.00

63

1.00

160

31

floor

1.00

10

1.00

62

1.00

158

32

floor

1.00

12

0.00

64

0.00

160

33

floor

0.00

12

0.00

64

0.00

160

34

ramp

0.00

14

0.00

64

1.00

160

35

ramp

0.00

10

0.00

51

0.00

128

36

obstacle

0.00

12

1.00

63

1.00

160

37

obstacle

0.00

11

1.00

57

1.00

143

38

obstacle

0.00

13

1.00

64

1.00

169

39

floor

1.00

13

0.00

65

0.00

163

40

ramp

0.00

12

0.00

66

0.00

160

41

ramp

0.00

12

0.00

61

0.00

161

42

obstacle

1.00

12

1.00

63

1.00

157

0.55

12

0.64

63

0.70

158

average

It can be expected that the angular resolution of the Hough space discretization has an influence on the finer classification. Hence, experiments with 9°, 15°, and 45° are conducted. Table 7 shows the results for the stereo camera data. It can be noticed that the success rates of 54% to 58% are rather poor, especially when compared to drivability detection where 98% to 100% are achieved. The main reason seems to be the high rate of noise in the stereo data (see also figure 11). This is also supported by the fact that the classification rates are hardly influenced by the angular resolution of the discretization of the Hough space. The situation is a bit different for more fine grain classification with the Swissranger, which provides less noisy data (see also figure 10). As shown in table 8, a higher angular resolution allows for this sensor at least some more fine grain classification with success rates of up to 70%. But the limitations in more fine grain classification seem also for the Swissranger to lie in the noise level of the data. It is expected that the high quality data from a 3D laser scanner would allow a very fine grain classification with the presented approach. Pursuing experiments are left for future work. This type of classification could for example be used for detailed semantic map annotation. But data acquisition with a 3D laser scanner is very slow, namely in the order of several seconds per single snapshot. This means that the robot has to be stopped and that the updates rates are very low. A stereo camera or a Swissranger in contrast allow very fast data acquisition on a moving robot. The detection of whether the robot can drive over the terrain covered by the sensor is extremely fast and very robust with the presented approach. It is hence an interesting alternative for obstacle detection in non-trivial environments, which can be used for reactive locomotion control as well as mapping.

4

Conclusions and Future Work

A simple but fast and robust approach to classify terrain for mobility is presented. It uses range data from a 3D sensor like a time-of-flight Swissranger, respectively a stereo camera. The range data is processed by a Hough transform with a three dimensional parameter space for representing planes. A decision-tree on the counts of the bins in the Hough space is used to distinguish drivable from non-drivable ground. In addition to this basic distinction, a more fine grain classification of terrain types is in principle possible with the approach. An autonomous robot can use this information for example to annotate its map with way points or to compute a risk assessment of a possible path. Extensive experiments with different prototypical indoor and outdoor ground types are presented, which can be clearly classified with respect to drivability. Seven datasets recorded indoor and outdoor under very varying conditions are used for testing. About 6,800 snapshots of range data are processed in total. It is shown that drivability can be robustly detected with success rates ranging between 83% and 100% for the Swissranger and between 98% and 100% for the stereo camera. The complete processing time for classifying one range snapshot is in the order of 5 to 50 msec. The detection of safe ground can hence be done in realtime on the moving robot, allowing using the approach for reactive motion control or mapping. The current decision tree is hand-crafted based on the physical capabilities of the mobile robot used in the experiments. As mentioned above, it performs very well for drivability. But a hand-crafted extension to more fine grain terrain classification, i.e., to distinguish more than two terrain types, has its limits as also reported in this article. An obvious next step for future work is to try to improve the more fine grain classification accuracy by machine learning, i.e., by generating the decision tree and its parameters on the fly. This can be done in a straight forward manner by using one part of the

human labeled ground truth data as training data and an other part as test data. More challenging but as well more general is the generation of the decision tree in an unsupervised manner, i.e., without human labeled training data. Other terrain classification approaches, e.g., based on texture information extracted by computer vision, as well as information from the low-level motion control of the robot may be used to provide feedback for this learning process.

5

Acknowledgments

The authors gratefully acknowledge the financial support of Deutsche Forschungsgemeinschaft (DFG). Please note the name-change of our institution. The Swiss Jacobs Foundation invests 200 Million Euro in International University Bremen (IUB) over a five-year period starting from 2007. To date this is the largest donation ever given in Europe by a private foundation to a science institution. In appreciation of the benefactors and to further promote the university’s unique profile in higher education and research, the boards of IUB have decided to change the university’s name to Jacobs University Bremen. Hence the two different names and abbreviations for the same institution may be found in this article, especially in the references to previously published material.

References Abouaf, J. (1998). Trial by fire: teleoperated robot targets chernobyl. Computer Graphics and Applications, IEEE, 18(4):10–14. Ballard, D. H. (1981). Generalizing the hough transform to detect arbitrary shapes. Pattern Recognition, 13(3):111–122.

Birk, A. (2005). The IUB 2004 rescue robot team. In Nardi, D., Riedmiller, M., and Sammut, C., editors, RoboCup 2004: Robot Soccer World Cup VIII, volume 3276 of Lecture Notes in Artificial Intelligence (LNAI). Springer. Birk, A. and Carpin, S. (2006). Rescue robotics - a crucial milestone on the road to autonomous systems. Advanced Robotics Journal, 20(5):595–695. Birk, A., Carpin, S., Chonnaparamutt, W., Jucikas, V., Bastani, H., Delchev, I., Krivulev, I., Lee, S., Markov, S., and Pfeil, A. (2006a). The IUB 2005 rescue robot team. In Noda, I., Jacoff, A., Bredenfeld, A., and Takahashi, Y., editors, RoboCup 2005: Robot Soccer World Cup IX, Lecture Notes in Artificial Intelligence (LNAI). Springer. Birk, A., Carpin, S., and Kenn, H. (2004). The IUB 2003 rescue robot team. In Polani, D., Browning, B., Bonarini, A., and Yoshida, K., editors, RoboCup 2003: Robot Soccer World Cup VII, volume 3020 of Lecture Notes in Artificial Intelligence (LNAI). Springer. Birk, A., Kenn, H., Rooker, M., Akhil, A., Vlad, B. H., Nina, B., Christoph, B.-S., Vinod, D., Dumitru, E., Ioan, H., Aakash, J., Premvir, J., Benjamin, L., and Ge, L. (2002). The IUB 2002 rescue robot team. In Kaminka, G., U. Lima, P., and Rojas, R., editors, RoboCup-02: Robot Soccer World Cup VI, LNAI. Springer. Birk, A., Markov, S., Delchev, I., and Pathak, K. (2006b). Autonomous rescue operations on the iub rugbot. In IEEE International Workshop on Safety, Security, and Rescue Robotics (SSRR). IEEE Press. Birk, A., Pathak, K., Schwertfeger, S., and Chonnaparamutt, W. (2006c). The iub rugbot: an intelligent, rugged mobile robot for search and rescue operations. In IEEE International Workshop on Safety, Security, and Rescue Robotics (SSRR). IEEE Press. CSEM (2006). The SwissRanger, Manual V1.02. CSEM SA, 8048 Zurich, Switzerland.

Davids, A. (2002). Urban search and rescue robots: from tragedy to technology. Intelligent Systems, IEEE, 17(2):81–83. Gennery, D. B. (1999). Traversability analysis and path planning for a planetary rover. Autonomous Robots, 6(2):131–146. Hardarsson, F. (1997). Locomotion for difficult terrain. Technical Report TRITA-MMK-1998:3, Mechatronics Lab, Dept. of Machine Design. Hoffman, R. and Krotkov, E. (1991). Perception of rugged terrain for a walking robot: true confessions and new directions. In IEEE/RSJ International Workshop on Intelligent Robots and Systems (IROS), pages 1505–1510 vol.3. Hoffman, R. and Krotkov, E. (1993). Terrain mapping for outdoor robots: robust perception for walking in the grass. In IEEE International Conference on Robotics and Automation (ICRA), pages 529–533 vol.1. Howard, A., Seraji, H., and Tunstel, E. (2001). A rule-based fuzzy traversability index for mobile robot navigation. In IEEE International Conference on Robotics and Automation, ICRA, volume 3, pages 3067–3071 vol.3. Howard, A., Wolf, D. F., and Sukhatme, G. S. (2004). Towards 3d mapping in large urban environments. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Sendai, Japan. Iagnemma, K., Brooks, C., and Dubowsky, S. (2004). Visual, tactile, and vibration-based terrain analysis for planetary rovers. In IEEE Aerospace Conference, volume 2, pages 841–848 Vol.2. Iagnemma, K. and Buehler, M. (2006a). Special issue on the darpa grand challenge, part 1. Journal of Field Robotics, 23(8).

Iagnemma, K. and Buehler, M. (2006b). Special issue on the darpa grand challenge, part 2. Journal of Field Robotics, 23(9). Iagnemma, K., Shibly, H., and Dubowsky, S. (2002). On-line terrain parameter estimation for planetary rovers. In IEEE International Conference on Robotics and Automation (ICRA), volume 3, pages 3142–3147. Jacoff, A., Messina, E., Weiss, B., Tadokoro, S., and Nakagawa, Y. (2003a). Test arenas and performance metrics for urban search and rescue robots. In Proceedings of the Intelligent and Robotic Systems (IROS) Conference. Jacoff, A., Weiss, B., and Messina, E. (2003b). Evolution of a performance metric for urban search and rescue. In Performance Metrics for Intelligent Systems (PERMIS), Gaithersburg, MD. Kitano, H. and Tadokoro, S. (2001). Robocup rescue. a grand challenge for multiagent and intelligent systems. AI Magazine, 22(1):39–52. Konolige, K. and Beymer, D. (2006). SRI Small vision system, user’s manual, software version 4.2. SRI International. Kurazume, R., Yoneda, K., and Hirose, S. (2002). Feedforward and feedback dynamic trot gait control for quadruped walking vehicle. Autonomous Robots, 12(2):157–172. Lacroix, S., Mallet, A., Bonnafous, D., Bauzil, G., Fleury, S., Herrb, M., and Chatila, R. (2001). Autonomous rover navigation on unknown terrains functions and integration. In Experimental Robotics Vii, volume 271 of Lecture Notes in Control and Information Sciences, pages 501– 510. Lacroix, S., Mallet, A., Bonnafous, D., Bauzil, G., Fleury, S., Herrb, M., and Chatila, R. (2002). Autonomous rover navigation on unknown terrains: Functions and integration. International Journal of Robotics Research, 21(10-11):917–942.

Lange, R. and Seitz, P. (2001). Solid-state time-of-flight range camera. Quantum Electronics, IEEE Journal of, 37(3):390–397. Larson, A., Voyles, R., and Demir, G. (2004). Terrain classification through weakly-structured vehicle/terrain interaction. In IEEE International Conference on Robotics and Automation (ICRA), volume 1, pages 218–224 Vol.1. Lewis, M. A. and Bekey, G. A. (2002). Gait adaptation in a quadruped robot. Autonomous Robots, 12(3):301–312. Lozano Albalate, M., Devy, M., Miguel, J., and Marti, S. (2002). Perception planning for an exploration task of a 3d environment. In 16th International Conference on Pattern Recognition, volume 3, pages 704–707 vol.3. Murphy, R. R. (2004). Trial by fire. IEEE Robotics and Automation Magazine, 11(3):50 – 61. Murphy, R. R., Casper, J., and Micire, M. (2001). Potential tasks and research issues for mobile robots in robocup rescue. In Stone, P., Balch, T., and Kraetszchmar, G., editors, RoboCup2000: Robot Soccer World Cup IV, volume 2019 of Lecture notes in Artificial Intelligence (LNAI), pages 339–334. Springer Verlag. Niwa, Y., Yukita, S., and Hanaizumi, H. (2004). Depthmap-based obstacle avoidance on rough terrain. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), volume 2, pages 1612–1617 vol.2. Scholtz, J., Young, J., Drury, J. L., and Yanco, H. A. (2004). Evaluation of human-robot interaction awareness in search and rescue. In Proceedings of the International Conference on Robotics and Automation, ICRA’2004, pages 2327– 2332. IEEE Press. Shah, B. and Choset, H. (2004). Survey on urban search and rescue robots. Journal of the Robotics Society of Japan (JRSJ), 22(5):40–44.

Snyder, R. G. (2001). Robots assist in search and rescue efforts at wtc. IEEE Robotics and Automation Magazine, 8(4):26–28. Surmann, H., Nuechter, A., and Hertzberg, J. (2003). An autonomous mobile robot with a 3d laser range finder for 3d exploration and digitalization of indoor environments. Robotics and Autonomous Systems, 45(3-4):181–198. Talukder, A., Manduchi, R., Castano, R., Owens, K., Matthies, L., Castano, A., and Hogg, R. (2002). Autonomous terrain characterisation and modelling for dynamic control of unmanned vehicles. In Intelligent Robots and System, 2002. IEEE/RSJ International Conference on, volume 1, pages 708–713 vol.1. Thrun, S., Burgard, W., and Fox, D. (2000). A real-time algorithm for mobile robot mapping with applications to multi-robot and 3d mapping. In ICRA, pages 321–328. Thrun, S., Montemerlo, M., Dahlkamp, H., Stavens, D., Aron, A., Diebel, J., Fong, P., Gale, J., Halpenny, M., Hoffmann, G., Lau, K., Oakley, C., Palatucci, M., Pratt, V., Stang, P., Strohband, S., Dupont, C., Jendrossek, L.-E., Koelen, C., Markey, C., Rummel, C., Niekerk, J. v., Jensen, E., Alessandrini, P., Bradski, G., Davies, B., Ettinger, S., Kaehler, A., Nefian, A., and Mahoney, P. (2006). Stanley: The robot that won the darpa grand challenge. Journal of Field Robotics, 23(9):661–692. Unnikrishnan, R. and Hebert, M. (2003). Robust extraction of multiple structures from nonuniformly sampled data. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), volume 2, pages 1322–1329 vol.2. IEEE Press. Urmson, C., Anhalt, J., Bartz, D., Galatali, M. C., Gutierrez, A., Harbaugh, S., Johnston, J., Kato, H., Koon, P. L., Messner, W., Miller, N., Mosher, A., Peterson, K., Ragusa, C., Ray, D., Smith, B. K., Snider, J. M., Spiker, S., Struble, J. C., Ziglar, J., and Whittaker, W. R. L. (2006).

A robust approach to high-speed navigation for unrehearsed desert terrain. Journal of Field Robotics, 23(8):467–508. Videre-Design (2006). Stereo-on-a-chip (STOC) stereo head, user manual 1.1. Videre Design. Wong, J. Y. (2001). Theory of Ground Vehicles. John Wiley and Sons, Inc., 3rd edition edition. Wulf, O., Brenneke, C., and Wagner, B. (2004). Colored 2d maps for robot navigation with 3d sensor data. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), volume 3, pages 2991–2996 vol.3. IEEE Press. Wulf, O. and Wagner, B. (2003). Fast 3d-scanning methods for laser measurement systems. In International Conference on Control Systems and Computer Science (CSCS14). Yoneda, K. and Hirose, S. (1995). Dynamic and static fusion gait of a quadruped walking vehicle on a winding path. Advanced Robotics, 9(2):125–136.