spired by a number of concepts from neuroethology, specifically from the visual naviga- tion abilities of insects. These concepts include UV/visible color contrast ...
NEUROETHOLOGICAL CONCEPTS AT WORK: INSECT-INSPIRED METHODS FOR VISUAL ROBOT NAVIGATION 1 ¨ 1 , Andrew Vardy2 , Lorenz Gerstmayr1 , Frank Roben ¨ Ralf Moller , Sven Kreft1 1 Computer Engineering, Faculty of Technology, Bielefeld University 33594 Bielefeld, Germany 2 Computer Science / Engineering & Applied Science, Memorial University of Newfoundland St. John’s A1B 3X5, Canada
Abstract
We present an outline of our research in visual robot navigation which is directly inspired by a number of concepts from neuroethology, specifically from the visual navigation abilities of insects. These concepts include UV/visible color contrast mechanisms for illumination invariance, holistic visual homing methods, the application of matched filters in visual homing methods, and the construction of topological rather then metrical maps for navigation, applied, for example, in the trajectory control of cleaning robots.
1 Introduction Numerous models have been proposed for visual navigation in a variety of animal species (Healy, 1998). Models proposed for smaller “simpler” invertebrates such as insects are generally more complete and based on firmer evidence than those for larger mammals such as humans. Insects exhibit a broad repertoire of navigational methods ranging from path integration to visual landmark navigation to possibly even map-like spatial memories (Collett and Collett, 2002; Menzel, R. et al., 2005; Collett and Collett, 2006). Our work rests on the assumption that insects use visual landmarks for guidance along a route or toward a goal, and can accomplish this without a cognitive map, i.e. without a map-like representation giving the positions of landmarks in a global reference frame. It appears that a combination of clever sensors, image recognition, and guidance mechanisms can suffice to allow insects such as bees and ants to navigate successfully over large areas. Meanwhile, robotics has fully embraced map-like representations. One of the most active areas in mobile robotics research is Simultaneous Localization and Mapping (SLAM). The goal of SLAM is to build a metrical map of the environment. However, to add newly perceived features to the map, the robot must simultaneously be localized within the map. The principal approach to SLAM is probabilistic: uncertain measurements of robot and landmark positions are incorporated into a map that represents not only position, but positional uncertainty (Thrun et al., 2005). Landmarks may be dense arrays of distance values or features extracted from range or visual sensors (Chen et al., 2007). While SLAM algorithms have been demonstrated to work well in small closed environments, most approaches have a high computational cost which limits their ability to operate in unbounded environments. There are also major qualitative differences between the characteristics of insect visual systems and the visual systems of typical robots. The eyes of insects such as bees and ants have low resolution (on the order of 2◦ per ommatidium). In robotics, cameras with much higher resolution are typically employed. The success of some of the robot homing algorithms described below suggests that robots, like insects, may be able to do without high resolution images. The visual systems of insects are also tuned to cues not typically used in robotics, such as the polarization pattern of skylight (an indicator of orientation if the sun is hidden behind clouds), and the putative use of UV-green contrast to distinguish ground from sky. Thus, there exists a wide gap between the way in which insects sense and model their world, and the way adopted by mainstream robotics. The research program outlined here attempts to bridge this gap by instantiating models of insect navigation as robot control algorithms. In addition, we have proposed new models for visual navigation that are either inspired by biological models, or at least plausible as being implemented in the limited neural hardware of an insect. In Sec. 2 we outline our work in peripheral processing producing an illumination-invariant representation of a scene, Sec. 3 presents insect-inspired robot homing methods, and Sec. 4 discusses the use of these methods in cleaning applications. 2 Peripheral processing The tiny size of insect nervous systems forced natural selection to find parsimonious and specialized solutions for many behavioral competences. With respect to perceptual tasks, this pressure apparently resulted in a shift from central to peripheral processing (Wehner, 1987). We conjectured that the illumination¨ invariant detection of landmarks is an instance of such a peripheral solution (Moller, 2002). Models of visual guidance in insects assume that a goal location is characterized by a snapshot image and the home 91
log UV
log UV
log infrared
log green
Figure 1: Sky and clouds (filled circles) and terrestrial objects (open squares) can be linearly separated (solid line) in a UV-green (left) or UV-infrared (right) contrast measure (Kollmeier et al., 2007). direction is computed by comparing snapshot and current view (Cartwright and Collett, 1983). Such a comparison is only possible by simple means if the image representation is independent of changes in the illumination, caused, for example, by movement of the sun or by changes in the cloud cover between the capturing of snapshot and current view. This can be accomplished in the sensory periphery by computing a contrast measure between two spectral channels, one in the UV, the other in the visual or infrared range. A subtraction of signals from two channels with logarithmic response to light intensity seems to be a trick that evolution has discovered to make the signal independent of the overall light intensity; this principle is used by insects in their polarized light compass which has been applied before as an additional compass sense for mobile robots (Lambrinos et al., 2000). The same principle could also be at work for landmark detection. We constructed a hand-held device with several logarithmic spectral channels and collected data from natural objects and sky under varying conditions of illumination (Kollmeier et al., 2007). Fig. 1 shows that a simple linear threshold is sufficient to separate landmarks as foreground from blue sky or clouds as background, both for a UV-green contrast (a candidate mechanism for insects) and even better for a UV-infrared contrast (a technical solution, not an insect model). After thresholding, terrestrial landmarks will appear as black landmarks in front of a white skyline, independent of the illumination of the landmarks and the conditions of the sky. We are currently constructing a panoramic dual-channel sensor which will exploit this mechanism for visual navigation of mobile robots in outdoor environments. 3 Local visual homing methods Local visual homing is the ability of an agent to return to a nearby goal location by comparing the currently perceived image of the landmark panorama to a snapshot image stored at the goal, and deriving a home vector pointing from the current position towards the goal. Following the home vector will make the current view gradually more similar to the snapshot, thus guiding the agent to the goal. How insects accomplish this task has so far not been fully resolved, but a number of computational models have been developed which are parsimonious enough to be considered as candidates of insect visual homing. In our previous work we could show that correspondence methods (e.g. blockmatching or differential optical flow) produce surprisingly precise home vectors, even though in visual homing their main assumption — small feature ¨ shifts — is usually violated (Vardy and Moller, 2005). Especially attractive, both as models of insect navigation and as efficient robot homing methods, is the class of holistic homing models where the home vector computation is not based on establishing correspondences between local features in snapshot and current view, but by analyzing the appearance of entire images. For example, Zeil et al. (2003) observed that, in natural environments, a pixel-wise image distance function between compass-aligned pairs of images varies smoothly and monotonously with increasing spatial distance between their vantage points. This can be exploited for visual homing since following the spatial gradient of the image distance function will lead an agent to the goal (descent in image distances, DID). The corresponding neural mechanism is extremely simple and fits well with what is known about the architecture of insect visual brains. An impractical requirement of the DID method, the necessity to sample the image distance at three different points in space for the computation of the gradient, could be overcome by our matched-filter descent in ¨ image distances model (MFDID, Moller and Vardy, 2006). MFDID uses the current view and two matched filters — flow fields corresponding to two small perpendicular translatory movements — to predict two more images, and computes the image distances between the snapshot and each of the three current view images (true and predicted) to estimate the gradient. Matched filters are a neuroethological concept describing arrangements of sensors or processing stages which are tuned to a specific task (Wehner, 1987). An extension of MFDID uses the Newton method to lessen the influence of anisotropies in the ¨ landmark distribution on the home vector precision (Moller et al., 2007). Fig. 2 (left) visualizes the distance measure of DID and the home vectors obtained from Newton-MFDID for a database of images collected in 92
snapshot image (5,7), original
snapshot image (5,7) used for homing
current view image (17,1), original
current view image (17,1) used for homing
Figure 2: Local visual homing. Grey values: image distance between current view and snapshot (square). Arrows: home vectors from Newton-MFDID. Panoramic images: snapshot (square) and a current view (circle), each as original database image and lowpass-filtered image used for homing. an unmodified apartment room (grid spacing: 10 cm). The method produces good home directions even if snapshot and current view are very dissimilar and strongly lowpass-filtered (right). Also the azimuthal alignment of a panoramic snapshot and current view — a prerequisite of DID and MFDID — can be obtained ¨ by a visual compass within the DID framework (Zeil et al., 2003; Moller et al., 2007). 4 Long-range visual homing and maps: Application for Cleaning Robots Since guidance strategies characterize places by the configuration of the surrounding landmarks, these landmarks need to be visible in order to successfully return to the home position. Thus, each snapshot position is surrounded by a catchment area: only from positions within this area, successful homing to the snapshot position is possible. In order to achieve long-range navigation capabilities, several snapshots need to be stored. Local visual homing methods can then be used to navigate from place to place. By storing several snapshots and their interrelation, a topological map can be obtained. Topological maps are a sparse representation of the environment where places are represented as vertices annotated with sensory information (here a snapshot) and edges link places that can be directly reached from each other (Filliat and Meyer, 2003); see Fig. 3. In order to test our homing methods in a real world application which requires long range navigation capabilities, we started to work in the field of autonomous robot cleaning. Here, one of the main goals is to completely cover the whole accessible area and to keep the proportion of repeated coverage as small as possible. Most research and commercially available floor cleaning robots for household usage rely on random search strategies (Prassler et al., 2000, 2003). With random strategies, it takes much time to completely clean the whole accessible area and the resulting overlap is considerable. Optimal cleaning strategies are closely related to complete coverage planning (Fiorini and Prassler, 2000; Choset, 2001), usually based on meandering or spiraling search patterns. However, most of these algorithms require the footprint of the room to be known. On-line coverage algorithms exist (e.g. Gabriely and Rimon, 2001; Hazon et al., 2006), but we are not aware of a vision-based approach like ours. We developed a mainly vision-based strategy which enables the robot to systematically clean an area. Fig. 3 illustrates the method; it relies on the assumption that the robot is able to drive along an initial straight lane, for example by taking the bearing of a visual feature. While moving forward, the robot successively stores snapshots in the topological map. When it encounters an obstacle, it turns by 90◦ , moves forward for a small distance, and turns by 90◦ again. So it starts a new lane which is parallel to the previous one. Now, while moving forward, it not only takes new snapshots, but also computes home vectors (Fig. 3: arrows) by comparing these images to snapshots taken at neighboring positions on the previous lane. The ¨ home vectors are computed using Newton-MFDID (Moller et al., 2007). Based on two home vectors and the distance between the snapshots along the previous lane (and a visual compass), the current distance from the previous lane (Fig. 3: line) can be estimated by triangulation. This estimation is passed to a controller keeping the robot on a course which is nearly parallel to the previous lane. The topological map constructed during the cleaning run (Fig. 3: circles and thin lines) provides information on which areas have been visited so far and can be used for the return to the charging station. The method was tested in simulations and shows good performance (Fig. 3: thick line); although deviations from the desired lanes are visible, even after ten or more consecutive lanes we never noticed the robot crossing a prior one. Encouraged by these promising results, we are currently running robot experiments.
93
Figure 3: Long-range visual homing and maps. Thick line: robot trajectory (real simulation results, see text). Circles: vertices of the topological map annotated with a snapshot image, thin lines: edges of the topological map, arrows: home vectors to the previous lane used for lane distance control (illustration only). References Cartwright BA, Collett TS: Landmark learning in bees. Journal of Comparative Physiology A 151 (4), 521– 543, 1983. Chen Z, Samarabandu J, Rodrigo R: Recent advances in simultaneous localization and map-building using computer vision. Advanced Robotics 21 (3), 233–265, 2007. Choset H: Coverage for robotics — a survey of recent results. Annals of Mathematics and Artificial Intelligence 31 (1–4), 113–126, 2001. Collett M, Collett T: Insect navigation: No map at the end of the trail? Current Biology 16 (2), R58–R51, 2006. Collett T, Collett M: Memory use in insect visual navigation. Nature Reviews Neuroscience 3, 542–552, 2002. Filliat D, Meyer J: Map-based navigation in mobile robots: part I. Cognitive Systems Research 4 (4), 243– 282, 2003. Fiorini P, Prassler E: Cleaning and household robots: a technology survey. Autonomous robots 9 (3), 227– 235, 2000. Gabriely Y, Rimon E: Spanning-tree based coverage of continuous areas by a mobile robot. Annals of Mathematics and Artificial Intelligence 31 (1–4), 77–98, 2001. Hazon N, Mieli F, Kaminka G: Towards robust on-line multi-robot coverage. In: Proceedings of the ICRA 2006. pp. 1710–1715, 2006. Healy S: Spatial Representation in Animals. Oxford University Press, 1998. ¨ ¨ Kollmeier T, Roben F, Schenck W, Moller R: Spectral contrasts for landmark navigation. Journal of the Optical Society of America A 24 (1), 1–10, 2007. ¨ Lambrinos D, Moller R, Labhart T, Pfeifer R, Wehner R: A mobile robot employing insect strategies for navigation. Robotics and Autonomous Systems, special issue: Biomimetic Robots 30 (1-2), 39–64, 2000. Menzel, R. et al.: Honey bees navigate according to a map-like spatial memory. Proceedings of the National Academy of Sciences of the United States of America 102 (8), 3040–3045, 2005. ¨ Moller R: Insects could exploit UV-green contrast for landmark navigation. Journal of Theoretical Biology 214 (4), 619–631, 2002. ¨ Moller R, Vardy A: Local visual homing by matched-filter descent in image distances. Biological Cybernetics 95 (5), 413–430, 2006. ¨ Moller R, Vardy A, Kreft S, Ruwisch S: Visual homing in environments with anisotropic landmark distribution. Autonomous Robots 23 (3), 231–245, 2007. ¨ Prassler E, Hagele M, Siegwart R: International contest for cleaning robots: Fun event or a first step towards benchmarking service robots. In: Proceedings of the International Conference on Field and Service Robotics. pp. 447–456, 2003. Prassler E, Ritter A, Schaeffer C, Fiorini P: A short history of cleaning robots. Autonomous Robots 9 (3), 211–226, 2000. Thrun S, Burgard W, Fox D: Probabilistic Robotics. MIT Press, 2005. ¨ Vardy A, Moller R: Biologically plausible visual homing methods based on optical flow techniques. Connection Science 17 (1-2), 47–89, 2005. Wehner R: ‘Matched filters’ — neural models of the external world. Journal of Comparative Physiology A 161, 511–531, 1987. Zeil J, Hoffmann MI, Chahl JS: Catchment areas of panoramic images in outdoor scenes. Journal of the Optical Society of America A 20 (3), 450–469, 2003. 94